Hi. I'm learning R and I have the following matrix with categorical variables.
V1 V2 V3 V4 V5
YES NO NO YES NO
YES YES YES NO YES
NO YES YES YES NO
YES NO YES YES YES
YES NO YES NO YES
YES YES NO YES YES
NO YES NO NO NO
YES YES NO YES YES
YES YES YES YES NO
NO NO YES YES YES
I'm looking for a way to count how many times each category appears in each variable and create a matrix with the count of all the columns together. Something like this.
YES 7 6 8 5 3
NO 3 4 2 5 7
Any suggestions? The Count and Table function of R only allow one Column at a time.
Thank you!
df <- read.table(text = "V1 V2 V3 V4 V5
YES NO NO YES NO
YES YES YES NO YES
NO YES YES YES NO
YES NO YES YES YES
YES NO YES NO YES
YES YES NO YES YES
NO YES NO NO NO
YES YES NO YES YES
YES YES YES YES NO
NO NO YES YES YES",
header = TRUE)
sapply(X = df,
FUN = table)
#> V1 V2 V3 V4 V5
#> NO 3 4 4 3 4
#> YES 7 6 6 7 6
Another approach with more piping, in which you end up with a tibble/data frame
library(tidyverse)
#set up data frame
df <- read.table(text = "V1 V2 V3 V4 V5
YES NO NO YES NO
YES YES YES NO YES
NO YES YES YES NO
YES NO YES YES YES
YES NO YES NO YES
YES YES NO YES YES
NO YES NO NO NO
YES YES NO YES YES
YES YES YES YES NO
NO NO YES YES YES",
header = TRUE)
df %>% #start with the data frame
map_df(table) %>% # use map_df from the purrr package to "table" each column
rownames_to_column("response") %>% #convert the rownames to a column named response
mutate(resp = case_when(response == 1 ~ "No", #change the resulting 1s to No in resp
response == 2 ~ "Yes")) %>% #change the resulting 2s to Yes in resp
select(resp, everything(), -response) #reorder the columns with resp at the front, removing response
#> # A tibble: 2 x 6
#> resp V1 V2 V3 V4 V5
#> <chr> <int> <int> <int> <int> <int>
#> 1 No 3 4 4 3 4
#> 2 Yes 7 6 6 7 6