Let's say I had a dataset with two categorical variables and I want to create a summary table that shows two things:
The number of instances (counts) of each combination of the categorical variables
The number of instances of one categorical variable by itself
Is there an elegant way of producing this summary table within a pipe workflow? My current approach works but the use of {} feels awkward:
library(dplyr)
starwars %>%
filter(gender %in% c("male", "female"),
eye_color %in% c("brown", "blue", "black")) %>%
{ bind_rows(count(., gender, eye_color),
count(., eye_color) %>% mutate(gender = "any")) } %>%
arrange(gender, desc(n))
#> # A tibble: 9 x 3
#> gender eye_color n
#> <chr> <chr> <int>
#> 1 any brown 21
#> 2 any blue 19
#> 3 any black 9
#> 4 female blue 6
#> 5 female brown 5
#> 6 female black 2
#> 7 male brown 16
#> 8 male blue 13
#> 9 male black 7
Honestly, your original is about as clear as you can get, but here's an option with Sam Firke's janitor package:
library(tidyverse)
starwars %>%
filter(gender %in% c("male", "female"),
eye_color %in% c("brown", "blue", "black")) %>%
janitor::crosstab(gender, eye_color) %>%
janitor::adorn_totals('row') %>%
gather(eye_color, n, -gender) %>%
arrange(gender, desc(n))
#> gender eye_color n
#> 1 female blue 6
#> 2 female brown 5
#> 3 female black 2
#> 4 male brown 16
#> 5 male blue 13
#> 6 male black 7
#> 7 Total brown 21
#> 8 Total blue 19
#> 9 Total black 9
which mirrors how you could do this in base R with table and addmargins:
library(dplyr)
starwars %>%
filter(gender %in% c("male", "female"),
eye_color %in% c("brown", "blue", "black")) %>%
select(gender, eye_color) %>%
table() %>%
addmargins(1) %>%
as_data_frame() %>%
arrange(gender, desc(n))
#> # A tibble: 9 x 3
#> gender eye_color n
#> <chr> <chr> <dbl>
#> 1 female blue 6
#> 2 female brown 5
#> 3 female black 2
#> 4 male brown 16
#> 5 male blue 13
#> 6 male black 7
#> 7 Sum brown 21
#> 8 Sum blue 19
#> 9 Sum black 9