Is there a tidyeval way to group_by() all columns in a data frame EXCEPT for the ones you don't want grouped? I often need to come up with shares over 1 dimension, so I use group_by_() to select all names except the ones I want.
This is useful for me for programming as I don't always know the names of the other columns in the data frame. My current method seems inefficient, though.
In the example below, I group by all columns except Sex and Freq (the data). This returns the shares by sex, grouped by Hair and Eye.
Thanks. This is a helpful start. I'm still trying to wrap my head about tidyeval, but is there a solution that doesn't require me to explicitly know that Sex and Freq are the names of the excluded variables, but rather, I can pass the string c("Sex", "Freq") into the vars() function?
I can't recommend the webinar by Lionel Henry enough for wrapping your head around tidyeval: link
In this specific case you can do something like this:
library(tidyverse)
vars <- c("Sex", "Freq")
dots <- purrr::map(vars, function(variable){
dot <- rlang::sym(variable)
rlang::quo(-!!dot)
})
as.data.frame(HairEyeColor) %>%
group_by_at(vars(!!!dots)) %>%
mutate(share = Freq / sum(Freq)) %>%
ungroup()
# A tibble: 32 x 5
Hair Eye Sex Freq share
<fct> <fct> <fct> <dbl> <dbl>
1 Black Brown Male 32.0 0.471
2 Brown Brown Male 53.0 0.445
3 Red Brown Male 10.0 0.385
4 Blond Brown Male 3.00 0.429
5 Black Blue Male 11.0 0.550
6 Brown Blue Male 50.0 0.595
7 Red Blue Male 10.0 0.588
8 Blond Blue Male 30.0 0.319
9 Black Hazel Male 10.0 0.667
10 Brown Hazel Male 25.0 0.463
# ... with 22 more rows