I have some issue to pass a filtering criteria to a function by using the dots. The first example (ex1) is working fine. But the second one (ex2) not. How can I pass the filtering criteria (from the dots) to inside of the summarize function embedded in my function?
Do you have any other suggestion to solve this problem?
Thank you for your help.
library(tidyverse)
dfex <- tibble(cat = sample(c("A", "B", "C", NA_character_), size = 1000, replace = TRUE),
subcat = sample(c(letters, NA_character_), size = 1000, replace = TRUE))
# ex1, OK
dfex %>%
filter(is.na(subcat)) %>%
count(cat, sort = TRUE) %>%
mutate(p = n / sum(n))
#> # A tibble: 4 x 3
#> cat n p
#> <chr> <int> <dbl>
#> 1 C 11 0.367
#> 2 B 10 0.333
#> 3 <NA> 5 0.167
#> 4 A 4 0.133
fraction1 <- function(df, group, ...){
group = enquo(group)
df %>%
filter(...) %>%
count(!! group, sort = TRUE) %>%
mutate(p = n / sum(n))
}
dfex %>% fraction1(cat, is.na(subcat))
#> # A tibble: 4 x 3
#> cat n p
#> <chr> <int> <dbl>
#> 1 C 11 0.367
#> 2 B 10 0.333
#> 3 <NA> 5 0.167
#> 4 A 4 0.133
dfex %>% fraction1(cat, subcat == "x")
#> # A tibble: 4 x 3
#> cat n p
#> <chr> <int> <dbl>
#> 1 <NA> 13 0.351
#> 2 C 12 0.324
#> 3 B 8 0.216
#> 4 A 4 0.108
# ex2, NOT OK
dfex %>%
group_by(cat) %>%
summarise(n_condition = sum(is.na(subcat)),
n = n()) %>%
mutate(p = n_condition / n) %>%
arrange(desc(p))
#> # A tibble: 4 x 4
#> cat n_condition n p
#> <chr> <int> <int> <dbl>
#> 1 B 10 244 0.0410
#> 2 C 11 276 0.0399
#> 3 <NA> 5 239 0.0209
#> 4 A 4 241 0.0166
fraction2 <- function(df, group, ...){
group = enquo(group)
df %>%
group_by(!! group) %>%
summarise(n_condition = sum(...),
n = n()) %>%
mutate(p = n_condition / n) %>%
arrange(desc(p))
}
dfex %>% fraction2(cat, is.na(subcat))
#> Error: object 'subcat' not found
Thank you for your suggestion. It is a perfect solution and it is solving my problem, but I have some concerns(?).
I have to include the definition of the n_condition variable in the function calling part, but my original intention was to define this variable only inside the function. I consider that logically the summing (when the condition is met) should be part of the function definition.
Still I am learning the tidyeval; sometimes I feel that the learning curve is a bit steep. I had the hope, that by solving my original problem it can help me understanding better the working of the tidyeval/closures/environments ...
However, I do want to mention that this way of doing things is rather obscure and is likely to lead to confusion down the line. Obviously, this is an example/approximation of your real problem, so it's difficult to say for certain whether that makes it difficult or not for your use-case specifically. Just the word of caution
I still wondering, how to decide when is enough to use only the dots and when the quotation is required. In case of first example both solution are working.