I've been wondering about the embrace operator {{...}}
and how it behaves differently in tidy selection and data masking contexts.
For dplyr verbs that use tidy selection I can write a function using the the embrace operator and it will accept unquoted or quoted column names or an embraced variable name as arguments. For example:
library(dplyr)
f <- function(df, col) {
range(pull(df, {{col}}))
}
f(mtcars, disp)
#> [1] 71.1 472.0
f(mtcars, "disp")
#> [1] 71.1 472.0
var <- "disp"
f(mtcars, {{var}})
#> [1] 71.1 472.0
all give the same result.
However, in a data masking context, only the non quoted column name works:
f <- function(df, col) {
df <- mutate(df, new = {{col}} * 100)
tibble(df[1,])
}
f(mtcars, disp)
#> # A tibble: 1 × 12
#> mpg cyl disp hp drat wt qsec vs am gear carb new
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 16000
f(mtcars, "disp")
#> Error in `mutate()`:
#> ℹ In argument: `new = "disp" * 100`.
#> Caused by error in `"disp" * 100`:
#> ! non-numeric argument to binary operator
var <- "disp"
f(mtcars, {{var}})
#> Error in `mutate()`:
#> ℹ In argument: `new = "disp" * 100`.
#> Caused by error in `"disp" * 100`:
#> ! non-numeric argument to binary operator
Presumably this is because the {{col}}
isn't unquoting when it's doing the mutate.
I can get around this by using rlang::ensym
bang-bang, which works the same in both tidy selection
f <- function(df, col) {
col <- rlang::ensym(col)
range(pull(df, !!col))
}
f(mtcars, disp)
#> [1] 71.1 472.0
f(mtcars, "disp")
#> [1] 71.1 472.0
var <- "disp"
f(mtcars, {{var}})
#> [1] 71.1 472.0
and data masking
f <- function(df, col) {
col <- rlang::ensym(col)
df <- mutate(df, new = !!col * 100)
tibble(df[1,])
}
f(mtcars, disp)
#> # A tibble: 1 × 12
#> mpg cyl disp hp drat wt qsec vs am gear carb new
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 16000
f(mtcars, "disp")
#> # A tibble: 1 × 12
#> mpg cyl disp hp drat wt qsec vs am gear carb new
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 16000
var <- "disp"
f(mtcars, {{var}})
#> # A tibble: 1 × 12
#> mpg cyl disp hp drat wt qsec vs am gear carb new
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 16000
contexts.
So, for me it seems the best option when writing functions that call dplyr verbs or ggplot aesthetics is to use the rlang::ensym()
bang-bang approach. However, ensym isn't mentioned at all in the Programming with dplyr or the Using ggplot2 in Packages vignettes, and while it seems like a "catch-all" solution (with the added bonus that I can use rlang::as_name()
to use the argument in standard evaluation functions), I feel like I must be missing something when other approaches are given more airtime and the rlang documentation states that " expr()
, enquo()
, and enquos()
are sufficient for most purposes but rlang provides these other operations, either for completeness or because they are useful to experts", and I'm not sure I'm much an expert in this area!
Does anyone else have any thoughts on this?