A confusing result from `tidyeval` (Can you believe it!?)

A friend in our local R-stats meetup posted this question today that I found intriguing and wondered if anyone could shed light upon:

``````suppressMessages(library(dplyr))
wtf <- function(df = mtcars, test = "stupid"){

df %>%
mutate(!! test := mpg) %>%
mutate(!! test := (!! test) + 1)
}

wtf_2 <- function(df = mtcars, test = "stupid"){

df %>%
mutate(!! test := mpg) %>%
mutate(!! test := (!! as.name(test)) + 1)
}

wtf()
#> Error in mutate_impl(.data, dots): Evaluation error: non-numeric argument to binary operator.

#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb stupid
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   22.0
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   22.0
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   23.8
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   22.4
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   19.7
#> 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1   19.1
``````

Created on 2018-08-22 by the reprex package (v0.2.0).

Obviously, we got it to work by adding in the `as.name()` function around `test`, but that doesn't feel very...tidy.

We also managed to identify the source of the issue:

``````> quo(mutate(df, !! test := (!! test) + 1))
<quosure>
expr: ^mutate(df, "stupid" := "stupid" + 1)
env:  global
> quo(mutate(df, !! test := (!! as.name(test)) + 1))
<quosure>
expr: ^mutate(df, "stupid" := stupid + 1)
env:  global
``````

But, nobody could come up with the best function to get this to actually work using the `tidyeval` format. Do any of you have insight?

3 Likes
``````suppressMessages(library(dplyr))
wtf <- function(df = mtcars, test = quo(stupid)){

df %>%
mutate(!! test := mpg) %>%
mutate(!! test := (!! test) + 1)
}
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb stupid
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   22.0
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   22.0
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   23.8
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   22.4
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   19.7
#> 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1   19.1
``````

Created on 2018-08-22 by the reprex package (v0.2.0.9000).

4 Likes

And, using `enxpr()` should let you use an unquoted argument

``````suppressMessages(library(dplyr))
wtf <- function(df = mtcars, test = stupid){
test <- enexpr(test)
df %>%
mutate(!! test := mpg) %>%
mutate(!! test := !! test + 1)
}
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb stupid
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   22.0
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   22.0
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   23.8
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   22.4
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   19.7
#> 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1   19.1
``````

Created on 2018-08-22 by the reprex package (v0.2.0).

7 Likes

Well, shoot, I get

Error: The LHS of `:=` must be a string or a symbol

with your `quo(stupid)` code. I'm working on CRAN versions of rlang/dplyr only, though, maybe this a newer change?.

1 Like

Either way, Edgar's version with `enexpr()` makes more sense, I think!

2 Likes

While `enexpr` is cleaner, you can also do it with `enquo` and `quo_name`:

``````suppressMessages(library(dplyr))
wtf <- function(df = mtcars, test = stupid){
test <- enexpr(test)
df %>%
mutate(!! test := mpg) %>%
mutate(!! test := !! test + 1)
}

wtf_2 <- function(df = mtcars, test = stupid){
test <- enquo(test)
test_name <- quo_name(test)
df %>%
mutate(!! test_name := mpg) %>%
mutate(!! test_name := !! test + 1)
}

identical(wtf(), wtf_2())
#> [1] TRUE
``````

Created on 2018-08-22 by the reprex package (v0.2.0).

Clearly, in this case, `enexpr` seems more appropriate since it takes care of both `enquo` and `quo_name` steps in one, but this way may be handy in other cases, so I thought I would post.

4 Likes

You have an error because as you saw in your "identifying the issue" code, you don't want a string as RHS of `:=` but a symbol (or a name). You manage to do that with `as.name`.
Per the `quotation` rlang doc

Symbols represent the name that is given to an object in a particular context
this is what you want.

The equivalent function in `rlang` is `sym`. So you could manage to make it work just changing that, and still providing `test = "stupid"` as a string

``````suppressMessages(library(dplyr))
wtf <- function(df = mtcars, test = "stupid"){
df %>%
mutate(!! test := mpg) %>%
mutate(!! test := !! sym(test) + 1)
}
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb stupid
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   22.0
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   22.0
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   23.8
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   22.4
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   19.7
#> 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1   19.1
``````

Created on 2018-08-22 by the reprex package (v0.1.1.9000).

All the other option mentioned in this thread are also correct if you want to full tidyeval and do not provide `test` as a string but as an expression directly

7 Likes

Perfect! Thanks for all the help!

A more generalized example:

``````suppressMessages(library(dplyr))

wtf <- function(df = mtcars, var = mpg, test = stupid){
test <- enexpr(test)
var <- enexpr(var)
df %>%
mutate(!! test :=  !!var)
}

mtcars %>%
wtf(mpg + 1) %>%
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb stupid
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   22.0
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   22.0
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   23.8
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   22.4
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   19.7
#> 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1   19.1

iris %>%
wtf(Sepal.Length + 1) %>%
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species stupid
#> 1          5.1         3.5          1.4         0.2  setosa    6.1
#> 2          4.9         3.0          1.4         0.2  setosa    5.9
#> 3          4.7         3.2          1.3         0.2  setosa    5.7
#> 4          4.6         3.1          1.5         0.2  setosa    5.6
#> 5          5.0         3.6          1.4         0.2  setosa    6.0
#> 6          5.4         3.9          1.7         0.4  setosa    6.4
``````

Created on 2018-08-22 by the reprex package (v0.2.0).

10 Likes

Here is another attempt to explain the distinctions. Please consider the following 3 examples.

``````library("dplyr")

mtcars %>%
mutate(newcol := mpg) %>%
mutate(newcol := newcol + 1) %>%

SYMBOL <- rlang::sym("newcol")
mtcars %>%
mutate(!!SYMBOL := mpg) %>%
mutate(!!SYMBOL := !!SYMBOL + 1) %>%

SYMSTR <- "newcol"
mtcars %>%
mutate(!!SYMSTR := mpg) %>%
mutate(!!SYMSTR := !!SYMSTR + 1) %>%
``````

The third one (SYMSTR) errors out. This means that strings are not shorthands for symbols (fair enough). The issue is: the mental model that "`!!`" substitutes in `newcol` is wrong, it in fact substitutes in `"newcol"` (notice the quotes).

What saves a lot of examples is `dplyr::mutate()` is willing to accept quoted strings on the left-hand-sides of assignments (it does not insist on un-quoted symbols).

``````mtcars %>%
mutate("newcol" := mpg) %>%
mutate("newcol" := newcol + 1) %>%

mtcars %>%
mutate("newcol" := mpg) %>%
mutate("newcol" := "newcol" + 1) %>%
``````

Additional complexity is from the fact "`!!`" is willing to substitute both names (column names, names of variables and so on) and also substitute values (strings, possibly numbers). This is obviously confusing some users (as some expect only name substitutions).

Our function `wrapr::let()` tries to stay closer to a names-only substitution model.

``````library("wrapr")
library("dplyr")
let(
c(NEWCOL = "newcol"),
mtcars %>%
mutate(NEWCOL = mpg) %>%
mutate(NEWCOL = NEWCOL + 1) %>%
)
``````

We have a formal write up on `wrapr::let()` here.

7 Likes

You might like to try `friendlyeval` to help resolve the correct `rlang` function in cases where it's unlcear.

There's a string you want to use as a column name. There's a function for that:

``````library(friendlyeval)
wtf_3 <- function(df = mtcars, test = "stupid"){

df %>%
mutate(!! friendlyeval::treat_string_as_col(test) := mpg) %>%
mutate(!! friendlyeval::treat_string_as_col(test) := !! friendlyeval::treat_string_as_col(test) + 1)
}
``````

Looks ugly no? No worries, transpile it away with the RStudio addin to:

``````wtf3 <- function(df = mtcars, test = "stupid"){

df %>%
mutate(!! rlang::sym(test) := mpg) %>%
mutate(!! rlang::sym(test) := !! rlang::sym(test) + 1)
}

> identical (wtf_2(), wtf_3())
# [1] TRUE
``````

Turns out @cderv was on the money!

4 Likes

This is a property of R generally, not of dplyr specifically:

``````"x" <- 1:10
"mean"(x)
#> [1] 5.5
mean("x" = x)
#> [1] 5.5
``````

Created on 2018-09-11 by the reprex
package
(v0.2.0).

Interesting.

However, notice strings and free-names are discernible in the presence of "`:=`" in the "`...`" region (which was the `mutate()` "assignment" case most of us were discussing).

``````f <- function(...) { match.call() }

f(x := 7)
# f(`:=`(x, 7))

f("x" := 7)
# f(`:=`("x", 7))
``````

So `dplyr`, in principle in some cases, could decide if to accept such notation (strings on the left-hand-sides of `:=`) or not.

Exception to an "R always does this" comprehension include:

``````quote("X"\$a)
# "X"\$a

X <- list(a = 5)
X\$a
# [1] 5
"X"\$a
# Error in "X"\$a : \$ operator is invalid for atomic vectors
``````
``````function("X" = 1) { X }
Error: unexpected string constant in "function("X""
``````

Oh, interesting. Just to clarify (for myself and any future browser unfamiliar with `match.call()`, as I was), R gets rid of the `""`* in the absence of `:=` like so:

``````f <- function(...) { match.call() }
f("mean"(x))
#> f(mean(x))
f(mean("x" = x))
#> f(mean(x = x))

f(mean("x" := x))
#> f(mean(`:=`("x", x)))
f(mean(x := x))
#> f(mean(`:=`(x, x)))
``````

Created on 2018-09-11 by the reprex package (v0.2.0.9000)

* distinguishes between free-names and strings

1 Like

Yes. The `match.call()` is quoting `f()` and arguments (into `language` objects, not `character`). So one can also write those examples as:

``````quote("mean"(x))
# mean(x)
quote(mean("x" = x))
# mean(x = x)

quote(mean("x" := x))
# mean(`:=`("x", x))
quote(mean(x := x))
# mean(`:=`(x, x))
``````

A case of this sort of thing I had to deal with in the `wrapr::let()` implementation is (notes here):

``````quote(d\$"X")
# d\$X
``````

Though it is worth repeating that "`\$`" isn't symmetric.

``````quote("X"\$a)
# "X"\$a

X <- list(a = 5)
X\$a
# [1] 5
"X"\$a
# Error in "X"\$a : \$ operator is invalid for atomic vectors
``````

So it is something I have thought a bit about, it just wasn't on the top of my mind (or worth the length) in my first note.

The one that gave me chills is when Gabe Becker taught me the following:

``````quote(7 -> x)
# x <- 7
``````
2 Likes