rkahne
August 22, 2018, 4:03pm
1
A friend in our local R-stats meetup posted this question today that I found intriguing and wondered if anyone could shed light upon:
suppressMessages(library(dplyr))
wtf <- function(df = mtcars, test = "stupid"){
df %>%
mutate(!! test := mpg) %>%
mutate(!! test := (!! test) + 1)
}
wtf_2 <- function(df = mtcars, test = "stupid"){
df %>%
mutate(!! test := mpg) %>%
mutate(!! test := (!! as.name(test)) + 1)
}
wtf()
#> Error in mutate_impl(.data, dots): Evaluation error: non-numeric argument to binary operator.
head(wtf_2())
#> mpg cyl disp hp drat wt qsec vs am gear carb stupid
#> 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 22.0
#> 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 22.0
#> 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 23.8
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 22.4
#> 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 19.7
#> 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 19.1
Created on 2018-08-22 by the reprex package (v0.2.0).
Obviously, we got it to work by adding in the as.name()
function around test
, but that doesn't feel very...tidy.
We also managed to identify the source of the issue:
> quo(mutate(df, !! test := (!! test) + 1))
<quosure>
expr: ^mutate(df, "stupid" := "stupid" + 1)
env: global
> quo(mutate(df, !! test := (!! as.name(test)) + 1))
<quosure>
expr: ^mutate(df, "stupid" := stupid + 1)
env: global
But, nobody could come up with the best function to get this to actually work using the tidyeval
format. Do any of you have insight?
3 Likes
mara
August 22, 2018, 4:08pm
2
suppressMessages(library(dplyr))
wtf <- function(df = mtcars, test = quo(stupid)){
df %>%
mutate(!! test := mpg) %>%
mutate(!! test := (!! test) + 1)
}
head(wtf())
#> mpg cyl disp hp drat wt qsec vs am gear carb stupid
#> 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 22.0
#> 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 22.0
#> 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 23.8
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 22.4
#> 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 19.7
#> 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 19.1
Created on 2018-08-22 by the reprex package (v0.2.0.9000).
4 Likes
And, using enxpr()
should let you use an unquoted argument
suppressMessages(library(dplyr))
wtf <- function(df = mtcars, test = stupid){
test <- enexpr(test)
df %>%
mutate(!! test := mpg) %>%
mutate(!! test := !! test + 1)
}
head(wtf())
#> mpg cyl disp hp drat wt qsec vs am gear carb stupid
#> 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 22.0
#> 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 22.0
#> 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 23.8
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 22.4
#> 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 19.7
#> 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 19.1
Created on 2018-08-22 by the reprex package (v0.2.0).
7 Likes
Well, shoot, I get
Error: The LHS of :=
must be a string or a symbol
with your quo(stupid)
code. I'm working on CRAN versions of rlang /dplyr only, though, maybe this a newer change?.
1 Like
mara
August 22, 2018, 4:20pm
5
Either way, Edgar's version with enexpr()
makes more sense, I think!
2 Likes
While enexpr
is cleaner, you can also do it with enquo
and quo_name
:
suppressMessages(library(dplyr))
wtf <- function(df = mtcars, test = stupid){
test <- enexpr(test)
df %>%
mutate(!! test := mpg) %>%
mutate(!! test := !! test + 1)
}
wtf_2 <- function(df = mtcars, test = stupid){
test <- enquo(test)
test_name <- quo_name(test)
df %>%
mutate(!! test_name := mpg) %>%
mutate(!! test_name := !! test + 1)
}
identical(wtf(), wtf_2())
#> [1] TRUE
Created on 2018-08-22 by the reprex package (v0.2.0).
Clearly, in this case, enexpr
seems more appropriate since it takes care of both enquo
and quo_name
steps in one, but this way may be handy in other cases, so I thought I would post.
4 Likes
cderv
August 22, 2018, 4:45pm
7
You have an error because as you saw in your "identifying the issue" code, you don't want a string as RHS of :=
but a symbol (or a name). You manage to do that with as.name
.
Per the quotation
rlang doc
Symbols represent the name that is given to an object in a particular context
this is what you want.
The equivalent function in rlang
is sym
. So you could manage to make it work just changing that, and still providing test = "stupid"
as a string
suppressMessages(library(dplyr))
#> Warning: le package 'dplyr' a été compilé avec la version R 3.4.4
wtf <- function(df = mtcars, test = "stupid"){
df %>%
mutate(!! test := mpg) %>%
mutate(!! test := !! sym(test) + 1)
}
head(wtf())
#> Warning: le package 'bindrcpp' a été compilé avec la version R 3.4.4
#> mpg cyl disp hp drat wt qsec vs am gear carb stupid
#> 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 22.0
#> 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 22.0
#> 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 23.8
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 22.4
#> 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 19.7
#> 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 19.1
Created on 2018-08-22 by the reprex package (v0.1.1.9000).
All the other option mentioned in this thread are also correct if you want to full tidyeval and do not provide test
as a string but as an expression directly
7 Likes
rkahne
August 22, 2018, 5:03pm
8
Perfect! Thanks for all the help!
A more generalized example:
suppressMessages(library(dplyr))
wtf <- function(df = mtcars, var = mpg, test = stupid){
test <- enexpr(test)
var <- enexpr(var)
df %>%
mutate(!! test := !!var)
}
mtcars %>%
wtf(mpg + 1) %>%
head()
#> mpg cyl disp hp drat wt qsec vs am gear carb stupid
#> 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 22.0
#> 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 22.0
#> 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 23.8
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 22.4
#> 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 19.7
#> 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 19.1
iris %>%
wtf(Sepal.Length + 1) %>%
head()
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species stupid
#> 1 5.1 3.5 1.4 0.2 setosa 6.1
#> 2 4.9 3.0 1.4 0.2 setosa 5.9
#> 3 4.7 3.2 1.3 0.2 setosa 5.7
#> 4 4.6 3.1 1.5 0.2 setosa 5.6
#> 5 5.0 3.6 1.4 0.2 setosa 6.0
#> 6 5.4 3.9 1.7 0.4 setosa 6.4
Created on 2018-08-22 by the reprex package (v0.2.0).
10 Likes
Here is another attempt to explain the distinctions. Please consider the following 3 examples.
library("dplyr")
mtcars %>%
mutate(newcol := mpg) %>%
mutate(newcol := newcol + 1) %>%
head() # works
SYMBOL <- rlang::sym("newcol")
mtcars %>%
mutate(!!SYMBOL := mpg) %>%
mutate(!!SYMBOL := !!SYMBOL + 1) %>%
head() # works
SYMSTR <- "newcol"
mtcars %>%
mutate(!!SYMSTR := mpg) %>%
mutate(!!SYMSTR := !!SYMSTR + 1) %>%
head() # errors-out
The third one (SYMSTR) errors out. This means that strings are not shorthands for symbols (fair enough). The issue is: the mental model that "!!
" substitutes in newcol
is wrong, it in fact substitutes in "newcol"
(notice the quotes).
What saves a lot of examples is dplyr::mutate()
is willing to accept quoted strings on the left-hand-sides of assignments (it does not insist on un-quoted symbols).
mtcars %>%
mutate("newcol" := mpg) %>%
mutate("newcol" := newcol + 1) %>%
head() # works
mtcars %>%
mutate("newcol" := mpg) %>%
mutate("newcol" := "newcol" + 1) %>%
head() # errors-out
Additional complexity is from the fact "!!
" is willing to substitute both names (column names, names of variables and so on) and also substitute values (strings, possibly numbers). This is obviously confusing some users (as some expect only name substitutions).
Our function wrapr::let()
tries to stay closer to a names-only substitution model.
library("wrapr")
library("dplyr")
let(
c(NEWCOL = "newcol"),
mtcars %>%
mutate(NEWCOL = mpg) %>%
mutate(NEWCOL = NEWCOL + 1) %>%
head()
)
We have a formal write up on wrapr::let()
here .
7 Likes
You might like to try friendlyeval
to help resolve the correct rlang
function in cases where it's unlcear.
There's a string you want to use as a column name. There's a function for that:
library(friendlyeval)
wtf_3 <- function(df = mtcars, test = "stupid"){
df %>%
mutate(!! friendlyeval::treat_string_as_col(test) := mpg) %>%
mutate(!! friendlyeval::treat_string_as_col(test) := !! friendlyeval::treat_string_as_col(test) + 1)
}
Looks ugly no? No worries, transpile it away with the RStudio addin to:
wtf3 <- function(df = mtcars, test = "stupid"){
df %>%
mutate(!! rlang::sym(test) := mpg) %>%
mutate(!! rlang::sym(test) := !! rlang::sym(test) + 1)
}
> identical (wtf_2(), wtf_3())
# [1] TRUE
Turns out @cderv was on the money!
4 Likes
hadley
September 11, 2018, 3:35pm
12
This is a property of R generally, not of dplyr specifically:
"x" <- 1:10
"mean"(x)
#> [1] 5.5
mean("x" = x)
#> [1] 5.5
Created on 2018-09-11 by the reprex
package (v0.2.0).
Interesting.
However, notice strings and free-names are discernible in the presence of ":=
" in the "...
" region (which was the mutate()
"assignment" case most of us were discussing).
f <- function(...) { match.call() }
f(x := 7)
# f(`:=`(x, 7))
f("x" := 7)
# f(`:=`("x", 7))
So dplyr
, in principle in some cases, could decide if to accept such notation (strings on the left-hand-sides of :=
) or not.
Exception to an "R always does this" comprehension include:
quote("X"$a)
# "X"$a
X <- list(a = 5)
X$a
# [1] 5
"X"$a
# Error in "X"$a : $ operator is invalid for atomic vectors
function("X" = 1) { X }
Error: unexpected string constant in "function("X""
mara
September 11, 2018, 4:05pm
14
Oh, interesting. Just to clarify (for myself and any future browser unfamiliar with match.call()
, as I was), R gets rid of the ""
* in the absence of :=
like so:
f <- function(...) { match.call() }
f("mean"(x))
#> f(mean(x))
f(mean("x" = x))
#> f(mean(x = x))
f(mean("x" := x))
#> f(mean(`:=`("x", x)))
f(mean(x := x))
#> f(mean(`:=`(x, x)))
Created on 2018-09-11 by the reprex package (v0.2.0.9000)
* distinguishes between free-names and strings
1 Like
Yes. The match.call()
is quoting f()
and arguments (into language
objects, not character
). So one can also write those examples as:
quote("mean"(x))
# mean(x)
quote(mean("x" = x))
# mean(x = x)
quote(mean("x" := x))
# mean(`:=`("x", x))
quote(mean(x := x))
# mean(`:=`(x, x))
A case of this sort of thing I had to deal with in the wrapr::let()
implementation is (notes here ):
quote(d$"X")
# d$X
Though it is worth repeating that "$
" isn't symmetric.
quote("X"$a)
# "X"$a
X <- list(a = 5)
X$a
# [1] 5
"X"$a
# Error in "X"$a : $ operator is invalid for atomic vectors
So it is something I have thought a bit about, it just wasn't on the top of my mind (or worth the length) in my first note.
The one that gave me chills is when Gabe Becker taught me the following:
quote(7 -> x)
# x <- 7
2 Likes