I'm not sure if this is a bug or I'm doing something wrong, but when I try to drop variables contained in a quosures object via select, the behavior is different depending on whether there are one or more variables in the quosures object.
select_not <- function(d, ...) {
to_drop <- rlang::quos(...)
dplyr::select(d, -!!!to_drop)
}
dd <- data.frame(x = 1:10, y = 11:20, z = 21:30)
# Good: returns x and y
select_not(dd, z)
# Not good: Returns y and z
select_not(dd, y, z)
> devtools::session_info()
Session info ---------------------------------------------------------------------------------------------------------------------
setting value
version R version 3.4.3 (2017-11-30)
system x86_64, darwin15.6.0
ui RStudio (1.1.383)
language (EN)
collate en_US.UTF-8
tz America/Denver
date 2018-01-03
Packages -------------------------------------------------------------------------------------------------------------------------
package * version date source
assertthat 0.2.0 2017-04-11 CRAN (R 3.4.0)
base * 3.4.3 2017-12-07 local
bindr 0.1 2016-11-13 CRAN (R 3.4.0)
bindrcpp 0.2 2017-06-17 CRAN (R 3.4.0)
compiler 3.4.3 2017-12-07 local
datasets * 3.4.3 2017-12-07 local
devtools 1.13.3 2017-08-02 CRAN (R 3.4.1)
digest 0.6.12 2017-01-27 CRAN (R 3.4.0)
dplyr 0.7.4 2017-09-28 CRAN (R 3.4.2)
glue 1.2.0 2017-10-29 cran (@1.2.0)
graphics * 3.4.3 2017-12-07 local
grDevices * 3.4.3 2017-12-07 local
magrittr 1.5 2014-11-22 CRAN (R 3.4.0)
memoise 1.1.0 2017-04-21 CRAN (R 3.4.0)
methods * 3.4.3 2017-12-07 local
pkgconfig 2.0.1 2017-03-21 CRAN (R 3.4.0)
R6 2.2.2 2017-06-17 CRAN (R 3.4.0)
Rcpp 0.12.14 2017-11-23 cran (@0.12.14)
rlang 0.1.6 2017-12-21 CRAN (R 3.4.3)
stats * 3.4.3 2017-12-07 local
tibble 1.3.4 2017-08-22 cran (@1.3.4)
tools 3.4.3 2017-12-07 local
utils * 3.4.3 2017-12-07 local
withr 2.1.0 2017-11-01 cran (@2.1.0)
yaml 2.1.14 2016-11-12 CRAN (R 3.4.0)
Thanks @mara! How can I "flatten" the quosures object inside the function, to get the same result but with the variables passed in separately through ... ?
select has a handy tool for dealing with these situations where you can negate the one_of result. However, this requires getting the variables names into a character string.
After to_drop <- rlang::quos(...), getting the character strings is easy enough with as.character, but you have an extraneous ~ character. A quick sub can drop it, and you use
Personally, I find the following maintains the look and feel of a tidyverse function, but gets to the heart of the issue more efficiently.
select_not <- function(d, ...){
# get expressions in ... as characters
to_drop <-
vapply(substitute(list(...)),
as.character,
character(1))[-1] # the first one ends up being "list", and we don't need it
d[!names(d) %in% to_drop]
}
Based on @lionel's answer to a related question on SO:
You can use rlang::lang, though I admit that the quosures created by that are a little weird (of the form -~z). It may end up a bit brittle because of that, I suspect.
The lang helpfile is a lesson. I hope the dives into rlang and NSE make us collectively better R users and programmers and we're not just learning a series of one-off tricks. I'll admit it hasn't come together for me yet, but I haven't tried too hard and am still optimistic.
I expect there may be some good resources available for learning rlang in the next year or so; the underlying code seems to be in enough flux at the moment that creating definitive documentation probably isn't a hugely worthwhile endeavor (beyond using it at a surface level).
I'll give you "better R users." However, I think "better R programmers" is up for debate. I'm not opposed to people programming with tidy evaluation, but it isn't all roses and rainbows. It has a performance cost. How much that performance cost affects your decisions depends on how you envision your work being used. If it gets used once, probably not a big deal. If it gets used in any kind of resampling, it can be a very big deal.
For instance, using the examples here, avoiding quosures altogether results in an execution time of about 200 microseconds. Using the quosures takes about 20,000 microseconds. If I needed to run this in a routine 10,000 times (say for a bootstrap procedure), that translates into 2 seconds with standard evaluation and over 3 minutes with quosures. There's a pretty good thread on the subject here
If you’re question’s been answered, would you mind selecting the solution? (I believe it's Nick's). That way we: know that your problem's solved; and someone in the future knows what the solution was (I like how discourse puts the solution right in the bottom of the question).
If you're the OP, there will be a little check box at the footer of replies to the thread. To select a solution, you just click on it.
Hi all, this is how I would go about that function. I start with exprs to get the list of fields back, and then add the minus sign to each expression using expr and map to iterate through all of the dots arguments (fields):
@edgararuiz, that's awesome. Any recommended reading besides the programming with dplyr vignette and the rlang helpfiles for getting one's head around this stuff?
Thanks. I know this chapter is still in-flight, but Hadley's updated version of Advanced R may be the best place to go, specifically to the idea behind expr(- !!.x)), which is to concatenate variables and operators to create a new formula, is found in this section: https://github.com/hadley/adv-r/blob/master/Quotation.Rmd#generating-code
Cheers! This is the same strategy I used when I wrote an answer for this question on Stack Overflow. If anyone wants an explanation of each step here, see my answer.