Implementing tidyverse's operators `-`/`!` to select all but specific variables

Is there any documentation on how to implement tidyverse's solution to select all but specific variables using -/! before unquoted vectors in selection verbs?

For example:

tibble(x = 1, y = 2, z = 3) |> select(-c(x, z))

I'd be interested in implementing that feature in some functions of my package. I've had a look at tidyselect source code but that's a fair bit of code to scan. I feel like the magic happen inside the data mask but any hint would be welcome.

1 Like

This part is in its own package, {tidyselect} (relying on {rlang}), you can check the vignette.

2 Likes

The untidy way is as simple interactively, with

d <- data.frame(x = 1, y = 2, z = 3)

# interactively
subset(d,select = -c(x,z))
#>   y
#> 1 2

but ... using names is messy

d[!(names(d) %in% c("x", "z"))]
#>   y
#> 1 2

which is why I much prefer to use numeric identifers

# scripts or interactive
d[,-c(1,3)]
#> [1] 2

Created on 2023-11-07 with reprex v2.0.2

Sorry, I meant to use it as unquoted variables in ellipsis.

For example:

tibble::tibble(x = 1, y = 2, z = 3) |> 
  (function(data, ...) {
    dots <- as.character(rlang::enexprs(...))
    data[, dots]
  })(x, z)
#> # A tibble: 1 × 2
#>       x     z
#>   <dbl> <dbl>
#> 1     1     3

Here, using - or ! won't work.

I know that verbs like select internally use column indexes instead of names, which allow the use - but not !.

tibble::tibble(x = 1, y = 2, z = 3) |> 
  (function(data, ...) {
    dots <- rlang::expr(c(...))
    data[, rlang::eval_tidy(dots)]
  })(-2)
#> # A tibble: 1 × 2
#>       x     z
#>   <dbl> <dbl>
#> 1     1     3

Created on 2023-11-08 by the reprex package (v2.0.1)

I was wondering how they do to handle these operators from expressions. I assume they parse the expression and process the data differently depending on the operator but I can't find it in the code. I was curious to see how they do it. It seems to happen inside tidyselect:::vars_select_eval() and probably inside the data_mask object coded in C.

Thanks, that's useful. But I realize I really need to better understand how rlang::expr()/enexpr() and the like work…

This topic was automatically closed after 45 days. New replies are no longer allowed.


If you have a query related to it or one of the replies, start a new topic and refer back with a link.