I'm playing around with some code for a data frame in which a particular "role" ("foo"
below) has been defined to refer to a specific column (bar
) at construction time:
set_foo_col <- \(x, col) `attr<-`(x, "foo", col)
get_foo_col <- \(x) attr(x, "foo")
data <- tibble::tibble(bar = 1:10) |>
set_foo_col("bar")
I wanted to save myself from having to write a bunch of code like dplyr::mutate(x, baz = .data[[get_foo_col(x)]])
, instead hoping I could do some meta-programming shenanigans to use .foo
as an alias in data-mask contexts (i.e. dplyr::mutate(x, baz = .foo)
instead).
This naïve solution fails, but expresses my intent, I hope:
my_mutate <- function(.data, ...) {
data <- rlang::enquo(.data)
dots <- rlang::enquos(...)
expr <- rlang::expr(dplyr::mutate(!!data, !!!dots))
env <- rlang::env(
rlang::caller_env(),
.foo = rlang::sym(get_foo_col(.data)) # This is wrong, but what to do instead?
)
rlang::eval_tidy(expr, env = env)
}
This fails with:
data |> my_mutate(baz = .foo)
#> Error in `dplyr::mutate()`:
#> ℹ In argument: `baz = .foo`.
#> Caused by error:
#> ! object '.foo' not found
Created on 2024-07-18 with reprex v2.1.1
Some things I have tried that didn't work.
- I can create a data mask from scratch (with
rlang::as_data_mask
and friends), but I can only figure out how to use this new mask by reimplementing tons ofdplyr
internals. That is, I don't know how to just "pass" it to the existingdplyr::mutate
implementation. - I could maybe modify the existing data mask in
.data
withdplyr:::peek_mask
, but that is not publicly exposed, so I'd rather avoid that. - I tried delaying execution with
rlang::env_bind_active
, but cannot find a solution that works in general. In particular, the combination withdplyr::pick
seemed attractive, but I cannot quite make it work. - Ultimately, I considered just walking the AST in
dots
to replace.foo
leaf symbols with.data[[get_foo_col(x)]]
, but that feels hacky?
Any pointers on how to approach something like this?