Premature evaluation of !!! inside mutate

I have the following code which throws an error (this is not a reprex, but I hope it is clear what is going on anyway since the question is more conceptual).

test_df <- test_reg %>%
  mutate(
    vars = map2(data, lm_obj, function(org_df, reg_obj) {
      org_df %>% 
        mutate(
          var_combinations = interaction(!!!syms(all.vars(reg_obj[["terms"]])[-1]), sep = "_", drop = TRUE)
        ) 
    })
  )

Error in all.vars(reg_obj[["terms"]]) : object 'reg_obj' not found

despite the fact that reg_obj is clearly defined in the function.

However, if I define a function with !!! and call that, the code runs fine.

add_interaction_var <- function(df, vars) {
  df %>% 
    mutate(
      var_combinations = interaction(!!!vars, sep = "_", drop = TRUE)
    )
}

test_df <- test_reg %>%
  mutate(
    vars = map2(data, lm_obj, function(org_df, reg_obj) {
      xvars <- syms(all.vars(reg_obj[["terms"]])[-1])
      add_interaction_var(org_df, xvars)
    })
  )

This seems a bit crazy, so I was wondering if there is a way to prevent !!! from being evaluated until the code actually run or some other way to solve this that does not involve running a new function.

Edit: I should probably also mention that the function has to be defined outside, otherwise you end up with the same error.

could you dput() a small amount of the test_reg , and share so I could run your examples ?

Thank you. I would be happy to, but even with a 0.05% sample (148 observations) test_reg ends up being almost 5,000 lines long when using dput. If that is okay, I can post that.

can you sample your data, and fit new simpler regressions ?
could you provide some code that would make a test_reg object out of some model fitting on the iris dataset or some other builtin ?

That makes more sense. I will get that done later today.

That was not as bas as expected. Here is the reprex. I think that it might be a problem with map2 rather than the fact that I am using regression models.

# showing problem with premature evaluation of !!! using mtcars
suppressMessages(library(tidyverse))
test_reg <- mtcars %>% 
  group_by(am) %>% 
  nest() %>% 
  mutate(
    lm_obj = map(data, ~ lm(mpg ~ factor(gear) + factor(carb), data = .))
  )

# Does not work
test_df <- test_reg %>% 
  mutate(
    vars = map2(data, lm_obj, function(org_df, reg_obj) {
      org_df %>% 
        mutate(
          var_combinations = interaction(!!!syms(all.vars(reg_obj[["terms"]])[-1]), sep = "_", drop = TRUE)
        ) 
    })
  )
#> Error in all.vars(reg_obj[["terms"]]): object 'reg_obj' not found

# Works
add_interaction_var <- function(df, vars) {
  df %>%
    mutate(
      var_combinations = interaction(!!!vars, sep = "_", drop = TRUE)
    )
}

test_df <- test_reg %>%
  mutate(
    vars = map2(data, lm_obj, function(org_df, reg_obj) {
      xvars <- syms(all.vars(reg_obj[["terms"]])[-1])
      add_interaction_var(org_df, xvars)
    })
  )

Created on 2020-05-28 by the reprex package (v0.3.0)

I think the first attempt can be repaired by hoisting the definition of the previously anonymous function, out to a step before

# showing problem with premature evaluation of !!! using mtcars
suppressMessages(library(tidyverse))
test_reg <- mtcars %>% 
  group_by(am) %>% 
  nest() %>% 
  mutate(
    lm_obj = map(data, ~ lm(mpg ~ factor(gear) + factor(carb), data = .))
  )

tryfunc <- function(org_df, reg_obj) {
  org_df %>% 
    mutate(
      var_combinations = interaction(
                !!!syms(all.vars(reg_obj[["terms"]])[-1]), 
                 sep = "_", drop = TRUE)) 
}

test_df <- test_reg %>% 
  mutate(
    vars = map2(data, lm_obj, ~tryfunc(.x,.y))
  )
1 Like

Yes, that is why the second approach I showed had the function add_interaction_var defined outside. However, it seems weird to have to define a function to get map2 to perform a simple operation. Is there another way to get mutate to hold off on evaluating !!! until there is actually something there to evaluate?

Your second version still tries to use an anonymous function rather than a named one like in mine, it's just that your anonymous function calls your named function, so it's again more complex than mine. I also don't think it's unreasonable that a function. Should exists in the environment before map2 uses it to iterate extensively. Maybe there are ways to quote and then evaluate but the code that resulted would surely be more convoluted than in the example with the named function....

Very true and thank you for the help. I was, however, not so much thinking about the complexity of having functions as about whether the evaluation order/timing was intended behavior. I found this with a similar question but no real solution as to whether it is intended: https://github.com/tidyverse/purrr/issues/541