Hello,
I love purrr and use it more or less daily. But there's one thing which keeps bugging me, and I was wondering whether someone can help clarify the following for me.
My question pertains to how to address (?) function arguments in .f
The documentation states that .f is
A function, specified in one of the following ways:
- A named function, e.g.
mean
. - An anonymous function, e.g.
\(x) x + 1
orfunction(x) x + 1
. - A formula, e.g.
~ .x + 1
. You must use.x
to refer to the first argument. Only recommended if you require backward compatibility with older versions of R.
Below my examples
library(tidyverse)
my_df <- tibble(col_num=1:3)
fn_square <- function(z) {z^2}
#NAMED FUNCTION
my_df %>%
mutate(num_2=map_dbl(.x=col_num, .f=fn_square)) #named function
#> # A tibble: 3 × 2
#> col_num num_2
#> <int> <dbl>
#> 1 1 1
#> 2 2 4
#> 3 3 9
#ANONYMOUS FUNCTION
my_df %>%
mutate(num_2=map_dbl(.x=col_num, .f=\(x) x^2))
#> # A tibble: 3 × 2
#> col_num num_2
#> <int> <dbl>
#> 1 1 1
#> 2 2 4
#> 3 3 9
#FORMULA
my_df %>%
mutate(num_2=map_dbl(.x=col_num, .f=~fn_square(z=.x)))
#> # A tibble: 3 × 2
#> col_num num_2
#> <int> <dbl>
#> 1 1 1
#> 2 2 4
#> 3 3 9
What I love about the formula approach is that it is very clear regrading the relation of the input and where it's fed into to the function. .X goes into the function argument z. The other two variants don't have this - there is no .x mentioned as .f input which I always find somewhat obscure. This issue is still somewhat negligible when using a simple map function, but becomes more pertinent when using map2 or pmap functions.
Now my question/issue:
At least with the latest purrr release (maybe I didn't notice it earlier), it is now stated that formula (my preferred approach) is "Only recommended if you require backward compatibility with older versions of R." Why is this? Or to put it differently, is there any recommended approach which let's me clearly state which input goes into which function attribute?
I would have hoped that the approach below works, but unfortunately - and surprisingly - no.
my_df %>%
mutate(num_2=map_dbl(.x=col_num, .f=fn_square(z=.x)))
#> Error in `mutate()`:
#> ℹ In argument: `num_2 = map_dbl(.x = col_num, .f = fn_square(z = .x))`.
#> Caused by error in `fn_square()`:
#> ! object '.x' not found
#> Backtrace:
#> ▆
#> 1. ├─my_df %>% ...
#> 2. ├─dplyr::mutate(., num_2 = map_dbl(.x = col_num, .f = fn_square(z = .x)))
#> 3. ├─dplyr:::mutate.data.frame(., num_2 = map_dbl(.x = col_num, .f = fn_square(z = .x)))
#> 4. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
#> 5. │ ├─base::withCallingHandlers(...)
#> 6. │ └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
#> 7. │ └─mask$eval_all_mutate(quo)
#> 8. │ └─dplyr (local) eval()
#> 9. ├─purrr::map_dbl(.x = col_num, .f = fn_square(z = .x))
#> 10. │ └─purrr:::map_("double", .x, .f, ..., .progress = .progress)
#> 11. │ └─purrr::as_mapper(.f, ...)
#> 12. ├─global fn_square(z = .x)
#> 13. └─base::.handleSimpleError(`<fn>`, "object '.x' not found", base::quote(fn_square(z = .x)))
#> 14. └─dplyr (local) h(simpleError(msg, call))
#> 15. └─rlang::abort(message, class = error_class, parent = parent, call = error_call)
I am sure the tidyverse team has good reason for this behavior. I personally would have found it the most accessible formulation.
Grateful I anyone has an idea about this.
Many thanks