Is there a more concise version of the 'map() inside mutate()` pattern?

I find myself typing the following pattern very frequently when working with listcols:

my_tbl %>%
  mutate(listcol = map(listcol, some_function)) %>%
  mutate(listcol = map(listcol, some_other_function)) %>%
  mutate(listcol = map(listcol, ~ a_bit_different(.x)))

The mutate(listcol = map(listcol, part of this pattern is a lot of typing. It would be great if there was a more concise way to perform these types of map-in-place mutations – something like:

my_tbl %>%
  mutmap(listcol, some_function) %>%
  mutmap(listcol, ~ some_other_function(.x))

Sometimes it's possible to work around this by unnest()ing whatever is in listcol, but a) listcol sometimes contains something other than a tibble/data frame, b) that requires extra lines of code to nest and unnest and c) clashing column names or different table dimensions in the values of listcol can make this a pain.

Is there a function like this in purrr or elsewhere? Thanks!

There is a new function in dev version of tidyr called hoist that can be used for rectangling, but its use-case is to extract elements from a list, not to apply arbitrary function on each element.

As far as I know, there isn't such a function in tidyverse, but a) you can write a package for that or b) in your specific example, you can use purrr::compose and have only one mutate with all of the functions in one go (however, it's likely that your use-case is more general than this example).

2 Likes

Here are two ways you can reduce typing. They differ by the "cleanliness" of code:

  1. Using fairly common way of stacking all transformations in one mutate() call, which will be applied consecutively:
my_tbl %>%
 mutate(
   listcol = map(listcol, some_function),
   listcol = map(listcol, some_other_function),
   listcol = map(listcol, ~ a_bit_different(.x))
 )
  1. Using two capabilities of pipe %>%: creating functions with . at the start and enabling lambda-expressions with {} (. servs as argument). For more information see this help page.
my_tbl %>% 
  mutate(
    listcol = map(
      .x = listcol,
      .f = . %>% some_function() %>%
        some_other_function() %>%
        {a_bit_different(.)}
    )
  )
2 Likes

You can write your own using a little rlang. Using 0.4's new {{...}} for interpolation (i.e. enquo and !! in one):

library(tidyverse)

mutmap <- function(.data, .col, .f, ...){
    dplyr::mutate(.data, {{.col}} := purrr::map({{.col}}, .f, ...))
}

mtcars %>% 
    nest(-cyl) %>% print() %>%
    mutmap(data, ~summarise_all(.x, mean)) %>% print() %>% 
    unnest()
#> # A tibble: 3 x 2
#>     cyl data              
#>   <dbl> <list>            
#> 1     6 <tibble [7 × 10]> 
#> 2     4 <tibble [11 × 10]>
#> 3     8 <tibble [14 × 10]>
#> # A tibble: 3 x 2
#>     cyl data             
#>   <dbl> <list>           
#> 1     6 <tibble [1 × 10]>
#> 2     4 <tibble [1 × 10]>
#> 3     8 <tibble [1 × 10]>
#> # A tibble: 3 x 11
#>     cyl   mpg  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1     6  19.7  183. 122.   3.59  3.12  18.0 0.571 0.429  3.86  3.43
#> 2     4  26.7  105.  82.6  4.07  2.29  19.1 0.909 0.727  4.09  1.55
#> 3     8  15.1  353. 209.   3.23  4.00  16.8 0     0.143  3.29  3.5

That said, I think there's something to be said for at least a little verbosity for this sort of code—it makes it much easier to read later. In particular, while sometimes you'll want to overwrite the old column, sometimes you'll want to add a new column, and seeing some assignment in there (albeit with =) makes the flow of data much easier to understand.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.