Simplest way to modify the same column in multiple dataframes in a list

tbradley · August 23, 2018, 3:13pm

Yes, your interpretation of that bit of code is exactly correct, as far as I understand it! The ~{.x} notation is handy when you want to write more complicated anonymous functions but don't want to separate it into its own function. If you use map2 you would use .x for the first argument and .y for the second.

You could do it with tidyeval like this:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

library(purrr)

# generate sample data
n <- 100
df1 <- data.frame(x = runif(n), y = rnorm(n), m = seq_len(n))
df2 <- data.frame(x = runif(n), y = rnorm(n), m = as.character(seq_len(n)))
df3 <- data.frame(x = runif(n), y = rnorm(n), m = as.factor(seq_len(n)))
list_of_dataframes <- list(df1 = df1, df2 = df2, df3 = df3)

my_func <- function(data, my_col){
  my_col <- enexpr(my_col)
  
  output <- data %>% 
    mutate(!!my_col := as.integer(!!my_col))
}

new_df <- map_dfr(list_of_dataframes, ~my_func(.x, m))

as_tibble(new_df)
#> # A tibble: 300 x 3
#>         x      y     m
#>     <dbl>  <dbl> <int>
#>  1 0.0728  0.652     1
#>  2 0.0534 -1.26      2
#>  3 0.735  -1.05      3
#>  4 0.305  -0.245     4
#>  5 0.746  -0.362     5
#>  6 0.101  -0.615     6
#>  7 0.0868  0.255     7
#>  8 0.865  -0.523     8
#>  9 0.818  -1.29      9
#> 10 0.0190 -1.28     10
#> # ... with 290 more rows

check_df <- map_dfr(list_of_dataframes, ~{
  .x %>% 
    mutate(m = as.integer(m))
})

identical(new_df, check_df)
#> [1] TRUE

Created on 2018-08-23 by the reprex package (v0.2.0).