Suppose we have a function that returns a named df/list with more than one variable output.
Question: How can run that function against a df and create more than one new variable at a time, and what is the most natural way with dplyr/ purrr?
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(purrr)
toy_fun <- function(chr) {
data.frame(let_rank = which(letters == chr),
rando = rnorm(n = length(chr)))
}
# for example, given
df <- data.frame(lets = sample(letters, 10))
# what is the correct dplyr/purrr way to map the function to
# capture the output of the function?
# this of course fails
df %>% mutate(toy_fun(lets), .id = names(toy_fun(lets)))
#> Error in mutate_impl(.data, dots): Evaluation error: argument "chr" is missing, with no default.
# here's one way which works, but which seems unwieldy and ugly:
bind_cols(df, map_df(df$lets, toy_fun))
#> lets let_rank rando
#> 1 d 4 -0.30910301
#> 2 j 10 1.54324912
#> 3 i 9 -0.57664505
#> 4 u 21 1.15671969
#> 5 b 2 -0.03828406
#> 6 f 6 -0.64202232
#> 7 l 12 0.50793796
#> 8 y 25 0.98867755
#> 9 x 24 0.02617367
#> 10 h 8 -0.75882107
# this doesn't work like I thought it would:
map_dfc(df$lets, toy_fun)
#> let_rank rando let_rank1 rando1 let_rank2 rando2 let_rank3
#> 1 4 0.8284374 10 1.342487 9 2.982913 21
#> rando3 let_rank4 rando4 let_rank5 rando5 let_rank6 rando6
#> 1 0.568951 2 0.7849361 6 -0.180958 12 2.808249
#> let_rank7 rando7 let_rank8 rando8 let_rank9 rando9
#> 1 25 0.4196204 24 -0.008707815 8 -0.5353852
Created on 2018-11-22 by the reprex package (v0.2.1)