Here is a way to do this using dplyr
and purrr
that utilizes group_by
, summarize_at
, and purrr::partial
.
Putting aside the purrr::partial
portion for now, I had to make changes to your my_means
function to work with the group_by
/summarize
workflow instead of split
/map
. To see my thoughts on the differences you can see this thread. The function now takes a vector rather than a dataframe and returns only the mean (which meets the requirements of a function passed to summarize).
So, now the fun part. purrr::partial
allows you to pass a function to it while setting different variables to change with each iteration of the function. If you call partial
inside of a map
call then these preset functions are conveniently saved to a list. Now the tricky part.. How do we run a list of functions on a specific subset of columns of our dataframe. Luckily, with rlang
(here using functions reexported with dplyr
) we can call our function list inside of the funs
argument/function in summarize_at
with !!!
. This will output the results for each of the functions in the list as its own column and each row will contain a different group.
One other important thing to note is that if you want to call the list of functions from summarize_at
as shown, the list has to be named. Hence, the reason for creating a dynamic list of names and using purrr::set_names
to apply them.
Here is the reprex
:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(purrr)
my_mean <- function(x, less_than) {
mean <- mean(x[x < less_than])
mean
}
my_cutoffs <- c(25, 30, 35)
my_means_names <- purrr::map(my_cutoffs, ~paste0("mean_lt_", .x))
my_partial_mean <- purrr::map(my_cutoffs, ~purrr::partial(my_mean, less_than = .x)) %>%
purrr::set_names(nm = my_means_names)
mtcars %>%
group_by(cyl) %>%
summarize_at(vars(mpg), funs(!!!my_partial_mean))
#> # A tibble: 3 x 4
#> cyl mean_lt_25 mean_lt_30 mean_lt_35
#> <dbl> <dbl> <dbl> <dbl>
#> 1 4 22.6 23.7 26.7
#> 2 6 19.7 19.7 19.7
#> 3 8 15.1 15.1 15.1
Created on 2018-10-19 by the reprex package (v0.2.0).
I recently wrote a blog post using this exact same workflow to calculate multiple quantiles for different groups with dplyr