First, I don't think this works, if you have "Column A"
this will be interpreted as a single group called "column A" instead of the column of the data frame named `column A`
. Here is a reprex, I believe it matches what you have:
library(tidyverse)
set.seed(1)
data <- list(
tibble(`Column A` = letters[1:2] |> rep(3) |> rep(each = 2),
`Column B` = LETTERS[1:3] |> rep(each = 4),
Vol = rnorm(12)),
tibble(`Column A` = letters[1:2] |> rep(3) |> rep(each = 2),
`Column B` = LETTERS[1:3] |> rep(each = 4),
Vol = rnorm(12))
)
# Currently adjusting group by parameters manually in below sample
vol_fn <- function(df){
df <- df %>%
group_by(`Column A`, `Column B`)%>%
summarize(Vol = sum(Vol))
return(df)
}
vol1 <- map(data, ~vol_fn(.))
#> `summarise()` has grouped output by 'Column A'. You can override using the
#> `.groups` argument.
#> `summarise()` has grouped output by 'Column A'. You can override using the
#> `.groups` argument.
vol1
#> [[1]]
#> # A tibble: 6 × 3
#> # Groups: Column A [2]
#> `Column A` `Column B` Vol
#> <chr> <chr> <dbl>
#> 1 a A -0.443
#> 2 a B -0.491
#> 3 a C 0.270
#> 4 b A 0.760
#> 5 b B 1.23
#> 6 b C 1.90
#>
#> [[2]]
#> # A tibble: 6 × 3
#> # Groups: Column A [2]
#> `Column A` `Column B` Vol
#> <chr> <chr> <dbl>
#> 1 a A -2.84
#> 2 a B 0.928
#> 3 a C 1.70
#> 4 b A 1.08
#> 5 b B 1.42
#> 6 b C -1.91
Created on 2024-03-08 with reprex v2.0.2
Now, on to your question, you need to check the programming with dplyr vignette. Since group_by()
uses data masking, so x
is an env-variable that refers to a data-variable, so you need to embrace:
vol_fn <- function(df, x){
df <- df %>%
group_by({{x}})%>%
summarize(Vol = sum(Vol))
return(df)
}
vol2 <- map(data, ~vol_fn(., `Column A`))
vol2
or to stick with quoted variables:
vol_fn <- function(df, x){
df <- df %>%
group_by(.data[[x]])%>%
summarize(Vol = sum(Vol))
return(df)
}
vol2 <- map(data, ~vol_fn(., "Column A"))
Finally, this works fine for 1 column, but you will run into a problem with 2 columns, if you want to call map(data, ~vol_fn(., `Column A`, `Column B`))
. In that case you can use ...
:
vol_fn <- function(df, ...){
df <- df %>%
group_by(...)%>%
summarize(Vol = sum(Vol))
return(df)
}
vol2 <- map(data, ~vol_fn(., `Column A`, `Column B`))
vol2