How can I use dplyr group_by in a function?

This is an example.

library('tidyverse')

org_dat = tibble( dat = sample( LETTERS[1:4], 100, replace = TRUE ) ,
num = sample( 1:100 , replace = TRUE ) )

subsetting = function( data, col ,var ){
return( data %>%
filter( .[col] == var ) %>%
group_by( .[col] ) %>%
summarise( SUM = sum( num ) ) )
}

subsetting(data = org_dat, col = 'dat' , var = 'A' )
Error: Column .[col] is of unsupported class data.frame

How can I set the group by value to get the following result?

image

Hi @choi,

Try this:

org_dat = tibble( dat = sample( LETTERS[1:4], 100, replace = TRUE ) ,
                  num = sample( 1:100 , replace = TRUE ) )
  
subsetting <- function(data, col, var) {
  data %>%
    filter({{col}} == var) %>% 
    group_by({{col}}) %>%
    summarise(MEAN_SUM = sum(num))
  }

subsetting(data = org_dat, col = dat, var = 'A')

Working with dplyr functions can be tricky due to tidy evaluation. Learn more about it here. dplyr functions expect to see unquoted variable names (literally without quotations), so if you want your functions work with dplyr functions, you need to pass it unquoted variable names. To do this, you need to use the special "embrace" operator {{ }} to do some quoting/unquoting magic.

3 Likes

:grinning: Thank you!

If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.