`summarise()` has grouped output by 'X'. You can override using the `.groups` argument.

The message about summarize() grouping the output is produced by the summarize function, so it is true as the code exits summarize and before the subsequent ungroup(). Your tibble named data_group is ungrouped in the next step, but summarize() doesn't know that. If you habitually end such blocks of code with ungroup(), you could replace that with setting the argument .groups to "drop". Here is a demonstration of the effect of the default behavior when there is a subsequent summarize() followed by an example of using ungroup().

library(tidyverse)

data <- tibble(gr1    = rep(LETTERS[1:4], each = 3),
               gr2    = rep(letters[1:2], times = 6),
               values = 101:112)

data_group <- data |>
  group_by(gr1, gr2) |>
  summarise(gr_sum = sum(values)) #eliminate the ungroup() function
#> `summarise()` has grouped output by 'gr1'. You can override using the `.groups`
#> argument.
data_group
#> # A tibble: 8 × 3
#> # Groups:   gr1 [4]
#>   gr1   gr2   gr_sum
#>   <chr> <chr>  <int>
#> 1 A     a        204
#> 2 A     b        102
#> 3 B     a        105
#> 4 B     b        210
#> 5 C     a        216
#> 6 C     b        108
#> 7 D     a        111
#> 8 D     b        222

data_group |> summarize(Total = sum(gr_sum)) #data_group is grouped by gr1
#> # A tibble: 4 × 2
#>   gr1   Total
#>   <chr> <int>
#> 1 A       306
#> 2 B       315
#> 3 C       324
#> 4 D       333

data_group2 <- data |>
  group_by(gr1, gr2) |>
  summarise(gr_sum = sum(values)) |>  
  ungroup()
#> `summarise()` has grouped output by 'gr1'. You can override using the `.groups`
#> argument.

data_group2 |> summarize(Total = sum(gr_sum)) #data_group2 has no groups
#> # A tibble: 1 × 1
#>   Total
#>   <int>
#> 1  1278

Created on 2023-10-03 with reprex v2.0.2
I don't know why the default is "drop_last". My reflex would be to make "drop" the default but people way smarter than me decided otherwise.

2 Likes