The message about summarize() grouping the output is produced by the summarize function, so it is true as the code exits summarize and before the subsequent ungroup(). Your tibble named data_group is ungrouped in the next step, but summarize() doesn't know that. If you habitually end such blocks of code with ungroup(), you could replace that with setting the argument .groups
to "drop". Here is a demonstration of the effect of the default behavior when there is a subsequent summarize() followed by an example of using ungroup().
library(tidyverse)
data <- tibble(gr1 = rep(LETTERS[1:4], each = 3),
gr2 = rep(letters[1:2], times = 6),
values = 101:112)
data_group <- data |>
group_by(gr1, gr2) |>
summarise(gr_sum = sum(values)) #eliminate the ungroup() function
#> `summarise()` has grouped output by 'gr1'. You can override using the `.groups`
#> argument.
data_group
#> # A tibble: 8 × 3
#> # Groups: gr1 [4]
#> gr1 gr2 gr_sum
#> <chr> <chr> <int>
#> 1 A a 204
#> 2 A b 102
#> 3 B a 105
#> 4 B b 210
#> 5 C a 216
#> 6 C b 108
#> 7 D a 111
#> 8 D b 222
data_group |> summarize(Total = sum(gr_sum)) #data_group is grouped by gr1
#> # A tibble: 4 × 2
#> gr1 Total
#> <chr> <int>
#> 1 A 306
#> 2 B 315
#> 3 C 324
#> 4 D 333
data_group2 <- data |>
group_by(gr1, gr2) |>
summarise(gr_sum = sum(values)) |>
ungroup()
#> `summarise()` has grouped output by 'gr1'. You can override using the `.groups`
#> argument.
data_group2 |> summarize(Total = sum(gr_sum)) #data_group2 has no groups
#> # A tibble: 1 × 1
#> Total
#> <int>
#> 1 1278
Created on 2023-10-03 with reprex v2.0.2
I don't know why the default is "drop_last". My reflex would be to make "drop" the default but people way smarter than me decided otherwise.