First two columns don't exist after group_by?

cgra · February 11, 2026, 6:17pm

I am new to R, currently learning dplyr and the tidyverse, and am not sure how to resolve the following issue. If I run this code:

pacman::p_load(tidyverse, nycflights13)
data(weather)

rain_daily <- weather %>%
  group_by(origin, year, month, day) %>% # drop hour as unit of observation
  summarize(sum_precip_day = sum(precip)) %>%  # sum hourly precip into daily
    group_by(year, month, day) %>% # drop airport as unit of observation
    summarize(median_precip = median(sum_precip_day)) %>% # median daily precip in NYC area
      mutate(rain_any = ifelse(median_precip > 0, 1, 0)) %>% # bin. yes/no rain on a day
      mutate(rain_cat = case_when(0.00 <= median_precip & median_precip < 0.01 ~ 0, # cat. no-heavy rain
                                    0.01 <= median_precip & median_precip < 0.10 ~ 1,
                                    0.10 <= median_precip & median_precip < 0.25 ~ 2,
                                    0.25 <= median_precip & median_precip < 0.50 ~ 3,
                                    0.50 <= median_precip & median_precip < 0.75 ~ 4,
                                    0.75 <= median_precip & median_precip < 1.00 ~ 5,
                                    1.00 <= median_precip & median_precip        ~ 6)) %>%
      mutate(across(c(1, 2, 3, 6), factor)) # turn year, month, day, rain_cat into factor vars

I get the following warning:
Error in mutate():
In argument: across(c(1, 2, 3, 6), factor).
Caused by error in across():
! Can't select columns past the end.
Location 6 doesn't exist.
There are only 4 columns.

If I drop the last mutate(across line, the code works and gives me 6 variables in the rain_daily. I don't know how R differentiates columns from variables. In playing around with column names versus column numbers for mutate(across, it seems that R is not recognizing the existence of my first two variables, year and month, which makes me suspect that this issue is caused by my group_by statement, but I can't figure out how to fix it. Thanks in advance!

cgra · February 11, 2026, 6:55pm

Add .groups = "drop" to summarize statements. Previous tests with .groups = "keep" didn't work so I thought it wasn't the right path to follow. "keep" turns grouped variables into one column (?) and "drop" gets rid of the grouping (?), returning them to separate variables.

  group_by(origin, year, month, day) %>% 
  summarize(sum_precip_day = sum(precip), .groups = "drop") %>% 
    group_by(year, month, day) %>% 
    summarize(median_precip = median(sum_precip_day), .groups = "drop") %>%

system · February 18, 2026, 6:56pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.