I often find myself writing code that analyses the same dataset at different levels of aggregation. Almost always I keep the results in separate dataframes like in the example below:
library(tidyr)
library(dplyr)
library(purrr)
car_reg <- function(data){
lm(data = data, mpg ~ cyl + hp)
}
by_vs_gear <- mtcars %>%
group_by(vs, gear) %>%
nest() %>%
mutate(car_model = map(data, car_reg))
by_vs <- mtcars %>%
group_by(vs) %>%
nest() %>%
mutate(car_model = map(data, car_reg))
Are there any tidyr or purrr idioms that would let me 'peel off' a level of aggregation and keep the original aggregated results all in one dataframe? Or maybe this is just a bad idea?
The ultimate idea is to have a dataframe that looks like this
library(tidyr)
library(dplyr)
library(purrr)
car_reg <- function(data){
lm(data = data, mpg ~ cyl + hp)
}
by_vs_gear <- mtcars %>%
group_by(vs, gear) %>%
nest() %>%
mutate(car_model = map(data, car_reg))
by_vs <- mtcars %>%
group_by(vs) %>%
nest() %>%
mutate(car_model = map(data, car_reg), gear = NA)
aggregated <- by_vs_gear %>% bind_rows(by_vs)
without any of the extra work of keeping two sub-dataframes around