DBScan
April 20, 2023, 1:52pm
1
I am starting to incorporate the purrr package in my daily work. One thing I can't accomplish is splitting a dataframe into multiple dataframes. Following the example of purrr, I can do this:
mtcars %>%
split(mtcars$gear)
Which splits the dataframe into three smaller dataframes by "gear".
Now I would like to split the dataframe again, for instance by "am".
I have tried this:
mtcars %>%
split(mtcars$gear) %>%
map(split, mtcars$am)
which works, but throws a warning.
Following purrr to fit a model, I tried this, but I got an error:
mtcars %>%
split(mtcars$gear) %>%
map(split, mtcars$am) %>%
map(\(df) lm(mpg ~ wt, data = df)) |>
map(summary) %>%
map_dbl("r.squared")
"object 'wt' not found".
How could I fix this?
Leon
April 20, 2023, 2:08pm
2
I recommend going about it using a tidy approach. See my example here:
This might not be exactly what you are looking for, but based on what you are trying to achieve, it might be a good approach to look into:
# Load Libraries ----------------------------------------------------------
library("tidyverse")
library("broom")
# Create Example Data -----------------------------------------------------
map(1:6, \(i) write_csv(x = tibble(x = rnorm(10), y = rnorm(10)),
file = str_c("~/my_csv_file_", i, ".csv")) )
# Load Data ---------------------------------------------------------------
my_csv_files <- list.files(path = "~", full.names = TRUE, pattern = "csv$")
my_data <- my_csv_files %>%
map(read_csv)
# Run tests --------------------…
Leon
April 20, 2023, 2:17pm
3
Which in your case, translates to something along the lines of:
library("tidyverse")
library("broom")
mtcars %>%
group_by(gear, am) %>%
nest %>%
mutate(mdl = map(data, ~lm(mpg ~ wt, data = .x)),
mdl_summary = map(mdl, glance)) %>%
unnest(mdl_summary) %>%
ungroup %>%
select(gear, am, r.squared)
2 Likes
DBScan
April 21, 2023, 6:15am
4
Interesting approach, works like a charm!
system
Closed
April 28, 2023, 6:16am
5
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.