This post is inspired by `unnest` removes `rsample` attributes · Issue #688 · tidyverse/tidyr · GitHub.
The post uses code inspired from Bootstrap Confidence Intervals • rsample.
Let's say I want to create bootstraps, fit models to those, and then analyze the models. That process is documented well. However, I can't seem to perform on grouped data the same post-model analyses that I can perform on ungrouped data. At least not in the context of using functions like int_*
.
Let's look at the analysis with one group:
library(tidymodels)
library(nlstools)
# data
O2Ka <- O2K %>% mutate(group = "A")
ggplot(O2Ka, aes(x = t, y = VO2)) +
geom_point()
# build formula
nlin_formula <-
as.formula(
VO2 ~ (t <= 5.883) * VO2rest +
(t > 5.883) *
(VO2rest + (VO2peak - VO2rest) * (1 - exp(-(t - 5.883) / mu)))
)
# Starting values from visual inspection
start_values <- list(VO2rest = 400, VO2peak = 1600, mu = 1)
single_nls <- nls(nlin_formula, start = start_values, data = O2Ka)
tidy(single_nls)
# Will be used to fit the models to different bootstrap data sets:
fit_nls_to_bootstraps <- function(split, ...) {
# We could check for convergence, make new parameters, etc.
nls(formula = nlin_formula,
data = analysis(split),
start = start_values, ...)
}
set.seed(462)
nlin_boot_no_grouping <-
bootstraps(O2Ka, times = 1000, apparent = TRUE) %>%
mutate(models = map(splits, fit_nls_to_bootstraps),
coefs = map(models, tidy))
nlin_boot_no_grouping
# Notice it says "# Bootstrap sampling with apparent sample" above table output
ci_no_grouping <- int_pctl(nlin_boot_no_grouping, coefs)
ci_no_grouping
That last step of using int_pctl()
works fine.
But try this on grouped data, and it throws an error.
# Let's have two groups now (even if just using the same data frame)
O2Kb <- O2K %>% mutate(group = "B")
O2Kg <- bind_rows(O2Ka, O2Kb)
# need bootstrap function to make it work in the nested workflow
bootstrapped <- function(x) bootstraps(x, times = 1000, apparent = TRUE)
set.seed(462)
nlin_boot_nested_grouping <-
O2Kg %>%
group_nest(group) %>%
mutate(straps = map(data, bootstrapped)) %>%
# the next unnest step seems to remove the rsample attributes of the bootstraps
unnest(straps) %>%
mutate(models = map(splits, fit_nls_to_bootstraps),
coefs = map(models, tidy))
nlin_boot_nested_grouping
# compare to
nlin_boot_no_grouping
# Percentile intervals don't work
ci_nested_grouping <- int_pctl(nlin_boot_nested_grouping, coefs)
# Error `.data` should be an `rset` object generated from `bootstraps()`
There are work arounds, of course, but I wanted to see if anyone else has encoutered this problematic workflow as I can't seem to find much on it through internet searches.
One possible workaround:
nlin_boot_nested_grouping %>%
unnest(coefs) %>%
group_by(group, term) %>%
summarise(lwr_CI = quantile(estimate, 0.025),
estimate = median(estimate),
upr_CI = quantile(estimate, 0.975))