I've primarily been using modeltime but having a play around with fable to explore bagging. I'm able to recreate the examples from the fpp3 book but what I would like to do is calculate accuracy metrics for the mean or median of the bagged forecast so I can compare with the standard model e.g. ETS vs baggedETS and so on.

I see the examples to create models, forecast, and calculate accuracy for single series and (for example) I can take my monthly data, create a training set, fit a number of models and then plot the forecast against the full data along with the accuracy.

It's not clear to me how to do the same thing with baggedETS. I can create train data, generate 100 simulations, and then forecast that forwards. I do not see how to calculate accuracy from there though (I guess because the resulting fable also has an additional .rep column compared to a 'normal' forecast) and I can't find any detail about this on the web or in the book.

Most of the required code is in the book. The only thing left to do is to turn the bagged forecast into a fable object. If you're only interested in point forecasts, we can just create a degenerate distribution based on the mean. To do full distributional forecasting, we would need to create a mixture distribution. Either way, once we have a fable, the accuracy function can be applied. Here is an example using a degenerate distribution based on the mean.

library(fpp3)
#> ── Attaching packages ──────────────────────────────────────────── fpp3 0.4.0 ──
#> ✓ tibble 3.1.4 ✓ tsibble 1.0.1
#> ✓ dplyr 1.0.7 ✓ tsibbledata 0.3.0
#> ✓ tidyr 1.1.3 ✓ feasts 0.2.2
#> ✓ lubridate 1.7.10 ✓ fable 0.3.1
#> ✓ ggplot2 3.3.5
#> ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
#> x lubridate::date() masks base::date()
#> x dplyr::filter() masks stats::filter()
#> x tsibble::intersect() masks base::intersect()
#> x tsibble::interval() masks lubridate::interval()
#> x dplyr::lag() masks stats::lag()
#> x tsibble::setdiff() masks base::setdiff()
#> x tsibble::union() masks base::union()
library(distributional)
# Training data
cement <- aus_production %>%
filter(year(Quarter) >= 1988, year(Quarter) <= 2008) %>%
select(Quarter, Cement)
# ETS forecasts
ets_forecasts <- cement %>%
model(ets = ETS(Cement)) %>%
forecast(h = 6) %>%
select(-.model)
# Bagged forecasts (averaged over 100 replicates)
bagged_forecasts <- cement %>%
model(stl = STL(Cement)) %>%
generate(new_data = cement, times = 100, bootstrap_block_size = 8) %>%
select(-.model, -Cement) %>%
model(ets = ETS(.sim)) %>%
forecast(h = 6) %>%
summarise(bagged_mean = mean(.mean)) %>%
mutate(dist = dist_degenerate(bagged_mean)) %>%
as_fable(response = "Cement", distribution = dist)
#> Warning: The dimnames of the fable's distribution are missing and have been set
#> to match the response variables.
# Compare accuracy
bind_rows(
ets_forecasts %>% accuracy(aus_production) %>% mutate(method = "ETS"),
bagged_forecasts %>% accuracy(aus_production) %>% mutate(method = "Bagged ETS")
) %>%
select(method, everything())
#> # A tibble: 2 × 10
#> method .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 ETS Test -157. 178. 161. -7.55 7.71 1.63 1.38 -0.327
#> 2 Bagged ETS Test -222. 242. 222. -10.7 10.7 2.25 1.87 -0.403

^{Created on 2021-09-03 by the reprex package (v2.0.1)}