Cross-Validate a list of models

willmjr · November 21, 2021, 12:38pm

Hi there,

I need to one-step-ahead cross-validate in list of models. But I am struggling with modeltime_resample_accuracy.

My idea is to create a function to cross-validate a list of model. But before that, I was trying to understand how modeltime_resample_accuracy works, and I got a error:

Error: In metric: `mase`
Problem with `summarise()` column `.estimate`.
i `.estimate = metric_fn(...)`.
x `truth` must have a length greater than `m` to compute the out-of-sample naive mean absolute error.
i The error occurred in group 1: .model_id = 1, .model_desc = "LM", .resample_id = "Slice1", .type = "Resamples".
Run `rlang::last_error()` to see where the error occurred.

I was running the following code:

library(tidyverse)
library(tidymodels)
library(modeltime)
library(modeltime.resample)
library(timetk)
library(lubridate)
library(zoo)
library(fpp3)


list_of_regressors <- c(1, us_change[3:6] %>% colnames())



regressor_combination <- function(.indepent_var, 
                                  .regressors = c(""),
                                  .min_regs = 1,
                                  .max_regs = length(.regressors)) {
  
  regressors_list <- map(.x = .min_regs:.max_regs,
                         .f = ~ combn(x = .regressors,
                                      m = .x) %>%
                           split(x = .,
                                 f = rep(1:ncol(.), each = nrow(.)))) %>%
    do.call(what = c,
            args = .) %>%
    unname()
  
  output <- tibble(
    regressors = regressors_list
  ) %>%
    mutate(id = row_number()) %>%
    select(id, everything()) %>%
    mutate(formula =  map(.x = regressors,
                          ~ paste0(.indepent_var, " ~ ", paste(.x, collapse = " + "))) %>%
             unlist())
  
  return(output)
  
}


tbl_models <- regressor_combination(.indepent_var = "Consumption",
                                    .regressors = list_of_regressors)


data <- us_change %>%
  as_tibble()


train_data <- training(initial_time_split(data = data, prop = 0.8))
test_data <- testing(initial_time_split(data = data, prop = 0.8))



resamples_tscv <- time_series_cv(
  data = data,
  date_var = Quarter,
  assess = 1,
  initial = nrow(data)-4*2
)

model <- linear_reg() %>%
  set_engine("lm") %>%
  fit(as.formula(tbl_models$formula[11]), data = train_data)


modeltime_table(model) %>%
  modeltime_fit_resamples(
    resamples = resamples_tscv,
    control = control_resamples(verbose = FALSE)
  ) %>%
  modeltime_resample_accuracy()

system · December 12, 2021, 12:38pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.