Hi, I'm new to machine learning so apologies if I'm just making a stupid mistake here: I'm trying to fit an elastic net model to some polling data from the UK in RStudio, using 10-fold cross validation. When I get to tuning the model, I get this error:
"x Fold01: preprocessor 1/1, model 1/10: Error: y
should be one of the following classes: 'data.frame', 'matrix', 'factor'" for each fold/model.
I've got this exact code to work before, but for a different dataset. I've tried removing NA entries and various other cleaning methods. I've also tried converting variables to factors and specifying as.dataframe. Every time I get the same error.
I would be interested so see if anyone else has run into this before, or could advise me on it! I can't find much on it online.
I'm using several packages, mostly from the Tidyverse.
Here's my code for reference:
# Importing
bes <- read_dta("Data/BES2019_W20_v0.1-3.dta")
bes[bes==9999] <- NA
bes <- as.data.frame(bes)
# Splitting data
set.seed(1111)
bes_split <- initial_split(bes, prop = 0.8)
bes_train <- training(bes_split)
bes_test <- testing(bes_split)
set.seed(1111)
bes_folds <- vfold_cv(bes_train, v = 10)
# Recipe
elastic_recipe <-
recipe(
formula = turnoutUKGeneral ~ generalElectionVote,
data = bes_train,
na.rm = TRUE
) %>%
step_zv(all_predictors()) %>%
step_normalize(all_predictors(), -all_nominal()) %>%
step_dummy(all_nominal()) %>%
step_nzv(all_nominal())
elastic_spec <-
linear_reg(penalty = tune(), mixture = tune()) %>%
set_mode("regression") %>%
set_engine("glmnet")
elastic_workflow <-
workflow() %>%
add_recipe(elastic_recipe) %>%
add_model(elastic_spec)
# Tuning
elastic_grid <- grid_regular(
penalty(range = c(-10, -2)),
mixture(),
levels = c(10, 10)
)
# ERROR WHEN RUNNING THIS CHUNK
elastic_tune <- elastic_workflow %>%
tune_grid(
resamples = bes_folds,
grid = elastic_grid,
)
elastic_tune %>%
collect_metrics()
Session info:
R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] glmnet_4.1-3 Matrix_1.3-4 vctrs_0.3.8 rlang_0.4.12 styler_1.6.2
[6] janitor_2.1.0 textrecipes_0.4.1 themis_0.1.4 rpart.plot_3.1.0 rpart_4.1-15
[11] ranger_0.13.1 parttree_0.0.1.9000 gt_0.3.1 modelsummary_0.9.3 haven_2.4.3
[16] patchwork_1.1.1 vip_0.3.2 gridExtra_2.3 jtools_2.1.4 tidytext_0.3.2
[21] yardstick_0.0.8 workflowsets_0.1.0 workflows_0.2.4 tune_0.1.6 rsample_0.1.1
[26] recipes_0.1.17 parsnip_0.1.7 modeldata_0.1.1 infer_1.0.0 dials_0.0.10
[31] scales_1.1.1 broom_0.7.10 tidymodels_0.1.4 forcats_0.5.1 stringr_1.4.0
[36] dplyr_1.0.7 purrr_0.3.4 readr_2.0.2 tidyr_1.1.4 tibble_3.1.5
[41] ggplot2_3.3.5 tidyverse_1.3.1
loaded via a namespace (and not attached):
[1] readxl_1.3.1 mlr_2.19.0 backports_1.3.0 fastmatch_1.1-3 plyr_1.8.6 splines_4.1.1
[7] listenv_0.8.0 SnowballC_0.7.0 digest_0.6.28 foreach_1.5.1 htmltools_0.5.2 fansi_0.5.0
[13] magrittr_2.0.1 checkmate_2.0.0 BBmisc_1.11 unbalanced_2.0 doParallel_1.0.16 tzdb_0.2.0
[19] globals_0.14.0 modelr_0.1.8 gower_0.2.2 R.utils_2.11.0 hardhat_0.1.6 colorspace_2.0-2
[25] rvest_1.0.2 xfun_0.27 crayon_1.4.2 jsonlite_1.7.2 survival_3.2-11 iterators_1.0.13
[31] glue_1.4.2 gtable_0.3.0 ipred_0.9-12 R.cache_0.15.0 shape_1.4.6 future.apply_1.8.1
[37] DBI_1.1.1 Rcpp_1.0.7 GPfit_1.0-8 lava_1.6.10 prodlim_2019.11.13 httr_1.4.2
[43] FNN_1.1.3 ellipsis_0.3.2 R.methodsS3_1.8.1 pkgconfig_2.0.3 ParamHelpers_1.14 nnet_7.3-16
[49] dbplyr_2.1.1 utf8_1.2.2 tidyselect_1.1.1 DiceDesign_1.9 munsell_0.5.0 cellranger_1.1.0
[55] tools_4.1.1 cli_3.1.0 generics_0.1.1 evaluate_0.14 fastmap_1.1.0 yaml_2.2.1
[61] tables_0.9.6 knitr_1.36 fs_1.5.0 pander_0.6.4 RANN_2.6.1 future_1.23.0
[67] R.oo_1.24.0 xml2_1.3.2 tokenizers_0.2.1 compiler_4.1.1 rstudioapi_0.13 reprex_2.0.1
[73] lhs_1.1.3 stringi_1.7.5 lattice_0.20-44 pillar_1.6.4 lifecycle_1.0.1 furrr_0.2.3
[79] data.table_1.14.2 R6_2.5.1 janeaustenr_0.1.5 parallelly_1.28.1 codetools_0.2-18 MASS_7.3-54
[85] assertthat_0.2.1 ROSE_0.0-4 withr_2.4.2 parallel_4.1.1 hms_1.1.1 grid_4.1.1
[91] timeDate_3043.102 class_7.3-19 rmarkdown_2.11 snakecase_0.11.0 parallelMap_1.5.1 pROC_1.18.0
[97] lubridate_1.8.0