How to refit a stacks ensemble to the whole data set (training + testing) to finally predict new data?

MxNl · November 23, 2022, 11:13am

Hello,

I have a question related to the stacks package and couldn't find any ressource during a rough web search.

I tuned multiple models (svm, mlp, nnetar and prophet) using crossvalidation resampling on the training split of initial_split().
Then, I stacked the tuned models with stacks() and applied blend_predictions() to make an ensemble, which I fitted with fit_members().
Then, I tested the ensemble performance using the prediction of the test split of initial_split().
The question now is: How do I now refit the ensemble using all of the data (training + testing) to obtain my final model and use it for predicting new data?

Maybe I am missing something really obvious. Thanks for any hint.
If it is not clear what I mean, I'll try to provide a reprex.

simoncouch · November 23, 2022, 1:21pm

Thanks for the post, @MxNI!

Your workflow as-is sounds complete! The fit_members() step will take care of fitting members with the full training set. We don't provide an interface for fitting to both the training and testing set, as this would leave no data for evaluating efficacy of the model.

MxNl · November 23, 2022, 3:27pm

Thanks a lot for the quick reply, @simoncouch ! I understand that point, but once you know your ensembles/models performance you usually to retrain it on both to include all available data, no? In my described case the data is a time series with predictors. Therefore it would be good to train the model until the most recent date before the prediction starts. Do you have a hint on how you would do this, without leaving the tidymodels logic?

Thanks again for your time!

system · December 14, 2022, 3:27pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.