Forecasting using all possible combination of coincident indicators

A lot of people, when forecasting some variable, like to run multiple regressions, and check the out of sample error, usually RMSE, in order to get a final forecast.

So, for example, imagine you the the industrial production (variable Y), and you have many coincident indicators, like payroll, industry sentiments, electricity usage, truck traffic, and so on (variables X1, X2, ... Xn)

One would like to run Y as a function of all possible combinations of these variables, like:

Y ~ X1

Y ~ X2

Y ~ X3

Y ~ X1 + X2

Y ~ X1 + X3

Y ~ X2 + X3

Y ~ X1 + X2 + X3

and so on.

Update: I found this post, and it helped me a lot: .

Update 3: I was able to create a vector of characters with all models I want to run, but the function "model" does not recognize it. What can I do?

Is it possible to try stepwise regressions? Suppose you have a set of variable that you what to check whether or not each of regression model should include any of these variables in Z. So, the model would be Y ~ X1 + Z, where each z in Z is test in the regression based on p-value, for example. (Here is an example of stepwise regression: R Stepwise & Multiple Linear Regression [Step by Step Example])

Using expanding window regression, forecast 1 step ahead, you can get the RMSE of each of these models.

Finally, one would run all models for the whole sample, to forecast next month industrial production (for example). He would see the top 100 models prediction, he can get a distribution of predictions, a scatter plot of predictions vs RMSE, and finally, using the inverse of RMSE as weight, he could aggregate all forecast to get the final prediction.

Update 2: I found on the book about the cross-validation. It's awesome. The function stretch_tsibble is amazing. One can found more about cross-validation on section 5.9 of the fpp3 book.

I am trying to implement it with fabble. I am reading the Forecasting: Principels and practices (3), and trying to figure out how do to it.

The data, should be in a long or wide format on tsiblle, in order to get all possible combinations? If I find all possible combination as vector of string (I dont know if thats the best way), how can I run all the models, testing the out-of-sample forecasting to get the RMSE of 1 step ahead predictions?

Is there a function or a package relate to fabble to implement this routine?

In order to get the 6 month moving average of the RMSE of each models, is there a function date could help, instead of writing a loop for every 1 step expanding window sample?

Appreciate any help!

When forecasting with exogenous regressors using fable, you will need to provide the future values of these regressors to the forecast() function via the new_data argument.

This is also needed when computing cross-validated forecast errors, however you will also need to prepare your new_data to match the cross-validation folds.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.