ML preprocess to achieve stationarity


I would like to use Machine learning models on top of multivariate time series data to forecast long horizons (for example 400 items and their historical sales in the last year & content features).

From many papers, blogs and Kaggle notebooks I understood that the time series must be stationary, before I am using classical ML algorithms. The reason that ML models such XG-boost\ Cat-Boost can't extrapolate to the feature .

If I enforce the variance and mean being stationary & adding seasonal attributes ( such as LAGS ) , then it should be fine :slight_smile:

To make variance stationary I can use log , or Box-Cox power transformation.

Though for mean , I can’t find a practical approach to enforce stationarity . I tried to use differencing – but since I have a long horizon ( such as future 90 points ) I got very bad results.

And I do familiar with two types of trend : stochastic and deterministic

some one can assist with how to enforce stationary to the mean , and then transform back the predictions to their original scale ? And if anyone has some Python code example to such task it will be great ! :slight_smile:

Regards, Boris

Referred here by Forecasting: Principles and Practice, by Rob J Hyndman and George Athanasopoulos

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.