Solutions to scalable time series forecasting


I'm looking for suggestion for a solutions in R that will allow me to forecast up to 50,000 time series each week.

My data consists of sales data for up to 50,000 products on a weekly frequency with approx. 2 years of weekly data for each time series i.e. 104 points per time series.

I'm researching solutions and I'm aware of approaches in Tidymodels using workflowsets with multiple recipes and models. These approaches seem to be more suited to small numbers of time series (please correct me if this is untrue). Here is an approach using tidymodels with multiple models and multiple timeseries. Assuming I apply this approach to my data using a large compute resource (high compute power and RAM), how will this solution fail?

Matt Dancho has developed modeltime with nested dataframes on Spark but upon further research it doesn't appear that hyper-parameter tuning is possible with dataframes on Spark using Sparklyr.

Are there other scalable solutions that you could suggest for scalable time series forecasting in R?


1 Like

If you want to create separate models for each series of 104 data points, that should be easy with tidymodels and model time.

I think that RS Connect parameterized documents is probably a good place to do the tuning and fitting en masse.

For each model, you can version and deploy them using vetiver.


Thanks Max, I'll check them out.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.