Advise for working with lots of data(hourly) sales forecast

Currently I’m working to predict hourly sales from
a store with a lot of sections.
It’s more a concept question than a code one.
What could be a predictor and a best algorithm in this case.
Also because I’m manipulating huge data my rstudio abort session sometimes.
In this case I need to use some specific library to manipulate data or work in rstudio cloud ?
Any advice will be helpful here

A rule of thumb is to start with 200 historical observations at whatever frequency that is of interest to model and forecast from that model to forecast horizon that does not result in confidence bands that exceed realistic bounds. (For example, negative numbers of calls in an hourly interval.) Only then should take a further lookback for other seasonal patterns. You may find it advantageous to aggregate data to daily, weekly or monthly frequencies. See Hyndman.

1 Like

There are two newer package families that you could consider. Both are capable when forecasting at scale.

timetk and modeltime

The tidyverts has a ton of functionality working with the tsibble or time series tibble. This is my personal favorite workflow, especially for constructing forecast aggregations but ymmv. The previous reply links to a text on forecasting that deploys this written by some of the package author/contributors.

Algorithm depends on the data structure. ARIMA and ETS are the workhorses. Those and others should be available in both the aforementioned package groups as well as others.

Predictors? No idea what data you have. For univariate forecasting, hour of day, day of week, the basic seasons one could construct and there are probably a few and some overlap.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.