Hyndman's recipe

library(forecast)

```
retail <- read.csv("https://robjhyndman.com/data/ausretail.csv",header=FALSE)
retail <- ts(retail[,-1],f=12,s=1982+3/12)
ns <- ncol(retail)
h <- 24
fcast <- matrix(NA,nrow=h,ncol=ns)
for(i in 1:ns)
fcast[,i] <- forecast(retail[,i],h=h)$mean
write(t(fcast),file="retailfcasts.csv",sep=",",ncol=ncol(fcast))
```

requires understanding several steps. Since the 2013 post, there are new tools to make this easier

Every `R`

problem can be thought of with advantage as the interaction of three objects: an existing object, x , a desired object, y , and a function, f, that will return a value of y given x as an argument.

f(x)=y

Any or all of these three objects (in `R`

, *everything* is an object) may contain other objects, including functions. We say that functions are composable, like f(g(x)).

In this case x is a composite of the 2,000 products and their respective 36 element time series. y is a composite of 2,000 time series models, perhaps including results of forecasting against a held-out portion of the data.

Before attempting to do anything 2,000 times at once, it is preferable to design a function f to do it once.

This begins with extracting an object for a single SKU from x. Each SKU is in its own row, a vector, containing a presumably character vector of the SKU identifer and a numeric vector of length 32. As toy data

```
SKU <- "blue towel"
dat <- seq(1:36)
```

(`dat`

in preference to `data`

because the latter is a built-in function and some operations give precedence to the later.)

Assuming that x is in the global environment as a data frame `SKU`

```
sku <- SKU[1,2:37]
```

The subset operator is row first, column second. The lowercase name is intentionalâ€”it can be reused for every other row, since only one-at-a-time is involved.

This is the first opportunity to make a function

```
pick_one <- function(x) SKU[x,2:37]
```

and this revises the corresponding line above

```
sku <- pick_one(1)
```

Next, a function to convert `dat`

into a time series object.

```
mk_ts <- function() ts(dat, start = c(2018,1), frequency = 12)
```

In a non-toy example, there would here be some error trapping for gaps, etc.

```
series <- mk_ts()
```

The next step is to model `series`

with one or more of the baseline models, `NAIVE, MEAN, DRIFT, RW, SNAIVE`

. These are from {fpp3}. The reason for that package will become obvious when when it comes to running multiple models in one go:

```
mk_model <- function() fit <- series %>% model(Naive = NAIVE(series))
the_model <- mk_model()
```

This requires {dplyr} or {magrittr} to be loaded, for the `%>%`

pipe operator.

Between `mk_ts()`

and `mk_object`

, there should be, of course, diagnostics for autocorrelation, trend, which will have a bearing on choice of model, and program flow logic based those results. Also consider inflation adjustments, if applicable, as well as log transformation or differencing. The best way to understand these is to work through the examples in Hyndman's book thoroughly for a single SKU and then pick another few at random.

Given a model, the residual diagnostics must be considered. See Hyndman Â§ 5.7.

When this is complete, thought should be given as to what model features should be captured and the appropriate object to contain them.

There is no golden road to forecasting with even a single series. Although there are tests, not all of these should be automated because application may require judgment. Do not attempt to fully automate even model creation without an informed understanding of all the required steps to make forecasts that can be defended.