My goal to understand the `forecast::forecast`

function's inner workings and manually verify the forecast calculations. To achieve this, I first generate data from an AR(2) process.

### Generate data

```
set.seed(123)
# Generate data
y = vector(length = 48)
y[1] = 10
y[2] = 10
for (i in 3:48) {
y[i] = 5 + 1.0*y[i-1] - 0.70*y[i-2] + rnorm(n = 1, sd = 1)
}
# Save it as a time-series object
ts <- as.ts(y, frequency = 1)
# Check data and its ACF (if desired)
# par(mfrow = c(1,2))
# plot(ts, main = "Time series Y", ylab = "Values")
# Acf(ts, main = "ACF for time series Y")
# par(mfrow=c(1,1))
```

### ARIMA modeling

To model the data, I applied `forecast::auto.arima`

, which correctly fitted an AR(2) process with \phi_1 and \phi_2 values of 0.9380345 and -0.7773760, respectively. Additionally, an intercept y_0 of 7.3359679 was included. It is important to note that I used only half of the data points to fit the model.

```
# Fit ARIMA model and check model
fit <- auto.arima(y = ts[1:24], seasonal = FALSE)
summary(fit)
plot(fit)
```

The final model is

where \epsilon is assumed to be a sampled innovation with known mean and variance.

### Forecasting

Using the model `fit`

, I forecasted 24 steps ahead. For this analysis, I want to verify the first forecast. Assuming t = 24, the first one-step-ahead forecast is y_{25}. The code to obtain the forecasts is as follows:

```
# One-step-ahead forecast
fit_forecast = forecast(fit, h = 24)
plot(fit_forecast)
```

Adapting the equation above to this forecast at hand, we have

To calculate this initial forecast, I assume the `forecast`

function relies on the x values within the model `fit`

, i.e., `fit$x`

. These values represent the data on which the model was trained. Looking up y_{24} and y_{23}, I find their respective values are 6.049662 and 5.779341. Therefore, to manually calculate y_{25}, I should use the formula:

When I check the forecasts in `fit_forecast$mean`

, I see that the first value, y_{25}, is 7.339453. To manually verify this value, I need to know the value sampled for \epsilon_{t+1}. However, I couldn't find this value anywhere in `fit_forecast`

.

One approach is to calculate \epsilon_{t+1} as the difference between the forecasted y_{25} and the expected value without the error term, 8.518039. This method seems counter-intuitive. I assume that `forecast::forecast`

must have sampled a value for \epsilon_{t+1}, but it does not seem to keep a history of it.

If the mean of the innovation is zero, we would have

I did inspect `forecast$fitted`

, but the first value is 8.576613, different from `fit_forecast$mean[1]`

.

**Does forecast::forecast not provide a history of sampled innovations or have I missed something?**