I want to estimate ARIMA models and use them to make predictions following two different paradigms.

One I will call **self-feeding** and the other **data-dependent**. The **self-feeding** prediction, which I know to implement, feeds the predictions into the model without the need to rely on past data, except for the first \max\{p,q\} values, where p is the autoregressive order and q is the moving-average order. The **data-dependent**, which I am not sure how to implement, does not self-feed into the the model. Instead, it keeps being fed data the user has.

The **self-feeding** paradigm allows the forecasts to "roam freer" whereas the **data-dependent** paradigm has a more corrective nature. In fact, one can think of the **data-dependent** paradigm as a way to keep the forecasts informed at each one-step-ahead prediction.

Here's a bit of code to help drive the point home. In this first naïve example, I expect the self-feeding prediction to output a sequence that starts from the last training data point, in this case, 30, and to follow closely from it, i.e., {31, 32, 33, ...}. I know how to do this with `forecast::forecast`

.

```
# Generate some data
set.seed(123)
# Suppose I have a a dataset D = {1, 2, 3, ..., 60}
# but choose to train the model on only the first half of it.
x = 1:30
uncertainty = runif(n = length(x))
y = x + uncertainty
```

Now, I fit a model and make a prediction of 10 steps ahead. The output of `predictions$mean`

shows a forecast sequence that closely follows the results I expected.

```
library(forecast)
fit <- forecast::auto.arima(y, seasonal = FALSE)
```

Let's inspect the model. As one can see, the model has a drift, as expected, and two autoregressive coefficients.

```
summary(fit)
# Series: y
# ARIMA(2,1,0) with drift
#
# Coefficients:
# ar1 ar2 drift
# -0.6302 -0.4995 0.9938
# s.e. 0.1617 0.1590 0.0282
#
# sigma^2 = 0.1102: log likelihood = -7.97
# AIC=23.93 AICc=25.6 BIC=29.4
#
# Training set error measures:
# ME RMSE MAE MPE MAPE MASE
# Training set 0.006437989 0.3089803 0.2553583 0.4930526 2.514604 0.2566012
# ACF1
# Training set -0.05921137
```

Finally, let's make a **self-feeding** prediction.

```
predictions <- forecast(fit)
predictions$mean
# Time Series:
# Start = 31
# End = 40
# Frequency = 1
# [1] 31.37578 32.28944 33.21643 34.29237
# [5] 35.26779 36.23214 37.25369 38.24472
# [9] 39.22641 40.22923
```

Now, suppose I try to use the model under the **data-dependent** paradigm. This time we don't want the model to be self-fed the predictions it outputs. This time, I want to use the data I have but didn't use to train the model, {31, 32, ..., 60}, For the first prediction, I feed the model the data points 29 and 30 to get the one-step-ahead forecast, t+1, from the forecast horizon, T.

where \eta is a random realization of known mean and variance, and \phi is an autoregressive coefficient. The following forecast

and so on. Note that if we had used a **self-feeding** paradigm, the previous equation would be x^{t+1}_{t+2} = y_1\phi_1 + 30\phi_2 + \eta, where y_1 is not necessarily equal to 31.

Is there any function either from a package or`base R`

that does this job?

I know the function `predict`

does it with the argument `newdata`

, but I am not sure this would work for an ARIMA model because there is, per se, an independent variable to be fed into the model.