Simulate from an ARIMA Model with constant within a loop in fable in differenced-form

Hi,

If I was to simulate from an AR(1) model with constant, within a loop in fable I would do something like:

library(fpp3)

constant <- +150.0
phi1 <- +0.7
mean <- (constant/(1.0 - phi1))
mean # theoretical mean (or isolating behaviour) as you look at the graph

nburn <- 100
nobs <- 200
T <- nburn+nobs 

y <- vector()
y[1] <- 0
e <- rnorm(T, 0, 1)
for(i in 2:T)
  y[i] <- constant + phi1*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(nburn+nobs), y = y, index = idx)
sim

p <- sim |> 
  slice((nburn+1):n()) |>
  autoplot(y) + 
  ggplot2::ggtitle("AR(|1|)\ny(t) = 150.0 + 0.7*y(t-1) + eps")
p

# fit ARIMA (as the model is written down)
fit <- sim |> slice((nburn+1):n()) |>
  model('sim'= ARIMA( y ~ 1 + pdq(1, 0, 0),
                      fixed = list(ar1 = NA, constant = NA), 
                      transform.pars = FALSE,
                      method = "CSS-ML", optim.control = list(maxit = 1000))
  )	
fit
report(fit)

How do I simulate model from an AR(1) model with constant and d=1, i.e. the equivalent

sim.ts.m2 <- arima.sim(list(order = c(1,1,0), ar = 0.7), n = 200) + mean
ts.plot(sim.ts.m2)

within a loop in fable? And I think with d=0 in arima.sim that would be the same as the sim model above.

thanks,
Amarjit

I suggest you use the gratis package:

library(gratis)
library(tsibble)
library(feasts)

constant <- 150.0
phi1 <- 0.7
mean <- (constant / (1.0 - phi1))

ar1 <- arima_model(p = 1, d = 0, q = 0, phi = phi1, constant = mean, sigma = 1)
generate(ar1, length = 200, nseries = 1) |>
  autoplot(value)


ari1 <- arima_model(p = 1, d = 1, q = 0, phi = phi1, constant = 0, sigma = 1)
generate(ari1, length = 200, nseries = 1) |>
  dplyr::mutate(value = value + mean) |>
  autoplot(value)

Created on 2024-07-24 with reprex v2.1.1

Thanks!

My point was that adding the mean onto arima.sim may produce an inaccurate simulation as the
mean <- (constant/(1.0 - phi1 - ... - phip)) calculation might produce rounding errors in higher order AR models, whereas within the loop version I have the constant as specified within the theoretical model.

I will use the methodology as in your examples using library(gratis).

However, to be clear with the following:

Is sim.ts.m2 correct?

  • fable::ARIMA, the constant and phi1 are estimated, and mean_est = constant/(1.0-phi1)
  • forecast::Arima, the drift and phi1 are estimated, and mean_est = drift+mean

Both are the same - are those correct?

  • stats::arima, I cannot include mean/intercept term, as it is ignored for differenced data, and the ar1 term is not the same as the others. So how to estimate with stats::arima?
# Simulate ARIMA(1,1,0) ARIMA-IN-DIFFS
# ------------------------------------

constant <- +150.0
phi1 <- +0.7
mean <- (constant/(1.0 - phi1))
mean # theoretical mean (or isolating behaviour) as you look at the graph


sim.ts.m2 <- arima.sim(list(order = c(1,1,0), ar = 0.7), n = 200) + mean
ts.plot(sim.ts.m2)


# Base R
sim.df <- as.data.frame(sim.ts.m2)
sim.df |>
  stats::arima( order=c(1,1,0),
                fixed = list(ar1 = NA), 
                transform.pars = FALSE,
                method = "CSS-ML", optim.control = list(maxit = 1000)
  )


# library(forecast)
f.fit <- sim.ts.m2 |>
  forecast::Arima( order=c(1,1,0), include.constant = TRUE,
                   fixed = list(ar1 = NA, drift = NA), 
                   transform.pars = FALSE,
                   method = "CSS-ML", optim.control = list(maxit = 1000)
  )
f.fit
mean_est <- f.fit$coef[[2]] + mean
mean_est


# library(fable)
library(fable)
tb <- as_tsibble(sim.ts.m2)
fit <- tb |>
  model('tb'= ARIMA( value ~ 1 + pdq(1, 1, 0),
                     fixed = list(ar1 = NA, constant = NA), 
                     transform.pars = FALSE,
                     method = "CSS-ML", optim.control = list(maxit = 1000))
  )	
fit
report(fit)

mean_est <- (((fit[[1]][[1]]$fit$par$estimate[2]) / (1.0 - fit[[1]][[1]]$fit$par$estimate[1]))) + mean
mean_est

There are two different parameterizations for ARIMA models in common use. See Estimate an ARIMA model — ARIMA • fable

Your sim.ts.m2 model does not include a drift coefficient, yet you try to estimate one (so its estimated value is close to 0). A drift term is equivalent to fitting a regression with a linear trend plus ARIMA(p,1,q) errors. See 10.4 Stochastic and deterministic trends | Forecasting: Principles and Practice (3rd ed)

The mean (or any added constant) is not estimable in an ARIMA model with d = 1 because it is differenced away.

Apologies - I should have looked at the gratis example more carefully, thus I have
added:

  • nothing onto sim.ts.m2
  • the mean onto the simulated series
  • the mean onto the estimated ar1 coeff
    Correct?
# Simulate ARIMA(1,1,0) with constant i.e. ARIMA-in-1st-diff
# ----------------------------------------------------------
constant <- +150.0
phi1 <- +0.7
mean <- (constant/(1.0 - phi1))
mean # theoretical mean (or isolating behaviour) as you look at the graph

sim.ts.m2 <- arima.sim(list(order = c(1,1,0), ar = 0.7), n = 200) # + 0
ts.plot(sim.ts.m2)
ts.plot(sim.ts.m2 + mean)

# Base R
sim.df <- as.data.frame(sim.ts.m2[2:201])
b.fit <- sim.df |>
  stats::arima( order=c(1,1,0),
         fixed = list(ar1 = NA), 
         transform.pars = FALSE,
         method = "CSS-ML", optim.control = list(maxit = 1000)
  )
b.fit
b.mean_est <- b.fit$coef[[1]] + mean
b.mean_est

# library(forecast)
f.fit <- sim.ts.m2[2:201] |>
  forecast::Arima( order=c(1,1,0),
                   fixed = list(ar1 = NA), 
                   transform.pars = FALSE,
                   method = "CSS-ML", optim.control = list(maxit = 1000)
  )
f.fit
f.mean.est <- f.fit$coef[[1]] + mean
f.mean.est

# library(fable)
library(fable)
tb <- as_tsibble(sim.ts.m2)
fit <- tb |>
  dplyr::slice(-1) |>
  model('tb'= ARIMA( value ~ 0 + pdq(1, 1, 0),
                     fixed = list(ar1 = NA), 
                     transform.pars = FALSE,
                     method = "CSS-ML", optim.control = list(maxit = 1000))
  )	
fit
report(fit)
mean_est <- fit[[1]][[1]]$fit$par$estimate[1] + mean
mean_est

Further, I can simulate an ARIMA(0,0,1) with constant as:

#===============================
# Simulate ARIMA(0,0,1) with constant ARIMA-IN-LEVELS
# ---------------------------------------------------
# y = 20 + e(t) + 0.8*e(t-1) 
# --------------------------

# R BURN-IN: arima.sim, n.start = NA, the default, a reasonable value is computed
# ---------
library(fpp3)

constant <- +20
theta1 <- +0.8
mean <- constant

ma1  <- arima.sim(model=list(ma=theta1), sd=sqrt(1), n=100) + constant
index <- c(1:100)
tb <- bind_cols(index=index, ma1=ma1) |>
  gather("Series", "value", -index) |>
  as_tsibble(index=index, key=Series)
tb

p1_R <- tb  |> autoplot(value) +
  ggplot2::ggtitle("MA(1)\ny = 20 + e(t) + 0.8*e(t-1)")


# fit ARIMA
fit <- tb |>
  model('tb'= ARIMA( value ~ 1 + pdq(0, 0, 1) + PDQ(0, 0, 0), 
                     method = "CSS-ML", optim.control = list(maxit = 1000))
  )	
fit
report(fit)

and in library(gratis) as:

#===============================
# Simulate ARIMA(0,0,1) ARIMA-IN-LEVELS
# -------------------------------------
# y = 20 + e(t) + 0.8*e(t-1) 
# --------------------------
library(fpp3)

constant <- +20
theta1 <- +0.8
mean <- constant

ma1 <- gratis::arima_model(p = 0, d = 0, q = 1,
  theta = theta1,
  constant = constant,
  sigma = 1
)

ma1_d0 <- gratis::generate(ma1, length = 100, nseries = 1)
ma1_d0


ma1_d0 |>
  autoplot(value)  +
  ggplot2::ggtitle("MA(1)\ny = 20 + e(t) + 0.8*e(t-1)")


# fit ARIMA (as the model is written down)
fit <- ma1_d0 |>
  model('ma1_d0_sim'= ARIMA( value ~ 1 + pdq(0, 0, 1),
                      fixed = list(ma1 = NA, constant = NA), 
                      transform.pars = FALSE,
                      method = "CSS-ML", optim.control = list(maxit = 1000))
  )	
fit
report(fit)

mean_est <- fit[[1]][[1]]$fit$par$estimate[2]
mean_est

Correct?

What about an
(i) ARIMA(0,1,1) with constant ?
(ii) ARIMA(1,0,1) with constant ?
(iii)ARIMA(1,1,1) with constant ?

[a]
I'll assume both the ARIMA(0,0,1) with constant examples are correct, except for mean_est calculations. Please let me know if not.

[b]

Yes, I think it's because e.g. for an AR(1):

Y(t) = c + phi1 * Y(t-1) + eps(t)
Y(t-1) = c + phi1 * Y(t-2) + eps(t-1)
and
Y(t) - Y(t-1) = 0 + phi1 * (Y(t-1)-Y(t-2)) + eps(t) - eps(t-1)
OR
d(Y(t)) = phi1 * (d(Y(t-1))) + d(eps(t))

[c]
Also, how do I write down the estimated simulated models in differenced form?

For example if I am simulating

# ARIMA(1,1,0)+c with parameters
constant <- +150.0
phi1 <- +0.7
mean <- (constant / (1.0 - phi1))

Do I write down the estimated model without the constant (even though the simulation equation has a constant)
d(Y(t)) = 0.7d(Y(t-1)) + d(eps(t))
OR in backshift notation
(1−0.7
B)y(t) = eps(t), where y(t)=Y(t)-Y(t-1), Y(t) is the original series

and similarly

# ARIMA(0,1,1)+c with parameters
constant <- +20
theta1 <- +0.8
mean <- constant

write down the estimated model without the constant (even though the simulation equation has a constant)
d(Y(t)) = 0.8*(eps(t-1)) + eps(t)
OR in backshift notation
y(t) = (1-0.8*B)*eps(t), where y(t)=Y(t)-Y(t-1), Y(t) is the original series

Is that the proper procedure for each of the simulations/estimated models?

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.