Simulating Time Series Data using arima.sim() and getting unexpected results

I would like to create an auto-correlated time series using an AR model. However, when I look at my simulated data, it appears that the lag effects are not having the impact I expect.

set.seed(123)
x <- arima.sim(model = list(ar=.9,.9,.9), n = 100) #autocorrelated
y <- arima.sim(model = list(ar=.01), n = 100) #not very autocorrelated
z<- arima.sim(model = list(ar=.9,-.1,.9,-.1), n =100) #weirdly autocorrelated
plot.ts(cbind(x,y,z))

Good! My time series seem strongly autocorrelated for x, very noise/random for y, and somewhere in between for z.

par(mfrow=(c(1,3)))
acf(x)
acf(y)
acf(z)

I'm surprised here mostly by the output of the last graph. I have very weak lag term in my AR model for even previous years but the acf() function seems to be finding a temporal pattern very similar to that of my first 'x' time series.

par(mfrow=(c(1,3)))
pacf(x)
pacf(y)
pacf(z)

The pacf() function finds strong temporal correlation in the first year for time series 'x' and then finds almost none. I suspect this is because the first strong autocorrelation in AR(1) is soaking up so much of the variance between years that there is not any leftover for the second and third year lag effects to explain. Does anyone know if this is a reasonable way to think about this?

I tried to test it with my 'z' model where I created strong correlation between the present year (t) and the previous year (t-1), and weak correlation between the previos year (t-1) and the second previous year (t-2), and a lot again for the second previous year (t-2) and the third previous year (t-3) and yet still pacf() does not seem to notice a relationship past the first year.

1 Like

Hi, and welcome!

Take a look at help(acf) and its first example and you will see similar behavior.

Default is 10*log10(N/m) where N is the number of observations and m the number of series.

It doesn't appear as if models x \& z are all that much different.

set.seed(123)
x <- arima.sim(model = list(ar=.9,.9,.9), n = 100) #autocorrelated
y <- arima.sim(model = list(ar=.01), n = 100) #not very autocorrelated
z<- arima.sim(model = list(ar=.9,-.1,.9,-.1), n =100) #weirdly autocorrelated

xs <- acf(x)

zs <- acf(z)


xs
#> 
#> Autocorrelations of series 'x', by lag
#> 
#>      0      1      2      3      4      5      6      7      8      9 
#>  1.000  0.878  0.766  0.667  0.549  0.477  0.433  0.394  0.340  0.280 
#>     10     11     12     13     14     15     16     17     18     19 
#>  0.235  0.176  0.093  0.049  0.007 -0.030 -0.069 -0.084 -0.105 -0.118 
#>     20 
#> -0.108
zs
#> 
#> Autocorrelations of series 'z', by lag
#> 
#>      0      1      2      3      4      5      6      7      8      9 
#>  1.000  0.836  0.725  0.629  0.553  0.446  0.379  0.369  0.334  0.322 
#>     10     11     12     13     14     15     16     17     18     19 
#>  0.284  0.255  0.214  0.186  0.155  0.104  0.030  0.008 -0.006  0.002 
#>     20 
#>  0.007

Created on 2019-10-23 by the reprex package (v0.3.0)

I suggest trying all the examples on both that page and plot.acf {stats} to see if they change your intuition. (Not that it's necessarily wrong, I've been away from ts for a few years.)

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.