Class Web Page



1. Effect of \(\bigtriangledown_s\) on Trend


a) Linear Trend

Suppose \(s=12\), and your observation \(Y_t\) has linear trend, as well as the seasonality with period \(s=12\). \[ Y_t = a+b t + S_t + X_t. \] Then

\[\begin{align*} \bigtriangledown_s Y_{t} &= \hspace{3mm} Y_{t}-Y_{t-12}\\\\ &= \hspace{3mm} a+b (t) + S_{t} + X_{t} \\\\ & \hspace{10mm} - \Big( \hspace{3mm} a+b (t-12) \hspace{5mm} + S_{t-12} \hspace{5mm} + X_{t-12} \Big) \\\\ &= \hspace{3mm} b (12) + X_{t} - X_{t-12}\\ \end{align*}\]

This is statonary series with constant \(b(12)\). (called drift by auto.arima())

Quadratic Trend

If your observation \(Y_t\) has quadratic trend, as well as the seasonality. \[ Y_t = a+b t + c t^2 + S_t + X_t. \] Then, \[\begin{align*} \bigtriangledown_s Y_{t} &= \hspace{3mm} Y_{t}-Y_{t-12} \\\\ &= \hspace{3mm} a+b (t) + c(t)^2 + S_{t} + X_{t} \\\\ &\hspace{10mm} - \Big( \hspace{3mm} a+b (t-12) + c(t-12)^2 + S_{t-12} + X_{t-12} \Big) \\\\ &= \hspace{3mm} b (12) + c(24 t) + c(12^2) + X_{t} - X_{t-12}\\ \end{align*}\]


This is statonary series with linear trend \(b(12) + c(12^2) + c(24) t\).


If you take \(\bigtriangledown\) again, who will be left?

  library(forecast)
  source('https://nmimoto.github.io/R/TS-00.txt')

  #--- Linear trend only
  X = 2+5*(1:200)

  layout(matrix(1:2, 1,2))
  plot(X, type="o")
  plot(diff(X, 12), type="o")

  #--- Quadratic trend only
  X = 2+3*(1:200)^2
  layout(matrix(1:3,1,3))
  plot(X, type="o")
  plot(diff(X, 12), type="o")
  plot(diff(diff(X, 12)), type="o")


b) Linear Trend plus WN

Suppose \(s=12\), and your observation \(Y_t\) has linear trend, \[ Y_t = a+b t + S_t + \epsilon_t \hspace{10mm} \mbox{ where } \epsilon_t \sim N(0,1) \]

Taking \(\bigtriangledown_{12}\) yields, \[\begin{align*} Y_{t+12}-Y_{t} &= \hspace{3mm} a+b (t) + S_{t} + e_{t} \\\\ &- ( a+b (t-12) \hspace{15mm} + S_{t-12} \hspace{5mm} + e_{t-12} ) \\\\ &= \hspace{3mm} \hspace{10mm} b(12) + e_{t} - e_{t-12} \end{align*}\]

What kind of process is this? \[ K_t \hspace{3mm} = \hspace{3mm} e_{t} - e_{t-12} \]

Linar regression with Seasonal average will be a better approach.

Ex. Linear Trend plus WN

  library(forecast)
  source('https://nmimoto.github.io/R/TS-00.txt')

  #--- Linear trend + WN
  set.seed(3462)
  X = 2+.5*(1:200) + rnorm(200,0,5)

  layout(matrix(1:2,1,2))
  plot(X, type="o")

  plot(diff(X, 12), type="o")


Set frequency!

  X = ts(X, start=c(1,1), freq=12)           # set freq=12

  Fit2 = auto.arima(X, D=1, stepwise=FALSE, approximation=FALSE)
  Fit2
## Series: X 
## ARIMA(0,0,0)(2,1,0)[12] with drift 
## 
## Coefficients:
##          sar1     sar2   drift
##       -0.6780  -0.2089  0.5142
## s.e.   0.0719   0.0747  0.0181
## 
## sigma^2 = 29.7:  log likelihood = -586.82
## AIC=1181.63   AICc=1181.85   BIC=1194.58
  Randomness.tests(Fit2$resid)

##   B-L test H0: the series is uncorrelated
##   M-L test H0: the square of the series is uncorrelated
##   J-B test H0: the series came from Normal distribution
##   SD         : Standard Deviation of the series
##       BL15 BL20  BL25  ML15  ML20    JB    SD
## [1,] 0.303 0.11 0.078 0.049 0.098 0.273 5.254


Check for sign of (seasonal) over-differencing

  Fit3 = Arima(X, order=c(0,0,0), seasonal=c(0,1,1))
  Fit3
## Series: X 
## ARIMA(0,0,0)(0,1,1)[12] 
## 
## Coefficients:
##         sma1
##       0.0767
## s.e.  0.0542
## 
## sigma^2 = 82.46:  log likelihood = -681.05
## AIC=1366.1   AICc=1366.17   BIC=1372.57

No sign of over-differencing

Don’t forget the drift!

  Fit4 = Arima(X, order=c(0,0,0), seasonal=c(0,1,1), include.drift=TRUE)
  Fit4
## Series: X 
## ARIMA(0,0,0)(0,1,1)[12] with drift 
## 
## Coefficients:
##          sma1   drift
##       -1.0000  0.5090
## s.e.   0.0933  0.0057
## 
## sigma^2 = 21.79:  log likelihood = -572.31
## AIC=1150.61   AICc=1150.74   BIC=1160.32

Now there’s clear sign of (seasonal) over-differencing.

c) Random Trend

Suppose \(s=12\), and your observation \(Y_t\) has random walk without drift as trend, together with the seasonality. \[ Y_t = W_t + S_t + X_t. \] Where \[ W_t \hspace{3mm} = \hspace{3mm} \sum_{i=1}^t e_i \hspace{10mm} e_i \sim_{iid} N(0,\sigma^2). \]

If you take seasonal difference, \[\begin{align*} \bigtriangledown_{12} Y_{t} &= \hspace{3mm} W_t \hspace{5mm} + S_t \hspace{5mm} + X_t \\ \\ &- \hspace{3mm} W_{t-12} - S_{t-12} - X_{t-12} \\ \\ &= \hspace{3mm} (W_t - W_{t-12}) + (X_t - X_{t-12}) \\ \end{align*}\]

The first part, \[ W_{t} - W_{t-12} \hspace{3mm} = \hspace{3mm} \sum_{i=0}^{t-1} e_{t-i} - \sum_{i=12}^{t-1} e_{t-i} \\ \hspace{3mm} = \hspace{3mm} \sum_{i=0}^{11} e_{t-i} \hspace{10mm} \sim N(0,12\sigma^2) \]

This is MA(11) with unit root. (non-invertible)

Ex. Random Trend w/o Drift

  library(forecast)
  source('https://nmimoto.github.io/R/TS-00.txt')

  #--- RW without drift ---
  set.seed(5351)
  RW1 = cumsum(rnorm(200, 0, 1))
  layout(matrix(1:2, 1,2))
  plot(RW1, type="l")
  plot(diff(RW1, 12), type="o")

  RW2 = ts(RW1, start=c(1,1), freq=12)
  auto.arima(RW2, D=1)
## Series: RW2 
## ARIMA(1,0,0)(1,1,0)[12] with drift 
## 
## Coefficients:
##          ar1     sar1    drift
##       0.9458  -0.5853  -0.0473
## s.e.  0.0224   0.0590   0.0751
## 
## sigma^2 = 1.301:  log likelihood = -293.38
## AIC=594.75   AICc=594.97   BIC=607.7
  acf(diff(RW2, 12))
  pacf(diff(RW2, 12))

  Arima(RW2, order=c(0,0,12), seasonal=c(0,1,0))
## Series: RW2 
## ARIMA(0,0,12)(0,1,0)[12] 
## 
## Coefficients:
##          ma1     ma2     ma3     ma4     ma5     ma6     ma7     ma8
##       0.9835  0.9878  1.0476  1.0295  1.0445  1.0440  1.0296  1.0460
## s.e.  0.0877  0.0926  0.1073  0.1188  0.1250  0.1207  0.1203  0.1103
##          ma9    ma10    ma11     ma12
##       0.9855  0.9827  0.9643  -0.0351
## s.e.  0.1063  0.0963  0.1055   0.0823
## 
## sigma^2 = 0.9189:  log likelihood = -269.31
## AIC=564.62   AICc=566.71   BIC=606.69
  Arima(RW2, order=c(0,0,12), seasonal=c(0,1,0), include.drift=TRUE)
## Series: RW2 
## ARIMA(0,0,12)(0,1,0)[12] with drift 
## 
## Coefficients:
##          ma1     ma2     ma3     ma4     ma5     ma6     ma7     ma8
##       0.9804  0.9841  1.0436  1.0251  1.0400  1.0394  1.0251  1.0419
## s.e.  0.0877  0.0927  0.1070  0.1184  0.1244  0.1202  0.1198  0.1100
##          ma9    ma10    ma11     ma12    drift
##       0.9816  0.9795  0.9610  -0.0382  -0.0471
## s.e.  0.1061  0.0964  0.1055   0.0826   0.0660
## 
## sigma^2 = 0.9217:  log likelihood = -269.06
## AIC=566.11   AICc=568.54   BIC=611.42


Ex. Random Trend w Drift

  #--- RW with drift ---
  set.seed(5351)
  RW.d = ts(cumsum(rnorm(200, .5, 1)), start=c(1,1), freq=12)
  plot(RW.d, type="l")

  # Take seasonal difference
  plot(diff(RW.d, 12), type="o")

  auto.arima(RW.d, D=1)    #- picks up the drift (.5)*(12)
## Series: RW.d 
## ARIMA(1,0,0)(1,1,0)[12] with drift 
## 
## Coefficients:
##          ar1     sar1   drift
##       0.9458  -0.5853  0.4527
## s.e.  0.0224   0.0590  0.0751
## 
## sigma^2 = 1.301:  log likelihood = -293.38
## AIC=594.75   AICc=594.97   BIC=607.7
  Arima(RW.d, order=c(0,0,12), seasonal=c(0,1,0) )
## Series: RW.d 
## ARIMA(0,0,12)(0,1,0)[12] 
## 
## Coefficients:
##          ma1     ma2     ma3     ma4     ma5     ma6     ma7     ma8
##       1.1366  1.1777  1.2771  1.2982  1.3251  1.3280  1.3023  1.2888
## s.e.  0.1262  0.0954  0.1498  0.1316  0.1749  0.1423  0.1709  0.1286
##          ma9    ma10    ma11    ma12
##       1.1925  1.1437  1.1138  0.1110
## s.e.  0.1524  0.0976  0.1399  0.0733
## 
## sigma^2 = 1.099:  log likelihood = -286.3
## AIC=598.61   AICc=600.7   BIC=640.68
  Arima(RW.d, order=c(0,0,12), seasonal=c(0,1,0), include.drift=TRUE)
## Series: RW.d 
## ARIMA(0,0,12)(0,1,0)[12] with drift 
## 
## Coefficients:
##          ma1     ma2     ma3     ma4     ma5     ma6     ma7     ma8
##       0.9804  0.9841  1.0436  1.0251  1.0400  1.0395  1.0251  1.0419
## s.e.  0.0877  0.0927  0.1070  0.1184  0.1244  0.1202  0.1198  0.1100
##          ma9    ma10    ma11     ma12   drift
##       0.9816  0.9795  0.9610  -0.0382  0.4529
## s.e.  0.1061  0.0964  0.1055   0.0826  0.0660
## 
## sigma^2 = 0.9217:  log likelihood = -269.06
## AIC=566.11   AICc=568.54   BIC=611.42



2. Forecasting sARMA

Suppose we have ARIMA\((1,0,1) \times (1,1,0)_{12}\) model, \[ (1-\phi_1 B) (1-\Phi_1 B^{12}) \bigtriangledown_{12} Y_t = (1-\theta_1 B) \epsilon_t \]

This is same as modeling \(\bigtriangledown_{12} Y_t\) with ARMA(13,1), \[ X_t \hspace{3mm} = \hspace{3mm} \bigtriangledown_{12} Y_t \\ \Big(1-\phi_1 B - \Phi_1 B^{12} + \phi_1 \Phi_1 B^{13} \Big) X_t \hspace{3mm} = \hspace{3mm} (1-\theta_1 B) \epsilon_t \]

We know how to get a prediction \(\hat X(h)\) for ARMA(\(p,q\)), Our predictor for \(Y_t\) will be \[\begin{align*} Y_{n+1} &= \hspace{3mm} Y_{n+1-12} + \hat X(1) \\ Y_{n+2} &= \hspace{3mm} Y_{n+1-12} + \hat X(2) \\ &\vdots \end{align*}\]

3. Tests for Seasonality

How do you check if you need to take a seasonal difference, instead of regular difference?

OCSB test (\(H_0\): Seasonal Unit Root Exists)

  • Default method in auto.arima()

  • Osborn-Chui-Smith-Birchenhall (1988)

CH test (\(H_0\): Deterministic Seasonality Exists)

  • Canova-Hansen (1995)

Fore more detail: http://robjhyndman.com/hyndsight/forecast3/


OCSB test

Osborn-Chui-Smith-Birchenhall (1988) test for

Default choice in for choosing value of \(D\). \[ \left\{ \begin{array}{ll} H_0: \mbox{ Seasonal Unit Root Exists}\\ H_A: H_0 \mbox{ is false} \\ \end{array} \right. \]

Small p-value means “Take Seasonal Difference”.

4. Warning about drift vs XREG in Arima()

set.seed(9823)
X = 2+5*(1:200) + rnorm(200,0,5)
X = ts(X, start=1, freq=12)
time(X)
##          Jan       Feb       Mar       Apr       May       Jun
## 1   1.000000  1.083333  1.166667  1.250000  1.333333  1.416667
## 2   2.000000  2.083333  2.166667  2.250000  2.333333  2.416667
## 3   3.000000  3.083333  3.166667  3.250000  3.333333  3.416667
## 4   4.000000  4.083333  4.166667  4.250000  4.333333  4.416667
## 5   5.000000  5.083333  5.166667  5.250000  5.333333  5.416667
## 6   6.000000  6.083333  6.166667  6.250000  6.333333  6.416667
## 7   7.000000  7.083333  7.166667  7.250000  7.333333  7.416667
## 8   8.000000  8.083333  8.166667  8.250000  8.333333  8.416667
## 9   9.000000  9.083333  9.166667  9.250000  9.333333  9.416667
## 10 10.000000 10.083333 10.166667 10.250000 10.333333 10.416667
## 11 11.000000 11.083333 11.166667 11.250000 11.333333 11.416667
## 12 12.000000 12.083333 12.166667 12.250000 12.333333 12.416667
## 13 13.000000 13.083333 13.166667 13.250000 13.333333 13.416667
## 14 14.000000 14.083333 14.166667 14.250000 14.333333 14.416667
## 15 15.000000 15.083333 15.166667 15.250000 15.333333 15.416667
## 16 16.000000 16.083333 16.166667 16.250000 16.333333 16.416667
## 17 17.000000 17.083333 17.166667 17.250000 17.333333 17.416667
##          Jul       Aug       Sep       Oct       Nov       Dec
## 1   1.500000  1.583333  1.666667  1.750000  1.833333  1.916667
## 2   2.500000  2.583333  2.666667  2.750000  2.833333  2.916667
## 3   3.500000  3.583333  3.666667  3.750000  3.833333  3.916667
## 4   4.500000  4.583333  4.666667  4.750000  4.833333  4.916667
## 5   5.500000  5.583333  5.666667  5.750000  5.833333  5.916667
## 6   6.500000  6.583333  6.666667  6.750000  6.833333  6.916667
## 7   7.500000  7.583333  7.666667  7.750000  7.833333  7.916667
## 8   8.500000  8.583333  8.666667  8.750000  8.833333  8.916667
## 9   9.500000  9.583333  9.666667  9.750000  9.833333  9.916667
## 10 10.500000 10.583333 10.666667 10.750000 10.833333 10.916667
## 11 11.500000 11.583333 11.666667 11.750000 11.833333 11.916667
## 12 12.500000 12.583333 12.666667 12.750000 12.833333 12.916667
## 13 13.500000 13.583333 13.666667 13.750000 13.833333 13.916667
## 14 14.500000 14.583333 14.666667 14.750000 14.833333 14.916667
## 15 15.500000 15.583333 15.666667 15.750000 15.833333 15.916667
## 16 16.500000 16.583333 16.666667 16.750000 16.833333 16.916667
## 17 17.500000 17.583333
plot(X)

lm(X ~ time(X))  # slope b is 60
## 
## Call:
## lm(formula = X ~ time(X))
## 
## Coefficients:
## (Intercept)      time(X)  
##      -53.55        60.07
Arima(X, order=c(1,0,1), xreg=time(X))  # xreg tells you b is 60
## Series: X 
## Regression with ARIMA(1,0,1) errors 
## 
## Coefficients:
##           ar1     ma1  intercept     xreg
##       -0.6822  0.5770   -53.5502  60.0739
## s.e.   0.2507  0.2791     0.6324   0.0604
## 
## sigma^2 = 19.61:  log likelihood = -579.39
## AIC=1168.78   AICc=1169.09   BIC=1185.27
mean( diff(X, 12) )  # supposed to be b(12) it's actually b(12)(1/12)
## [1] 59.96646
Arima(X, order=c(2,0,0), seasonal=c(0,1,0), include.drift=TRUE)
## Series: X 
## ARIMA(2,0,0)(0,1,0)[12] with drift 
## 
## Coefficients:
##           ar1     ar2   drift
##       -0.1085  0.1250  4.9972
## s.e.   0.0722  0.0721  0.0377
## 
## sigma^2 = 37.86:  log likelihood = -606.85
## AIC=1221.7   AICc=1221.92   BIC=1234.65
           # They actually tell you b/12 instead of b.



Summary

  • If you take \(\bigtriangledown_{12}\) of Linear trend + Seasonality, it eliminates the seasonality, and leaves constant \(12 (slope)\).

  • But if you do the same with (Linear trend + Sesonality + WN), then it turns into 12(slope) + (non-invertible sMA(1)). Which is troublesome. If this is the case, force fit sMA(1) will produce \(\Theta_1\) that is close to 1.

  • If you have (Linear Trend + Seasonality + WN), then it should be modeled by regression and seasonal average, not by \(\bigtriangledown_{12}\).

  • If you take \(\bigtriangledown_{12}\) of (Random Trend + Seasonality), then it leaves 12(drift) + (non-invertible MA(11)). Force fit MA(11) to find out.