5b: Box-Jenkins Differencing Method

1. Box-Jenkins Differencing Method

This is called ARIMA modeling.

a) Backward and Difference Operator

Recall the backward operator $B$, \[ B \, X_t = X_{t-1}. \]

Then define difference operator $\bigtriangledown$ (del), \[ \bigtriangledown = (1-B). \]

For example, \[ \bigtriangledown X_t = (1-B)X_t = X_t - X_{t-1} \\ \\ \bigtriangledown^2 X_t = (1-B)(1-B) X_t = X_t - 2X_{t-1} + X_{t-2} \]

b) Differencing Linear Trend

Suppose your time series have a line plus a zero-mean staionary noise. \[ Y_t = a + bt + X_t \] That means \[ Y_{t-1} = a + b(t-1) + X_{t-1} \\ \\ \bigtriangledown Y_t = Y_t - Y_{t-1} = b + X_t - X_{t-1} \]

If $X_t$ is stationary, then $X_t-X_{t-1}$ is also stationary. Thus we have now \[ \bigtriangledown Y_t = \mu + M_t \] We can try to model this with ARMA(p,q) with intercept $\mu = b$.

  set.seed(42352)
  t   = 1:100
  Y   = 3 - 10*t + arima.sim(n=100, list(ar = c(.7,.2)), sd=100)

  Y2 = diff(Y)              # take the difference

  layout(matrix(1:2,1,2))
  plot(Y,type="o")           # Y is simulated data
  plot(Y2, type="o" )

  mean(Y2)

## [1] -9.461098

c) Differencing Quadratic Trend

If $m_t$ is quadratic, then \[\begin{align} Y_t &= a + bt + c t^2 + X_t \\\\ Y_{t-1} &= a + b(t-1) + c (t-1)^2+ X_{t-1} \\\\ Y_{t-2} &= a + b(t-2) + c (t-2)^2 + X_{t-2}. \end{align}\]

That means

\[\begin{align} \bigtriangledown^2 Y_t &= Y_t - 2Y_{t-1} + Y_{t-2} \\\\ &= 2c + X_t - 2X_{t-1} + X_{t-2} \end{align}\]

If $X_t$ is stationary, so is $X_t - 2X_{t-1} + X_{t-2}$.

If $m_t$ is polynomial of deg $k$, with coefficients $c_0, c_1, $ then applying $k$ power of difference operator will remove the trend. \[ \bigtriangledown^k Y_t = k! c_k + \bigtriangledown^k X_t. \]

Then you will end up with some statonary series $\bigtriangledown^k X_t$ with constant trend $k! c_k$.

  t   = 1:100
  tsq = t^2
  Y   = 3 - .5*t + .1*tsq + arima.sim(n=100, list(ar = c(.7,.2)))*10

  layout(matrix(1:2, 1, 2))
  plot(Y,type="o")
  plot(diff(Y),type="o")

  plot(diff(diff(Y)),type="o")

Key Points

- Box-Jenkins Differencing Turns many non-stationary trends into constants.

If you difference linar trend $a+bt$ once, it becomes constant $b$.

If you difference quadratic trend $a+bt+ct^2$ twice, it becomes constant $2c$.

If your data is from (Linear Trend + ARMA), then

Direct linear regression should yield stationary residuals that can be fit with ARMA.

Differencing should result in a stationary series with significant mean.

2. Differencing Random Trends

a) Random Walk

Suppose we have a model with random trend \[ Y_t = M_t + X_t, \] where $X_t$ is a stationary series and $M_t$ is a random walk: \[ M_t = \sum_{i=1}^t e_i \mbox{ where } e_i \sim_{iid} N(0, 1). \]

With iid errors $(e_1,e_2,e_3,\ldots)$ random walk is generated as

\[\begin{align} M_1 &= e_1 \\ M_2 &= e_1 + e_2 \\ M_3 &= e_1 + e_2 + e_3 \\ M_4 &= e_1 + e_2 + e_3 + e_4 \\ &\vdots \end{align}\]

We can use cumsum() function in R to do this easily.

Mean of RW is \[ E[M_t] = E\Big(\sum_{i=1}^t e_i\Big) = \sum_{i=1}^t E(e_i) = 0 \]

Variance of RW is \[ V[M_t] = V\Big(\sum_{i=1}^t e_i\Big) = \sum_{i=1}^t V(e_i) = \sigma^2 t \]

  #- RW without drift
  M = cumsum(rnorm(100))
  plot(M, type="l", ylim=c(-40,40), main="50 random walks")

  for (i in 1:50) {
    e = rnorm(100)      #- copy and paste these 3 lines many times
    M = cumsum(e)
    lines(M)
  }

Note that mean is 0, but variance increase as time goes on.

b) Differencing RW

Since \[ M_t = \sum_{i=1}^t e_i, \] we have \[ \bigtriangledown M_t = M_t - M_{t-1} = \sum_{i=1}^t e_i - \sum_{i=1}^{n-1} e_i = \epsilon_t \] And we know that $\epsilon_t \sim N(0,\sigma)$.

So if $M_t$ is Random Walk, then $\bigtriangledown M_t$ should look like iid $N(0,\sigma)$ noise.

Ex: Differencing RW

  e = rnorm(100)      #- 100 random number from N(0,1)
  M = cumsum(e)

  layout(matrix(1:2, 1, 2))
  plot(M,type="o", main="Random Walk")
  plot(diff(M), type="o", main="difference of RW");

  acf(diff(M), main="ACF of the RW difference")
  hist(diff(M))

c) RW with Drift

Random Walk with drift $\delta$ is \[ M_t = \sum_{i=1}^t e_i \mbox{ where } e_i \sim_{iid} N(\delta, 1). \]

Now each step have little more chance of being positive than negative:

If $M_t$ is Random Walk with drift, then $\bigtriangledown M_t$ should look like iid $N(\delta,\sigma)$ noise.

Mean of RW is now \[ E[M_t] = E\Big(\sum_{i=1}^t e_i\Big) = \sum_{i=1}^t E(e_i) = \delta \]

Variance of RW is unchanged: \[ V[M_t] = V\Big(\sum_{i=1}^t e_i\Big) = \sum_{i=1}^t V(e_i) = \sigma^2 t \]

  layout(matrix(1:2, 1, 2))

  #- RW with drift (delta = .3)
  e = rnorm(100, .3, 1)      #- 100 random number from N(1,1)
  M = cumsum(e)

  #- repeat this 50 times
  plot(M,type="l", ylim=c(-40,40), main="50 RW with drift .3")
  for (i in 1:50) {
    e = rnorm(100, .3, 1)
    M = cumsum(e)
    lines(M)
  }


  #- RW without drift
  e = rnorm(100)      #- 100 random number from N(0,1)
  M = cumsum(e)

  #- repeat this 50 times
  plot(M,type="l", ylim=c(-40,40), main="50 RW w/o drift")
  for (i in 1:50) {
    e = rnorm(100)      #- copy and paste these 3 lines many times
    M = cumsum(e)
    lines(M)
  }

d) Testing RW for Drift

  set.seed(87656)

  e = rnorm(100, .3, 1)
  M = cumsum(e)

  plot(M,type="o")

Does this RW has drift or not?

Test if mean of the difference is 0.

  plot(diff(M), type="o", main="difference of RW")

  c(mean(diff(M)), sd(diff(M)), 1.96*sd(diff(M))/sqrt(100) )

## [1] 0.3292302 1.0239543 0.2006950

95% CI for the mean is $.33 \pm .20$, so the sample mean is significantly different from 0. We conclude that the above RW has a drift.

Random Walk as Trend

If $Y_t = M_t + X_t$, then \[ Y_t = \sum_{i=1}^t e_i + X_t \\Y_{t-1} = \sum_{i=1}^{t-1} e_i + X_{t-1} \] That means \[ \bigtriangledown Y_t = Y_t - Y_{t-1} = \epsilon_t + X_t - X_{t-1} \] Since $X_t, \epsilon_t$ are stationary, so is $\epsilon_t+X_t-X_{t-1}$.

Key Points

If you difference (RW without drift), it becomes N($0,\sigma^2$) iid noise.

If you difference (RW with drift $\delta$), it becomes N($\delta,\sigma^2$) iid noise.

When difference is applied to (RW w/o drift + ARMA), it should result in stationary series with mean 0, that can be fit with another ARMA.

If you difference (RW w drift + ARMA), it should result in stationary series with mean $\delta$, that can be fit with another ARMA.

3. Ex: AAPL

Daily adjusted close price of Apple, Inc. Between 10/17/2007 to 8/4/2008 from Yahoo! finance website

  library(forecast)

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

  library(quantmod)      # Do install.packages("quantmod")  if not installed

## Loading required package: xts

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## Loading required package: TTR

  source('https://nmimoto.github.io/R/TS-00.txt')

  getSymbols("AAPL")     #- download from Yahoo!

## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
## 
## This message is shown once per session and may be disabled by setting 
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.

## [1] "AAPL"

  X = as.ts(AAPL$AAPL.Adjusted[400:200])
  plot(X, type="o")

  layout(matrix(1:2, 1, 2))
  plot(diff(X), type="o")
  acf(diff(X))

  Fit1 = auto.arima(X, stepwise=FALSE, approximation=FALSE)
  Fit1

## Series: X 
## ARIMA(0,1,0) 
## 
## sigma^2 = 0.01962:  log likelihood = 109.34
## AIC=-216.67   AICc=-216.65   BIC=-213.38

  # Arima(0,1,0) w/o drift is suggested

  # Force drift term. Is it significant?
  Arima(X, order=c(0,1,0), include.drift=TRUE)

## Series: X 
## ARIMA(0,1,0) with drift 
## 
## Coefficients:
##         drift
##       -0.0030
## s.e.   0.0099
## 
## sigma^2 = 0.01971:  log likelihood = 109.38
## AIC=-214.76   AICc=-214.7   BIC=-208.17

  # Force AR(1) after the difference.
  Arima(X, order=c(1,1,0))

## Series: X 
## ARIMA(1,1,0) 
## 
## Coefficients:
##           ar1
##       -0.0261
## s.e.   0.0706
## 
## sigma^2 = 0.0197:  log likelihood = 109.41
## AIC=-214.81   AICc=-214.75   BIC=-208.21

  # Force MA(1) after the difference.
  Arima(X, order=c(0,1,1))

## Series: X 
## ARIMA(0,1,1) 
## 
## Coefficients:
##           ma1
##       -0.0242
## s.e.   0.0680
## 
## sigma^2 = 0.01971:  log likelihood = 109.4
## AIC=-214.8   AICc=-214.74   BIC=-208.2

  Randomness.tests(Fit1$residuals)

## 
##     'tseries' version: 0.10-49
## 
##     'tseries' is a package for time series analysis and
##     computational finance.
## 
##     See 'library(help="tseries")' for details.

##   B-L test H0: the series is uncorrelated
##   M-L test H0: the square of the series is uncorrelated
##   J-B test H0: the series came from Normal distribution
##   SD         : Standard Deviation of the series

##       BL15  BL20 BL25  ML15  ML20    JB   SD
## [1,] 0.389 0.563 0.52 0.096 0.155 0.001 0.14

5b: Box-Jenkins Differencing Method

Nao Mimoto - Dept. of Statistics : The University of Akron

1. Box-Jenkins Differencing Method

a) Backward and Difference Operator

b) Differencing Linear Trend

c) Differencing Quadratic Trend

Key Points

2. Differencing Random Trends

a) Random Walk

b) Differencing RW

Ex: Differencing RW

c) RW with Drift

d) Testing RW for Drift

Random Walk as Trend

Key Points

3. Ex: AAPL