Class Web Page



1. Box-Jenkins Differencing Method

This is called ARIMA modeling.

a) Backward and Difference Operator

Recall the backward operator \(B\), \[ B \, X_t = X_{t-1}. \]

Then define difference operator \(\bigtriangledown\) (del), \[ \bigtriangledown = (1-B). \]

For example, \[ \bigtriangledown X_t = (1-B)X_t = X_t - X_{t-1} \\ \\ \bigtriangledown^2 X_t = (1-B)(1-B) X_t = X_t - 2X_{t-1} + X_{t-2} \]

b) Differencing Linear Trend

Suppose your time series have a line plus a zero-mean staionary noise. \[ Y_t = a + bt + X_t \] That means \[ Y_{t-1} = a + b(t-1) + X_{t-1} \\ \\ \bigtriangledown Y_t = Y_t - Y_{t-1} = b + X_t - X_{t-1} \]

If \(X_t\) is stationary, then \(X_t-X_{t-1}\) is also stationary. Thus we have now \[ \bigtriangledown Y_t = \mu + M_t \] We can try to model this with ARMA(p,q) with intercept \(\mu = b\).

  set.seed(42352)
  t   = 1:100
  Y   = 3 - 10*t + arima.sim(n=100, list(ar = c(.7,.2)), sd=100)

  Y2 = diff(Y)              # take the difference

  layout(matrix(1:2,1,2))
  plot(Y,type="o")           # Y is simulated data
  plot(Y2, type="o" )

  mean(Y2)
## [1] -9.461098


c) Differencing Quadratic Trend

If \(m_t\) is quadratic, then \[\begin{align} Y_t &= a + bt + c t^2 + X_t \\\\ Y_{t-1} &= a + b(t-1) + c (t-1)^2+ X_{t-1} \\\\ Y_{t-2} &= a + b(t-2) + c (t-2)^2 + X_{t-2}. \end{align}\]

That means

\[\begin{align} \bigtriangledown^2 Y_t &= Y_t - 2Y_{t-1} + Y_{t-2} \\\\ &= 2c + X_t - 2X_{t-1} + X_{t-2} \end{align}\]

If \(X_t\) is stationary, so is \(X_t - 2X_{t-1} + X_{t-2}\).

If \(m_t\) is polynomial of deg \(k\), with coefficients $c_0, c_1, $ then applying \(k\) power of difference operator will remove the trend. \[ \bigtriangledown^k Y_t = k! c_k + \bigtriangledown^k X_t. \]

Then you will end up with some statonary series \(\bigtriangledown^k X_t\) with constant trend \(k! c_k\).

  t   = 1:100
  tsq = t^2
  Y   = 3 - .5*t + .1*tsq + arima.sim(n=100, list(ar = c(.7,.2)))*10

  layout(matrix(1:2, 1, 2))
  plot(Y,type="o")
  plot(diff(Y),type="o")

  plot(diff(diff(Y)),type="o")


Key Points

- Box-Jenkins Differencing Turns many non-stationary trends into constants.

  • If you difference linar trend \(a+bt\) once, it becomes constant \(b\).

  • If you difference quadratic trend \(a+bt+ct^2\) twice, it becomes constant \(2c\).

  • If your data is from (Linear Trend + ARMA), then

    • Direct linear regression should yield stationary residuals that can be fit with ARMA.
    • Differencing should result in a stationary series with significant mean.



3. Ex: AAPL

Daily adjusted close price of Apple, Inc. Between 10/17/2007 to 8/4/2008 from Yahoo! finance website

  library(forecast)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
  library(quantmod)      # Do install.packages("quantmod")  if not installed
## Loading required package: xts
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: TTR
  source('https://nmimoto.github.io/R/TS-00.txt')

  getSymbols("AAPL")     #- download from Yahoo!
## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
## 
## This message is shown once per session and may be disabled by setting 
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
## [1] "AAPL"
  X = as.ts(AAPL$AAPL.Adjusted[400:200])
  plot(X, type="o")

  layout(matrix(1:2, 1, 2))
  plot(diff(X), type="o")
  acf(diff(X))

  Fit1 = auto.arima(X, stepwise=FALSE, approximation=FALSE)
  Fit1
## Series: X 
## ARIMA(0,1,0) 
## 
## sigma^2 = 0.01962:  log likelihood = 109.34
## AIC=-216.67   AICc=-216.65   BIC=-213.38
  # Arima(0,1,0) w/o drift is suggested

  # Force drift term. Is it significant?
  Arima(X, order=c(0,1,0), include.drift=TRUE)
## Series: X 
## ARIMA(0,1,0) with drift 
## 
## Coefficients:
##         drift
##       -0.0030
## s.e.   0.0099
## 
## sigma^2 = 0.01971:  log likelihood = 109.38
## AIC=-214.76   AICc=-214.7   BIC=-208.17
  # Force AR(1) after the difference.
  Arima(X, order=c(1,1,0))
## Series: X 
## ARIMA(1,1,0) 
## 
## Coefficients:
##           ar1
##       -0.0261
## s.e.   0.0706
## 
## sigma^2 = 0.0197:  log likelihood = 109.41
## AIC=-214.81   AICc=-214.75   BIC=-208.21
  # Force MA(1) after the difference.
  Arima(X, order=c(0,1,1))
## Series: X 
## ARIMA(0,1,1) 
## 
## Coefficients:
##           ma1
##       -0.0242
## s.e.   0.0680
## 
## sigma^2 = 0.01971:  log likelihood = 109.4
## AIC=-214.8   AICc=-214.74   BIC=-208.2
  Randomness.tests(Fit1$residuals)
## 
##     'tseries' version: 0.10-49
## 
##     'tseries' is a package for time series analysis and
##     computational finance.
## 
##     See 'library(help="tseries")' for details.

##   B-L test H0: the series is uncorrelated
##   M-L test H0: the square of the series is uncorrelated
##   J-B test H0: the series came from Normal distribution
##   SD         : Standard Deviation of the series
##       BL15  BL20 BL25  ML15  ML20    JB   SD
## [1,] 0.389 0.563 0.52 0.096 0.155 0.001 0.14