w6b: Residual Analysis

Class Web Page

Modeling Procedure:

( Stationarity Check ) Does \(X_t\) look stationary?

perform stationarity transformation if it look stationary

( De-mean ) Is the mean of this process 0?

We usually de-mean, unless we have a reason to believe the true mean is equal to 0.

( Model Selection ) What model should we use? AR(p)? or something else?

( Order Selection ) How did we decide on the value of \(p\) to fit?

ACF and PACF
AIC and log-likelihood
Parameter Significance

( Paramete Estimation ) What estimator was used for \(\phi_i\)? Can you reject he null hypothesis that \(\phi_i=0\)?

Use \(\hat \phi \pm 2(S.E.)\) CI to check significance.

( Residual Analysis ) How was the residual calculated? What does it say about how the model fits the data?

Residuals should look like White Noise. Check its ACF and PACF for autocorrelation. It should show no sign of autocorrelation.

1. How Residuals are calculated

Residuals comes directly from flipping the AR(p) equation backwards. (That’s why the first three \(\hat \epsilon_t\) are NAs)

Say we have AR(2) model \[ X_t = \hat \phi_1 X_{t-1} + \hat \phi_2 X_{t-2} + \epsilon_t, \] where \(X_t\) are observations. Since we never get to observe \(\epsilon_t\), we must estimate them. \[ \hat \epsilon_t = X_t -\hat \phi_1 X_{t-1} -\hat \phi_2 X_{t-2} \hspace10mm t= 3,4,\ldots,n \] This is the reason, that for AR(\(p\)), first \(p\) residuals are NA.

If the model chosen is fitting the data adequately, \(\hat \epsilon_t\) should behave like \(\epsilon_t\). i.e. residuals should behave like White Noise.

Ex:

  library(forecast)

  set.seed(3563)
  Y    = arima.sim(list(ar = c(.6, .3) ), 100 )
  plot(Y, type="o")

  Fit1 = Arima(Y, order=c(2,0,0), include.mean=FALSE)          # fit AR(2)
  Fit1

## Series: Y 
## ARIMA(2,0,0) with zero mean 
## 
## Coefficients:
##          ar1     ar2
##       0.6279  0.2530
## s.e.  0.0958  0.0959
## 
## sigma^2 estimated as 1.088:  log likelihood=-145.78
## AIC=297.55   AICc=297.8   BIC=305.37

  res  = Fit1$resid               # extract the residuals
  res

## Time Series:
## Start = 1 
## End = 100 
## Frequency = 1 
##   [1] -0.125120376 -0.516514985  0.371576940 -1.107009588 -2.102633877  0.149898985
##   [7] -0.924339580 -1.063014563  1.937233661  1.479867875  0.206238128 -0.074583443
##  [13] -0.562946698  1.107662240 -1.406661721  0.732666286 -1.009395211  0.857768994
##  [19] -1.483114286 -0.315466261  0.538839192 -0.600311196  1.455587318  0.350725689
##  [25] -0.032241007  0.496018048 -1.582838152 -0.410956217  1.509867542 -0.700668997
##  [31] -1.645742613 -0.687316378 -0.224103502 -2.481482569 -0.903373124  0.957356315
##  [37]  0.545280800 -0.299999646 -2.929796634  1.634819088 -0.402916836 -2.521157208
##  [43] -0.129219223  0.335421656 -2.046475081  1.180352070  0.182626415 -0.676085651
##  [49] -0.197933089  1.009032404  1.026029210 -0.088155778 -0.611498470 -0.531325661
##  [55]  0.469679768  0.624261534  0.651314043 -0.876104882  1.104851743  0.983531787
##  [61] -0.044519851 -0.064338560  0.004113708  2.246773379  0.398652275 -0.051180491
##  [67]  0.085646365  0.509691218 -0.516394560  1.015704897  0.874144891 -0.626109628
##  [73]  0.498649583 -1.456258337  1.065650387  0.567119069  0.277276538  0.048982999
##  [79]  0.984373577  1.280681352  0.891782794 -0.975160790 -0.051029099 -1.809801283
##  [85] -0.689138734 -0.063599760 -1.728563304 -0.022074558  0.424939853 -0.198056122
##  [91] -0.077164248  0.712087762 -0.248736398  0.995859847 -0.027582447  2.729472001
##  [97]  0.550102057 -0.052067553 -0.887269883 -0.575951928

  Y[3]-Fit1$model$phi[1]*Y[2] - Fit1$model$phi[2]*Y[1]

## [1] 0.3715769

  Y[4]-Fit1$model$phi[1]*Y[3] - Fit1$model$phi[2]*Y[2]

## [1] -1.10701

  Y[5]-Fit1$model$phi[1]*Y[4] - Fit1$model$phi[2]*Y[3]

## [1] -2.102634

2. Residual Analysis

After time series model is fit, we want to check the residual for

Randomness (no autocorrelation)
Conditional Heteroscedasticity (constatnt conditional variance)
Normality (not important in ARMA)

Given residuals, \(\hat \epsilon_t\), we want to see if it is an uncorrelated sequence or not.

Plot ACF/PACF
Ljung-Box test for randomness
McLeod-Li test
q-q plot and Jarque-Bera test for Normality

Ex:

  plot(res)

  # Check acf + pacf of residuals
  layout(matrix(1:2, 1,2))                   # 2 plots side by side
  acf(res)
  pacf(res)

  layout(1)

3. Custom Functions

Making Your Own Function in R

  x = 5
  f = exp(5+4)
  f

  f = function(x){
    exp(x+4)
  }
  f(5)

My Custom Functions for Time Series Analysis

All of above function is put into my function called Randomness.tests(). You can load that into R by copy and pasting the following line.

  #--- This line copy and paste Basic Functions on class web page
  source("https://nmimoto.github.io/R/TS-00.txt")

  #- Use the function as below
  res = Fit1$resid

  Randomness.tests(res)

Summary

Check Residuals for Randomness (uncorrelation) (use Ljung-Box test)
Heteroscedasticity (Randomness of squared series) - (use McLeod-Li test)
Normality - (qqplot and Jarque-Bera test) (not as important as above two)
Use below code to load the all-in-one function (you need package installed)

  res = Fit1$resid

  source("https://nmimoto.github.io/R/TS-00.txt")

  Randomness.tests(res)