Class Web Page



Modeling Procedure:

( Stationarity Check ) Does \(X_t\) look stationary?

  • perform stationarity transformation if it look stationary

( De-mean ) Is the mean of this process 0?

  • We usually de-mean, unless we have a reason to believe the true mean is equal to 0.

( Model Selection ) What model should we use? AR(p)? or something else?

( Order Selection ) How did we decide on the value of \(p\) to fit?

  • ACF and PACF

  • AIC and log-likelihood

  • Parameter Significance

( Paramete Estimation ) What estimator was used for \(\phi_i\)? Can you reject he null hypothesis that \(\phi_i=0\)?

  • Use \(\hat \phi \pm 2(S.E.)\) CI to check significance.

( Residual Analysis ) How was the residual calculated? What does it say about how the model fits the data?

1. Use ACF and PACF

ACF of AR(p) tails off PACF of AR(P) cuts off at lag \(p\). Theoretical value of PACF(\(h\)) are \(\phi_h\).

  set.seed(3563)
  X = arima.sim(list(ar = c(.6, .3) ), 100)  + 8.5
  plot(X, type="o")

  layout(matrix(1:2, 1, 2))  # two plots side by side
  acf(X)                     # plot sample ACF
  pacf(X)                    # plot sample PACF

  layout(1)                  # return to 1 plot layout



2. Use AIC and log-likelihood

Likelihood function \[ L(\theta) = f(X_1) \, f(X_2) \, \cdots f(X_n) \]

Akaike Information Criteria: \[ \mbox{AIC} = -2 \, \log(\mbox{maximum likelihood}) + 2 k \] \(k=p+1\) if non-zero mean is included in the model, and \(k=p\) if not.

Choose \(p\) that MINIMIZE AIC.

The auto.arima() in forecast package chooses \(p\) value based on AIC automatically.

AIC Simulation Study

If we use AIC all the time, what is the probability we end up with correct \(p\)?

Set \(p=2\). Generate simulated AR(2). Use auto.arima(Y) and search for best \(p\) by using Minimum AIC. Repeat 1000 times. How many time do we end up with right \(p\)?

  library(forecast)
## Warning: package 'forecast' was built under R version 4.1.2
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
  #- Simulation Study of choosing p by AIC
  Result = 0
  for (i in 1:1000){
    X    = arima.sim(list(ar = c(.6, .3) ), 100 )  + 8.5
    Fit7 = auto.arima(X, d=0, max.q=0)        # find AR(p) by AIC
    Result[i] = length(Fit7$model$phi)        # save value of p chosen
  }
  head(Result, 100)                       # first 100 p
##   [1] 3 2 1 2 3 2 1 3 2 2 2 2 1 2 2 3 1 2 4 2 2 3 3 2 3 1 2 3 2 2
##  [31] 2 3 2 2 2 2 2 2 2 2 2 2 2 1 2 1 2 3 2 2 2 2 1 2 2 2 1 2 2 2
##  [61] 2 2 1 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 1 1 2 2 1 1 2 2
##  [91] 2 2 2 3 2 5 2 2 2 2
  mean(Result==2)                         # propotion of times p=2
## [1] 0.765



3. Use Parameter Significance

After \(p\) was suggested by AIC, test the pameter estimate for significance using large-sample 95% CI. i.e. \(\hat p \pm 2 SE\) If the last parameter is not significant, reduce \(p\) by 1, and fit again.

Ex: When AIC fails use Param Sig

If we use AIC all the time, what is the probability we end up with correct \(p\)?

  • Set \(p=2\). Generate simulated AR(2).

  • Use ar(Y) and search for best \(p\) by using Minimum AIC.

  • Repeat 1000 times. How many time do we end up with right \(p\)?

  set.seed(1532)

  #- Run until p=3 comes out
  for (i in 1:999){
    X    = arima.sim(list(ar = c(.6, .3) ), 100 )  + 8.5
    Fit7 = auto.arima(X, d=0, max.q=0)        # find AR(p) by AIC
    Result = length(Fit7$model$phi)           # save value of p chosen
    if (Result==3){
      break
    }
  }

  auto.arima(X, d=0, max.q=0)   # search for best p by AIC
## Series: X 
## ARIMA(3,0,0) with non-zero mean 
## 
## Coefficients:
##          ar1     ar2      ar3    mean
##       0.6314  0.4514  -0.1932  8.8675
## s.e.  0.0972  0.1063   0.0990  0.8142
## 
## sigma^2 = 0.9548:  log likelihood = -138.39
## AIC=286.79   AICc=287.43   BIC=299.82
  Arima(X, order=c(2,0,0))      # fit AR(2)
## Series: X 
## ARIMA(2,0,0) with non-zero mean 
## 
## Coefficients:
##          ar1     ar2    mean
##       0.5633  0.3466  8.9965
## s.e.  0.0928  0.0938  0.9787
## 
## sigma^2 = 0.9818:  log likelihood = -140.26
## AIC=288.52   AICc=288.94   BIC=298.94



Summary

  • Plot ACF and PACF to get some feel for value of \(p\).

  • Use minumum Akaike Information Criteria (AIC) criteria to look for the best \(p\). It is default in auto.arima().

  • Check parameter significance of estimated parameters.

  • After getting the suggested “best” \(p\) from AIC, fit model with p+1, and check to see if the last parameter estimate comes out non-significant.