w6a: Order Selection

Class Web Page

Modeling Procedure:

( Stationarity Check ) Does \(X_t\) look stationary?

perform stationarity transformation if it look stationary

( De-mean ) Is the mean of this process 0?

We usually de-mean, unless we have a reason to believe the true mean is equal to 0.

( Model Selection ) What model should we use? AR(p)? or something else?

( Order Selection ) How did we decide on the value of \(p\) to fit?

ACF and PACF
AIC and log-likelihood
Parameter Significance

( Paramete Estimation ) What estimator was used for \(\phi_i\)? Can you reject he null hypothesis that \(\phi_i=0\)?

Use \(\hat \phi \pm 2(S.E.)\) CI to check significance.

( Residual Analysis ) How was the residual calculated? What does it say about how the model fits the data?

1. Use ACF and PACF

ACF of AR(p) tails off PACF of AR(P) cuts off at lag \(p\). Theoretical value of PACF(\(h\)) are \(\phi_h\).

  set.seed(3563)
  X = arima.sim(list(ar = c(.6, .3) ), 100)  + 8.5
  plot(X, type="o")

  layout(matrix(1:2, 1, 2))  # two plots side by side
  acf(X)                     # plot sample ACF
  pacf(X)                    # plot sample PACF

  layout(1)                  # return to 1 plot layout

2. Use AIC and log-likelihood

Likelihood function \[ L(\theta) = f(X_1) \, f(X_2) \, \cdots f(X_n) \]

Akaike Information Criteria: \[ \mbox{AIC} = -2 \, \log(\mbox{maximum likelihood}) + 2 k \] \(k=p+1\) if non-zero mean is included in the model, and \(k=p\) if not.

Choose \(p\) that MINIMIZE AIC.

The auto.arima() in forecast package chooses \(p\) value based on AIC automatically.

AIC Simulation Study

If we use AIC all the time, what is the probability we end up with correct \(p\)?

Set \(p=2\). Generate simulated AR(2). Use auto.arima(Y) and search for best \(p\) by using Minimum AIC. Repeat 1000 times. How many time do we end up with right \(p\)?

  library(forecast)

## Warning: package 'forecast' was built under R version 4.1.2

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

  #- Simulation Study of choosing p by AIC
  Result = 0
  for (i in 1:1000){
    X    = arima.sim(list(ar = c(.6, .3) ), 100 )  + 8.5
    Fit7 = auto.arima(X, d=0, max.q=0)        # find AR(p) by AIC
    Result[i] = length(Fit7$model$phi)        # save value of p chosen
  }
  head(Result, 100)                       # first 100 p

##   [1] 3 2 1 2 3 2 1 3 2 2 2 2 1 2 2 3 1 2 4 2 2 3 3 2 3 1 2 3 2 2
##  [31] 2 3 2 2 2 2 2 2 2 2 2 2 2 1 2 1 2 3 2 2 2 2 1 2 2 2 1 2 2 2
##  [61] 2 2 1 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 1 1 2 2 1 1 2 2
##  [91] 2 2 2 3 2 5 2 2 2 2

  mean(Result==2)                         # propotion of times p=2

## [1] 0.765

3. Use Parameter Significance

After \(p\) was suggested by AIC, test the pameter estimate for significance using large-sample 95% CI. i.e. \(\hat p \pm 2 SE\) If the last parameter is not significant, reduce \(p\) by 1, and fit again.

Ex: When AIC fails use Param Sig