( Stationarity Check ) Does \(X_t\) look stationary?
( De-mean ) Is the mean of this process 0?
( Model Selection ) What model should we use? AR(p)? or something else?
( Order Selection ) How did we decide on the value of \(p\) to fit?
ACF and PACF
AIC and log-likelihood
Parameter Significance
( Paramete Estimation ) What estimator was used for \(\phi_i\)? Can you reject he null hypothesis that \(\phi_i=0\)?
( Residual Analysis ) How was the residual calculated? What does it say about how the model fits the data?
ACF of AR(p) tails off PACF of AR(P) cuts off at lag \(p\). Theoretical value of PACF(\(h\)) are \(\phi_h\).
set.seed(3563)
= arima.sim(list(ar = c(.6, .3) ), 100) + 8.5
X plot(X, type="o")
layout(matrix(1:2, 1, 2)) # two plots side by side
acf(X) # plot sample ACF
pacf(X) # plot sample PACF
layout(1) # return to 1 plot layout
Likelihood function \[ L(\theta) = f(X_1) \, f(X_2) \, \cdots f(X_n) \]
Akaike Information Criteria: \[ \mbox{AIC} = -2 \, \log(\mbox{maximum likelihood}) + 2 k \] \(k=p+1\) if non-zero mean is included in the model, and \(k=p\) if not.
Choose \(p\) that MINIMIZE AIC.
The auto.arima() in forecast package chooses \(p\) value based on AIC automatically.
If we use AIC all the time, what is the probability we end up with correct \(p\)?
Set \(p=2\). Generate simulated AR(2). Use auto.arima(Y) and search for best \(p\) by using Minimum AIC. Repeat 1000 times. How many time do we end up with right \(p\)?
library(forecast)
## Warning: package 'forecast' was built under R version 4.1.2
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
#- Simulation Study of choosing p by AIC
= 0
Result for (i in 1:1000){
= arima.sim(list(ar = c(.6, .3) ), 100 ) + 8.5
X = auto.arima(X, d=0, max.q=0) # find AR(p) by AIC
Fit7 = length(Fit7$model$phi) # save value of p chosen
Result[i]
}head(Result, 100) # first 100 p
## [1] 3 2 1 2 3 2 1 3 2 2 2 2 1 2 2 3 1 2 4 2 2 3 3 2 3 1 2 3 2 2
## [31] 2 3 2 2 2 2 2 2 2 2 2 2 2 1 2 1 2 3 2 2 2 2 1 2 2 2 1 2 2 2
## [61] 2 2 1 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 1 1 2 2 1 1 2 2
## [91] 2 2 2 3 2 5 2 2 2 2
mean(Result==2) # propotion of times p=2
## [1] 0.765
After \(p\) was suggested by AIC, test the pameter estimate for significance using large-sample 95% CI. i.e. \(\hat p \pm 2 SE\) If the last parameter is not significant, reduce \(p\) by 1, and fit again.
If we use AIC all the time, what is the probability we end up with correct \(p\)?
Set \(p=2\). Generate simulated AR(2).
Use ar(Y) and search for best \(p\) by using Minimum AIC.
Repeat 1000 times. How many time do we end up with right \(p\)?
set.seed(1532)
#- Run until p=3 comes out
for (i in 1:999){
= arima.sim(list(ar = c(.6, .3) ), 100 ) + 8.5
X = auto.arima(X, d=0, max.q=0) # find AR(p) by AIC
Fit7 = length(Fit7$model$phi) # save value of p chosen
Result if (Result==3){
break
}
}
auto.arima(X, d=0, max.q=0) # search for best p by AIC
## Series: X
## ARIMA(3,0,0) with non-zero mean
##
## Coefficients:
## ar1 ar2 ar3 mean
## 0.6314 0.4514 -0.1932 8.8675
## s.e. 0.0972 0.1063 0.0990 0.8142
##
## sigma^2 = 0.9548: log likelihood = -138.39
## AIC=286.79 AICc=287.43 BIC=299.82
Arima(X, order=c(2,0,0)) # fit AR(2)
## Series: X
## ARIMA(2,0,0) with non-zero mean
##
## Coefficients:
## ar1 ar2 mean
## 0.5633 0.3466 8.9965
## s.e. 0.0928 0.0938 0.9787
##
## sigma^2 = 0.9818: log likelihood = -140.26
## AIC=288.52 AICc=288.94 BIC=298.94
Plot ACF and PACF to get some feel for value of \(p\).
Use minumum Akaike Information Criteria (AIC) criteria to look for the best \(p\). It is default in auto.arima().
Check parameter significance of estimated parameters.
After getting the suggested “best” \(p\) from AIC, fit model with p+1, and check to see if the last parameter estimate comes out non-significant.