ARMA(1,1) model is defined as \[
X_t - \phi_1 X_{t-1} \hspace{3mm} = \hspace{3mm} \epsilon_t + \theta_1
\epsilon_{t-1}
\] where \(\epsilon_t\sim
WN(0,\sigma^2)\). Using the backward operator, this is same as
\[
(1- \phi_1 B) \, X_t \hspace{3mm} = \hspace{3mm} (1 + \theta_1 B) \,
\epsilon_t
\] \[
{\large
\Phi(B) \, X_t \hspace{3mm} = \hspace{3mm} \Theta(B) \, \epsilon_t
}
\]
Similarly, we can write
\[
X_t - \phi_1 X_{t-1} - \phi_1 X_{t-1} - \cdots - \phi_p X_{t-p}
\hspace{3mm} = \hspace{3mm} \epsilon_t + \theta_1 \epsilon_{t-1} +
\theta_2 \epsilon_{t-2} + \cdots + \theta_q e_{t-q}
\] \[
(1- \phi_1 B - \phi_2 B^2 - \cdots - \phi_p B^p) \,\, X_t
\hspace{3mm} = \hspace{3mm} (1 + \theta_1 B + \theta_2 B^2 + \cdots +
\theta_p B^p) \,\, \epsilon_t
\] \[
{\large
\Phi(B) \,\, X_t \hspace{3mm} = \hspace{3mm} \Theta(B) \,\,
\epsilon_t
}
\]
For example, ARMA(3,2) would look like \[ {\large X_t - \phi_1 X_{t-1} - \phi_2 X_{t-2} - \phi_3 X_{t-3} \hspace{3mm} = \hspace{3mm} \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} } \]
\[ {\large (1- \phi B - \phi_2 B^2 - \phi_3 B^3) \, X_t \hspace{3mm} = \hspace{3mm} (1 + \theta B + \theta_2 B^2) \, \epsilon_t } \]
\[
{\large
\Phi(B) X_t \hspace{3mm} = \hspace{3mm} \Theta(B) \epsilon_t
}
\]
If the AR characteristic polinomial \(\Phi(z)\) has all the root outside of the
unit circle, we can represent ARMA(\(p,q\)) process as causal, \[
{\large
X_t \hspace{3mm} = \hspace{3mm} \sum_{i=0}^\infty \psi_i \, e_{t-i}
}
\] with absolutely summable sequence \(\psi_i\). The process is stationary.
If the MA characteristic polinomial \(\Theta(z)\) has all the root outside of the unit circle, we can represent ARMA(\(p,q\)) process as invertible, \[ {\large \epsilon_t \hspace{3mm} = \hspace{3mm} \sum_{i=0}^\infty \pi_i \, X_{t-i} } \] with absolutely summable sequence \(\pi_i\).
We assume that all ARMA(\(p,q\)) are
causal, and invetible. Again, nothing is lost in this assumption.
Now both ACF and PACF will go down
= c(.4, .2)
phis = c(.6, .2)
thetas
# sign is like [Brockwell]: (1-phi B) X_t = (1+theta B) \epsilon_t
= ARMAacf(ar=phis, ma=thetas, lag.max=20) # Theoretical ACF
Th.acf = ARMAacf(ar=phis, ma=thetas, lag.max=20, pacf=T) # Theoretical PACF
Th.pacf
#--- Basic Simulation with ARMA(p,q) ---
= 5
mu = arima.sim(n = 250, list(ar =phis, ma = thetas )) + mu # Simulate ARMA(2,2)
x
plot(x, type="o");
layout(matrix(1:2, 1, 2))
acf(x); lines(0:20, Th.acf, type='p', col="red") # sample ACF; Theo ACF
pacf(x); lines(1:20, Th.pacf, type='p', col="red") # sample PACF; Theo PACF
layout(1)
Bias-corrected version of AIC, suggested by Hurvich and Tsai (1989) \[ \mbox{AICC} = - 2 \log(\mbox{Max Likehood}) + \frac{2(k+1)(k+2)}{n-k-2} \] where \(k=p+q+1\) if non-zero mean is in the model, and \(k=p+q\) if no mean is in the model.
Compare this to AIC, which was designed to be an approximately unbiased estimate of the Kullback–Leibler index of the fitted model relative to the true model: \[ {\large \mbox{AIC} = - 2 \log(\mbox{Max Likehood}) + 2k } \]
Bayesian Information Criteria \[
{\large
\mbox{BIC} = - 2 \log(\mbox{Max Likehood}) + k \log(n)
}
\] where \(k=p+q+1\) if non-zero
mean is in the model, and \(k=p+q\) if
no mean is in the model.
How much better is AICC over AIC? In what scenario?
Simulate ARMA(2,1) with non-zero mean
Pick ARMA(p,q) model based on AICc, AIC, BIC.
Repeat 1000 times, see how many times each criteria picked right \(p\).
library(forecast) #if not installed, do: install.packages("forecast")
# initialize the object that saves result
= Result2 = Result3 = matrix(0, 1000, 7)
Result1 = Result5 = Result6 = matrix(0, 1000, 7)
Result4 for (i in 1:1000){
= arima.sim(list(ar = c(.6, -.6), ma=c(.8) ), 100 ) + 10 # case 1
Y #Y = arima.sim(list(ar = c(.6, .3), ma=c(.5) ), 100 ) + 10 # case 2
#- picks model based on AICC, AIC, and BIC
= auto.arima(Y, max.order=6, max.d=0, max.D=0, ic=c("aicc"))
Fit1 = auto.arima(Y, max.order=6, max.d=0, max.D=0, ic=c("aic"))
Fit2 = auto.arima(Y, max.order=6, max.d=0, max.D=0, ic=c("bic"))
Fit3
#- all combo method (takes time)
= auto.arima(Y, max.order=6, max.d=0, max.D=0, ic=c("aicc"), stepwise=FALSE)
Fit4
print(i) #- print order on screen (optional)
= Fit1$arma # checks if it picked ARMA(2,1) with mean
Result1[i,] = Fit2$arma
Result2[i,] = Fit3$arma
Result3[i,] = Fit4$arma
Result4[i,]
}
= apply( Result1, 1, function(x){ setequal(x, c(2,1,0,0,1,0,0)) } )
R1 = apply( Result2, 1, function(x){ setequal(x, c(2,1,0,0,1,0,0)) } )
R2 = apply( Result3, 1, function(x){ setequal(x, c(2,1,0,0,1,0,0)) } )
R3 = apply( Result4, 1, function(x){ setequal(x, c(2,1,0,0,1,0,0)) } )
R4 c(mean(R1), mean(R2), mean(R3), mean(R4))
# Results
# AICc AIC BIC AICc w All
# 0.867 0.835 0.943 0.716 for Case 1 ar=(.6, -.6) ma=c(.8) mu=10
# 0.352 0.374 0.172 0.293 for Case 2 ar=(.6, .5) ma=c(.8) mu=10
In this two cases Stepwise method picked correct \((p,q)\) more frequently than All Combination method.
Still, use stepwise=FALSE when time allows.