We start with AR(p) equation. Let \(p=3\) for now, \[ X_t - \phi_1 X_{t-1} - \phi_2 X_{t-2} - \phi_3 X_{t-3} \hspace3mm = \hspace3mm \epsilon_t. \] We multiply both sides by \(X_t\), and take expectation. \[ E\Big( X_t \, [X_t - \phi_1 X_{t-1} - \phi_2 X_{t-2} - \phi_3 X_{t-3}] \Big) \hspace3mm = \hspace3mm E\Big( X_t \, [ \epsilon_t ]\Big). \]
Recall the formula for covariance, \[ \mbox{Cov}(X_t,X_{t-1}) = E(X_t \, X_{t-1}) - E(X_t)E(X_{t-1}). \] So if \(E(X_t)=0\), then \[ \gamma(1) = E(X_t \, X_{t-1}). \]
Then the equation \[ E\Big( X_t \, [X_t - \phi_1 X_{t-1} - \phi_2 X_{t-2} - \phi_3 X_{t-3}] \Big) \hspace3mm = \hspace3mm E\Big( X_t \, [ \epsilon_t ]\Big). \] can be written as \[ \gamma(0) - \phi_1 \gamma(1) - \phi_2 \gamma(2) - \phi_3 \gamma(3) = \sigma^2 \hspace30mm \mbox{ (Eqn 1) } \]
Now repeat the process starting with \(X_{t+1}\) instead of \(X_t\), \[ X_{t+1} - \phi_1 X_{t} - \phi_2 X_{t-1} - \phi_3 X_{t-2} \hspace3mm = \hspace3mm e_{t+1} \] then we would have gotten \[ E\Big( X_{t} \, [X_{t+1} - \phi_1 X_{t} - \phi_2 X_{t-1} - \phi_3 X_{t-2}] \Big) \hspace3mm = \hspace3mm E\Big( X_t \, [ e_{t+1} ]\Big). \] \[ \gamma(1) - \phi_1 \gamma(0) - \phi_2 \gamma(1) - \phi_3 \gamma(2) \hspace3mm = \hspace3mm 0. \hspace30mm \mbox{ (Eqn 2) } \]
If we use the original AR(3) equation for \(X_{t+2}\) instead of for \(X_t\), we would get \[ E\Big( X_{t} \, [X_{t+2} - \phi_1 X_{t+1} - \phi_2 X_{t} - \phi_3 X_{t-1}] \Big) \hspace3mm = \hspace3mm E\Big( X_t \, [ e_{t+2} ]\Big). \] \[ \gamma(2) - \phi_1 \gamma(1) - \phi_2 \gamma(0) - \phi_3 \gamma(1) \hspace3mm = \hspace3mm 0. \hspace30mm \mbox{ (Eqn 2) } \]
Repeat it one more time, and we get equations, \[ \gamma(0) - \phi_1 \gamma(1) - \phi_2 \gamma(2) - \phi_3 \gamma(3) \hspace3mm = \hspace3mm \sigma^2\\ \\ \gamma(1) - \phi_1 \gamma(0) - \phi_2 \gamma(1) - \phi_3 \gamma(2) \hspace3mm = \hspace3mm 0\\ \gamma(2) - \phi_1 \gamma(1) - \phi_2 \gamma(0) - \phi_3 \gamma(1) \hspace3mm = \hspace3mm 0\\ \gamma(3) - \phi_1 \gamma(2) - \phi_2 \gamma(1) - \phi_3 \gamma(0) \hspace3mm = \hspace3mm 0. \]
We set aside the first equation, and re-write the rest of equations \[ \gamma(1) - \phi_1 \gamma(0) - \phi_2 \gamma(1) - \phi_3 \gamma(2) \hspace3mm = \hspace3mm 0\\ \gamma(2) - \phi_1 \gamma(1) - \phi_2 \gamma(0) - \phi_3 \gamma(1) \hspace3mm = \hspace3mm 0\\ \gamma(3) - \phi_1 \gamma(2) - \phi_2 \gamma(1) - \phi_3 \gamma(0) \hspace3mm = \hspace3mm 0 \] as \[ \phi_1 \gamma(0) + \phi_2 \gamma(1) + \phi_3 \gamma(2) \hspace3mm = \hspace3mm \gamma(1) \\ \phi_1 \gamma(1) + \phi_2 \gamma(0) + \phi_3 \gamma(1) \hspace3mm = \hspace3mm \gamma(2) \\ \phi_1 \gamma(2) + \phi_2 \gamma(1) + \phi_3 \gamma(0) \hspace3mm = \hspace3mm \gamma(3). \] Which can be put in a matrix form as \[ \left[ \begin{array}{cccc} \gamma(0) & \gamma(1) & \gamma(2) \\ \gamma(1) & \gamma(0) & \gamma(1) \\ \gamma(2) & \gamma(1) & \gamma(0) \\ \end{array} \right] \hspace2mm \left[ \begin{array}{c} \phi_1 \\ \phi_2 \\ \phi_3\\ \end{array}\right] = \left[ \begin{array}{c} \gamma(1) \\ \gamma(2) \\ \gamma(3) \\ \end{array} \right]. \]
We still have the first equation, \[ \gamma(0) - \phi_1 \gamma(1) - \phi_2 \gamma(2) - \phi_3 \gamma(3) \hspace3mm = \hspace3mm \sigma^2. \]
These two equations are called Yule-Walker Equations.
For AP(3), Y-W equations are: \[ \gamma(0) - \phi_1 \gamma(1) - \phi_2 \gamma(2) - \phi_3 \gamma(3) \hspace3mm = \hspace3mm \sigma^2 \] \[ \left[ \begin{array}{cccc} \gamma(0) & \gamma(1) & \gamma(2) \\ \gamma(1) & \gamma(0) & \gamma(1) \\ \gamma(2) & \gamma(1) & \gamma(0) \\ \end{array} \right] \hspace2mm \left[ \begin{array}{c} \phi_1 \\ \phi_2 \\ \phi_3\\ \end{array}\right] \hspace3mm = \hspace3mm \left[ \begin{array}{c} \gamma(1) \\ \gamma(2) \\ \gamma(3) \\ \end{array} \right]. \] Which can be soved for \(\phi_i\), \[ \left[ \begin{array}{c} \phi_1 \\ \phi_2 \\ \phi_3\\ \end{array}\right] \hspace3mm = \hspace3mm \left[ \begin{array}{cccc} \gamma(0) & \gamma(1) & \gamma(2) \\ \gamma(1) & \gamma(0) & \gamma(1) \\ \gamma(2) & \gamma(1) & \gamma(0) \\ \end{array} \right] ^{-1} \hspace2mm \left[ \begin{array}{c} \gamma(1) \\ \gamma(2) \\ \gamma(3) \\ \end{array} \right]. \]
We can use Y-W equasion backwards with sample ACVF to estimate parameters. \[ \left[ \begin{array}{c} \hat \phi_1 \\ \hat \phi_2 \\ \hat \phi_3\\ \end{array}\right] \hspace3mm = \hspace3mm \left[ \begin{array}{cccc} \hat \gamma(0) & \hat \gamma(1) & \hat \gamma(2) \\ \hat \gamma(1) & \hat \gamma(0) & \hat \gamma(1) \\ \hat \gamma(2) & \hat \gamma(1) & \hat \gamma(0) \\ \end{array} \right]^{-1} \hspace2mm \left[ \begin{array}{c} \hat \gamma(1) \\ \hat \gamma(2) \\ \hat \gamma(3) \\ \end{array} \right]. \\ \\ \mathbf{\hat \phi_3} \hspace3mm = \hspace3mm \mathbf{ \hat \Gamma^{-1}_3} \hspace2mm \mathbf{\hat \gamma_3} \]
\[
\hat \sigma^2 \hspace3mm = \hspace3mm \hat \gamma(0) - \hat \phi_1 \hat \gamma(1) - \hat \phi_2 \hat \gamma(2) - \hat \phi_3 \hat \gamma(3)
\]
layout(1) # reset the layout
Fit2 = ar(Y, aic=FALSE, order.max=3) # fits AR(3) by Y-W method(default)
#Fit2 = ar(Y, aic=F, order.max=2, method="mle") # fits AR(2) by MLE
?ar # opens help page of function ar()
## starting httpd help server ... done
\(\sqrt{n}(\hat \phi_p - \phi_p)\) is approximately multivariate normal, \(N_p(0, \sigma^2 \mathbf \Gamma^{-1}/n)\), when \(n\) is large. (B-D p141)
That means 95% confidence interval for \(\phi_{j}\) is \[ \hat \phi_{j} \pm 1.96 \sqrt{ \frac{\hat{\sigma}^2 \, \hat{\Gamma}^{-1}_{jj}}{n} } \] where \(\hat{\Gamma}^{-1}_{jj}\) be \(j\)th diagnal element of the inverse matrix \(\mathbf{\hat{\Gamma}^{-1}}\).
If you use ar() function, $asy.var.coef of an output is same as \(\sigma^2 \mathbf \Gamma^{-1}/n\).
Y = arima.sim(list(ar = c(.6, .3) ), 100 ) # Simulate AR(2)
Fit2 = ar(Y, aic=FALSE, order.max=3) # fit AR(3) by Y-W(default)
Fit2
##
## Call:
## ar(x = Y, aic = FALSE, order.max = 3)
##
## Coefficients:
## 1 2 3
## 0.4824 0.2901 0.0135
##
## Order selected 3 sigma^2 estimated as 1.013
## List of 15
## $ order : num 3
## $ ar : num [1:3] 0.4824 0.2901 0.0135
## $ var.pred : num 1.01
## $ x.mean : num -0.776
## $ aic : Named num [1:4] 70.25 7.21 0 1.98
## ..- attr(*, "names")= chr [1:4] "0" "1" "2" "3"
## $ n.used : int 100
## $ n.obs : int 100
## $ order.max : num 3
## $ partialacf : num [1:3, 1, 1] 0.6915 0.2966 0.0135
## $ resid : Time-Series [1:100] from 1 to 100: NA NA NA -0.215 -0.64 ...
## $ method : chr "Yule-Walker"
## $ series : chr "Y"
## $ frequency : num 1
## $ call : language ar(x = Y, aic = FALSE, order.max = 3)
## $ asy.var.coef: num [1:3, 1:3] 0.01041 -0.00507 -0.00309 -0.00507 0.01196 ...
## - attr(*, "class")= chr "ar"
phi.hat = Fit2$ar # phi-hats
sigSq.hat = Fit2$var.pred # sig^2 hat
Fit2$asy.var.coef # Variance of phi-hats
## [,1] [,2] [,3]
## [1,] 0.010414782 -0.005065612 -0.003089063
## [2,] -0.005065612 0.011962401 -0.005065612
## [3,] -0.003089063 -0.005065612 0.010414782
## Warning in sqrt(Fit2$asy.var.coef): NaNs produced
## [1] 0.48239711 0.29006128 0.01345101
## [,1] [,2] [,3]
## [1,] 0.1020528 NaN NaN
## [2,] NaN 0.1093728 NaN
## [3,] NaN NaN 0.1020528
## [,1] [,2]
## [1,] 0.009498555 -0.006568091
## [2,] -0.006568091 0.009498555
Recall the Yule-Walker Equation, \[ \left[ \begin{array}{c} \phi_1 \\ \phi_2 \\ \phi_3\\ \end{array}\right] \hspace3mm = \hspace3mm \left[ \begin{array}{cccc} \gamma(0) & \gamma(1) & \gamma(2) \\ \gamma(1) & \gamma(0) & \gamma(1) \\ \gamma(2) & \gamma(1) & \gamma(0) \\ \end{array} \right]^{-1} \hspace2mm \left[ \begin{array}{c} \gamma(1) \\ \gamma(2) \\ \gamma(3) \\ \end{array} \right]. \] \[ \mathbf{\phi_3} \hspace3mm = \hspace3mm \mathbf{ \Gamma^{-1}_3} \hspace2mm \mathbf{\gamma_3} \] Note that this equation, can be extended to more than 3 parameters, even though we only have 3 \(\phi\)’s.
For example, if we use \(\mathbf{\Gamma_5}\), then \[
\left[ \begin{array}{c}
\phi_1 \\ \phi_2 \\ \phi_3\\ 0\\ 0\\
\end{array}\right]
\hspace3mm = \hspace3mm
\left[ \begin{array}{ccccc}
\gamma(0) & \gamma(1) &\gamma(2) & \gamma(3) & \gamma(4) \\
\gamma(1) & \gamma(0) &\gamma(1) & \gamma(0) & \gamma(3) \\
\gamma(2) & \gamma(1) &\gamma(0) & \gamma(1) & \gamma(2) \\
\gamma(3) & \gamma(2) &\gamma(1) & \gamma(0) & \gamma(1) \\
\gamma(4) & \gamma(3) &\gamma(2) & \gamma(1) & \gamma(0) \\
\end{array} \right]^{-1}
\hspace2mm
\left[ \begin{array}{c}
\gamma(1) \\ \gamma(2) \\ \gamma(3) \\ \gamma(4) \\ \gamma(5) \\
\end{array} \right] \\
\] \[
\mathbf{\phi_5} \hspace3mm = \hspace3mm
\mathbf{ \Gamma^{-1}_5} \hspace2mm \mathbf{\gamma_5}.
\]
Partial ACF of lag \(h\) is defined as last element of vector \[ \mathbf{\phi_h} \hspace3mm = \hspace3mm \mathbf{ \Gamma^{-1}_h} \hspace2mm \mathbf{\gamma_h}. \] For AR(p), \[ \left\{ \begin{array}{ll} \alpha(0) = 1 \\ \alpha(h) = \phi_h & \mbox{ if } 1 < h \leq p \\ \alpha(h) = 0 & \mbox{ if } p < h \\ \end{array} \right. \]
PACF of AR(\(p\)) cuts off after lag \(p\).
Sample version of PACF of lag \(h\) is the last element of \[ \mathbf{\hat \phi_h} \hspace3mm = \hspace3mm \mathbf{ \hat \Gamma^{-1}_h} \hspace2mm \mathbf{\hat \gamma_h}. \] For AR(p), \[ \left\{ \begin{array}{ll} \alpha(0) = 1 \\ \alpha(h) = \phi_h & \mbox{ if } 1 < h \leq p \\ \alpha(h) = 0 & \mbox{ if } p < h \\ \end{array} \right. \]
PACF of AR(\(p\)) cuts off after lag \(p\).
\(\alpha(k)\) is the correlation between prediction errors \[
X_k - \mbox{Pred}(X_h|X_1,\cdots,X_{k-1})
\hspace2mm \mbox{ and } \hspace2mm
X_0 - \mbox{Pred}(X_0|X_1,\cdots,X_{k-1}).
\]
# read in daily Dow Jones data from my website
D = read.csv("https://nmimoto.github.io//datasets/dowj.csv")
D1 = ts(D, start=c(1,1), freq=1) # turn the data into TS object
Y1 = diff(log(D1)) # Take log-difference log(X_{t}) - log(X_{t-1})
Y = Y1[-1] # remove the first observation because it's NA
head(Y)
## [1] -0.0023516653 0.0011765240 0.0017170489 0.0008123111 -0.0034342555
## [6] 0.0009048955
layout(matrix(1:2, 1, 2)) # make next two plot side by side (you don't have to do this)
acf(Y) # Sample ACF
pacf(Y) # Sample Partial ACF (see below)
One method for estimating parameters in AR(\(p\)) is to use .
Yule-Walker Estimator make use of relationship between ACVF and AR(\(p\)) parameters.
When \(n\) is large, standard error of Y-W estimator can be calculated using large-sample formula.
Partial ACF (PACF) is defined on p48 of Cryer. and it is as characteristc of AR(\(p\)) process as ACF.
ACF for AR(\(p\)) process decays, as PACF for AR(\(p\)) cuts off after lag \(p\).