w10a: Forecasting ARMA

1. Forecasing ARMA model

1-step forecasting

If we have ARMA(2,2), formula for the next observation is \[ {\large X_{n+1} = \phi_1 X_n + \phi_2 X_{n-1} + e_n + \theta_1 e_{n-1} + \theta_2 e_{n-2} } \]

Then best 1-step predictor is,

\[ {\large \hat X(1) = \phi_1 X_n + \phi_2 X_{n-1} + \theta_1 e_{n-1} + \theta_2 e_{n-2} } \]

Since we don’t have parameters and true error terms, we have to replace them with their estimates:

\[ {\large \hat X(1) = \hat \phi_1 X_n + \hat \phi_2 X_{n-1} + \hat \theta_1 \hat e_{n-1} + \hat \theta_2 \hat e_{n-2} } \]

Then as in AR(p) and MA(q), the 1-step predition MSE is

\[ {\large pMSE_1 = \sigma^2 } \] which is the variance of the error term $\epsilon_t$.

h-step forecasting

like in AR(p) case, as $h$ increase, the prediction will tend to 0. (or the mean)

long-run prediction MSE is the unconditional variance \[ {\large pMSE_{\infty} = V(X_t) = \gamma(0) } \]

2. Innovations Algorithm

(Brockwell p71) Recall that to find the best $h-$step linear predictor, \[ \hat Y_n (h) = a_0 + a_1 Y_n + \cdots + a_n Y_1, \]

we need to minimize prediction MSE, \[ \mbox{MSE} = E \Big[\big(\hat Y_n (h) - Y_{n+h}\big)^2\Big]. \]

To find such $a_0,a_1, \ldots, a_n$, we ended up with matrix equation, \[ \left[ \begin{array}{c} a_1 \\ a_2 \\ a_3\\ a_4\\ \end{array}\right] \hspace{3mm} = \hspace{3mm} \left[ \begin{array}{cccc} \gamma(0) & \gamma(1) & \gamma(2) & \gamma(3) \\ \gamma(1) & \gamma(0) & \gamma(1) & \gamma(2) \\ \gamma(2) & \gamma(1) & \gamma(0) & \gamma(1) \\ \gamma(3) & \gamma(2) & \gamma(1) & \gamma(0) \\ \end{array} \right] ^{-1} \hspace{3mm} \left[ \begin{array}{l} \gamma(h) \\ \gamma(h+1) \\ \gamma(h+2) \\ \gamma(h+3) \\ \end{array} \right]. \]

What happens when $n$ is large? Can we get inverse of $1000\times 1000$ matrix? Recursive formula is needed.

innovations: one-step prediction errors using $n-1$ observations: $Y_n - \hat Y_n(1)$

\[\begin{align} \mbox{ Use } 0 \mbox{ obs, $\hspace{7mm}$ innov: } Y_1 - \hat Y_1(1), \hspace{10mm} \hat Y_1(1) & = 0 \\ \mbox{ Use } 1 \mbox{ obs, $\hspace{7mm}$ innov: } Y_2 - \hat Y_2(1), \hspace{10mm} \hat Y_1(1) & = a_{11} Y_1 \\ \mbox{ Use } 2 \mbox{ obs, $\hspace{7mm}$ innov: } Y_3 - \hat Y_3(1), \hspace{10mm} \hat Y_1(1) & = a_{21} Y_2 + a_{22} Y_1 \\ \mbox{ Use } 3 \mbox{ obs, $\hspace{7mm}$ innov: } Y_4 - \hat Y_4(1), \hspace{10mm} \hat Y_1(1) & = a_{31} Y_3 + a_{32} Y_3 + a_{33} Y_1 \\ & a \end{align}\]

\[\begin{align} Y_1 - \hat Y_1(1) &= Y_1 - 0\\ Y_2 - \hat Y_2(1) &= Y_2 - a_{11} Y_1\\ Y_3 - \hat Y_3(1) &= Y_3 - a_{21} Y_2 -a_{22} Y_1\\ Y_4 - \hat Y_4(1) &= Y_4 - a_{31} Y_3 -a_{32} Y_3 - a_{33} Y_1\\ & \vdots \end{align}\]

\[ \left[ \begin{array}{c} Y_1 - \hat Y_1(1) \\ Y_2 - \hat Y_2(1) \\ Y_3 - \hat Y_3(1) \\ Y_4 - \hat Y_4(1) \\ \end{array}\right] \hspace{3mm} = \hspace{3mm} \left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ a_{11} & 1 & 0 & 0 \\ a_{21} & a_{22} & 1 & 0 \\ a_{31} & a_{32} & a_{33} & 1 \\ \end{array} \right] \hspace{2mm} \left[ \begin{array}{l} Y_1 \\ Y_2 \\ Y_3 \\ Y_4 \\ \end{array} \right] \\ \\ \mathbf U_4 \hspace{3mm} = \hspace{3mm} \mathbf A_4 \mathbf Y_4 \] Note that $\mathbf A_n$ is a lower triangular matrix.

We have another way to write the left hand side, \[ \left[ \begin{array}{c} Y_1 - \hat Y_1(1) \\ Y_2 - \hat Y_2(1) \\ Y_3 - \hat Y_3(1) \\ Y_4 - \hat Y_4(1) \\ \end{array}\right] \hspace{3mm} = \hspace{3mm} \left[ \begin{array}{l} Y_1 \\ Y_2 \\ Y_3 \\ Y_4 \\ \end{array} \right] - \left[ \begin{array}{l} \hat Y_1(1) \\ \hat Y_2(1) \\ \hat Y_3(1) \\ \hat Y_4(1) \\ \end{array} \right] \\ \mathbf U_n \hspace{3mm} = \hspace{3mm} \mathbf Y_n \hspace{2mm} - \hspace{2mm} \mathbf{\hat Y_n} \]

So we have two equations, \[\begin{align} \mathbf U_n &= \mathbf Y_n - \mathbf{\hat Y_n} \\ \mathbf U_n &= \mathbf A_4 \mathbf Y_4 \end{align}\]

From the two equations, \[\begin{align} \mathbf U_n &= \mathbf Y_n - \mathbf{\hat Y_n} \\ \mathbf U_n &= \mathbf A_n \mathbf Y_n \end{align}\]

We can get \[\begin{align} \mathbf{\hat Y_n} &= \mathbf Y_n - \mathbf U_n \\ \mathbf A_n^{-1} \mathbf U_n &= \mathbf Y_n \end{align}\]

Substituting, we can write the predictors as \[\begin{align} \mathbf{\hat Y_n} &= \mathbf A_n^{-1} \mathbf U_n - \mathbf U_n \\ &= \Big(\mathbf A_n^{-1} - \mathbf I \Big) \mathbf U_n. \end{align}\]

Inverse of a lower triangular matrix is a lower triangular matrix. Let us call the elements of $\mathbf A_n^{-1}$ as $\theta_{ij}$. \[ \mathbf A_4^{-1} \hspace{3mm} = \hspace{3mm} \left[ \begin{array}{cccc} 1 & 0 & 0 & 0 \\ \theta_{11} & 1 & 0 & 0 \\ \theta_{21} & \theta_{22} & 1 & 0 \\ \theta_{31} & \theta_{32} & \theta_{33} & 1 \\ \end{array} \right] \]

We now have formula \[ \mathbf{\hat Y_n} \hspace{3mm} = \hspace{3mm} \Big(\mathbf A_n^{-1} - \mathbf I \Big) \mathbf U_n . \] \[ \mathbf{\hat Y_n} \hspace{3mm} = \hspace{3mm} \Big(\mathbf A_n^{-1} - \mathbf I \Big) \mathbf U_n \\\\ \left[ \begin{array}{l} \hat Y_1(1) \\ \hat Y_2(1) \\ \hat Y_3(1) \\ \hat Y_4(1) \\ \end{array} \right] \hspace{3mm} = \hspace{3mm} \left[ \begin{array}{cccc} 0 & 0 & 0 & 0 \\ \theta_{11} & 0 & 0 & 0 \\ \theta_{21} & \theta_{22} & 0 & 0 \\ \theta_{31} & \theta_{32} & \theta_{33} & 0 \\ \end{array} \right] \left[ \begin{array}{l} Y_1 - \hat Y_1(1) \\ Y_2 - \hat Y_2(1) \\ Y_3 - \hat Y_3(1) \\ Y_4 - \hat Y_4(1) \\ \end{array} \right] \\ \]

This will let us compute $\hat X_n$ recursively, since row $n$ of $\mathbf{\Theta_n}$ only uses up to $\hat X_{n-1}$. \[ \hat Y_{n}(1) \hspace{3mm} = \hspace{3mm} \left\{ \begin{array}{ll} 0 & \mbox{ if } n=0 \\ \sum_{j=1}^n \theta_{nj} \Big(Y_{n+1-j} - \hat Y_{n+1-j}\Big)& \mbox{ if } n>0 \\ \end{array} \right. \]

Innovations Algorithm Formula

This is an Recursive Formula that can be used for non-stationary series. If $Y_t$ is invertible, $\theta_{nj} \to \theta_j$, the coefficients $\theta_{ij}$ can be calculated as letting \[ \nu_0 \hspace{3mm} = \hspace{3mm} \gamma(0) \] and for $ k=0,1,, n-1$, \[ \theta_{n,n-k} \hspace{3mm} = \hspace{3mm} \Big( \gamma(n-k) - \sum_{j=0}^{k-1} \theta_{k,k-j} \, \theta_{n,n-j} \, \nu_j \Big)/ \nu_k \\ \nu_n \hspace{3mm} = \hspace{3mm} \gamma(0) - \sum_{j=0}^{n-1} \theta_{n,n-j}^2 \nu_j. \]

You compute successively, $\nu_0; \quad \theta_{11}, \nu_1; \quad \theta_{22}, \theta_{21}, \nu_2;$ $\quad \theta_{33}, \theta_{32}, \theta_{31}, \nu_3;$ and so on.

It looks like: \[\begin{align} \nu_0 &= \hat \gamma(0) \\ \theta_{11} &= \hat \gamma(1) / \nu_0 \\\\ \nu_1 &= \hat \gamma(0) - \theta_{11}^2 \nu_0 \\ \theta_{22} &= \hat \gamma(2) / \nu_0 \\ \theta_{21} &= [\hat \gamma(1) - \theta_{11} \theta_{22} \nu_0] / \nu_1 \\\\ \nu_2 &= \gamma(0) - \theta_{22}^2 \nu_0 -\theta_{21}^2 \nu_1 \\ \theta_{33} &= \hat \gamma(3) / \nu_0 \\ \theta_{32} &= \hat \gamma(2) - \theta_{11} \theta_{33} \nu_0] / \nu_1 \\ \theta_{31} &= [\gamma(1) - \theta_{22} \theta_{33} \nu_0 - \theta_{21} \theta_{32} \nu_1] / \nu_2 \\ &\vdots \end{align}\]

Summary

Innovation Algorithm is a general method for forecasting time series.
Applied to ARMA model, Innovation Algorithm can be ritten in rather simple formula, and can be calcuted recursively to forecast the time series.