Suppose we have MA(1) model (watch the sign!) \[ X_t = \epsilon_t + \theta_1 \epsilon_{t-1} \] This is already a causal representation.
We can rewrite the equation and get \[ X_t - \theta_1 \epsilon_{t-1} \hspace3mm = \hspace3mm \epsilon_t. \]
That means we can do the same for \(X_{t-1}\), and write \[ X_{t-1} - \theta_1 \epsilon_{t-2} \hspace3mm = \hspace3mm \epsilon_{t-1} \] We can substitute this into \(\epsilon_{t-1}\) above.
Substituting this expression of \(\epsilon_{t-1}\) into the first equation, \[ X_t - \theta_1 \epsilon_{t-1} \hspace3mm = \hspace3mm \epsilon_t. \\\\ X_t - \theta_1 \big(X_{t-1} - \theta_1 \epsilon_{t-2} \big) \hspace3mm = \hspace3mm \epsilon_t \\ \\ X_t - \theta_1 X_{t-1} - \theta_1^2 \epsilon_{t-2} \hspace3mm = \hspace3mm \epsilon_t \]
Repeat this \(n\) times, we get \[ X_t - \theta_1 X_{t-1} - \cdots - \theta_1^{n-1} X_{t-n+1} - \theta_1^n e_{t-n} \hspace3mm = \hspace3mm \epsilon_t \] If \(|\theta_1|<1\), then we can write \[ \epsilon_t \hspace3mm = \hspace3mm \sum_{i=0}^\infty \theta_1^i \, X_{t-i} \] This is called invertible representation.
\[ \epsilon_t \hspace3mm = \hspace3mm \sum_{i=0}^\infty \pi_i X_{t-i} \\ \\ \hspace3mm = \hspace3mm (\pi_0 + \pi_1 B + \pi_2 B^2 + \pi_3 B^3 + \cdots ) \, X_{t} \\ \\ \hspace3mm = \hspace3mm \Pi(B) \, X_{t} \]
So for our MA(\(q\)) model \[ X_t \hspace3mm = \hspace3mm \Theta(B) \epsilon_t \\ \\ \Pi(B) \, X_{t} \hspace3mm = \hspace3mm \epsilon_t \]
Very similar to causal representation in AR(\(p\)), we have identity, \[ \Pi(z) \hspace3mm = \hspace3mm \frac{1}{\Theta(z)} \] \[ (\pi_0 + \pi_1 z + \pi_2 z^2 + \pi_3 z^3 + \cdots ) \hspace3mm = \hspace3mm \frac{1}{(1+\theta_1 z + \theta_2 z^2 + \cdots + \theta_q z^q)} \]
We can calculate coefficients by moving all the polynomials to the left and matching \[ (1-\theta_1 z - \theta_2 z^2 - \cdots - \theta_q z^q)(\pi_0 + \pi_1 z + \pi_2 z^2 + \pi_3 z^3 + \cdots ) \hspace3mm = \hspace3mm 1 \]
For example, if we have MA(1), \[ (1-\theta_1 z )(\pi_0 + \pi_1 z + \pi_2 z^2 + \pi_3 z^3 + \cdots ) \hspace3mm = \hspace3mm 1 \]
To find out the coeffeicnets \(\pi_i\), we can match the order of \(z\). \[ \begin{align} \pi_0 &= \hspace3mm 1 \hspace15mm (\mbox{ coefficient without $z$ })\\ -\theta_1 \pi_0 + \pi_1 &= \hspace3mm 0 \hspace15mm (\mbox{ coefficient of $z$})\\ -\theta_1 \pi_1 + \pi_2 &= \hspace3mm 0 \hspace15mm (\mbox{ coefficient of $z^2$})\\ -\theta_1 \pi_2 + \pi_3 &= \hspace3mm 0 \hspace15mm (\mbox{ coefficient of $z^3$})\\ &\vdots& \end{align} \]
\[ \begin{align} \pi_0 &= \hspace3mm 1 \\ \pi_1 &= \hspace3mm \theta_1 \pi_0 \\ \pi_2 &= \hspace3mm \theta_1 \pi_1 \\ \pi_3 &= \hspace3mm \theta_1 \pi_2 \\ &\vdots& \end{align} \]
\[
\begin{align}
\pi_0 &= \hspace3mm 1 \\
\pi_1 &= \hspace3mm \theta_1 \\
\pi_2 &= \hspace3mm \theta_1^2\\
\pi_3 &= \hspace3mm \theta_1^3\\
&\vdots&
\end{align}
\]
Now we get invertible representation for MA(1), \[ \epsilon_t \hspace3mm = \hspace3mm \sum_{i=0}^\infty \pi_i \, X_{t-i} = \sum_{i=0}^\infty \theta_1^i \, X_{t-i}. \]
To write the today’s error (innovation) \(\epsilon_t\), we need infinetely many past observation \(X_{t-i}\).
When can we do this?
If all roots of characteristic polynomial \[ \Theta(z) = 1 + \theta_1 z + \theta_2 z^2 + \cdots + \theta_q z^q \] is outside of the unit circle, then MA(\(q\)) admits a invertible representation.
Same as causal condition in AR(\(p\)).
Watch the sign! We have same sign as \(\Phi(z)\) in AR(\(p\)), because we started with Brockwell’s convention; \(X_t = \epsilon_t + \theta_1 \epsilon_{t-1}\).
We assume all MA(\(q\)) we deal with
are invertible.
If we have MA(3) process, \[ X_t = \epsilon_t -\hat \theta_1 \epsilon_{t-1} -\hat \theta_2 \epsilon_{t-2} -\hat \theta_3 \epsilon_{t-3} \] with estimated parameter \(\hat \theta_i\), we want to get residuals \(\hat \epsilon_t\), and check their randomness.
In AR(\(p\)), this was intuitive. e.g. if we had AR(2), residual was \[ X_t - \hat \phi_1 X_{t-1} - \hat \phi_2 X_{t-2} = \hat \epsilon_t \]
In MA(\(q\)), we must use invertible
expression, \[
\hat \epsilon_t = \hat \pi_0 X_t + \hat \pi_1 X_{t-1} + \hat \pi_2
X_{t-2} + \cdots
\]
We haven’t talked about forecasting, but invertibility will be used in forecasting pretty much the same way as in the residuals.
MA(\(q\)) process is defiend with equation \[ X_t = \Theta(B) \epsilon_t \] you can check the roots of the characteristic polynomial \[ \Theta(z) = 1 + \theta_1 z + \theta_2 z^2 + \cdots + \theta_q z^q. \] If all root are outside of the (complex) unit circle, then MA(\(q\)) admits a invertible representation.
Inverted representation allows to write today’s error using infinite sum of past observations. \[ \epsilon_t = \sum_{i=0}^\infty X_{t-i} \]
Bcause we can’t directly observe \(\epsilon_t\), invertibility is importatnt in calculating residuals and forecasts.