( Stationarity Check ) Does \(X_t\) look stationary?
( De-mean ) Is the mean of this process 0?
( Model Selection ) What model should we use? AR(p)? or something else?
( Order Selection ) How did we decide on the value of \(p\) to fit?
( Paramete Estimation ) What estimator was used for \(\phi_i\)? Can you reject he null hypothesis that \(\phi_i=0\)?
( Residual Analysis ) How was the residual calculated? What does it say about how the model fits the data?
Theoretically, AR(\(1\)) has mean of 0. \[ X_t = \phi X_{t-1} + \epsilon_t \hspace10mm \epsilon_t \sim WN(0,\sigma^2). \] If the observed series looks like AR(\(1\)) with constant mean, our model is \[ Y_t = \mu + X_t, \hspace10mm \mbox{ where } \mu \mbox{ is a constant and $X_t$ is AR(p).} \] That means, \(Y_t\) has a mean of \(\mu\), and we need to de-mean (de-trend) by letting \[ \hat X_t = Y_t - \hat \mu = Y_t - \bar Y. \] before modeling \(\hat X_t\) with zero-mean AR(\(1\)).
include.mean=TRUE is default in Arima() of forecast package.
If you use include.mean=FALSE option, that means our model is \[
Y_t = X_t.
\] and \(Y_t\) is believed to have mean of \(0\). \(Y_t\) is directly being modeled with zero-mean AR(p) model.
## [1] 8.11923
library(forecast) # install.packages("forecast") if you haven't
Fit01 = Arima(X, order=c(2,0,0)) # de-mean
Fit02 = Arima(X, order=c(2,0,0), include.mean=TRUE) # de-mean
Fit03 = Arima(X, order=c(2,0,0), include.mean=FALSE) # don't de-mean
Fit04 = Arima(X-mean(X), order=c(2,0,0), include.mean=FALSE) # de-mean by hand
Fit01 is fitting AR(2) with mean using MLE.
Fit03 is fitting AR(2) with zero-mean using MLE.
Fit04 is fitting AR(2) with zero-mean after you demean by hand.
- We usually de-mean, unless we have a reason to believe the true mean is equal to 0.
# read in daily Dow Jones data from my website
D = read.csv("https://nmimoto.github.io/datasets/dowj.csv")
head(D) # see only first 6 lines
## dowj
## 1 110.94
## 2 110.69
## 3 110.43
## 4 110.56
## 5 110.75
## 6 110.84
## [1] FALSE
## Time Series:
## Start = 1
## End = 6
## Frequency = 1
## [1] 110.94 110.69 110.43 110.56 110.75 110.84
## [1] TRUE
## Time Series:
## Start = 2
## End = 7
## Frequency = 1
## [1] -0.0022560132 -0.0023516653 0.0011765240 0.0017170489 0.0008123111
## [6] -0.0034342555
#--- Fit AR(3) ---
library(forecast) # install.packages("forecast")
Fit1 = Arima(X, order=c(3,0,0), include.mean=FALSE) # fit AR(3) to Y
Fit1 # see the result of fit
## Series: X
## ARIMA(3,0,0) with zero mean
##
## Coefficients:
## ar1 ar2 ar3
## 0.4201 0.1127 0.0760
## s.e. 0.1147 0.1250 0.1156
##
## sigma^2 estimated as 1.093e-05: log likelihood=331.92
## AIC=-655.85 AICc=-655.29 BIC=-646.47
- When differencing is used, (true mean = 0) has a special meaning.