5e: Picking Trend Types

1. Types of Trends

Now we can handle following possibilities:

Linear Trend with ARMA noise
Quadratic Trend with ARMA noise
RW without drift only
RW with drift only
RW without drift + ARMA noise
RW with drift + ARMA noise

2. Simulaiton: Linear Trend vs RW with Drift

If the true model is a linear trend + ARMA noise, then residuals after linear regression should be stationary.

If the true model is RW with Drift, residuals after linear regression should be non-stationary.

# Simulate RW w drift.  See if it can be caught by above method
slope = phi1 = phi1.se = adf = 0
options(warn=-1)  # supress warning

for (i in 1:100){

  drift = .3
  e = rnorm(100, drift, 1)
  M = cumsum(e)

  #plot(M,type="o")
  Reg01 = lm(M~t)
  slope[i] = Reg01$coef[2]
  Fit01=auto.arima(Reg01$resid, d=0, stepwise=FALSE, approximation=FALSE)
  phi1[i] = Fit01$coef[1]
  phi1.se[i] = sqrt(Fit01$var.coef[1])

  # Randomness.tests(Fit01$resid)
  adf[i] = Stationarity.tests(Reg01$resid)[2]  # use ADF test
}

hist(slope)   # drift term is showing up as slope

# If we use AR1 parameter being significantly different from 1
#  ar1.check = ((phi1-2*phi1.se)<1 & (phi1+2*phi1.se)>1)
#  cbind(phi1-2*phi1.se, phi1+2*phi1.se, ar1.check)
#  mean(ar1.check)    # Using AR1 ne 1
#  # .34

# If we use ADF test as a guide
mean(adf >.05)

## [1] 0.93

3. Ex: Sheep Data

Preliminary

Annual sheep population (1000s) in England and Wales 1867 – 1939

  library(forecast)
  source('https://nmimoto.github.io/R/TS-00.txt')

  D  = read.csv("https://nmimoto.github.io/datasets/sheep.csv", header=T)
  Sheep = ts(D[,2], start=c(1867), freq=1)

  plot(Sheep, type="o")

  Stationarity.tests(Sheep)

##        KPSS   ADF    PP
## p-val: 0.01 0.554 0.211

–> All three pointing to non-stationarity (small, Large, Large)

Ask auto.arima()

  #--- ARIMA fit ---
  Fit1 = auto.arima(Sheep, stepwise=FALSE, approximation=FALSE)
  Fit1

## Series: Sheep 
## ARIMA(3,1,0) 
## 
## Coefficients:
##          ar1      ar2      ar3
##       0.4210  -0.2018  -0.3044
## s.e.  0.1193   0.1363   0.1243
## 
## sigma^2 = 4991:  log likelihood = -407.56
## AIC=823.12   AICc=823.71   BIC=832.22

  Randomness.tests(Fit1$resid)

##   B-L test H0: the series is uncorrelated
##   M-L test H0: the square of the series is uncorrelated
##   J-B test H0: the series came from Normal distribution
##   SD         : Standard Deviation of the series

##       BL15  BL20  BL25  ML15  ML20    JB     SD
## [1,] 0.984 0.808 0.836 0.915 0.899 0.985 68.846

It has all parameters significant, and also passed the residual analysis.

We should also look at ARIMA(4,1,0) and ARIMA(3,1,1) to see the significance of the last parameters. (omitted here)

We should check the significance of the drift term.

Check the signifiance of the drift term.

You must use Arima and force to include the drift term.

  Fit2 = Arima(Sheep, order=c(3,1,0), include.drift=TRUE)
  Fit2

## Series: Sheep 
## ARIMA(3,1,0) with drift 
## 
## Coefficients:
##          ar1      ar2      ar3    drift
##       0.4134  -0.2045  -0.3115  -5.8707
## s.e.  0.1192   0.1357   0.1241   7.4553
## 
## sigma^2 = 5021:  log likelihood = -407.26
## AIC=824.51   AICc=825.42   BIC=835.9

Drift term is NOT significant.

–> ARIMA(3,1,0) w/o drift is fitting well.

Stationarity check for d=1

d=1 is suggested by auto.arima(), which means that KPSS test said Sheep is stationary after one differencing.

Take the difference by hand, and see what the other two tests says.

  layout(matrix(1:2, 1, 2))
  plot(Sheep, main="Sheep")
  plot(diff(Sheep), main="diff(Sheep)")

  Stationarity.tests(diff(Sheep))

##        KPSS   ADF   PP
## p-val:  0.1 0.022 0.01

–> All 3 tests points to stationarity after d=1. (Large, small, small)

ARIMA(3,1,0) w/o drift is fitting well

$Y_t$ is ARIMA(3,1,0) without drift. \[\begin{align*} & Y_t \mbox{ is the observation} \\ & \bigtriangledown Y_t \hspace{3mm} = \hspace{3mm} X_t \\ & $X_t$ is ARMA(3,0) \end{align*}\] \[ X_t - \phi_1 X_{t-1} - \phi_2 X_{t-2} - \phi_3 X_{x-3} = \epsilon_t \]

What does this mean about the trend?

This is a evidence toward \[ Y_t \hspace{3mm} = \hspace{3mm} T_t + M_t \] where $T_t$ is a RW without drift and $M_t$ is a stationary time series.

This is an evidence against models \[ Y_t \hspace{3mm} = \hspace{3mm} a + b t + M_t \\ \\ Y_t \hspace{3mm} = \hspace{3mm} T_t + M_t \] where $T_t$ is Random Walk with drift $\delta$. Because both models should have significant drift term when the difference is taken.

Key Points

ARIMA(3,1,0) without drift was adequate model for Sheep data.

This is an evidence against the model that Sheep had linear trend going down.

If Sheep data has linear trend $a+bt$, then ARIMA(p,1,q) model should have a drift term of $b$.

The model is also an evidence against the model that The Sheep had random trend with negative drift.

Forcing Linear Trend Model

Here’s another method of rulng out the linear trend possiblity.

If linear trend + stationary series is the correct model for Sheep, then the residual after a regression must be stationary.

Look at ADF stationarity test after the regression (use lm()).

  Reg1 =  lm(Sheep ~ time(Sheep))
  summary(Reg1)

## 
## Call:
## lm(formula = Sheep ~ time(Sheep))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -380.54  -67.68   16.21   77.20  232.84 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 17319.1627  1502.3855   11.53  < 2e-16 ***
## time(Sheep)    -8.1253     0.7894  -10.29 1.01e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 142.1 on 71 degrees of freedom
## Multiple R-squared:  0.5987, Adjusted R-squared:  0.5931 
## F-statistic: 105.9 on 1 and 71 DF,  p-value: 1.01e-15

  layout(matrix(1:2, 1, 2))
  plot(Sheep, type="o", main="Sheep")
  abline(Reg1, col="red")

  plot(Reg1$residuals, type="o", main="Reg1 resid")

  Stationarity.tests(Reg1$residuals)

##        KPSS   ADF    PP
## p-val:  0.1 0.554 0.211

–> Regression residual is not stationary (by ADF test)

Sheep should be modeled by random trend without drift instead of linear trend.

Sheep had random trend, which dissapeared when d=1 difference was taken. (When KPSS test was applied to Sheep, it could not reject the null of the presence of the random trend.)

Fitting a line and fitting ARMA(0,3) seems to be ok, since residuals are uncorrelated. However, the stationarity is questionable.

Fitting ARIMA(3,1,0) with drift, shows non-significant drift term. If linear trend is correct, then drift term should match the slope term.