0% found this document useful (0 votes)
29 views37 pages

Lecture-9-Univariate-Time-Series-Modelling - Part 1

The document discusses univariate time series modeling, focusing on the prediction of future values based on past behaviors using linear time series models. It covers moving average (MA(q)) and autoregressive (AR(p)) models, their mathematical representations, and properties such as stationarity, mean, variance, and covariance. Examples using electricity sales data illustrate the application of these models in forecasting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views37 pages

Lecture-9-Univariate-Time-Series-Modelling - Part 1

The document discusses univariate time series modeling, focusing on the prediction of future values based on past behaviors using linear time series models. It covers moving average (MA(q)) and autoregressive (AR(p)) models, their mathematical representations, and properties such as stationarity, mean, variance, and covariance. Examples using electricity sales data illustrate the application of these models in forecasting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Univariate Time Series Modelling

OlaOluwa S. Yaya

2022-12-13

Bibliography
• Box, G. E. P., and G. Jenkins. 1976. Time Series Analysis: Forecasting and Control. Second. San
Francisco: Holden Day.
• Racine, J. S. 2019. Reproducible Econometrics using R. Oxford University Press, New York.
• Schwarz, G. 1978. “Estimating the Dimension of a Model.” The Annals of Statistics 6:461–64.

Introduction
• Univariate linear time series models are often used when interest of time series econometrician is on
prediction of time series yt for future values based on past behaviours y1 , y2 , . . . , yt = {yi }ti=1 , where k
is time lag
• Thus, research may no nothing or little about causal relationships between the variable to be forecasted
and/or the potential explanatory variable(s)
• The goal is to model the stochastic process underlying the series and to use this for forecasting purposes
• Therefore, we concentrate on the forecasts, their errors, and the variance of these forecast errors

• Time series modelling requires time series observations to be stationary before they can be modeled via
equations with fixed coefficients that can be estimated from past data
• Recall that a process is said to be stationary if its stochastic properties are invariant of time, while
nonstationary process is variant of time
• A consequence of stationarity is that the mean, variance, and covariances of the series are all time-
invariant
• Linear time series models are typically estimated via the method of maximum-likelihood, which you
have studied previously

MA(q) Models
• If a time series has been generated by a moving average process of order q, then it can be expressed as

yt = µ + ϵt − θ1 ϵt−1 − · · · − θq ϵt−q ,

where ϵt ∼ (0, σϵ2 ) is i.i.d.


• We define the backward shift operator B so that Bat = at−1 and B 2 at = at−2 and so on

1
• We can therefore express an MA(q) process as

yt = µ + ϵt − θ1 Bϵt − · · · − θq B q ϵt
= µ + (1 − θ1 B − · · · − θq B q )ϵt
= µ + Θ(B)ϵt ,

where Θ(B) is a polynomial in the backward shift operator B

Example
The underlying code uses the arima() function base R (stats) to generate moving averages of order
q = (2, 4, 6) for the volume of electricity sold to residential customers in South Australia each year from 1989
to 2008 (elecsales is in the R package fpp)
## The data elecsales is from the fpp package
require(fpp)

## Loading required package: fpp


## Loading required package: fma
## Loading required package: expsmooth
## Loading required package: lmtest
data(elecsales)
plot(elecsales, main="Residential Electricity Sales",ylab="GWh", xlab="Year")
lines(fitted(arima(elecsales,c(0,0,2))),col=2,lty=2)
lines(fitted(arima(elecsales,c(0,0,4))),col=3,lty=3)
lines(fitted(arima(elecsales,c(0,0,6))),col=4,lty=4)
legend("topleft",c("Data","MA(2)","MA(4)","MA(6)"),col=1:4,lty=1:4,bty="n")

Residential Electricity Sales


3600

Data
MA(2)
MA(4)
MA(6)
3200
GWh

2800
2400

1990 1995 2000 2005

Year

• An MA(q) process has mean given by

E[yt ] = E[µ + ϵt − θ1 ϵt−1 − · · · − θq ϵt−q ]


= µ + E[ϵt ] − θ1 E[ϵt−1 ] − · · · − θq E[ϵt−q ] = µ

2
• It has variance given by

γ0 = E (yt − E[yt ])2


 

= E (ϵt − θ1 ϵt−1 − · · · − θq ϵt−q )2


 

q
" #
X
2 2 2
= E ϵt + θi ϵt−i + . . .
i=1
q
!
X
= 1+ θi σϵ2
2

i=1

• It has covariance given by


γk = Cov[yt , yt−k ]
= E [(yt − E[yt ])(yt−k − E[yt−k ])]
 
= E (ϵt − θ1 ϵt−1 − · · · − θq ϵt−q )(ϵt−k − θ1 ϵt−k−1 − · · · − θq ϵt−k−q )

• This is equal to
γk = (−θk + θk+1 θ1 + · · · + θq θq−k )σϵ2
for k ≤ q and 0 for k > q
• Moving average processes are characterized by dependency on a finite past given by the order of the
process, q

• The autocorrelation function for an MA(q) process is given by


γk −θk + θk+1 θ1 + · · · + θq θq−k
ρk = =
γ0 1 + θ12 + · · · + θq2

for k ≤ q, and ρk = 0 for k > q


• So, for an M A(1) process, i.e. q = 1, and k = 1, ρk = 0 for k > 1,
γ1 −θ1
ρ1 = =
γ0 1 + θ12

• For an M A(2) process,i.e. q = 2, and k = 1, 2, ρk = 0 for k > 2,


γ1 −θ1 + θ2 θ1
ρ1 = =
γ0 1 + θ12 + θ22
γ2 −θ2
ρ2 = =
γ0 1 + θ12 + θ22

• Given that the mean, variance, and covariances of MA(q) models do not depend on time, then the
requirements for stationarity are satisfied if the process has finite variance.
• Recall that the variance is given by
q
!
X
γ0 = 1+ θi2 σϵ2
i=1

• The restrictions required for an MA(q) process to have finite variance are therefore
q
X
θi2 < ∞
i=1

3
• Identification of moving average processes is quite straightforward
• We do so on the basis of the sample autocorrelation and partial autocorrelation functions
• We exploit that fact that the partial autocorrelation function will give an exponential decaying pattern,
while autocorrelation function will cut off to 0 after certain lags. The cutting off is determined using
Bartlett Approximation

set.seed(42)
par(mfrow=c(1,3))
pacf(arima.sim(model=list(ma=c(-.8)),n=1000),lag.max=5,main="MA(1)")
pacf(arima.sim(model=list(ma=c(-.5,.4)),n=1000),lag.max=5,main="MA(2)")
pacf(arima.sim(model=list(ma=c(-.3,.2,.3)),n=1000),lag.max=5,main="MA(3)")

MA(1) MA(2) MA(3)


0.2
0.0

0.2
−0.1

0.1
0.0
Partial ACF

Partial ACF

Partial ACF
−0.2

0.0
−0.2
−0.3

−0.1
−0.4

−0.4

−0.2

1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

Lag Lag Lag

par(mfrow=c(1,1))

set.seed(42)
par(mfrow=c(1,3))
acf(arima.sim(model=list(ma=c(-.8)),n=1000),lag.max=5,main="MA(1)")
acf(arima.sim(model=list(ma=c(-.5,.4)),n=1000),lag.max=5,main="MA(2)")
acf(arima.sim(model=list(ma=c(-.3,.2,.3)),n=1000),lag.max=5,main="MA(3)")

4
MA(1) MA(2) MA(3)
1.0

1.0

1.0
0.8
0.6
0.5
0.5
ACF

ACF

ACF

0.4
0.2
0.0
0.0

0.0
−0.2
−0.5
−0.5

0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5

Lag Lag Lag

par(mfrow=c(1,1))

AR(p) Models
• If a time series has been generated by an autoregressive process of order p, then it can be expressed as

yt = δ + ϕ1 yt−1 + · · · + ϕp yt−p + ϵt

• ϵt ∼ (0, σϵ2 ) is i.i.d.


• Using the backward shift operator B, we can express this as

yt − ϕ1 Byt − · · · − ϕp B p yt = δ + ϵt
(1 − ϕ1 B − · · · − ϕp B p )yt = δ + ϵt
Φ(B)yt = δ + ϵt

• These models are extremely flexible and are capable of handling a wide range of time series relationships

Example
## The data elecsales is from the fpp package
require(fpp)
data(elecsales)
plot(elecsales, main="Residential Electricity Sales",ylab="GWh", xlab="Year")
lines(fitted(arima(elecsales,c(2,0,0))),col=2,lty=2)
lines(fitted(arima(elecsales,c(4,0,0))),col=3,lty=3)
lines(fitted(arima(elecsales,c(6,0,0))),col=4,lty=4)
legend("topleft",c("Data","AR(2)","AR(4)","AR(6)"),col=1:4,lty=1:4,bty="n")

5
Residential Electricity Sales
3600

Data
AR(2)
AR(4)
AR(6)
3200
GWh

2800
2400

1990 1995 2000 2005

Year

• Autoregressive processes are characterized by dependency on an infinite past


• If the process is stationary then the dependency between elements of the series decreases as the distance
between these elements increase
• An AR(p) process has mean given by

E[yt ] = δ + ϕ1 E[yt−1 ] + · · · + ϕp E[yt−p ]

• If yt is stationary, the mean is invariant with respect to time, therefore a stationary AR(p) process has
mean given by
µy = δ + ϕ1 µy + · · · + ϕp µy

• This is equal to
δ
µy =
1 − ϕ1 − · · · − ϕp

• Without loss of generality, assume that δ = 0


• We know that this does not change the variance or covariances

6
• The variance is

V ar[yt ] = γ0
= E[(yt − E[yt ])2 ]
= E[yt2 ]
= E[yt yt ]
= E [yt (ϕ1 yt−1 + · · · + ϕp yt−p + ϵt )]
= ϕ1 E[yt yt−1 ] + · · · + ϕp E[yt yt−p ] + E[yt ϵt ]
= ϕ1 γ1 + · · · + ϕp γp + σϵ2

• From the above, for stationarity, we can check the restrictions on the parameters ϕi required for finite
variance
• By way of example, consider an AR(1) model given by

yt = δ + ϕ1 yt−1 + ϵt

• The variance and covariance functions up to order p = 1 are given by

γ0 = ϕ1 γ1 + σϵ2 ,
γ1 = ϕ1 γ0

• Solving for the variance γ0 yields


σϵ2
γ0 =
(1 − ϕ21 )

• Under stationarity, this must be a finite positive number therefore we require that |ϕ1 | < 1

• Consider an AR(2) model given by

yt = δ + ϕ1 yt−1 + ϕ2 yt−2 + ϵt

• The variance and covariance functions up to order p = 2 are given by

γ0 = ϕ1 γ1 + ϕ2 γ2 + σϵ2 ,
γ1 = ϕ1 γ0 + ϕ2 γ1 ,
γ2 = ϕ1 γ1 + ϕ2 γ0

• Solving for the variance γ0 yields

(1 − ϕ2 )σϵ2
γ0 =
(1 + ϕ2 )(1 − ϕ1 − ϕ2 )(1 + ϕ1 − ϕ2 )

• Under stationarity, this must be a positive finite number therefore we require that |ϕ2 | < 1, ϕ1 + ϕ2 < 1,
and −ϕ1 + ϕ2 < 1

7
• The covariance for k = 1 is given by

γ1 = E[(yt − E[yt ])(yt−1 − E[yt−1 ])]


= E[yt−1 yt ]
= E [yt−1 (ϕ1 yt−1 + · · · + ϕp yt−p + ϵt )]
= ϕ1 E[yt−1 yt−1 ] + · · · + ϕp E[yt−1 yt−p ] + E[yt−1 ϵt ]
= ϕ1 γ0 + · · · + ϕp γp−1 + 0
= ϕ1 γ0 + · · · + ϕp γp−1

• In general, this is given by

γk = E [yt−k (ϕ1 yt−1 + · · · + ϕp yt−p + ϵt )]


= ϕ1 E[yt−k yt−1 ] + · · · + ϕp E[yt−k yt−p ] + E[yt−k ϵt ]
= ϕ1 γk−1 + · · · + ϕp γk−p + 0
= ϕ1 γk−1 + · · · + ϕp γk−p

• For k = 0 to k = p, we therefore have

γ0 = ϕ1 γ1 + · · · + ϕ p γp ,
γ1 = ϕ1 γ0 + · · · + ϕp γp−1 ,
..
.
γp = ϕ1 γp−1 + · · · + ϕp γ0

• The autocorrelation functions for AR(p) processes are

ρ0 = 1,
ϕ1 γ0 + · · · + ϕp γp−1
ρ1 = ,
γ0
..
.
ϕ1 γp−1 + · · · + ϕp γ0
ρp =
γ0

• We write these as

ρ0 = 1,
ρ1 = ϕ1 + ϕ2 ρ1 + · · · + ϕp ρp−1 ,
..
.
ρp = ϕ1 ρp−1 + ϕ2 ρp−2 + · · · + ϕp

• These are known as the Yule-Walker equations

• A useful rule of thumb for an Autoregressive model is that, if the sum of the coefficients are less than
one and no coefficient exceeds one in absolute value, then it is likely that the relationship is stationary
• If an AR(p) process is stationary we say that the characteristic roots of the polynomial Φ(B) lie outside
of the unit circle

8
• Identification of autoregressive processes is also quite straightforward
• We do so on the basis of the sample autocorrelation and partial autocorrelation functions
• We exploit that fact that the autocorrelation function will give an exponential decaying pattern, while
partial autocorrelation function will cut off to 0 after certain lags. The cutting off is determined using
Bartlett Approximation

## The data elecsales is from the fpp package


require(fpp)
data(elecsales)
acf(elecsales,lag.max=6)

Series elecsales
0.8
0.4
ACF

0.0
−0.4

0 1 2 3 4 5 6

Lag

## The data elecsales is from the fpp package


require(fpp)
data(elecsales)
pacf(elecsales,lag.max=6)

9
Series elecsales
0.0 0.2 0.4 0.6 0.8
Partial ACF

−0.4

1 2 3 4 5 6

Lag

Non-Seasonal ARMA(p, q) Models


• If a time series has been generated by a mixed autoregressive moving average (ARMA) process of orders
p and q, then it can be expressed as

yt = δ + ϕ1 yt−1 + · · · + ϕp yt−p + ϵt − θ1 ϵt−1 − · · · − θq ϵt−q ,

where ϵt ∼ (0, σϵ2 ) is i.i.d.

Non-Seasonal ARIMA(p, d, q) Models


• If a homogeneous non-stationary process of order d, wt = ∆d yt = (1 − B)d yt where ∆d denotes the dth
difference of a series, has been generated by a mixed ARMA process of orders p and q, then it can be
expressed as

∆d yt = δ + ϕ1 ∆d yt−1 + · · · + ϕp ∆d yt−p + ϵt − θ1 ϵt−1 − · · · − θq ϵt−q ,

• Or, we may write this as

wt = δ + ϕ1 wt−1 + · · · + ϕp wt−p + ϵt − θ1 ϵt−1 − · · · − θq ϵt−q ,

where ϵt ∼ (0, σϵ2 ) is i.i.d. And each of the models is a well-specified Autoregressive Integrated Moving
Average [ARIMA(p,d,q)] Model

10
• We say that yt is I(d) or integrated of order d. The term comes from calculus; if dy/dt = x, then y is
the integral of x
• In discrete time series, if ∆yt = xt , then y might also be viewed as the integral, or sum over t, of x
• Using the backwards shift operator we may express our ARIMA(p, d, q) model as

wt = δ + ϕ1 Bwt + · · · + ϕp B p wt + ϵt − θ1 Bϵt − · · · − θq B q ϵt

• This can be rearranged to obtain

(1 − ϕ1 B − · · · − ϕp B p )wt = δ + (1 − θ1 B − · · · − θq B q )ϵt

• This is often written as


Φ(B)wt = δ + Θ(B)ϵt

• This is also written as


δ + Θ(B)ϵt
wt =
Φ(B)

• Here, Φ(B) and Θ(B) are polynomials of order p and q in the backwards shift operator B
• This is a parsimonious way of writing an ARIMA(p, d, q) model

• Many of the models we have studied above are special cases of the ARIMA(p, d, q) model:

Model ARIMA(p, d, q) Representation


White noise ARIMA(0, 0, 0)
Random walk ARIMA(0, 1, 0) without a constant
Random walk with drift ARIMA(0, 1, 0) with a constant
AR(p) ARIMA(p, 0, 0)
MA(q) ARIMA(0, 0, q)

Stationarity of ARIMA(p, d, q) Models


• For stationarity, we require that this representation for ∆d yt is convergent (i.e., that this stochastic
difference equation is stable)
• Intuitively, this series cannot explode, the mean, variance, and covariances must be time-independent,
and the dependence of the series must approach zero as observations become further apart in time
Pq
• Strictly speaking, we require that i=1 θi2 < ∞ (finite variance of the MA(q) component) and that the
characteristic roots of the polynomial Φ(B) lie outside of the unit circle (finite variance of the AR(p)
component)

Identification of ARIMA(p, d, q) Processes


• First check whether yt is stationary by examining the sample autocorrelation function
– If it appears to be non-stationary, difference the series and check whether the differenced series is
itself stationary
– As well, check that the mean is stationary, that is, make sure that there is no upwards or downwards
trend which indicates that the mean is changing over time
• Next, use the sample autocorrelation function to try to infer the order of the moving average component

11
Identification of ARIMA(p, d, q) Processes
• Next, use the partial autocorrelation function to try to infer the order of the autoregressive component
• Having found likely candidates for p, d, and q, we estimate the model and then conduct some basic
diagnostic checks
– If the model is correctly specified, then the residuals should be white noise
– If they are not, then you may have the wrong order of the autoregressive or moving average
component
– Fortunately, it would appear that most time series can be characterized by low order ARIMA(p, d, q)
processes
• There is also a handy function named auto.arima() in the forecast package that can be helpful for
locating a candidate model

Estimation of ARIMA(p, d, q) Processes


• We can write our process as
Φ(B)∆d yt − δ = Θ(B)ϵt
or
ϵt = Φ(B)∆d yt − δ /Θ(B)


• These models are typically estimated via maximum likelihood (ML)


• For instance, under the assumption of Gaussian errors, we obtain the likelihood function given by
T
T T 1 X 2
L=− ln(2π) − ln(σ 2 ) − 2 ϵt
2 2 2σ

• When starting values for the ϕ parameters are required, we use the partial autocorrelation function
ϕ̂i = ai for i = 1, . . . , p
• When R estimates an ARIMA(p, d, q) model, it uses ML, which delivers parameter values that maximize
the probability of obtaining the time series that we have observed
• For ARIMA(p, d, q) models, ML is similar to the least squares estimates that would be obtained by
minimizing the sum of squared residuals
• However, you need to realize that ARIMA(p, d, q) models are much more complicated to estimate than
simple regression models, and different software implementations can give slightly different answers
because they employ different estimation procedures and algorithms

Trends, Constants, and ARIMA(p, d, q) Models


• A non-seasonal ARIMA(p, d, q) model can be expressed as

Φ(B)(1 − B)d yt = δ + Θ(B)ϵt ,

where δ = µ(1 − ϕ1 − · · · − ϕp ) and where µ is the mean of (1 − B)d yt


• This can also be expressed as

Φ(B)(1 − B)d (yt − µtd /d!) = Θ(B)ϵt

• R uses this parameterization of the model

12
• There is a feature of this model that is worth noting, namely, that including a constant in a non-
stationary ARIMA(p, d, q) model is equivalent to incorporating a polynomial trend of order d in the
model

• In base R there is a function named arima(), and in the R forecast package there is a similar but
slightly different function named Arima() (R is case sensitive)
• Both can be used to fit ARIMA(p, d, q) models, but they differ in one aspect that is relevant when
forecasting series containing a trend
• The arima() function sets δ = µ = 0 when d > 0, while when d = 0 it provides an estimate of
µ = δ/(1 − ϕ1 − · · · − ϕp ). The parameter µ is called the intercept in the model summary
• The arima() function accepts the argument include.mean which only has an effect when d = 0 (setting
include.mean=FALSE will set δ = µ = 0)

• The Arima() function, on the other hand, provides more flexibility for the inclusion of a (linear) trend
• It accepts the argument include.mean which has identical functionality to the corresponding argument
for arima(), but also accepts the argument include.drift which allows µ ̸= 0 when d = 1 (when
d = 1 the parameter µ is called the drift in the R output)
• The Arima() function also accepts the argument include.constant which, if TRUE, will set
include.mean=TRUE if d = 0 and include.drift=TRUE when d = 1
• Note that when d > 1 a constant is ignored by Arima() since quadratic and higher order trends can
produce erratic and unreasonable forecasts in this setting

• When d = 0 and include.drift=TRUE is set, the fitted model from Arima() is

Φ(B)(yt − a − bt) = Θ(B)ϵt

• An extra linear trend term bt is thereby included in the model by Arima() that is not present when
using arima()
• The R output will label a as the intercept and b as the drift coefficient
• When d = 1 and include.drift=TRUE is set, the fitted model from Arima() is

Φ(B)(∆yt − c) = Θ(B)ϵt

• When d = 1 and include.drift=TRUE is set, an extra constant term c is included in the model by
Arima() that is not present when using arima(), which is equivalent to including a linear trend
• In this case, the R output will label c the drift coefficient

Model Selection Criteria, Trends, and Stationarity


• A trend stationary process is said to contain a deterministic trend, while a difference stationary process
such as a random walk with drift is said to contain a stochastic trend
• Thus, a trend stationary process is nonstationary and can be made stationary by removing the trend.
Also, a difference stationary process can be made stationary by taking series differencing discussed
earlier

13
• Consider the practical issue of how one might distinguish between a non-stationary random walk
process with drift (case of trend stationarity) and a non-stationary autoregressive process that is trend
stationary (case of difference stationarity)

• One way would be to estimate a set of candidate models and allow some model selection criteria to
determine which model is the best
• In what follows we consider the Bayes Information Criterion [@SCHWARZ:1978] and use the R
command BIC to compute the criterion function value for each of the candidate models (smaller values
are preferred)
• We consider two data generating processes for this exercise, one a non-stationary AR(1) containing a
time trend given by
yt = ayt−1 + bt + ϵt , |a| < 1

• The other is a non-stationary random walk with drift given by

yt = yt−1 + d + ϵt

• The former is trend stationary while the latter is difference stationary

• We simulate two series plotted in the figure that follows, one from each process, and compute four
candidate models:
– A classical AR(1) model (‘Model yt ’ in what follows)
– A classical first-differenced AR(1) model (‘Model (1 − B)yt ’ in what follows)
– An augmented AR(1) model with a linear trend (‘Model yt − a − bt’ in what follows)
– An augmented first-differenced AR(1) model with a linear trend (‘Model (1 − B)(yt − ct)’ in what
follows)
• We present the BIC values for these models for each series in the table that follows
require(forecast)
set.seed(42)
n <- 250
## Simulate a trend stationary AR(1) process
x.ard <- numeric()
x.ard[1] <- 0
for(t in 2:n) x.ard[t] <- 0.5*x.ard[t-1] + 0.5*t + rnorm(1,sd=25)
x.ard <- ts(x.ard)
## Simulate a random walk with drift
x.rwd <- ts(cumsum(rnorm(n,mean=2,sd=25)))
## Fit four candidate models for the AR(1) series
model.ard.nd <- Arima(x.ard,c(1,0,0))
model.ard.nd.diff <- Arima(x.ard,c(1,1,0))
model.ard <- Arima(x.ard,c(1,0,0),include.drift=TRUE)
model.ard.diff <- Arima(x.ard,c(1,1,0),include.drift=TRUE)
## Compute the BIC criterion for each model
ard.BIC <- c(BIC(model.ard.nd),
BIC(model.ard.nd.diff),
BIC(model.ard),
BIC(model.ard.diff))
## Fit four candidate models for the random walk series
model.rwd.nd <- Arima(x.rwd,c(1,0,0))

14
model.rwd.nd.diff <- Arima(x.rwd,c(1,1,0))
model.rwd <- Arima(x.rwd,c(1,0,0),include.drift=TRUE)
model.rwd.diff <- Arima(x.rwd,c(1,1,0),include.drift=TRUE)
## Compute the BIC criterion for each model
rwd.BIC <- c(BIC(model.rwd.nd),
BIC(model.rwd.nd.diff),
BIC(model.rwd),
BIC(model.rwd.diff))

## Plot the two series


ylim <- range(c(x.ard,x.rwd))
plot(x.ard,ylim=ylim,ylab="$y_t$")
lines(x.rwd,lty=2,col=2)
legend("topleft",c("Trend stationary AR(1)",
"Random walk with drift"),col=1:2,lty=1:2,bty="n")

Trend stationary AR(1)


400

Random walk with drift


300
200
$y_t$

100
0
−100

0 50 100 150 200 250

Time

## Compare models
foo <- data.frame(cbind(ard.BIC,rwd.BIC))
colnames(foo) <- c("AR(1)","RWD")
rownames(foo) <- c("Model $y_t$",
"Model $(1-B)y_t$",
"Model $y_t-a-bt$",
"Model $(1-B)(y_t-ct)$")

15
knitr::kable(foo,caption="BIC Model Selection Criterion Values for Four Candidate Models Based on a Tren
escape=FALSE)

Table 2: BIC Model Selection Criterion Values for Four Candidate


Models Based on a Trend Stationary AR(1) Process and a Random
Walk with Drift (smaller values are better).

AR(1) RWD
Model yt 2392.883 2322.317
Model (1 − B)yt 2366.227 2306.183
Model yt − a − bt 2326.711 2320.778
Model (1 − B)(yt − ct) 2371.132 2311.251

• For the trend stationary AR(1) process, we observe that the BIC criterion selected Model yt − a − bt
(BIC = 2326.711)
• For the non-stationary random walk with drift, we can see that the BIC criterion selected Model
(1 − B)yt (BIC = 2306.183)
• The model selection criteria that we study later in the course therefore appear to have the ability to
distinguish between these two cases
• Note, however, that these selection procedures are based on a random sample and are not guaranteed
to deliver the true model in applied settings, even in the unlikely event that it was contained in the set
of candidate models

Model Selection via auto.arima()


• The function auto.arima() returns the best ARIMA(p, d, q) model according to either its AIC, AICc
or BIC value by conducting a search over possible models within the order constraints provided
• The function auto.arima() is not guaranteed to deliver the best model according to the selection
criterion used
• You might therefore wish to consider the model produced to be a candidate model and then proceed
with some of the diagnostics that follow to see whether you might improve upon the candidate
• We will study model selection via these criterion later in this course
• The following figure plots the series and an h=10 step-ahead prediction for the Canadian Lynx data
(i.e., a sequence of 1-10 year horizon forecasts)

Example - Canadian Lynx Data (auto.arima())


data(lynx)
model <- auto.arima(lynx,stepwise=FALSE,approximation=FALSE)
plot(forecast(model,h=10))
lines(fitted(model),col=2,lty=2)
legend("topleft",c("Data","fitted(model)","forecast(model,h=10)"),lty=c(1,2,1),col=c(1,2,4),bty="n")

16
Forecasts from ARIMA(4,0,0) with non−zero mean

Data
6000

fitted(model)
forecast(model,h=10)
4000
2000
0
−2000

1820 1840 1860 1880 1900 1920 1940

Diagnostics for ARIMA(p, d, q) Models


• If a model provides an adequate fit to the data, the residuals from the model ought to be white noise
• There is a useful function in base R called tsdiag() that you can apply to a candidate model
• This is a generic function that plots
– The residuals, often standardized
– The autocorrelation function of the residuals
– The p-values of a portmanteau test for all lags up to gof.lag

tsdiag() Illustration For the Canadian Lynx Data


tsdiag(model)

17
3
2
1
0
−1
−2
Standardized Residuals

1820 1840 1860 1880 1900 1920

Time

ACF of Residuals
1.0
0.8
0.6
ACF

0.4
0.2
0.0
−0.2

0 5 10 15 20

Lag

p values for Ljung−Box statistic


1.0
0.8
0.6
p value

0.4
0.2
0.0

2 4 6 8 10

lag

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


• If we were selling ceiling fans then we might predict this July’s sales using last July’s sales (lag 12)
• This relationship of predicting using last year’s data would hold for any month of the year
• We might also expect that this June’s sales would also be useful (lag 1)
• The non-seasonal ARIMA(p, d, q) model cannot incorporate seasonal data
• It can, however, be overloaded to model seasonal data
• We can overload the non-seasonal ARIMA(p, d, q) model to not only include first differences, second
differences and so on, but also seasonal differences

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


• We overload a non-seasonal ARIMA(p, d, q) model by including additional seasonal parts
• The seasonal parts of the model consist of terms that are analogous to the non-seasonal terms, but
instead involve simple backshifts of the seasonal period, denoted m
• m represents the number of periods in the season (see the R function frequency())
• These models are denoted ARIMA(p, d, q)(P, D, Q)m
• The (P, D, Q)m component denotes the seasonal part of the model
• You specify a seasonal ARIMA(0, 1, 3)(2, 0, 0)12 model in R as follows:
Arima(UKDriverDeaths,order=c(0,1,3),seasonal=list(order=c(2,0,0),period=12))

18
Seasonal ARIMA(p, d, q)(P, D, Q)m Models
Example - ARIMA(1, 1, 1)(1, 1, 1)4
• An ARIMA(1, 1, 1)(1, 1, 1)4 model (without a constant) constructed for quarterly data (m = 4) can be
expressed as
(1 − ϕ1 B)(1 − Φ1 B 4 )(1 − B)(1 − B 4 )yt = (1 − θ1 B)(1 − Θ1 B 4 )ϵt

• The terms involving B 4 are the seasonal components


– ϵt ∼ (0, σϵ2 ) is i.i.d.
– The first term on the left hand side, (1 − ϕ1 B), is the non-seasonal AR(1) component
– The second term (1 − Φ1 B 4 ) is the seasonal AR(1) component
– The third term (1 − B) is the non-seasonal difference component
– The fourth term (1 − B 4 ) is the seasonal difference component

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Example - ARIMA(1, 1, 1)(1, 1, 1)4
• The first term on the right hand side, (1 − θ1 B), is the non-seasonal MA(1) component
• The second term (1 − Θ1 B 4 ) is the seasonal MA(1) component
• Letting wt = ∆yt = (1 − B)yt = yt − yt−1 , we can see that

(1 − ϕ1 B)(1 − Φ1 B 4 )(1 − B)(1 − B 4 )yt = wt − ϕ1 wt−1


− (1 + Φ1 )(wt−4 − ϕ1 wt−5 )
+ Φ1 (wt−8 − ϕ1 wt−9 )

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Example - ARIMA(1, 1, 1)(1, 1, 1)4
• Similarly,

(1 − θ1 B)(1 − Θ1 B 4 )ϵt = ϵt − θ1 ϵt−1 − Θ1 (ϵt−4 − θ1 ϵt−5 )

• So,

wt = ϕ1 wt−1 + (1 + Φ1 )(wt−4 − ϕ1 wt−5 ) − Φ1 (wt−8 − ϕ1 wt−9 )


+ ϵt − θ1 ϵt−1 − Θ1 (ϵt−4 − θ1 ϵt−5 )

• This is simply an ARIMA(9, 0, 5) model in wt with a large number of parameter constraints or an


ARIMA(10, 1, 5) model in yt

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Estimation, Properties, Forecasts, and Forecast Error Variances
• ARIMA(p, d, q)(P, D, Q)m models are essentially ARIMA(p′ , d, q ′ ) models with a large number of
constraints (zero parameters and so forth)

19
• Therefore, there is nothing new about these models when it comes to properties, estimation, forecasting,
forecast errors, and forecast error variances
• We can leverage existing results for identification (white noise tests etc.)
• Here we might use both ndiffs() and nsdiffs() in addition to the acf() and pacf() functions
• We can also use auto.arima() for identifying a candidate model

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Example - Modeling and Forecasting European Quarterly Retail Trade
• You will need to install the R packages fpp and forecast to run the following examples
• We will use the function auto.arima() to determine a candidate ARIMA(p, d, q)(P, D, Q)m model,
compare the series with the fitted values, and conduct h=10 step-ahead prediction
• The function auto.arima() returns the best ARIMA(p, d, q)(P, D, Q)m model according to either the
AIC, AICc or BIC values by conducting a search over possible models within the order constraints
provided
• The auto.arima() function is in the R package forecast, while the euretail data is in the R package
fpp so these packages need to be installed prior to running this code

Example - Modeling and Forecasting European Quarterly Retail Trade

Seasonal plot: euretail


2007
2006
2005
2008
100
2004

2003
2009
2002
2010
2000
96 2011
2001

1999

1998
92

1997
1996

Q1 Q2 Q3 Q4
Quarter

Figure 1: ‘ggseasonplot‘ of European Quarterly Retail Trade.

20
Example - Modeling and Forecasting European Quarterly Retail Trade
## The euretail data is in the fpp package, auto.arima() in the forecast package
require(fpp)
require(forecast)
data(euretail)
model <- auto.arima(euretail,stepwise=FALSE,approximation=FALSE)
plot(forecast(model,h=10))
lines(fitted(model),col=2,lty=2)
legend("topleft",c("Data","fitted(model)","forecast(model,h=10)"),lty=1:2,col=c(1,2,4),bty="n")

Forecasts from ARIMA(0,1,3)(0,1,1)[4]


102

Data
fitted(model)
forecast(model,h=10)
98
96
94
92
90
88

2000 2005 2010

Example - Modeling Monthly Cortecosteroid Drug Sales


• We consider a data set from the fpp package, h02
• h02 is a series of monthly cortecosteroid drug sales in Australia from 1992 to 2008
• The following table tabulates h=12 step-ahead forecasts (horizon 1 through 12, monthly data) using the
argument forecast(model,h=12)
• Make sure that the fpp package is installed prior to running this code
• We also present a figure with the time series, fitted values, and predictions along with prediction
intervals for h=24 step-ahead forecasts

21
Example - Modeling Monthly Cortecosteroid Drug Sales

Seasonal plot: h02


1.25 2004
2003
2007
2005
2006
2002
1999
1.00 2000
2001
1997
1998
1996
1993
1995

1994
0.75 2008 1992

1991

0.50

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month

Figure 2: ‘ggseasonplot‘ of Monthly Cortecosteroid Drug Sales.

Example - Modeling Monthly Cortecosteroid Drug Sales


require(fpp)
require(forecast)
data(h02)
model <- auto.arima(h02,stepwise=FALSE,approximation=FALSE)
plot(forecast(model,h=24))
lines(fitted(model),col=2,lty=2)
legend("topleft",c("Data","fitted(model)","forecast(model,h=24)"),lty=1:2,col=c(1,2,4),bty="n")

22
Forecasts from ARIMA(3,1,1)(0,1,1)[12]
1.4

Data
fitted(model)
1.2

forecast(model,h=24)
1.0
0.8
0.6
0.4

1995 2000 2005 2010

Example - Modeling Monthly Cortecosteroid Drug Sales


## The h02 data is in the fpp package, auto.arima() in the forecast package
require(fpp)
require(forecast)
data(h02)
model <- auto.arima(h02,stepwise=FALSE,approximation=FALSE)
knitr::kable(data.frame(forecast(model,h=12)),
caption="Australian Monthly Cortecosteroid Drug Sales Forecasts.")

Table 3: Australian Monthly Cortecosteroid Drug Sales Forecasts.

Point.Forecast Lo.80 Hi.80 Lo.95 Hi.95


Jul 2008 1.0166464 0.9481087 1.0851840 0.9118270 1.1214657
Aug 2008 1.0566558 0.9878375 1.1254740 0.9514073 1.1619042
Sep 2008 1.0981460 1.0248512 1.1714409 0.9860513 1.2102408
Oct 2008 1.1617772 1.0838991 1.2396553 1.0426729 1.2808814
Nov 2008 1.1685904 1.0894889 1.2476919 1.0476152 1.2895657
Dec 2008 1.2000699 1.1186714 1.2814684 1.0755816 1.3245582
Jan 2009 1.2467725 1.1638749 1.3296701 1.1199916 1.3735535
Feb 2009 0.7093491 0.6253626 0.7933355 0.5809029 0.8377952
Mar 2009 0.7130936 0.6279837 0.7982036 0.5829292 0.8432580
Apr 2009 0.7525096 0.6665585 0.8384607 0.6210587 0.8839605
May 2009 0.8225565 0.7358643 0.9092487 0.6899722 0.9551407
Jun 2009 0.8340600 0.7467027 0.9214174 0.7004585 0.9676616

External Predictors
• Sometimes you may have additional predictor variables that you believe affect yt but are not themselves lagged
values of either yt or ϵt

23
• These predictors (often called external predictors) turn out to be straightforward to incorporate in the
ARIMA(p, d, q)(P, D, Q)m framework
• They simply appear as additional explanatory variables on the right hand side of the model
• In R, the arima(), Arima() and auto.arima() functions support the arguments xreg=
• Note that xreg is an optional vector or matrix of external predictors, which must have the same number of
rows as y

Example - UK Driver Deaths and Compulsory Seat Belt Laws


• UKDriverDeaths is a time series giving the monthly totals of car drivers in Great Britain killed or seriously
injured January 1969 to December 1984
• Compulsory wearing of seat belts was introduced on 31 January 1983
• The figure that follows presents the ggseasonplot for this series
• You can see a secular decline in fatalities from 1983 onward

Example - UK Driver Deaths and Compulsory Seat Belt Laws

Seasonal plot: UKDriverDeaths


1972

2500 1970

1976
1978
1977
1979
1975
1971
1973
1969
1982
1974
2000
1980

1984
1981

1500 1983

1000
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month

Figure 3: ‘ggseasonplot‘ of Monthly Totals of Car Drivers in Great Britain Killed or Seriously Injured January
1969 to December 1984.

Example - UK Driver Deaths and Compulsory Seat Belt Laws


• We wish to introduce an external predictor (a dummy variable for compulsory wearing of seat belts) into a
seasonal ARIMA(p, d, q)(P, D, Q)m model
• We use the auto.arima() function to locate a candidate model, then use this candidate model and fit a model
with the external predictor seatbelt using the Arima() function
• The following table presents a summary of the model

24
• Note that the external predictor seatbelt is and accounts for an average decrease of deaths or serious injuries
per month

Model Selection Criteria, Trends, and Stationarity


• A trend stationary process is said to contain a deterministic trend, while a difference stationary process such as
a random walk with drift is said to contain a stochastic trend
• Thus, a trend stationary process is nonstationary and can be made stationary by removing the trend. Also, a
difference stationary process can be made stationary by taking series differencing discussed earlier
• Consider the practical issue of how one might distinguish between a non-stationary random walk process with
drift (case of trend stationarity) and a non-stationary autoregressive process that is trend stationary (case of
difference stationarity)

• One way would be to estimate a set of candidate models and allow some model selection criteria to determine
which model is the best
• In what follows we consider the Bayes Information Criterion [@SCHWARZ:1978] and use the R command BIC
to compute the criterion function value for each of the candidate models (smaller values are preferred)
• We consider two data generating processes for this exercise, one a non-stationary AR(1) containing a time trend
given by
yt = ayt−1 + bt + ϵt , |a| < 1
• The other is a non-stationary random walk with drift given by

yt = yt−1 + d + ϵt

• The former is trend stationary while the latter is difference stationary

• We simulate two series plotted in the figure that follows, one from each process, and compute four candidate
models:
– A classical AR(1) model (‘Model yt ’ in what follows)
– A classical first-differenced AR(1) model (‘Model (1 − B)yt ’ in what follows)
– An augmented AR(1) model with a linear trend (‘Model yt − a − bt’ in what follows)
– An augmented first-differenced AR(1) model with a linear trend (‘Model (1 − B)(yt − ct)’ in what follows)
• We present the BIC values for these models for each series in the table that follows

require(forecast)
set.seed(42)
n <- 250
## Simulate a trend stationary AR(1) process
x.ard <- numeric()
x.ard[1] <- 0
for(t in 2:n) x.ard[t] <- 0.5*x.ard[t-1] + 0.5*t + rnorm(1,sd=25)
x.ard <- ts(x.ard)
## Simulate a random walk with drift
x.rwd <- ts(cumsum(rnorm(n,mean=2,sd=25)))
## Fit four candidate models for the AR(1) series
model.ard.nd <- Arima(x.ard,c(1,0,0))
model.ard.nd.diff <- Arima(x.ard,c(1,1,0))
model.ard <- Arima(x.ard,c(1,0,0),include.drift=TRUE)
model.ard.diff <- Arima(x.ard,c(1,1,0),include.drift=TRUE)
## Compute the BIC criterion for each model
ard.BIC <- c(BIC(model.ard.nd),
BIC(model.ard.nd.diff),

25
BIC(model.ard),
BIC(model.ard.diff))
## Fit four candidate models for the random walk series
model.rwd.nd <- Arima(x.rwd,c(1,0,0))
model.rwd.nd.diff <- Arima(x.rwd,c(1,1,0))
model.rwd <- Arima(x.rwd,c(1,0,0),include.drift=TRUE)
model.rwd.diff <- Arima(x.rwd,c(1,1,0),include.drift=TRUE)
## Compute the BIC criterion for each model
rwd.BIC <- c(BIC(model.rwd.nd),
BIC(model.rwd.nd.diff),
BIC(model.rwd),
BIC(model.rwd.diff))

Trend stationary AR(1)


400

Random walk with drift


300
200
$y_t$

100
0
−100

0 50 100 150 200 250

Time

Figure 4: Trend Stationary AR(1) and Random Walk with Drift.

Table 4: BIC Model Selection Criterion Values for Four Candidate Models
Based on a Trend Stationary AR(1) Process and a Random Walk with
Drift (smaller values are better).

AR(1) RWD
Model yt 2392.883 2322.317
Model (1 − B)yt 2366.227 2306.183
Model yt − a − bt 2326.711 2320.778

26
AR(1) RWD
Model (1 − B)(yt − ct) 2371.132 2311.251

• For the trend stationary AR(1) process, we observe that the BIC criterion selected Model yt − a − bt (BIC =
2326.711)
• For the non-stationary random walk with drift, we can see that the BIC criterion selected Model (1 − B)yt
(BIC = 2306.183)
• The model selection criteria that we study later in the course therefore appear to have the ability to distinguish
between these two cases
• Note, however, that these selection procedures are based on a random sample and are not guaranteed to deliver
the true model in applied settings, even in the unlikely event that it was contained in the set of candidate models

Model Selection via auto.arima()


• The function auto.arima() returns the best ARIMA(p, d, q) model according to either its AIC, AICc or BIC
value by conducting a search over possible models within the order constraints provided
• The function auto.arima() is not guaranteed to deliver the best model according to the selection criterion used
• You might therefore wish to consider the model produced to be a candidate model and then proceed with some
of the diagnostics that follow to see whether you might improve upon the candidate
• We will study model selection via these criterion later in this course
• The following figure plots the series and an h=10 step-ahead prediction for the Canadian Lynx data (i.e., a
sequence of 1-10 year horizon forecasts)

Example - Canadian Lynx Data (auto.arima())


data(lynx)
model <- auto.arima(lynx,stepwise=FALSE,approximation=FALSE)
plot(forecast(model,h=10))
lines(fitted(model),col=2,lty=2)
legend("topleft",c("Data","fitted(model)","forecast(model,h=10)"),lty=c(1,2,1),col=c(1,2,4),bty="n")

27
Forecasts from ARIMA(4,0,0) with non−zero mean

Data
6000

fitted(model)
forecast(model,h=10)
4000
2000
0
−2000

1820 1840 1860 1880 1900 1920 1940

Diagnostics for ARIMA(p, d, q) Models


• If a model provides an adequate fit to the data, the residuals from the model ought to be white noise
• There is a useful function in base R called tsdiag() that you can apply to a candidate model
• This is a generic function that plots
– The residuals, often standardized
– The autocorrelation function of the residuals
– The p-values of a portmanteau test for all lags up to gof.lag

tsdiag() Illustration For the Canadian Lynx Data


tsdiag(model)

28
3
−2
Standardized Residuals

1820 1840 1860 1880 1900 1920

Time

ACF of Residuals
ACF

−0.2

0 5 10 15 20

Lag

p values for Ljung−Box statistic


0.0 1.0
p value

2 4 6 8 10

lag

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


• If we were selling ceiling fans then we might predict this July’s sales using last July’s sales (lag 12)
• This relationship of predicting using last year’s data would hold for any month of the year
• We might also expect that this June’s sales would also be useful (lag 1)
• The non-seasonal ARIMA(p, d, q) model cannot incorporate seasonal data
• It can, however, be overloaded to model seasonal data
• We can overload the non-seasonal ARIMA(p, d, q) model to not only include first differences, second differences
and so on, but also seasonal differences

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


• We overload a non-seasonal ARIMA(p, d, q) model by including additional seasonal parts
• The seasonal parts of the model consist of terms that are analogous to the non-seasonal terms, but instead
involve simple backshifts of the seasonal period, denoted m
• m represents the number of periods in the season (see the R function frequency())
• These models are denoted ARIMA(p, d, q)(P, D, Q)m
• The (P, D, Q)m component denotes the seasonal part of the model
• You specify a seasonal ARIMA(0, 1, 3)(2, 0, 0)12 model in R as follows:
Arima(UKDriverDeaths,order=c(0,1,3),seasonal=list(order=c(2,0,0),period=12))

29
Seasonal ARIMA(p, d, q)(P, D, Q)m Models
Example - ARIMA(1, 1, 1)(1, 1, 1)4
• An ARIMA(1, 1, 1)(1, 1, 1)4 model (without a constant) constructed for quarterly data (m = 4) can be
expressed as
(1 − ϕ1 B)(1 − Φ1 B 4 )(1 − B)(1 − B 4 )yt = (1 − θ1 B)(1 − Θ1 B 4 )ϵt

• The terms involving B 4 are the seasonal components


– ϵt ∼ (0, σϵ2 ) is i.i.d.
– The first term on the left hand side, (1 − ϕ1 B), is the non-seasonal AR(1) component
– The second term (1 − Φ1 B 4 ) is the seasonal AR(1) component
– The third term (1 − B) is the non-seasonal difference component
– The fourth term (1 − B 4 ) is the seasonal difference component

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Example - ARIMA(1, 1, 1)(1, 1, 1)4
• The first term on the right hand side, (1 − θ1 B), is the non-seasonal MA(1) component
• The second term (1 − Θ1 B 4 ) is the seasonal MA(1) component
• Letting wt = ∆yt = (1 − B)yt = yt − yt−1 , we can see that

(1 − ϕ1 B)(1 − Φ1 B 4 )(1 − B)(1 − B 4 )yt = wt − ϕ1 wt−1


− (1 + Φ1 )(wt−4 − ϕ1 wt−5 )
+ Φ1 (wt−8 − ϕ1 wt−9 )

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Example - ARIMA(1, 1, 1)(1, 1, 1)4
• Similarly,

(1 − θ1 B)(1 − Θ1 B 4 )ϵt = ϵt − θ1 ϵt−1 − Θ1 (ϵt−4 − θ1 ϵt−5 )

• So,

wt = ϕ1 wt−1 + (1 + Φ1 )(wt−4 − ϕ1 wt−5 ) − Φ1 (wt−8 − ϕ1 wt−9 )


+ ϵt − θ1 ϵt−1 − Θ1 (ϵt−4 − θ1 ϵt−5 )

• This is simply an ARIMA(9, 0, 5) model in wt with a large number of parameter constraints or an


ARIMA(10, 1, 5) model in yt

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Estimation, Properties, Forecasts, and Forecast Error Variances
• ARIMA(p, d, q)(P, D, Q)m models are essentially ARIMA(p′ , d, q ′ ) models with a large number of
constraints (zero parameters and so forth)

30
• Therefore, there is nothing new about these models when it comes to properties, estimation, forecasting,
forecast errors, and forecast error variances
• We can leverage existing results for identification (white noise tests etc.)
• Here we might use both ndiffs() and nsdiffs() in addition to the acf() and pacf() functions
• We can also use auto.arima() for identifying a candidate model

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Example - Modeling and Forecasting European Quarterly Retail Trade
• You will need to install the R packages fpp and forecast to run the following examples
• We will use the function auto.arima() to determine a candidate ARIMA(p, d, q)(P, D, Q)m model,
compare the series with the fitted values, and conduct h=10 step-ahead prediction
• The function auto.arima() returns the best ARIMA(p, d, q)(P, D, Q)m model according to either the
AIC, AICc or BIC values by conducting a search over possible models within the order constraints
provided
• The auto.arima() function is in the R package forecast, while the euretail data is in the R package
fpp so these packages need to be installed prior to running this code

Example - Modeling and Forecasting European Quarterly Retail Trade


require(fpp)
require(forecast)
data(euretail)
ggseasonplot(euretail,year.labels=TRUE)

31
Seasonal plot: euretail
2007
2006
2005
2008
100
2004

2003
2009
2002
2010
2000
96 2011
2001

1999

1998
92

1997
1996

Q1 Q2 Q3 Q4
Quarter

Example - Modeling and Forecasting European Quarterly Retail Trade


## The euretail data is in the fpp package, auto.arima() in the forecast package
require(fpp)
require(forecast)
data(euretail)
model <- auto.arima(euretail,stepwise=FALSE,approximation=FALSE)
plot(forecast(model,h=10))
lines(fitted(model),col=2,lty=2)
legend("topleft",c("Data","fitted(model)","forecast(model,h=10)"),lty=1:2,col=c(1,2,4),bty="n")

32
Forecasts from ARIMA(0,1,3)(0,1,1)[4]
102

Data
fitted(model)
forecast(model,h=10)
98
96
94
92
90
88

2000 2005 2010

Example - Modeling Monthly Cortecosteroid Drug Sales


• We consider a data set from the fpp package, h02
• h02 is a series of monthly cortecosteroid drug sales in Australia from 1992 to 2008
• The following table tabulates h=12 step-ahead forecasts (horizon 1 through 12, monthly data) using the
argument forecast(model,h=12)
• Make sure that the fpp package is installed prior to running this code
• We also present a figure with the time series, fitted values, and predictions along with prediction
intervals for h=24 step-ahead forecasts

Example - Modeling Monthly Cortecosteroid Drug Sales


require(fpp)
require(forecast)
data(h02)
ggseasonplot(h02,year.labels=TRUE)

33
Seasonal plot: h02
1.25 2004
2003
2007
2005
2006
2002
1999
1.00 2000
2001
1997
1998
1996
1993
1995

1994
0.75 2008 1992

1991

0.50

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month

Example - Modeling Monthly Cortecosteroid Drug Sales


require(fpp)
require(forecast)
data(h02)
model <- auto.arima(h02,stepwise=FALSE,approximation=FALSE)
plot(forecast(model,h=24))
lines(fitted(model),col=2,lty=2)
legend("topleft",c("Data","fitted(model)","forecast(model,h=24)"),lty=1:2,col=c(1,2,4),bty="n")

34
Forecasts from ARIMA(3,1,1)(0,1,1)[12]
1.4

Data
fitted(model)
1.2

forecast(model,h=24)
1.0
0.8
0.6
0.4

1995 2000 2005 2010

Example - Modeling Monthly Cortecosteroid Drug Sales


## The h02 data is in the fpp package, auto.arima() in the forecast package
require(fpp)
require(forecast)
data(h02)
model <- auto.arima(h02,stepwise=FALSE,approximation=FALSE)
knitr::kable(data.frame(forecast(model,h=12)),
caption="Australian Monthly Cortecosteroid Drug Sales Forecasts.")

Table 5: Australian Monthly Cortecosteroid Drug Sales Forecasts.

Point.Forecast Lo.80 Hi.80 Lo.95 Hi.95


Jul 2008 1.0166464 0.9481087 1.0851840 0.9118270 1.1214657
Aug 2008 1.0566558 0.9878375 1.1254740 0.9514073 1.1619042
Sep 2008 1.0981460 1.0248512 1.1714409 0.9860513 1.2102408
Oct 2008 1.1617772 1.0838991 1.2396553 1.0426729 1.2808814
Nov 2008 1.1685904 1.0894889 1.2476919 1.0476152 1.2895657
Dec 2008 1.2000699 1.1186714 1.2814684 1.0755816 1.3245582
Jan 2009 1.2467725 1.1638749 1.3296701 1.1199916 1.3735535
Feb 2009 0.7093491 0.6253626 0.7933355 0.5809029 0.8377952
Mar 2009 0.7130936 0.6279837 0.7982036 0.5829292 0.8432580
Apr 2009 0.7525096 0.6665585 0.8384607 0.6210587 0.8839605
May 2009 0.8225565 0.7358643 0.9092487 0.6899722 0.9551407
Jun 2009 0.8340600 0.7467027 0.9214174 0.7004585 0.9676616

35
External Predictors
• Sometimes you may have additional predictor variables that you believe affect yt but are not themselves lagged
values of either yt or ϵt
• These predictors (often called external predictors) turn out to be straightforward to incorporate in the
ARIMA(p, d, q)(P, D, Q)m framework
• They simply appear as additional explanatory variables on the right hand side of the model
• In R, the arima(), Arima() and auto.arima() functions support the arguments xreg=
• Note that xreg is an optional vector or matrix of external predictors, which must have the same number of
rows as y

Example - UK Driver Deaths and Compulsory Seat Belt Laws


• UKDriverDeaths is a time series giving the monthly totals of car drivers in Great Britain killed or seriously
injured January 1969 to December 1984
• Compulsory wearing of seat belts was introduced on 31 January 1983
• The figure that follows presents the ggseasonplot for this series
• You can see a secular decline in fatalities from 1983 onward

Example - UK Driver Deaths and Compulsory Seat Belt Laws


require(forecast)
ggseasonplot(UKDriverDeaths,year.labels=TRUE)

Seasonal plot: UKDriverDeaths


1972

2500 1970

1976
1978
1977
1979
1975
1971
1973
1969
1982
1974
2000
1980

1984
1981

1500 1983

1000
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month

36
require(forecast)
## Create a dummy variable taking value 0 prior to January 31 1983, 1 afterwards
seatbelt <- as.matrix(c(rep(0,169),rep(1,23)))
## Fit a seasonal ARIMA model with the additional predictor `seatbelt`
## (p,d,q)(P,D,Q)_m taken from auto.arima()
## model <- auto.arima(UKDriverDeaths,stepwise=FALSE,approximation=FALSE)
model <- Arima(UKDriverDeaths,
order=c(0,1,3),
seasonal=list(order=c(2,0,0),period=12),
xreg=seatbelt)
## Extract the model coefficients, standard errors, compute
## the z-statistic and associated p-values
params <- coef(model)
se <- sqrt(diag(vcov(model)))
z <- params/se
p <- 1-pnorm(abs(z))

Example - UK Driver Deaths and Compulsory Seat Belt Laws


• We wish to introduce an external predictor (a dummy variable for compulsory wearing of seat belts) into a
seasonal ARIMA(p, d, q)(P, D, Q)m model
• We use the auto.arima() function to locate a candidate model, then use this candidate model and fit a model
with the external predictor seatbelt using the Arima() function
• The following table presents a summary of the model
• Note that the external predictor seatbelt is and accounts for an average decrease of deaths or serious injuries
per month

37

You might also like