Lecture-9-Univariate-Time-Series-Modelling - Part 1
Lecture-9-Univariate-Time-Series-Modelling - Part 1
OlaOluwa S. Yaya
2022-12-13
Bibliography
• Box, G. E. P., and G. Jenkins. 1976. Time Series Analysis: Forecasting and Control. Second. San
Francisco: Holden Day.
• Racine, J. S. 2019. Reproducible Econometrics using R. Oxford University Press, New York.
• Schwarz, G. 1978. “Estimating the Dimension of a Model.” The Annals of Statistics 6:461–64.
Introduction
• Univariate linear time series models are often used when interest of time series econometrician is on
prediction of time series yt for future values based on past behaviours y1 , y2 , . . . , yt = {yi }ti=1 , where k
is time lag
• Thus, research may no nothing or little about causal relationships between the variable to be forecasted
and/or the potential explanatory variable(s)
• The goal is to model the stochastic process underlying the series and to use this for forecasting purposes
• Therefore, we concentrate on the forecasts, their errors, and the variance of these forecast errors
• Time series modelling requires time series observations to be stationary before they can be modeled via
equations with fixed coefficients that can be estimated from past data
• Recall that a process is said to be stationary if its stochastic properties are invariant of time, while
nonstationary process is variant of time
• A consequence of stationarity is that the mean, variance, and covariances of the series are all time-
invariant
• Linear time series models are typically estimated via the method of maximum-likelihood, which you
have studied previously
MA(q) Models
• If a time series has been generated by a moving average process of order q, then it can be expressed as
yt = µ + ϵt − θ1 ϵt−1 − · · · − θq ϵt−q ,
1
• We can therefore express an MA(q) process as
yt = µ + ϵt − θ1 Bϵt − · · · − θq B q ϵt
= µ + (1 − θ1 B − · · · − θq B q )ϵt
= µ + Θ(B)ϵt ,
Example
The underlying code uses the arima() function base R (stats) to generate moving averages of order
q = (2, 4, 6) for the volume of electricity sold to residential customers in South Australia each year from 1989
to 2008 (elecsales is in the R package fpp)
## The data elecsales is from the fpp package
require(fpp)
Data
MA(2)
MA(4)
MA(6)
3200
GWh
2800
2400
Year
2
• It has variance given by
q
" #
X
2 2 2
= E ϵt + θi ϵt−i + . . .
i=1
q
!
X
= 1+ θi σϵ2
2
i=1
• This is equal to
γk = (−θk + θk+1 θ1 + · · · + θq θq−k )σϵ2
for k ≤ q and 0 for k > q
• Moving average processes are characterized by dependency on a finite past given by the order of the
process, q
• Given that the mean, variance, and covariances of MA(q) models do not depend on time, then the
requirements for stationarity are satisfied if the process has finite variance.
• Recall that the variance is given by
q
!
X
γ0 = 1+ θi2 σϵ2
i=1
• The restrictions required for an MA(q) process to have finite variance are therefore
q
X
θi2 < ∞
i=1
3
• Identification of moving average processes is quite straightforward
• We do so on the basis of the sample autocorrelation and partial autocorrelation functions
• We exploit that fact that the partial autocorrelation function will give an exponential decaying pattern,
while autocorrelation function will cut off to 0 after certain lags. The cutting off is determined using
Bartlett Approximation
set.seed(42)
par(mfrow=c(1,3))
pacf(arima.sim(model=list(ma=c(-.8)),n=1000),lag.max=5,main="MA(1)")
pacf(arima.sim(model=list(ma=c(-.5,.4)),n=1000),lag.max=5,main="MA(2)")
pacf(arima.sim(model=list(ma=c(-.3,.2,.3)),n=1000),lag.max=5,main="MA(3)")
0.2
−0.1
0.1
0.0
Partial ACF
Partial ACF
Partial ACF
−0.2
0.0
−0.2
−0.3
−0.1
−0.4
−0.4
−0.2
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
par(mfrow=c(1,1))
set.seed(42)
par(mfrow=c(1,3))
acf(arima.sim(model=list(ma=c(-.8)),n=1000),lag.max=5,main="MA(1)")
acf(arima.sim(model=list(ma=c(-.5,.4)),n=1000),lag.max=5,main="MA(2)")
acf(arima.sim(model=list(ma=c(-.3,.2,.3)),n=1000),lag.max=5,main="MA(3)")
4
MA(1) MA(2) MA(3)
1.0
1.0
1.0
0.8
0.6
0.5
0.5
ACF
ACF
ACF
0.4
0.2
0.0
0.0
0.0
−0.2
−0.5
−0.5
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
par(mfrow=c(1,1))
AR(p) Models
• If a time series has been generated by an autoregressive process of order p, then it can be expressed as
yt = δ + ϕ1 yt−1 + · · · + ϕp yt−p + ϵt
yt − ϕ1 Byt − · · · − ϕp B p yt = δ + ϵt
(1 − ϕ1 B − · · · − ϕp B p )yt = δ + ϵt
Φ(B)yt = δ + ϵt
• These models are extremely flexible and are capable of handling a wide range of time series relationships
Example
## The data elecsales is from the fpp package
require(fpp)
data(elecsales)
plot(elecsales, main="Residential Electricity Sales",ylab="GWh", xlab="Year")
lines(fitted(arima(elecsales,c(2,0,0))),col=2,lty=2)
lines(fitted(arima(elecsales,c(4,0,0))),col=3,lty=3)
lines(fitted(arima(elecsales,c(6,0,0))),col=4,lty=4)
legend("topleft",c("Data","AR(2)","AR(4)","AR(6)"),col=1:4,lty=1:4,bty="n")
5
Residential Electricity Sales
3600
Data
AR(2)
AR(4)
AR(6)
3200
GWh
2800
2400
Year
• If yt is stationary, the mean is invariant with respect to time, therefore a stationary AR(p) process has
mean given by
µy = δ + ϕ1 µy + · · · + ϕp µy
• This is equal to
δ
µy =
1 − ϕ1 − · · · − ϕp
6
• The variance is
V ar[yt ] = γ0
= E[(yt − E[yt ])2 ]
= E[yt2 ]
= E[yt yt ]
= E [yt (ϕ1 yt−1 + · · · + ϕp yt−p + ϵt )]
= ϕ1 E[yt yt−1 ] + · · · + ϕp E[yt yt−p ] + E[yt ϵt ]
= ϕ1 γ1 + · · · + ϕp γp + σϵ2
• From the above, for stationarity, we can check the restrictions on the parameters ϕi required for finite
variance
• By way of example, consider an AR(1) model given by
yt = δ + ϕ1 yt−1 + ϵt
γ0 = ϕ1 γ1 + σϵ2 ,
γ1 = ϕ1 γ0
• Under stationarity, this must be a finite positive number therefore we require that |ϕ1 | < 1
yt = δ + ϕ1 yt−1 + ϕ2 yt−2 + ϵt
γ0 = ϕ1 γ1 + ϕ2 γ2 + σϵ2 ,
γ1 = ϕ1 γ0 + ϕ2 γ1 ,
γ2 = ϕ1 γ1 + ϕ2 γ0
(1 − ϕ2 )σϵ2
γ0 =
(1 + ϕ2 )(1 − ϕ1 − ϕ2 )(1 + ϕ1 − ϕ2 )
• Under stationarity, this must be a positive finite number therefore we require that |ϕ2 | < 1, ϕ1 + ϕ2 < 1,
and −ϕ1 + ϕ2 < 1
7
• The covariance for k = 1 is given by
γ0 = ϕ1 γ1 + · · · + ϕ p γp ,
γ1 = ϕ1 γ0 + · · · + ϕp γp−1 ,
..
.
γp = ϕ1 γp−1 + · · · + ϕp γ0
ρ0 = 1,
ϕ1 γ0 + · · · + ϕp γp−1
ρ1 = ,
γ0
..
.
ϕ1 γp−1 + · · · + ϕp γ0
ρp =
γ0
• We write these as
ρ0 = 1,
ρ1 = ϕ1 + ϕ2 ρ1 + · · · + ϕp ρp−1 ,
..
.
ρp = ϕ1 ρp−1 + ϕ2 ρp−2 + · · · + ϕp
• A useful rule of thumb for an Autoregressive model is that, if the sum of the coefficients are less than
one and no coefficient exceeds one in absolute value, then it is likely that the relationship is stationary
• If an AR(p) process is stationary we say that the characteristic roots of the polynomial Φ(B) lie outside
of the unit circle
8
• Identification of autoregressive processes is also quite straightforward
• We do so on the basis of the sample autocorrelation and partial autocorrelation functions
• We exploit that fact that the autocorrelation function will give an exponential decaying pattern, while
partial autocorrelation function will cut off to 0 after certain lags. The cutting off is determined using
Bartlett Approximation
Series elecsales
0.8
0.4
ACF
0.0
−0.4
0 1 2 3 4 5 6
Lag
9
Series elecsales
0.0 0.2 0.4 0.6 0.8
Partial ACF
−0.4
1 2 3 4 5 6
Lag
where ϵt ∼ (0, σϵ2 ) is i.i.d. And each of the models is a well-specified Autoregressive Integrated Moving
Average [ARIMA(p,d,q)] Model
10
• We say that yt is I(d) or integrated of order d. The term comes from calculus; if dy/dt = x, then y is
the integral of x
• In discrete time series, if ∆yt = xt , then y might also be viewed as the integral, or sum over t, of x
• Using the backwards shift operator we may express our ARIMA(p, d, q) model as
wt = δ + ϕ1 Bwt + · · · + ϕp B p wt + ϵt − θ1 Bϵt − · · · − θq B q ϵt
(1 − ϕ1 B − · · · − ϕp B p )wt = δ + (1 − θ1 B − · · · − θq B q )ϵt
• Here, Φ(B) and Θ(B) are polynomials of order p and q in the backwards shift operator B
• This is a parsimonious way of writing an ARIMA(p, d, q) model
• Many of the models we have studied above are special cases of the ARIMA(p, d, q) model:
11
Identification of ARIMA(p, d, q) Processes
• Next, use the partial autocorrelation function to try to infer the order of the autoregressive component
• Having found likely candidates for p, d, and q, we estimate the model and then conduct some basic
diagnostic checks
– If the model is correctly specified, then the residuals should be white noise
– If they are not, then you may have the wrong order of the autoregressive or moving average
component
– Fortunately, it would appear that most time series can be characterized by low order ARIMA(p, d, q)
processes
• There is also a handy function named auto.arima() in the forecast package that can be helpful for
locating a candidate model
• When starting values for the ϕ parameters are required, we use the partial autocorrelation function
ϕ̂i = ai for i = 1, . . . , p
• When R estimates an ARIMA(p, d, q) model, it uses ML, which delivers parameter values that maximize
the probability of obtaining the time series that we have observed
• For ARIMA(p, d, q) models, ML is similar to the least squares estimates that would be obtained by
minimizing the sum of squared residuals
• However, you need to realize that ARIMA(p, d, q) models are much more complicated to estimate than
simple regression models, and different software implementations can give slightly different answers
because they employ different estimation procedures and algorithms
12
• There is a feature of this model that is worth noting, namely, that including a constant in a non-
stationary ARIMA(p, d, q) model is equivalent to incorporating a polynomial trend of order d in the
model
• In base R there is a function named arima(), and in the R forecast package there is a similar but
slightly different function named Arima() (R is case sensitive)
• Both can be used to fit ARIMA(p, d, q) models, but they differ in one aspect that is relevant when
forecasting series containing a trend
• The arima() function sets δ = µ = 0 when d > 0, while when d = 0 it provides an estimate of
µ = δ/(1 − ϕ1 − · · · − ϕp ). The parameter µ is called the intercept in the model summary
• The arima() function accepts the argument include.mean which only has an effect when d = 0 (setting
include.mean=FALSE will set δ = µ = 0)
• The Arima() function, on the other hand, provides more flexibility for the inclusion of a (linear) trend
• It accepts the argument include.mean which has identical functionality to the corresponding argument
for arima(), but also accepts the argument include.drift which allows µ ̸= 0 when d = 1 (when
d = 1 the parameter µ is called the drift in the R output)
• The Arima() function also accepts the argument include.constant which, if TRUE, will set
include.mean=TRUE if d = 0 and include.drift=TRUE when d = 1
• Note that when d > 1 a constant is ignored by Arima() since quadratic and higher order trends can
produce erratic and unreasonable forecasts in this setting
• An extra linear trend term bt is thereby included in the model by Arima() that is not present when
using arima()
• The R output will label a as the intercept and b as the drift coefficient
• When d = 1 and include.drift=TRUE is set, the fitted model from Arima() is
Φ(B)(∆yt − c) = Θ(B)ϵt
• When d = 1 and include.drift=TRUE is set, an extra constant term c is included in the model by
Arima() that is not present when using arima(), which is equivalent to including a linear trend
• In this case, the R output will label c the drift coefficient
13
• Consider the practical issue of how one might distinguish between a non-stationary random walk
process with drift (case of trend stationarity) and a non-stationary autoregressive process that is trend
stationary (case of difference stationarity)
• One way would be to estimate a set of candidate models and allow some model selection criteria to
determine which model is the best
• In what follows we consider the Bayes Information Criterion [@SCHWARZ:1978] and use the R
command BIC to compute the criterion function value for each of the candidate models (smaller values
are preferred)
• We consider two data generating processes for this exercise, one a non-stationary AR(1) containing a
time trend given by
yt = ayt−1 + bt + ϵt , |a| < 1
yt = yt−1 + d + ϵt
• We simulate two series plotted in the figure that follows, one from each process, and compute four
candidate models:
– A classical AR(1) model (‘Model yt ’ in what follows)
– A classical first-differenced AR(1) model (‘Model (1 − B)yt ’ in what follows)
– An augmented AR(1) model with a linear trend (‘Model yt − a − bt’ in what follows)
– An augmented first-differenced AR(1) model with a linear trend (‘Model (1 − B)(yt − ct)’ in what
follows)
• We present the BIC values for these models for each series in the table that follows
require(forecast)
set.seed(42)
n <- 250
## Simulate a trend stationary AR(1) process
x.ard <- numeric()
x.ard[1] <- 0
for(t in 2:n) x.ard[t] <- 0.5*x.ard[t-1] + 0.5*t + rnorm(1,sd=25)
x.ard <- ts(x.ard)
## Simulate a random walk with drift
x.rwd <- ts(cumsum(rnorm(n,mean=2,sd=25)))
## Fit four candidate models for the AR(1) series
model.ard.nd <- Arima(x.ard,c(1,0,0))
model.ard.nd.diff <- Arima(x.ard,c(1,1,0))
model.ard <- Arima(x.ard,c(1,0,0),include.drift=TRUE)
model.ard.diff <- Arima(x.ard,c(1,1,0),include.drift=TRUE)
## Compute the BIC criterion for each model
ard.BIC <- c(BIC(model.ard.nd),
BIC(model.ard.nd.diff),
BIC(model.ard),
BIC(model.ard.diff))
## Fit four candidate models for the random walk series
model.rwd.nd <- Arima(x.rwd,c(1,0,0))
14
model.rwd.nd.diff <- Arima(x.rwd,c(1,1,0))
model.rwd <- Arima(x.rwd,c(1,0,0),include.drift=TRUE)
model.rwd.diff <- Arima(x.rwd,c(1,1,0),include.drift=TRUE)
## Compute the BIC criterion for each model
rwd.BIC <- c(BIC(model.rwd.nd),
BIC(model.rwd.nd.diff),
BIC(model.rwd),
BIC(model.rwd.diff))
100
0
−100
Time
## Compare models
foo <- data.frame(cbind(ard.BIC,rwd.BIC))
colnames(foo) <- c("AR(1)","RWD")
rownames(foo) <- c("Model $y_t$",
"Model $(1-B)y_t$",
"Model $y_t-a-bt$",
"Model $(1-B)(y_t-ct)$")
15
knitr::kable(foo,caption="BIC Model Selection Criterion Values for Four Candidate Models Based on a Tren
escape=FALSE)
AR(1) RWD
Model yt 2392.883 2322.317
Model (1 − B)yt 2366.227 2306.183
Model yt − a − bt 2326.711 2320.778
Model (1 − B)(yt − ct) 2371.132 2311.251
• For the trend stationary AR(1) process, we observe that the BIC criterion selected Model yt − a − bt
(BIC = 2326.711)
• For the non-stationary random walk with drift, we can see that the BIC criterion selected Model
(1 − B)yt (BIC = 2306.183)
• The model selection criteria that we study later in the course therefore appear to have the ability to
distinguish between these two cases
• Note, however, that these selection procedures are based on a random sample and are not guaranteed
to deliver the true model in applied settings, even in the unlikely event that it was contained in the set
of candidate models
16
Forecasts from ARIMA(4,0,0) with non−zero mean
Data
6000
fitted(model)
forecast(model,h=10)
4000
2000
0
−2000
17
3
2
1
0
−1
−2
Standardized Residuals
Time
ACF of Residuals
1.0
0.8
0.6
ACF
0.4
0.2
0.0
−0.2
0 5 10 15 20
Lag
0.4
0.2
0.0
2 4 6 8 10
lag
18
Seasonal ARIMA(p, d, q)(P, D, Q)m Models
Example - ARIMA(1, 1, 1)(1, 1, 1)4
• An ARIMA(1, 1, 1)(1, 1, 1)4 model (without a constant) constructed for quarterly data (m = 4) can be
expressed as
(1 − ϕ1 B)(1 − Φ1 B 4 )(1 − B)(1 − B 4 )yt = (1 − θ1 B)(1 − Θ1 B 4 )ϵt
• So,
19
• Therefore, there is nothing new about these models when it comes to properties, estimation, forecasting,
forecast errors, and forecast error variances
• We can leverage existing results for identification (white noise tests etc.)
• Here we might use both ndiffs() and nsdiffs() in addition to the acf() and pacf() functions
• We can also use auto.arima() for identifying a candidate model
2003
2009
2002
2010
2000
96 2011
2001
1999
1998
92
1997
1996
Q1 Q2 Q3 Q4
Quarter
20
Example - Modeling and Forecasting European Quarterly Retail Trade
## The euretail data is in the fpp package, auto.arima() in the forecast package
require(fpp)
require(forecast)
data(euretail)
model <- auto.arima(euretail,stepwise=FALSE,approximation=FALSE)
plot(forecast(model,h=10))
lines(fitted(model),col=2,lty=2)
legend("topleft",c("Data","fitted(model)","forecast(model,h=10)"),lty=1:2,col=c(1,2,4),bty="n")
Data
fitted(model)
forecast(model,h=10)
98
96
94
92
90
88
21
Example - Modeling Monthly Cortecosteroid Drug Sales
1994
0.75 2008 1992
1991
0.50
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
22
Forecasts from ARIMA(3,1,1)(0,1,1)[12]
1.4
Data
fitted(model)
1.2
forecast(model,h=24)
1.0
0.8
0.6
0.4
External Predictors
• Sometimes you may have additional predictor variables that you believe affect yt but are not themselves lagged
values of either yt or ϵt
23
• These predictors (often called external predictors) turn out to be straightforward to incorporate in the
ARIMA(p, d, q)(P, D, Q)m framework
• They simply appear as additional explanatory variables on the right hand side of the model
• In R, the arima(), Arima() and auto.arima() functions support the arguments xreg=
• Note that xreg is an optional vector or matrix of external predictors, which must have the same number of
rows as y
2500 1970
1976
1978
1977
1979
1975
1971
1973
1969
1982
1974
2000
1980
1984
1981
1500 1983
1000
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
Figure 3: ‘ggseasonplot‘ of Monthly Totals of Car Drivers in Great Britain Killed or Seriously Injured January
1969 to December 1984.
24
• Note that the external predictor seatbelt is and accounts for an average decrease of deaths or serious injuries
per month
• One way would be to estimate a set of candidate models and allow some model selection criteria to determine
which model is the best
• In what follows we consider the Bayes Information Criterion [@SCHWARZ:1978] and use the R command BIC
to compute the criterion function value for each of the candidate models (smaller values are preferred)
• We consider two data generating processes for this exercise, one a non-stationary AR(1) containing a time trend
given by
yt = ayt−1 + bt + ϵt , |a| < 1
• The other is a non-stationary random walk with drift given by
yt = yt−1 + d + ϵt
• We simulate two series plotted in the figure that follows, one from each process, and compute four candidate
models:
– A classical AR(1) model (‘Model yt ’ in what follows)
– A classical first-differenced AR(1) model (‘Model (1 − B)yt ’ in what follows)
– An augmented AR(1) model with a linear trend (‘Model yt − a − bt’ in what follows)
– An augmented first-differenced AR(1) model with a linear trend (‘Model (1 − B)(yt − ct)’ in what follows)
• We present the BIC values for these models for each series in the table that follows
require(forecast)
set.seed(42)
n <- 250
## Simulate a trend stationary AR(1) process
x.ard <- numeric()
x.ard[1] <- 0
for(t in 2:n) x.ard[t] <- 0.5*x.ard[t-1] + 0.5*t + rnorm(1,sd=25)
x.ard <- ts(x.ard)
## Simulate a random walk with drift
x.rwd <- ts(cumsum(rnorm(n,mean=2,sd=25)))
## Fit four candidate models for the AR(1) series
model.ard.nd <- Arima(x.ard,c(1,0,0))
model.ard.nd.diff <- Arima(x.ard,c(1,1,0))
model.ard <- Arima(x.ard,c(1,0,0),include.drift=TRUE)
model.ard.diff <- Arima(x.ard,c(1,1,0),include.drift=TRUE)
## Compute the BIC criterion for each model
ard.BIC <- c(BIC(model.ard.nd),
BIC(model.ard.nd.diff),
25
BIC(model.ard),
BIC(model.ard.diff))
## Fit four candidate models for the random walk series
model.rwd.nd <- Arima(x.rwd,c(1,0,0))
model.rwd.nd.diff <- Arima(x.rwd,c(1,1,0))
model.rwd <- Arima(x.rwd,c(1,0,0),include.drift=TRUE)
model.rwd.diff <- Arima(x.rwd,c(1,1,0),include.drift=TRUE)
## Compute the BIC criterion for each model
rwd.BIC <- c(BIC(model.rwd.nd),
BIC(model.rwd.nd.diff),
BIC(model.rwd),
BIC(model.rwd.diff))
100
0
−100
Time
Table 4: BIC Model Selection Criterion Values for Four Candidate Models
Based on a Trend Stationary AR(1) Process and a Random Walk with
Drift (smaller values are better).
AR(1) RWD
Model yt 2392.883 2322.317
Model (1 − B)yt 2366.227 2306.183
Model yt − a − bt 2326.711 2320.778
26
AR(1) RWD
Model (1 − B)(yt − ct) 2371.132 2311.251
• For the trend stationary AR(1) process, we observe that the BIC criterion selected Model yt − a − bt (BIC =
2326.711)
• For the non-stationary random walk with drift, we can see that the BIC criterion selected Model (1 − B)yt
(BIC = 2306.183)
• The model selection criteria that we study later in the course therefore appear to have the ability to distinguish
between these two cases
• Note, however, that these selection procedures are based on a random sample and are not guaranteed to deliver
the true model in applied settings, even in the unlikely event that it was contained in the set of candidate models
27
Forecasts from ARIMA(4,0,0) with non−zero mean
Data
6000
fitted(model)
forecast(model,h=10)
4000
2000
0
−2000
28
3
−2
Standardized Residuals
Time
ACF of Residuals
ACF
−0.2
0 5 10 15 20
Lag
2 4 6 8 10
lag
29
Seasonal ARIMA(p, d, q)(P, D, Q)m Models
Example - ARIMA(1, 1, 1)(1, 1, 1)4
• An ARIMA(1, 1, 1)(1, 1, 1)4 model (without a constant) constructed for quarterly data (m = 4) can be
expressed as
(1 − ϕ1 B)(1 − Φ1 B 4 )(1 − B)(1 − B 4 )yt = (1 − θ1 B)(1 − Θ1 B 4 )ϵt
• So,
30
• Therefore, there is nothing new about these models when it comes to properties, estimation, forecasting,
forecast errors, and forecast error variances
• We can leverage existing results for identification (white noise tests etc.)
• Here we might use both ndiffs() and nsdiffs() in addition to the acf() and pacf() functions
• We can also use auto.arima() for identifying a candidate model
31
Seasonal plot: euretail
2007
2006
2005
2008
100
2004
2003
2009
2002
2010
2000
96 2011
2001
1999
1998
92
1997
1996
Q1 Q2 Q3 Q4
Quarter
32
Forecasts from ARIMA(0,1,3)(0,1,1)[4]
102
Data
fitted(model)
forecast(model,h=10)
98
96
94
92
90
88
33
Seasonal plot: h02
1.25 2004
2003
2007
2005
2006
2002
1999
1.00 2000
2001
1997
1998
1996
1993
1995
1994
0.75 2008 1992
1991
0.50
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
34
Forecasts from ARIMA(3,1,1)(0,1,1)[12]
1.4
Data
fitted(model)
1.2
forecast(model,h=24)
1.0
0.8
0.6
0.4
35
External Predictors
• Sometimes you may have additional predictor variables that you believe affect yt but are not themselves lagged
values of either yt or ϵt
• These predictors (often called external predictors) turn out to be straightforward to incorporate in the
ARIMA(p, d, q)(P, D, Q)m framework
• They simply appear as additional explanatory variables on the right hand side of the model
• In R, the arima(), Arima() and auto.arima() functions support the arguments xreg=
• Note that xreg is an optional vector or matrix of external predictors, which must have the same number of
rows as y
2500 1970
1976
1978
1977
1979
1975
1971
1973
1969
1982
1974
2000
1980
1984
1981
1500 1983
1000
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
36
require(forecast)
## Create a dummy variable taking value 0 prior to January 31 1983, 1 afterwards
seatbelt <- as.matrix(c(rep(0,169),rep(1,23)))
## Fit a seasonal ARIMA model with the additional predictor `seatbelt`
## (p,d,q)(P,D,Q)_m taken from auto.arima()
## model <- auto.arima(UKDriverDeaths,stepwise=FALSE,approximation=FALSE)
model <- Arima(UKDriverDeaths,
order=c(0,1,3),
seasonal=list(order=c(2,0,0),period=12),
xreg=seatbelt)
## Extract the model coefficients, standard errors, compute
## the z-statistic and associated p-values
params <- coef(model)
se <- sqrt(diag(vcov(model)))
z <- params/se
p <- 1-pnorm(abs(z))
37