0% found this document useful (0 votes)
20 views61 pages

STA457 Week 7 Notes

The document covers Integrated ARMA (ARIMA) models and the process of building these models for time series analysis. It outlines steps for fitting ARIMA models, including data plotting, transformation, identifying model orders, parameter estimation, and diagnostics. Additionally, it provides examples, including the analysis of U.S. GNP data, demonstrating the application of these concepts in real-world scenarios.

Uploaded by

easyacemt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views61 pages

STA457 Week 7 Notes

The document covers Integrated ARMA (ARIMA) models and the process of building these models for time series analysis. It outlines steps for fitting ARIMA models, including data plotting, transformation, identifying model orders, parameter estimation, and diagnostics. Additionally, it provides examples, including the analysis of U.S. GNP data, demonstrating the application of these concepts in real-world scenarios.

Uploaded by

easyacemt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

STA457: Time Series Analysis

Lecture 11

Lijia Wang

Department of Statistical Sciences


University of Toronto

Lijia Wang (UofT) STA457: Time Series Analysis 1 / 28


Overview

Last Time:
1 Forecasting
2 Estimation
Today:
1 Integrated ARMA (ARIMA) models
2 Building ARIMA models

Lijia Wang (UofT) STA457: Time Series Analysis 2 / 28


Outline

1 Integrated Models for Nonstationary Data

2 Building ARIMA Models

Lijia Wang (UofT) STA457: Time Series Analysis 3 / 28


Motivation 1

We consider the model


xt = µ t + yt ,
where µt = ω0 + ω1 t and yt is stationary. Di!erencing such a process will
lead to a stationary process:

→xt = xt ↑ xt→1 = ω1 + yt ↑ yt→1 = ω1 + →yt .

Lijia Wang (UofT) STA457: Time Series Analysis 4 / 28


Motivation 2

Another model that leads to first di!erencing is the case in which µt is


stochastic and slowly varying according to a random walk. That is,

µt = µt→1 + εt ,

where εt is stationary. In this case,

→xt = εt + →yt ,

is stationary.

Lijia Wang (UofT) STA457: Time Series Analysis 5 / 28


The Integrated ARMA, or ARIMA, Model

Definition: A process xt is said to be ARIMA(p, d, q) if

→d xt = (1 ↑ B)d xt

is ARMA(p, q). In general, we will write the model as

ϑ(B)(1 ↑ B)d xt = ϖ(B)wt .

If E (→d xt ) = µ, we write the model as

ϑ(B)(1 ↑ B)d xt = ϱ + ϖ(B)wt ,

where ϱ = µ(1 ↑ ϑ1 ↑ ϑ2 ↑ · · · ↑ ϑp ).

Lijia Wang (UofT) STA457: Time Series Analysis 6 / 28


ARIMA forecasting

It should be clear that, since yt = →d xt is ARMA, we can use the


previously introduced methods to obtain forecasts of yt , which in turn lead
to forecasts for xt .

n
For example, if d = 1, given forecasts yn+m for m = 1, 2, . . ., we have
n
yn+m n
= xn+m n
↑ xn+m→1 , so that

n n n
xn+m = yn+m + xn+m→1
n
with initial condition xn+1 n
= yn+1 + xn .

Lijia Wang (UofT) STA457: Time Series Analysis 7 / 28


Example: IMA(1,1) and EWMA model

The ARIMA(0,1,1), or IMA(1,1) model is of interest because many


economic time series can be successfully modeled this way. In addition, the
model leads to a frequently used forecasting method called exponentially
weighted moving averages (EWMA). We will write the model as

xt = xt→1 + wt ↑ ςwt→1 ,
with |ς| < 1, for t = 1, 2, . . ., and x0 = 0. We could also include a drift
term in the formula.

Lijia Wang (UofT) STA457: Time Series Analysis 8 / 28


Example: IMA(1,1) and EWMA model

If we write

yt = wt ↑ ςwt→1 ,
we may write the IMA(1,1) as ! xt = xt→1 + yt . Because |ς| < 1, yt has an
invertible representation, yt = ↑j=1 ς jy
t→j + wt , and substituting
yt = xt ↑ xt→1 , we may write

"
xt = (1 ↑ ς)ςj→1 xt→j + wt .
j=1

as an approximation for large t (put xt = 0 for t ↓ 0), which is the


exponentially weighted moving averages (EWMA).

Lijia Wang (UofT) STA457: Time Series Analysis 9 / 28


Outline

1 Integrated Models for Nonstationary Data

2 Building ARIMA Models

Lijia Wang (UofT) STA457: Time Series Analysis 10 / 28


Steps for building ARIMA Models

There are a few basic steps to fitting ARIMA models to time series data.
These steps involve:
1 Plotting the data and interpreting the plot.
2 Possibly transforming the data (e.g., log, first di!erence).
3 Identifying the dependence orders of the model (p, d, q).
4 Parameter estimation.
5 Diagnostics of residuals (interpretation, normality assumptions, ACF
graphs).
6 Model choice.

Lijia Wang (UofT) STA457: Time Series Analysis 11 / 28


Steps 1 and 2: Plotting the data and transforming

1. Plotting the Data: First, as with any data analysis, we should


construct a time plot of the data and inspect the graph for any
anomalies.
2. Transforming the Data: If the variability in the data grows with time,
it will be necessary to transform the data to stabilize the variance. In
such cases, the Box–Cox class of power transformations can be
employed.

Lijia Wang (UofT) STA457: Time Series Analysis 12 / 28


Step 3: Identifying the dependence orders (p, d, q)

3. Identifying the Dependence Orders of the Model: After suitably


transforming the data, the next step is to identify preliminary values
of the autoregressive order, p, the order of di!erencing, d, and the
moving average order, q.

Lijia Wang (UofT) STA457: Time Series Analysis 13 / 28


Step 3.1: Identifying the di!erencing order d

3.1 Identifying the di!erencing order d:


The time plot and the ACF plot can help in indicating whether
di!erencing is needed. A slow decay in the sample ACF is an indication
that di!erencing may be needed.
If from the plots, we see that di!erencing is needed and di!erencing is
called for, then di!erence the data once, d = 1, and inspect the time
plot of →xt . If we see that additional di!erencing is necessary, then try
di!erencing again and inspect a time plot of →2 xt . We repeat the
procedure
Watch out for over-di!erencing: Be careful not to over-di!erence
because this may introduce dependence where none exists. For
example, xt = wt is serially uncorrelated, but →xt = wt ↑ wt→1 is
MA(1).

Lijia Wang (UofT) STA457: Time Series Analysis 14 / 28


Step 3.2: Identifying the AR order p and MA order q

3.2 Identifying the autoregressive order p and moving average order q:


When preliminary values of d have been settled, the next step is to
look at the sample ACF and PACF of →d xt for whatever values of d
have been chosen.
We use the plots to identify parameters p and q following the
properties of ACF and PACF introduced in previous lectures.

Lijia Wang (UofT) STA457: Time Series Analysis 15 / 28


Step 3.2: Identifying the AR order p and MA order q

3.2 Identifying the autoregressive order p and moving average order q:


Note that it cannot be the case that both the ACF and PACF cut o!.
Because we are dealing with real data estimates, it will not always be
clear whether the sample ACF or PACF is tailing o! or cutting o!.
Also, two models that are seemingly di!erent can actually be very
similar.
With this in mind, we should not worry about being so precise at this
stage of the model fitting. At this point, with a few preliminary values
of p, d, and q at hand, and we can start estimating the parameters.

Lijia Wang (UofT) STA457: Time Series Analysis 16 / 28


Step 4: Estimate model parameters

4. Estimate model parameters: We estimate model parameters using


both the MOM and the MLE approaches that were introduced in
previous lectures.

Lijia Wang (UofT) STA457: Time Series Analysis 17 / 28


Step 5: Model Diagnostics

5. Diagnostics: This investigation focuses on the analysis of the


residuals. The diagnostic results provide base for model selection.

The standardized innovations or residuals can be computed by

xt ↑ x̂tt→1
et = # ,
P̂tt→1

where x̂tt→1 is the one-step-ahead prediction of xt based on the fitted


model and P̂tt→1 is the estimated one-step-ahead error variance. If the
model fits well, the standardized residuals should behave as an iid
sequence with mean zero and variance one. A normal probability plot
or a Q-Q plot can help in identifying departures from normality.

Lijia Wang (UofT) STA457: Time Series Analysis 18 / 28


Step 5: Model Diagnostics

A good check on the correlation structure of the residuals is to plot


φ̂e (h) versus h along with the error bounds of ± ↓2n .
The Ljung–Box–Pierce test can be used to identify whether φ̂e (h) is
small in magnitude for a given lag h.

Lijia Wang (UofT) STA457: Time Series Analysis 19 / 28


Step 6: Model choosing

6. Model choosing: There may be multiple candidate models after Step


3. We choose the best one following the diagnose results.

Lijia Wang (UofT) STA457: Time Series Analysis 20 / 28


Example: Analysis of US GNP Data

In this example, we consider the analysis of quarterly U.S. GNP from


1947(1) to 2002(3), n = 223 observations. The data are real U.S. gross
national product in billions of chained 1996 dollars and have been
seasonally adjusted. The data were obtained from the Federal Reserve
Bank of St. Louis (http://research.stlouisfed.org/).

Lijia Wang (UofT) STA457: Time Series Analysis 21 / 28


Example: Step 1&2

Figure: Quarterly U.S. GNP from 1947(1) to 2002(3)

Lijia Wang (UofT) STA457: Time Series Analysis 22 / 28


Example: Step 3

When reports of GNP and similar economic indicators are given, it is often
in growth rate (percent change) rather than in actual (or adjusted) values
that is of interest. The growth rate, say, xt = →log(yt ), is plotted, and it
appears to be a stable process.

Figure: U.S. GNP quarterly growth rate

Lijia Wang (UofT) STA457: Time Series Analysis 23 / 28


Example: Step 3

Figure: Sample ACF and PACF of the GNP quarterly growth rate

Lijia Wang (UofT) STA457: Time Series Analysis 24 / 28


Example: Step 3

Inspecting the sample ACF and PACF, we might feel that two models are
suitable for the data:
1 The ACF is cutting o! at lag 2 and the PACF is tailing o!. This
would suggest the GNP growth rate follows an MA(2) process, or log
GNP follows an ARIMA(0,1,2) model.
2 The ACF is tailing o! and the PACF is cutting o! at lag 1. This
suggests an AR(1) model for the growth rate, or ARIMA(1,1,0) for
log GNP.
Rather than focus on one model, we will fit both models.

Lijia Wang (UofT) STA457: Time Series Analysis 25 / 28


Example: Step 4

Using MLE to fit the models:


1 For the MA(2) model:

x̂t = .008(.001) +.303(.065) ŵt→1 +.204(.064) ŵt→2 +ŵt with ˆw = .0094


2 For the AR(1) model:

x̂t = .008(.001) (1 ↑ .347) + .347(.063) x̂t→1 + ŵt with ˆw = .0095


The values in parentheses are the corresponding estimated standard errors.


All of the regression coe”cients are significant, including the constant.

Lijia Wang (UofT) STA457: Time Series Analysis 26 / 28


Example: Step 5

We take the MA(2) for example:

Figure: Residual plots of the GNP quarterly growth rate

Lijia Wang (UofT) STA457: Time Series Analysis 27 / 28


Example: Step 5

Performing the Ljung-Box Test

The figure shows the p-values associated with the Ljung-Box Q-statistic,
at lags H = 3 through H = 20 (with corresponding degrees of freedom
H ↑ 2).

Lijia Wang (UofT) STA457: Time Series Analysis 28 / 28


STA457: Time Series Analysis
Lecture 12

Lijia Wang

Department of Statistical Sciences


University of Toronto

Lijia Wang (UofT) STA457: Time Series Analysis 1 / 19


Overview

Last Time:
1 Integrated ARMA (ARIMA) models
2 Building ARIMA models
Today:
1 Regression with Auto-correlated Errors
2 Multiplicative Seasonal ARIMA Models

Lijia Wang (UofT) STA457: Time Series Analysis 2 / 19


Outline

1 Regression with Auto-correlated Errors

2 Multiplicative Seasonal ARIMA Models

Lijia Wang (UofT) STA457: Time Series Analysis 3 / 19


Regression with Auto-correlated Errors: Introduction

Definition: We consider the regression model with correlated errors xf


r
!
yt = ωj ztj + xf ,
j=1

where xf is a process with some covariance function εx (s, t). Such a


model is called Regression with Auto-correlated Errors.

Lijia Wang (UofT) STA457: Time Series Analysis 4 / 19


Procedure to identify a model

1 First, run an ordinary regression of yt on zt1 , · · · , ztr (acting as if the


errors are uncorrelated). Retain the residuals,
r
!
x̂t = yt → ωj ztj .
j=1

2 Identify ARMA model(s) for the residuals x̂t .


3 Run weighted least squares (or MLE) on the regression model with
autocorrelated errors using the model specified in step (ii).
4 Inspect the residuals ŵt for whiteness, and adjust the model if
necessary.

Lijia Wang (UofT) STA457: Time Series Analysis 5 / 19


If the error is AR(p)

If the error term has an AR(p) representation:

ϑ(B)xt = wt

Multiplying the regression equation through by the transformation ϑ(B)


yields,
!r
ϑ(B)yt = ωj ϑ(B)zt,j + ϑ(B)xt .
j=1

Lijia Wang (UofT) STA457: Time Series Analysis 6 / 19


Example: Mortality, Temperature and Pollution

We consider the following analyses relating mean adjusted temperature Tr ,


and particulate levels Pt to cardiovascular mortality Mt . We consider the
regression model

Mt = ω1 + ω2 t + ω3 Tr + ω4 Tr2 + ω5 Pt + xt

where, for now, we assume that xt is white noise.

Lijia Wang (UofT) STA457: Time Series Analysis 7 / 19


Sample ACF and PACF of the residuals

Figure: Sample ACF and PACF of the mortality residuals indicating an AR(2)
process.
Lijia Wang (UofT) STA457: Time Series Analysis 8 / 19
Fit the correlated error model
Our next step is to fit the correlated error model, but where xt is AR(2).

xt = ϑ1 xt→1 + ϑ2 xt→2 + wt

and wt is white noise.

Lijia Wang (UofT) STA457: Time Series Analysis 9 / 19


Outline

1 Regression with Auto-correlated Errors

2 Multiplicative Seasonal ARIMA Models

Lijia Wang (UofT) STA457: Time Series Analysis 10 / 19


Multiplicative Seasonal ARIMA Models

In this section, we introduce several modifications made to the ARIMA


model to account for seasonal and nonstationary behavior. Often, the
dependence on the past tends to occur most strongly at multiples of some
underlying seasonal lag s.

Lijia Wang (UofT) STA457: Time Series Analysis 11 / 19


ARMA(P, Q)s

The pure seasonal autoregressive moving average model, say,


ARMA(P, Q)s , can be written by

!P (B s )xi = ”Q (B s )wi ,

where the operators

!P (B s ) = 1 → !1 B s → !2 B 2s → · · · → !P B Ps

and
”Q (B s ) = 1 + ”1 B s + ”2 B 2s + · · · + ”Q B Qs
are the seasonal autoregressive operator and the seasonal moving
average operator of orders P and Q, respectively, with seasonal period s.

Lijia Wang (UofT) STA457: Time Series Analysis 12 / 19


Properties of the ARMA(P, Q)s

Analogous to the properties of nonseasonal ARMA models, the pure


seasonal ARMA(P, Q)s is causal only when the roots of !P (B s ) lie
outside the unit circle, and it is invertible only when the roots of ”Q (B s )
lie outside the unit circle.

Lijia Wang (UofT) STA457: Time Series Analysis 13 / 19


first-order seasonal AR

first-order seasonal autoregressive series that might run over months could
be written as
(1 → !B 12 )xt = wt
or
xt = !xt→12 + wt .

This model exhibits the series xt in terms of past lags at the multiple
of the yearly seasonal period s = 12 months.
It is clear from the above form that estimation and forecasting for
such a process involves only straightforward modifications of the unit
lag case already treated.
In particular, the causal condition requires |!| < 1.

Lijia Wang (UofT) STA457: Time Series Analysis 14 / 19


first-order seasonal MA

first-order seasonal moving average series could be written as

xt = wt + ”wt→12 .
We can verify the auto-covariance that,

ε(0) = (1 + ”2 )ϖ 2
ε(±12) = ”ϖ 2
ε(h) = 0 Otherwise

Lijia Wang (UofT) STA457: Time Series Analysis 15 / 19


Behavior of the ACF and PACF for pure SARMA models

Table: Behavior of the ACF and PACF for Pure SARMA Models
AR(P)s MA(Q)s ARMA(P, Q)s
Tails o# at lags ks, Cuts o# after
ACF Tails o# at lags ks
k = 1, 2, . . . lag Qs
Tails o# at lags ks,
PACF Cuts o# after lag Ps Tails o# lags ks
k = 1, 2, . . .

Note that the values of ACF and PACF at nonseasonal lags h ↑= ks, for
k = 1, 2, · · · , are zero.

Lijia Wang (UofT) STA457: Time Series Analysis 16 / 19


ARMA(p, q) ↓ (P, Q)s

In general, we can combine the seasonal and nonseasonal operators into a


multiplicative seasonal autoregressive moving average model, denoted by

ARMA(p, q) ↓ (P, Q)s

and write
!P (B S )ϑ(B)xt = ”Q (B S )ϱ(B)wt .

Lijia Wang (UofT) STA457: Time Series Analysis 17 / 19


Example

Consider an ARMA(0, 1) ↓ (1, 0)12 model

xt = ϑxt12 + wt + ϱwt1 ,

where |ϑ| < 1 and |ϱ| < 1. Find the ACF of xt .

Lijia Wang (UofT) STA457: Time Series Analysis 18 / 19


SARIMA model

Definition: The multiplicative seasonal autoregressive integrated moving


average model, or SARIMA model is given by

!P (B s ) ϑ(B)↔D
s ↔ d
x t = ς + ” Q (B s
) ϱ(B)wt ,
where wt is the usual Gaussian white noise process. The general model is
denoted as ARIMA (p, d, q)(P, D, Q)s .
The ordinary AR and MA components are represented by polynomials
ϑ(B) and ϱ(B) of orders p and q, respectively
The seasonal AR and MA components are represented by !P (B s )
and ”Q (B s ) of orders P and Q
Ordinary and seasonal di#erence components by ↔d = (1 → B)d and
↔D = (1 → B s )D
s

Lijia Wang (UofT) STA457: Time Series Analysis 19 / 19


STA457: Time Series Analysis
Lecture 13

Lijia Wang

Department of Statistical Sciences


University of Toronto

Lijia Wang (UofT) STA457: Time Series Analysis 1 / 13


Overview

Last Time:
1 Regression with Auto-correlated Errors
2 Multiplicative Seasonal ARIMA Models
Today:
1 An Example of SARIMA model
2 R code

Lijia Wang (UofT) STA457: Time Series Analysis 2 / 13


Outline

1 An Example of SARIMA model

2 R code

Lijia Wang (UofT) STA457: Time Series Analysis 3 / 13


Introduction

We consider the R data set AirPassengers, which are the monthly totals of
international airline passengers, 1949 to 1960, taken from Box & Jenkins
(1970).

Lijia Wang (UofT) STA457: Time Series Analysis 4 / 13


Data transformation

1 Note that x is the original series, which shows trend plus increasing
variance.
2 The logged data are in lx, and the transformation stabilizes the
variance.
3 The logged data are then di!erenced to remove trend, and are stored
in dlx.
4 It is clear the there is still persistence in the seasons, so that a
twelfth-order di!erence is applied and stored in ddlx.
5 The transformed data appears to be stationary and we are now ready
to fit a model.

Lijia Wang (UofT) STA457: Time Series Analysis 5 / 13


Trajectory plots

Figure: The monthly totals of international airline passengers x, and the


transformed
Lijia Wang data:
(UofT) lx = logx , dlx = →logx
STA457: and ddlx = → →logx
Time Series ,Analysis 6 / 13
ACF and PACF plots

Figure: Sample ACF and PACF of ddlx


Lijia Wang (UofT) STA457: Time Series Analysis 7 / 13
Model specification

Seasonsal Component: It appears that at the seasons, the ACF is cutting


o! a lag 1s (s = 12), whereas the PACF is tailing o! at lags 1s, 2s, 3s, 4s,
. . . . These results implies an SMA(1), P = 0, Q = 1, in the season
(s = 12).

Non-Seasonsal Component: Inspecting the sample ACF and PACF at the


lower lags, it appears as though both are tailing o!. This suggests an
ARMA(1, 1) within the seasons, p = q = 1.

Lijia Wang (UofT) STA457: Time Series Analysis 8 / 13


Model selection
We first try an ARIMA(1, 1, 1) ↑ ARMIA(0, 1, 1)12 model

Figure: Results of ARIMA(1, 1, 1) ↑ ARMIA(0, 1, 1)12

The AR parameter is not significant, so we should try dropping one


parameter from the within seasons part.
In this case,we try both an ARIMA(0, 1, 1) ↑ ARMIA(0, 1, 1)12 and an
ARIMA(1, 1, 0) ↑ ARMIA(0, 1, 1)12 model
Lijia Wang (UofT) STA457: Time Series Analysis 9 / 13
Model selection

Figure: Results of ARIMA(0, 1, 1) ↑ ARMIA(0, 1, 1)12 and an


ARIMA(1, 1, 0) ↑ ARMIA(0, 1, 1)12

Lijia Wang (UofT) STA457: Time Series Analysis 10 / 13


Model diagnose

Figure: Residual analysis for the ARIMA(0, 1, 1) ↑ ARMIA(0, 1, 1)12 fit to the
loggedLijiaairWang
passengers
(UofT) data set. STA457: Time Series Analysis 11 / 13
Forecasting

Figure: Twelve month forecast using theARIMA(0, 1, 1) ↑ ARMIA(0, 1, 1)12 model


on theLijia
logged air passenger data
Wang (UofT) set. Time Series Analysis
STA457: 12 / 13
Outline

1 An Example of SARIMA model

2 R code

Lijia Wang (UofT) STA457: Time Series Analysis 13 / 13

You might also like