Study Guide 2022
Study Guide 2022
SCIENCES
SSTA031
STUDY GUIDE
(2022)
1
SSTA031 STUDY GUIDE 2022
INTRODUCTION
A time series is a collection of observations made sequentially over time. Examples occur
in a variety of fields, ranging from economics to engineering. The module will introduce a
variety of examples of time series from diverse areas of application. Studying models that
incorporate dependence is the key concept in time series analysis
LECTURER INFORMATION:
STUDY COMPONENTS
To predict or forecast the future values based on the history of that series.
B. CHAPTERS
2
SSTA031 STUDY GUIDE 2022
2.2 Introduction
2.5 Filtering
3.2 Introduction
3
SSTA031 STUDY GUIDE 2022
5. PARAMETER ESTIMATION
5.2 Introduction
6. MODEL DIAGNOSTICS
6.2 Introduction
6.4 Overfitting
7. FORECASTING
4
SSTA031 STUDY GUIDE 2022
C. ACTIVITIES
You have to attend 3 lectures per week. The times are as follows:
A student is admitted to the final exam based on a module mark of at least 40%.
CALCULATION OF MARKS
D. COURSE PLAN:
Your performance on this module is assessed through two written Quizzes, tests
and assignments. The Quizzes are written a week before each test.
5
SSTA031 STUDY GUIDE 2022
This module has a duration of 15 weeks and a tentative course plan is given
hereunder.
SCHEDULE OF LECTURES
IMPORTANT DATES
?? ?? 1,2,3 QUIZ 1
?? ?? 1,2,3 TEST 1
?? ?? 4,5,6 QUIZ 2
?? ?? 4,5,6,7 TEST 2
?? ?? 4,5,6,7 ASSIGNMENT 1
6
SSTA031 STUDY GUIDE 2022
Take note of the main aspects (like test and quiz dates) indicated in this study guide
and, also read the headings and sub-headings of the relative material in the textbook
or other study materials mentioned in this study guide.
You are expected to purchase atleast one of the following recommended textbooks:
RECOMMENDED TEXTS
Chris Chatfield (2004). The Analysis of Time Series, An Introduction, Six Edition,
Chapman & Hall/CRC.
Cryer, D. C. and Chan, K (2008). Time Series Analysis with Application in R, 2nd
Edition, Springer.
Wei W.W.S (2006) Time Series Analysis, Univariate and Multivariate Methods, Second
Edition, Pearson Addison Wesley.
SUPPLEMENTARY TEXTS
Peter J. Brockwell and Richard A.Davis (2002), Introduction to Time Series and
Forecasting ,Second Edition,Springer.
LECTURE NOTES
A copy of Lecture notes will be available on blackboard. The notes do not include
proofs of stationarity/invertibility , so students are advised to take additional notes.
EXAMPLES
As already noted, there are examples in every chapter to work on. It is very
important that you attempt them. Your understanding of the material you have
studied in the chapter will be greatly improved if you do the examples yourself.
ASSIGNMENTS
You will be given two assignments and make sure that you hand in each assignment
on/before the due date. All work submitted by each student must be an authentic
product of his/her efforts and should not be copied from another student.
7
SSTA031 STUDY GUIDE 2022
CHAPTER 1
INTRODUCTION TO TIME SERIES
1.2 Introduction
What is time series? A time series may be defined as a set of observations of a random
variable arranged in chronological (time) order. We also say it’s a series of observations
recorded sequentially at equally spaced intervals of time.
The aim of time series is “to identify any recurring patterns which could be useful in
estimating future values of the time series”. Time series analysis assumes that the actual
values of a random variable in a time series are influenced by a variety of environmentally
forces operating over time.
Time series analysis attempts to isolate and quantify the influence of these different
environmental forces operating on the time series into a number of different components.
This is achieved through a process known as decomposition of the time series.
Once identified and quantified, these components are used to estimate future values of the
time series. An important assumption in time series analysis is the continuation of past
patterns into the future (i.e. the environment in which the time series occurs is stable.)
Notation:
8
SSTA031 STUDY GUIDE 2022
The most important step in time series analysis is to plot the observations against time. This
graph should show up important features of the series such as a trend, seasonality, outliers
and discontinuities. The plot is vital, both to describe the data and to help in formulating a
sensible model. This is basically a plot of the response or variable of interest x against
time t , denoted xt .
Plot the series and examine the main features: This is usually done with the aid of
some computer package e.g. SPSS, SAS, etc.
One way to examine a time series is to break it into components. A standard approach is to
find components corresponding to a long –term trend, any cyclic behavior, seasonal
behavior and a residual, irregular part.
A smooth or regular underlying movement of a series over a fairly long period of time. A
gradual and consistent pattern of changes.
Movement in a time series which recur year after in some months or quarters with more less
the same intensity.
Period variations extending over a long period of time, caused by different factors such as
cycles, recession, depression, recovery, etc.
9
SSTA031 STUDY GUIDE 2022
Variations caused by readily identifiable special events such as elections, wars, floods,
earthquakes, strikes, etc.
The main time series analysis is to isolate the influence of each of the four components of
the actual series. The multiplicative time series model is used to analyse the influence of
each of these four components. The multiplicative model is based on the idea that the actual
values of a time series xt can be found by multiplying the trend component T by cyclical
component C , by seasonal index S and by irregular component I . Thus, the
multiplicative time series is defined as: x T C S I .
Smoothing methods are used in attempting to get rid of the irregular, random component of
the series.
A moving average (ma) of order M is produced by calculating the average value of a variable
over a set of M values of the series.
A running median of order M is produced by calculating the median value of a variable over
a set of M values of the series.
xˆ t 1 xt 1 xˆ t
10
SSTA031 STUDY GUIDE 2022
The trend in a time series can be identified by averaging out the short term fluctuations in
the series. This will result in either a smooth curve or a straight line.
The three –year moving total for an observation x would be the sum of the
observation immediately before x , x itself and the observation immediately after x .
The three- year moving average would be each of these moving totals divided by 3.
The five-year moving total for an observation x would be the sum of the two
observations immediately before x , x itself and the two observations immediately
after x .The five-year moving average would be each of these moving totals divided
by 5.
Example: 1.1
Australia’s official development assistance (ODA) from 1984-85 until 1992-93 is shown (at
current prices, $ million) in Table 1.
11
SSTA031 STUDY GUIDE 2022
Table 1
Year ODA($
million)
1984-85 1011
1985-86 1031
1986-87 976
1987-88 1020
1988-89 1195
1989-90 1174
1990-91 1261
1991-92 1330
1992-93 1384
(a) Find the three- year moving averages to obtain the trend of the data.
A trend line isolates the trend (T) component only. It shows the general direction in which
the series is moving and is therefore best represented by a straight line. The method of least
squares from regression analysis is used to determine the trend line of best fit.
Regression line is defined by: y 0 1 x . If the variable x , are not given, must be coded.
12
SSTA031 STUDY GUIDE 2022
Example 1.2
Table 2
Y 2 6 1 5 3 7 2
Seasonal analysis isolates the influence of seasonal forces on a time series. The ratio-to-
moving average method is used to measure these influences. The seasonal influence is
expressed as an index number.
Step-1
Step-2
Step-3
Average the seasonal ratios across corresponding periods within each year.
Step-4
13
SSTA031 STUDY GUIDE 2022
Example: 1. 3
The average daily sales (in litres) of milk at a country store are shown in Table 3 for each of
the years 1983 to 1985.
Table 3
1 47.6
2 48.9
1983 3 51.5
4 55.3
1 57.9
1984 2 61.7
3 65.3
4 70.2
1 76.1
1985 2 84.7
3 93.2
4 97.2
(b) Calculate the seasonal index by making use of the ratio-to-moving average method.
14
SSTA031 STUDY GUIDE 2022
CHAPTER 2
THE MODEL BUILDING STRATEGY
2.2 Introduction
Perhaps the most important question we ask now is “how do we decide on the model to
use?” Finding appropriate models for time series is not an easy task. We will develop a
model building strategy which was developed by Box and Jenkins in 1976.
There are three main steps in the Box-Jenkins procedure, each of which may be used
several times:
Model specification
Model fitting.
Model diagnostics.
In model specification (or identification) we select classes of time series that may be
appropriate for a given observed series. In this step we look at the time plot of the series,
compute many different statistics from the data, and also apply knowledge from the subject
area in which the data arise, such as economics, physics, chemistry, or biology. The model
chosen at this point is tentative and may be revised later in the analysis. In the process of
model selection we shall try to adhere to the principle of parsimony.
Definition 2.3.1 (The principle of parsimony): The model used should require the smallest
possible number of parameters that will adequately represent the data.
15
SSTA031 STUDY GUIDE 2022
For the data set to be purely random sequence/white noise the sample autocorrelation
2
̂ k (i.e the sample ACF must be within the boundaries).
n
Example 2.1
200 observations on a stationary series were analyzed and gave the following sample
autocorrelation :
Table 4
k 1 2 3 4 5
Model fitting consists of finding the best possible estimates of those unknown parameters
within a given model. After we have identified the model and estimated the unknown
parameters we need to check if the model is a good model, this is done through diagnostic
checking.
Here we are concerned with analyzing the quality of the model that we have specified and
estimated. We ask the following questions to guide us:
16
SSTA031 STUDY GUIDE 2022
2.4.1 Transformations
If the process is not stationary to analyze its series, we must transform the series to be
stationary. There are various transformations that we can use to make a time series
stationary. Some of them are:
Differencing.
Log transformation.
Arcsine transformation.
Power transformation.
Models that are not stationary when subjected to differencing often yield processes that are
stationary. Thus if we difference a time series we denote it by xt xt xt 1 . We will use the
operator to denote the difference operation. In some instances, differencing ones may not
yield a stationary process.
17
SSTA031 STUDY GUIDE 2022
2.5.1 Filtering
Definition 2.5.1: A linear filter is a linear transformation or any operator which converts one
time series, xt into another series yt through the linear operation.
Example: 2.2
1 1 1 1
(a) Consider the following two filters: A , , B , Compute A * B where
2 2 2 2
*denotes the convolution operator.
(b) Consider the following two filters: A 1,1, B 1,1Compute A * B where *denotes
the convolution operator.
time series is a sample path or realization of a stochastic (random) process xt , t T where
T is an ordered set.
Definition 2.6.2: Let xt , t T be a stochastic process and let be a set of all possible
realization or sample of path then is called the ensemble for the process xt .
Remarks: In time series literature, the terms “time series” and process are used (often)
interchangeably.
18
SSTA031 STUDY GUIDE 2022
CHAPTER 3
MODELS FOR STATIONARY TIME SERIES
Define a strictly stationary, weakly stationary, purely random, IID noise and random
walk processes.
Identify the properties of AR and MA processes.
Express and manipulate time series model using Backshift operator (B-notation).
Calculate ACF and PACF for both AR and MA processes.
Classify the model as ARMA/ARIMA process.
Determine whether the process is stationary/invertible or both.
3.2 Introduction
In order to be able to analyze or make meaningful inference about the data generating
process it is necessary to make some simplifying and yet reasonable assumption about the
process. A characteristic feature of time series data which distinguishes it from other types
of data is that the observations are, in general, correlated or dependent and one principal
aim of time series is to study , investigate, explore and model this unknown correlation
structure.
Definition 3.2.1: A time series xt is said to be strictly stationary if the joint density
functions depend only on the relation location of the observations, so that
f xt1 h , xt 2 h ,..., xtk h f xt1 , xt 2 ,..., xtk . meaning that xt1h , xt 2h ,..., xtk h and xt1 , xt 2 ,..., xtk
have the same joint distributions for all h and for all choices of the time points ti .
Definition 3.2.2: A stochastic process z t is weakly stationary (or of second order
stationary), if both the mean function and the autocovariance function do not depend on time
t .thus,
19
SSTA031 STUDY GUIDE 2022
Example: 3.1
Prove whether the following process is covariance stationary:
(a) z t 1 A ,where A is a random variable with zero mean and unit variance?
t
Definition 3.3.1: Suppose a stationary stochastic process xt has mean , variance 2 and
k k
acv.f. k , then the autocorrelation function (ac.f) is given by: k
0 2
Properties
1. 0 1
2. k k .
3. k 1
Definition 3.4.1: A discrete time process is called a purely random process (white noise)
if it consists of a sequence of random variable z t which are mutually independent and
identically distributed. It follows that both mean and variance are constants and the acv.f. is:
k covz t , z t k
0 for k 1,2...
1 , for k 0
The ac.f.is given by k
0 , for k 1,2,...
20
SSTA031 STUDY GUIDE 2022
A process xt is said to be an IID noise with mean 0 and variance x ,written xt ~ IID 0, x
2 2
if the random variable xt are independent and identically distributed with xt 0 and
var xt x2 .
Let z1 , z 2 ,...be independent identically distributed random variables, each with mean 0 and
variance z2 . The time series that can be observed xt is called a random walk if it can be
expressed as xt z1 z 2 z3 zt .................
Backshift operator
The backshift operator is used to express and manipulate time series models. The backshift
operator denoted B on the time index of a series and shifts time back 1 time unit to form a
new series i.e: Bxt xt 1 .
The MA (1) for the actual data, as opposed to deviations to deviation from the mean will be
written as xt z t 1 z t 1 or xt z t 1 z t 1 where is the mean of the series
Invertible conditions
stationary for all values of 1 and 2 .However , it is invertible only if the roots of the
characteristic equation 1 1 B 2 B 2 0 lie outside the unit circle, that is ,
(i) 2 1 1
(ii) 2 1 1
(iii) 1 2 1
21
SSTA031 STUDY GUIDE 2022
Example: 3.2
Find the ACF of the following process: xt zt 1 zt 1 2 zt 2
Definition 3.6.1: Suppose that zt is a purely random process, such that
E z t 0, var z t z .then a process xt is said to be a moving average process of order
2
Example: 3.3
Definition 3.7.1: The partial correlation (PACF) is defined as the correlation between xt and
0 1 1
0 1 1 0 2
2 1 3
Where 11 1 , 22 1 2 and 33
0 1 0 1 2
1 0 1 0 1
2 1 0
ˆ kk 1
Definition 3.7.2: Standard error for
n
22
SSTA031 STUDY GUIDE 2022
Definition 3.8.1: Let z t be a purely random process with mean zero and variance z .
2
xt 1 xt 1 ... p xt p z t .
Example: 3.4
Consider a model xt 1 xt 1 zt where xt is a white noise.
Stationary condition
For stationarity, the roots of B 1 1 B 2 B 2 0 must lie outside the unit circle, which
implies that the parameters 1 and 2 must lie in the triangular region:
(i ) 2 1 1
(ii) 2 1 1
(iii) 1 2 1
Stationary conditions:
23
SSTA031 STUDY GUIDE 2022
condition condition
The general solution is k A1 1 k ... A p p k where i ,are the roots of the auxiliary
24
SSTA031 STUDY GUIDE 2022
Example: 3.5
1
Consider the AR (2) process given by xt xt 1 xt 2 z t
4
Stationary process
Invertible process
Example: 3.6
1
Consider the following process: xt xt 1 z t 2 z t 1 .
2
25
SSTA031 STUDY GUIDE 2022
3.11.2 Weights ( , )
From the relations 1 1 0 1 1 1 and j 1 j 1 for j 1 , we find that the j
weights are given by j 1 1 1 j 1 , for j 1, and similarity it is easily seen that
j 1 1 1 j 1 , for j 1 ,for the stationary and ARMA (1, 1) process.
Example: 3.7
Find the weights and weights for the ARMA (1, 1) process given by
xt 0.5xt 1 z t 0.3zt 1 .
Definition 3.12.1: A process xt is called a seasonal ARMA process of non-seasonal order
26
SSTA031 STUDY GUIDE 2022
CHAPTER 4
MODEL FOR NON STATIONARY SERIES
xt is ARIMA p, d , q .
Example: 4.1
Identify the following models
a) 1 0.8B 1 B xt zt .
b) 1 B xt 1 0.75zt .
27
SSTA031 STUDY GUIDE 2022
Example: 4.2
Let us consider the ARIMA0,1,1 0,1,112 model.
12
12
where , Wt 1 B 1 B xt 1 B 1 B z t .
CHAPTER 5
PARAMETER ESTIMATION
5.2 Introduction
Having tentatively specified ARIMA p, d , q the next step is to estimate the parameters of
this model. This chapter focuses on estimation of parameters of an AR and MA models. We
shall deal with the most commonly used method of estimating parameters, these are:
Method of moments.
Maximum-likelihood method.
28
SSTA031 STUDY GUIDE 2022
This method consist of equating sample moments such as the sample mean x ,sample
variance 0 and sample autocorrelation function to the theoretical counterparts and solving
the resultant equation(s).
5.3.1 Mean
With only a single realization (of length n ) of the process, a natural estimator of the mean,
1 n
is the sample mean x xt .where
n t 1
x is the average time average of n observation.
1 n
Since E x E xt , x is an unbiased estimator of
n t 1
0 n 1
k
var x 1 2 1 k
n k 1 n
ESTIMATION OF k and k
Suppose that we have n observations x1, x2 ,, xn then the corresponding sample
nk
x x
n t 1 2
t
t 1
n 1
1 n 1 xt x xt 1 x
ˆ
As an example, 1 xt x xt 1 x ̂
and 1 t 1 n are used to estimate 1
xt x 2
n t 1
t 1
and 1 .
29
SSTA031 STUDY GUIDE 2022
Example: 5.1
and z t ~ IID 0, Z . Where IID stand for Independent Identically Distributed noise.
2
Example: 5.2
moments.
Method of moments is not convenient when applied to moving average models. However,
for purposes of illustration, we shall consider MA1 process given in the following example:
Example: 5.3
The method of Least Square Estimation is an estimation procedure developed for standard
regression models. In this section we discuss LSE procedure and its associated problems
in time series analysis.
n xt y t xt y t
The Least Square Estimate is given by: ˆ
n xt2 xt
2
30
SSTA031 STUDY GUIDE 2022
In the next subsection we shall apply the LSE method to time series model.
Model *can be viewed as a regression model with predictor variable xt 1 and responded
variable xt . On LSE method we minimize the sum of squares of the difference xt xt 1 that
is S z t2 xt xt 1
2
d S
2 xt xt 1 xt 1 ...............* *
d
Setting this equal to zero and solving for yields 2 xt ˆ xt 1 xt 1 0 ˆ x t 1 xt
.
2
x t 1
Example: 5.4
Consider an AR 1 model: xt ( xt 1 ) zt .
n 1
k
X ~ N , 0
1 2 1 k
k 1 n
n
If 0 and k ' s are known then a 1001 percent confidence interval for is
0 k
X z 1 2 1 k . Where z is upper quantile from the standard normal
2
n n 2
2
distribution. Note that if k 0 for all k ,then this confidence interval formula reduces to
0
X z .
2
n
31
SSTA031 STUDY GUIDE 2022
Example 5.5
Suppose that in a sample of size 100 from AR 1 : xt 1 xt 1 z t process with mean ; 0.6
CHAPTER 6
MODEL DIAGNOSTICS
6.2 Introduction
Model diagnostics is primarily concerned with testing the goodness of fit of a tentative model.
Two complementary approaches are: Analysis of residuals from fitted models and analysis
of over parameterized model will be considered in this chapter.
Before a model can be used for inference the assumptions of the model should be assessed
using residuals. Recall from regression analysis, residuals are given by :
Residual can be used to assess if the ARMA is adequate and if the parameter estimates
are close to the true values. Model adequacy is checked by assessing whether the model
assumptions are satisfied.
32
SSTA031 STUDY GUIDE 2022
The basic assumption is that , z t are white noise .That is they possess the properties of
independence, identically and normally distributed random variables with zero mean and
constant variance z2 .
A good model is one with residuals that satisfy these properties, that is, it should have
residuals which are:
Constant variance.
-Examining ACF: Compute the sample ACF of the residual. Residuals are independent if
they do not form any pattern and are statistically insignificant, that is, they are within Z
2
standard deviation.
Test of constant variance can be inspected by plotting the residuals over time. If the model
is adequate, we expect the plot to suggest a rectangular scatter around zero horizontal level
with no trends whatsoever.
33
SSTA031 STUDY GUIDE 2022
The basic idea behind the ARIMA modeling is to account for any autocorrelation pattern in
the series xt with a parsimonious combination of AR and MA terms, leaving random terms
zt as a white noise. If the residuals are white noise this implies that they are correlated,
that is, they are serial independent. To determine if the residuals ACF are significantly
different from zero we use the following Portmanteau test.
This test uses the magnitude of residual autocorrelations as a group to check for model
adequacy.
Hypothesis
K
Test statistic: Q N rz ,k
2
k 1
Decision rule: Reject H 0 if Q k2,1 and conclude that the random term z t from the
estimated model are correlated and that the estimated model may be inadequate.
Note: The maximum lag K is taken large enough so that the weights are negligible for j K
34
SSTA031 STUDY GUIDE 2022
K
Alternative test: Ljung-Box test, Q N N 2 rz ,k / N k .
* 2
k 1
Example: 6.1
The AR2 model xt C 1 xt 1 2 xt 2 zt was fitted to a data set of length 121.
24
a) The Box-Pierce statistic value was Q 121 rz k 31.5
2
k 1
At 95% level of significance, test the hypothesis that AR2 model fit the data.
The last four values in the series are 707, 6.90, 6.63, 6.20 .
Another basic diagnostic tool is that of over fitting. In this diagnostic check we add another
coefficient to see if the model is better. Recall that any ARMA p, q model can be considered
as a special case of a more general ARMA model with the additional parameters equal to
zero. Thus, after specifying and fitting our tentative model, we fit a more general model that
contains the original model as a special case.
There is no unique way to over fit a model, but one should be careful not to add coefficients
to both sides of the model. Over fitting both AR and MA terms at the same time leads to
estimation problems because of the parameter redundancy as well as violating the principle
of parsimony.
If our tentative model is AR2 we might over fit with AR3 . The original AR2 will be
confirmed if :
The estimates of the additional parameter 3 is not significantly different from zero.
The estimates for the parameters 1 and 2 do not change significantly from the
original states.
35
SSTA031 STUDY GUIDE 2022
CHAPTER 7
FORECASTING
The minimum mean square error (MMSE) forecasts xˆ n l of z n l at the forecast origin n is
Example: 7.1
Assume that we have x1 , x2 ,..., xn and we want to forecast xnl (i.e l step-ahead forecast from
origin, the actual value is: xnl 1 z nl 1 2 z nl 2 ... z nl .
The “minimum mean square error” forecast for xnl is xˆ n l 1 z nl 1 2 z nl 2 ...
This form is not very useful for computing forecasts, but is useful in finding the forecast error.
36
SSTA031 STUDY GUIDE 2022
l 1
var en l z 1 1 2 ... l 1 z
2 2 2 2 2 2
i where 0 1 .
i 0
The minimum mean square error forecast for xnl is xˆ n l 1 z nl 1 2 z nl 2 ...
A 95% prediction interval for l step ahead forecast is xˆ n l Z 0.025 var en l .
Example: 7.2
For each of the following models,
37
SSTA031 STUDY GUIDE 2022
MA1 process: xt z t z t 1 .
Example: 7.3
For 1 B xt 1 B zt
38