0% found this document useful (0 votes)
15 views22 pages

Unit-13 Correlation Analysis in Time Series

Time series

Uploaded by

ss t
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views22 pages

Unit-13 Correlation Analysis in Time Series

Time series

Uploaded by

ss t
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

UNIT 13

CORRELATION ANALYSIS IN
TIME SERIES

Structure
13.1 Introduction 13.5 Correlogram
Expected Learning Outcomes 13.6 Interpretation of
Correlogram
13.2 Autocovariance and
Autocorrelation Functions 13.7 Summary
13.3 Estimation of Autocovariance 13.8 Terminal Questions
and Autocorrelation
13.9 Solution/Answers
Functions
13.4 Partial Autocorrelation
Function

13.1 INTRODUCTION
With the help of the time series data, we try to fit a time series model so that
we can forecast the observations. But one of the essential elements of time
series modelling is stationarity. In the previous unit, you have studied what is
stationary and nonstationary time series and how to detect and transform
nonstationary time series to stationary time series. As you know, a time series
is a collection of observations with respect to time, therefore, there is a chance
that a value at the present time may relate/depend on the past value. In most
of the time series, we observe such relationships. To study the degree of
relationship between previous/past values with the current value, we have to
study the covariance and correlation between them before modelling the time
series. Therefore, in this unit, you will study correlation analysis in time series.
We begin with a simple introduction of autocovariance and autocorrelation
functions in time series in Sec. 13.2. In Sec. 13.3, we discuss how to estimate
the autocovariance and autocorrelation functions using time series data.
When we study the autocorrelation between observations in the presence of
the intermediate variables, then it does not give the true picture of the relation.
Therefore, to remove the effect of the same, we use partial autocorrelation
which is discussed in Sec. 13.4. To present the autocorrelation/ partial
autocorrelation in the form of graphs/diagrams, we use a correlogram. In
Sec. 13.5, we describe what is correlogram and how to plot it. The
95
* Dr. Prabhat Kumar Sangal, School of Sciences, IGNOU, New Delhi
Block 3 Time Series Analysis
interpretation of the correlogram is also explained in Sec. 13.6. In the next
unit, you will study different models for time series.

Expected Learning Outcomes


After studying this unit, you would be able to:
 describe the concept of covariance and correlation in time series;
 explain autocovariance and autocorrelation functions;
 describe partial autocorrelation function; and
 plot and interpret the correlogram.

13.3 AUTOCOVARIANCE AND


AUTOCORRELATION FUNCTIONS
As you know, a time series is a collection of observations with respect to time.
Since time series data are continuous and chronologically arranged, therefore,
there is a chance that a value at the present time may depend on the past
value. For example, the temperature in the next hour is not a random event
since, in most cases, it depends on the current temperature or that has
occurred during the past 24 hours. Therefore, the past temperature has an
impact more on the feature temperature. In other words, we can say that there
exists a strong relationship between the current temperature and the next
hour's temperature. Similarly, in cases of current sales of a company
depending on past sales, the stock is up today, then it is more likely to be up
tomorrow, etc. For measuring such a linear relationship, we use covariance or
correlation. The correlation between a series and its lags is called
autocorrelation. Since we calculate covariance/correlation between two values
of the same time series, therefore, it is called autocovariance/autocorrelation.
The information provided by the autocovariance/ autocorrelation is used to
understand the properties of time series data, fit the appropriate models, and
forecast future events of the series.
You have some idea about the covariance and correlation. You now try to
understand the basic concepts of both and what covariance and
correlation are.
Covariance
Covariance is defined as a measure of the relationship between two variables.
It measures how much two variables change together. If X and Y are two
variables, then covariance is defined mathematically as
1 n
Cov(x, y)= ( )(
∑ Xi − X Yi − Y
n i=1
)
It takes values from −∞ to + ∞ . The covariance tells whether both variables
vary in the same direction (positive covariance) or in the opposite direction
(negative covariance). If it is positive then it indicates a direct dependency, i.e,
increasing the value of one variable will result in an increase in the value of the
other variable and vice versa. On the other hand, a negative value signifies
negative covariance, which indicates that the two variables have an inverse
96 dependency, i.e., increasing the value of one variable will result in decreasing
Unit 13 Correlation Analysis in Time Series

the value of another variable and vice versa. A zero value indicates no
relationship between the variables.
The main problem with the covariance is that it is hard to interpret due to its
wide range ( −∞ to + ∞ ). For example, our data set could return a value say 5,
or 500. It may take a large value if the variables X and Y are large. Therefore,
a large value of covariance does not indicate that there exists a strong
relationship between the variables. It means that it does not tell us that there
exists a strong relationship between the variables when it is large. A value of
500 tells us that the variables are correlated, but unlike the correlation
coefficient, that number doesn’t tell us exactly how strong that relationship is.
There is no meaning of the numerical value of covariance only the sign is
useful. To overcome this problem the covariance is divided by the standard
deviation to get the correlation coefficient.

Correlation

Correlation is a measure of identifying and quantifying the linear relationship


between two variables. This relationship could vary from having a full
dependency or linear relationship between the two, to complete independenc.
Correlation explains the change in one variable leads to how much proportion
changes in the second variable. One of the most popular methods for
measuring the level of correlation between two variables is the Pearson
correlation coefficient. It measures the intensity or degree of the linear
relationship between two variables. If X and Y are two variables, then the
Pearson correlation coefficient (r) is defined mathematically as When two
variables are
n

∑(X )( ) related in such a


− X Yi − Y
Cov(X, Y) i way that change in
rXY
= rYX
= = i =1
the value of one
Var(X)Var(Y) n 2 n
∑(X ) ∑ (Y − Y)
2
i −X i
variable affects the
=i 1=i 1 value of another
variable, then
variables are said
The value of the coefficient of correlation can range from – 1 to +1, with a
to be correlated.
negative value indicating an inverse relationship and a positive value
indicating a direct relationship. It reveals not only the nature of the relationship
but also its strength. If it is near to ± 1 then the variables are highly correlated
on the other hand if it is near zero then it indicates a poor relationship.
To understand autocovariance/autocorrelation, you have to understand what
lag is.

Lag

The number of intervals between the two observations is the lag. For example,
the lag between the current and previous observations is one. If you go back
one more interval, the lag is two, and so on. In mathematical terms, the
observations Yt and Yt+k are separated by k time units, then the lag is k. This
lag can be days, quarters, or years depending on the nature of the data. When
k = 1, you are assessing adjacent observations.
We now come to our main topic autocovariance, and autocorrelation and we
now define autocovariance/autocorrelation formally. 97
Block 3 Time Series Analysis
Autocovariance

If we are interested in finding a linear relationship between two consecutive


observations of a time series, say, Yt and Yt+1 and also interested in the
relationship between observations at k lag apart i.e. Yt and Yt+k, then we use
autocovariance /autocorrelation. Let us start with autocovariance and we will
introduce the autocorrelation function after that.
Autocovariance can be defined as
The covariance between a given time series and a lagged version of itself over
successive time intervals is called autocovariance.
If Yt and Yt +k (t = 1, 2, …k = 0,1, 2, …) denote the time series which start from
time t and t+k, respectively, then covariance between time series Yt and Yt +k is
You also noticed called autocovariance at lag k. Mathematically, we can define the
that the summation autocovariance function as
on the left-hand
1 N− k
side of the formula
of autocovariance
γk = Cov ( Yt , Yt +k ) =
γ −k = ∑ {Yt − mean ( Yt )}{Yt +k − mean ( Yt +k )}
N t =1
is divided by N
instead of N − k as where N is the size of time series.
you may expect.
This is done
The autocovariance function is denoted by γk and read as gamma. Here, k
because the former represents the lag. Since for stationary time series mean remains constant,
ensures that the therefore,
estimate of the
covariance matrix mean ( Yt ) mean
= = ( Yt +k ) μ
is a nonnegative
definite matrix. Thus,
1 N− k
γ −k Cov ( Yt , Yt +k ) =
γk == ∑ ( Yt − μ)( Yt +k − μ)
N t =1

When the lag is zero, that is, k = 0, then


1 N 1 N

γ 0 Cov ( Yt =
, Yt )
( Yt − μ)( Yt −=
μ) ∑ ( Yt − μ)
2
=
=N t 1= Nt1

The autocovariance is the same as the covariance. The only difference is that
the autocovariance is applied to the same time series data, i.e., you compute
the covariance of the data say temperature Y with the same data temperature
Y, but from a previous period.
Autocorrelation
In time series analysis, the autocorrelation is the fundamental technique for
calculating the degree of correlation between a series and its lags. This
method is fairly similar to the Pearson correlation coefficient but
autocorrelation uses the same time series twice: one in its original form and
the second lagged one or more time periods as in autocovariance. We now
define autocorrelation as
Autocorrelation is a measure of the degree of relationship between a
given time series and a lagged version of itself over successive time intervals.
If Yt and Yt +k denote the value of a stationary time series which start from
time t and t+k, respectively,then the autocorrelation function/ coefficient
98 between time series Yt and its lag value Yt +k is defined as
Unit 13 Correlation Analysis in Time Series

Cov ( Yt , Yt +k )
ρk =
Var ( Yt ) Var ( Yt +k )

Since for stationary time series variance of the series remains constant,
therefore,
Var ( Yt ) = Var ( Yt +k )

Thus, the autocorrelation function at lag k becomes as


N− k

Cov ( Yt , Yt +k ) ∑ ( Y − μ)( Y
t t +k − μ)
=ρk = t =1

Var ( Yt ) N

∑ ( Y − μ)
2
t
t =1

The autocorrelation function ( ρk ), can also be defined in terms of


autocovariance as
N− k

∑ ( Y − μ)( Y
t t +k − μ)
γk
ρk =
t =1
N
γ0
∑ ( Yt − μ)
2

t =1

When lag is zero, that is, k = 0 then


N

∑ ( Y − μ)( Y − μ)
t t
γ0
ρ
=0
t =1
N
= = 1
γ0
∑ ( Y − μ)
2
t
t =1

The degree of correlation between a series and its lags indicates the
pattern/characteristics of the series. For example, if a time series has a
seasonality component say monthly then we will observe a strong correlation
with its seasonal lags, say, 12, 24, and 36 months.
Some important properties of time series can be studied with the help of
autocovariance and autocorrelation functions. They measure the linear
relationship between observations at different time lags apart. They provide
useful descriptive properties of the time series under study. This is also an
important tool for guessing a suitable model for the time series data.
After understanding the concept of autocovariance and autocorrelation
functions, we now study how to estimate them using sample data.

13.3 ESTIMATION OF AUTOCOVARIANCE AND


AUTOCORRELATION FUNCTIONS
In the previous section, we consider the theoretical aspect of autocovariance
and autocorrelation functions for time series. In practice, we have a finite time
series and based on the observations of the given time series we estimate the
mean, autocovariance and autocorrelation function. Suppose y t = y1, y 2 ,..., yn
represent the observations of a finite time series and it is assumed a sample of
theoretical time series Yt . We can estimate the mean of the time series as (μ)
by the sample mean as
1 n
μ̂= y= ∑ yt
n t =1 99
Block 3 Time Series Analysis
and estimate autocovariance function as
1 n −k
γ̂k ==
ck ∑ ( y t − y )( y t +k − y ); k =
n t =1
1,2,...,n − 1

It is known as the sample autocovariance function.


You also noticed that the summation on the left-hand side of the formula of
sample autocovariance is divided by n instead of n − k as you may expect.
This is done because the former ensures that the estimate of the covariance
matrix is a nonnegative definite matrix.
Similarly, we estimate the autocorrelation function at lag k as
n −k

∑(y t− y )( y t −k − y )
c
ρ̂k= rk= t =1
n
= k ; k= 1,2,...,n − 1
c0
∑ ( yt − y )
2

t =1

It is known as the sample autocorrelation function.


As you know that the correlation coefficient is calculated between variables
with multiple values of the same length. Therefore, to compute the sample
autocorrelation, first of all, we make two series of the same length as
discussed in Example 1.
You will also notice that as we increase lag k, that is, if we calculate
autocorrelation between observations further and further apart, then we create
two variables, yt+k and yt and they will each have n – k observations, therefore,
as we increase k, the number of observations decrease. Therefore, after a
while, the estimates of autocovariance and autocorrelation will become more
and more unreliable. Hence, to find a reliable estimate of the autocorrelation
function, we should require at least 50 observations and the sample
autocorrelation function should be calculated up to lag k = n/4, where n is the
number of observations in the time series. For illustration purposes, we just
consider small time series data (less than 50 observations).
Let's look at an example which helps you to understand how to calculate the
sample autocovariance and autocorrelation functions.
Example 1: The meteorological department collected the following data of
temperature (in oC) in a particular area on different days:
Day Temperature Day Temperature
1 22 9 28
2 23 10 30
3 23 11 31
4 24 12 30
5 23 13 30
6 25 14 31
7 26 15 30
8 28

Calculate mean, variance and autocorrelation functions for the given data.
Solution: As you know that the autocovariance/autocorrelation function is
calculated between variables with multiple values of the same length.
100 Therefore, to compute the sample autocorrelation, first of all, we make two
Unit 13 Correlation Analysis in Time Series
series of the same length. If y t denotes the value of the temperature/series at
any particular time t then y t +1 denotes the value of the temperature/series one
time after time t. That is, y t +1 is the lag 1 value of y t as shown in the following
table:
Day Temperature (yt) yt+1 Day Temperature (yt) yt+1
1 22 -- 9 28 28
2 23 22 10 30 28
3 23 23 11 32 30
4 24 23 12 32 31
5 23 24 13 34 30
6 25 23 14 33 31
7 26 25 15 34 31
8 28 26

Since y t and y t +1 have different dimensions (the first one has 15


observations, while the second one has 14), therefore, we use data from
day 2 onwards to day 15 to make equal length for k = 1. Consequently, our
data is as follows:
Day Temperature (yt) yt+1
1 22 --
2 23 22
3 23 23
4 24 23
5 23 24
6 25 23
7 26 25
8 28 26
9 28 28
10 30 28
11 32 30
12 32 31
13 34 30
14 33 31
15 34 31

Since there are 15 observations, therefore, we prepare the data up to n/4


= 15/4 ~ 4 lags in a similar way as shown below:
Day Temperature(yt) yt+1 yt+2 yt+3 yt+4
1 22
For k = 2, we
2 23 22
consider data from
3 23 23 22 day 3 and for k = 3,
4 24 23 23 22 we start from day 4.

5 23 24 23 23 22
6 25 23 24 23 23
7 26 25 23 24 23
8 28 26 25 23 24
101
Block 3 Time Series Analysis
9 28 28 26 25 23
10 30 28 28 26 25
11 31 30 28 28 26
12 30 31 30 28 28
13 31 30 31 30 28
14 31 31 30 31 30
15 30 31 31 30 31
Total 405

Since for the calculation of the autocorrelation function, we assume that the
time series is stationary, therefore, mean and variance of the series will be
constant. Thus, we calculate the sample mean and variance of the given
original time series and make the necessary calculations for calculating the
autocovariance and autocorrelation function in the following table:

yt − y ( yt − y )
2
y t +1 − y yt+2 − y yt +3 − y yt+4 − y ( y t − y )( y t +1 − y ) ( y t − y )( y t + 2 − y ) ( y t − y )( y t + 3 − y ) ( y t − y )( y t + 4 − y )
–5 25

–4 16 –5 20

–4 16 –4 –5 16 20

–3 9 –4 –4 –5 12 12 15
-4 16 –3 –4 –4 –5 12 16 16 20

–2 4 –4 –3 –4 –4 8 6 8 8

–1 1 –2 –4 –3 –4 2 4 3 4
1 1 –1 –2 –4 –3 –1 –2 –4 –3
1 1 1 –1 –2 –4 1 –1 –2 –4
3 9 1 1 –1 –2 3 3 –3 –6
4 16 3 1 1 -1 12 4 4 –4
3 9 4 3 1 1 12 9 3 3
4 16 3 4 3 1 12 16 12 4
4 16 4 3 4 3 16 12 16 12
3 9 4 4 3 4 12 12 9 12
Total 164 –3 –7 –11 –14 137 111 77 46

Therefore,
1 n 405
Mean
= ∑ =
n t =1
y t = 27
15

1 n 164
Variance =c 0 = ∑ ( y t − y ) =
2
=10.933
n t =1 15

Autocovariance function
1 n−1 1
c1 = ∑
n t =1
( y t − y )( y t +1 − y ) = × 137 = 9.133
15

1 n−2 1
c2 = ∑
n t =1
( y t − y )( y t +2 − y ) = × 111 = 7.4
15

1 n−3 1
102 c3 = ∑
n t =1
( y t − y )( y t +3 − y ) = × 77 = 5.133
15
Unit 13 Correlation Analysis in Time Series

1 n− 4
1
c4 = ∑
n t =1
( y t − y )( y t + 4 − y ) = × 46 = 3.067
15

After calculating the autocovariance function, we now calculate the sample


autocorrelation function as
c1 9.133
=r1 = = 0.835
c 0 10.933
c2 7.4
r2 =
= = 0.677
c 0 10.933
c 3 5.133
r3 =
= = 0.470
c 0 10.933
c 4 3.067
r4 =
= = 0.280
c 0 10.933
You may like to try the following Self Assessment Question before studying
further.

SAQ 1
A researcher wants to study the pattern of the unemployment rate in his
country. He collected quarterly unemployment rate data and given in the
following table:
Unemployment Quarter Unemployment
Quarter
rate rate
1 91 7 64
2 45 8 99
3 89 9 64
4 36 10 89
5 72 11 68
6 51 12 108

Compute:
(i) mean and variance, and
(ii) Autocovariance and autocorrelation functions.

13.4 PARTIAL AUTOCORRELATION FUNCTION


In the previous section, you studied autocorrelation function which measures
the linear dependency between a time series Yt with its own lagged values
Yt+k. However, a time series tend to carry information and dependency
structures in steps and therefore autocorrelation at lag k is also influenced by
the intermediate variables Yt+1, Yt+2,…, Yt+k–1. Therefore, autocorrelation is not
the correct measure of the mutual correlation between Yt and Yt+k in the
presence of the intermediate variables. Partial autocorrelation solves this
problem by measuring the correlation between Yt and Yt+k when the influence
of the intermediate variables has been removed. Hence partial autocorrelation
in time series analysis defines the correlation between Yt and Yt+k which is not
accounted for by lags t +1 to t + k –1. The partial autocorrelation function is
similar to the atocorrelation function except that it displays only the correlation
between two observations after removing the effect of intermediate variables.
103
For example, if we are interested in the direct relationship between today’s
Block 3 Time Series Analysis
consumption of patrol and that of a year ago then we don’t blame what
happens in between. The consumption of the previous 12 months has an
effect on the consumption of the previous 11 months, and the cycle continues
until the most current period. In partial autocorrelation estimates, these indirect
effects are ignored. Therefore, we can define the partial autocorrelation
function as
The partial autocorrelation function calculates the degree of relationship
between a time series Yt with its own lagged values Yt+k after their mutual
linear dependency on the intervening variables Yt+1, Yt+2,…, Yt+k–1 has
been removed.
You can understand the same using the diagram given in Fig. 13.1.

Fig. 13.1: PACF of order 2.

In other words, we can define the partial autocorrelation function between Yt


and Yt+k as
The conditional correlation between Yt and Yt+k, conditional
on Yt +1, Yt + 2 ,..., Yt +k −1 (the set of observations that come between the time
points Yt and Yt+k), is known as the kth order PACF.
Therefore, we can define the kth order (lag) partial autocorrelation function
mathematically as
Cov ( Yt , Yt +k | Yt +1, Yt + 2 ,..., Yt +k −1 )
φkk =
Var ( Yt | Yt +1, Yt + 2 ,..., Yt +k −1 ) Var ( Yt +k | Yt +1, Yt + 2 ,..., Yt +k −1 )

This is the correlation between values two time periods apart conditional on
knowledge of the value in between. (By the way, the two variances in the
denominator will equal each other in a stationary series.), therefore,

Cov ( Yt , Yt +k | Yt +1, Yt + 2 ,..., Yt +k −1 )


φkk =
Var ( Yt | Yt +1, Yt + 2 ,..., Yt +k −1 )

The formula for calculating the partial autocorrelation function looks scary,
therefore, we calculate it using the autocorrelation function instead of it.
The 1st order partial autocorrelation function equals to the 1st order
autocorrelation function, that is,
φ11 =ρ1

Similarly, we can define the 2nd order (lag) partial autocorrelation function in
terms of autocorrelation function as

φ22 =
(ρ − ρ )
2
2
1

104 (1 − ρ )2
1
Unit 13 Correlation Analysis in Time Series

The general form for calculating partial autocorrelation function in terms of


ACF is given in the matrix form as shown:
−1
 φ1k   1 ρ1 ρ2  ρk −1   ρ1 
φ   ρ 1 ρ3  ρk − 2  ρ 
 2k   1  2
φ3k  =  ρ2 ρ1 1  ρk −3  ρ2  Cramer’s rule can be
      used in any system of
           n linear equations in n
 φkk  ρk −1 ρk − 2 ρk −3  1  ρk 
     variables. If we have
following equations
Or
φk =Pk−1Ψk

where
 φ1k   1 ρ1 ρ2  ρk −1   ρ1  and if then
φ  ρ 1 ρ3  ρk − 2  ρ  according to Cramer-
 2k   1  2 Rule the system has
=φk φ3k
=  ,Pk  ρ2 ρ1 1  ρk −3  and
= Ψk ρ2  unique solution and is
      given by
          
 φkk  ρk −1 ρk − 2 ρk −3  1  ρk 
    
In the above expression, the last coefficient, φkk , is the partial autocorrelation Where

function of order k. Since we are interested only in this coefficient, therefore,


we can solve the above expression for φkk using the Cramer-Rule. We get

Pk*
φkk =
Pk

where . indicates the determinant and Pk* is given as


1 ρ1 ρ2  ρ1
ρ1 1 ρ3  ρ2
*
Pk = ρ2 ρ1 1  ρ3
    
ρk −1 ρk − 2 ρk −3  ρk

It is equal to the matrix Pk in which the kth column is replaced with Ψk .

Therefore, the 3rd order partial autocorrelation function is


1 ρ1 ρ1
ρ1 1 ρ2
ρ ρ1 ρ3
φ33 = 2
1 ρ1 ρ2
ρ1 1 ρ3
ρ2 ρ1 1

As you saw, the autocorrelation function helps assess the properties of a time
series. In contrast, the partial autocorrelation function (PACF) is more useful
for finding the order of an autoregressive, autoregressive integrated moving
average (ARIMA) model. You will study these models in the next unit.

Sample Partial Autocorrelation Function


In practice, we have a finite time series and on the basis of the observations of
105
the given time series, we estimate the partial autocorrelation function. The
Block 3 Time Series Analysis
estimate of the partial autocorrelation function is known as sample partial
autocorrelation and the formulae for the same are obtained by replacing
autocorrelation function (ρ) with sample autocorrelation function (r) which are
given as follows:

φˆ11 =r1

We define the 2nd order (lag) sample partial autocorrelation function as

φˆ22 =
(
r2 − r12 )
(
1 − r12 )
The general form for calculating the sample partial autocorrelation function of
order k is given in the matrix form as shown below:

1 r1 r2  r1
r1 1 r3  r2
r2 r1 1  r3
    
*
P̂ k rk −1 rk − 2 rk −3  rk
φˆkk= =
Pˆk 1 r1 r2  rk −1
r1 1 r3  rk − 2
r2 r1 1  rk −3
    
rk −1 rk − 2 rk −3  1

Let's consider an example which helps you to understand how to calculate the
sample partial autocorrelation function.

Example 2: For the data given in Example 2 of Unit 12, calculate the sample
partial autocorrelation up to order 3.

Solution: For calculating sample partial autocorrelation, first of all, we have to


compute the sample autocorrelation function. We have already calculated
these in Example 2. Therefore, for the sake of time, we just write them here

r1 = 0.835, r2 = 0.677, r3 = 0.470, r4 = 0.280

Since, the 1st-order partial autocorrelation function equals the 1st-order


autocorrelation function, therefore,

φ11 = r1 = 0.835

We can calculate the 2nd order (lag) sample partial autocorrelation function as

=
φ22
(r =
−r )
2 1
2
0.677 − ( 0.835 )
2

(1 − r ) 1
2
1 − ( 0.835 )
2

−0.020
= = −0.067
106 0.303
Unit 13 Correlation Analysis in Time Series

Similarly, We now compute the 3rd order sample partial autocorrelation


function as
1 r1 r1
r1 1 r2
r r r
φ33 =2 1 3
1 r1 r2
r1 1 r3
r2 r1 1

1 r1 r1 1 0.835 0.835
r1 1 r2 = 0.835 1 0.677
r2 r1 r3 0.677 0.835 0.470

1× ( 0.470 − 0.835 × 0.677 ) − 0.835 × ( 0.835 × 0.470 − 0.677 × 0.677 )


=
+ 0.835 × ( 0.835 × 0.835 − 0.676 × 1)

−0.095 + 0.055 + 0.017 =


= −0.023
Similarly,
1 r1 r2 1 0.835 0.677
r1 1 r3 = 0.835 1 0.470
r2 r1 1 0.677 0.835 1

1× (1 − 0.835 × 0.470 ) − 0.835 × ( 0.835 × 1 − 0.677 × 0.470 )


=

+ 0.677 × ( 0.835 × 0.835 − 0.677 )

= 0.608 − 0.432 + 0.014 = 0.191


Therefore,
−0.023
φˆ33 = =−0.120
0.191
Before going to the next session, you may like to compute the sample partial
autocorrelation function yourself. Let us try a Self Assessment Question.

SAQ 2
For the data given in SAQ 1, calculate the sample partial autocorrelation
function up to order 2.

13.5 CORRELOGRAM
In the previous sessions, you learnt autocovariance, autocorrelation, and
partial autocorrelation functions which are used to understand the properties of
time series, fit the appropriate models, and forecast future events of the series.
With the help of the autocorrelation/partial autocorrelation function, we can
also diagnose whether the time series is stationary or not. But a group of a
large number of autocorrelation always makes misperceptions to the reader
and he/she may understand it wrongly. If we present the autocorrelation/
partial autocorrelation function in the form of graphs/diagrams, then it attracts
the reader and it can be understood better. 107
Block 3 Time Series Analysis
A plot in which we take the autocorrelation function on the vertical axis and
different lags on the horizontal axis is known as a correlogram. The technique
of drawing a correlogram is the same as that of a simple bar diagram. The
only difference is that we just take a line instead of a bar of the same width.
Each bar in the correlogram represents the level of correlation between the
series and its lags in chronological order. A correlogram is also known as an
autocorrelation function (ACF) plot or autocorrelation plot. It gives
a summary of autocorrelation at different lags. With the help of a
correlogram, we can easily examine the nature of the time series and
diagnose a suitable model for the time series data.
The correlogram suggests that observations with smaller lag are positively
correlated and autocorrelation decreases as lag k increases. In most of the
time series, it is noticed that the absolute value of rk i.e. | rk| decreases as k
increases. This is because observations which are located far away are not
much related to each other, whereas observations close may be positively or
negatively correlated.
Let us understand how we plot a correlogram with the help of an example.
Example 3: For the data given in Example 2 of Unit 12, plot the correlogram.
Solution: A correlogram is a plot of the autocorrelation function with respect to
its lag, therefore, first of all, we have to compute the sample autocorrelation
coefficients. In Example 2, we have already calculated these. Therefore, to the
sake of time, we just write them here
r1 = 0.835, r2 = 0.676, r3 = 0.469, r4 = 0.280

For the correlogram, we take lags on the X-axis and sample autocorrelation
function on the Y-axis. At each lag, we draw a line, which represents the level
of correlation between the series and a lagged version of itself, as shown in
the following Fig. 13.2.

Fig. 13.2: The correlogram for lag k = 1, 2, 3 and 4.

After learning what is correlogram and how we plot it, we now understand how
the correlogram helps us to recognise the nature of a time series.

13.6 INTERPRETATION OF CORRELOGRAM


A correlogram is a graph used to interpret a set of autocorrelation functions in
108
which the autocorrelation function is plotted against its lag. It is often very
Unit 13 Correlation Analysis in Time Series

helpful for visual inspection to recognise the nature of time series, though it is
not always easy. We now describe certain types of time series and the nature
of their correlograms.
Random Series
A time series is completely random if it contains only independent
observations. Therefore, the values of the autocorrelation function for such a
series are approximately zero, that is, rk  0 and the correlogram of such a
random time series will be moving around the zero line. The typical
correlogram is shown in Fig. 13.3.

Fig. 13.3: The correlogram of random series.

Alternating Series
If a time series behaves in a very rough and zig-zag manner, alternating
between above and below mean, then it indicates by negative rk and positive
rk+1 and vice-versa. The correlogram of an alternating time series is shown in
Fig. 13.4.

Fig. 13.4: The correlogram of alternating series.

Stationary Time Series

A time series is said to be stationary if its mean, variance and covariance are
109
almost constant and it is free from trend and seasonal effects. The
Block 3 Time Series Analysis

Fig. 13.5: The correlogram of stationary time series.

correlogram of the stationary series has a few large autocorrelations in


absolute value for small lag k and they tend to zero very rapidly with an
increase in lag k (See Fig. 13.5). A model called an autoregressive model (you
will study in the next session), may be appropriate for a series of this type.
Nonstationary Time Series
A time series is said to be nonstationary if its mean, variance, and covariance
change over time. Therefore, a time series which contains trend, seasonality
cycles, random walks, or combinations of these is nonstationary. Such a
series is usually very smooth in nature and its autocorrelations go to zero very
slowly as the observations are dominated by trend. We should remove the
trend from such a time series before doing any further analysis. The time
series with trend and seasonal effects are as follows:
(i) Trend Time Series
If a time series has a trend effect, then a time plot will show an upward
or down word pattern as you have seen in Unit 10. For such type of time
series, the correlogram decreases in an almost linear fashion as the lags
increase as shown in Fig. 13.6. Hence a correlogram of this type is a
clear indication of a trend.

110
Fig. 13.6: The correlogram of time series having trend effect.
Unit 13 Correlation Analysis in Time Series

(ii) Seasonal Time Series


If a time series has a dominant seasonal pattern, then a time plot will
show cyclical behaviour with a periodicity of the season. The
correlogram will also exhibit an oscillation behaviour as shown in Fig.
13.7. If there is seasonality, say, 12 months, then the ACF value will be
large and positive at lag 12 and possibly also at lags 24, 36, . . .Similarly,
for quarterly seasonal data, a large ACF value will be seen at lag 4 and
possibly also at lags 8,12, .… If the seasonal variation is removed from
time series data then the correlogram may provide useful information.
Therefore, in this case, the correlogram may not contain any more
information than what is given by the time plot of the time series.

Fig. 13.7: The correlogram of time series having seasonal effect.

In general, the interpretation of a correlogram is not easy and requires a lot of


experience and insight.
You may like to try the following Self Assessment Question.

SAQ 3
A share market expert wants to study the pattern of a particular share price.
For that, he calculates the autocorrelation for different lags which are given as
follows:
r0 = 1, r1 = 0.482 , r2 = 0.050 , r3 = −0.159 , r4 = 0.253 , r5 = −0.024 , r6 = 0.053,

r7 = 0.025 , r8 = −0.252 , r9 = −0.177 , r10 = 0.006 , r11 = 0.390 , r12 = −0.838,

r13 = 0.407 , r14 = 0.010 , r15 = −0.181, r16 = −0.257 , r7 = −0.057 , r18 = 0.016

r19 = −0.051

For the above information:


(i) Plot the correlogram.
(ii) Interpret the correlogram. Is the seasonality apparent in the
correlogram?

We end this unit by giving a summary of its contents. 111


Block 3 Time Series Analysis

13.7 SUMMARY
In this unit, we have discussed:
• Role of correlation analysis in time series.
• The covariance between a given time series and a lagged version of itself
over successive time intervals is called autocovariance. The formula for
calculating the autocovariance function is given as
1 N− k
γ −k Cov ( Yt , Yt +k ) =
γk == ∑ ( Yt − μ)( Yt +k − μ)
N t =1
and its estimate using sample data is as follows:
1 n −k
γ̂k ==
ck ∑ ( y t − y )( y t +k − y ); k =
n t =1
1,2,...,n − 1

• Autocorrelation is a measure of the degree of relationship between a


given time series and a lagged version of itself over successive time
intervals. The formula for calculating the autocorrelation function is given
as
N− k

∑ ( Y − μ)( Y
t − μ)
γk t +k
=ρk =
t =1
N
γ0
∑ ( Yt − μ)
2

t =1

and its estimate using sample data is as follows:


n −k

∑(y t− y )( y t −k − y )
c
ρ̂k= rk= t =1
n
= k ; k= 1,2,...,n − 1
c0
∑ ( yt − y )
2

t =1

• The partial autocorrelation function calculates the degree of relationship


between a time series Yt with its own lagged values Yt+k after their mutual
linear dependency on the intervening variables Yt+1, Yt+2,…, Yt+k–1 has
been removed. We can calculate it using the autocorrelation function as
φ11 =ρ1

φ22 =
(ρ − ρ )
2
2
1

(1 − ρ ) 2
1

Pk*
φkk =
Pk

• A plot in which we take the autocorrelation function on the vertical axis


and different lags on the horizontal axis is known as a correlogram.

13.8 TERMINAL QUESTIONS


For the data which is obtained after the first difference of the time series data
(sales of a new single house in a region) of TQ 1 of Unit 12

(i) Calculate ACF.

(ii) Interpret the correlogram. Is the trend apparent in the correlogram?


112
Unit 13 Correlation Analysis in Time Series

13.9 SOLUTION/ANSWERS
Self Assessment Questions (SAQs)
1. Since there are 12 observations, therefore, we prepare the data up to
n/4 = 12/4 = 3 lags as follows:
Quarter Unemployment (yt) yt+1 yt+2 yt+3
1 91
2 45 91
3 89 45 91
4 36 89 45 91
5 72 36 89 45
6 51 72 36 89
7 64 51 72 36
8 99 64 51 72
9 64 99 64 51
10 89 64 99 64
11 68 89 64 99
12 108 68 89 64
Total 876

For the calculation of the autocorrelation function, we assume that the


time series is stationary, therefore, mean and variance of the series will
be constant. Thus, we calculate the sample mean and variance of the
given original time series and make the necessary calculations for
calculating the autocovariance and autocorrelation function in the
following table:

yt − y ( yt − y )
2
y t +1 − y y t + 2 − y y t + 3 − y ( y t − y )( y t +1 − y ) ( y t − y )( y t +2 − y ) ( y t − y )( y t +3 − y )
18 324
–28 784 18 –504
16 256 –28 18 –448 288
–37 1369 16 –28 18 –592 1036 –666
–1 1 –37 16 –28 37 –16 28
–22 484 –1 –37 16 22 814 –352
–9 81 –22 –1 –37 198 9 333
26 676 –9 –22 –1 –234 –572 –26
–9 81 26 –9 –22 –234 81 198
16 256 –9 26 –9 –144 416 –144
–5 25 16 –9 26 –80 45 –130
35 1225 –5 16 –9 –175 560 –315
0 5562 –2154 2661 –1074

Therefore,
1 n 876
Mean
= ∑ =
n t =1
y t = 73 ,
12

1 n 5562
Variance =c 0 = ∑ ( y t − y ) =
2
=463.5
n t =1 12 113
Block 3 Time Series Analysis
Autocovariance
1 n−1 1
c1 = ∑
n t =1
( y t − y )( y t +1 − y ) = × −2154 = −179.5
15
1 n−2
1
c 2 = ∑ ( y t − y )( y t + 2 − y ) = × 2661 = 221.75
n t =1 12
1 n−3 1
c 3 = ∑ ( y t − y )( y t + 3 − y ) = × −1074 = −89.5
n t =1 12
After calculating the autocovariance function, we now calculate the
sample autocorrelation function as
r1 = −0.387 , r2 = 0.478 , r3 = −0.193
2. In SAQ 1, we have already calculated the sample autocorrelation
coefficients which are as follows:
c1 −179.5 c 2 221.75
r1 = = = −0.387 , =
r2 = = 0.478
c0 463.9 c0 463.9

c 3 −89.5
r3 = = = −0.193
c 0 463.9

Since the 1st-order partial autocorrelation equals to the 1st-order


autocorrelation, therefore,
φ11 =r1 =−0.387

We can compute the 2nd order (lag) sample partial autocorrelation


function as

φˆ22
=
(r =
2−r )
1
2
0.478 − ( −0.387 )
=
2

0.386
(1 − r )
1
2
1 − ( −0.193 )
2

3. For plotting the correlogram, we take lags on the X-axis and sample
autocorrelation coefficients on the Y-axis. At each lag, we draw a line,
which represents the level of correlation between the series and its lags,
as shown in the following Fig. 13.8.

Fig. 13.8: The correlogram of time series of the share price.

Since the correlogram shows an oscillation, therefore, the time series of


the share price is not stationary. The frequency of the oscillations is
114
almost the same, therefore, it has a seasonal effect.
Unit 13 Correlation Analysis in Time Series

Terminal Questions (TQs)


1. For calculating the first sample autocorrelation of the first-order
difference, we prepare the data of lags. Since there are 14
observations, therefore, we prepare the data up to n/4 = 14/4 ~ 4
lags as follows:
First
Month yt+1 yt+2 yt+3 yt+4
Difference (yt)
2 38
3 21 38
4 32 21 38
5 18 32 21 38
6 5 18 32 21 38
7 15 5 18 32 21
8 25 15 5 18 32
9 20 25 15 5 18
10 10 20 25 15 5
11 15 10 20 25 15
12 30 15 10 20 25
13 8 30 15 10 20
14 32 8 30 15 10
15 25 32 8 30 15

For the calculation of the autocorrelation function, we assume that the


time series is stationary, therefore, mean and variance of the series will
be constant. Thus, we calculate the sample mean and variance of the
time series ( first difference data) and make the necessary calculations
for calculating the autocovariance and autocorrelation function in the
following table:

yt − y ( yt − y )
2
y t +1 − y yt+2 − y yt +3 − y yt+4 − y ( y t − y )( y t +1 − y ) ( y t − y )( y t + 2 − y ) ( y t − y )( y t + 3 − y ) ( y t − y )( y t + 4 − y )
17 289
0 0 17 0
11 121 0 17 0 187
–3 9 11 0 17 –33 0 –51
–16 256 –3 11 0 17 48 –176 0 –272
–6 36 –16 –3 11 0 96 18 –66 0

4 16 –6 –16 –3 11 –24 –64 –12 44

–1 1 4 –6 –16 –3 –4 6 16 3

–11 121 –1 4 –6 –16 11 –44 66 176

–6 36 –11 –1 4 –6 66 6 –24 36

9 81 –6 –11 –1 4 –54 –99 –9 36

–13 169 9 –6 –11 –1 –117 78 143 13

11 121 –13 9 –6 –11 –143 99 –66 –121


4 16 11 –13 9 –6 44 –52 36 –24
Total 1272 –10 –41 33 –109
115
Block 3 Time Series Analysis
1 n
294
Mean
= ∑ =
n t =1
y t = 21
14

1 n −k 1272
Variance =c 0 = ∑ ( y t − y ) =
2
=90.86
n t =1 14

Autocovariance
1 n−1 1
c1 = ∑
n t =1
( y t − y )( y t +1 − y ) = × −110 = −7.86
14

1 n−2 1
c2 = ∑
n t =1
( y t − y )( y t +2 − y ) = × −41 = −2.93
14

1 n−3 1
c3 = ∑
n t =1
( y t − y )( y t +3 − y ) = × 33 = 2.36
14

1 n− 4 1
c4 = ∑
n t =1
( y t − y )( y t + 4 − y ) = × −109 = −7.79
14

After calculating the autocovariance, we now calculate the sample


autocorrelation function as
c1 −7.86 c −2.93
r1 = = = −0.086 , r2 = 2 = = −0.032
c 0 90.86 c 0 9086

c 3 2.36 c −7.79
r3
= = = 0.026 , r4 = 4 = = −0.086
c 0 90.86 c 0 90.86

For the correlogram, we take lags on the X-axis and sample


autocorrelation function on the Y-axis. At each lag, we draw a line,
which represents the level of correlation between the series and its lags,
as shown in the following Fig. 13.9.

Fig. 13.9: The correlogram of time series data of sales of new single houses.

Since the autocorrelation function is approximately zero and moving


around the zero line, therefore, the time series is stationary and no trend
appeared in the correlogram.

116

You might also like