Dire Dawa University
College of Business and Economics
Department of Economics
Econometrics II
By Desalegn N.
1 October 23, 2023
Chapter Three: Introduction to Basic Regression Analysis with Time
Series Data
Outlines
The nature of Time Series Data
Stationary and non-stationary stochastic Processes
Trend Stationary and Difference Stationary Stochastic Processes
Integrated Stochastic Process
Tests of Stationarity: The Unit Root Test
The nature of Time Series Data
A time series is a set of observations on the values that a variable takes at different
times.
Time series data collected at regular intervals: daily, weekly, monthly, quarterly,
annually, and decennially.
It is a particular realization of the stochastic process.
Time series is generated by a stochastic or random process
In time series data a current observation is dependent on the previous.
The nature of Time Series Data
For instance, data recorded daily (e.g., stock prices, weather reports)
weekly (e.g., money supply figures), monthly (e.g., the unemployment rate, the
Consumer Price Index [CPI]), quarterly (e.g., GDP), annually (e.g., government
budgets).
As sample is to a population in cross section data, realization (time series) is to a
stochastic process.
The nature of Time Series Data
Aims of Time Series Analysis:
(a) Description: estimate a model for Yt and analyze it
(b) Forecasting future values of Yt
Problems with time series data
Non-stationary of data/Random walk nature of variables
Faced with autocorrelation problem due to non-stationary of variables
Spurious regression: high 𝑅2 even though there is no meaningful relationship
among/between variables.
Stationary and non-stationary stochastic Processes
Stochastic process or a time series process is a sequence of random variables
indexed by time.
Its is a collection of random variables ordered in time.
If we let Y denote a random variable, and if it is continuous, we denote it as Y(t),
but if it is discrete, we denoted it as Yt.
Since most economic data are collected at discrete points in time, for our purpose
we will use the notation Yt.
Stationary and non-stationary stochastic Processes
If we let Y represent GDP, for our data we have Y1, Y2, Y3, ... , Yn
where the subscript 1 denotes the first observation of GDP. Accordingly,
A random or stochastic process is a collection (sequence) of random variables
ordered in time: = Yt = {Y1, Y2, ..., Yt }.
Stationary and non-stationary stochastic Processes
A. Stationary Stochastic Processes: is a stochastic process where its mean and
variance are constant over time and the value of the covariance between the two
time periods depends only on the distance or gap or lag between the two time
periods and not the actual time at which the covariance is computed.
Stationary and non-stationary stochastic Processes
If a time series is stationary, its mean, variance, and auto-covariance (at various
lags) remain the same no matter at what point we measure them; that is, they are
time invariant.
B. Non-stationary stochastic Processes: is a stochastic process which have a time-
varying mean or a time-varying variance or both.
The best example of non-sationary stochastic process is random walk model
(RWM).
Stationary and non-stationary stochastic Processes
For instance ; asset prices, such as stock prices or exchange rates, follow a random
walk model/that is, they are non-stationary.
Random walk model classified into random walk without drift (i.e., no constant or
intercept term) and (2) random walk with drift (i.e., a constant term is present).
Random Walk without Drift: Suppose Ut is a white noise error term with zero
mean and constant variance. Then the series Yt is said to be a random walk
without drift if: Yt=Yt-1+Ut.
Stationary and non-stationary stochastic Processes
Y1 = Y0 + u1 -------------------------(1)
Y2 = Y1 + u2 = Y0 + u1 + u2--------------------------(2)
Y3 = Y2 + u3 = Y0 + u1 + u2 + u------------------------(3)
Random Walk with Drift: Yt=α+Yt-1+Ut-------------------(4)
Yt − Yt−1 = ∆Yt = α + Ut-----------------------------(5)
Trend stationary (TS) and Difference stationary (ds) stochastic processes
Suppose first order autoregressive model(AR(1)) is given by: Yt = ρYt−1 + Ut------(6)
If ρ is 1, we face what is known as the unit root problem, that is, a situation of non-
stationarity and in this case the variance of Yt is not constant.
On the contrary, if |ρ| ≤ 1, that is if the absolute value of ρ is less than one, then it
can be shown that the time series Yt is stationary in the sense we have defined it.
Trend stationary (TS) and Difference stationary (DS) stochastic processes
If the trend in a time series is completely predictable and not variable, we call it a
deterministic trend, whereas if it is not predictable, we call it a stochastic trend. To
make the definition more formal, consider the following model of the time series.
Yt = β1 + β2t + β3Yt−1 + Ut----------------------------------(7)
If in the above model β1 = 0, β2 = 0, β3 = 1, we get Yt = Yt−1 + Ut------------------(8)
we obtain random walk model without drift. Hence: ∆ Yt = (Yt − Yt−1) = ut
If eq(8) stationary we call it difference stationary process.
Trend stationary (TS) and Difference stationary (DS) stochastic processes
However, suppose again consider random walk with drift(eq(8))
if β1≠ 0, β2 = 0, β3 = 1,
The model becomes Yt = β1 + β3Yt−1 + Ut-------(9), this again becomes
∆ Yt = Yt-Yt-1= β1 +Ut, this also random walk model.
Its is called DSP process because the non-stationarity in Yt can be eliminated by
taking first differences of the time series.
Trend stationary (TS) and Difference stationary (DS) stochastic processes
On the contrary if β1≠ 0, β2 ≠0, β3 = 0, it becomes
Yt = β1 + β2t + Ut if this stationary, the we call trend stationary process (TSP).
Mean of Yt is β1 + β2t, which is not constant, but if the value of β1 and β2 are
known we can forecast the mean perfectly.
Thus, if we subtract the mean of Yt from Yt, the resulting series will be stationary,
hence such kind of stationary process is known as trend stationary.
Integrated Stochastic Process
Integrated processes is the random walk model but a specific case of a more
general class of stochastic processes. It’s the process of becoming non-stationary
series stationary series.
Random walk model without drift is non-stationary but if its first difference
becomes stationary.
Thus, we call the RWM without drift is integrated of order 1, denoted as I(1).
Integrated Stochastic Process
Further, if the RWM without drift is only stationary at second difference, RWM
becomes integrated order two, I(2).
In general, if a (nonstationary) time series has to be differenced d times to make it
stationary, and hence the time series is said to be integrated of order d, I(d).
On the other hand, if a time series Yt is stationary to begin with (i.e., it does not
require any differencing), it is said to be integrated of order zero, denoted by Yt ∼ I(0).
Most economic time series are generally I(1); that is, they generally become stationary
only after taking their first differences.
Properties of Integrated Series
1. If Xt ∼ I(0) and Yt ∼ I(1), then Zt = (Xt + Yt) = I(1); that is, a linear combination or
sum of stationary and nonstationary time series is nonstationary.
2. If Xt ∼ I(d), then Zt = (a + bXt) = I(d), where a and b are constants. That is, a linear
combination of an I(d) series is also I(d). Thus, if Xt ∼ I(0), then Zt = (a + bXt) ∼
I(0).
3. If Xt ∼ I(d1) and Yt ∼ I(d2), then Zt = (aXt + bYt) ∼ I(d2), where d1 < d2. 4.
4. If Xt ∼ I(d) and Yt ∼ I(d), then Zt = (aXt + bYt) ∼ I(d*); d* is generally equal to d,
but in some cases d* < d
Properties of Integrated Series
Thu, we need to give undue attention in combining different order series. Suppose
SLR is given by : Yt = β1 + β2Xt + Ut. Assume that Xt is I(1) and Yt is I(0), since,
Now due to Xt is non-stationary at level, the variance will increase indefinitely as
time span rise.
As a result the value of β2 approaches to zero as the sample increase and it will not
even have an asymptotic distribution
Spurious Regression
To see why stationary time series are so important, consider the following two
random walk models:
Yt = Yt−1 + ut ------------------------(10)
Xt = Xt−1 + vt-------------------------(11)
Assume that both Yt and Xt are non-stationary at the level but they are I(1)
Assume that again the initial values of both Y and X were zero. We also assumed
that Ut and Vt are serially uncorrelated as well as mutually uncorrelated
Spurious Regression
Now regress Yt on Xt. Since Yt and Xt are uncorrelated I(1) processes, the 𝑅 2 from
the regression of Y on X should tend to zero; that is, there should not be any
relationship between the two variables.
However, we may obtain high 𝑅2 and statistically significant coefficient.
Thus, this is called spurious or nonsense regression.
This occur due to correlation could persist in non-stationary time series.
Spurious Regression
The spuriousness of the regression can be know by regressing the first difference of
Yt(∆Yt) on first difference of Xt(∆Xt).
Now if 𝑅2 is close zero, thus the regression initially obtain by regressing Yt on Xt is
non-sense(the relationship of Yt and Xt is non-sense).
∆Yt = β 1+ β2∆Xt +Ut
Tests of Stationarity: Unit root test.
There are several tests of stationarity, but we discuss only those that are
prominently discussed in the literature.
Accordingly, the graphical analysis and Dickey–Fuller (DF)/ Augmented Dickey–
Fuller Test of stationarity discuss in this chapter.
Graphical Analysis for Unit root Test
This is informal method of testing stationarity.
Here, we draw a graph that connect the value of given variable recorded over time
against time period.
Then, if the given variable grow over time/shows upward trend , the series
suggests the mean and variance of the variable has been changing and the variable
is not stationary.
Graphical Analysis for Unit root Test
for instance the GDP of Ethiopia is over time is give by:
gdp
1E+11
9E+10
8E+10
7E+10
gdp
6E+10
5E+10
4E+10
3E+10
2E+10
1E+10
0
1985 1990 1995 2000 2005 2010 2015 2020 2025
Graphical Analysis for Unit root Test
As presented in the above graph GDP shows upward trending.
As a result, the mean and variance of GDP change overtime and the graph suggests
GDP is not stationary.
Dickey–Fuller test
Suppose a stochastic process given by: Yt = ρYt−1 + Ut------------------(12)
If − 1 < ρ < 1, the series is stationary.
if ρ = 1, that is, in the case of the unit root/non-stationary.
The idea is that to test unit root we regress Yt on Yt-1 and if ρ=1, Yt is non-
stationary, but if the absolute value of ρ<1, then Yt is stationary.
However, for practical purpose we subtract both side Yt-1 ,
Yt − Yt−1 = ρYt−1 − Yt−1 + Ut
Dickey–Fuller test
∆Yt= (ρ-1)Yt-1+Ut
∆Yt =δYt−1 + Ut---------------------------------------(13)
Where δ=ρ-1
After estimating eq(13), we need to develop hypothesis
H0: δ=0
H1: δ<0
Thus, if we accept H0, we have a unit root, meaning the time series under
consideration is nonstationary.
However, δ if it is negative, we conclude that Yt is stationary
Dickey–Fuller test
The next question we need to answer is how δ significance test checked. Since the
usual T-test can not be used. Because T-value of the estimated coefficient of Yt−1
does not follow the T distribution even in large samples; that is, it does not have an
asymptotic normal distribution.
Alternatives: Dickey and Fuller show that the null hypothesis that δ = 0, the
estimated T-value of the coefficient of Yt−1 in Eq(13) follows the τ (tau) statistic.
Dickey–Fuller test
These authors have computed the critical values of the tau statistic on the basis of
Monte Carlo simulations.
In the literature the tau statistic test is known as the Dickey–Fuller (DF) test.
The DF test is estimated in three different forms, that is, under three different null
hypotheses. The null hypothesis is that δ = 0
∆Yt =δYt−1 + Ut------------------------------(14) Yt is a random walk without drift
∆Yt = β1 + δYt−1 + Ut--------------------------(15)Yt is a random walk with drift:
∆Yt = β1 + β2t + δYt−1 + Ut--------(16) Yt is a random walk with drift and trend
Dickey–Fuller test
After estimating Eq(14), EQ(15), and Eq(16), by OLS and calculating tau statistics
Then if the tau statistics is greater than DF or MacKinnon critical tau values, we
reject H0 and Yt is stationary.
But, if tau statistics less than DF or MacKinnon critical tau values, we fail to reject
H0 and Yt is nonstationary.
The Augmented Dickey–Fuller (ADF) test
In the above Dickey–Fuller (ADF) test we assumed error terms are uncorrelated.
But in case of Ut are correlated, Dickey and Fuller have developed a test, known as
the augmented Dickey–Fuller (ADF) test.
This test is conducted by “augmenting” the preceding three equations(Eq(14),
EQ(15), and Eq(16))by adding the lagged values of the dependent variable.
the Augmented Dickey–Fuller (ADF) test
The ADF test here consists of estimating the following regression:
𝑚
∆Yt = β1 + β2t + δYt−1 + 𝑖 α𝑌 t-i+ Ut----------------------(17)
In ADF we still test whether δ = 0 and the ADF test follows the same asymptotic
distribution as the DF statistic.
the Augmented Dickey–Fuller (ADF) test
Example: unit root test for GDP.
Augmented Dickey-Fuller test for unit root Number of obs = 28
Interpolated Dickey-Fuller
Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value
Z(t) 0.943 -4.352 -3.588 -3.233
MacKinnon approximate p-value for Z(t) = 1.0000
As presented in the table tau statistics(0943) less than DF or MacKinnon critical tau
values at 10%, 5%, and 1% are (-3.233, -3.588, and -4.352). Thus, we fail to reject H0
and GDP is non-stationary at the level.
N.B: We compare tau statistic and critical tau values on their absolute value