0% found this document useful (0 votes)
14 views43 pages

Lecture 2

The document discusses time series analysis and stochastic processes. It introduces concepts like expectations, variance, covariance, correlation, autocorrelation, the backshift operator, and stochastic processes. It provides examples and properties of these concepts. The document is a lecture on time series analysis covering important foundational topics.

Uploaded by

pamelasedimo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views43 pages

Lecture 2

The document discusses time series analysis and stochastic processes. It introduces concepts like expectations, variance, covariance, correlation, autocorrelation, the backshift operator, and stochastic processes. It provides examples and properties of these concepts. The document is a lecture on time series analysis covering important foundational topics.

Uploaded by

pamelasedimo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

STAT 411 - Time Series

Dr Mabikwa

BIUST

February 13, 2023

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 1 / 43


CHAPTER 2

1 Review of Expectations White Noise


Random Walk
Expectation
Variance 4 Fitted Models and Diagnostic
Covariance Plots
Correlation Simulated Random Walk
Auto-correlation Series
Correlogram Random walk with drift
2 Backshift Operator
3 Stochastic Processes 5 Auto Regressive Models
Expectations of Stochastic Stationary and non-stationary
Processes AR processes
Examples of Stochastic Second-order properties of an
Processes AR(1) model

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 2 / 43


Expectation

The expected value, commonly abbreviated to expectation, E, of a


variable, or a function of a variable, is its mean value in a population
Assume X is a continuous random variable with a pdf f (x). The
expected value of X is defined as
Z ∞
E (X ) = xf (x)dx
−∞
R∞
if −∞ |x|f (x)dx < ∞. Otherwise, E (X ) is undefined. We
sometimes write µX = E (X ).

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 3 / 43


Properties
R∞
If h(x) is a function such that −∞ |h(x)|f (x)dx < ∞, then
1
R∞
E (h(X )) = −∞ h(x)f (x)dx.
RIf ∞
X Rand Y have a joint pdf f (x, y ) and
2

∞ ∞ |h(x, y )|f (x, y )dxdy < ∞, then
Z ∞Z ∞
E (h(X , Y )) = h(x, y )f (x, y )dxdy .
∞ ∞

3 For a, b, c real, we have

E (aX + bY + c) = aE (X ) + bE (Y ) + c .

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 4 / 43


Variance
The variance of X is defined as

V (X ) = E {X − µX }2


provided E (X ) and E (X 2 ) are defined.


Properties
1 V (X ) ≥ 0
2 V (aX + b) = a2 V (X )
3 If X and Y are independent (i.e f (x, y ) = f (x)g (y ), where f and g
are the pdfs of X and y , resp.), then

V (X + Y ) = V (X ) + V (Y )

4 V (X ) = E (X 2 ) − µ2X
p
The standard deviation of X is σX = + V (X ). We sometimes write
σX2 = V (X ).

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 5 / 43


Covariance

The covariance of X and Y is defined as

Cov (X , Y ) = E [(X − µX )(Y − µy )]

provided E (X ), E (Y ), and E (XY ) are all defined.


Properties
1 If X and Y are independent, then Cov (X , Y ) = 0.
2 Cov (aX + b , cX + d) = acCov (X , Y )
3 Cov (X , Y ) = E (XY ) − E (X )E (Y )
4 V (X + Y ) = V (X ) + V (Y ) + 2Cov (X , Y )
5 Cov (X + Y , Z ) = Cov (X , Z ) + Cov (Y , Z )
6 Cov (X , X ) = V (X )
7 Cov (X , Y ) = Cov (Y , X )

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 6 / 43


Correlation
is a dimensionless measure of the linear association between a pair of
variables (x, y ) and is obtained by standardising the covariance by
dividing it by the product of the standard deviations of the variables.
The correlation coefficient of X and Y is defined as
Cov (X , Y ) E (X − µx )E (Y − µy )
Corr (X , Y ) = =
σX σY σX σY
provided that the variances and covariances exist and σX > 0 and
σY > 0. We sometimes write ρ = Corr (X , Y ).
Properties
1 |Corr (X , Y )| ≤ 1

2 Corr (aX + b , cY + d) = sgn(ac)Corr (X , Y ) where


1
 if s > 0
sgn(s) = 0 if s = 0

−1 if s < 0

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 7 / 43


3 |Corr (X , Y )| = 1 iff there are constants a and b such that
P(Y = aX + b) = 1

Sample Correlation

Cov (X , Y )
Corr (X , Y ) =
sd(X )sd(Y )

Example
x <- c(1,0,3,4,5)
y <- c(0,2,1,5,7)
corr(x,y)

## [1] 0.7856876

cov(x,y)

## [1] 4.75
Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 8 / 43
Auto-correlation

mean and variance play an important role in the study of statistical


distributions because they summarise two key distributional properties
- central location and the spread.
Similarly, in time series models, a key role is played by the
second-order properties, which include the mean, variance, and
serial correlation
A correlation of a variable with itself at different times is known as
autocorrelation or serial correlation.
The lag k autocorrelation function (acf ), ρk , is defined by

E (Yt − µ)(Yt−1 − µ) γk
ρk = = 2
σ2 σ
ck
These can be estimated from sample equivalence rk = c0

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 9 / 43


Example
Suppose X and Y are random variables with
µX = 0, σX = 1, µY = 10, σY = 3, ρ = 1/2. Then
1 E (2X + Y ) = 2E (X ) + E (Y ) = 2(0) + 10 = 10
2 Using the properties of the covariance and definition of ρ
V (X − Y ) = V (X ) + V (−Y ) + 2Cov (X , −Y )
= V (X ) + (−1)2 V (Y ) − 2Cov (X , Y )
= V (X ) + V (Y ) − 2ρσX σY
= 1 + 9 − 2(1/2)(1)(3) = 7
3 Using the properties of the covariance
Cov (X − Y , Y ) = Cov (X , Y ) + Cov (−Y , Y )
= Cov (X , Y ) − Cov (Y , Y )
= Cov (X , Y ) − V (Y )
15
= (1/2)(1)(3) − 9 = −
2
Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 10 / 43
R example
unemp <- read.csv("Desktop/bw_youth_unempl.csv")
acf(unemp$percent)

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 11 / 43


Correlogram
the acf function produces a plot of rk against k, which is called the
correlogram.
The x-axis gives the lag (k) and the y-axis gives the autocorrelation
(rk ) at each lag. The unit of lag is the sampling interval, 0.1 second.
Correlation is dimensionless, so there is no unit for the y-axis.
If ρk = 0, the sampling distribution of rk is approximately normal,
with a mean of −1/n and a variance of 1/n. The dotted lines on the
correlogram

−1 2
±√
n n
At kag zero rk is always equal to 1
The correlogram for wave heights has a well-defined shape that
appears like a sampled damped cosine function (Typical AR(2)).
**covered later
Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 12 / 43
Backshift operator
Definition: The backward shift operator B is defined by

BYt = Yt−1

R1 The backward shift operator is sometimes called the ‘lag operator’. By


repeatedly applying B, it follows that

B n Yt = Yt−n

Using B then the random walk Yt = Yt−1 + et can be written as

Yt = BYt + et =⇒ (1 − B)Yt = et =⇒ Yt = (1 − B)−1 et

considering B as a number this can be written as geometric series such that

Yt = (1 + B + B 2 + .....)et = Φ(B)et = et + et−1 + et−2 + ....

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 13 / 43


NB Φ(B) = 0 is called the characteristic equation of any series. The roots
of the polynomial determines stationarity:

a If all roots are greater than unity in absolute value implies stationary
b Random walk has roots B = 1 is non stationary

Difference Operator
Differencing adjacent terms of a series can transform a non-stationary
series to a stationary series. For example: if Yt is a random walk it
is non stationary but first differencing R1 makes it stationary
et = Yt − Yt−1 .
thus differencing is useful filtering procedure

∇xt = xt − xt−1 = (1 − B)xt

in general therefore
∇n = (1 − B)n
- Proof (exercise)
Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 14 / 43
Stochastic Processes

Definition: - Is a collection of random variables that are ordered in time.

We write Y (t) if time is continuous (−∞ < t < ∞) and Yt if time is


discrete t = 0, ±1, ±2, . . ..
At this time we will only consider stochastic processes indexed by
discrete time and write {Yt : t = 0, ±1, ±2, . . .} to represent
{. . . , Y−1 , Y0 , Y1 , . . .}.
The finite set of observations {Y1 , . . . , Yn } that we usually have will
be considered to be a random sample from {Yt : t = 0, ±1, ±2, . . .}.

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 15 / 43


Expectations of Stochastic Processes

Let’s consider the stochastic process {Yt : t = 0, ±1, ±2, . . .}.

The mean function is the expected value of the process at time t defined
by
µt = E (Yt ) , t = 0, ±1, ±2, . . .
The autocovariance function (ACVF) is defined as

γt,s = Cov (Yt , Ys ) = E (Yt Ys ) − µt µs , s, t = 0, ±1, ±2, . . .

The autocorrelation function (ACF) is defined by


γt,s
ρt,s = Corr (Yt , Ys ) = √ , s, t = 0, ±1, ±2, . . .
γt,t γs,s

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 16 / 43


Properties
1 γt,t = V (Yt ) , ρt,t = 1

2 |γt,s | ≤ γt,t γs,s , |ρt,s | ≤ 1
3 If a1 , . . . , an and b1 , . . . , bm are constants and t1 , . . . , tn and
s1 , . . . , sm are time points, then
 
X n Xm  X n Xm
Cov ai Yti , bj Ysj = ai bj γti ,sj
 
i=1 j=1 i=1 j=1

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 17 / 43


Examples of Stochastic Processes
1 White Noise
A white noise process is a sequence of iid random variables {et }with a
normal distribution. It is the simplest of a stationary process characterised
by the following:
constant mean
constant variance
(
V (et ) if t = k
γk = Cov (et , et+k ) =
0 ̸ k
if t =

No correlation over time


(
1 if k = 0
ρk =
0 ̸ 0
if k =

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 18 / 43


White Noise Examples

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 19 / 43


Simulation of White Noise
set.seed(1)
w1 <- rnorm(50)
plot(w, type = "l", xlab = "Time")

Figure: Time plot of simulated Gaussian white noise series

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 20 / 43


Simulation of White Noise: Mean=4 & SD=2
set.seed(20)
w2 <- rnorm(50,4,2)
plot(w2, type = "l", xlab = "Time")

Figure: Simulated Gaussian white noise series with mean=4 and sd=2

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 21 / 43


Correlogram
par(mfrow=c(1,2))
acf(w1)
acf(w2)

Figure: Correlogram of simulated white noise series.The underlying


autocorrelations are all zero (except at lag 0); the statistically significant value at
lag 4 (w2) is due to sampling variation.

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 22 / 43


2 The Random Walk
Definition - No specified mean and variance. Strong dependence on time.
It’s changes over time are white noise (WN). Let

e1 , e2 , . . .

be i.i.d random variables with mean 0 and variance σe2 . Now construct a
time series {Yt } as Y1 = e1
Y2 = e1 + e2
..
.
Yt = e1 + e2 + · · · + et
..
.
Yt = Yt−1 + et , where Y0 = 0 with probability 1.
et is the step size at time t by a random walker.
Yt is the position of the random walker at time t whose initial
position is the origin.
Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 23 / 43
Yt − Yt−1 = diff (Y ) iid White Noise

Random Walk Examples

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 24 / 43


Random Walk Expectations
The mean function is
µt = E (Yt ) = E (e1 + · · · + et ) = E (e1 ) + · · · + E (et ) = 0 .
The variance is

V (Yt ) = V (e1 + · · · + et ) = V (e1 ) + · · · + V (et ) = tσe2

which increases linearly with time.


The ACVF for 1 ≤ t ≤ s is
 
Xt s
X  Xt X
s
γt,s = Cov (Yt , Ys ) = Cov ei , ej = γi,j
 
i=1 j=1 i=1 j=1
t
X t X
X s
= γi,i + γi,j = tσe2
i=1 i=1 j=1
i̸=j

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 25 / 43


The autocorrelation function is
r
γt,s tσ 2 t
ρt,s =√ =p e =
γt,t γs,s tσe2 sσe2 s

Example: Consider a very simple sequence of independent random


variables
e1 , e2 , . . .
each taking the values +1 and −1 with probability 1/2, i.e

P(ei = +1) = P(ei = −1) = 1/2 , i = 1, 2, . . .

The mean and variance of ei are

E (ei ) = (1)(.5) + (−1)(.5) = 0 and

V (ei ) = E (ei2 ) = (12 )(.5) + (−12 )(.5) = 1 .


Thus the random walk {Yt } given by Yt = Yt−1 + et for t ≥ 1 (with
P(Y0 = 0) = 1) has µt = 0, and γt,s = t for s ≥ t.
Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 26 / 43
Simulation of Random Walk in R
x <- w <- rnorm(1000)
for (t in 2:1000) x[t] <- x[t - 1] + w[t]
plot(x, type = "l", xlab="Time")

Figure: Time plot of a simulated random walk

The series exhibits an increasing trend. However, this is purely stochastic


and due to the high serial correlation.
Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 27 / 43
Correlogram
acf(x)

Figure: The correlogram for the simulated random walk.

A gradual decay from a high serial correlation is a notable feature of a


random walk series.
Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 28 / 43
Fitted Models and Diagnostic Plots
1 Simulated Random Walk Series
The first-order differences of a random walk are a white noise series, so the
correlogram of the series of differences can be used to assess whether a
given series is reasonably modelled as a random walk. For example
acf(diff(x))

Figure: Correlogram of differenced series

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 29 / 43


Interpretation
There are no obvious patterns in the correlogram (only a couple of
marginally statistically significant values). These significant values can be
ignored because they are small in magnitude and about 5%. Thus, there is
good evidence that the simulated series in x follows a random walk.

2 Random walk with drift


Generally stakeholders expect their investment to increase in value despite
the volatility of financial markets. The random walk model can be adapted
to allow for this by including a drift parameter c such that

Yt = c + Yt−1 + ϵt

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 30 / 43


R Exercise

Import stock data as provide by the code below and interpret your results.

www <- "http://www.massey.ac.nz/~pscowper/ts/HP.txt"


HP.dat <- read.table(www, header = T) ; attach(HP.dat)
plot (as.ts(Price))
DP <- diff(Price) ; plot (as.ts(DP)) ; acf (DP)
mean(DP) + c(-2, 2) * sd(DP)/sqrt(length(DP))

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 31 / 43


Auto Regressive Models
Definition- The series Yt is an autoregressive process of order p,
abbreviated to AR(p), if

Yt = α1 Yt−1 + α2 Yt−2 + ...... + αp Yt−p + et

where et is white noise and the αk are the model parameters with αp ̸= 0
for an order p process. This can be expressed as a polynomial of order p in
terms of the backward shift operator:

Φp (B)Yt = (1 − α1 B − α2 B 2 − .... − αp B p )Yt = et

The following should be noted


1 The random walk is the special case AR(1) with α1 = 1. [Refer to
slide 23]
2 The exponential smoothing model is the special case αi = α(1 − α)i
for i = 1, 2, .. , p → ∞

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 32 / 43


3 The model is a regression of Yt on past terms from the same series;
hence the use of the term ‘autoregressive’
4 prediction at time t is given by

Ŷt = α1 Yt−1 + α2 Yt−2 + ...... + αp Yt−p + et

5 The model parameters can be estimated by minimising the sum of


squared errors.

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 33 / 43


Stationary and non-stationary AR processes
The equation θp (B) = 0, where B is a real or complex number, is called
the characteristic equation. The roots of the characteristic equation
(i.e., the polynomial θp (B) must all exceed unity in absolute value for the
process to be stationary. Otherwise, if the random walk has θ = 1 − B
with root B = 1 is non-stationary.

The following four examples illustrate the procedure for determining


whether an AR process is stationary or non-stationary:
1 The AR(1) model Yt = Yt−1 + et is stationary because the root of
1- 21 B = 0 is B = 2, which is greater than 1.
2 The AR(2) model Yt = Yt−1 − Yt−2 + et is stationary. The proof of
this result is obtained by first expressing the model in terms of the
backward shift operator 14 (B2 − 4B + 4)Yt = et ; i.e.,
1 2
4 (B − 2) Yt = et . The roots of the polynomial are given by solving
1
θB = 4 (B − 2)2 = 0 and are therefore obtained as B = 2. Since the
roots are greater than unity this AR(2) model is stationary.
Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 34 / 43
4 The model Yt = 21 Yt−1 + 12 Yt−2 + et is non-stationary because one
of the roots is unity. To prove this, first express the model in terms of
the backward shift operator: − 21 (B2 + B − 2)Yt = et ;
i.e.,− 21 (B − 1)(B + 2)Yt = et . The polynomial
θ(B) = − 21 (B − 1)(B + 2) has roots B=1,-2. Since, there is a unit
root (B=1), the model is non-stationary. NB: The other root (B=-2)
exceeds unity in absolute value however, the presence of the unit root
makes the whole process non-stationary.
5 The AR(2) model Yt = − 14 Yt−2 + et is stationary because the √
roots
1 2
of 1 + 4 B are B = ±2i, which are complex numbers with i = −1
each having an absolute value of 2 exceeding unity.

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 35 / 43


Second-order properties of an AR(1) model

From the AR equation provided above, the AR(1) process is given by

Yt = αYt−1 + et

where et is a white noise series with mean zero and variance σ 2 . Thus, it
can be shown that the second-order properties of AR(1) follow as

µx = 0

αk σ 2
γk =
(1 − α2 )

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 36 / 43


Proofs

Using B, a stable AR(1) process (α < 1) can be written as

(1 − αB)Yt = et

X
−1 2
⇒ Yt = (1 − αB) et = et + αet−1 + α et−2 + ... = αi et−i
i=0

Hence, the mean is given by


∞ ∞
!
X X
i
E (Yt ) = E α et−i = αi E (et−i ) = 0
i=0 i=0

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 37 / 43


and the autocovariance is as follows
 
X∞ ∞
X
γk = Cov (Yt , Yt+k ) = Cov  αi et−i , αj et+k−j 
i=0 j=0
X
= αi αj Cov (et−i , et+k−j )
j=k+i
X∞
k 2
=α σ α2i = αk σ 2 /(1 − α2 )
i=0

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 38 / 43


Correlogram of an AR(1) Process
The auto correlation function for an AR(1) process is as follows

ρk = αk ; (k ≥ 0)

where |α| < 1. Thus, the correlogram decays to zero more rapidly for
smaller α.

Example: Two correlograms for positive and negative values of α


rho <- function(k, alpha) alpha^k
layout(1:2)
plot(0:10, rho(0:10, 0.7), type = "b", ylab="rho",
> xlab="lag k,(alpha=0.7)")
plot(0:10, rho(0:10, -0.7), type = "b", ylab="rho",
> xlab="lag k,(alpha=-0.7)")
Exercise: Experiment with other values for α. For example, using much
more smaller values and observe a more rapid decay to zero in the
correlogram.
Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 39 / 43
Partial autocorrelation

Definition: The partial autocorrelation at lag k is the correlation that


results after removing the effect of any correlations due to the terms at
shorter lags.

Generally, the partial autocorrelation at lag k is the kth coefficient of a


fitted AR(k) model; if the underlying process is AR(p), then the
coefficients αk will be zero for all k > p. Thus, an AR(p) process has a
correlogram of partial autocorrelations that is zero after lag p. Hence, a
plot of the estimated partial autocorrelations can be useful when
determining the order of a suitable AR process for a time series.

In R, the function pacf can be used to calculate the partial


autocorrelations of a time series and produce a plot of the partial
autocorrelations against lag (the ‘partial correlogram’).

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 40 / 43


Simulation: An AR(1) Process

An AR(1) process can be simulated in R as follows:

set.seed(8)
x <- w <- rnorm(100)
for (t in 2:100) x[t] <- 0.7 * x[t - 1] + w[t]
layout(1:3)
plot(x, type = "l", xlab="Time Series plot")
acf(x)
pacf(x)

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 41 / 43


Graphs

Figure: A simulated AR(1) process, Yt = 0.7Yt−1 + et . NB: In the partial


correlogram only the first lag is significant, which is usually the case when the
underlying process is AR(1)

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 42 / 43


R session

Based on the simulated data, perform the following tasks.


1 Fit an AR model to the data giving the parameter estimates and
order of the fitted AR process.
2 Construct 95% confidence intervals for the parameter estimates of the
fitted model. Do the model parameters fall within the confidence
intervals? Explain your results
Hint: Use the ar function available in R software

Dr Mabikwa (BIUST) STAT 411 - Time Series February 13, 2023 43 / 43

You might also like