0% found this document useful (0 votes)
34 views54 pages

Lecture 14

This document summarizes an example of spectral analysis of monthly streamflow data from 1979-2008. Key points: 1) Spectral density analysis was used to identify periodicities in the streamflow time series. Peaks in the power spectrum indicate periodicities like annual and semi-annual cycles. 2) Statistical tests like variance decomposition were applied to determine if identified periodicities were statistically significant. 3) Significant periodicities were removed by transforming the original time series into a standardized series with the mean and standard deviation removed for each month. This isolates the stochastic component of the streamflow.

Uploaded by

Fofo Elorfi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views54 pages

Lecture 14

This document summarizes an example of spectral analysis of monthly streamflow data from 1979-2008. Key points: 1) Spectral density analysis was used to identify periodicities in the streamflow time series. Peaks in the power spectrum indicate periodicities like annual and semi-annual cycles. 2) Statistical tests like variance decomposition were applied to determine if identified periodicities were statistically significant. 3) Significant periodicities were removed by transforming the original time series into a standardized series with the mean and standard deviation removed for each month. This isolates the stochastic component of the streamflow.

Uploaded by

Fofo Elorfi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

INDIAN

 INSTITUTE  OF  SCIENCE  

STOCHASTIC HYDROLOGY
Lecture -14
Course Instructor : Prof. P. P. MUJUMDAR
Department of Civil Engg., IISc.
Summary  of  the  previous  lecture  

• Frequency domain analysis


– Spectral density
– Test for significance of periodicities
– Removing periodicities
– Standardizing the data

2  
Frequency Domain Analysis
• Spectral density (Ik) is the amount of variance per
interval of frequency
• Spectral analysis helps indentify the significant
periodicities themselves
• A peak in the spectrum indicates an important
contribution to variance at frequencies close to the
peak I(k)  
• Prominent spikes indicate periodicity
• Line spectrum - inconsistent estimate
• Power spectrum - consistent estimate ωk      

3  
Example – 1
(Spectral Analysis)
Monthly Stream flow (in cumec) statistics(1979-2008) for a
river is selected for the study. (Part data shown below)
Year Month S.No. Flow
N = 348  
1979 June 1 54.6
July 2 325.4
August 3 509.5
September 4 99.4
October 5 53.5
November 6 25.8
December 7 12.5
1980 January 8 5.6
February 9 3.1
March 10 2.2
April 11 0.9
May 12 0.81

4  
Example – 1 (contd.)
(Spectral Analysis)

The time series plot


900  

800  

700  
Flow in Cumec

600  

500  

400  

300  

200  

100  

0  
0   50   100   150   200   250   300   350   400  

Time

5  
Example – 1 (contd.)
(Spectral Analysis)
Zt = Xt – Yt  
Time series plot of Zt, Yt = µ + αˆ1 cos (ω1t ) + βˆ1 sin (ω1t )
500  
+ αˆ 2 cos (ω2t ) + βˆ2 sin (ω2t )
400  

300  
Flow in Cumec

200  

100  

0  
0   50   100   150   200   250   300   350   400  

-­‐100  

-­‐200  

-­‐300  
Time

6  
Example – 1 (contd.)
(Spectral Analysis) 1.5   Correlogram of
1  
original series  
Correlogram of Zt, 0.5  
ρk
1.2   0  
0   50   100   150  
1   -­‐0.5  
Lag (k)
0.8  

0.6  

0.4  
ρk
0.2  

0  
0   20   40   60   80   100   120   140  
-­‐0.2  

-­‐0.4  

-­‐0.6  
Lag, k

7  
Example – 1 (contd.)
(Spectral Analysis) Power Spectrum
of original series  

Power Spectrum of Zt,

I(k)
350000  

300000  
0   0.5   1   1.5   2   2.5  

250000   W(k)
200000  
I(k)

150000  

100000  

50000  

0  
0   0.5   1   1.5   2   2.5  

-­‐50000  
ω(k)

8  
Example – 1 (contd.)
(Spectral Analysis)

• Significance test:
γ 2 ( N − 2)
I = 4 ρˆ
1

Where γ2= α2 + β2 and


1 ⎡ N ˆ ⎤
ˆ
N ⎣ t =1
{ ˆ }
ρ1 = ⎢∑ xt − α cos (ωk t ) − β sin (ωk t ) ⎥
⎦
For first peak, ω1= 0.5236, α1 = 29.28, β1 = 172.93

Therefore γ2 = 29.282 + 172.932


= 30762

9  
Example – 1 (contd.)
(Spectral Analysis)
1 ⎡ N ⎤
ρˆ1 = ⎢ ∑ {xt − α1 cos (ω1t ) − β1 sin (ω1t )}⎥
N ⎣ t =1 ⎦
1
= × 36810.56
348
= 105.78

γ 2 ( N − 2 ) 30762 (348 − 2 )
I =
4 ρˆ1
=
4 ×105.78
= 25155

From F distribution table at 95% significance level,


F(2, 346) = 3.0

10  
Example – 1 (contd.)
(Spectral Analysis)

I > F ( 2,346 )

Therefore the periodicity is significant.


The values for other periodicities are as follows
ωk Statistic F(2, N-2)
0.5236 25154 3.0
1.0472 11242 3.0
1.5708 4104 3.0
2.0944 1295 3.0

11  
Example – 1 (contd.)
(Spectral Analysis)

• The periodicities from the time series is removed


by transforming the series into a standardized one.
• The series {Xt} is expressed as the new series
{Z t} where, Month Mean Stdev.
Jun 117.49 52.24

Z 't =
( X t − Xi ) Jul
Aug
474.50
421.39
150.18
126.53
Si Sep 145.94 77.65
Oct 66.61 30.67
Nov 22.99 13.26
The mean and standard Dec 10.30 9.82
deviation for each month Jan 5.55 9.16
is tabulated. Feb 1.91 0.74
Mar 1.09 0.54
Apr 0.76 0.51
May 0.80 0.60
12  
Example – 1 (contd.)
(Spectral Analysis)

( 54.6 − 117.49 )
Z '1 = = −1.204 (June)
52.24

( 325.4 − 474.5)
Z '2 = = −0.993 (July)
150.18

( 509.5 − 421.39 )
Z '3 = = 0.696 (August)
126.53
And so on.

13  
Example – 1 (contd.)
(Spectral Analysis)

• Series of Z t (part data shown)


Year Month S.No. Xt Z t
1979 June 1 54.6 -1.204
July 2 325.4 -0.993
August 3 509.5 0.696
September 4 99.4 -0.599
October 5 53.5 -0.428
November 6 25.8 0.212
December 7 12.5 0.224
1980 January 8 5.6 0.006
February 9 3.1 1.609
March 10 2.2 2.063
April 11 0.9 0.272
May 12 0.81 0.019
14  
Example – 1 (contd.)
(Spectral Analysis) Timeseries of
original series  

Time series of standardized data. 500  

Flow
7.000   0  
0   100   200   300   400  
6.000   Time
5.000  

4.000  

3.000  
Flow

2.000  

1.000  

0.000  
0   50   100   150   200   250   300   350   400  
-­‐1.000  

-­‐2.000  

-­‐3.000  

-­‐4.000  
Time

15  
Example – 1 (contd.)
(Spectral Analysis)

Correlogram of standardized data. 1.5  


Correlogram of
original series  
1.2   1  

ρk 0.5  
1   0  
0   50   100   150  
-­‐0.5  
0.8   Lag (k)
0.6  

ρk 0.4  

0.2  

0  
0   10   20   30   40   50   60   70   80   90   100  

-­‐0.2  

-­‐0.4  
Lag, k

16  
Example – 1 (contd.)
(Spectral Analysis)

Spectrum of standardized data.


Power Spectrum
30   of original series  

I(k)
25  

20   0   0.5   1   1.5   2   2.5  

W(k)
15  
I(k)

10  

5  

0  
0   0.2   0.4   0.6   0.8   1   1.2   1.4   1.6   1.8  

-­‐5  
ω(k)

17  
Example – 1 (contd.)
(Spectral Analysis)

Test for significance for standardized data:

ωk Statistic F(2, N-2)


0.5236 -4.7E-12 3.0
1.0472 -3.2E-12 3.0
1.5708 -3.5E-11 3.0

I < F ( 2,346 )

The periodicities are insignificant

18  
ARIMA MODELS

19  
ARIMA Models
X2
X1
Regression:
Y = f(X1, X2, X3, X4,…)
X3
X4
Auto Regression:
Xt = f(Xt-1, Xt-2, Xt-3,…)

Y
e.g., AR(1), model
Xt = φ1Xt-1 + εt

(Error, random component,


noise, residual)  

20  
ARIMA Models
AR(2), model
Xt = φ1Xt-1 + φ2Xt-2 + εt

AR(p) model
Xt = φ1Xt-1 + φ2Xt-2 +………….. φpXt-p + εt
p
X t = ∑φ j X t − j + ε t
j =1
{φj} are AR Parameters

21  
Partial Auto Correlation
Partial Auto Correlation (PAC):
Indicates the dependence of Xt on Xt-k when the
dependence on all other variables Xt-1, Xt-2,…,Xt-k-1
are removed.

e.g.,Y is regressed upon X1 and X2, then it is of interest


to ask how much explanatory power X1 has if the
effect of X2 are partialled out.

This means regressing Y on X2, getting the residual


errors from this analysis and regressing the
residuals with X1.
22  
Partial Auto Correlation
Y = f(X1 , X2)
Y = f(X2) {ei} get the errors  

X1 = f(e) How much of the relationship is being explained


by X1 alone  

For AR(1), model


Xt = φ1Xt-1 + εt φ1 Partial Auto Correlation (PAC) of order 1  

For AR(2), model


Xt = φ1Xt-1 + φ2Xt-2 + εt φ2 is the PAC of order 2  

23  
Partial Auto Correlation
AR(p) model
Xt = φ1Xt-1 + φ2Xt-2 +………….. φpXt-p + εt

φp is the PAC of order p  

Calculation of Partial Auto Correlations:


(Yule Walker equations) pth order Yule Walker
equations to get φp  
Pp * φp = ρp
Auto Correlations
Auto Correlation
function Partial Auto Correlation
24  
Partial Auto Correlation
Gives partial auto correlation of order p

⎡ 1 ρ1 ρ2 . . ρ n −1 ⎤ ⎡ φ1 ⎤ ⎡ ρ1 ⎤
⎢ ρ ⎢φ ⎥ ⎢ ρ ⎥
⎢ 1 1 ρ1 . . ρ n −2 ⎥⎥ ⎢ 2 ⎥ ⎢ 2 ⎥
⎢ ρ 2 ⎥ ⎢ . ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ ⎥ = ⎢ ⎥
⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥
⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢⎣ ρ n −1 ρn−2 . . . 1 ⎥⎦p x p   ⎢⎣φ p ⎥⎦ ρ
p x 1   ⎢⎣ p ⎥⎦ p x 1  

25  
Partial Auto Correlation
For PAC of order 1,

[1][φ1 ] = [ ρ1 ]
φ1 = ρ1

For PAC of order 2, ⎡ 1 ρ1 ⎤ ⎡φ1 ⎤ ⎡ ρ1 ⎤


= ⎢ ⎥
⎢ ρ ⎥ ⎢ ⎥
1 ⎦ ⎣φ2 ⎦ ⎣ ρ 2 ⎦
⎣ 1

φ1 + ρ1φ2 = ρ1
ρ1φ1 + φ2 = ρ 2
26  
Partial Auto Correlation
φ1 + ρ1 ( ρ 2 − ρ1φ1 ) = ρ1
φ1 + ρ1 ρ 2 − ρ12φ1 = ρ1
ρ1 (1 − ρ 2 )
φ1 =
1 − ρ12
ρ12 (1 − ρ 2 )
φ2 = ρ 2 −
1 − ρ12
ρ 2 − ρ 2 ρ12 − ρ12 + ρ 2 ρ12
=
1 − ρ12
ρ 2 − ρ12
=
1 − ρ12 φ2 is PAC of order 2

27  
Example – 2
Obtain the φ1 and φ2 for
r1 = 0.57, r2 = 0.07

Since φ1 = r1
φ1 = 0.57
ρ 2 − ρ12
φ2 =
1 − ρ12
0.07 − 0.57 2
=
1 − 0.57 2
= −0.38
28  
ARIMA Models
Box Jenkins Time series models:
• For stationary time series
• If the time series is stationary, the correlogram dies
down fairly quickly (e.g., within 4 or 5 lags, in most
hydrologic applications)
• If the time series is non stationary, the decay is very
slow
ρk   ρk  

k     k    
Stationary time series   Non-stationary time series  
29  
ARIMA Models
• If the time series is non stationary, convert it to a
stationary time series

• One way is by standardizing the time series


described in spectral analysis

• Another way is by simply differencing the time


series.

30  
ARIMA Models
• Differencing:

Yt = Xt = Xt - Xt-1

Xt is First order differencing

{Xt} = 2, 4, 6, 8, 10, ……. {Yt} = 2, 2, 2, ……

Xt     Yt    

t     t     31  
ARIMA Models

X t'' = X t' − X t'−1

Xt is Second order differencing

X t'' = X t' − X t'−1


= ( X t − X t −1 ) − ( X t −1 − X t −2 )
= X t − 2 X t −1 + X t −2

32  
Example – 3
(Differencing)
Period,t Xt Xt Xt
1 54.6 - -
2 325.4 -270.8 -
3 509.5 -184.1 -86.7
4 99.4 410.1 -594.2
5 53.5 45.9 364.2
6 25.8 27.7 18.2
7 12.5 13.3 14.4
8 5.6 6.9 6.4
9 3.1 2.5 4.4
10 2.2 0.9 1.6
11 0.9 1.3 -0.4
12 0.81 0.09 1.21
33  
Example – 4
Monthly Stream flow (in cumec) statistics(1979-2008) for a
river is selected for the study. (Part data shown below)
Year Month S.No. Flow
1979 June 1 54.6
July 2 325.4
August 3 509.5
September 4 99.4
October 5 53.5
November 6 25.8
December 7 12.5
1980 January 8 5.6
February 9 3.1
March 10 2.2
April 11 0.9
May 12 0.81

34  
Example – 4 (contd.)
900   Time series  
800  
Flow in Cumec

700  
600  
500  
400  
300  
200  
100   2000000  
Spectrum  
0  
0   50   100   150   200   250   300   350   400   1500000  

Time 1000000  

I(k)
1.5  
500000  
1  
0  
ρk 0.5   0   0.5   1   1.5   2   2.5  
-­‐500000  
0  
0   20   40   60   80   100   120   140  
W(k)
-­‐0.5  
Lag (k)
Correlogram  

35  
Example – 4 (contd.)
First order differenced data, Xt = Xt - Xt-1  
800   Time series   1.2  
Correlogram  
1  
600  
0.8  
400  
0.6  
200   0.4  
0.2  
0  
0   100   200   300   400   0  
-­‐200   0   20   40   60   80   100   120   140  
-­‐0.2  
-­‐400   -­‐0.4  

-­‐600  
3000000  

2500000  

2000000  

1500000  

1000000  

500000  

0  
0   0.5   1   1.5   2   2.5  
-­‐500000  
Spectrum  
36  
Example – 4 (contd.)
Second order differenced data  
1.2  
Time series   Correlogram  
600   1  

400   0.8  

200   0.6  
0   0.4  
0   50   100   150   200   250   300   350   400  
-­‐200  
0.2  
-­‐400  
0  
-­‐600   0   20   40   60   80   100   120   140  
-­‐0.2  
-­‐800  
-­‐0.4  
-­‐1000  
-­‐0.6  
-­‐1200   4000000  
3500000  
3000000  
Spectrum  
2500000  
2000000  
1500000  
1000000  
500000  
0  
0   0.5   1   1.5   2   2.5  
-­‐500000  
37  
Example – 4 (contd.)
Third order differenced data  
Time series  
1000  
1.2  
800  
1  
600  
400  
0.8   Correlogram  
200   0.6  

0   0.4  
-­‐200   0   50   100   150   200   250   300   350   400   0.2  
-­‐400   0  
-­‐600   0   20   40   60   80   100   120   140  
-­‐0.2  
-­‐800   -­‐0.4  
-­‐1000  
-­‐0.6  

8000000  
7000000  
6000000  
5000000  
Spectrum  
4000000  
3000000  
2000000  
1000000  
0  
-­‐1000000   0   0.5   1   1.5   2   2.5  
38  
Example – 4 (contd.)
Standardized data  
7.000  
6.000   Time series  
5.000  
1.2  
4.000  
1   Correlogram  
Flow

3.000  
2.000   0.8  
1.000   0.6  
0.000   ρk 0.4  
-­‐1.000   0   100   200   300   400   0.2  
-­‐2.000   0  
-­‐3.000   -­‐0.2   0   20   40   60   80   100  
-­‐4.000   -­‐0.4  
Time Lag, k
30  
25  
20   Spectrum  
15  
I(k)

10  
5  
0  
0   0.5   1   1.5   2  
-­‐5  
ω(k)
39  
ARIMA Models
• Operator B :
The effect of operator B is to shift the argument
to that one step behind.

BXt = Xt-1
BXt-1 = Xt-2

AR (1) Model: Xt = φ1Xt-1 + εt


Xt = φ1BXt + εt
Xt(1 – φ1B) = εt
AR (1) component  
40  
ARIMA Models
AR (2) Model: Xt = φ1Xt-1 + φ2Xt-2 + εt
Xt = φ1BXt + φ2BXt-1 + εt
Xt = φ1BXt + φ2B2X + εt
Xt(1 – φ1B – φ2B2) = εt
AR (2) component  

Generalized form for an AR(p) model is


p
⎛ i ⎞
X t ⎜1 − ∑ φi B ⎟ = ε t
⎝ i =1 ⎠
41  
ARIMA Models
Auto Regressive Integrated Moving Average models:

Order of differencing  

ARIMA (p, d, q)

No. of Moving average


terms  
No. of Auto-regressive
terms  

42  
ARIMA Models
Auto Regressive Moving Average models:

ARMA (p, q)

Residuals of order q  
Xt = φ1Xt-1 + φ2Xt-2+…+ φpXt-p + θ1et-1 + θ2et-2 +…+ θqet-q
+ et
AR of order p  

43  
ARIMA Models
First order differencing:  
Xt – Xt-1 = et
Xt – BXt = et
Xt (1 – B) = et

Second order differencing: X t'' = X t' − X t'−1


= ( X t − X t −1 ) − ( X t −1 − X t − 2 )
= X t − 2 X t −1 + X t − 2
= X t − 2 BX t + B 2 X t
2
= (1 − B ) X t
44  
ARIMA Models
d
In general dth order difference is (1 − B ) X t

ARIMA (1, 1, 1) Yt = Xt - Xt-1  


Yt = φ1Yt −1 + θ1et −1 + et
X t − X t −1 = φ1 ( X t −1 − X t − 2 ) + θ1et −1 + et
( )
X t − BX t = φ1 BX t − B 2 X t + θ1 Bet + et

( )
X t 1 − B − φ1 B + φ1 B 2 = et (1 + θ1 B )

45  
ARIMA Models
Procedure for fitting Box-Jenkins type time series
models:

3 steps
1. Identification of the model structure
2. Parameter estimation and calibration
3. Model testing / Validation

46  
ARIMA Models
1. Identification of the model structure:

• Identify if the series is stationarity.


§ Plot correlogram (correlogram shows a rapid
decay for a stationary series)
• Remove non-stationarity if any by differencing/
standardization.
• Obtain the order of AR and MA components of
the model.
• PAC determines the order of the AR process

47  
ARIMA Models
For example, AR(1) process:
ρk   Exponentially decaying
xt  
with only the first few
correlations significant  

t     Correlogram  
k    
Time series  

I(k)   Lower frequencies φk   Exactly one (the first


dominant   one) PAC is significant  

Power Spectrum  
ωk       k    
PAC function  
48  
ARIMA Models
AR(2) process:
Decays in sinusoidal
xt   ρk  
wave form  

t     Correlogram  
k    
Time series  

Dominant
I(k)   frequencies φk   Exactly two PAC s
are neither low significant  
nor high  

ωk       k    
Power Spectrum   PAC function  
49  
ARIMA Models
Another AR(2) process: Exponentially decaying
xt   ρk   with many correlations
significant  

t     Correlogram  
k    
Time series  

I(k)   Lower frequencies φk   Exactly two PAC s


dominant   significant  

ωk       k    
Power Spectrum   PAC function  
50  
ARIMA Models
• Behavior of AR process:

• Decaying auto correlation function (either


exponentially or in a dampened sine wave)

• Order of AR determined by the significant


PAC s

51  
ARIMA Models
MA(1) process:
ρk   Exactly one auto
xt  
correlation function is
significant  

t     k    
Time series  
Correlogram  

I(k)   φk  

Power Spectrum  
ωk       PAC function   k    
52  
ARIMA Models
MA(2) process: Exactly two auto
xt   ρk   correlation functions
significant  

t     Correlogram  
k    
Time series  
Decays in sinusoidal
I(k)   φk   wave  

ωk       k    
Power Spectrum   PAC function  
53  
ARIMA Models
• Behavior of MA process:

• The order of MA is determined by the number


of significant auto correlations

• Decaying PAC function (either exponentially or


in a dampened sine wave)

54  

You might also like