0% found this document useful (0 votes)
10 views26 pages

CH-3

The document discusses statistical principles and techniques for time series modeling, focusing on probability distributions, including discrete and continuous variables. It covers concepts such as probability mass functions (PMF), cumulative distribution functions (CDF), joint distributions, and various statistical properties like kurtosis and Chebyshev's inequality. Additionally, it addresses methods for estimating parameters of probability distributions and techniques for selecting appropriate distributions based on data analysis.

Uploaded by

apextrumph
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views26 pages

CH-3

The document discusses statistical principles and techniques for time series modeling, focusing on probability distributions, including discrete and continuous variables. It covers concepts such as probability mass functions (PMF), cumulative distribution functions (CDF), joint distributions, and various statistical properties like kurtosis and Chebyshev's inequality. Additionally, it addresses methods for estimating parameters of probability distributions and techniques for selecting appropriate distributions based on data analysis.

Uploaded by

apextrumph
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

12/5/2024

Statistical Principles
and Techniques for
Time Series Modeling

3.0 Concept of probability distribution

A. Frequency
For discrete random variables, the number of
occurrences of a variate is generally called
frequency. When the number of occurrences of
a variate, or the frequency, is plotted against the
variate, a pattern of distribution is obtained.
The pattern is called the frequency
distribution.

B. Relative frequency function


If the number of observations ni in interval i is
divided by the total number of observations n,
the result is called relative frequency function
fs(x) = ni /n = P(x)
This is also called the probability of a function.

1
12/5/2024

C. Probability mass function (PMF) and Cumulative Distribution Function (CDF) of a


discrete random variable
PMF: The probability function, also known as the probability mass function. PMF gives the
probability associated with all possible values of a random variable.
For a discrete random variable X, PMF (P(xi)) is given by
• P(xi) = P[X=xi]
Properties of PMF

PMF

CDF: CDF is the probability of the event that the random variable X is less than or equal to x,
for every value x.
F[x] = P[X ≤ x]
For a discrete random variable, the CDF is found by summing up the probabilities.
P[X=xi] can be computed by
P[X=xi] = F[xi]- F[xi-1]

PMF CDF

2
12/5/2024

Example
Assume that X has the PMF given by

Find the CDF of X. Also plot PMF and CDF.


Solution
The PMF of the random variable X is as follows
PX(0) = ¼, PX(1) = ½, PX(2) = ¼, and otherwise 0.
Here, the random variable changes its values at x = 0, x = 1, and x = 2. The values are 0, 0+¼ = ¼,
¼ + ½ = ¾, and ¾ + ¼ = 1. Thus, the CDF of X is given by

PMF CDF

3
12/5/2024

Example: The plot of the CDF of a discrete random variable X is shown in Figure below. Find the
PMF of X.

Solution
The random variable takes on values with nonzero probability at X = 1, X = 2, X = 4 and X = 6.
The size of the jump at X = 1 is 1/3, the size of the jump at X = 2 is 1/2−1/3=1/6, the size of the
jump at X = 4 is 3/4−1/2=1/4, and the size of the jump at X = 6 is 1−3/4=1/4. Thus, the PMF
of X is given by

Example: Find the PMF of a discrete random variable X whose CDF is given by:

Solution:
Here CDF changes values at x = 0, x = 2, x = 4 and x = 6, which means that these are the values
of the random variable that have nonzero probabilities.

4
12/5/2024

The next task after isolating these values with nonzero probabilities is to determine their
probabilities.
The first value is PX(0), which is 1/6.
At x = 2 the size of the jump is 1/2−1/6=1/3=PX(2).
Similarly, at x = 4 the size of the jump is 5/8−1/2=1/8=PX(4).
Finally, at x = 6 the size of the jump is 1−5/8=3/8=PX(6).
Therefore, the PMF of X is given by

D. Probability Density Function (PDF) and Cumulative Distribution Function (CDF) of


a continuous random variable
In the case of a continuous random variable, we cannot assign probabilities for every value
that it can take, where X-axis is divided into a large number of mutually exclusive intervals
each of an infinitesimal length dx. We can think of a function f(x) such that the probability that
X lies in the interval x to x+dx is given by f(x)dx. Such a function is called the probability
density function (PDF).
P[x1 ≤ X ≤ x2] = ∫ 𝑓 𝑥 𝑑𝑥
where f(x) represents the probability at point (x).

f(x) ≥ 0 and ∫ ∾
𝑓 𝑥 𝑑𝑥 = 1,
x = specified value of variable X,
f(x) = frequency of occurrence of x,
P[a ≤ x ≤ b] is the probability that x lies
between a and b.

5
12/5/2024

• Knowing CDF, P[a ≤ x ≤ b] can be computed as


• P[a ≤ x ≤ b] = F(b) – F(a)
• Relationship between PDF and CDF is
• f(x) = dF(x)/dx where f(x) = PDF and F(x) = CDF

p (x=0) = 0.25

6
12/5/2024

E. Joint probability distribution (Bivariate or multivariate)


Joint probability distribution represents probability distribution of more
than one random variables.
For two discrete random variable X and Y, joint probability function is
P(x,y) = P[X=x, Y=y)
Conditions
P(x, y) ≥ 0 and ∑ ∑ 𝑝 (𝑥, 𝑦) = 1

The Joint CDF is F(x, y) = P[X ≤ x, Y ≤ y] = ∑ ≤ ∑ ≤ 𝑝 (𝑠, 𝑡)


For −∾< 𝑥 <+∾ , −∾< 𝑦 <+∾ where p(s, t) is the value of joint
probability function of X and Y at (s, t).

For two continuous random variable X and


Y, Joint pdf is
P(x,y) = P[X=x, Y=y)

For −∾< 𝑥 <+∾ , −∾< 𝑦 <+∾ where f(s, t) is the value of joint
probability function of X and Y at (s, t).

7
12/5/2024

F. Marginal distribution
Given the joint probability distribution of two random variables,
marginal distribution is the distribution of one of the variables
obtained by integrating out the variable that is no longer of
interest.
Given P(x, y) = joint probability distribution of two random variables
X and Y

8
12/5/2024

G. Conditional Distributions
A conditional distribution is a distribution of values for one variable that exists based on
the existance of the values of other variables. This type of distribution allows to assess
the dispersal of a variable of interest under specific conditions.
If f(x,y) = joint pdf of two random variables X and Y, g(x) = marginal pdf of X, and h(y) =
marginal pdf of Y

Y Y
F(x,y) 0 10 25 35
F(x,y) 0 10 25 35 f(y/x)
0 0 0 0 1
0 0 0 0 1 1
1 0 1 1 0
X 2 1 0 0 0 1 0 0.5 0.5 0 1
f(x/y) 1 1 1 1 X 2 1 0 0 0 1

If two random variables X and Y have the joint pdf given by

Find the marginal density of X, marginal density of Y and verify


whether two random variables are independent.
Solution:
Marginal PDF of X

Marginal PDF of Y

9
12/5/2024

H. Derived distributions
Given the pdf of independent variable, derived distribution is the
distribution obtained for functionally dependent variable.
Given: f(x) as pdf of X and
To compute pdf of Y, say g(y) For small element in pdf
Area of pdf of X = area of pdf of Y
f(x)dx=g(y)dy

10
12/5/2024

Statistical properties
Sample statistics

Population statistics

Statistical properties

Kurtosis is a statistical measure that defines


how heavily the tails of a distribution differ
from the tails of a normal distribution. In other
words, kurtosis identifies whether the tails of a
given distribution contain extreme values.

If K = 3; curve is having normal peak and is called Mesokurtic curve.

If K > 3; curve is more peaked than the normal and is called Leptokurtic curve.

If K < 3; curve is less peaked than the normal and is called Platykurtic curve.

11
12/5/2024

Covariance signifies the direction of the linear


relationship between the two variables.

12
12/5/2024

3.3 Chebyshev’s inequality


Knowing mean and standard deviation of a variable, approximate probability can be
obtained by using Chebyshev’s inequality. It states that for any k>1 at least 1-1/k2 of data
from a sample must fall within k standard deviations from the mean.
Chebyshev’s inequality, in probability theory, a theorem that characterizes the dispersion of
data away from its mean (average)

For K = 2 we have 1 – 1/K2 = 1 - 1/4 = 3/4 = 75%.


So Chebyshev's inequality says that at least 75% of
the data values of any distribution must be within
two standard deviations of the mean. For K = 3 we
have 1 – 1/K2 = 1 - 1/9 = 8/9 = 89%.

Chebyshev's inequality is a probability theory that guarantees only a definite fraction of values will be found
within a specific distance from the mean of a distribution. The fraction for which no more than a certain number
of values can exceed is represented by 1/K2.

13
12/5/2024

3.3 Chebyshev’s inequality

Proof

X-µ X+µ
µ - kσ µ + kσ

3.3 Chebyshev’s inequality

14
12/5/2024

3.3 Chebyshev’s inequality

3.4 Moment generating function


Moment generating functions are a way to find moments like the
mean (μ) and the variance (σ2).
They are an alternative way to represent a probability distribution
with a simple one-variable function.

15
12/5/2024

3.4 Moment generating function

3.4 Moment generating function

16
12/5/2024

3.5 Normal (Gaussian) distribution


The graph of the Gaussian distribution depends on two factors – the
mean and the standard deviation.
The mean of the distribution determines the location of the center of
the graph, and the standard deviation determines the height and width
of the graph
Normal distribution, also known as the Gaussian distribution, is a
probability distribution that is symmetric about the mean, showing
that data near the mean are more frequent in occurrence than data far
from the mean
Normal (Gaussian) distribution is widely used distribution because it
fits many natural phenomena and easily describe like, hydrological
variables of large intervals such as annual runoff, annual rainfall.
Random errors associated with any hydrological measurements can
be approximated by normal distribution. This type of distribution is
also useful in hypothesis testing

3.5 Normal (Gaussian) distribution


The PDF of normal distribution is given by

17
12/5/2024

3.5 Normal (Gaussian) distribution


The pdf at the x can be expressed by
This expression cannot be evaluated analytically. The alternative way
is to transform N(µ,σ) to standardized form.

3.5 Normal (Gaussian) distribution


190
256

18
12/5/2024

3.6 Central Limit theorem


States that, given a sufficiently large sample size, the sampling
distribution of the mean for a variable will approximate a normal
distribution
The central limit theorem states that
if you have a population with mean
μ and standard deviation σ and take
sufficiently large random samples
from the population with
replacement, then the distribution
of the sample means will be
approximately normally distributed

It states that if a sequence of random variables xi are independently


and identically distributed with mean μ and variance σ2, then the
distribution of the sum of n such random variables follows normal
distribution with mean nμ and variance nσ2 as n becomes large.

3.6 Central Limit theorem


If a daily event such as daily runoff or daily precipitation is
considered to be a random variable, the corresponding annual event
is a result of the summation of 365 random variables.
Though the daily events are not truly independent, since 365 is a
fairly large number, by central limit theorem annual hydrologic
variables can be taken to follow normal distribution.
Random errors associated with any hydrologic measurements are
usually symmetrically distributed and can be well approximated by
the normal distribution.

19
12/5/2024

3.7 Estimating the parameters of probability distribution


a. Method of moments (Karl Pearson method)
The expected value of random variable is population moment,
whereas the sample moment is computed from the given time
series data.
The parameters of time series model are functions of population
moments.
Therefore, the sample moment is equated to population moment
and the resulting equations are solved to get the parameters.
For k no. of parameters, first k population and sample moments are
equated and solved simultaneously.

b. Method of maximum likelihood (Fisher method)


Likelihood of a set of data is the probability of obtaining that
particular set of data, given the chosen probability distribution
model.
This expression contains the unknown model parameters.
The values of these parameters that maximize the sample likelihood
are known as the Maximum Likelihood Estimates
Maximum likelihood estimation is a method that determines values
for the parameters of a model.
The parameter values are found such that they maximise the
likelihood that the process described by the model produced the data
that were actually observed.
According to this method, the best value of the parameters of a
distribution should be that value which maximizes the likelihood
function or joint probability of occurrence of the observed sample.

20
12/5/2024

b. Method of maximum likelihood (Fisher method)

b. Method of maximum likelihood (Fisher method)

21
12/5/2024

3.8 Selection of distribution


a. Probability plotting
Plot between random variable and probability of exceedence
Method
• Arrange the data in descending order and give rank (m) starting
from 1.
• Compute plotting position , P (X ≥ xm) and return period, T
(T=1/P)
• Plot given data versus P or T on probability paper
If the plot is made on probability paper of an assumed distribution
and if the plot is straight line, then the chosen distribution is valid for
given data.
Such plot is also useful for interpolation or extrapolation.
California formula for plotting position: P (X≥xm) = m/n : simplest
formula
Weibull’s formula for plotting position: P (X≥xm) = m/n+1
where m = rank, n = total number of values

a. Probability plotting

22
12/5/2024

b. Testing goodness of fit


i) Chi-squared test
Chi-squared test is used to determine whether there is a statistically significant difference
between the expected frequencies and the observed frequencies.

Divide a random variables into k class intervals.


Select a probability distribution whose adequacy is to be tested.
Obtain its parameters.
Determine probability pi with which the random variable lies in each
class interval.

= (1000-1662)/1350

23
12/5/2024

ii) Kolmogorov-Smirnov test (K–S test or KS test)


The Kolmogorov-Smirnov test is used to decide if a sample
comes from a population with a specific distribution.
The Kolmogorov–Smirnov statistic quantifies
a distance between the empirical distribution
function of the sample and the cumulative
distribution function of the reference distribution.
Compute cumulative probability P(xm) by Weibull’s formula The red line is a model CDF, the
blue line is an empirical CDF, and
(P(xm)=1-exceedence prob).
the black arrow is the KS statistic
Compute theoretical cumulative probability F(xm) for the
assumed distribution.
Compute the Smirnov-Kolmogorov test statistic

24
12/5/2024

Z = (X-mean)/ SD From Z-Table


(P(xm)=1-exceedence prob.)

25
12/5/2024

26

You might also like