CH-3
CH-3
Statistical Principles
and Techniques for
Time Series Modeling
A. Frequency
For discrete random variables, the number of
occurrences of a variate is generally called
frequency. When the number of occurrences of
a variate, or the frequency, is plotted against the
variate, a pattern of distribution is obtained.
The pattern is called the frequency
distribution.
1
12/5/2024
PMF
CDF: CDF is the probability of the event that the random variable X is less than or equal to x,
for every value x.
F[x] = P[X ≤ x]
For a discrete random variable, the CDF is found by summing up the probabilities.
P[X=xi] can be computed by
P[X=xi] = F[xi]- F[xi-1]
PMF CDF
2
12/5/2024
Example
Assume that X has the PMF given by
PMF CDF
3
12/5/2024
Example: The plot of the CDF of a discrete random variable X is shown in Figure below. Find the
PMF of X.
Solution
The random variable takes on values with nonzero probability at X = 1, X = 2, X = 4 and X = 6.
The size of the jump at X = 1 is 1/3, the size of the jump at X = 2 is 1/2−1/3=1/6, the size of the
jump at X = 4 is 3/4−1/2=1/4, and the size of the jump at X = 6 is 1−3/4=1/4. Thus, the PMF
of X is given by
Example: Find the PMF of a discrete random variable X whose CDF is given by:
Solution:
Here CDF changes values at x = 0, x = 2, x = 4 and x = 6, which means that these are the values
of the random variable that have nonzero probabilities.
4
12/5/2024
The next task after isolating these values with nonzero probabilities is to determine their
probabilities.
The first value is PX(0), which is 1/6.
At x = 2 the size of the jump is 1/2−1/6=1/3=PX(2).
Similarly, at x = 4 the size of the jump is 5/8−1/2=1/8=PX(4).
Finally, at x = 6 the size of the jump is 1−5/8=3/8=PX(6).
Therefore, the PMF of X is given by
5
12/5/2024
p (x=0) = 0.25
6
12/5/2024
For −∾< 𝑥 <+∾ , −∾< 𝑦 <+∾ where f(s, t) is the value of joint
probability function of X and Y at (s, t).
7
12/5/2024
F. Marginal distribution
Given the joint probability distribution of two random variables,
marginal distribution is the distribution of one of the variables
obtained by integrating out the variable that is no longer of
interest.
Given P(x, y) = joint probability distribution of two random variables
X and Y
8
12/5/2024
G. Conditional Distributions
A conditional distribution is a distribution of values for one variable that exists based on
the existance of the values of other variables. This type of distribution allows to assess
the dispersal of a variable of interest under specific conditions.
If f(x,y) = joint pdf of two random variables X and Y, g(x) = marginal pdf of X, and h(y) =
marginal pdf of Y
Y Y
F(x,y) 0 10 25 35
F(x,y) 0 10 25 35 f(y/x)
0 0 0 0 1
0 0 0 0 1 1
1 0 1 1 0
X 2 1 0 0 0 1 0 0.5 0.5 0 1
f(x/y) 1 1 1 1 X 2 1 0 0 0 1
Marginal PDF of Y
9
12/5/2024
H. Derived distributions
Given the pdf of independent variable, derived distribution is the
distribution obtained for functionally dependent variable.
Given: f(x) as pdf of X and
To compute pdf of Y, say g(y) For small element in pdf
Area of pdf of X = area of pdf of Y
f(x)dx=g(y)dy
10
12/5/2024
Statistical properties
Sample statistics
Population statistics
Statistical properties
If K > 3; curve is more peaked than the normal and is called Leptokurtic curve.
If K < 3; curve is less peaked than the normal and is called Platykurtic curve.
11
12/5/2024
12
12/5/2024
Chebyshev's inequality is a probability theory that guarantees only a definite fraction of values will be found
within a specific distance from the mean of a distribution. The fraction for which no more than a certain number
of values can exceed is represented by 1/K2.
13
12/5/2024
Proof
kσ
X-µ X+µ
µ - kσ µ + kσ
14
12/5/2024
15
12/5/2024
16
12/5/2024
17
12/5/2024
18
12/5/2024
19
12/5/2024
20
12/5/2024
21
12/5/2024
a. Probability plotting
22
12/5/2024
= (1000-1662)/1350
23
12/5/2024
24
12/5/2024
25
12/5/2024
26