0% found this document useful (0 votes)
21 views

Random Variables

The University of Aberdeen's Lecture series on Probability and Statistics

Uploaded by

Sudeep Roy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Random Variables

The University of Aberdeen's Lecture series on Probability and Statistics

Uploaded by

Sudeep Roy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Probability & Statistics

L EC T U R E 3 : R AN D O M VAR IABL ES AN D D ISC R ET E PR O BABIL IT Y


D IS T R IB U T IONS
R A.O M BEA.COM SESSIO N ID : 2 3 3 6 6 2
D r Pe te r D . D u n n i n g

ENGINEERING RISK & RELIABILITY ANALYSIS


Random variables
o Any quantity that has an uncertain value
o Discrete (count data): e.g. number of failed components

o Continuous (measure data): e.g. yield strength of metal

o May be bounded:
o Example: number of components failed in a batch of 200: n  [0, 200]

ENGINEERING RISK & RELIABILITY ANALYSIS 2


Probability density and mass function
o A probability mass function (PMF) describes the probability that a discrete random
variable is equal to a certain value
➢ Example, dice roll, uniform distribution:

P 𝑋 = 𝑥 = 𝑓 𝑥 = 1Τ6 , 𝑥 ∈ 1,6

o A probability density function (PDF) describes the probability that a continuous


random variable is equal to a certain value:
➢ Example, heights in a population, normal distribution:
f(x)
1 1 𝑥−𝜇𝑥 2
P 𝑋=𝑥 =𝑓 𝑥 = exp −
2𝜋 1Τ2 𝜎𝑥 2 𝜎𝑥
x

ENGINEERING RISK & RELIABILITY ANALYSIS 3


Cumulative distribution function
o Cumulative distribution function (CDF) describes the probability that the random
variable is less than or equal to a certain value

o Evaluate by integrating PDF, or summing a PMF up to a certain value:

f(x) F(a)

a x a
𝑎

𝑃 𝑋 ≤ 𝑎 = 𝐹𝑥 𝑎 = ෍ 𝑓𝑥 𝑥𝑖 𝑃 𝑋 ≤ 𝑎 = 𝐹𝑥 𝑎 = න 𝑓𝑥 𝑥 𝑑𝑥
𝑥𝑖 ≤𝑎 −∞

o Total area under a PDF, or sum of PMF, must equal 1 (axiom of probability theory)

ENGINEERING RISK & RELIABILITY ANALYSIS 4


Cumulative distribution function
o Probability that a random variable lies between two limits is
equal to the difference between CDF values

𝑃 𝑎 < 𝑋 ≤ 𝑏 = ෍ 𝑓𝑥 𝑥𝑖 = 𝐹𝑥 𝑏 − 𝐹𝑥 𝑎
𝑎<𝑥𝑖 ≤𝑏

𝑏 𝑎

𝑃 𝑎<𝑋≤𝑏 = න 𝑓𝑥 𝑥 𝑑𝑥 − න 𝑓𝑥 𝑥 𝑑𝑥 = 𝐹𝑥 𝑏 − 𝐹𝑥 𝑎 f(x)

−∞ −∞

a b x

ENGINEERING RISK & RELIABILITY ANALYSIS 5


Statistical moments
o PMF and PDFs are often described by their derived properties

o Such as their statistical moments: mean, variance, etc …

o Mean (1st moment) is the weighted average, or most likely value of the random variable:

𝐸 𝑋 = 𝜇𝑥 = ෍ 𝑥𝑖 𝑓𝑥 𝑥𝑖
𝑖
f(x)

𝐸 𝑋 = 𝜇𝑥 = න 𝑥𝑓𝑥 𝑥 𝑑𝑥 x
−∞ μ = 3.5 μ

ENGINEERING RISK & RELIABILITY ANALYSIS 6


Statistical moments
o Variance (2nd moment) measures the amount of randomness about the mean:

2 2
𝐸 𝑋 − 𝜇𝑥 = var 𝑋 = ෍ 𝑥𝑖 − 𝜇𝑥 𝑓𝑥 𝑥𝑖
𝑖

var 𝑋 = 𝐸 𝑋 2 − 𝜇𝑥 2
2
𝐸 𝑋 − 𝜇𝑥 = var 𝑋 = න 𝑥𝑖 − 𝜇𝑥 2 𝑓𝑥 𝑥 𝑑𝑥
−∞

o Standard deviation: σx = [var(X )]0.5

o Coefficient of variation: Vx = σx / μx

ENGINEERING RISK & RELIABILITY ANALYSIS 7


Statistical moments
o Skewness (3rd moment) measures the lack of symmetry about the mean:

𝐸 𝑋 − 𝜇𝑥 3
𝐸 𝑋 − 𝜇𝑥 3 = න 𝑥𝑖 − 𝜇𝑥 3 𝑓𝑥 𝑥 𝑑𝑥 𝛾1 =
𝜎𝑥3
−∞

o Kurtosis (4th moment) measures the ‘flatness’ of a distribution:



𝐸 𝑋 − 𝜇𝑥 4
4
𝐸 𝑋 − 𝜇𝑥 = න 𝑥𝑖 − 𝜇𝑥 4 𝑓𝑥 𝑥 𝑑𝑥 𝛾2 =
𝜎𝑥4
−∞

ENGINEERING RISK & RELIABILITY ANALYSIS 8


Name some engineering random variables?

ENGINEERING RISK & RELIABILITY ANALYSIS 9


Basic variables
o Fundamental variables used in reliability calculations
o ideally they are independent (but not always possible)

o dependent variables add complexity to analysis (joint PDFs)

o basic variables can be related to even more fundamental variables

o Examples: material properties (stiffness, strength), dimensions, loads

o Uncertainty in the value of a random basic variable is represented using a PMF or PDF

ENGINEERING RISK & RELIABILITY ANALYSIS 10


Discrete uniform distribution
o The simplest discrete PMF, assumes a limited number of possible values,
each with an equal probability

o For n possible values, PMF: 𝑓 𝑥 = 1Τ𝑛 , 𝑥 ∈ 1, 𝑛

o Not often useful in engineering

ENGINEERING RISK & RELIABILITY ANALYSIS 11


Binomial distribution
o Discrete random variable: the number of successes in a set of trials

o Each trial is independent and the probability of success is constant

o Binomial distribution: probability of x successes in n trials with probability of success p:


𝑛! 𝑥 𝑛−𝑥
PMF: 𝑓 𝑥 = 𝑥! 𝑛−𝑥 !
𝑝 1−𝑝 , 𝑥 = 0,1,2, ⋯ , 𝑛

𝑎 𝑛!
CDF: P 𝑋 ≤ 𝑎 = F 𝑎 = ෍ 𝑝𝑦 1 − 𝑝 𝑛−𝑦
𝑦=0 𝑦! 𝑛 − 𝑦 !

Mean: 𝜇𝑥 = 𝑛𝑝

Variance: 𝜎𝑥2 = 𝑛𝑝 1 − 𝑝

ENGINEERING RISK & RELIABILITY ANALYSIS 12


Binomial distribution
o A process that can be described using a Binomial distribution (the event
is either a success or failure) is called a Bernoulli process
o Useful in many engineering applications when a process is repeated, the
outcome is either a success or failure and each repeat is independent

o Examples:
➢ Quality control on a production line, a product either passes or fails inspection
Jacob Bernoulli
➢ In an earthquake zone a building may or may not be damaged during one year 1654-1705

ENGINEERING RISK & RELIABILITY ANALYSIS 13


Example
o A factory produces identical components. The probability that a component will pass the quality
inspection is 0.99.

1. What is the probability that in a batch of 100, 2 or less components fail inspection?

Use the CDF for the Binomial distribution. For convenience define the event of a failed inspection
as a “success.” Thus, n = 100, p = 0.01:
2 100!
P X≤2 =F 2 =෍ 0.01𝑦 × 0.99100−𝑦
𝑦=0 𝑦! 100 − 𝑦 !

100! 100! 100!


F 2 = × 0.99100 + 0.01 × 0.9999 + 0.012 × 0.9998
100! 99! 2 × 98!

F 2 = 0.3660 + 0.3697 + 0.1849 = 𝟎. 𝟗𝟐𝟎𝟔

ENGINEERING RISK & RELIABILITY ANALYSIS 14


Example
2. What is the probability that exactly 3 components fail inspection?

Use the Binomial PMF with a “success” being a passed inspection: then p = 0.99, n = 100:

100!
𝑃 𝑋 = 97 = 𝑓 97 = 0.9997 × 0.013 = 𝟎. 𝟎𝟔𝟏
3! × 97!

Alternatively, define a failed inspection as a “success,” then p = 0.01, n = 100:

100!
𝑃 𝑋=3 =𝑓 3 = 0.013 × 0.9997 = 𝟎. 𝟎𝟔𝟏
97! × 3!

ENGINEERING RISK & RELIABILITY ANALYSIS 15


Geometric distribution
o Discrete random variable: number of trials until a success occurs

o Again, each trial is independent and the probability of success is constant

o Geometric distribution: probability that trial number x is the first success,


with probability of success for each trial p:
𝑥−1 𝑝
PMF: 𝑓 𝑥 = 1 − 𝑝 , 𝑥 = 1,2, ⋯
𝑎
𝑦−1 𝑝 𝑎
CDF: P 𝑋 ≤ 𝑎 = F 𝑎 = ෍ 1−𝑝 = 1− 1−𝑝
𝑦=1

Mean: 𝜇𝑥 = 1/𝑝

Variance: 𝜎𝑥2 = 1 − 𝑝 /𝑝2

ENGINEERING RISK & RELIABILITY ANALYSIS 16


Geometric distribution
o In a time (or space) sequence that is appropriately discretized into intervals, such that the
probability of an event occurring (success) within an interval is constant and independent of events
in other intervals:
➢ First occurrence time = the number of intervals until first success (i.e. the event occurs)
➢ The PMF for the recurrence time of the event is equal to the PMF of the first occurrence time,
thus it can be modelled using a geometric distribution

ഥ:
o Mean recurrence time is known as the (average) return period 𝑻

1
𝑇ത = 𝐸 𝑇 = ෍ 𝑝 1 − 𝑝 𝑡−1 =
𝑝
𝑡=1

t = one time interval, 𝑇ത = average number of time intervals between events

ENGINEERING RISK & RELIABILITY ANALYSIS 17


Example
o A fixed offshore platform is designed for a wave height of 8m above the mean sea level
o This wave height corresponds to a 5% probability of being exceeded per year
1. What is the probability that the platform will be subjected to the design wave height
within the return period?
First, the return period is: 𝑇ത = 1/p = 1/0.05 = 20 years

Define success as a wave height above 8m, p = 0.05 then we can


use the geometric distribution CDF:

P(H>8 in 20 years) = P(X ≤ 20) = F(20) = 1 – (1 – 0.05)20 = 0.6415

ENGINEERING RISK & RELIABILITY ANALYSIS 18


Example
2. What is the probability that the first exceedance of the design wave height will occur
after the third year?
Use the complimentary event and the geometric distribution CDF:
P(H>8 after 3 years) = 1 - P(H>8 in first 3 years) = 1 – P(X ≤ 3)
P(H>8 after 3 years) = 1 – F(3) = 1 – [1 – (1 – 0.05)3] = 0.8574

Note - another way of asking the same question:


What is the probability that the design wave height is not exceeded in the first 3 years?

ENGINEERING RISK & RELIABILITY ANALYSIS 19


Poisson distribution
o Discrete random variable:
number of times an event occurs in a time interval or spatial region
o Poisson process:
o The number of events in one interval / region is independent of the number
of events in another interval / region (no overlap in the intervals / regions)
o The probability an event occurs during a very short time interval (or small
region) is proportional to the length of the time interval (size of region) and
does not depend on the number of events occurring outside the time Siméon Denis Poisson
interval / region. 1741-1840

o The probability that more than one event will occur in such a short time
interval, or fall in such a small region is negligible

ENGINEERING RISK & RELIABILITY ANALYSIS 20


Poisson distribution
o The given time interval may be of any length, such as a minute, a day, a week, a
month, or even a year
o The specified region could be a line segment, an area, a volume, or piece of material
o Examples:
➢ Number of times a flight is delayed in a year
➢ Number of defects in a 1km length of pipe
➢ Number of severe storms in a region during ten years

ENGINEERING RISK & RELIABILITY ANALYSIS 21


Poisson distribution
o The Poisson process is related to the Binomial distribution

o Binominal distribution: in one interval (trial) an event either occurs or does not, thus an event
can only occur at most once during an interval

o Poisson distribution: an event can occur more than once during an event

o As the time interval gets shorter, the Poisson distribution approaches the Binomial distribution
because the likelihood of multiple events occurring in one interval decreases

ENGINEERING RISK & RELIABILITY ANALYSIS 22


Poisson distribution
o Poisson distribution: probability that x events occur in one time period (region) t with the
average number of occurrences v (in period t):

Poisson parameter: 𝜆 = 𝑣𝑡

PMF: 𝑓 𝑥 =
𝜆𝑥 −𝜆
𝑒 , 𝑥 = 0,1,2, ⋯ λ=2
𝑥!
𝑎
𝜆𝑦 −𝜆
CDF: P 𝑋 ≤ 𝑎 = F 𝑎 = ෍ 𝑒
𝑦!
𝑦=0

Mean: 𝜇𝑥 = 𝜆

Variance: 𝜎𝑥2 = 𝜆

ENGINEERING RISK & RELIABILITY ANALYSIS 23


Example
o Historical records of severe storms at a North Sea offshore platform indicate that there
had been an average of 7.3 severe storms per year. Assuming that the occurrences of
severe storms may be modelled with Poisson process:

1. What is the probability that there would not be any


severe storms next year?
2. What is the probability of 7 severe storms next year?
3. What is the probability of 2 or more severe storms
in the next 6 months?

ENGINEERING RISK & RELIABILITY ANALYSIS 24


Example
1. What is the probability that there would not be any severe storms next year?
Define the time interval as one year (t = 1), then v = 7.3 and λ = 1×7.3 = 7.3
7.30 −7.3
Use Poisson distribution PMF: 𝑃 𝑋 = 0 = 𝑓 0 = 𝑒 = 𝟔. 𝟕𝟓 × 𝟏𝟎−𝟒
0!

2. What is the probability of 7 severe storms next year?


7.37 −7.3
𝑃 𝑋=7 =𝑓 7 = 𝑒 = 𝟎. 𝟏𝟒𝟖
7!

ENGINEERING RISK & RELIABILITY ANALYSIS 25


Example
3. What is the probability of 2 or more severe storms in the next 6 months?
First, we need to redefine the Poisson's parameter (change in period):
Time interval is now 6 months, or half a year (t = 0.5) and λ = 0.5×7.3 = 3.65
Then use Poisson CDF and complimentary event (1 period is now 6 months):

3.650 3.651
𝑃 𝑋 ≥ 2 = 1 − 𝑃 𝑋 ≤ 1 = 1 − 𝐹 1 = 1 − 𝑒 −3.65 + = 𝟎. 𝟖𝟕𝟗
0! 1!

ENGINEERING RISK & RELIABILITY ANALYSIS 26


Dependent event in a sequence
o Both the Binomial and Poisson distributions assume the occurrence of an event within
an interval (or for a trial) is independent of events within previous intervals or trials

o However, the likelihood of an event occurring may be dependent on previous events (or
earlier trials), and thus could involve conditional probabilities

o If this conditional probability depends on the immediately preceding trial (or interval),
the resulting model is a Markov chain (or Markov Process) – beyond this course!

ENGINEERING RISK & RELIABILITY ANALYSIS 27


Identify the correct distributions
Random variables are….
RV1: number of years until a magnitude 7 earthquake
18
RV2: number of accidents at a factory in one year

A) RV1 = Binomial, RV2 = Geometric


B) RV1 = Geometric, RV2 = Poisson
5 4
C) RV1 = Poisson, RV2 = Binomial 4
2
D) RV1 = Geometric, RV2 = Binomial 0
E) RV1 = Binomial, RV2 = Poisson A B C
F) RV1 = Poisson, RV2 = Geometric D E F

ENGINEERING RISK & RELIABILITY ANALYSIS 28


Summary of discrete distributions
Distribution Random variable PMF
Number of successes in 𝑛! 𝑥 𝑛−𝑥
Binomial 𝑓 𝑥 = 𝑥! 𝑝 1−𝑝 , 𝑥 = 0,1,2, ⋯ , 𝑛
n trials 𝑛−𝑥 !

First success in a 𝑥−1 𝑝


Geometric 𝑓 𝑥 = 1−𝑝 , 𝑥 = 1,2, ⋯
sequence of trials
Number of events in a 𝜆𝑥
Poisson time period (or region) t, 𝜆 = 𝑣𝑡 , 𝑓 𝑥 = 𝑥! 𝑒 −𝜆 , 𝑥 = 0,1,2, ⋯
average no. events v

ENGINEERING RISK & RELIABILITY ANALYSIS 29

You might also like