Chapter 7
Functions of random variables
Outline
Outline of the chapter
1. Motivation and objective
2. Cdf approach
3. Method of transformation
4. Method of mgf
5. Some distributions related to the normal
5.1 Chi-squared distribution
5.2 Student distribution
5.3 Fisher distribution
2
1. Motivation and objective
Motivation
In the statistical part of the course, we will try and make inference about
a population based on information of a collected sample.
From a modeling point of view, each collected observation will be
considered as the realization of a r.v..
Example
• In order to estimate the value of the true unknown mean µ of a population, we have a
collection Y1 , . . . , Yn , and we compute Y := n−1 ni=1 Yi .
P
• Y1 , . . . , Yn are r.v’s ⇒ Y is also a r.v., i.e. for different samples the observed value for
Y will be different.
• In order to make the link between Y (r.v.), and µ (deterministic), we will need to be
able to compute probabilities with respect to Y .
In general: if Y1 , . . . , Yn are r.v’s and g is a known function, what is the
cdf of X = g(Y1 , . . . , Yn )?
3
1. Motivation and objective
Example 1
Let us consider a line of production with two independent workstations.
Times of assembly: Y1 ∼ Exp(β), Y2 ∼ Exp(β), for some β > 0.
Total duration of assembly: X = Y1 + Y2 . What is the distribution of X ?
Recall that:
mYj = (1 − βt)−1 , j ∈ {1, 2}.
Uniquivocal correspondance between mgf and distribution.
Hence, since Y1 ⊥
⊥ Y2 ,
tY tY −2
mX (t) = E e 1 E e 2 = (1 − βt) ,
which is the mgf of a r.v. with Gamma(2, β) distribution.
4
1. Motivation and objective
Example 2
Y1 , . . . , Yn iid ∼ Exp(β), for some β > 0.
X := min(Y1 , . . . , Yn ). What is the distribution of X ?
We compute
FX (t) = P(X ≤ t) = 1 − P(X > t)
= 1 − P(Y1 > t, . . . , Yn > t) by definition of X(1)
= 1 − P(Y1 > t) . . . P(Yn > t) by independence
n
= 1 − (1 − FY (t)) by ‘identically distributed’
h in
−βt
=1− 1− 1−e by exponential distribution
−nβt
=1−e , t > 0.
Hence X ∼ Exp(nβ).
5
1. Motivation and objective
Objective
Find the distribution (or pmf or pdf) of U = h(Y1 , . . . , Yn ), with Y1 , . . . , Yn
independent r.v’s. Classical approaches, written for n = 1:
(1) Via the cdf:
1(h(y ) ∈ A) pY (y ).
P
• Discrete: P(h(Y ) ∈ A) = y
1(h(y ) ∈ A) fY (y ) dy .
R
• Continuous: P(h(Y ) ∈ A) = y
(2) Via method of ‘transformation’ as special case of (1) for strictly
monotone h.
(3) Via the mgf.
Remark
Any one of these may be employed to find the distribution of a given function
of the variables, but one of the methods usually leads to a simpler derivation
than the others. The method that works “best” varies from one application to
another.
6
2. Cdf approach
Principle
Recall that:
1(h(y ) ∈ A)pY (y ).
P
Discrete: P(h(Y ) ∈ A) = y
1(h(y ) ∈ A)fY (y ) dy .
R
Continuous: P(h(Y ) ∈ A) = y
Hence, the cdf of U is
P pY (y ),
FU (u) = P(u ≤ u) = R y :h(y )≤u
f (y ) dy .
y :h(y )≤u Y
Note that this approach always works but is perhaps heavy to calculate.
7
2. Cdf approach
Example 1
The result of a game of chance is a r.v. Y with probability function
y -2 -1 0 1 2
P(Y = y ) 0.05 0.2 0.3 0.4 0.05
The profit (+ or −) is a r.v. U = 2Y 2
u 0 2 8
P(U = u) 0.3 0.6 0.1
8
2. Cdf approach
Example 2
A density function sometimes used by engineers to model lengths of life of
electronic components is the Rayleigh density, given by
2y − yθ2
fY (y ) = e 1(y > 0),
θ
for some θ > 0.
Let U = Y 2 . What is the pdf fU of U?
9
2. Cdf approach
Example 3
Consider a bivariate density for (Y1 , Y2 ) given by
f (y1 , y2 ) = 3y1 1(0 ≤ y2 ≤ y1 ≤ 1).
Let U = Y1 − Y2 . What is the pdf fU of U?
10
3. Method of transformation
Let Y have probability density function fY (y ). If h(y ) is monotonic and
differentiable for all y such that fY (y ) > 0, then U = h(Y ) has density function
d[h−1 (u)]
fU (u) = fY (h−1 (u)).
du
Proof
11
3. Method of transformation
Example 1
What is the pdf fU of U = −4Y + 3 where Y has pdf
fY (y ) = 2y 1(0 ≤ y ≤ 1)?
The function h(y ) = −4y + 3 is strictly decreasing for all y . We have
• h−1 (u) = 3−u
4
,
d[h−1 (u)] −1
• du
= 4
.
Hence,
3−u 1 3−u
fU (u) = 2 1(0 ≤ ≤ 1)
4 4 4
3−u
= 1(−1 ≤ u ≤ 3).
8
12
3. Method of transformation
Example 2
Let U ∼ Unif [0, 1]. What is the pdf of Y = −β ln(1 − U) for some β > 0?
13
4. Method of mgf
Principle already applied multiple times in the course. Particularly useful
when U = ni=1 Yi with Yi iid r.v’s.
P
Summands Sum
Yi ∼ Be(p) U ∼ Bin(n, p)
U ∼ Bin( ni=1 ni , p)
P
Yi ∼ Bin(ni , p)
U ∼ P( ni=1 λi )
P
Yi ∼ P(λi )
U ∼ Gamma( ni=1 αi , β)
P
Yi ∼ Gamma(αi , β)
Yi ∼ Exp(β) U ∼ Gamma(n, β)
Yi ∼ N(µi , σi2 ) U ∼ N( ni=1 µi , ni=1 σi2 )
P P
14
5. Some distributions related to the normal
To conclude this chapter, we consider different continuous distributions
that are related to the normal distribution.
As we will see later, the normal distribution plays a central role in
statistics.
Although the need for defining these distributions is not apparent in this
chapter, these tools will be essential for the statistical part of the course.
The distributions are the following:
‘Chi-squared’
‘Student’
‘Fisher’
15
5.1 Chi-squared distribution
Probability density function
A r.v. X defined on R+ is following a Chi-squared distribution with ν degrees
of freedom, denoted X ∼ χ2ν for ν ∈ N0 , if it is a gamma-distributed random
variable with parameters α = ν/2 and β = 2.
Link with Normal distribution
If Z1 , . . . , Zn are iid N(0, 1), then
If Y = Zi2 for any i = 1, . . . , n, then Y ∼ χ21 .
Pn
If Y = i=1 Zi2 , then Y ∼ χ2n .
Proof
16
5.1 Chi-squared distribution
Proof
17
5.1 Chi-squared distribution
Probability density function
18
5.2 The Student t - distribution
Probability density function
Let Z and Y be two independent r.v. Z ∼ N(0, 1) and Y ∼ χ2n for n ∈ N0 . A
r.v. T defined on R is following a Student t distribution with n degrees of
freedom, denoted T ∼ tn , if
Z
T = p .
Y /n
Its pdf is
− n+1
Γ n+1
2 t2 2
fT (t) = √ n
1 + , t ∈ R.
nπΓ 2 n
19
5.2 The Student t - distribution
Properties
Like the normal distribution, T is symmetric around 0.
If n grows, then the tn distribution approaches the N(0, 1) distribution:
− n+1
Γ n+1
t2 2
1 t2
lim √ 2
1+ = √ e− 2 .
n→∞ nπΓ n2 n 2π
In practice, if T ∼ tn with n ≥ 30, then one may simply assume that
T ∼ N(0, 1).
E(T ) = 0 = E(Z ), but V(T ) = n/(n − 2) for n > 2, while V(Z ) = 1.
20
5.2 The Student t - distribution
Probability density function
21
5.3 The Fisher distribution
Probability density function
Let Y1 and Y2 be two independent r.v. Y1 ∼ χ2n1 and Y ∼ χ2n2 for n1 , n2 ∈ N0 .
A r.v. F defined on R+ is following a Fisher distribution with n1 and n2
degrees of freedom, denoted F ∼ Fn1 ,n2 , if
Y1 /n1
F = .
Y2 /n2
Properties
Defined on R+ (ratio of two positive values).
If F ∼ Fn1 ,n2 , then 1/F ∼ Fn2 ,n1 .
If T ∼ tn , then T 2 ∼ F1,n .
22
5.3 The Fisher distribution
Probability density function
23