18.
05 Lecture 11
February 28, 2005
A pair (X, Y) of random variables:
f(x, y) joint p.f. (discrete), joint�
p.d.f. (continuous)
Marginal Distributions: f (x) = y f (x, y) - p.f. of X (discrete)
�
f (x) = f (x, y)dy - p.d.f. of X (continuous)
Conditional Distributions
Discrete Case:
P(X = x, Y = y)
P(X = x|Y = y) =
P(Y = y)
f (x,y)
P= f (y) = f (x|y) conditional p.f. of X given Y = y. Note: defined when f(y) is positive.
f (y |x) = ff(x,y)
(x) conditional p.f.
of Y given X = x. Note: defined when f(x) is positive.
If the marginal probabilities are zero, conditional probability is undefined.
Continuous Case:
Formulas are the same, but can’t treat like exact possibilities at fixed points.
Consider instead in terms of probability density:
Conditional c.d.f. of X given Y=y;
P(X ← x, Y ⊂ [y − φ, y + φ])
P(X ← x|Y ⊂ [y − φ, y + φ]) =
P(Y ⊂ [y − φ, y + φ])
�
Joint p.d.f. f (x, y), P(A) = A
f (x, y)dxdy
1
� y+δ � x
2δ y−δ −∗ f (x, y)dxdy
= � y+δ � ∗ 1
y−δ −∗
f (x, y)dxdy × 2δ
As φ ↔ 0:
�x �x
f (x, y)dx f (x, y)dx
�−∗
∗ = −∗
−∗
f (x, y)dx f (y)
Conditional c.d.f:
32
�x
−∗ f (x, y)dx
P(X ← x|Y = y) =
f (y)
Conditional p.d.f:
� f (x, y)
f (x|y) = P(X ← x|Y = y) =
�x f (y)
Same result as discrete.
Also, f (x|y) only defined when f(y) > 0.
Multiplication Rule
f (x, y) = f (x|y)f (y)
Bayes’s Theorem:
f (x, y) f (x|y)f (y) f (x|y)f (y)
f (y |x) = =� =�
f (x) f (x, y)dy f (x|y)f (y)dy
Bayes’s formula for Random Variables. For� each� y, you know the distribution of x.
Note: When considering the discrete case, ↔
In statistics, after observing data, figure out the parameter using Bayes’s Formula.
Example: Draw X uniformly on [0, 1], Draw Y uniformly on [X, 1]
p.d.f.:
1
f (x) = 1 × I(0 ← x ← 1), f (y |x) = × I(x ← y ← 1)
1−x
Joint p.d.f:
1
f (x, y) = f (y |x)f (x) = × I(0 ← x ← y ← 1)
1−x
Marginal: � � y
1
f (y) = f (x, y)dx = dx = − ln(1 − x)|y0 = − ln(1 − y)
0 1 − x
Keep in mind, this condition is everywhere: given, y ⊂ [0, 1] and f(y) = 0 if y ⊂
/ [0, 1]
Conditional (of X given Y):
f (x, y) −1
f (x|y) = = I(0 ← x ← y ← 1)
f (y) (1 − x) ln(1 − y)
Multivariate Distributions
Consider n random variables: X1 , X2 , ..., Xn �
Joint p.f.: f (x1 , x2 , ..., xn ) = P(X�1 = x1 , ..., Xn = xn ) ∼ 0, f = 1
Joint p.d.f.: f (x1 , x2 , ..., xn ) ∼ 0, f dx1 dx2 ...dxn = 1
Marginal, Conditional in the same way:
Define notation as vectors to simplify:
↔
−
X = (X1 , ..., Xn ), ↔−x = (x1 , ..., xn )
↔
− ↔
− ↔ − ↔
−
X = ( Y , Z ) subsets of coordinates: Y = (X1 , ..., Xk ), − ↔
y = (y1 ...yk )
↔
−
Z = (Xk+1 , ..., Xn ), ↔ −z = (z1 ...zn−k )
↔
−
Joint p.d.f. or joint p.f. of X , f (↔ −
x ) = f (↔−
y ,↔−z)
33
Marginal: � �
f (↔
−
y)= f (↔
−
y ,↔
−
z )d↔
−
z , f (↔
−
z)= f (↔
−
y ,↔
−
z )d↔
−
y
Conditional:
f (↔
−
y ,↔
−z) f (↔
−y |↔
−z )f (↔
−z)
f (↔
−
y |↔
−
z)= ↔
− , f (↔
−
z |↔
−
y)= � ↔
f( z ) f ( y | z )f ( z )d↔
− ↔
− ↔
− −
z
Functions of Random Variables
Consider random variable X and a function r: R ↔ R,
Y = r(X), and you want to calculate the distribution of Y.
Discrete Case:
Discrete p.f.:
� �
f (y) = P(Y = y) = P(r(X) = y) = P(x : r(x) = y) = P(X = x) = f (x)
x:r(x)=y x:r(x)=y
(very similar to “change of variable”)
Continuous Case:
Find the c.d.f. of Y = r(X) first.
�
P(Y ← y) = P(r(X) ← y) = P(x : r(x) ← y) = P(A(y)) = f (x)dx
A(y)
�
�
p.d.f. f (y) = �y A(y)
f (x)dx
** End of Lecture 11
34