Lecture 03
Lecture 03
Exercise 1.1 (Countably infinite coin tosses). Consider a sequence of coin tosses, such that the
sample space is Ω = { H, T }N . For set of outcomes En ≜ {ω ∈ Ω : ωn = H }, we consider an event
space generated by F ≜ σ({ En : n ∈ N}). Let Fn be the event space generate by the first n coin
tosses, i.e. Fn ≜ σ ({ Ei : i ∈ [n]}). Let An be the set of outcomes corresponding to at least one head
in first n outcomes An ≜ {ω ∈ Ω : ωi = H for some i ∈ [n]} = ∪in=1 Ei ∈ F, and Bn be the set of out-
comes corresponding to first head at the nth outcome Bn ≜ {ω ∈ Ω : ω1 = · · · = ωn−1 = T, ω = H } =
∩in=−11 Eic ∩ En ∈ F.
1. Show that F = σ ({Fn : n ∈ N}).
Theorem 1.2 (Law of total probability). For a probability space (Ω, F, P), consider a sequence of events B ∈ FN
that partitions the sample space Ω, i.e. Bm ∩ Bn = ∅ for all m ̸= n, and ∪n∈N Bn = Ω. Then, for any event A ∈ F,
we have
P( A) = ∑ P( A ∩ Bn ).
n ∈N
Proof. We can expand any event A ∈ F in terms of any partition B of the sample space Ω as
A = A ∩ Ω = A ∩ (∪n∈N Bn ) = ∪n∈N ( A ∩ Bn ).
From the mutual disjointness of the events B ∈ FN , it follows that the sequence ( A ∩ Bn ∈ F : n ∈ N) is
mutually disjoint. The result follows from the countable additivity of probability of disjoint events.
Example 1.3 (Countably infinite coin tosses). Consider the sample space Ω = { H, T }N and event space F
generated by sequence E ∈ FN defined in Exercise 1.1. We observe that any event A ∈ Fn can be written as
A = ∪ω ∈ A {ω } = ∪ω ∈ A ∩in=1 ({ω ∈ Ei } ∪ {ω ∈
/ Ei }).
2 Independence
Definition 2.1 (Independence of events). For a probability space (Ω, F, P), a family of events A ∈ F I is said
to be independent, if for any finite set F ⊆ I, we have
P(∩i∈ F Ai ) = ∏ P( Ai ).
i∈ F
Remark 1. The certain event Ω and the impossible event ∅ are always independent to every event A ∈ F.
1
Example 2.2 (Two coin tosses). Consider two coin tosses, such that the sample space is Ω =
{ HH, HT, TH, TT }, and the event space is F = P (Ω). It suffices to define a probability function P : F → [0, 1]
on the sample space. We define one such probability function P, such that
1
P({ HH }) = P({ HT }) = P({ TH }) = P({ TT }) = .
4
Let event E1 ≜ { HH, HT } and E2 ≜ { HH, TH } correspond to getting a head on the first or the second toss
respectively.
From the defined probability function, we obtain the probability of getting a tail on the first or the
second toss is 21 , and identical to the probability of getting a head on the first or the second toss. That is,
P( E1 ) = P( E2 ) = 12 and the intersecting event E1 ∩ E2 = { HH } with the probability P( E1 ∩ E2 ) = 41 . That is,
for events E1 , E2 ∈ F, we have
P( E1 ∩ E2 ) = P( E1 ) P( E2 ).
That is, events E1 and E2 are independent.
Example 2.3 (Countably infinite coin tosses). Consider the outcome space Ω = { H, T }N and event space
F generated by the sequence E defined in Exercise 1.1. We define a probability function P : F → [0, 1] by
P(∩i∈ F Ei ) = p| F| for any finite subset F ⊆ N. By definition, E ∈ FN is a sequence of independent events.
Consider A, B ∈ FN , where An ≜ ∪in=1 Ei and Bn ≜ ∩in=−11 Eic ∩ En ∈ F for all n ∈ N. It follows that P( An ) =
1 − (1 − p)n and P( Bn ) = p(1 − p)n−1 for n ∈ N.
For any ω ∈ Ω, we can define the number of heads in first n trials by k n (ω ) ≜ ∑in=1 1{ H } (ωi ) =
∑in=1 1{ω ∈Ei } . For any general event A ∈ Fn = σ({ Ei : i ∈ [n]}), we can write
n h i
P( A) = ∑∏ P {ω ∈ Ei } + P {ω ∈ Eic } = ∑ p k n ( ω ) (1 − p ) n − k n ( ω ) .
ω ∈ A i =1 ω∈ A
Example 2.4 (Counter example). Consider a probability space (Ω, F, P) and the events A1 , A2 , A3 ∈ F. The
condition P( A1 ∩ A2 ∩ A3 ) = P( A1 ) P( A2 ) P( A3 ) is not sufficient to guarantee independence of the three
events. In particular, we see that if
P ( A1 ∩ A2 ∩ A3 ) = P ( A1 ) P ( A2 ) P ( A3 ), P( A1 ∩ A2 ∩ A3c ) ̸= P( A1 ) P( A2 ) P( A3c ),
then P( A1 ∩ A2 ) = P( A1 ∩ A2 ∩ A3 ) + P( A1 ∩ A2 ∩ A3c ) ̸= P( A1 ) P( A2 ).
Definition 2.5. A family of collections of events (Ai ⊆ F : i ∈ I ) is called independent, if for any finite set
F ⊆ I and Ai ∈ Ai for all i ∈ F, we have
P(∩i∈ F Ai ) = ∏ P( Ai ).
i∈ F
3 Conditional Probability
Consider N trials of a random experiment over an outcome space Ω and an event space F. Let ωn ∈ Ω
denote the outcome of the experiment of the nth trial. Consider two events A, B ∈ F and denote the number
2
of times event A and event B occurs by N ( A) and N ( B) respectively. We denote the number of times both
events A and B occurred by N ( A ∩ B). Then, we can write these numbers in terms of indicator functions as
N N N
N ( A) = ∑ 1{ ω n ∈ A } , N ( B) = ∑ 1{ ω n ∈ B } , N ( A ∩ B) = ∑ 1{ ω n ∈ A ∩ B } .
n =1 n =1 n =1
N ( A) N ( B) N ( A∩ B)
We denote the relative frequency of events A, B, A ∩ B in N trials by N , N , N respectively. We can
find the relative frequency of events A, on the trials where B occurred as
N ( A∩ B)
N N ( A ∩ B)
N ( B)
= .
N ( B)
N
Inspired by the relative frequency, we define the conditional probability function conditioned on events.
Definition 3.1. Fix an event B ∈ F such that P( B) > 0, we can define the conditional probability P(·| B) :
F → [0, 1] of any event A ∈ F conditioned on the event B as
P( A ∩ B)
P( A| B) = .
P( B)
Lemma 3.2 (Conditional probability). For any event B ∈ F such that P( B) > 0, the conditional probability
P(·| B) : F → [0, 1] is a probability measure on space (Ω, F ).
Proof. We will show that the conditional probability satisfies all three axioms of a probability measure.
Non-negativity: For all events A ∈ F, we have P( A| B) ⩾ 0 since P( A ∩ B) ⩾ 0.
σ-additivity: For an infinite sequence of mutually disjoint events ( Ai ∈ F : i ∈ N) such that Ai ∩ A j = ∅
for all i ̸= j, we have P(∪i∈N Ai | B) = ∑i∈N P( Ai | B). This follows from disjointness of the sequence
( A i ∩ B ∈ F : i ∈ N).
Certainty: Since Ω ∩ B = B, we have P(Ω| B) = 1.
Remark 2. For two independent events A, B ∈ F such that P( A ∩ B) > 0, we have P( A| B) = P( A) and
P( B| A) = P( B). If either P( A) = 0 or P( B) = 0, then P( A ∩ B) = 0.
Remark 3. For any partition B of the sample space Ω, if P( Bn ) > 0 for all n ∈ N, then from the law of total
probability and the definition of conditional probability, we have
P( A) = ∑ P( A| Bn ) P( Bn ).
n ∈N
4 Conditional Independence
Definition 4.1 (Conditional independence of events). For a probability space (Ω, F, P), a family of events
A ∈ F I is said to be conditionally independent given an event C ∈ F such that P(C ) > 0, if for any finite set
F ⊆ I, we have
P(∩i∈ F Ai |C ) = ∏ P( Ai |C ).
i∈ F
Remark 4. Let C ∈ F be an event such that P(C ) > 0. Two events A, B ∈ F are said to be conditionally
independent given event C, if
P ( A ∩ B | C ) = P ( A | C ) P ( B | C ).
If the event C = Ω, it implies that A, B are independent events.
Remark 5. Two events may be independent, but not conditionally independent and vice versa.
3
Example 4.2. Consider two independent events A, B ∈ F such that P( A ∩ B) > 0 and P( A ∪ B) < 1. Then
the events A and B are not conditionally independent given A ∪ B. To see this, we observe that
P(( A ∩ B) ∩ ( A ∪ B)) P( A ∩ B) P( A) P( B)
P( A ∩ B| A ∪ B) = = = = P ( A | A ∪ B ) P ( B ).
P( A ∪ B) P( A ∪ B) P( A ∪ B)
P( B)
We further observe that P( B| A ∪ B) = P( A∪ B)
̸= P( B) and hence P( A ∩ B| A ∪ B) ̸= P( A| A ∪ B) P( B| A ∪ B).
Example 4.3. Consider two non-independent events A, B ∈ F such that P( A) > 0. Then the events A and B
are conditionally independent given A. To see this, we observe that
P( A ∩ B)
P( A ∩ B| A) = = P ( B | A ) P ( A | A ).
P( A)