Math Section A
Math Section A
Conditional probability answers the question ‘how does the probability of an event change if we have
extra information’. We’ll illustrate with an example.
(b) Suppose we are told that the first toss was heads. Given this information how should we compute
the probability of 3 heads?
Answer: We have a new (reduced) sample space: Ω' = {HHH, HHT, HTH, HTT}.
All outcomes are equally likely, so
P (3 heads given that the first toss is heads) = 1/4.
This is called conditional probability, since it takes into account additional conditions. To
develop the notation, we rephrase (b) in terms of events.
Rephrased (b) Let A be the event ‘all three tosses are heads’ = {HHH}.
Let B be the event ‘the first toss is heads’ = {HHH, HHT, HTH, HTT}. The conditional probability of A
knowing that B occurred is written
P(A|B)
This is read as:-
‘the conditional probability of A given B’ or ‘the probability of A conditioned on B’ or simply
‘the probability of A given B’.
We can visualize conditional probability as follows. Think of P(A) as the proportion of the area of the
whole sample space taken up by A. For P(A|B) we restrict our attention to B.
That is, P(A|B) is the proportion of area of B taken up by A, i.e. P(A ∩ B)/P(B). B
Note, A ⊂ B in the right-hand figure, so there are only two colors shown.
The formal definition of conditional probability catches the gist of the above example and visualization.
Formal definition of conditional probability
Let’s redo the coin tossing example using the definition in Equation (1). Recall A = ‘3 heads’
and B = ‘first toss is heads’. We have P(A) = 1/8 and P(B) = 1/2. Since A ∩ B = A, we
also have P(A ∩ B) = 1/8. Now according to (1), P(A|B) = 1/8 = 1/4, which agrees with
We can visualize conditional probability as follows. Think of P(A) as the proportion of the
our answer in Example 1(b).
The law of total probability will allow us to use the multiplication rule to find probabilities in more
interesting examples. It involves a lot of notation, but the idea is fairly simple. We state the law when
the sample space is divided into 3 pieces. It is a simple matter to extend the rule when there are more
than 3 pieces.
Law of Total Probability
Suppose the sample space Ω is divided into 3 disjoint events B1, B2, B3 (see the figure below). Then for
any event A:
P(A)=P(A ∩ B1)+P(A ∩ B2)+P(A ∩ B3)
P (A) = P (A|B1) P (B1) + P (A|B2) P (B2) + P (A|B3) P (B3)
The top equation says ‘if A is divided into 3 pieces then P (A) is the sum of the probabilities of the
pieces’. The bottom equation is called the law of total probability. It is just a rewriting of the top
equation using the multiplication rule.
The law holds if we divide Ω into any number of events, so long as they are disjoint and
cover all of Ω. Such a division is often called a partition of Ω.
Our first example will be one where we already know the answer and can verify the law.
Example 2:-
An urn contains 5 red balls and 2 green balls. Two balls are drawn one after the other. What is the
probability that the second ball is red?
answer: The sample space is Ω = {rr, rg, gr, gg}.
Let R1 be the event ‘the first ball is red’, G1 = ‘first ball is green’, R2 = ‘second ball is red’, G2 = ‘second
ball is green’. We are asked to find P(R2).
The fast way to compute this is just like P(S2) in the card example above. Every ball is equally likely to be
the second ball. Since 5 out of 7 balls are red, P(R2) = 5/7.
Let’s compute this same value using the law of total probability). First, we’ll find the conditional
probabilities. This is a simple counting exercise.
P (R2|R1) = 4/6, P (R2|G1) = 5/6.
Bayes’ Theorem
Bayes’ theorem is a pillar of both probability and statistics and it is central to the rest of this course. For
two events A and B Bayes’ theorem (also called Bayes’ rule and Bayes’ formula) says
Comments: 1. Bayes’ rule tells us how to ‘invert’ conditional probabilities, i.e. to find P(B|A) from
P(A|B).
2. In practice, P (A) is often computed using the law of total probability.