Notes
Notes
AXIOMATIC PROBABILITY
Till now we have seen a few type of probability problems and types. Axiomatic probability
was introduced by A.N. Kolmogorov to unify the ideals of the subject. First, we assume the
following notations –
• Ω → Sample space
• 2Ω → Power set of Ω. It gives the number of possible subsets of the sample space
• 𝐴 → An event. It is a subset of Ω
• 𝐹 → A set of events. It is a subset of 2Ω
Suppose the events 𝐴1 , 𝐴2 , … , 𝐴𝑛 which make up 𝐹 satisfy the following conditions –
• 𝜙 ∈ 𝐹 and Ω ∈ 𝐹
• 𝐴𝐶 ∈ 𝐹 ∀ 𝐴 ∈ 𝐹
• ⋃ 𝑛𝑖=1 𝐴𝑖 ∈ 𝐹 ∀ 𝐴𝑖 ∈ 𝐹 𝑤ℎ𝑒𝑟𝑒 𝑖 ∈ [1, 𝑛]
Then 𝐹 is called a 𝝈 − 𝒇𝒊𝒆𝒍𝒅. We can use this to formally define Probability as a mapping
between the 𝜎 – field and a set of real numbers between 0 and 1.
𝑃∶𝐹→𝑅
The combination (Ω, 𝐹, 𝑃) is called the Probability Space. On the other hand, the tuple (Ω, 𝐹 )
is called the measurable space.
NOTE
We shall be using 𝐹 more frequently from here on out instead of the entire subset
combinations 2Ω . The reason is that Ω can be countably or uncountably infinite. As a result,
2Ω becomes uncountably infinite and makes things unnecessarily complex. Instead, we use
𝐹 which is a subset of 2Ω which can be either finite or countably finite. To read more on
countability, check Appendix.
IMP OBSERVATIONS, LEMMAS AND PROOFS
Lemma
If 𝐴1 , 𝐴2 , … , 𝐴𝑛 ∈ 𝐹, then ⋃ 𝑛𝑖=1 𝐴𝑖 also belongs to 𝐹
Proof
As per the conditions of the 𝜎 – field, we have –
∞
⋃ 𝐴𝑖 ∈ 𝐹
𝑖=1
⋃ 𝐴𝑖 = ⋃ 𝐴𝑖 ∈ 𝐹
𝑖=1 𝑖=1
HENCE PROVED
Lemma
If 𝐴1 , 𝐴2 , … ∈ 𝐹, then ⋂∞
𝑖=1 𝐴𝑖 ∈ 𝐹
Proof
As per the conditions of the 𝜎 – field, we have –
∞
⋃ 𝐴𝑖 ∈ 𝐹
𝑖=1
⋃𝐴𝑖 = 𝐴1 ∪ 𝐴2 ∪ … = ( 𝐴1 ∩ 𝐴2 ∩ … ) 𝐶 = (⋂𝐴𝑖 )
𝑖=1 𝑖=1
From the properties of a 𝜎 – field, we can say that if 𝐴 ∈ 𝐹, then 𝐴𝐶 ∈ 𝐹. Therefore, we can
now write that –
∞
⋂ 𝐴𝑖 ∈ 𝐹
𝑖=1
HENCE PROVED
Lemma
If 𝐴1 , 𝐴2 , … , 𝐴𝑛 ∈ 𝐹, then ⋂ 𝑛𝑖=1 𝐴𝑖 also belongs to 𝐹
Proof
As per the proof above, we have –
∞
⋂ 𝐴𝑖 ∈ 𝐹
𝑖=1
⋂ 𝐴𝑖 = ⋂ 𝐴𝑛 ∈ 𝐹
𝑖 =1 𝑖 =1
HENCE PROVED
Lemma
If 𝐴 and 𝐵 are in 𝐹, then 𝐴 − 𝐵 is also in 𝐹
Proof
From set theory, we can write –
𝐴 − 𝐵 = 𝐴 ∩ 𝐵𝐶
From the properties of 𝐹 and the proof mentioned above, we can write –
𝐵𝐶 ∈ 𝐹 𝑠𝑖𝑛𝑐𝑒 𝐵 ∈ 𝐹
𝐴 ∩ 𝐵𝐶 ∈ 𝐹 𝑠𝑖𝑛𝑐𝑒 𝐴 ∈ 𝐹 𝑎𝑛𝑑 𝐵𝐶 ∈ 𝐹
Therefore, we can write –
𝐴−𝐵 ∈𝐹
HENCE PROVED
• Probability is a function that is a mapping from the sigma field to the set of real
numbers between the range 0 and 1 (inclusive). 𝑃: 𝐹 → [0,1]
• 𝑃(Ω) = 1
• If 𝐴1 , 𝐴2 , … are mutually disjoint set of events, then
∞ ∞
⋃ 𝑃( 𝐴𝑖 ) = ∑ 𝑃(𝐴𝑖 )
𝑖=1 𝑖=1
This is called the Counting/Sigma Additivity property of probability function. Now, we shall
use these properties to define many other properties of Probability function.
Lemma
The probability function is a monotone, non-decreasing function. That means that it
maintains a standard trend across its domain (monotone) and the values of the function can
only increase or stay the same but never decrease as the function input increases. In our
case, input 𝐴 will be bigger than input 𝐵 if 𝐴 ⊇ 𝐵. So, our lemma becomes –
From basic set theory, we know that 𝐵 and (𝐴 − 𝐵) are mutually disjoint sets. As a result, we
can use the property of the probability function and write –
The value of 𝑃(𝐴 − 𝐵) will be a real number in the range [0,1]. Thus, we can conclude that –
𝑃(𝐴) ≥ 𝑃(𝐵)
HENCE PROVED
Lemma
𝑃(𝐴 ∪ 𝐵) = 𝑃 (𝐴) + 𝑃(𝐵) − 𝑃 (𝐴 ∩ 𝐵)
Proof
From the basic set theory knowledge, we can write –
𝐴 ∪ 𝐵 = 𝐴 ∪ [𝐵 − (𝐴 ∩ 𝐵)]
Again, we can see that 𝐴 and [𝐵 − (𝐴 ∩ 𝐵)] are mutually disjoint set. Thus,
𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴 ∪ [𝐵 − (𝐴 ∩ 𝐵)]) = 𝑃 (𝐴) + 𝑃(𝐵 − [𝐴 ∩ 𝐵]) − −−→ 𝟏
From set theory, we can also write –
𝐵 = [𝐵 − (𝐴 ∩ 𝐵)] ∪ (𝐴 ∩ 𝐵)
Since [𝐵 − (𝐴 ∩ 𝐵)] and (𝐴 ∩ 𝐵) are mutually disjoint, we can say that –
𝑃(𝐴𝐶 ) = 1 − 𝑃(𝐴)
Proof
From set theory, we have –
𝐴𝐶 = Ω − 𝐴
We can also write –
Ω = 𝐴 ∪ (Ω − 𝐴)
Since 𝐴 and Ω − 𝐴 are mutually disjoint sets, we can write –
𝑃(Ω) = 𝑃(𝐴) + 𝑃 (Ω − 𝐴)
1 = 𝑃(𝐴) + 𝑃 (𝐴𝐶 )
Thus,
𝑃(𝐴𝐶 ) = 1 − 𝑃(𝐴)
HENCE PROVED
NOTE
If we take 𝐴 = Ω, then the above case becomes –
𝑃(𝜙) = 1 − 1 = 0
Let us now define this formally. The limit of a sequence of sets contains the elements that are
–
1. Not present in finitely many set.
2. Present in infinitely many sets
Basically, if we have 𝜔 as an element in the limit set, then we can write 2 sets as follows –
𝐴 = {𝜔 ∶ ∃ 𝑛 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝜔 ∈ 𝐴𝑘 ∀ 𝑘 ≤ 𝑛}
𝐴 = {𝜔 ∶ ∀ 𝑛 ∃ 𝑘 ≥ 𝑛 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝜔 ∈ 𝐴𝑘 }
Here, we call 𝐴 and 𝐴 as limit infimum and limit supremum respectively. The 𝐥𝐢𝐦𝒔𝒖𝒑 is the
set of elements that appear infinitely often in 𝐴𝑛 . The 𝒍𝒊𝒎 𝒊𝒏𝒇 is the set of elements that
appear in all sequences 𝐴𝑛 except for some finite instances. We can formally define these
values as –
∞ ∞
lim 𝑠𝑢𝑝 𝐴𝑛 = ⋂ ⋃ 𝐴𝑘
𝑛→∞
𝑛=1 𝑘=𝑛
∞ ∞
lim 𝑖𝑛𝑓 𝐴𝑛 = ⋃ ⋂ 𝐴𝑘
𝑛→∞
𝑛=1 𝑘=𝑛
PROOF
Let us take the case of the supremum. We can write –
∞
𝐵𝑛 = ⋃ 𝐴𝑘
𝑘=𝑛
In this case, we are taking the union of the sets ∀ 𝑘 ≥ 𝑛. This way, we get the values of 𝜔 ∈
𝐵𝑛 that are present in at least 1 of infinitely many sets. Now, we have infinitely many sets as
follows –
∞ ∞ ∞ ∞
𝐵1 = ⋃ 𝐴𝑘 ; 𝐵2 = ⋃ 𝐴𝑘 ; 𝐵3 = ⋃ 𝐴𝑘 ; 𝐵4 = ⋃ 𝐴𝑘 ; …
𝑘=1 𝑘=2 𝑘=3 𝑘=4
Now, if we take intersection of these sets, we get the elements that are present in infinitely
many sets, which is the definition of supremum. Hence, we get –
∞ ∞ ∞
lim 𝑠𝑢𝑝 𝐴𝑛 = ⋂ 𝐵𝑛 = ⋂ ⋃ 𝐴𝑘
𝑛→∞
𝑛=1 𝑛=1 𝑘=𝑛
HENCE PROVED
𝐵𝑛 = ⋂ 𝐴𝑘
𝑘=𝑛
In this case, we are taking the intersection of the sets ∀ 𝑘 ≥ 𝑛. This way, we get the values of
𝜔 ∈ 𝐵𝑛 that are present in at all of infinitely many sets. Now, we have infinitely many sets as
follows –
∞ ∞ ∞ ∞
𝐵1 = ⋂ 𝐴𝑘 ; 𝐵2 = ⋂ 𝐴𝑘 ; 𝐵3 = ⋂ 𝐴𝑘 ; 𝐵4 = ⋂ 𝐴𝑘 ; …
𝑘=1 𝑘=2 𝑘=3 𝑘=4
Now, we take unions of these infinite sets and that results in a set of elements that belong to
some 𝐴𝑛 for a large enough 𝑛. This can also be interpreted as the fact that the set contains
elements that are not present in a finite number of sets, which is the definition of the infimum.
Thus,
∞ ∞ ∞
lim 𝑖𝑛𝑓 𝐴𝑛 = ⋃ 𝐵𝑛 = ⋃ ⋂ 𝐴𝑘
𝑛→∞
𝑛=1 𝑛=1 𝑘=𝑛
Question
Find the limit (if it exists) for the sequence of sets –
𝑛
{𝐴𝑛 } = [0, )
𝑛+1
Let us try to find the limit supremum and the infimum first.
∞ ∞
lim 𝑠𝑢𝑝 𝐴𝑛 = ⋂ ⋃ 𝐴𝑘
𝑛→∞
𝑛=1 𝑘=𝑛
⋃ 𝐴𝑛 = [0,1) = 𝐵𝑛 (𝑠𝑎𝑦)
𝑘=𝑛
lim 𝑖𝑛𝑓 𝐴𝑛 = ⋃ ⋂ 𝐴𝑘
𝑛→∞
𝑛=1 𝑘=𝑛
Since the 𝐴𝑛 sequence here is a monotonically increasing function, then we can say that –
∞
𝑛
⋂ 𝐴𝑘 = 𝐴𝑛 = [0, ) = 𝐶𝑛
𝑛+1
𝑘=𝑛
Therefore, we get –
∞ ∞ ∞
Since the Sup and Inf limits are equal, the limit exists for the sequence and is [𝟎, 𝟏)
Question
Find the limit (if it exists) for the sequence of sets –
1
{𝐴𝑛 } = [0, 1 + )
𝑛
Answer
Let us start with the supremum –
∞ ∞
𝐴 = ⋂ ⋃ 𝐴𝑘
𝑛=1 𝑘=𝑛
Let,
∞
𝐵𝑛 = ⋃ 𝐴𝑘
𝑘=𝑛
The interesting thing to note here is that the sequence provided is a monotone non-increasing
function because as the value of 𝑛 grows larger, the set has lesser elements in it. Therefore,
we can write –
∞
𝐵𝑛 = ⋃ 𝐴𝑘 = 𝐴𝑛
𝑘=𝑛
Hence, we get –
∞
𝑨 = ⋂ 𝑨𝒏
𝒏=𝟏
Let us now take the case of limit infimum –
∞ ∞
𝐴 = ⋃ ⋂ 𝐴𝑘
𝑛=1 𝑘=𝑛
We can express the above expression in a more interesting and simpler fashion as follows –
𝐴 = [𝐴1 ∩ 𝐴2 ∩ 𝐴3 ∩ … ] ∪ [𝐴2 ∩ 𝐴3 ∩ 𝐴4 ∩ … ] ∪ [𝐴3 ∩ 𝐴4 ∩ 𝐴5 ∩ … ] …
Since the sequence of sets is a monotone decreasing sequence, we can see that the –
𝐴1 ∩ 𝐴2 ∩ 𝐴3 ∩ … = lim 𝐴𝑁
𝑁→∞
Every intersection term will evaluate to the value above. Hence, we can write –
𝐴 = lim 𝐴𝑁 ∪ lim 𝐴𝑁 ∪ lim 𝐴𝑁 ∪ … = lim 𝐴𝑁
𝑁→∞ 𝑁→∞ 𝑁→∞ 𝑁→∞
𝐴 = 𝐴1 ∩ 𝐴2 ∩ 𝐴3 ∩ …
∞
𝑨 = ⋂ 𝑨𝒏
𝒏=𝟏
𝐥𝐢𝐦 𝑨𝒏 = 𝑨 = 𝑨 = ⋂ 𝑨𝒏
𝒏→∞
𝒏=𝟏
𝐴 = ⋂ ⋃ 𝐴𝑘
𝑛=1 𝑘=𝑛
⋃ 𝐴𝑘 = lim 𝐴𝑁
𝑁→∞
𝑘=𝑛
Therefore,
∞
𝑨 = ⋃ 𝑨𝒏
𝒏=𝟏
Now, let us take the case of the infimum –
∞ ∞
𝐴 = ⋃ ⋂ 𝐴𝑛
𝑛=1 𝑘=𝑛
⋂ 𝐴𝑘 = 𝐴𝑛
𝑘=𝑛
Therefore, we get –
∞
𝑨 = ⋃ 𝑨𝒏
𝒏=𝟏
𝑨 = 𝑨 = ⋃ 𝑨𝒏
𝒏=𝟏
HENCE PROVED
NOTE
From De-Morgan’s law, we can say –
𝐶
( lim 𝑖𝑛𝑓 𝐴𝑛 ) = lim 𝑠𝑢𝑝 𝐴𝐶𝑛
𝑛→∞ 𝑛→∞
Lemma
The limit supremum is the superset of the limit infimum
Proof
This is not always the case but for most part, this is true. Let us prove this. Let us take an
element 𝑥 which belongs to the limit infimum of the sequence of sets.
∞ ∞
𝑥 ∈ ⋃ ⋂ 𝐴𝑘
𝑛=1 𝑘=𝑛
𝑥 ∈ ⋂ 𝐴𝑘
𝑘=𝑛0
This means that 𝑥 belongs to each set 𝐴𝑘 ∀ 𝑘 ≥ 𝑛0. Therefore, we can write –
∞
𝑥 ∈ ⋃ 𝐴𝑘
𝑘=𝑛0
Since 𝑥 belongs to the union of the sets and the intersection of the sets, we can write –
∞ ∞
𝑥 ∈ ⋂ ⋃ 𝐴𝑘
𝑛=𝑛0 𝑘=𝑛0
Since we had take 𝑛0 as an arbitrary value, the above condition holds true for all 𝑛0 ∈ ℕ.
Therefore, we get –
𝒙∈𝑨
In short, if the value is in the infimum, it is also in the supremum. As a result, we can say that
the supremum is the superset or equal to the infimum.
HENCE PROVED
Lemma
The limit supremum and the limit infimum always exist for any given sequence
Proof
Lemma
Proof
This is a very intuitive proof. Let us write the expression for the supremum first –
∞ ∞
𝐴 = ⋂ ⋃ 𝐴𝑘
𝑛=1 𝑘=𝑛
We know that 𝐴𝑘 ∈ 𝐹 ∀ 𝑘 ∈ ℕ. As per the property of the event field 𝐹, we know that if 𝐴1 , 𝐴2 , …
belong to 𝐹, then their union also belongs to 𝐹. Therefore, we can say –
∞
⋃ 𝐴𝑘 = 𝐵𝑛 ∈ 𝐹
𝑘=𝑛
Additionally, we have also proven that if if 𝐴1 , 𝐴2 , … belong to 𝐹, then their intersection also
belongs to 𝐹. Therefore, we can say –
∞
⋂ 𝐵𝑛 ∈ 𝐹
𝑛=1
𝑨 ∈ 𝑭 𝒊𝒇 𝑨𝒏 ∈ 𝑭 ∀ 𝒏 ∈ ℕ
We can similarly prove the same for the infimum as well.
LEFT CONTINUITY OF THE PROBABILITY FUNCTION
A function 𝑓 (𝑥) is said to be left continuous for the monotonically increasing sequence 𝑥 𝑛
if the limit and the function of the sequence can be interchanged –
Assuming lim 𝑥𝑛 = 𝑥. The probability function is also left continuous. Basically, for a
𝑛→∞
monotone increasing sequence 𝐴𝑛 , we have –
As we have seem above for the monotone increasing functions, we can write –
∞
lim 𝐴𝑛 = ⋃ 𝐴𝑛
𝑛→∞
𝑛=1
Proof
Let us define a sequence of infinite sets as follows –
{𝐵𝑛} = 𝐴𝑛 − 𝐴𝑛−1
Thus, we get –
𝐵1 = 𝐴1
𝐵2 = 𝐴2 − 𝐴1
𝐵3 = 𝐴3 − 𝐴2
And so on. The interesting thing here is that the sequence of sets {𝐵𝑛 } are mutually disjoint.
This means, we can use the property of 𝜎 – additivity for them. Another interesting thing about
the sequence is that –
∞ ∞
⋃ 𝐵𝑛 = ⋃ 𝐴𝑛
𝑛=1 𝑛=1
⋃ 𝐴𝑛 = 𝐴𝑁 = 𝐴
𝑛=1
Therefore, we get –
𝑵
HENCE PROVED
A function 𝑓 (𝑥) is said to be right continuous for the monotonically decreasing sequence
𝑥 𝑛 if the limit and the function of the sequence can be interchanged –
Assuming lim 𝑥𝑛 = 𝑥.
𝑛→∞
Proof
For a sequence of set of events {𝐴𝑛 }, we need to prove that –
In this case, the sequence 𝐴𝑛 is a monotone decreasing sequence. This means that 𝐴𝑛 ⊇
𝐴𝑛+1 . Let us define a sequence of infinite sets as follows –
{𝐵𝑛} = 𝐴𝑛 − 𝐴𝑛+1
Thus, we get –
𝐵1 = 𝐴1 − 𝐴2
𝐵2 = 𝐴2 − 𝐴3
𝐵3 = 𝐴3 − 𝐴4
And so on. The interesting thing here is that the sequence of sets {𝐵𝑛 } are mutually disjoint.
This means, we can use the property of 𝜎 – additivity for them. Another interesting thing about
the sequence is that –
∞ ∞
⋃ 𝐵𝑛 = ⋃ 𝐴𝑛
𝑛=1 𝑛=1
⋃ 𝐴𝑛 = 𝐴1 = 𝐴
𝑛=1
Therefore, we get –
𝑵
HENCE PROVED
Let us assume that we have a set of outcomes Ω and a set of events 𝐴 ∈ 2Ω . Then, we can
define the smallest 𝜎 – field over 𝐴 as 𝜎(𝐴) if –
• 𝐴 ⊆ 𝜎(𝐴)
• If there is any other sigma field 𝜎 ∗ that contains 𝐴, then 𝜎(𝐴) ⊆ 𝜎 ∗
We have already seen that if we have a bunch of sigma fields defined over the same outcome
set, then their intersection will also be a sigma field. (Refer to Tutorial 1, Problem 2b). The
smallest 𝜎 – field over a set of events 𝐴 is the one which is obtained after taking the
intersection of all the 𝜎 – fields over the set 𝐴.
∞
𝜎(𝐴) = ⋂ 𝜎𝑖
𝑖=1
CONDITIONAL PROBABILITY
Let 𝐷 be an event i.e. 𝐷 ∈ 𝔽 such that 𝑃(𝐷) > 0, then we can define for any event 𝐴 ∈ 𝔽 a
conditional probability of event 𝐴 occurring given event 𝐷 has occurred is as follows –
𝑃(𝐴 ∩ 𝐷)
𝑃(𝐴 |𝐷 ) = = 𝑃𝐷 (𝐴)
𝑃(𝐷)
Here, we have defined a new measure 𝑃𝐷 ∶ 𝔽 → [0,1]. But is this a probability measure? Let’s
check the 3 properties of probability measure –
Therefore, we can see that 𝑃𝐷 satisfies all the conditions and can hence be deemed a
Probability measure. With the concept of conditional probability, we can now define a few
other important concepts.
Let 𝐴1 , 𝐴2 , … , 𝐴𝑛 denote partitions (mutually exclusive and exhaustive) of Ω such that 𝑃 (𝐴𝑘 ) >
0 ∀ 𝑘 ∈ [1, 𝑛]. Then we can say –
𝑛 𝑛
Proof
Since it is given that 𝐴𝑘 make partitions, we can say that –
𝐵 = (𝐵 ∩ 𝐴1 ) ∪ (𝐵 ∩ 𝐴2 ) ∪ (𝐵 ∩ 𝐴3 ) ∪ … ∪ (𝐵 ∩ 𝐴𝑛 )
Additionally, as 𝐴𝑘 are partitions, the sets (𝐵 ∩ 𝐴𝑘 ) will be mutually disjoint for all 𝐴𝑘 .
Therefore, by the sigma additivity property, we can write –
𝑛
HENCE PROVED
BAYES THEOREM
As per the Bayes theorem, if 𝑃 (𝐵) > 0, then –
𝑃(𝐵|𝐴𝑖 )𝑃(𝐴𝑖 ) 𝑃 (𝐵|𝐴𝑖 )𝑃(𝐴𝑖 )
𝑃(𝐴𝑖 |𝐵) = = 𝑛
𝑃 (𝐵) ∑ 𝑖=1 𝑃(𝐵|𝐴𝑘 )𝑃(𝐴𝑘 )
Proof
We can see that,
𝑃 (𝐵 ∩ 𝐴𝑖 )
𝑃 (𝐵|𝐴𝑖 )𝑃(𝐴𝑖 ) ∗ 𝑃 (𝐴𝑖 ) 𝑃(𝐵 ∩ 𝐴 )
𝑃 (𝐴𝑖 ) 𝑖
= = = 𝑃 (𝐴𝑖 |𝐵)
𝑃 (𝐵) 𝑃 (𝐵) 𝑃 (𝐵)
HENCE PROVED
INDEPENDENT EVENTS
Two events – 𝐴 and 𝐵 are said to be independent under a probability measure 𝑃 if –
𝑃 (𝐴 ∩ 𝐵) = 𝑃 (𝐴) ∗ 𝑃(𝐵)
Lemma
If 2 events in a probability space 𝐴 and 𝐵 are independent, then show that their complement
is also independent in the same space.
Proof
Since we have 𝐴 ⫫ 𝐵, we can write –
𝑃 (𝐴 ∩ 𝐵) = 𝑃(𝐴) ∗ 𝑃(𝐵)
We also know from previous work that –
𝑃(𝐴𝐶 ) = 1 − 𝑃(𝐴)
With these points in mind, we can derive –
𝑃 (𝐴𝐶 ∩ 𝐵𝐶 ) = 𝑃[(Ω − 𝐴) ∩ (Ω − 𝐵)] = 𝑃[Ω − (𝐴 ∪ 𝐵)]
Thus,
= 𝑃(𝐴) 𝑃( 𝐵𝐶 ) + 1 − 𝑃 (𝐵𝐶 )
= 1 + 𝑃(𝐵𝐶 )[𝑃 (𝐴) − 1]
= 1 − 𝑃 (𝐴𝐶 )𝑃(𝐵𝐶 )
Substituting this back in (1), we get –
Lemma
If events 𝐴 and 𝐵 are independent over a probability space, then the events 𝐴 and 𝐵𝐶 are
also independent over the same space.
Proof
From set theory, we know that –
𝐴 ∩ 𝐵𝐶 = 𝐴 − (𝐴 ∩ 𝐵)
Thus,
𝑃(𝐴 ∩ 𝐵𝐶 ) = 𝑃(𝐴) − 𝑃 (𝐴 ∩ 𝐵)
= 𝑃(𝐴) − 𝑃 (𝐴)𝑃 (𝐵)
= 𝑃(𝐴)[1 − 𝑃(𝐵)]
= 𝑃 (𝐴)𝑃 (𝐵𝐶 )
Therefore, we get –
𝑃 (𝐴 ∩ 𝐵𝐶 ) = 𝑃(𝐴) 𝑃(𝐵𝐶 )
HENCE PROVED
Suppose we have a collection of events say 𝐴 = {𝐴1 , 𝐴2 , … , 𝐴𝑛 } , then these events are said
to be independent under probability measure 𝑃 if for any collection of 𝑘 events 𝐴 𝛼1 , 𝐴 𝛼2 , … , 𝐴 𝛼𝑘
𝑘
Here, 𝑘 can vary from 2 to 𝑛. When 𝑘 = 2, we are taking pairs, when 𝑘 = 3, we are taking
triplets and so on. Therefore,
𝑇𝑜𝑡𝑎𝑙 𝑛𝑜 𝑜𝑓 𝑐𝑜𝑚𝑝𝑎𝑟𝑖𝑠𝑜𝑛𝑠 = 𝐶2𝑛 + 𝐶3𝑛 + ⋯ + 𝐶𝑛𝑛 = 𝟐𝒏 − 𝟏
BOREL 𝝈 – FIELD
Borel sets are the sets that can be formed from open sets (or equivalently, closed sets)
through the operations of countable union, countable intersection, and relative complement.
We start with the collection of all open sets in a topological space 𝑋. Then, the Borel 𝝈 –
field, denoted by 𝔹(𝑿), is the smallest 𝝈 – field containing all open sets in 𝑋. A Borel set is
any set that belongs to 𝔹(𝑋).
In our 1st tutorial, we saw a sequence of sets as follows –
𝐴 = {(−∞, 𝑥] ∀ 𝑥 ∈ ℝ}
This is a classic Borel set. We shall be using the concept of Borel set and fields in the
upcoming topic.
RANDOM VARIABLES
We usually work in the probability space of interest (PSI) which is denoted by (Ω, 𝔽, 𝑃) . This
is the space where we define our measures, events etc. However, when in mathematics we
are developing theories, we need to develop it for all situations. As a result, we have solutions
in some standard space for the problems we are facing. Thus, instead of developing solutions
for the problems in our space, we first translate our problem to the standard space, solve it
using the solution there and then translate back to our space of interest.
In the case of probability, we take the standard space (SS) as (ℝ, 𝔹) where ℝ is the set of
real numbers and 𝔹 is a Borel field. The translation from PSI to SS in our case is done using
Random Variables.
A random variable 𝑿 is a function that maps outcomes in PSI to real values in SS. Basically,
𝑋∶Ω→ ℝ
That means ∃ 𝐵 ∈ 𝔹 such that 𝑋(𝐴) = 𝐵 for some 𝐴 ∈ 𝔽. In expanded terms,
𝐵 = 𝑋(𝐴) = {𝑟 ∶ 𝑟 = 𝑋(𝜔) ∈ ℝ ∀ 𝜔 ∈ 𝐴}
The inverse of this is also true –
Here is the interesting part. There can be multiple 𝜔 ∈ 𝐴 that can be mapped to a single value
in 𝐵 ∈ 𝔹. This means, multiple values in 𝐴 can find the same value in 𝐵. Thus, we can safely
say that –
𝐴̃ ⊇ 𝐴
We can now define probability measures in both PSI and SS. We take 𝑃 ∶ 𝔽 → [0,1] and 𝑃𝑋 ∶
𝔹 → [0,1] such that –
The question now arises – Can any mapping 𝑋 ∶ Ω → ℝ be a random variable? Not really.
One thing to note in the probability relation is that the equality only stands if 𝑋−1 (𝐵) ∈ 𝔽.
Otherwise, 𝑃 (𝑋−1 (𝐵)) is not defined. In other words, 𝑿 has to be Borel Measurable.
Let us take 2 measurable spaces - (𝑋, 𝔹(𝑋)) and (𝑌, 𝔹(𝑌)) where 𝔹(. ) Is a Borel field on the
sets. Then a function 𝑓 ∶ 𝑋 → 𝑌 is said to be Borel measurable if ∀ 𝐵 ∈ 𝔹(𝑌), the pre-image
𝑓 −1 (𝐵) ∈ 𝔹(𝑋).
Extending this definition to our case, we get that 𝑋 is Borel measurable if ∀ 𝐵 ∈ 𝔹, the pre-
image 𝑋−1 (𝐵) ∈ 𝔽. Therefore, the 2 conditions for a mapping 𝑿 to be a random variable are
–
1. 𝑋 ∶ Ω → ℝ
2. 𝑋 is Borel measurable
For example, let us take the case of a dice throw. In this case –
Ω = {1,2,3,4,5,6}
Now, let us consider a mapping –
𝑋(𝜔) = 2 ∗ 𝜔 ∀ 𝜔 ∈ Ω
With these defined, we now define fields (in PSI) as follows –
𝐹1 = {𝜙, Ω}
As 2 and 4 are the only values of 𝑋(𝜔) that are in 𝐵. Now, for 𝑋 to be Borel measurable,
𝑋−1 (𝐵) = {1,2} ∈ 𝔽. Thus, X is a random variable over field 𝔽2 but not over field 𝔽1 .
For a mapping 𝑋 to be Borel measurable, we need that ∀ 𝐵 ∈ 𝔹, 𝑋−1 (𝐵) ∈ 𝔽. However, how
do we check for all the sets in the Borel field? These are uncountably infinite sets. To do this,
we make use of a well-known lemma –
Lemma – Let us consider 2 spaces – (Ω1 , 𝔽1 ) and (Ω2 , 𝔽2 ) and a mapping 𝑋 ∶ Ω1 → Ω2 . Also
consider the set 𝔸 which is a collection of subsets of Ω2 such that 𝔽2 is the smallest 𝝈 – field
over 𝔸. If 𝑋−1 (𝐵) ∈ 𝔽1 ∀ 𝐵 ∈ 𝔸, then 𝑋−1 (𝐴) ∈ 𝔽1 ∀ 𝐴 ∈ 𝔽2
In short, if a collection of sets exist in SS such that the pre-image of the elements of the sets
are all in PSI, then every element in the smallest 𝜎 – field over the said collection in SS also
has a pre-image in PSI. Also recall that a Borel field is the smallest 𝜎 – field over all Borel
sets.
Therefore, instead of finding the pre-image of every set in the Borel field, we can simply find
the pre-image of a collection of Borel sets instead. This collection is as follows –
𝔸 = {(−∞, 𝑥] ∀ 𝑥 ∈ ℝ}
Thus, 𝑋 is Borel measurable if 𝑋−1 ((−∞,𝑥]) ∈ 𝔽 for all 𝑥 ∈ ℝ. The problem scope has
reduced significantly, but the collection of Borel sets is still uncountably infinite. We need to
simplify this problem further to a more manageable scope. In this situation, we make use of
another popular lemma –
Lemma – If we have a probability measure 𝑃𝑋 over the Borel space 𝔹, then 𝑃𝑋 can be
perfectly described by just stating 𝑃𝑋((−∞, 𝑥]) ∀ 𝑥 ∈ ℝ
In short, the probability measure we defined in SS can be described by just applying the
probability measure over the collection of Borel sets. Using the relation between the
Probability measures in SS and PSI, we can say –
The above definition also means that for a large enough value of 𝑛, the sequence values lie
in the epsilon space on the number line around 𝑎 ∈ 𝑅. If we choose a smaller 𝜖, we would
need to choose a higher value of 𝑋.
If a sequence doesn’t satisfy the above conditions, then it is called divergent. Let us take an
example question.
Question
1
Prove that the sequence (𝑎 𝑛 ) = ; 𝑛 ∈ 𝑁 is a convergent sequence and converges to zero
𝑛
Answer
Suppose we take some 𝜖 > 0. Then, we need to show that ∀ 𝑛 ≥ 𝑋, we have –
1 1
|𝑎 𝑛 − 0| = |𝑎 𝑛 | = | | = (𝑏𝑐𝑜𝑧 𝑛 ∈ 𝑁) < 𝜖
𝑛 𝑛
We can choose a value 𝑋 such that 𝑋. 𝜖 > 1. Then, we get –
1
𝜖>
𝑋
Since we have established that 𝑛 ≥ 𝑋, we can say that –
1 1
≤
𝑛 𝑋
Therefore, we can prove that –
1
<𝜖
𝑛
HENCE PROVED.
Question
Prove that the sequence (𝑎 𝑛 ) = (−1) 𝑛 ; 𝑛 ∈ 𝑁 is divergent.
Answer
Let us assume that the sequence given is convergent to a point 𝑎 ∈ 𝑅. Let us also take a
value 𝜖 > 0. Now since we have assumed the sequence to be convergent, we can say that
for a value 𝑋 ∈ 𝑁, we have –
|𝑎 𝑛 − 𝑎| < 𝜖 𝑓𝑜𝑟 𝑛 ≥ 𝑋
|−1 − 𝑎| 𝑖𝑓 𝑛 𝑖𝑠 𝑜𝑑𝑑
|𝑎 𝑛 − 𝑎| = |(−1) 𝑛 − 𝑎| = { }
|1 − 𝑎| 𝑖𝑓 𝑛 𝑖𝑠 𝑒𝑣𝑒𝑛
𝑎𝑛 lim 𝑎 𝑛
lim ( ) = 𝑛→∞
𝑛→∞ 𝑏𝑛 lim 𝑏𝑛
𝑛→∞
lim (𝑥 ∗ 𝑎 𝑛 ) = 𝑥 ∗ lim 𝑎 𝑛
𝑛→∞ 𝑛→∞
Now suppose we have 3 sequences such that 𝑎 𝑛 ≤ 𝑐𝑛 ≤ 𝑏𝑛 and 𝑎 𝑛 and 𝑏𝑛 are convergent
such that lim 𝑎 𝑛 = lim 𝑏𝑛 , then 𝑐𝑛 is also a convergent sequence with lim 𝑎 𝑛 = lim 𝑏𝑛 =
𝑛→∞ 𝑛→∞ 𝑛→∞ 𝑛→∞
lim 𝑐𝑛
𝑛→∞
Basically, if given a subset of real numbers, the supremum is the lowest value that is greater
than any value present in the subset. Similarly, we can define the infimum as follows –
If the set 𝑀 is not bounded, then the infimum becomes −∞. On the other hand, if the set 𝑀
is an empty set, then the infimum becomes ∞.
Suppose we have a sequence (𝑎 𝑛 ),𝑛 ∈ 𝑁. Then, we can define another infinitely large
sequence (𝑎 𝑘), 𝑘 ∈ 𝑁 such that the elements of (𝑎 𝑘) are strictly monotonically
increasing/decreasing and this sequence is called a sub-sequence. One cool thing to note
here is that if the original sequence converges as a points, then the sub-sequence will also
converge at that points, i.e. –
lim 𝑎 𝑛 = lim 𝑎 𝑘
𝑛→∞ 𝑘→∞
The application of using sub-sequences comes in a lot of cases. Let us take the example of
a sequence –
(𝑎 𝑛 ) = (−1) 𝑛
We have already seen previously that this sequence doesn’t converge and hence doesn’t
have a limit value. Now, we create 2 sub-sequences as follows –
( 𝑎 2𝑛 ) = (−1) 2𝑛 = 1 ∀ 𝑛 ∈ 𝑁
Where 𝑎 and 𝑏 are the highest and lowest accumulation values of the sequence. Also note
that 𝑎, 𝑏 ∈ 𝑅 ∪ {−∞, ∞}. What this means is that the limit superior and limit inferior can take
improper accumulation values as well.