0% found this document useful (0 votes)
38 views28 pages

Wa0000.

Module 2 covers the fundamentals of probability and probability distributions, including permutations, combinations, and basic terminology such as random experiments, sample space, and events. It outlines key concepts like mutually exclusive events, independent events, and the axioms of probability, along with theorems including the addition law and Bayes' theorem. The document provides examples to illustrate these concepts, emphasizing their application in real-world scenarios.

Uploaded by

7k8nb6jrgm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views28 pages

Wa0000.

Module 2 covers the fundamentals of probability and probability distributions, including permutations, combinations, and basic terminology such as random experiments, sample space, and events. It outlines key concepts like mutually exclusive events, independent events, and the axioms of probability, along with theorems including the addition law and Bayes' theorem. The document provides examples to illustrate these concepts, emphasizing their application in real-world scenarios.

Uploaded by

7k8nb6jrgm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MODULE 2: PROBABILITY &

PROBABILITY DISTRIBUTION

INTRODUCTION TO PROBABILITY
Permutations and Combinations
Principle of Counting: If an event can happen in 𝑛 ways and thereafter, for each of these events,
a second event can happen in 𝑛 ways, and for each of these first and second events, a third event
can happen in 𝑛 ways, and so on, then the number of ways these 𝑚 events can happen is given
by the product: 𝑛 ⋅ 𝑛 ⋅ 𝑛 ⋯ 𝑛

Permutations: A permutation of a number of objects is their arrangement in some definite order.


Given three letters 𝑎, 𝑏, 𝑐 we can permute them two at a time as 𝑏𝑐, 𝑐𝑏, 𝑐𝑎, 𝑎𝑐, 𝑎𝑏, 𝑏𝑎 yielding 6
permutations. The combinations or groupings are only 3, i.e., 𝑏𝑐, 𝑐𝑎, 𝑎𝑏. Here the order is
immaterial.

The number of permutations of 𝑛 different things taken 𝑟 at a time is:


!
𝑛(𝑛 − 1)(𝑛 − 2). . . (𝑛 − 𝑟 + 1) which is denoted by 𝑃 . Thus, 𝑃 =( )!

Permutations with repetitions: The number of permutations of 𝑛 objects of which 𝑛 are alike,
!
𝑛 are alike, and 𝑛 are alike is:
! ! !

Combinations: The number of combinations of 𝑛 different objects taken 𝑟 at a time is denoted


by 𝐶 . If we take any one of the combinations, its 𝑟 objects can be arranged in 𝑟! ways. So, the
total number of combinations which can be obtained from all the arrangements is:
𝑃 𝑛!
𝐶 = =
𝑟! 𝑟! (𝑛 − 𝑟)!

Note: 𝐶 = 𝐶

Basic Terminology and Definition of Probability


Random experiment: Experiments which are performed essentially under the same conditions
and whose results cannot be predicted are known as random experiments. e.g., Tossing a coin or
rolling a die are random experiments.
Sample space: The set of all possible outcomes of a random experiment is called sample space
for that experiment and is denoted by 𝑆. The elements of the sample space 𝑆 are called the sample
points. e.g., On tossing a coin, the possible outcomes are the head (𝐻) and the tail (𝑇).
Thus 𝑆 = { 𝐻, 𝑇 }.

Event: The outcome of a random experiment is called an event. Thus, every subset of a sample
space 𝑆 is an event. The null set 𝜙 is also an event and is called an impossible event.

Exhaustive events. A set of events is said to be exhaustive, if it includes all the possible events.
For example, in tossing a coin there are two exhaustive cases either head or tail and there is no
third possibility.

Mutually exclusive events. If the occurrence of one of the events precludes the occurrence of all
other, then such a set of events is said to be mutually exclusive. Just as tossing a coin, either head
comes up or the tail and both can’t happen at the same time, i.e., these are two mutually exclusive
cases.

Equally likely events. If one of the events cannot be expected to happen in preference to another
then such events are said to be equally likely. For instance, in tossing a coin, the coming of the
head or the tail is equally likely.

Thus, when a die is thrown, the turning up of the six different faces of the die are exhaustive,
mutually exclusive and equally likely event.

Independent events: Two events are said to be independent, if happening or failure of one does
not affect the happening or failure of the other. Otherwise, the events are said to be dependent.

Definition of Probability: If there are 𝑛 exhaustive, mutually exclusive and equally likely cases
of which 𝑚 are favourable to an event 𝐴, then probability (𝑝) of the happening of 𝐴 is 𝑃(𝐴) =

As there are 𝑛 − 𝑚 cases in which 𝐴 will not happen (denoted by 𝐴′), the chance of 𝐴 not
happening is 𝑞 or 𝑃(𝐴′) so that 𝑞 = = 1 − = 1 − 𝑝 i.e., 𝑃(𝐴 ) = 1 − 𝑃(𝐴) so that
𝑃(𝐴) + 𝑃(𝐴 ) = 1

i.e., if an event is certain to happen then its probability is unity, while if it is certain not to happen,
its probability is zero. Thus, probability of an impossible event is zero, i.e., 𝑃(ϕ) = 0.

Axioms of Probability.
(i) The numerical value of probability lies between 0 and 1. i.e., for any event 𝐴 of 𝑆,
0 ≤ 𝑃(𝐴) ≤ 1.
(ii) The sum of probabilities of all sample events is unity, i.e., 𝑃(𝑆) = 1.
(iii) Probability of an event made of two or more sample events is the sum of their probabilities.
Notations.
(i) Probability of happening of events 𝐴 or 𝐵 is written as 𝑃(𝐴 + 𝐵) 𝑜𝑟 𝑃(𝐴 ∪ 𝐵).
(ii) Probability of happening of both the events 𝐴 and 𝐵 is written as 𝑃(𝐴𝐵) 𝑜𝑟 𝑃(𝐴 ∩ 𝐵)
(iii) Event A implies (⇒) event B is expressed as 𝐴 ⊆ 𝐵.
(iv) Events A and B are mutually exclusive is expressed as 𝐴 ∩ 𝐵 = 𝜙

Theorems of Probability
Addition Law of Probability
1. If the probability of an event A happening as a result of a trial is P(A) and the probability of a
mutually exclusive event B happening is P(B), then the probability of either of the events
happening as a result of the trial is 𝑃(𝐴 + 𝐵) 𝑜𝑟 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵).

In general, for 𝑛 mutually exclusive events 𝐴 𝐴 … 𝐴 :


𝑃(𝐴 + 𝐴 + ⋯ + 𝐴 ) 𝑜𝑟 𝑃(𝐴 ∪ 𝐴 ∪ … ∪ 𝐴 ) = 𝑃(𝐴 ) + 𝑃(𝐴 ) + ⋯ + 𝑃(𝐴 )

2. If A, B are any two events (not mutually exclusive), then


𝑃(𝐴 + 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴𝐵) or 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵)

Conditional Probability
For two dependent events A and B, the symbol 𝑃(𝐵/𝐴) denotes the probability of occurrence of
B, when A has already occurred. It is known as the conditional probability and is read as a
‘probability of B given A’.

Theorem of compound probability (Multiplication law of probability)


1. If the probability of an event A happening as a result of a trial is P(A) and after A has happened
the probability of an event B happening as a result of that trial (i.e., conditional probability of
B given A) is 𝑃(𝐵/𝐴), then the probability of both the events A and B happening as a result
of two trials is
𝑃(𝐴𝐵) 𝑜𝑟 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) ⋅ 𝑃(𝐵/𝐴)

2. If the events A and B are independent, 𝑃(𝐵/𝐴) = 𝑃(𝐵) and 𝑃(𝐴 | 𝐵) = 𝑃(𝐴).
∴ 𝑃(𝐴𝐵) 𝑜𝑟 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) ⋅ 𝑃(𝐵).

In general, P(𝐴 𝐴 … 𝐴 ) 𝑜𝑟 P(𝐴 ∩ 𝐴 ∩ … ∩ 𝐴 ) = P(𝐴 ) ⋅ 𝑃(𝐴 ) … 𝑃(𝐴 ).

Law of Total Probability


Suppose 𝐵 , 𝐵 , … , 𝐵 are mutually exclusive and exhaustive in a sample space 𝑆 with 𝑃(𝐵 ) ≠
0 for 𝑖 = 1,2, … , 𝑘. Then for any event A of S

𝑃(𝐴) = 𝑃(𝐵 ). 𝑃(𝐴/𝐵 ) + 𝑃(𝐵 ). 𝑃(𝐴/𝐵 ) + ⋯ + 𝑃(𝐵 ). 𝑃(𝐴/𝐵 )


BAYES THEOREM (THEOREM OF INVERSE PROBABILITY)
Bayes Theorem
An event 𝐴 corresponds to a number of exhaustive events 𝐵 , 𝐵 , 𝐵 , … , 𝐵 . If 𝑃(𝐵 ) and 𝑃(𝐴/𝐵 )
are given, then
𝑃(𝐵 )𝑃(𝐴/𝐵 )
𝑃(𝐵 /𝐴) =
∑𝑃(𝐵 )𝑃(𝐴/𝐵 )

Proof: By the multiplication law of probability


𝑃(𝐴/𝐵 ) = 𝑃(𝐴)𝑃(𝐵 /𝐴) = 𝑃(𝐵 )𝑃(𝐴/𝐵 ) … (1)
𝑃(𝐵 )𝑃(𝐴/𝐵 )
∴ 𝑃(𝐵 /𝐴) = … (2)
𝑃(𝐴)

Since the event corresponds to 𝐵 , 𝐵 , 𝐵 , … , 𝐵 , by addition law of probability,


𝑃(𝐴) = 𝑃(𝐴𝐵 ) + 𝑃(𝐴𝐵 ) + ⋯ + 𝑃(𝐴𝐵 ) = ∑𝑃(𝐴𝐵 ) = ∑𝑃(𝐵 )𝑃(𝐴/𝐵 )

( ) ( / )
Hence from (2) we have 𝑃(𝐵 /𝐴) =
∑ ( ) ( / )

Note: The probabilities 𝑃(𝐵 ) 𝑖 = 1,2, . . . , 𝑛 are called “aprior” probabilities because these exist
before we get any information from the experiment.

The probabilities 𝑃(𝐴/𝐵 ) 𝑖 = 1,2, . . . , 𝑛 are called “posteriori” probabilities, because these are
found after the experiment results are known.

Examples
1. There are three bags: first containing 1 white, 2 red, 3 green balls; second 2 white, 3 red, 1
green ball; and third 3 white, 1 red, 2 green balls. Two balls are drawn from a randomly chosen
bag. These are found to be one white and one red. Find the probability that the balls so drawn
came from the second bag.

Solution: Let 𝐵 , 𝐵 , 𝐵 pertain to the first, second, and third bags chosen, and let 𝐴 be the event
that the two balls drawn are white and red.
Now, 𝑃(𝐵 ) = 𝑃(𝐵 ) = 𝑃(𝐵 ) =
𝑃(𝐴/𝐵 ) = 𝑃(𝑎 𝑤ℎ𝑖𝑡𝑒 𝑎𝑛𝑑 𝑎 𝑟𝑒𝑑 𝑏𝑎𝑙𝑙 𝑎𝑟𝑒 𝑑𝑟𝑎𝑤𝑛 𝑓𝑟𝑜𝑚 𝑡ℎ𝑒 𝑓𝑖𝑟𝑠𝑡 𝑏𝑎𝑔)
(¹𝐶₁ × ²𝐶₁) 2
= =
⁶𝐶₂ 15
(² ₁ × ³ ₁) (³ ₁ × ¹ ₁)
Similarly, 𝑃(𝐴/𝐵 ) = = 2/5 and 𝑃(𝐴/𝐵 ) = = 1/5
⁶ ₂ ⁶ ₂
By Bayes' theorem,
[𝑃(𝐵 ) 𝑃(𝐴 /𝐵 )]
𝑃(𝐵 / 𝐴) =
[𝑃(𝐵 ) 𝑃(𝐴 /𝐵 ) + 𝑃(𝐵 ) 𝑃(𝐴 /𝐵 ) + 𝑃(𝐵 ) 𝑃(𝐴 /𝐵 )]
= [ (1/3) × (2/5) ] / [ (1/3) × (2/15) + (1/3) × (2/5) + (1/3) × (1/5) ]
= 6/11

2. In a certain college, 4% of the boys and 1% of the girls are taller than 1.8 m. Furthermore, 60%
of the students are girls. If a student is selected at random and is found to be taller than 1.8 m,
what is the probability that the student is a girl?

Solution: Let B₁ be the event that the student is a boy, B₂ be the event that the student is a girl and
A be the event that the student is taller than 1.8 m.
We have: P(B₁) = 0.4, P(B₂) = 0.6 (since 60% of students are girls, the remaining 40% are boys),
P(A/B₁) = 0.04 (4% of boys are taller than 1.8 m),
P(A/B₂) = 0.01 (1% of girls are taller than 1.8 m)
By Bayes' Theorem, the probability that a randomly selected student who is taller than 1.8 m is a
girl is:
[𝑃(𝐵 )𝑃(𝐴/𝐵 )]
𝑃(𝐵 /𝐴) =
𝑃(𝐵₁) 𝑃(𝐴/𝐵₁) + 𝑃(𝐵₂) 𝑃(𝐴/𝐵₂)
= (0.6 × 0.01) / (0.4 × 0.04) + (0.6 × 0.01)
= 0.006 / 0.022
= 3/11 ≈ 0.2727
Thus, the probability that the student is a girl, given that they are taller than 1.8 m, is 0.2727

3. Three machines 𝑀 , 𝑀 , and 𝑀 produce identical items. Of their respective output, 5%, 4%,
and 3% of items are faulty. On a certain day, 𝑀 has produced 25% of the total output, 𝑀 has
produced 30%, and 𝑀 the remainder. An item selected at random is found to be faulty. What
are the chances that it was produced by the machine with the highest output?

Solution: Let the event of drawing a faulty item from any of the machines be 𝐴, and the event that
an item drawn at random was produced by 𝑀 be 𝐵 . We have to find 𝑃(𝐵 /𝐴)for which we
proceed as follows:

𝑀 𝑀 𝑀 Remarks
𝑃(𝐵 ) 0.25 0.30 0.45 Sum = 1
𝑃(𝐴/𝐵 ) 0.05 0.04 0.03
𝑃(𝐵 )𝑃(𝐴/𝐵 ) 0.0125 0.012 0.0135 Sum = 0.038
𝑃(𝐵 /𝐴) 0.0125/ 0.038 0.012/ 0.038 0.0135 / 0.038 By Bayes'
theorem

The highest output being from 𝑀 , the required probability = 0.0135 / 0.038 = 0.355.
Exercise
1. In a certain college, 25% of boys and 10% of girls are studying mathematics. The girls
constitute 60% of the student body. (a)What is the probability that mathematics is being
studied? (b) If a student is selected at random and is found to be studying mathematics, find
the probability that the student is a girl? (c)a boy? (Ans: 4/25, 3/8, 5/8)

2. Three suppliers X, Y, Z supply items to an establishment in the proportion of 1/2, 1/3 and 1/6
respectively. Of the items supplied by X, Y, Z, 5%, 6% and 8% respectively are found to be
defective. An item taken at random from the lot of all items supplied is found to be defective.
What is the probability that it is supplied by the three suppliers X, Y and Z?
(Ans: X:15/35, Y:12/35, Z:8/35)

3. In a bolt factory there are four machines A, B, C and D manufacturing 20%, 15%, 25% and
40% respectively. Of their outputs 5%, 4%, 3% and 2% in the same order are defective bolts.
A bolt is chosen at random from the factory’s production and is found defective. What is the
probability that it was manufactured by Machine A or Machine D? (Ans: 0.3175, 0.254)

RANDOM VARIABLE
If a real variable 𝑋 be associated with the outcome of a random experiment, then since the values
which 𝑋 takes depend on chance, it is called a random variable or a stochastic variable or simply
a variate.

For instance, if 𝐸 consists of two tosses of a coin, we may consider the random variable as the
number of heads (0, 1 or 2). Then 𝑋 is the random variable. It is a function whose values are real
numbers and depend on chance. The set of values which 𝑋 takes is called the spectrum of the
random variable.
Outcome HH HT TH TT
Value of X 2 1 1 0

If in a random experiment, the event corresponding to a number 𝑎 occurs, then the corresponding
random variable 𝑋 is said to assume the value 𝑎 and the probability of the event is denoted by
𝑃(𝑋 = 𝑎). Similarly, the probability of the event 𝑋 assuming any value in the interval 𝑎 < 𝑋 < 𝑏
is denoted by 𝑃(𝑎 < 𝑋 < 𝑏). The probability of the event 𝑋 < 𝑐 is written as 𝑃(𝑋 < 𝑐).
Thus, for the example given above, 𝑃(𝑋 ≤ 1) = 𝑃(𝑇𝑇, 𝐻𝑇, 𝑇𝐻) = 3/4.

If a random variable takes a finite set of values, it is called a discrete variate. On the other hand,
if it assumes an infinite number of uncountable values, it is called a continuous variate.
Discrete Probability Distribution (Probability Mass Function)
Suppose a discrete variate 𝑋 is the outcome of some experiment. If the probability that 𝑋 takes the
values 𝑥 is 𝑝 , then 𝑃(𝑋 = 𝑥 ) = 𝑝 ; or 𝑝(𝑥 ) for 𝑖 = 1,2, … where
i) 𝑝(𝑥 ) ≥ 0 for all values of 𝑖,
ii) ∑ 𝑝(𝑥 ) = 1
The set of values 𝑥 with their probabilitics 𝑝 , i.e., (𝑥 , 𝑝 ) constitute a discrete probability
distribution of the discrete variate 𝑋.

For example, the discrete probability distribution for 𝑋, the minimum of the two numbers that
appear in a single throw of a pair of fair dice is given by:

𝑋=𝑥 1 2 3 4 5 6

𝑃(𝑋 = 𝑥 ) 11/36 9/36 7/36 5/36 3/36 1/36

Since the event minimum 5 can appear only as (5,5), (5,6), (6,5), 𝑋 assigns to this event of the
sample space a real number 3. The probability of such an event happening is 3/36 since there are
36 exhaustive cases.

Distribution function of the discrete variable 𝑿.


The distribution function 𝐹(𝑥) of the discrete variate 𝑋 is defined by

𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = 𝑝(𝑥 ) 𝑤ℎ𝑒𝑟𝑒 𝑥 𝑖𝑠 𝑎𝑛𝑦 𝑖𝑛𝑡𝑒𝑔𝑒𝑟.

The distribution function is also sometimes called cumulative distribution function.

Example
1. The probability mass function of a variate 𝑋 is

𝑋 0 1 2 3 4 5 6

𝑝(𝑋) 𝑘 3𝑘 5𝑘 7𝑘 9𝑘 11𝑘 13𝑘

i) Find 𝑃(𝑋 < 4), 𝑃(𝑋 ≥ 5) and 𝑃(3 < 𝑋 ≤ 6)


ii) What will be the minimum value of 𝑘 so that 𝑃(𝑋 ≤ 2) > 0.3
Solution: If 𝑋 is a random variable, then

𝑝(𝑋) = 1 ⇒ 𝑘 + 3𝑘 + 5𝑘 + 7𝑘 + 9𝑘 + 11𝑘 + 13𝑘 = 1 ⇒ 𝑘 = 1/49

i) 𝑃(𝑋 < 4) = 𝑘 + 3𝑘 + 5𝑘 + 7𝑘 = 16𝑘 = 16/49


𝑃(𝑋 ≥ 5) = 11𝑘 + 13𝑘 = 24𝑘 = 24/49
𝑃(3 < 𝑋 ≤ 6) = 9𝑘 + 11𝑘 + 13𝑘 = 33𝑘 = 33/49
ii) 𝑃(𝑋 ≤ 2) > 0.3
𝑘 + 3𝑘 + 5𝑘 > 0.3 i.e., 9𝑘 > 0.3
Thus, minimum value of 𝑘 is 1/30

Exercise
1. A random variable 𝑋 has following probability function

𝑋 0 1 2 3 4 5 6 7

𝑝(𝑋) 0 𝑘 2𝑘 2𝑘 3𝑘 𝑘 2𝑘 7𝑘 + 𝑘

Evaluate 𝑃(𝑋 < 6), 𝑃(𝑋 ≥ 6) and 𝑃(0 < 𝑋 < 5)


(Ans: 81/100, 19/100, 4/5)

Continuous Probability Distribution (Probability density function)


The probability distribution of a continuous variate 𝑥 is defined by a function 𝑓(𝑥) such that the
probability of the variate 𝑥 falling in the small interval 𝑥 − 𝑑𝑥 to 𝑥 + 𝑑𝑥 is 𝑓 (𝑥) 𝑑𝑥.
Symbolically it can be expressed as 𝑃 𝑥 − 𝑑𝑥 ≤ 𝑥 ≤ 𝑥 + 𝑑𝑥 = 𝑓(𝑥)𝑑𝑥. Then 𝑓(𝑥) is called
probability density function and the curve 𝑦 = 𝑓(𝑥) is called the probability curve.
When the range is finite, it is convenient to consider it as infinite by supposing the density function
to be zero outside the given range.

The density function 𝑓 (𝑥) is always positive and ∫ 𝑓(𝑥)𝑑𝑥 = 1 (i.e., the total area under the
probability curve and the x-axis is unity which corresponds to the requirements that the total
probability of happening of an event is unity).
Distribution function of the continuous variable 𝑿:
If 𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∫ 𝑓(𝑥)𝑑𝑥 then 𝐹(𝑥) is defined as the cumulative distribution function
or simply the distribution function of the continuous variate 𝑋.

The distribution function 𝐹(𝑥) has the following properties:


i) 𝐹 (𝑥) = 𝑓(𝑥) ≥ 0 Thus 𝐹(𝑥) is a non-decreasing function’
ii) 𝐹(−∞) = 0.
iii) 𝐹(∞) = 1.
iv) 𝐹(𝑎 ≤ 𝑥 ≤ 𝑏) = ∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝑓(𝑥)𝑑𝑥 − ∫ 𝑓(𝑥)𝑑𝑥 = 𝐹(𝑏) − 𝐹(𝑎)

Note: In case of discrete random variable, the probability at a point, i.e., 𝑃 (𝑥 = 𝑐) is not zero for
some fixed 𝑐. However, in case of continuous random variables the probability at a point is always
zero, i.e., 𝑃(𝑥 = 𝑐) = 0 for all possible values of 𝑐.

Example
1. The diameter of an electric cable; say 𝑋, is assumed to be a continuous random variable with
p.d.f. 𝑓(𝑥) = 6𝑥(1 − 𝑥), 0 ≤ 𝑥 ≤ 1.
i) Check that above is probability density function.
ii) Determine a number 𝑏 such that 𝑃 (𝑋 < 𝑏) = 𝑃(𝑋 > 𝑏)

Solution:
i) For 0 ≤ 𝑥 ≤ 1, 𝑓(𝑥) ≥ 0.
∫ 𝑓(𝑥)𝑑𝑥 = ∫ 6𝑥(1 − 𝑥)𝑑𝑥 = 1 Thus 𝑓(𝑥) is a p.d.f. of the variate 𝑋

ii) 𝑃 (𝑋 < 𝑏) = 𝑃(𝑋 > 𝑏)

⇒ 𝑓(𝑥)𝑑𝑥 = 𝑓(𝑥)𝑑𝑥

⇒ 6𝑥(1 − 𝑥)𝑑𝑥 = 6𝑥(1 − 𝑥)𝑑𝑥

⇒ 3𝑏 − 2𝑏 = 1 − 3𝑏 + 2𝑏
⇒ 𝑏 = 1/2 is the only real value lying between 0 and 1 and satisfying the above equation.

Exercise
𝑒 , 𝑥≥0
1. Is the function 𝑓(𝑥) = a density function? If so, determine the probability
0, 𝑥<0
that the variate having this density will fall in the interval (1,2)? Also find the cumulative
probability function 𝐹(2). (Ans: 0.233, 0.865)
MATHEMATICAL EXPECTATION
The mean value (𝜇) of the probability distribution of a variate 𝑋 is commonly known as its
expectation and is denoted by 𝐸(𝑋). If 𝑝(𝑥 ) is the probability mass function and 𝑓(𝑥) the
probability density function of the variate 𝑋, then

𝐸(𝑋) = ∑ 𝑥 𝑝(𝑥 ) (For discrete distribution)


𝐸(𝑋) = ∫ 𝑥𝑓(𝑥)𝑑𝑥 (For continuous distribution)

Variance of a distribution is given by:


𝜎 = ∑ 𝑝(𝑥 )(𝑥 − 𝜇) = ∑ 𝑝(𝑥 )x − 𝜇 (For discrete distribution)
𝜎 =∫ 𝑓(𝑥)(𝑥 − 𝜇) 𝑑𝑥 (For continuous distribution)
where 𝜎 is the Standard Deviation of the distribution.

Example
1. 𝑋 is a continuous random variable with probability density function given by:
𝑘𝑥 0≤𝑥<2
𝑓(𝑥) = 2𝑘 2≤𝑥<4
−𝑘𝑥 + 6𝑘 4≤𝑥<6
Find 𝑘 and the mean value of 𝑋.

Solution: Since the total probability is unity, ∫ 𝑓(𝑥)𝑑𝑥 = 1

⇒ 𝑘𝑥 𝑑𝑥 + 2𝑘 𝑑𝑥 + (−𝑘𝑥 + 6𝑘) 𝑑𝑥 = 1

⇒ 2𝑘 + 4𝑘 + (−10𝑘 + 12𝑘) = 1
⇒ 𝑘 = 1/8

Mean of 𝑋, 𝐸(X) = ∫ 𝑥𝑓(𝑥)𝑑𝑥 = ∫ 𝑘𝑥 𝑑𝑥 + ∫ 2𝑘𝑥 𝑑𝑥 + ∫ (−𝑘𝑥 + 6𝑘)𝑥 𝑑𝑥 = 24𝑘 = 3

2. A die is tossed thrice. A success is getting 1 or 6 on a toss. Find the mean and variance of the
number of successes.

Solution: Probability of a success = = , Probability of failures is 1 − =


Probability of no success = Probability of all 3 failures = × × =
Probability of one success and two failures = × × + × × + × × =
Probability of two successes and one failure = × × + × × + × × =
Probability of all 3 failures = × × =

𝑥 0 1 2 3

8 4 2 1
𝑝
27 9 9 27

8 4 2 1
𝐸(𝑋) = 𝜇 = 0 × +1× +2× +3× =1
27 9 9 27
8 4 2 1 2
σ = 𝑝𝑥 −μ = 0 × + 1 × + 2 × + 3 × −1 =
27 9 9 27 3

Exercise
1. Determine the discrete probability distribution, mathematical expectation, variance, standard
deviation of a discrete random variable 𝑋 which denotes the minimum of the two numbers that
appear when a pair of fair dice is thrown once.
(Ans: 𝐸(𝑋) = 2.5, 𝜎 = 1.4)

Note: In a gambling game, expected value 𝐸 of the game is considered to be the value of the game
to the player. Game is favorable to the player if 𝐸 > 0, unfavorable if 𝐸 < 0 and fair if 𝐸 = 0

2. A player tosses 3 fair coins. He wins Rs. 500 if 3 heads occur, Rs.300 if 2 heads occur, Rs.100
if one head occurs. On the other hand, he loses Rs.1500 if 3 tails occur. Find the value of the
game to the player. Is it favorable? (Ans: 𝐸(𝑋) = 𝑅𝑠. 25, Favorable)

3. Suppose a continuous variate 𝑋 has the probability density


)
𝑓(𝑥) = 𝑘(1 − 𝑥 0<𝑥<1
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

a. Find 𝑘
b. Find 𝑃(0.1 < 𝑥 < 0.2)
c. Find 𝑃(𝑥 > 0.5)
d. Using distribution function, determine the probabilities that:
1. 𝑥 is less than 0.3
2. 𝑥 is between 0.4 and 0.6
e. Calculate mean and variance for the probability density function
(Ans: a: 3/2, b: 0.1465, c: 0.3125, d-1: 0.4365, d-2: 0.224, e: 𝐸(𝑋) = 3/8 & 𝜎 = 19/320 )
BERNOULLI TRIAL
 An experiment is repeated 𝑛 number of times, called 𝑛 trials where 𝑛 is a fixed integer.
 The outcome of each trial is classified into two mutually exclusive (dichotomous) categories
arbitrarily called a “success” and a “failure”.
 Probability of success, denoted by 𝑝, remains constant for all trials.
 The outcomes are independent (of the outcomes of the previous trials)

BINOMIAL DISTRIBUTION
Binomial Distribution is a discrete probability distribution concerned with trials of a repetitive
nature in which only the occurrence or non-occurrence, success or failure, acceptance or rejection,
yes or no of a particular event is of interest.

Examples:
 Number of defectives in a sample from production line
 Estimation of reliability of systems.
 Number of rounds fired from a gun hitting a target
 Radar detection.

If a Bernoulli trial is repeated 𝑛 times and if 𝑝 is the probability of a success and 𝑞 that of a failure,
then the probability of 𝑟 successes and 𝑛 − 𝑟 failures is given by 𝑝 𝑞 .
These 𝑟 successes and 𝑛 − 𝑟 failures can occur in any of the 𝐶 ways in each of which the
probability is same. Thus, the probability of r successes is 𝐶 𝑝 𝑞 .

The binomial variate 𝑋 is the number of successes in 𝑛 Bernoulli trials. 𝑋 is discrete since 𝑋 takes
only integer values. Binomial distribution is thus the probability distribution of this discrete
random variable 𝑋, and is given by:
𝑃(𝑋 = 𝑥) = 𝐶 𝑝 𝑞 𝑥 = 0, 1, 2 … 𝑟 … 𝑛

These probabilities are the successive terms in the expansion of the binomial (𝑝 + 𝑞) and hence
is called the binomial distribution.

The two independent constants 𝑛 and 𝑝 in the distribution are known as the parameters of the
distribution. The notation 𝑋~𝐵(𝑛, 𝑝) is used to denote that the random variable 𝑋 follows binomial
distribution with parameters 𝑛 and 𝑝.

The sum of the probabilities


= 𝑞 + 𝐶 𝑝𝑞 + 𝐶𝑝 𝑞 + …+ 𝐶 𝑝 𝑞 + ⋯ + 𝑝 = (𝑝 + 𝑞) = 1
Constants of the Binomial distribution
𝑀𝑒𝑎𝑛 = 𝑛𝑝
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑛𝑝𝑞

Binomial frequency distribution.


If 𝑛 independent trials constitute one experiment and this experiment be repeated 𝑁 times, then
the frequency of 𝑟 successes is 𝑁 𝐶 𝑝 𝑞 . The possible number of successes together with
these expected frequencies constitute the binomial frequency distribution.

Example
1. The probability that a pen manufactured by a company will be defective is . If 12 such pens
are manufactured, find the probability that
(a) exactly two will be defective.
(b) at least two will be defective.
(c) none will be defective.

Solution: The probability of a defective pen is = 0.1


∴ The probability of a non-defective pen is 1 – 0.1 = 0.9
(a) The probability that exactly two will be defective = 𝑃(𝑋 = 2)
= ¹²𝐶₂ (0.1)² (0.9)¹⁰ = 0.2301
(b) The probability that at least two will be defective = 𝑃(𝑋 ≥ 2)
= 1– (𝑝𝑟𝑜𝑏. 𝑡ℎ𝑎𝑡 𝑒𝑖𝑡ℎ𝑒𝑟 𝑛𝑜𝑛𝑒 𝑜𝑟 𝑜𝑛𝑒 𝑖𝑠 𝑛𝑜𝑛 − 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒) = 1 − [𝑃(𝑋 = 0) + 𝑃(𝑋 = 1)]
= 1 – [¹²𝐶₀(0.9)¹² + ¹²𝐶₁(0.1) (0.9)¹¹] = 0.3412
(c) The probability that none will be defective = 𝑃(𝑋 = 0)
= ¹²𝐶₁₂(0.9)¹² = 0.2833.

2. In 256 sets of 12 tosses of a coin, in how many cases one can expect 8 heads and 4 tails.

Solution: 𝑃(ℎ𝑒𝑎𝑑) = and 𝑃(𝑡𝑎𝑖𝑙) =


By binomial distribution, probability of 8 heads and 4 tails in 12 trials is
1 1 12! 1 495
𝑃(𝑋 = 8) = ¹²C₈ = =
2 2 8! 4! 2 4096
∴ The expected number of such cases in 256 sets
495
= 256 × 𝑃(𝑋 = 8) = 256 ×
4096
= 30.9 ≈ 31.
3. In sampling a large number of parts manufactured by a machine, the mean number of defectives
in a sample of 20 is 2. Out of 1000 such samples, how many would be expected to contain at
least 3 defective parts?

Solution: Mean number of defectives = 2 = 𝑛𝑝 = 20𝑝.


∴ The probability of a defective part is 𝑝 = = 0.1.
And the probability of a non-defective part = 0.9.

The probability of at least three defectives in a sample of 20,


= 1 – (𝑝𝑟𝑜𝑏. 𝑡ℎ𝑎𝑡 𝑒𝑖𝑡ℎ𝑒𝑟 𝑛𝑜𝑛𝑒, 𝑜𝑟 𝑜𝑛𝑒, 𝑜𝑟 𝑡𝑤𝑜 𝑎𝑟𝑒 𝑛𝑜𝑛 − 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑝𝑎𝑟𝑡𝑠)
= 1 – [²⁰𝐶₀(0.9)²⁰ + ²⁰𝐶₁(0.1)(0.9)¹⁹ + ²⁰𝐶₂(0.1)²(0.9)¹⁸]
= 1 – (0.9)²⁰ × 4.51 = 0.323

Thus, the number of samples having at least three defective parts out of 1000 samples
= 1000 × 0.323 = 323.

4. The following data are the number of seeds germinating out of 10 on damp filter paper for 80
sets of seeds. Fit a binomial distribution to these data and test its goodness of fit.

𝑥: 0 1 2 3 4 5 6 7 8 9 10

𝑓: 6 20 28 12 8 6 0 0 0 0 0

Solution: Here 𝑛 = 10 and 𝑁 = 𝛴𝑓 = 80


Mean = = = = 2.175

Now the mean of a binomial distribution = 𝑛𝑝


∴ 𝑛𝑝 = 10𝑝 = 2.175 ⟹ 𝑝 = 0.2175, 𝑞 = 1 – 𝑝 = 0.7825

Hence the binomial distribution to be fitted is


𝑁(𝑞 + 𝑝)¹⁰ = 80 × (0.7825 + 0.2175)¹⁰
= 80 × [¹⁰𝐶₀ (0.7825)¹⁰ + ¹⁰𝐶₁ (0.7825)⁹ (0.2175)¹
+ ¹⁰𝐶₂ (0.7825)⁸ (0.2175)² + . . . + 𝐶 (0.2175) ]
= 6.885 + 19.13 + 23.94 + . . . + 0.0007 + 0.00002
∴ The successive terms in the expansion give the expected or theoretical frequencies which are

𝑥: 0 1 2 3 4 5 6 7 8 9 10

𝑓: 6.9 19.1 24.0 17.8 8.6 2.9 0.7 0.1 0 0 0


5. A department in a manufacturing firm has 10 machines which may need adjustment from time
to time during the day. Three of these machines are old, each having a probability of of
needing adjustment during the day, and 7 are new, having corresponding probabilities of .
Assuming that no machine needs adjustment twice on the same day, determine the probabilities
that on a particular day:

(a) just 2 old and no new machines need adjustment.


(b) If just 2 machines need adjustment, they are of the same type.

Solution: Let 𝑝 = Probability that an old machine needs adjustment = . Thus, 𝑞 =


And 𝑝 =Probability that a new machine needs adjustment = . Thus 𝑞 =
Then 𝑃 (𝑟) the probability that ‘r’ old machines need adjustment:
1 10
𝑃 (𝑟) = 𝐶 𝑝 𝑞 = 𝐶
11 11
And 𝑃 (𝑟) the probability that ‘r’ new machines need adjustment:
1 20
𝑃 (𝑟) = 𝐶 𝑝 𝑞 = 𝐶
21 21

(a) The probability that just two old and no new machines need adjustment is given by:
1 10 20
𝑃 (2) × 𝑃 (0) = 𝐶 = 0.016
11 11 21

(b) Similarly, the probability that just 2 new machines and no old machine need adjustment is:
10 1 20
𝑃 (0) × 𝑃 (2) = 𝐶 = 0.028
11 21 21

Thus, the probability that "If just two machines need adjustment, they are of the same type" is the
same as the probability that "either just 2 old and no new or just 2 new and no old machines need
adjustment".
∴ Required probability= 0.016 + 0.028 = 0.044

Exercise
1. Assume that 50% of all engineering students are good in mathematics. Determine the
probabilities that among 18 engineering students
a. exactly 10
b. at least 10
c. at most 8
d. at least 2 and at most 9, are good in math.
(Ans: a: 0.1670 b: 0.4073 c: 0.4073 d: 0.5920)
2. The incidence of occupational disease in an industry is such that the workers have it 20%
chance of suffering from it. What is the probability that out of six workers chosen at random,
four or more will suffer from the disease. (Ans: 0.01664)

3. If 𝑋 be a binomially distributed random variable with 𝐸(𝑋) = 2 and 𝑉𝑎𝑟(𝑋) = , find the
distribution of 𝑋.
Ans:

𝑥 0 1 2 3 4 5 6

64 192 240 160 60 12 1


𝑓(𝑥 )
729 729 729 729 729 729 729
NEGATIVE BINOMIAL DISTRIBUTION (PASCAL DISTRIBUTION)
The negative binomial distribution is a discrete probability distribution that models the number
of failures that occur before a specified number of successes are achieved in a sequence of
Bernoulli trials.

Let 𝑥 denote the number of failures before the 𝑟 success in 𝑥 + 𝑟 trials. Now, the last trial must
be a success, whose probability is 𝑝. In the remaining (𝑥 + 𝑟 − 1) trials, there must be (𝑟 − 1)
successes whose probability is given by: 𝐶 𝑝 𝑞

Thus, the probability that there are 𝑥 failures preceding the 𝑟 success in 𝑥 + 𝑟 trials is given by:
𝐶 𝑝 𝑞 × 𝑝 = 𝐶 𝑝 𝑞

A random variable 𝑋 is said to follow a negative binomial distribution if its probability mass
function is given by:
𝑝(𝑥) = 𝑃(𝑋 = 𝑥) = 𝐶 𝑝 𝑞 ; 𝑥 = 0, 1, 2, . . .

Also, 𝐶 = 𝐶 (𝑆𝑖𝑛𝑐𝑒 𝐶 = 𝐶 )
( )( )…( ) ( )( )…( )
= 𝑆𝑖𝑛𝑐𝑒 𝐶 =
! !
(−1) (−𝑟)(−𝑟 − 1) … (−𝑟 − 𝑥 + 2)(−𝑟 − 𝑥 + 1)
=
𝑥!
= (−1) ( 𝐶 )

Thus 𝑝(𝑥) = 𝑃(𝑋 = 𝑥)


= (−1) ( 𝐶 ) 𝑝 𝑞
= 𝐶 𝑝 (−𝑞) ; 𝑥 = 0, 1, 2, … which is the (𝑥 + 1) term in the expansion of
𝑝 (1 − 𝑞) (Check!), a binomial expansion with a negative index. Hence the distribution is
known as negative binomial distribution.

Note:

𝑝(𝑥) = 𝑝 𝐶 (−𝑞) = 𝑝 (1 − 𝑞) = 𝑝 (𝑝) =1

Key difference between Binomial Distribution and Negative Binomial Distribution:


Binomial distribution models the number of successes in a fixed number of independent trials 𝑛
whereas negative binomial distribution models the number of failures that occur before a specified
number of successes (r) are achieved. Here, the number of successes (r) are fixed and known, but
the number of trials is random and can vary.
Geometric Distribution: If we take 𝑟=1 in negative binomial distribution,
we have 𝑝(𝑥) = 𝑞 𝑝 which is the probability mass function of geometric distribution. Hence
negative binomial distribution may be regarded as the generalization of geometric distribution.

Alternatively, negative binomial distribution can be defined as:


In a sequence of independent Bernoulli trials with probability of success 𝑝, let the random variable
𝑋 denote the trials at which the 𝑟 success occurs where 𝑟 is a fixed integer. Then

𝑃(𝑋 = 𝑥) = 𝐶 𝑝 𝑞 , 𝑥 = 𝑟, 𝑟 + 1, 𝑟 + 2, …

Example
1. A scientist needs three diseased rabbits for an experiment. He has 5 rabbits available and
inoculates them one at a time with a serum, quitting if and when he gets 3 positive reactions.
If the probability is 0·25 that a rabbit can contract the disease from the serum, what is the
probability that the scientist is able to get 3 diseased rabbits from 5?

Solution: Given 𝑟 = 3, 𝑝 = 0.25, 𝑛 = 5.


Let 𝑋 represents the number of failures before getting 3 diseased rabbits.
𝑃(𝑋 = 𝑥) = 𝐶 𝑝 𝑞 ; 𝑥 = 0, 1, 2, …
To get 3 diseased rabbits from 5 trials 𝑋 ≤ 5 − 3 = 2
𝑃(𝑋 ≤ 2) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)
= 𝐶 (0.25) + 𝐶 (0.25) (0.75) + 𝐶 (0.25) (0.75)
= 0.0156 + 0.0351 + 0.052 = 0.1033
The probability that the scientist is able to get 3 diseased rabbits from the 5 available is
approximately 0.1033.

2. An item is produced in large numbers. The machine is known to produce 5% defectives. A


quality control inspector is examining the items by taking them at random. What is the
probability that at least 4 items are to be examined in order to get 2 defectives?

Solution: Let us apply the alternative definition of the negative binomial distribution.
Given 𝑟 = 2 (defectives items to be obtained) and 𝑝 = 0.05. Let 𝑋 denote the number of trials
required to obtain 2 defective items (𝑥 = 2,3,4,5 …)
𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 4 𝑖𝑡𝑒𝑚𝑠 𝑎𝑟𝑒 𝑡𝑜 𝑏𝑒 𝑒𝑥𝑎𝑚𝑖𝑛𝑒𝑑) = 𝑃(𝑋 = 4) + 𝑃(𝑋 = 5) + 𝑃(𝑋 = 6) + ⋯

= 𝐶 𝑝 𝑞

=1− 𝐶 𝑝 𝑞

= 1 − [ 𝐶 (0.05) + 𝐶 (0.05) (0.95)} = 0.9928

The probability that at least 4 items are to be examined in order to get 2 defectives is 0.9928
3. A die is cast until 6 appears. What is the probability that it must be cast more than 5 times?

Solution: Let us apply the alternative definition of the negative binomial distribution. Let 𝑋 denote
the number of trials required for 6 to appear (𝑥 = 1, 2, 3, 4, 5 … ) Here 𝑝 = and 𝑟 = 1
𝑃(𝑋 > 5) = 1 − 𝑃(𝑋 ≤ 5)

= 1− 𝐶 𝑝 𝑞

= 1 − [𝑝 + 𝑝𝑞 + 𝑝𝑞 + 𝑝𝑞 + 𝑝𝑞 ] (𝑆𝑖𝑛𝑐𝑒 𝐶 = 1)
1 1 5 1 5 1 5 1 5
= 1− + + + +
6 6 6 6 6 6 6 6 6
= 1 − 0.5981 = 0.4019
The probability that the die must be cast more than 5 times for 6 to appear is 0.4019

4. A boy is throwing stones at a target. What is the probability that his 10th throw is his 5th hit,
if the probability of hitting the target at any trial is 0.5?

Solution: Given: 𝑝 = 0.5, 𝑟 = 5, 𝑛 = 10.


Let 𝑋 represents the number of failures before getting the 5th hit.
𝑃(𝑋 = 𝑥) = 𝐶 𝑝 𝑞 ; 𝑥 = 0, 1, 2, …
th
For 5 hit from 10 throws 𝑥 = 5
𝑃(𝑋 = 5) = 𝐶 (0.5) (0.5)
= 𝐶 (0.5) (0.5) = 126(0.5) = 0.123

Alternatively,
Let 𝑋 denote the 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡ℎ𝑟𝑜𝑤𝑠 required for the stone to hit the target for 5th time
(𝑥 = 5,6,7, … )
𝑃(10 𝑡ℎ𝑟𝑜𝑤 𝑖𝑠 5 ℎ𝑖𝑡) = 𝑃(𝑋 = 10) = 𝐶 𝑝 𝑞
= 𝐶 (0.5) (0.5) = 126(0.5) = 0.123

Exercise
1. A student has taken a 5-answer multiple choice examination orally. He continues to answer
questions until he gets five correct answers. What is the probability that he gets them on or
before the eighth question if he guesses at each answer? (Ans: 0.0104)
POISSON DISTRIBUTION
Poisson distribution is the discrete probability distribution of a discrete random variable 𝑋, which
has no upper bound. It is suitable for ‘rare’ events for which the probability of occurrence 𝑝 is
very small and the number of trials 𝑛 is very large.

Example:
 The number of persons born blind per year in a large city
 The number of deaths by horse kick in an army corps
 Number of telephone calls received at a particular telephone exchange in some unit of time or
connections to wrong numbers in a telephone exchange.
 Number of suicides reported in a particular city.
 Number of air accidents in some unit of time.

This distribution can be derived as a limiting case of the binomial distribution when n→∞ and
p→0, keeping 𝑛𝑝 fixed (= 𝑚, say).

In binomial distribution the number of successes 𝑟 (occurrence of an event) out of a total definite
number of 𝑛 trials is determined, whereas in Poisson distribution the number of successes at a
random point of time and space is determined.

The probability of 𝑟 successes in a binomial distribution is given by:


𝑛!
𝑃(𝑋 = 𝑟) = 𝐶 𝑝 𝑞 = 𝑝 𝑞
𝑟! (𝑛 − 𝑟)!
𝑛(𝑛 − 1)(𝑛 − 2). . . (𝑛 − 𝑟 + 1)
= 𝑝 𝑞
𝑟!
[𝑛𝑝(𝑛𝑝 − 𝑝)(𝑛𝑝 − 2𝑝). . . (𝑛𝑝 − (𝑟 − 1)𝑝)]
= (1 − 𝑝)
𝑟!
As 𝑛 → ∞, 𝑝 → 0 (while keeping 𝑛𝑝 = 𝑚), we obtain:
𝑚
𝑚 1 −
P(𝑋 = 𝑟) = lim 𝑛
𝑟! → 1 − 𝑚
𝑛
Since lim 1 − = 𝑒 , we get:

𝑚
𝑃(𝑋 = 𝑟) = 𝑒
𝑟!

Here, 𝑚 is known as the parameter of the distribution. The notation 𝑋~𝑃(𝑚) is used to denote
that 𝑋 is a Poisson variate with parameter 𝑚.
Thus, the probabilities of 0, 1, 2, . . . , 𝑟, … successes in a Poisson distribution are given by:
𝑚 𝑒 𝑚 𝑒
𝑒 , 𝑚𝑒 , ,…, ,…
2! 𝑟!
The sum of these probabilities is unity.

Constants of the Poisson distribution:


𝑀𝑒𝑎𝑛 = 𝑚
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = √𝑚
The equality of the mean and variance is an important characteristic of the Poisson distribution.

Example
1. If the probability of a bad reaction from a certain injection is 0.001, determine the chance that
out of 2,000 individuals more than two will get a bad reaction.

Solution: It follows a Poisson distribution as the probability of occurrence is very small.


Mean 𝑚 = 𝑛𝑝 = 2000(0.001) = 2
Probability that more than 2 will get a bad reaction
= 1 − [𝑝𝑟𝑜𝑏. 𝑡ℎ𝑎𝑡 𝑛𝑜 𝑜𝑛𝑒 𝑔𝑒𝑡𝑠 𝑎 𝑏𝑎𝑑 𝑟𝑒𝑎𝑐𝑡𝑖𝑜𝑛 + 𝑝𝑟𝑜𝑏. 𝑡ℎ𝑎𝑡 𝑜𝑛𝑒 𝑔𝑒𝑡𝑠 𝑎 𝑏𝑎𝑑 𝑟𝑒𝑎𝑐𝑡𝑖𝑜𝑛
+ 𝑝𝑟𝑜𝑏. 𝑡ℎ𝑎𝑡 𝑡𝑤𝑜 𝑔𝑒𝑡 𝑏𝑎𝑑 𝑟𝑒𝑎𝑐𝑡𝑖𝑜𝑛]
𝑚𝑒 𝑚 𝑒
=1− 𝑒 + +
1! 2!
1 2 2
=1− + +
𝑒 𝑒 𝑒
5
= 1 − = 0.32.
e

2. In a certain factory turning out razor blades, there is a small chance of 0.002 for any blade to
be defective. The blades are supplied in packets of 10, use Poisson distribution to calculate the
approximate number of packets containing no defective, one defective and two defective
blades respectively in a consignment of 10,000 packets.

Solution: We know that 𝑚 = 𝑛𝑝 = 10 × 0.002 = 0.02


. ( . )
𝑒 = 1 − 0.02 + … ≈ 0.9802 approximately
!
Probability of no defective blade is 𝑒 = 𝑒 . = 0.9802
Thus, the number of packets containing no defective blade is 10,000 × 0.9802 = 9802
Similarly, the number of packets containing one defective blade
= 10,000 × 𝑚𝑒 = 10,000 × (0.02) × 0.9802 = 196
Finally, the number of packets containing two defective blades
( . )
= 10,000 × = 10,000 × × 0.9802 = 2 approximately.
! !

3. If 𝑋 is a Poisson variate such that 𝑃(𝑋 = 2) = 9𝑃(𝑋 = 4) + 90𝑃(𝑋 = 6).


Find the mean of 𝑋.

Solution: If 𝑋 is a Poisson variate with parameter 𝑚, then 𝑃(𝑋 = 𝑥) = 𝑥 = 0,1,2, …


!
Hence, the given equation simplifies to:

𝑒 𝑚 𝑒 𝑚 𝑒 𝑚
=9 + 90
2! 4! 6!
𝑒 𝑚 𝑒 𝑚
= [3𝑚 + 𝑚 ]
2! 8
⇒ 𝑚 + 3𝑚 − 4 = 0
Solving as a quadratic in 𝑚 , we get 𝑚 = 1
Since 𝑚 > 0, we 𝑚 = 1.
Hence, 𝑚𝑒𝑎𝑛(𝑚) = 1

4. In a book of 520 pages, 390 typographical errors occur. Assuming Poisson law for the number
of errors per page, find the probability that a random sample of 5 pages will contain no error.

Solution: The average number of typographical errors per page in the book is given by
𝑚 = 390/520 = 0.75
Hence, using the Poisson probability law, the probability of 𝑥 errors per page is given by:
𝑒 𝑚 𝑒 . (0.75)
𝑃(𝑋 = 𝑥) = = , 𝑥 = 0,1,2,
𝑥! 𝑥!
The required probability that a random sample of 5 pages will contain no error is given by:
[𝑃(𝑋 = 0)] = (𝑒 . ) = 𝑒 .

5. A car hire firm has two cars which it hires out day by day. The number of demands for a car
on each day is distributed as Poisson variate with mean 1.5. Calculate the proportion of days
on which (a) neither car is used, and (b) some demand is refused.

Solution: The proportion of days on which there are 𝑥 demands for a car
𝑒 . (1.5)
= P[𝑜𝑓 𝑥 𝑑𝑒𝑚𝑎𝑛𝑑𝑠 𝑖𝑛 𝑎 𝑑𝑎𝑦] = P(𝑋 = 𝑥) = , 𝑥 = 0,1,2
𝑥!

Since the number of demands for a car on any day is a Poisson variate with mean 1.5
(a) Proportion of days on which neither car is used is given by: 𝑃(𝑋 = 0) = 𝑒 . = 0.2231
(b) Proportion of days on which some demand is refused:
𝑃(𝑋 > 2) = 1 − 𝑃(𝑋 ≤ 2)
= 1 − [𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)]
.
(1.5)
= 1−𝑒 1 + 1.5 + = 1 − 0.2231 × 3.625 = 0.19126
2!

6. Fit a Poisson distribution to the set of observations:

𝑥: 0 1 2 3 4
𝑓: 122 60 15 2 1

Σ ( )
Solution. Mean = = = 0.5.
Σ
∴ mean of Poisson distribution i.e., 𝑚 = 0.5.
Hence the theoretical frequency for 𝑟 successes is
. ( . )
= , where 𝑟 = 0, 1, 2, 3, 4
! !
∴ the theoretical frequencies are

𝑥: 0 1 2 3 4
𝑓: 121 61 15 2 0

Exercise
1. Suppose that on an average one-person in1000 makes a numerical error in preparing income
tax return (ITR). If 10000 forms are selected at random and examined, find the probability that
6, 7 or 8 of the forms will be in error. (Ans: 0.2657)

2. Determine the number of pages expected with 0, 1, 2, 3, and 4 errors in 1000 pages of a book
if on an average two errors are found in five pages by fitting a Poisson distribution.
(Ans: 670, 268, 54, 7, 1)
NORMAL DISTRIBUTION (GAUSSIAN DISTRIBUTION)
Any quantity whose variation depends on random causes is distributed according to the normal
law. Its importance lies in the fact that a large number of distributions approximate to the normal
distribution.

A continuous random variable 𝑋 is said to have a normal distribution with parameters 𝜇 (mean)
and 𝜎 (standard deviation) if its probability density function 𝑓(𝑥) is given by:

1 (𝑥 − 𝜇)
𝑓(𝑥) = exp − 𝑓𝑜𝑟 − ∞ ≤ 𝑥 ≤ ∞
𝜎√2𝜋 2𝜎

The notation 𝑋~𝑁( 𝜇, 𝜎) is used to denote that 𝑋 is a normal variate with parameter 𝜇 and 𝜎.

Define a variate z = so that 𝑧 is a normal variate with mean zero and standard deviation unity
i.e., 𝑧 ~ 𝑁(0,1)

Properties of the normal distribution:


i) The normal curve 𝑓(𝑥) is bell-shaped and is symmetrical about its mean. It is unimodal
with ordinates decreasing rapidly on both sides of the mean. As it is symmetrical, its mean,
median and mode are the same.

ii) The maximum value of 𝑓(𝑥) is



iii) The probability of 𝑥 lying between 𝑥 and 𝑥 is given by the area under the normal curve
from 𝑥 and 𝑥
( )
i.e., 𝑃(𝑥 < 𝑥 < 𝑥 ) = ∫ 𝑒 𝑑𝑥 = ∫ 𝑒 𝑑𝑧 = 𝑃(𝑧 ) − 𝑃(𝑧 )
√ √

where 𝑃(𝑧) = ∫ 𝑒 𝑑𝑧, 𝑧 = 𝑎𝑛𝑑 𝑧 =



The function 𝑃(𝑧) is called the probability integral or the error function due to its use in
the theory of sampling and the theory of errors.

Note: The probability integral is tabulated for various values of 𝑧 varying from 0 to 3.9
and is known as normal table. Download table from here.

iv) The total area between the curve 𝑦 = 𝑓(𝑥) and the 𝑥 − 𝑎𝑥𝑖𝑠 is unity
i.e., 𝑃(−∞ < 𝑥 < ∞) = 1
v) The area of the portion covered on each side of the mean 𝜇 is as each portion is
symmetrical about 𝜇. i.e., if z = then 𝑃(0 < 𝑧 < ∞) = 0.5 = 𝑃(−∞ < 𝑧 < 0)
vi) By Symmetry 𝑃(𝑍 ≤ −𝑧) = 𝑃(𝑍 ≥ 𝑧)

Example
1. 𝑋 is a normal variate with mean 30 and S.D. 5, find the probabilities that
(𝑖) 26 ≤ 𝑋 ≤ 40, (𝑖𝑖) 𝑋 ≥ 45, (𝑖𝑖𝑖) |𝑋 − 30| > 5.

Solution: We have 𝜇 = 30 and 𝜎 = 5


(𝑋 − 𝜇) (𝑋 − 30)
𝑧= =
𝜎 5
i) When 𝑋 = 26, 𝑧 = −0.8; when 𝑋 = 40, 𝑧 = 2
𝑃(26 ≤ 𝑋 ≤ 40) = 𝑃(−0.8 ≤ 𝑧 ≤ 2)
= 𝑃(−0.8 ≤ 𝑧 ≤ 0) + 𝑃(0 ≤ 𝑧 ≤ 2)
= 𝑃(0 ≤ 𝑧 ≤ 0.8) + 𝑃(0 ≤ 𝑧 ≤ 2) (By Symmetry)
= 0.2881 + 0.4772 = 0.7653

ii) When 𝑋 = 45, 𝑧 = 3


𝑃(𝑋 ≥ 45) = 𝑃(𝑧 ≥ 3) = 0.5 − 𝑃(0 ≤ 𝑧 ≤ 3) = 0.5 − 0.4986 = 0.0014

iii) 𝑃(|𝑋 − 30| ≤ 5) = 𝑃(25 ≤ 𝑋 ≤ 35)


= 𝑃(−1 ≤ 𝑧 ≤ 1) = 2𝑃(0 ≤ 𝑧 ≤ 1) = 2 × 0.3413 = 0.6826
∴ P(|X − 30| > 5) = 1 − P(|X − 30| ≤ 5) = 1 − 0.6826 = 0.3174

2. In a test on 2000 electric bulbs, it was found that the life of a particular make was normally
distributed with an average life of 2040 hours and S.D. of 60 hours. Estimate the number of
bulbs likely to burn for:
(a) More than 2150 hours. (b) Less than 1960 hours.
(c) More than 1920 hours but less than 2160 hours.
Solution: Let 𝑋~𝑁( 𝜇, 𝜎) with 𝜇 = 2040 hours and 𝜎 = 60 hours.
(𝑋 − 𝜇) (𝑋 − 2040)
𝑧= =
𝜎 60

(a) For X = 2150, 𝑧 = 1.83.


The probability that a bulb will burn for more than 2150 hours is given by:
𝑃(𝑋 > 2150) = 𝑃(𝑧 > 1.83)
= 0.5 − 𝑃(0 ≤ 𝑧 ≤ 1.83) = 0.5 − 0.4664 = 0.0336
Thus, the number of bulbs expected to burn for more than 2150 hours = 0.0336 × 2000 = 67
approximately.

(b) For 𝑋 = 1960, 𝑧 = −1.37


The probability that a bulb will burn for less than 1960 hours is given by:
𝑃(𝑋 < 1960) = 𝑃(𝑧 < −1.37)
= P(𝑧 > 1.37) (By Symmetry)
= [0.5 − 𝑃(0 ≤ 𝑧 ≤ 1.37)] = 0.5 − 0.4147 = 0.0853
∴ the number of bulbs expected to burn for less than 1960 hours
= 0.0853 × 2000 = 171 approximately.

(c) When 𝑋 = 1920, 𝑧 = −2; When 𝑋 = 2160, 𝑧 = 2


The probability that a bulb will burn for more than 1920 hours but less than 2160 is given by:
P(1920 < 𝑋 < 2160) = P(−2 < 𝑧 < 2)
= 2 × 𝑃(0 < 𝑧 < 2) (By Symmetry)
= 2 × 0.4772 = 0.9544
Thus, the required number of bulbs = 0.9544 × 2000 = 1909 nearly.

3. The mean yield for one-acre plot is 662 kilos with a standard deviation 32 kilos. Assuming
normal distribution, how many one-acre plots in a batch of 1000 plots would you expect to
have yield (a) over 700 kilos (b) below 650 kilos

Solution: Let 𝑋~𝑁( 𝜇, 𝜎) with 𝜇 = 662 𝑘𝑖𝑙𝑜𝑠 and 𝜎 = 32 𝑘𝑖𝑙𝑜𝑠.


(𝑋 − 𝜇) (𝑋 − 662)
𝑧= =
𝜎 32

(a) When 𝑋 = 700, 𝑧 = 1.19


The probability that a plot has a yield over 700 kilos is given by:
𝑃(𝑋 > 700) = 𝑃(𝑧 > 1.19)
= 0.5 − 𝑃(0 ≤ 𝑧 ≤ 1.19) = 0.5 − 0.3830 = 0.1170
Hence in a batch of 1000 plots, the expected number of plots with yield over 700 kilos is:
1000 × 0.117 = 117

(b) 𝑋 = 650, 𝑧 = −0.38


Required number of plots with yield below 650 kilos is given by:
1000 × 𝑃(𝑋 < 650) = 1000 × 𝑃(𝑧 < −0.38)
= 1000 × 𝑃(𝑧 > 0.38) (By Symmetry)
= 1000 × [0.5 − 𝑃(0 ≤ 𝑧 ≤ 0.38)] = 1000 × [0.5 − 0.1480] = 352

4. In a normal distribution 7% of the items are under 35 and 89% are under 63. What are the
mean and standard deviation of the distribution?

Solution: If 𝑋~𝑁( 𝜇, 𝜎), then we are given:


𝑃(𝑋 < 63) = 0.89 ⇒ 𝑃(𝑋 > 63) = 0.11 and 𝑃(𝑋 < 35) = 0.07

The points 𝑋 = 63 and 𝑋 = 35 are located as shown in the figure. Since the value 𝑋 = 35 is
located to the left of the ordinate at 𝑋 = μ the corresponding value of 𝑧 is negative.

When 𝑋 = 35, z= = −𝑧 (say); and when 𝑋 = 63, 𝑧= =𝑧 (say)


Thus, 𝑃(0 < 𝑧 < 𝑧 ) = 0.39 and
𝑃(𝑧 < −𝑧 ) = 0.07 𝑃(𝑧 > 𝑧 ) = 0.07 (By Symmetry)
𝑃(0 < 𝑧 < 𝑧 ) = 0.5 − 0.07 = 0.43

Thus, from normal table, 𝑧 = 1.48 and 𝑧 = 1.23


35 − 𝜇 63 − 𝜇
∴ = −1.48 𝑎𝑛𝑑 = 1.23
𝜎 𝜎
Solving simultaneously 𝜇 ≈ 50.3 and 𝜎 ≈ 10.33
Exercise
1. In a normal distribution, 31% of the items are under 45 and 8% are over 64. Find the mean and
S.D. of the distribution. (Ans: 𝜇 = 50, 𝜎 = 10)

2. The scores of the performance of students in a certain subject in a public examination are
approximately normal with mean 55 and standard deviation 10. Find the percentage of students
with the score
a) greater than 80
b) between 45 and 70
c) less than 35
If students getting less than 35 fail, and there are 10,000 students appearing for the
examination, how many would fail? If those securing 80 or above get distinction in the subject,
how many are expected to qualify for distinction?
(Ans: 0.0002, 0.7745, 0.0228, Failed:228, Distinction:2)

You might also like