Department of Statistics Module Code: STA1142: School of Mathematical and Natural Sciences
Department of Statistics Module Code: STA1142: School of Mathematical and Natural Sciences
Department of Statistics
Module Code: STA1142
Module Description: Introductory Probability
Year: 2020
Lecturer: Mr. T Ravele
Page 1 of 19
1.COURSE CONTENT
1 Introduction
2 Counting techniques
3 Probability and Relative frequencies, properties
4 Addition rule and mutually exclusive events
5 Conditional probability, Bayes Theorem and independence
6 Random variables and probability distributions
7 Binomial, Poisson and Normal distributions, Binomial and Normal tables.
2. MODULE ASSESSMENT
3. READING LIST
Although lecture notes are provided, it is important that you reinforce this material by referring
to more detailed texts.
Recommended Text
“Probability and Statistics for Scientists and Engineers” by R.E Walpole & R.H Myers, any
edition.
Page 2 of 19
Chapter 1: Introduction
Statistical (or random) Experiment – is a process or a course of action which has several
possible outcomes. The outcome that occurs cannot be predicted before the action is performed.
Some examples:
Toss a coin
Roll a die
Pick a card out of a regular pack of cards
Write a test which has 10 questions
Ask consumer if she prefers product A or B.
The body mass of a new born baby.
Sample space – is the set of all possible outcomes of a statistical experiment. Each outcome in a
sample space is called an element or a member of the sample space. The outcomes can either be
discrete (then they can be listed individually) or continuous (then they fall within an interval and
cannot be listed individually).
Discrete sets – we explicitly write down every single possible outcome. List them between curly
brackets and separate them with semi colons.
S = {1; 2; 3; 4; 5; 6}
Page 3 of 19
Continuous sets – if the outcome of the experiment can be any value within an interval, then the
sample space is the interval. Call the results of the experiment x, and if x can assume any value in
interval [a, b] (including a and b), then we write the sample space as follows:
S={x: a x b }
Null set (or empty set) – a null set, , is a set which contains no elements.
Intersection ( A B ) – the intersection of two events, A and B, is the event containing all
elements that are common to A and B.
Mutually exclusive events – two events are said to be mutually exclusive if they have no
common elements.
Union ( A B ) – the union of two events, A and B, is the event containing all the elements that
belong to A or B or both.
Exhaustive events – two events, A and B, are said to be exhaustive if together they contain all the
elements in the sample space. Therefore, A B = S.
n( A)
P( A)
N
Where n(A) = the number of outcomes corresponding to the event A. N = n(S), the total number
of possible outcomes in the sample space.
Page 4 of 19
Basic rules of probability:
Rule 1: 0 P( A) 1
Rule 2:
N
elementary event.
Rule 3: P( A) 1 p( A)
P( A B) P( A) P( B) P( A B)
P( A B) P( A) P( B)
Page 5 of 19
Chapter 2: Counting Techniques
Theorem 1:
If an operation consists of two steps, of which the first can be made in n1 ways and for each of
these the second can be made in n2 ways, then the whole operation can be made in n1.n2 ways.
Example 1: How many sample points are there in the sample space when a pair of dice is
thrown once?
Theorem 2:
If an operation consists ok k steps, of which the first can be made in n1 ways, for each of these the
second can be made in n2 ways, for each of the first two the third can be made in n3 ways, and so
forth, then the whole operation can be made in n1.n2.n3……nk ways.
Example 1: Sam is going to assemble a computer by himself. He has the choice of ordering
chips from two brands, a hard drive from four, memory from three, and an accessory bundle
from five local stores. How many different ways can Sam order the parts?
Theorem 3:
The number of ways in which n distinct objects can be arranged in row is n! (n factorial). Note
that 0! = 1
Theorem 4:
The number of permutations [order is important] of n distinct objects taken r at a time is:
n!
Pr , where r 0,1,2,........, n.
(n r )!
n
Example 1: In one year, three awards (research, teaching, and service) will be given for a
class of 25 graduate students in a statistics department. If each student can receive at most
one award, how many possible selections are there?
Page 6 of 19
Theorem 5:
Theorem 6:
The number of distinct permutations of n things of which n1 are of one kind, n2 of a second
kind,…, nk of a kth kind in
𝑛!
𝑛1 !𝑛2 !…𝑛𝑘 !
Theorem 5:
The number of combinations [order is not important] of r objects selected from a set of n distinct
objects is:
n n!
n
C r , where r 0,1,2,......, n.
r r!(n r )!
Example 1: A young boy asks his mother to get five Game-Boy cartridges from his
collection of 10 arcade and 5 sports games. How many ways are there that his mother
will get 3 arcade and 2 sports games, respectively?
One very useful method of calculating probabilities is to use a probability tree, in which the various
possible events of an experiment are represented by lines or branches of the tree. When you want
to construct a sample space for an experiment, a probability tree is a useful device for ensuring
that you have identified all simple events and assigned the associated probabilities.
Page 7 of 19
Chapter 3: Probability and relative frequency
Definition: The probability of an event A is the sum of the weights of all sample points in A.
Therefore,
0 ≤ P(A) ≤ 1, P(∅) = 0, and P(S) = 1
Furthermore if 𝐴1 , 𝐴2 , 𝐴3 …is a sequence of mutually exclusive events, then
P(𝐴1 𝑈𝐴2 𝑈𝐴3 𝑈 … . ) = 𝑃(𝐴1 ) + 𝑃(𝐴2 ) + 𝑃(𝐴3 ) + ……..
Example 1: A coin is tossed twice. What is the probability that at least one head occurs?
Theorem: If an experiment can result in any one of N different equally likely outcomes, and if
exactly n of these outcomes correspond to event A, then the probability of event A is
𝑛
𝑃(𝐴) = 𝑁
Page 8 of 19
Chapter 4: Addition rule and mutually exclusive events
Theorem: If A and B are two events, then
P( A B) P( A) P( B) P( A B)
Corollary: If A, B and C,
P(A𝑈B𝑈C) = 𝑃(A) + 𝑃(𝐵) + 𝑃(C) − 𝑃(𝐴 ∩ 𝐵) − 𝑃(𝐴 ∩ 𝐶) −𝑃(𝐵 ∩ 𝐶) + 𝑃(𝐴 ∩ 𝐵 ∩ 𝐶)
Example 2: If the probabilities are, respectively, 0.09, 0.15, 0.21,and 0.23 that a person
purchasing a new automobile will choose the colour green, white, red, or blue, what is the
probability that a given buyer will purchase a new automobile that comes in one of those
colours?
Page 9 of 19
Example 3: If the probabilities that the automobile mechanic will service 3, 4, 5, 6, 7, or
8 or more cars on any given workday are, respectively, 0.12, 0.10, 0.28, 0.24, 0.10, and
0.07, what is the probabilities that he will service at least 5 cars on his next day at work?
Page 10 of 19
Chapter 5: Conditional probability, Bayes Theorem and independence
P (A│B) is called the conditional probability that event A will occur given that event B has
occurred.
P( A B)
P (A│B) = , provided that P(B) 0
P( B)
Example 1: The probability that a regularly scheduled flight departs on time is P(D) = 0.83;
the probability that it arrives on time is P(A) = 0.82; and the probability that it departs and
arrives on time is P(D∩A) = 0.78. Find the probability that a plane
(a) Arrives on tome given that it departed on time, and
(b) Departed on time given that it has arrived on time.
P( A B) P( A) P( B)
Page 11 of 19
The theorem of Total probability (or Rule of Elimination):
If the events B1, B2,….,Bk constitute a partition of the sample space S such that P(Bi)≠0 for
Bayes’ Rule
If the events B1, B2,….., Bk constitute a partition of the sample space S such that P(A)≠0,
P( Br A) P( Br ) P( A / Br )
P (Br│A) = k
k
P( Bi A) P( B ) P( A / B ) i i
i 1 i 1
Example 1: In a certain assembly plant, three machines, B1, B2 and B3, make 30%, 45%
and 25% of the products respectively. It is known from past experience that 2%, 3% and
2% of the products made by each machine, respectively, are detective. Suppose that a
finished product randomly selected. What is the probability that is defective? Now
suppose a product is chosen randomly and found to be defective, what is the probability
that it was made by machine B3?
Example 2: A certain government agency employs three consulting firms (A, B and C)
with probabilities 0.10, 0.35 and 0.25 respectively. From past experience it is known that
the probabilities of cost overruns for the firms are 0.05, 0.03 and 0.15 respectively. Suppose
a cost overrun is experienced by the agency.
Page 12 of 19
Chapter 6: Random variables and probability distributions
Definition 1:
- A random variable is a quantity which can take on the values of a given set (sample space)
with specified probabilities.
- A random variable is the aspect of a random experiment which we are interested in studying.
Definition 2:
A random variable is said to be discrete if it can only assume a countable number of probabilities.
Definition 3:
Definition 4:
If X is a discrete random variable, the function given by f(x) = P(X = x) for each x within the range
of X is called the probability distribution of X with the following properties:
3. P(X = x) = f(x)
Example 2: If a car agency sells 50% of its inventory of a certain foreign car equipped with
airbags, find a formula for the probability distribution of the number of cars with airbags among
the next 4 cars sold by the agency.
Definition 5
The cumulative distribution, F(x), of a discrete random variable X with probability function f(x)
is:
F ( x) P( X x) f (t ), for - x
tx
The values, F(x), of the cumulative function of a discrete random variable, X, satisfy the following
conditions:
1. F( ) = 0
Page 13 of 19
2. F(∞) = 1
3. If a < b, then F(a) ≤ F(b) for any real numbers a and b.
Definition 5:
A function f(x), defined over the set of all real numbers, is called a probability density function
(pdf) of the continuous random variable X, if and only if
b
2. f ( x) 1
a
p(a x b) f ( x)dx
b
3. for any real constants a and b with a b.
a
Example 4: Suppose that the error in the reaction temperature, in °C, for a controlled
laboratory experiment is a continuous random variable X having the probability density
function.
𝑥2
, −1 < 𝑥 < 2
𝑓(𝑥) = {3
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
If X is a continuous random variable and a and b are two real constants with a ≤ b, then
Definition 6:
Is called the cumulative distribution of X. [f(t) is the value of the probability density function of X
at t]
Example 5: For the density function of Example 4 find F(x), and use it to evaluate P(0 <
X ≤ 1).
Page 14 of 19
Theorem 2:
dF ( x)
f ( x) , where the derivative exists.
dx
MATHEMATICAL EXPECTATION
Definition 1:
Let X be a random variable with probability distribution f(x). The mean or expected value of X is
defined as follows:
E ( X ) xf ( x) if X is discrete;
x
E(X) xf ( x)dx if X is continuous .
-
Definition 2:
Let X be a random variable with probability distribution f(x) and mean . The variance of X is
defined as follows:
2 E[( X ) 2 ] ( x ) 2 f ( x) if X is discrete;
x
2
2 E[( X ) 2 ] ( x ) f ( x)dx if X is continuous .
The positive square root of the variance, s, is called the standard deviation of X.
Page 15 of 19
Chapter 7: Binomial, Poisson and Normal distributions, Binomial and Normal
tables
The Binomial Distribution
n
p( x) p x (1 p) nx , x 0, 1, 2,......, n
x
Poisson experiments yield numerical values of a random variable X representing the number of
outcomes occurring during a given time interval or in a specified region.
e x
p ( x) , x 0, 1, 2,........
x!
and 2
The normal probability distribution or the normal curve is a bell-shaped (symmetric) curve. Its
mean is denoted by and its standard deviation by . A continuous random variable x that has a
normal distribution is called a normal random variable. Not all bell-shaped curves represent a
normal distribution curve.
Page 16 of 19
2. The curve is symmetric about the mean.
3. Mean, median and mode of a normal distribution are equal.
4. Normal distributions are defined by two parameters, the mean (μ) and the standard
deviation (σ).
The standard normal distribution is a special case of the normal distribution. For the standard
normal distribution the value of the mean is equal to zero, and the value of the standard deviation
is equal to one. The normal distribution with = 0 and = 1 is called the standard distribution.
The random variable that possesses the standard normal distribution is denoted by z. the units
marked on the horizontal axis of the standard normal curve are denoted by z and are called the z
values or z scores. A specific value of z gives the distance between the mean and the point
represented by z in terms of the standard deviation. The z values on the right side of the mean are
positive and those on the left side of the mean are negative. Although the values of z on the left
side of the mean are negative, the area under the curve is always positive. The area under the
standard curve between any two points can be interpreted as the probability that z assumes a value
within that interval.
The normal distribution table gives the area to the left of a z value. To find the area to the right of
z, first we find the area to the left of z, then we subtract this area from 1 which is the total area
under the curve.
Examples:
a. Find the area under the standard normal curve to the left of z = 1.96.
b. Find the area under the standard normal curve from z = -2.15 to z = 0.
c. Find the following areas under the standard normal curve.
i. Area to the right of z = 2.38
ii. Area to the left of z = -1.57
In real world applications, a random variable may have a normal distribution with values of the
mean and standard deviation that are different from 0 and 1, respectively. The first step in such a
Page 17 of 19
case is to convert the given normal distribution to the standard normal distribution. This procedure
is called standardizing a normal distribution. The units of the normal distribution are denoted by
x.
For a normal random variable x, a particular value of x can be converted to its corresponding z
value by the formula:
x
z
Where and are the mean and standard deviation of the normal distribution of x, respectively.
Thus to find the z value for an x value, we calculate the difference between the given x value and
the mean , and divide this by the standard deviation . If the value of x is equal to , then its z
value is equal to zero. Note that we will always round z values to two decimal places.
Example
Let x be a continuous random variable that has a normal distribution with a mean of 50 and a
standard deviation of 10. Convert the x value of 35 to z value and find the probability to the left of
this point.
The z value for an x value that is greater than m is positive, the z value for an x value that is equal
to is zero, and the z value that is less than is negative.
Page 18 of 19
To find the area between two values of x for a normal distribution, we first convert both values of
x to their respective z values. Then we find the area under the standard normal curve between those
two z values. The area between the two z values gives the area between the corresponding x values.
How to find the corresponding value of z or x when an area under a normal distribution curve is
known.
Example:
1. Find the point z such that the area under the standard normal curve to the left of z is 0.9251.
Solution: to find the required value of z, we locate 0.9251 in the body of the normal
distribution table. Next we read the numbers in column and row for z that correspond to
0.9251, which is 1.4 and 0.04. Combining these two numbers, we obtain the required value
of z = 1.44.
APPROXIMATIONS
Under certain conditions:
-The normal distribution can be used to approximate both the Binomial and Poisson
distributions.
Page 19 of 19