0% found this document useful (0 votes)
62 views19 pages

Department of Statistics Module Code: STA1142: School of Mathematical and Natural Sciences

This document provides an overview of introductory probability concepts including: 1) Key terms like sample space, events, and probability definitions. 2) Common probability rules and theorems for counting techniques, addition rules, and independence. 3) Examples are provided to illustrate theorems for permutations, combinations, and probability trees. 4) Key concepts are the classical definition of probability, basic probability rules, and the addition rule for mutually exclusive events.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views19 pages

Department of Statistics Module Code: STA1142: School of Mathematical and Natural Sciences

This document provides an overview of introductory probability concepts including: 1) Key terms like sample space, events, and probability definitions. 2) Common probability rules and theorems for counting techniques, addition rules, and independence. 3) Examples are provided to illustrate theorems for permutations, combinations, and probability trees. 4) Key concepts are the classical definition of probability, basic probability rules, and the addition rule for mutually exclusive events.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

School of Mathematical and Natural Sciences

Department of Statistics
Module Code: STA1142
Module Description: Introductory Probability

Year: 2020
Lecturer: Mr. T Ravele

Page 1 of 19
1.COURSE CONTENT

1 Introduction
2 Counting techniques
3 Probability and Relative frequencies, properties
4 Addition rule and mutually exclusive events
5 Conditional probability, Bayes Theorem and independence
6 Random variables and probability distributions
7 Binomial, Poisson and Normal distributions, Binomial and Normal tables.

2. MODULE ASSESSMENT

2.1 Continuous Assessment

2.1.1 Test 1 50%

2.1.2 Test 2 50%

2.2 Examination [refer to School calendar]

2.3 Assessment Dates will be decided at a later date.

3. READING LIST

Although lecture notes are provided, it is important that you reinforce this material by referring
to more detailed texts.

Recommended Text

“Probability and Statistics for Scientists and Engineers” by R.E Walpole & R.H Myers, any
edition.

Page 2 of 19
Chapter 1: Introduction
Statistical (or random) Experiment – is a process or a course of action which has several
possible outcomes. The outcome that occurs cannot be predicted before the action is performed.

Some examples:

 Toss a coin
 Roll a die
 Pick a card out of a regular pack of cards
 Write a test which has 10 questions
 Ask consumer if she prefers product A or B.
 The body mass of a new born baby.

Sample space – is the set of all possible outcomes of a statistical experiment. Each outcome in a
sample space is called an element or a member of the sample space. The outcomes can either be
discrete (then they can be listed individually) or continuous (then they fall within an interval and
cannot be listed individually).

Discrete sets – we explicitly write down every single possible outcome. List them between curly
brackets and separate them with semi colons.

S = {1; 2; 3; 4; 5; 6}

Page 3 of 19
Continuous sets – if the outcome of the experiment can be any value within an interval, then the
sample space is the interval. Call the results of the experiment x, and if x can assume any value in
interval [a, b] (including a and b), then we write the sample space as follows:

S={x: a  x  b }

Null set (or empty set) – a null set,  , is a set which contains no elements.

Event – an event is a subset of a sample space.

Venn diagram – is a pictorial illustration of a sample space and events.

Complement – if A is an event with respect to the sample space S, the complement of A


(denoted by A ) is the subset of all elements that are not in A.

Intersection ( A B ) – the intersection of two events, A and B, is the event containing all
elements that are common to A and B.

Mutually exclusive events – two events are said to be mutually exclusive if they have no
common elements.

Union ( A B ) – the union of two events, A and B, is the event containing all the elements that
belong to A or B or both.

Exhaustive events – two events, A and B, are said to be exhaustive if together they contain all the
elements in the sample space. Therefore, A  B = S.

The classical definition of probability


The probability of an event A is defined as follows:

n( A)
P( A) 
N

Where n(A) = the number of outcomes corresponding to the event A. N = n(S), the total number
of possible outcomes in the sample space.

Page 4 of 19
Basic rules of probability:

Rule 1: 0  P( A)  1

 Probability is a measure of the likelihood of an event occurring.


 Probability is a number between 0 and 1.
 Probability is never negative.
 Probability is never greater than 1.
 P(A) = 0 means event A is impossible.
 P(A) = 1 means event A will definitely happen.
 P(S) = 1.
 P(  ) = 0

Rule 2:
N

 p(e )  1 ,Where N = number of elementary events (outcomes) in the sample space, ei


i 1
i = ith

elementary event.

Rule 3: P( A)  1  p( A)

Rule 4: Additional rule for probability

P( A  B)  P( A)  P( B)  P( A  B)

Rule 5: Additional rule for mutually exclusive events

P( A  B)  P( A)  P( B)

Page 5 of 19
Chapter 2: Counting Techniques
Theorem 1:

If an operation consists of two steps, of which the first can be made in n1 ways and for each of
these the second can be made in n2 ways, then the whole operation can be made in n1.n2 ways.

Example 1: How many sample points are there in the sample space when a pair of dice is
thrown once?

Example 2: A developer of a new subdivision offers prospective home buyers a choice of


Tudor, rustic, colonial, and traditional exterior styling in ranch, two-story, and split level
floor plans. In how many ways can a buyer order one of these homes?

Theorem 2:

If an operation consists ok k steps, of which the first can be made in n1 ways, for each of these the
second can be made in n2 ways, for each of the first two the third can be made in n3 ways, and so
forth, then the whole operation can be made in n1.n2.n3……nk ways.

Example 1: Sam is going to assemble a computer by himself. He has the choice of ordering
chips from two brands, a hard drive from four, memory from three, and an accessory bundle
from five local stores. How many different ways can Sam order the parts?

Theorem 3:

The number of ways in which n distinct objects can be arranged in row is n! (n factorial). Note
that 0! = 1

Theorem 4:

The number of permutations [order is important] of n distinct objects taken r at a time is:

n!
Pr  , where r  0,1,2,........, n.
(n  r )!
n

Example 1: In one year, three awards (research, teaching, and service) will be given for a
class of 25 graduate students in a statistics department. If each student can receive at most
one award, how many possible selections are there?

Page 6 of 19
Theorem 5:

The number of permutations of n objects arranged in a circle is (n – 1)!.

Theorem 6:

The number of distinct permutations of n things of which n1 are of one kind, n2 of a second
kind,…, nk of a kth kind in
𝑛!
𝑛1 !𝑛2 !…𝑛𝑘 !

Theorem 5:

The number of combinations [order is not important] of r objects selected from a set of n distinct
objects is:

n n!
n
C r     , where r  0,1,2,......, n.
 r  r!(n  r )!

Example 1: A young boy asks his mother to get five Game-Boy cartridges from his
collection of 10 arcade and 5 sports games. How many ways are there that his mother
will get 3 arcade and 2 sports games, respectively?

Probability trees (Tree diagrams):

One very useful method of calculating probabilities is to use a probability tree, in which the various
possible events of an experiment are represented by lines or branches of the tree. When you want
to construct a sample space for an experiment, a probability tree is a useful device for ensuring
that you have identified all simple events and assigned the associated probabilities.

Page 7 of 19
Chapter 3: Probability and relative frequency

Definition: The probability of an event A is the sum of the weights of all sample points in A.
Therefore,
0 ≤ P(A) ≤ 1, P(∅) = 0, and P(S) = 1
Furthermore if 𝐴1 , 𝐴2 , 𝐴3 …is a sequence of mutually exclusive events, then
P(𝐴1 𝑈𝐴2 𝑈𝐴3 𝑈 … . ) = 𝑃(𝐴1 ) + 𝑃(𝐴2 ) + 𝑃(𝐴3 ) + ……..

Example 1: A coin is tossed twice. What is the probability that at least one head occurs?

Theorem: If an experiment can result in any one of N different equally likely outcomes, and if
exactly n of these outcomes correspond to event A, then the probability of event A is
𝑛
𝑃(𝐴) = 𝑁

Example 1: A statistics class for engineers consists of 25 industrial, 10 mechanical,10


electrical, 8 civil engineering students. If a person is randomly selected by the instructor to
answer a question, find the probability that the student chosen is (a) an industrial
engineering major, (b) a civil engineering or electrical engineering major.

Page 8 of 19
Chapter 4: Addition rule and mutually exclusive events
Theorem: If A and B are two events, then
P( A  B)  P( A)  P( B)  P( A  B)

Corollary: If A and B are mutually exclusive, then


P( A  B)  P( A)  P( B)
Corollary: If 𝐴1 , 𝐴2 , 𝐴3 …. 𝐴𝑛 are mutually exclusive, then
P(𝐴1 𝑈𝐴2 𝑈𝐴3 𝑈 … 𝑈𝐴𝑛 ) = 𝑃(𝐴1 ) + 𝑃(𝐴2 ) + 𝑃(𝐴3 ) + … + 𝑃(𝐴𝑛 )
Corollary: If 𝐴1 , 𝐴2 , 𝐴3 …. Is a partition of sample space S, then
P(𝐴1 𝑈𝐴2 𝑈𝐴3 𝑈 … 𝑈𝐴𝑛 ) = 𝑃(𝐴1 ) + 𝑃(𝐴2 ) + 𝑃(𝐴3 ) + … + 𝑃(𝐴𝑛 ) = 𝑃(S) = 1

Corollary: If A, B and C,
P(A𝑈B𝑈C) = 𝑃(A) + 𝑃(𝐵) + 𝑃(C) − 𝑃(𝐴 ∩ 𝐵) − 𝑃(𝐴 ∩ 𝐶) −𝑃(𝐵 ∩ 𝐶) + 𝑃(𝐴 ∩ 𝐵 ∩ 𝐶)

Corollary: If A and B are complementary, then


𝑃(A) + 𝑃(𝐴′) = 1

Proposition (Properties of the relative frequency).


1. 𝑃(𝑆) = 1, where 𝑆 is the certain event;
2. If 𝑃(𝐴 ∪ 𝐵) = ∅, 𝑡ℎ𝑒𝑛 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵).

Example 1: John is going to graduate from industrial engineering department in a


university by the end of the semester. After being interviewed at two companies he likes,
he assesses that the probability of getting an offer from company B is 0.6. If, on the other
hand, he believes that the probability that he will get offers from both companies is 0.5,
what is the probability that he will get at least one offer from these two companies?

Example 2: If the probabilities are, respectively, 0.09, 0.15, 0.21,and 0.23 that a person
purchasing a new automobile will choose the colour green, white, red, or blue, what is the
probability that a given buyer will purchase a new automobile that comes in one of those
colours?

Page 9 of 19
Example 3: If the probabilities that the automobile mechanic will service 3, 4, 5, 6, 7, or
8 or more cars on any given workday are, respectively, 0.12, 0.10, 0.28, 0.24, 0.10, and
0.07, what is the probabilities that he will service at least 5 cars on his next day at work?

Page 10 of 19
Chapter 5: Conditional probability, Bayes Theorem and independence

Definition 1: Conditional probability

P (A│B) is called the conditional probability that event A will occur given that event B has
occurred.

P( A  B)
P (A│B) = , provided that P(B)  0
P( B)

Example 1: The probability that a regularly scheduled flight departs on time is P(D) = 0.83;
the probability that it arrives on time is P(A) = 0.82; and the probability that it departs and
arrives on time is P(D∩A) = 0.78. Find the probability that a plane
(a) Arrives on tome given that it departed on time, and
(b) Departed on time given that it has arrived on time.

Example 2: The concept of conditional probability has countless applications in both


industrial and biomedical applications. Consider an industrial process in the textile industry
in which strips of a particular type of cloth are being produced. These strips can be
defective in two ways, length and nature of texture. For the case of the latter, the process
of identification is very complicated. It is known from historical information on the process
that 10% of strips fail the length test,5% fail the texture test, and only 0.8% fail both the
tests. If a strip is selected randomly from the process and a quick measurement identifies t
as failing the length test, what is the probability that it is texture defective?

Definition 2: Independent Events

Two events, A and B, are said to be independent if and only if

P(A│B) = P(A) and P(B│A) = P(B) or P( A  B)  P( A) P( B)

Definition 3: Independence Events

Two events, A and B, are said to be independent if and only if

P( A  B)  P( A) P( B)

Page 11 of 19
The theorem of Total probability (or Rule of Elimination):

If the events B1, B2,….,Bk constitute a partition of the sample space S such that P(Bi)≠0 for

i = 1, 2,…., k, then for any event A of S,


k k
P( A)   P( Bi  A)   P( B ) (A│Bi)i
i 1 i 1

Bayes’ Rule

If the events B1, B2,….., Bk constitute a partition of the sample space S such that P(A)≠0,

P( Br  A) P( Br ) P( A / Br )
P (Br│A) = k
 k

 P( Bi  A)  P( B ) P( A / B ) i i
i 1 i 1

Example 1: In a certain assembly plant, three machines, B1, B2 and B3, make 30%, 45%
and 25% of the products respectively. It is known from past experience that 2%, 3% and
2% of the products made by each machine, respectively, are detective. Suppose that a
finished product randomly selected. What is the probability that is defective? Now
suppose a product is chosen randomly and found to be defective, what is the probability
that it was made by machine B3?

Example 2: A certain government agency employs three consulting firms (A, B and C)
with probabilities 0.10, 0.35 and 0.25 respectively. From past experience it is known that
the probabilities of cost overruns for the firms are 0.05, 0.03 and 0.15 respectively. Suppose
a cost overrun is experienced by the agency.

a. What is the probability that the consulting firm involved is company C?


b. What is the probability that it is company A?

Page 12 of 19
Chapter 6: Random variables and probability distributions
Definition 1:

- A random variable is a quantity which can take on the values of a given set (sample space)
with specified probabilities.
- A random variable is the aspect of a random experiment which we are interested in studying.

Definition 2:

A random variable is said to be discrete if it can only assume a countable number of probabilities.

Definition 3:

A random variable is said to be continuous if it can assume an uncountable number of values.

Definition 4:

If X is a discrete random variable, the function given by f(x) = P(X = x) for each x within the range
of X is called the probability distribution of X with the following properties:

1. f(x)  0 for each value within its domain.


2.  f ( x)  1 ,where the summation extends over all the values within its domain.
x

3. P(X = x) = f(x)

Example 1: A shipment of 8 similar microcomputers to a retail outlet contains 3 that are


defective. If a school makes a random purchase of 2 of these computers, find the probability
distribution for the number of defectives.

Example 2: If a car agency sells 50% of its inventory of a certain foreign car equipped with
airbags, find a formula for the probability distribution of the number of cars with airbags among
the next 4 cars sold by the agency.

Definition 5

The cumulative distribution, F(x), of a discrete random variable X with probability function f(x)
is:

F ( x)  P( X  x)   f (t ), for -   x  
tx

The values, F(x), of the cumulative function of a discrete random variable, X, satisfy the following
conditions:

1. F(   ) = 0

Page 13 of 19
2. F(∞) = 1
3. If a < b, then F(a) ≤ F(b) for any real numbers a and b.

Example 3: Find the cumulative distribution function of the random variable X in


Example 2. Using F(x), verify that f(2) = 3/8.

Definition 5:

A function f(x), defined over the set of all real numbers, is called a probability density function
(pdf) of the continuous random variable X, if and only if

1. f(x) ≥ 0 for -∞ < x < ∞


b
2. f ( x)  1
a

p(a  x  b)   f ( x)dx
b
3. for any real constants a and b with a  b.
a

Example 4: Suppose that the error in the reaction temperature, in °C, for a controlled
laboratory experiment is a continuous random variable X having the probability density
function.
𝑥2
, −1 < 𝑥 < 2
𝑓(𝑥) = {3
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

a. Verify condition 2 of the Definition 5.


b. Find P(0 < X ≤ 1).
Theorem 1:

If X is a continuous random variable and a and b are two real constants with a ≤ b, then

P (a ≤ x ≤ b) = P(a ≤ x < b) = P(a < x ≤ b) = P(a < x < b).

Definition 6:

If X is a continuous random variable, the function given by



F ( x)  P( X  x)   f (t )dt for -   x  


Is called the cumulative distribution of X. [f(t) is the value of the probability density function of X
at t]

Example 5: For the density function of Example 4 find F(x), and use it to evaluate P(0 <
X ≤ 1).

Page 14 of 19
Theorem 2:

If f(x) is the pdf and F(x), the cumulative distribution of X, then

P (a ≤ x ≤ b) = F(b) – F(a) , for any real constants a and a with a ≤ b, and

dF ( x)
f ( x)  , where the derivative exists.
dx

MATHEMATICAL EXPECTATION

Definition 1:

Let X be a random variable with probability distribution f(x). The mean or expected value of X is
defined as follows:

  E ( X )   xf ( x) if X is discrete;
x

  E(X)   xf ( x)dx if X is continuous .
-

Definition 2:

Let X be a random variable with probability distribution f(x) and mean . The variance of X is
defined as follows:

 2  E[( X   ) 2 ]   ( x   ) 2 f ( x) if X is discrete;
x
2
 2  E[( X   ) 2 ]   ( x   ) f ( x)dx if X is continuous .

The positive square root of the variance, s, is called the standard deviation of X.

Page 15 of 19
Chapter 7: Binomial, Poisson and Normal distributions, Binomial and Normal
tables
The Binomial Distribution

- The random experiment consists of n independent repeated Bernoulli trials.


- The probability of “success”, p, remains constant from trial to trial
- The Binomial random variable, X = the number of “successes” out of the n trials.

The Binomial distribution is given by:

n
p( x)    p x (1  p) nx , x  0, 1, 2,......, n
 x

n and p are called the parameters of the distribution,  = np and 2 = np(1−p)

The Poisson Distribution

Poisson experiments yield numerical values of a random variable X representing the number of
outcomes occurring during a given time interval or in a specified region.

The Poisson distribution is given by:

e  x
p ( x)  , x  0, 1, 2,........
x!

Where λ is the average number of outcomes per unit time or region.

   and  2  

The Normal Distribution

The normal probability distribution or the normal curve is a bell-shaped (symmetric) curve. Its
mean is denoted by  and its standard deviation by . A continuous random variable x that has a
normal distribution is called a normal random variable. Not all bell-shaped curves represent a
normal distribution curve.

Properties of Normal distribution.

1. The area under the normal curve is equal to 1.

Page 16 of 19
2. The curve is symmetric about the mean.
3. Mean, median and mode of a normal distribution are equal.
4. Normal distributions are defined by two parameters, the mean (μ) and the standard
deviation (σ).

The standard Normal distribution

The standard normal distribution is a special case of the normal distribution. For the standard
normal distribution the value of the mean is equal to zero, and the value of the standard deviation
is equal to one. The normal distribution with  = 0 and  = 1 is called the standard distribution.
The random variable that possesses the standard normal distribution is denoted by z. the units
marked on the horizontal axis of the standard normal curve are denoted by z and are called the z
values or z scores. A specific value of z gives the distance between the mean and the point
represented by z in terms of the standard deviation. The z values on the right side of the mean are
positive and those on the left side of the mean are negative. Although the values of z on the left
side of the mean are negative, the area under the curve is always positive. The area under the
standard curve between any two points can be interpreted as the probability that z assumes a value
within that interval.

The normal distribution table gives the area to the left of a z value. To find the area to the right of
z, first we find the area to the left of z, then we subtract this area from 1 which is the total area
under the curve.

Examples:

a. Find the area under the standard normal curve to the left of z = 1.96.
b. Find the area under the standard normal curve from z = -2.15 to z = 0.
c. Find the following areas under the standard normal curve.
i. Area to the right of z = 2.38
ii. Area to the left of z = -1.57

Standardizing a Normal Distribution

In real world applications, a random variable may have a normal distribution with values of the
mean and standard deviation that are different from 0 and 1, respectively. The first step in such a

Page 17 of 19
case is to convert the given normal distribution to the standard normal distribution. This procedure
is called standardizing a normal distribution. The units of the normal distribution are denoted by
x.

Converting an x value to a z value.

For a normal random variable x, a particular value of x can be converted to its corresponding z
value by the formula:

x
z

Where  and  are the mean and standard deviation of the normal distribution of x, respectively.
Thus to find the z value for an x value, we calculate the difference between the given x value and
the mean , and divide this by the standard deviation . If the value of x is equal to , then its z
value is equal to zero. Note that we will always round z values to two decimal places.

Example

Let x be a continuous random variable that has a normal distribution with a mean of 50 and a
standard deviation of 10. Convert the x value of 35 to z value and find the probability to the left of
this point.

The z value for an x value that is greater than m is positive, the z value for an x value that is equal
to  is zero, and the z value that is less than  is negative.

Page 18 of 19
To find the area between two values of x for a normal distribution, we first convert both values of
x to their respective z values. Then we find the area under the standard normal curve between those
two z values. The area between the two z values gives the area between the corresponding x values.

 Applications of the Normal Distribution


 Determining the z and x values

How to find the corresponding value of z or x when an area under a normal distribution curve is
known.

Example:

1. Find the point z such that the area under the standard normal curve to the left of z is 0.9251.
Solution: to find the required value of z, we locate 0.9251 in the body of the normal
distribution table. Next we read the numbers in column and row for z that correspond to
0.9251, which is 1.4 and 0.04. Combining these two numbers, we obtain the required value
of z = 1.44.

APPROXIMATIONS
Under certain conditions:

-The Binomial distribution can be used to approximate the Hypergeometric distribution

- The Poisson distribution can be used to approximate the Binomial distribution.

-The normal distribution can be used to approximate both the Binomial and Poisson
distributions.

Page 19 of 19

You might also like