CH 1 Overview of Basic Probability Theory
CH 1 Overview of Basic Probability Theory
1. INTRODUCTION
Statistics is one of the most important and useful tools for many scientific investigations. This
course on statistics begins with the fundamentals of Probability on the assumption that
students have taken introductory course on statistics.
The field of statistics has two aspects (or broad sub-divisions): Descriptive statistics and
inferential (analytical) statistics.
Descriptive statistics – is concerned with the collection, processing, summarizing and
describing important features of the data without going beyond (i.e. without any attempt to
infer from the data).
Inferential (Analytical) statistics – is concerned with the process of using data obtained from
sample to make estimates or test hypotheses about the characteristics of a population. It
consists of a host of techniques that help decision makers to arrive at rational decisions under
uncertainty.
Actually, data are sought for a large group of elements (individuals, households, products,
etc). However, due to time, cost and other considerations, data are collected from only a
small portion of the group. Thus, economists, managers and other decision makers draw
conclusions, make estimates and test hypotheses about the characteristics of population from
the data for a small portion of the group. This process is referred to as statistical inference.
In descriptive statistics, we cannot generalize the results it was outside the observed data
under consideration. Any question relating to the population from which the observed data
were drawn cannot be answered within the descriptive statistics framework. In order to be
able to do that we need the theoretical framework offered by probability theory. In effect,
probability theory develops a mathematical model that provides the logical foundation of
statistical inference procedures for analyzing observed data.
Page 1 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
2.PROBABILITY THEORY
2.1 Introduction:
Personally, in our daily lives we are faced with many decision-making situations that involve
uncertainty. Perhaps you may ask yourself to analyze one of the following situations:
- What is the chance for me to score “A” in statistics?
- What is the likelihood that your (our/my) weekend picnic be successful?
In such situations, we use the concept of probability in our daily life without detailed and
actual knowledge of the concept in other words we use it intuitively.
Professionally, much of statistical theory and practice rests on the concept of probability,
since conclusions concerning population are drawn from samples and this is subject to certain
amount of uncertainty. Besides, you may be asked one of the following:
As business economist: What is the chance that sales or quantity demand will increase if the
price of the product reduced?
As project analyst: What likely is the project will be completed on time?
The subject matter most useful in effectively dealing with such uncertainties is enclosed
under the heading probability.Probability can be thought as a numerical measure of the
chance of likelihood that a particular event will occur.
2.2 Some basic concepts in probability theory
In probability theory, we are concerned with an experiment with an outcome depending on
chance, which is called a random experiment.
Experiment / Trial: An experiment or trial is an act that can be repeated under given identical
conditions.
Example: Throwing a die, tossing a coin are the examples of experiment or trial.
Outcome:An outcome is the result of an experiment.
Examples
If we throw a die once, we get 1 or 2 or 3 or 4 or 5 or 6. So that individually 1 is an
outcome or sample point, 2 is an outcome or sample point and etc.
If we toss a coin once, we get head or tail. Individually head and trial are two
outcomes.
Page 2 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
Page 3 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
Example: In tossing a coin twice, the happening of Head or Tail in the first toss does
not affect the happening of head or tail in the second toss. The two tosses are
therefore independent.
Dependent Events- Two events are said to be dependent when the happening (or occurrence)
and non-occurrence of an event affects the happening of another event.
Example: If a person randomly selects a ball in two rounds from a box containing two
black and two white balls without replacing the first ball from the first round selection,
the probability of selecting a white/black ball in the second round depends on the
probability of selecting a white/black ball in the first round. Then, we say that the
probability of selecting a white/black ball in the second round depends on the
happening of a white/black ball in the first round.
Mutually Exclusive Event: Events are said to be mutually exclusive if one and only of them can
take place at a time.
Example: In tossing a coin once, the outcomes Head and Tail are mutually exclusive in
that only one of them can happen and not both at the same time.
Collectively Exhaustive Lists: When a set of events for an experiment includes every possible
outcome, the set is said to be collectively exhaustive event/list.
Example: In flipping a fair coin twice if the list or set has all the possible outcomes
(that is {HH, HT, TH, TT}), then it is said to be collectively exhaustive.
2.3 Definition of Probability
Experts disagree about the concept of probability, since there are various conceptual
approaches in defining probability. The most common are discussed below:
i. Classical Approach (or A priori Definition of Probability)
This approach is based on the assumption that each of the possible outcomes must be
mutually exclusive and equally likely. If a trial results in n exhaustive, mutually exclusive and
equally likely cases and m of them are favorable to the happening of an event E, then the
probability of event E happening, i.e. P (E), is given by:
( )
( )
Page 4 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
Equally likely means that each outcome of an experiment has the same chance of
happening as others have.
Example: In rolling a fair dice once, the probability of observing the outcome 1 is:
* + ( )
Shortcoming of the Classical Approach
The classical method was originally developed in the analysis of gambling problems, where
the assumption of equally likely and mutually exclusive outcomes is often reasonable. It does
not provide ways to compute probabilities for events that are not equally likely.
ii. Relative Frequency / Statistical or Empirical Approach
According to this approach, the probability of an event is the proportion of times that this
event occurs over the long run if the experiment is repeated many times under uniform
conditions.
If a trial is repeated a number of times under essentially homogeneous and identical
conditions, then the limiting value of the ratio of the number of times the event happens to
the number of trials as the number of trials become indefinitely large is called the probability
of happening of the event. It is assumed that the limit is finite and unique.
Symbolically, if in n trials an event E happens m times, then the probability p of the
happening of event E is given by
m
P(E)= lim
n n
Page 5 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
Page 6 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
S must occur, and it is to this certain event that we assign a probability of 1ur, and it is to this
certain event that we assign a probability of 1.
(c) Complement Rule: the complement of an event is everything in the sample space apart
from that event.
The complement of heads is tails, for example. If we write the complement of A as not-A then
it follows that ( ) ( ) and hence:
( ) ( )
Most practical problems require the calculation of the probability of a set of outcomes or
compound events rather than just a single one, or the probability of a series of outcomes in
separate trials. These compound events (whose probability are called Joint Probability) are
made up of simple events compounded by the words “Or” and “And” which act as operators.
The following rules for manipulating probabilities show how to handle these operators and
thus how to calculate the probability of a compound event.
(d) Rules of Addition
This rule is associated with the operator OR which is designed by the sign “U”. When we want
the probability of one outcome or another occurring, we add the probabilities of each.
Special rule of Addition: If two events A and B are mutually exclusive, the special rule of
addition states that the probability of one or the other events occurring equals the sum of
their probabilities. That is,
( ) ( ) ( ) ( )
For three mutually exclusive events designated, A, B and C the rule is written as,
( ) ( ) ( ) ( ) ( )
Example: If we roll a fair die once, what is the probability of a 5 or a 6?
Solution: Here there are two events, namely event and event . So that,
( ) ( ) ( ) ( )
The general rule for addition: When two or more events are not mutually exclusive then we
use the general rule for addition. The rule is
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
Page 7 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
Example: Mr. X feels that the probability that he will pass Mathematics is 2/3 and Statistics is
5/6. If the probability that he will pass both the course is 3/5. What is the probability that he
will pass at least one of the courses?
Solution: Let M and S be the events that he will pass the courses Mathematics and Statistics
respectively. The event means that at least one of occurs. Therefore,
P( ) ( ) ( )
( ) ( ) ( )
Joint Probability: When two events overlap, the probability is called a joint probability. In the
exercise below, the probability that Mr. Y gets A in both Calculus and Statistics is an example
of a joint probability. It is a probability that measures the likelihood that two or more events
(getting A in both Calculus and Statistics) will happen concurrently (jointly).The individual
probabilities are referred to as Marginal Probabilities.
In general, the general rule of addition is used to combine and determine Joint Probabilities
for events that are not mutually exclusive.
Exercise: A student feels that the probability that he will get A in Calculus is 3/4, A in
Statistics is 4/5 and A in both courses is 3/5. What is the probability that the student will get
i. At least one A
ii. No A ’s?
(e) Multiplication Rules
The multiplication rule is associated with use of the word ‘And’(∩) to combine events.
The special rule of multiplication: requires that two events A and B be Independent. Two
events are independent if the occurrence of one does not alter the probability of the other.
For two independent events, A and B, the probability that A and B will both occur (written as
( ) ( ) ( )
For three independent events, A, B, and C, the probability of occurrence of all is given by:
( ) ( ) ( ) ( )
Page 8 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
Example: In terms of the fair coin example, the probability of heads on two successive tosses
is the probability of heads on the first toss (which we shall call H 1) times the probability of
heads on the second toss (H2). We have shown that the events are statistically independent,
because the probability of heads on any toss is 0.5, and P (H1∩ H2) = 0.5 x 0.5 = 0.25. Thus, the
Page 9 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
2/9 if the first roll was defective. (Only two defective rolls remain in the box containing
nine rolls)
3/9 if the first roll selected was good. (All three defective rolls are still in the box
containing nine rolls.)
The fraction 2/9(or 3/9) is aptly called a conditional probability because its value is
conditional on (dependent on) whether a defective or a good roll of film is chosen in the first
selection from the box. If selecting a defective and good rolls are named as event A and B
respectively, and subscripts 1 and 2 to denote the selection rounds, the probability of
selecting a defective roll followed by another defective roll is computed as:
( ) ( ) ( ) ( )
Example 2: Samples of executives were surveyed about loyalty to their company. One of the
questions was, “if you were given an offer by another company equal to or slightly better
than your present position, would you remain with the company or take the other position?”
The responses of the 200 executives in the survey were cross-classified with their length of
service with the company as shown in the following table.
Length of Service
Loyalty Less than 1 year, 1-5 years, 6-10 years, More than 10 years, Total
B1 B2 B3 B4
Would remain, A1 10 30 5 75 120
Would not remain, A2 25 15 10 30 80
Sub Total 35 45 5 105 200
Now, using the table determine the probability of randomly selecting an executive who is
loyal to the company (would remain) and who has more than 10 years of service?
Solution: note that two events occur at the same time- the executive would remain with the
company (A1), and he or she has more than 20 years of service (B4).
I. Event A1 happens if a randomly selected executive will remain with the company
despite an equal or slightly better offer from another company. Out of the 200
executives in the survey, 120 of them would remain with the company. So, the
probability that event A1 will happen is ( )
Page 10 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
II. Event B4happens if a randomly selected executive has more than 10 years of service.
Thus, ( ) is the conditional probability that an executive with more than 10
years of service would remain with the company despite an equal or slightly better
offer from another company. 75 of the 120 executives who would remain have more
than 10 years of service, so ( ⁄ ) .
III. Thus, the probability that a randomly selected executive will be one who would
remain in the company and who has more than 10 years of service will be:
( ) ( ) ( ( ⁄ ) ( ⁄ ) ( ⁄ )
Exercise: Using the above table determine ( ) and ( )
2.5 Counting Procedures
We have learned that one of the steps in assigning probabilities to events is to list and count
the related events of interest from the sample space. However, when the number of possible
events in an experiment is large, it would be tedious to count all the possibilities. For example
is it easier to count the possible outcomes in rolling a single die. But, it requires more time
and effort to count the possible arrangements if the die is rolled say 10 times. However, we
can avoid the tedious task involved in counting large number of possible arrangement by
using the rules of counting. We will start with the multiplication formula.
a. Multiplication Formula:states that if there are m ways of doing one thing and n ways of
doing another thing, there area total of m x n ways of doing both things. It is applied to
find the number of possible arrangements for two or more groups.
Example:If a TV dealer can offer to sell 10 different brands of TVs and each brand of TV comes
with 6 different price levels, how many different TV sets and prices can the dealer offer?
Solution: Each brand of TV can be sold at 6 different prices. There are a total of 10 TV brands
and as a result the total number of TV sets the dealer can offer is given by 6 x 10 = 60.
b. Permutation: is any arrangement of r objects selected from a single group of n possible
objects. It is denoted by . The permutation formula is applied to find the possible
number of arrangements when there is only one group of objects. It is given by the
formula:
( )
Page 11 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
Where n is the total number of objects and r is the number of objects selected. n! is read as n
factorialand is used to denote the product of n(n-1)(n-2)(n-3)…(1). For example;5!
=5*4*3*2*1 = 120 and 5!/3! = 120/6=20. By definition, zero factorial (0!) is equal to one.
Example 1: If there are three seats where three persons, A, B and C, can seat, in how many
different ways can the three persons seat considering that the order of arrangement matters?
List: ABC, ACB, BCA, BAC, CAB, CBA. We can also determine it using the permutation formula.
There are three persons so n = 3 and because there are only 3 seats r= 3
( ) ( )
Note that in permutation the arrangement ABC is different from BCA.
In the previous example we selected and arranged all the people n = r. However, in many
cases only some objects are selected and arranged from n possible objects.
Example 2:Assume that there are only three seats but there are 8 people. In how many
different ways can the 8 people be arranged or seated?
( ) ( )
Exercise: How many 4 digit numbers can be formed using the numbers 0 to 9? 1 to 9?4 to 8?
c. Combination: if the order of the selected objects is not important, any selection is called
combination. The formula to count the number of r objects from a set of n objects is:
( ) ( )
( ) * ( ) + *( ) ( ) +
For example, if Ketema, Chaltu and Tirhas are to be chosen as a committee to negotiate a
merger, there is only one possible combination of these three: the committee Ketema, Chaltu
and Tirhas is the same as the committee Tirhas Chaltu, Ketema. The Order of the
arrangement has no importance so we use combination:
( ) ( )
Example: The marketing department has been given the assignment of designing color codes
for the 42 different lines of compact disks sold by Google Recods. Three colors are to be used
on each CD, but a combination of 3 colors used for one CD cannot be arranged and used to
identify a different CD. I.e. if green, yellow and violet were to identify one line, then yellow,
Page 12 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
green and violet or any other combination of these three colors cannot be used to identify
another line. Would seven colors taken three at a time be adequate to color-code the 42
lines?
Solution: There are 35 different combinations found by
( ) ( )
The seven colors taken three at a time (i.e. three colors to a line) would not be adequate to
color-code the 42 different lines because they would provide only 35 combinations.
Exercise 1: Would 8 colors taken three at a time be adequate to the above example?
Exercise 2: A family has 10 children. How many different possible combinations of boys and
girls might this family have?
In our discussion of conditional probability, we indicated that revising probabilities when new
information is obtained is an important phase of probability analysis. Often, we begin our
analysis with initial or prior probability estimates for specific events of interest. Then, from
sources such as a sample, a special report, or some other means, we obtain some additional
information about the events. Given this new information, we update the prior probability
values by calculating revised probabilities, referred as posterior probabilities, Bayes’ theorem
provides a means for making these probability calculations. The steps in this probability
revision process are shown in figure below.
Page 13 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
varies with the source of supply. Historical data suggest that the quality ratings of the two
suppliers are as shown in the table below.
Historical quality Levels of Two Suppliers
Suppliers Percentage of Good parts (G) Percentage of Bad parts ( B)
Supplier 1 (A1) 98 2
Supplier 2 (A2) 95 5
If we let G denote the event that a part is good, and B denote the event that a part is bad, the
information in the above table provides the following conditional probability values.
P (G/A1) = 0.98 P (B/A1) = 0.02
P (G/A2) = 0.95 P (B/A2) = 0.05
Based on the above information we can compute the joint probabilities of a part being good
and comes from supplier 1, good and A2, a part being bad and supplied by A1; and bad and
supplied by A2.
P (A1G) = P (A1)* P (G/A1) = 0.637
P (A1B) = P (A1)* P (B/A1) = 0.013
P (A2G) = P (A2)*P (G/A2) = 0.3325
P (A2B) = P (A2) *P(G/A2) = 0175
Suppose now that the parts from the two suppliers are used in the firm’s manufacturing
process and that a machine breaks down because it attempts to process a bad part. Given
the information that the part is a bad, what is the probability that it came from supplier 1 and
what is the probability that it came from supplier 2?
With the prior probabilities and the join probabilities, Bayes’ theorem can be used to
answer these questions.
Letting B denote the event that the part is bad, we are looking for the posterior probabilities
P (A1/B) and P (A2/B). From the law of conditional probability and marginal probability, we
know that:
( )
( ⁄ )
( )
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
Page 14 of 15
Statistics for Economics Handout 1: Overview of Probability Theory
Substituting the above equations, we obtain Bayes’ theorem for the case of two events.
( ) ( )
( ⁄ ) ( ⁄ )
( ) ( ) ( )
( ) ( )
( ⁄ )
( ) ( ) ( ) ( )
Similarly, we can determine P(A2/B) as follows
( ) ( )
( ⁄ )
( ) ( ) ( )
( ) ( )
( ⁄ )
( ) ( ) ( ) ( )
Using the above formula:
0.65x0.02 0.0130
P( A1 / B) 0.4262
(0.65x0.02) (0.35x0.05) 0.0305
0.35x0.05 0.0175
P( A2 / B) 0.5738
(0.65x0.02) (0.35x0.05) 0.0305
Note that in this application we started with a probability of .65 that a part selected at
random was from supplier 1. However, given information that the part is bad, the probability
that the part is from supplier 1 drops to .4262. In fact, if the part is bad, there is a better than
50-50 chance that the part came from supplier 2; that is, P (A2/B) = .5738.
Bayes’ theorem is applicable when the events for which we want to compute posterior
probabilities are mutually exclusive and their union is the entire sample space. Bayes’
theorem can be extended to the case where there are n mutually exclusive events A1, A2,…, An
whose union is the entire sample space. In such case, Bayes’ theorem for computing
posterior probability P (Ai/B) can be written symbolically as:
P( Ai ) P( B / Ai )
P( Ai / B)
P( A1 ) P( B / A1 ) P( A2 ) P( B / A2 ) ... P( An ) P( B / An )
Page 15 of 15