0% found this document useful (0 votes)
371 views128 pages

Managerial Statistics

Uploaded by

Dawit g/kidan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
371 views128 pages

Managerial Statistics

Uploaded by

Dawit g/kidan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

DEFENCE UNIVERSITY

COLLEGE OF RESOURCE MANAGEMENT

Corse note for The Course


Business/Managerial Statistics(Stat 2072)

5. Course description This course is designed to introduce students to the application of statistics in

1
managerial decision-making. In addition, this course introduces the application of
inferential statistics as applied to managerial decision making, sampling theories
and sampling distribution, statistical estimation, hypothesis testing, analysis of
variance, chi-square distribution, statistical forecasting(time series & regression
analysis), and index numbers.

6. Course objectives The course enables students to have an understanding on:

To familiarize students about the use & application of various statistical tools in
the field of managerial decision making

To enable students make valid inference from data

To enable students to construct and test different types of hypothesis

To enable students to find correlation between variables

How to apply the statistical tests in the preparation of Research report.

The application of statistics in every areas of activities in business and industry


such as production, financial analysis, distribution, market research, manpower
planning.

7. course content

Contact Hours Chapters to be Covered

Week 1 Chapter one: SAMPLING AND SAMPLING DISTRIBUTIONS


[Link] PLING THEORY
1.1.2. Basic Definitions
1.1.3. The need for samples
1.1.4. Designing and conducting a sampling study
1.1.5. Bias and errors in sampling, non-sampling errors
1.1.6. Types of samples- random and non-random samples
1.2. SAMPLING DISTRIBUTIONS
1.2.1. Definitions
1.2.2. Sampling distributions of the mean and proportion
Sampling distribution of the difference between two means and two proportions
Week 2-4 Chapter 2 -STATISTICAL ESTIMATIONS

2
2.1. Basic concepts
2.2 Point estimators of the mean and proportion
2.3 Interval estimators of the mean and proportion
2.4. Interval estimation of the difference between two independent means
(concept and formula)
2.5. Student's t-distribution
2.6. Determining the sample size
Week 5-6 Chapter 3 - HYPOTHESIS TESTING
3.1. Basic concepts

3.2. Steps in Hypothesis testing

3.3. Type I and type II errors (concepts)

3.4. One tailed \IS two tailed hypothesis tests

3.5. Hypothesis testing of:

3.5.1. Population mean, proportion

3.5.2. The difference between two means and two proportions

Chapter 4 - CHI-SQUAREDISTRIBUTIONS
4.1. Areas of application
4.1.1 Tests for independence between two variables
4.1.2. Tests for the equality of several proportions
Week 7-9 4.1.1. Goodness- of fit tests (Binomial, normal, Poisson)
Chapter 5 - ANALYSIS OF VARIANCE
5.1. Areas of application
5.1.1. Comparison of the mean of more than two populations
5.1.2. Variance test
Week 10-13 Chapter 6 - REGRESSION AND CORRELATION
6.1 Linear correlation
6.1.1 The coefficient of correlation
6.1.2 Rank correlation coefficient
6.2. Simple linear regression
6.2.1. curve fitting,
6.2.2. the method of least square, r2

3
Methods of Delivery Lecture, exercises, group discussion, tutorial, presentation and reflection

Assessment Continuous assessment( Quizzes, tests, individual and group assignments,


techniques discussion and group reflection )and final exam

References  Anderson, statistics for Business and Economics.


 Bowen Earl, Basic Statistics for Business and Economics.
 Hank/Reitsch, understanding Business Statistics.
 Hoel Paul G. and Jessen Raymond, Basic Statistics for Business and
Economics
 Kohler, statistics for Business and Economics.
 Lapin, Statistics for modem business and economics.
 Lino Douglas A. and Robert D. mason, Basic statistics for Business and
Economics.
 Neter/Wasserman, Fundamental statistics for Business and Economics.
 Stockton and Clark, Introduction to Business and Economics Statistics.
 Van matre/Gilbreath statistics for Business and Economics.

Unit One
Some Important Probability Distributions
1.1.1 Binomial Distribution
Binomial distribution is one of the simplest and most frequently used discreet probability
distribution and is very useful in many practical situations where a trial can be expressed in
exactly two mutually exclusive outcomes.

Dear Learner, to understand binomial probability and binomial distribution, you need to
know binomial experiment, binomial random variable and some associated notations; so we
cover those topics first. Then, after binomial distributions and binomial probabilities, you will
learn how to compute expected value, variance and standard deviation of a binomial random
variable. At last, we will define cumulative binomial probability and you will learn how to
find binomial probabilities using the cumulative binomial probability table.
The Binomial Experiment
A binomial experiment (also known as a Bernoulli trial) is a statistical experiment has the
following properties:
 The experiment consists of n repeated trials;
 Each trial can result in outcomes that can be classified as successes or failures;
 The probability of success, denoted by P, is the same on every trial;
 The trials are independent; that is, the outcome on one trial does not affect the
outcome on other trials.

4
Consider the following statistical experiment. You flip a coin 2 times and count the number
of times the coin lands on heads. This is a binomial experiment because:
a) the experiment consists of repeated trials. We flip a coin 2
times.
b) each trial can result in just two possible outcomes - heads or
tails.
c) the probability of success is constant - 0.5 on every trial.
d) the trials are independent; that is, getting heads on one trial
does not affect whether we get heads on other trials.

The Binomial Random Variable; X


A binomial random variable can be defined as the number of successes in n repeated trials of
a binomial experiment.
The following are some illustrative examples of a binomial random variable;
 X: The number of heads in 3 tosses of a coin
 X: The number of defective items in a randomly selected lot of items
 X: The number of boys in a randomly selected sample of 10 children
The binomial probability; b(x; n, p)
The binomial probability, denoted by b(x; n, p), is defined as b (x ;n , p )= p ( X=x ) where
p( X =x )= ()
n x n−x
x
p q
 x: The number of successes that result from the binomial experiment.
 n: The number of trials in the binomial experiment.
 p: The probability of success on an individual trial.
 q : The probability of failure on an individual trial. (This is equal to 1 - p.)


()
n
x : The number of combinations of n things, taken x at a time.
 b(x; n, p): The probability that an n-trial binomial experiment results in exactly x
successes, when the probability of success on an individual trial is p.

Example 1: Find the probability of getting 3 heads in 5 tosses of a fair coin.


1
Solution: Clearly, in each toss of the coin, the probability of getting a head is 2 (that is p =
1 1 1
and hence q=1− p=1− =
2 2 2 ). In the experiment, we toss the coin for 5 times (that is n
= 5) and what we are interested is in getting 3 heads out of this experiment (that is x = 3). In
general, the probability of getting x successes in n trials of a binomial experiment is given by

()
b (x ;n , p )= n p x q n− x
x . Thus, for our particular case, the probability of getting 3 heads in 5
tosses of the fair coin will be

() ( )( ) ( )
3
5 1 3 1 5−3 5 1 12 1 1 10 5
b (3;5 , 0 . 5)= ( )( ) = =10( )( )= = =0 .3125
3 2 2 3 2 2 8 4 32 16

5
Example 2: What is the probability of getting 2 fours in 5 tosses of a fair die?
1
Solution:- Clearly, in each toss of the die, the probability of getting a four is 6 (that is p =
1 1 5
6 and hence q = 1 – p = 1- 6 = 6 ). In the experiment, we toss the die for 5 times (that is n
= 5) and what we are interested is in getting 2 fours out of this experiment (that is x = 2).
Thus, the probability of getting 2 fours in 5 tosses of the fair die will be

( )( ) ( ) ( )( ) ( ) ( )( )
2 5−2 2
1 5 1 5 5 1 53 1 125 1250
b (2 ; 5 , )= = =10 = ≈0 .1608
6 2 6 6 2 6 6 36 216 7776
Binomial Distribution
The probability distribution of a binomial random variable is called a binomial distribution
(also known as a Bernoulli distribution).

Suppose we flip a coin two times and count the number of heads (successes). The binomial
random variable is the number of heads, which can take on values of 0, 1, or 2. The
probability of getting no head, 1 head and 2 heads; respectively, are computed as follows.

( ) ( )( ) ( ) ( )( ) ( ) ()
0 2−0 0
1 2 1 1 2 1 12 1 1
b (x ; n , p )=b 0 ; 2 , = = =1(1) = =0. 25
2 0 2 2 0 2 2 4 4

( ) ( ) ( ) ( ) ( )( ) ( ) ( )( )
1 2−1 1
1 2 1 1 2 1 1 1 1 1 2
b (x ; n , p )=b 1 ;2 , = = =2 = =0. 5
2 1 2 2 1 2 2 2 2 4

( ) ( )( ) ( ) ( )( ) ( ) ( )
2 2−2 2
1 2 1 1 2 1 1 0 1 1
b (x ; n , p )=b 2 ; 2 = = =1 (1)= =0 .25
2 2 2 2 2 2 2 4 4

Now, the binomial probability distribution is presented below.

Number of heads (x) Probability (p)


0 0.25
1 0.50
2 0.25

Expected value, variance and standard deviation of a binomial random variable

6
Let X denotes a binomial random variable. The mean (expected value), variance and standard
deviation of X, denoted by E(X), Var(X) and SD(X), respectively, are defined as follows:

E ( X )=np

Var ( X ) =npq

SD( X ) =√ npq

Example 3: Find the mean (expected value), the variance and the standard deviation for the
number of heads in the experiment of tossing a coin for 2 times.
Solution: - Here there are two approaches for the solution of this problem; one is to
construct the probability distribution for the experiment and calculate these quantities from
the scratch and the other approach is to use the above simple formulas. Now, let us see each
of these two approaches one by one.
Method 1: We have already constructed the probability distribution for this experiment and
found to be the following;

Number of heads (x) Probability (p)


0 0.25
1 0.50
2 0.25

Now,
where; and ,

Thus,
Since , we have
Method 2: Since the experiment is tossing a coin for 2 times, we have n = 2 and as far as the
coin is fair, the probability of getting a head in each toss is 0.5 (that is p = 0.5 and q = 1 – p
= 1 – 0.5 = 0.5).Thus,

Since , we have

Cumulative Binomial Probability


A cumulative binomial probability refers to the probability that the binomial random
variable is less than or equal to a stated upper limit.

7
For example, we might be interested in the cumulative binomial probability of obtaining 15
or fewer heads in 20 tosses of a coin. This would be the sum of all these individual binomial
probabilities.

b(x < 15; 20, 0.5) = b(x = 0; 20, 0.5) + b(x = 1; 20, 0.5) + ... + b(x = 14; 20, 0.5) + b(x = 15; 20,
0.5)
Dear Learner, have you observed that how many binomial probabilities are you going to
calculate to arrive at the solution to the above example? In deed, the number of binomial
probabilities that you are going to perform is 16, which is undoubtedly boring. Thus, it is
better to use tables to find binomial probabilities, especially for cumulative binomial
probabilities. The cumulative binomial table is attached at the end of this module for your
consideration (Table-1). This table provides binomial probabilities of the form
for n values from 1 to 20 and for some selected values of p.
How to use the cumulative binomial table
A cumulative binomial distribution table shows a cumulative probability associated with a
particular x, n and p values. Table rows show the whole numbers n and x and table columns
show the p values. The cumulative probability appears in the cell of the table.

For example, a section of the cumulative binomial distribution table is presented below. To
find the cumulative probability , cross-reference the row of the table
containing n =5 and x = 3 under the column containing 0.4. Then, you will find 0.913.

8
p

c 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95
0.95 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.05
n=1 0 0 0 0 0 0 0 0 0 0 0 0
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
1 0 0 0 0 0 0 0 0 0 0 0
0.90 0.81 0.64 0.49 0.36 0.25 0.16 0.09 0.04 0.01 0.00
n=2 0 3 0 0 0 0 0 0 0 0 0 3
0.99 0.99 0.96 0.91 0.84 0.75 0.64 0.51 0.36 0.19 0.09
1 8 0 0 0 0 0 0 0 0 0 8
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
2 0 0 0 0 0 0 0 0 0 0 0

---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
0.77 0.59 0.32 0.16 0.07 0.03 0.01 0.00 0.00 0.00 0.00
n=5 0 4 0 8 8 8 1 0 2 0 0 0
0.97 0.91 0.73 0.52 0.33 0.18 0.08 0.03 0.00 0.00 0.00
1 7 9 7 8 7 8 7 1 7 0 0
0.99 0.99 0.94 0.83 0.68 0.50 0.31 0.16 0.05 0.00 0.00
2 9 1 2 7 3 0 7 3 8 9 1
1.00 1.00 0.99 0.96 0.91 0.81 0.66 0.47 0.26 0.08 0.02
3 0 0 3 9 3 3 3 2 3 1 3
1.00 1.00 1.00 0.99 0.99 0.96 0.92 0.83 0.67 0.41 0.22
4 0 0 0 8 0 9 2 2 2 0 6
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
5 0 0 0 0 0 0 0 0 0 0 0

---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----

Example 4: The probability that a student is accepted to DRMC is 0.3. If 5 students from the
same school apply, what is the probability that at most 2 are accepted?

Solution:-The probability that a student is accepted to DRMC is 0.3 (that is p = 0.3). In the
experiment 5 students are participated (that is n = 5) and what we are interested is, in getting
at most 2 students accepted by the college (that is ). Thus, the probability of
obtaining at most 2 students from a total of 5 students who apply for the college, denoted by
, can be read from the cumulative binomial probability table across the row
n = 5, x = 2 and down the column p = 0.3 and found to be 0.837

Example 5: What is the probability of obtaining 8 or fewer heads in 10 tosses of a coin?

9
1
Solution:- Clearly, in each toss of the coin, the probability of getting a head is 2 (that is p =
1
2 ). In the experiment, we toss the coin for 10 times (that is n = 10) and what we are
interested is, in getting 8 or fewer heads out of this experiment (that is ). Thus, the
probability of obtaining 8 or fewer heads in 10 tosses of a coin, denoted by
, can be read from the cumulative binomial probability table across the
row n = 10, x = 8 and down the column p = 0.5 and found to be 0.989.

Finding binomial probabilities using the cumulative binomial probability table


In practice we often find the following 9 forms of probabilities. However, the cumulative
binomial probability table provides values for only one form of probability; namely for
probabilities of the form . Now the question is, how can we make use of the
cumulative binomial distribution table so as to find other forms of probabilities like
etc? The following conversion formulas provide you the way how we
can convert any form of probability to probability of the form so that you can use
the cumulative binomial distribution table.
1) … can be read directly from the cumulative binomial probability table.
2)
3)
4)
5)
6)
7)
8)
9)
Example 6: For 6 trials of a binomial experiment if probability of success in each trial is 0.7,
find:
b) c)
Solution:- In this example we have given that n = 6 and p = 0.7
a) .
This value can be read from the cumulative binomial probability table across the row
n=6, x=3 and down the column p = 0.7 and we find it to be . Therefore,
.
b)
By rule 2; .
Therefore, .
This value can be read from the cumulative binomial probability table across the row
n=6, x=1 and down the column p = 0.7 and we find it to be . Therefore,
.
c)
By rule 3;
Therefore,

10
Exercise

1) Find the mean (expected value), the variance and the standard deviation for the
number of heads in the experiment of tossing a coin for 4 times.

E[X] = _________ and Var(X) = _________


2) For 6 trials of a binomial experiment if probability of success in each trial is 0.7, then:

a) = _____ b) c)

d) e) f)

3) Consider the experiment of tossing a coin for 20 times. Then, the probability of
getting:
a) at most 10 heads is _____
b) at least 12 heads is _____
c) more than 8 heads is _____
d) less than 13 heads is _____
e) at least 7 and at most 14 heads is _____
1.1.2 Poisson Distribution
Poisson distribution is another theoretical distribution which is useful for modeling certain
real situations. It differs from the binomial distribution in the sense that in the binomial
distribution we must be able to count the number of successes and the number of failures,
while in the Poisson distribution we all want to know is the average success in a given unit of
time or space.
In many situations, it is not possible to count the number of failures even though we can
know the number of successes. For instance, in the case of patients coming to a hospital for
emergency treatment, we can always count the number of patients arriving in a given hour. If
the number of patients arriving is considered as the number of successes, then we can not
know the number of failures, because it is not possible to count the number of patients not
coming for emergency treatment in that hour. Accordingly, it is not possible to determine the
total number of possible outcomes (successes and failures) and hence binomial distribution
cannot be applied as a decision making tool. In such a situation, we can use Poisson
distribution if we know the average number of patients arriving for emergency treatment per
hour. Other examples of Poisson distribution are:
 Telephone calls going through a switch board system,
 Number of cars passing in AAU main gate,
 Number of customers coming to Commercial Bank of Ethiopia (CBE) for service, etc.
All these arrivals can be described by a discreet random variable that takes on integer values
(0, 1, 2, 3, . . .)

Dear Learner! To understand Poisson probability and Poisson distribution, you need to
know Poisson experiment, Poisson random variable and some associated notations; so we
cover those topics first. Then, after Poisson distributions and binomial probabilities, you will
learn how to compute expected value, variance and standard deviation of a Poisson random
variable. At last, you will define cumulative Poisson probability and you will learn how to
find Poisson probabilities using the cumulative Poisson probability table.
Poisson experiment

11
A Poisson experiment is a statistical experiment that has the following properties:
 The experiment results in outcomes that can be classified as successes or failures.
 Average number of successes ( λ ) that occurs in a specified interval of time ( Δt ) is
constant.
λ ( Δt )
 The average number of success ( )that occurs in a specified interval of time is
λ ∝ Δt
directly proportional to the size of the interval. ( )
 The probability that a success will occur in an extremely small interval of time is virtually
zero.

Poisson random variable


A Poisson random variable is the number of successes that result from a Poisson
experiment.
The following are some illustrative examples of a Poisson random variable;
 X: The number of cars entering the main gate of DRMC in every working day.
 X: The number of car accidents at Meskel square per day.
 X: The number of homes sold by Sunshine Real Estate Company per day.
 X: The number of persons who enter Commercial Bank of Ethiopia (CBE) Sillassie
Branch for service every 10 min.
The Poisson probability; p(x; )
e− λ λ x
p( X =x , λ )=
The Poisson probability, denoted by p(x; ), is defined as x ! where
 x: The number of successes that result from the Poisson experiment.
 λ : The average number of successes within a given interval of time.
 e: is the base of the natural logarithm which is approximately equal to 2.71828.

Example 7: Customers arrive at a restaurant at an average rate of 6 every 15 min. The


number of arrivals is distributed accordingly to a Poisson distribution. What is the probability
that:
a) There will be no arrivals during any period of 15 min.
b) There will be exactly one arrival during this time period.
c) There will be more than 2 arrivals during this time period
Solution:- Let X: The number of customers who arrive at the restaurant in every 15 min.

e− λ λ x
p( x )= , where λ=6
x!
e−6 60 −6
p(0 )= =e ≈0 . 0025
a) 0!
e−6 61
p(1)= =6 e−6 ≈0 .0149
b) 1 !
c)

12
p( x >2)=1− p( x≤2 )
=1−[ p(0 )+ p(1)+ p(2 ) ]

=1−
0![
e−6 60 e−6 6 1 e−6 6 2
+
1!
+
2! ]
=1−[ e−6 +6 e−6 +18 e−6 ]
=1−25 e−6
≈0 . 938
Poisson Distribution
The probability distribution of a Poisson random variable is called a Poisson distribution. To
illustrate this, let us consider the random variable
X: The number of cars entering the main gate of DU in every working day.
Assume that the average number of cars entering the main gate of DU in every working day
is 20 and the process follows the Poisson distribution.
Clearly, the random variable can take values 0, 1, 2, 3, . . . with probabilities
e−20 20 0 −20
p(0 )= =e
0!
e−20 201
p(1)= =20 e−20
1!
−20
e 202
p(2 )= =200 e−20
2!

etc

Therefore, the Poisson distribution of this random variable can be presented as follows;

X 0 1 2 …
−λ x
e λ e−20 20 e−20 200 e−20 …
p( X =x , λ )=
x!

Expected value, variance and standard deviation of a Poisson random variable


Let X denotes a Poisson random variable. The mean (expected value), variance and standard
deviation of X, denoted by E(X), Var(X) and SD(X), respectively, are defined as follows:
E ( X )=λ

Var ( X ) = λ

SD( X ) =√ λ

Example 8: Assume that on average 4 persons enter Commercial Bank of Ethiopia (CBE)
Sillassie Branch for service every 10 min. If the process follows Poisson distribution, find the
expected value, the variance and standard deviation of the distribution.
Solution:- λ=4

13
E ( X )=λ=4 , Var ( X ) = λ=4 , SD( X ) =√ 4= √ 4=2
Therefore,
Dear learner, Poisson distribution is found to be particularly useful in waiting line queuing
type of problems where by knowing the rate of arrival of units at a service station and the rate
at which these units are served, the number of service stations as well as the average waiting
time for service for each arriving unit can be reasonably determined.
Cumulative Poisson Probability
A cumulative Poisson probability refers to the probability that the Poisson random variable
is less than or equal to a stated upper limit.
For example, we might be interested in the cumulative Poisson probability like p(x < 30;40).
Direct forward calculation shows that

p(x < 30; 40) = b(x = 0; 40) + b(x = 1; 40) + ... + b(x = 29; 40) + b(x = 30; 40)

Dear learner, have you observed that how many probabilities are you going to calculate to
arrive at the solution to the above example? In deed, the number of Poisson probabilities that
you are going to perform is 31, which is undoubtedly boring. Thus, it is better to use tables to
find Poisson probabilities, especially for cumulative Poisson probabilities. The cumulative
Poisson table is attached at the end of this module for your consideration (Table-2). This table
provides values for Poisson probabilities of the form for some selected values of
x and .

How to use the cumulative Poisson table


A cumulative Poisson distribution table shows a cumulative probability associated with a
particular and x values. Table rows show the values of x and table columns show the
values of . The cumulative probability appears in the cell of the table.

For example, a section of the cumulative Poisson distribution table is presented below. To
find the cumulative probability , cross-reference the row of the table containing x
=3 under the column containing = 4. Then, you will find 0.4335.

14
λ= 0.5 1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
x= 0 0.6065 0.3679 0.2231 0.1353 0.0821 0.0498 0.0302 0.0183 0.0111 0.0067
1 0.9098 0.7358 0.5578 0.4060 0.2873 0.1991 0.1359 0.0916 0.0611 0.0404
2 0.9856 0.9197 0.8088 0.6767 0.5438 0.4232 0.3208 0.2381 0.1736 0.1247
3 0.9982 0.9810 0.9344 0.8571 0.7576 0.6472 0.5366 0.4335 0.3423 0.2650
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
λ= 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0
x = 0 0.0041 0.0025 0.0015 0.0009 0.0006 0.0003 0.0002 0.0001 0.0001 0.0000
1 0.0266 0.0174 0.0113 0.0073 0.0047 0.0030 0.0019 0.0012 0.0008 0.0005
2 0.0884 0.0620 0.0430 0.0296 0.0203 0.0138 0.0093 0.0062 0.0042 0.0028
3 0.2017 0.1512 0.1118 0.0818 0.0591 0.0424 0.0301 0.0212 0.0149 0.0103
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
λ= 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0
x = 0 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1 0.0003 0.0002 0.0001 0.0001 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000
2 0.0018 0.0012 0.0008 0.0005 0.0003 0.0002 0.0001 0.0001 0.0001 0.0000
3 0.0071 0.0049 0.0034 0.0023 0.0016 0.00011 0.0007 0.0005 0.0003 0.0002
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----

Example 9: Suppose the average number of lions visited on a 1-day safari is 12. What is the
probability that tourists will visit at most ten lions on the next 1-day safari?

Solution:- The average number of lions visited on a 1-day safari is 12 (that is . The
probability that tourists will visit at most ten lions on the next 1-day safari, denoted by
can be read from the cumulative Poisson probability table and you will find it
to be 0.3472

Finding Poisson probabilities using the cumulative Poisson probability table


In practice, we often find the following 9 forms of probabilities. However, the cumulative
Poisson probability table provides values for only one form of probability; namely for
probabilities of the form . Now the question is, how can we make use of the
cumulative Poisson distribution table so as to find other forms of probabilities like
etc? The following conversion formulas provide you the way how we
can convert any form of probability to probability of the form so that we can use
the cumulative Poisson distribution table.
1) … can be read directly from the cumulative binomial probability table.
2)
3)
4)
5)
6)

15
7)
8)
9)

Example 10: On average, schools in Addis close 10 days each year, due to various reasons,
like public holidays, meetings, etc. What is the probability that schools in Addis will close at
least for 8 days next year?

Solution:- Here we have given that

Let X be a random variable representing the number of days in a year where schools in Addis
are going to be closed. Now, we are asked to find . Here we assume that X follows a
Poisson distribution.

By rule 3;
Therefore,
Example 11: A typist makes, on average, 2 typing errors every 5 pages. What is the
probability that the typist will make fewer than 8 errors on the next fifteen pages?
Solution:
Let X:- number of typing errors that the typist makes in fifteen pages.
Since the average number of typing errors that the typist makes in every 5 pages is 2, the
average number of typing errors that the typist makes in every 15 pages will be .
Therefore,

Now, we are asked to find . Here we assume that X follows a Poisson distribution.

By rule 2;
Therefore,
Poisson Approximation to Binomial
Dear learner, in this section we shall give an approximation to the probability of k- successes
in n-Bernoulli trials with probability of success “p” ie b (k , n , p ) for large “n” and small “p”
such that np=λ is moderate. There is no hard and fast rule that determines the values of
for which we can apply Poisson approximation to binomial. However, as a rule of thumb
shall not exceed 5.
For large n and small p, where np=λ is of moderate magnitude, the Bernoulli b ( k , n , p ) can
e−λ λ k
be approximated by the formula: k !
Example 12: There are 1000 workers in a factory. The probability that any worker will be
absent on a pre-assigned day is 0.001. What is the probability that on a certain day exactly 2
of the workers will be absent?
Solution:

Clearly, is moderate.
e− λ λk
b ( k , n , p )≈
Therefore, k!

16
e−λ λk e−1 12 1
b ( 2 , 1000 , 0. 001 )≈ = = ≈0 .18394
k! 2! 2e

If you use the binomial directly,

() ( ) ( )
b ( k , n , p )= n p k qn−k = 1000 ( 0. 001 )2 ( 0 . 999 )1000−2= 1000 ( 0 . 001 )2 ( 0 . 999 )998 =0 .184032
k 2 2

Exercise
Solve the following problems.

1) The average number of homes sold by the Sunshine Real Estate Company is 2 homes
per day. What is the probability that exactly 3 homes will be sold tomorrow?
2) Assume that on average 12 foreigners visit National Museum with in any 1hr time
interval from 8:30 A.M -5:30 P.M during working days (Monday-Friday). What is the
probability that at least 6 and at most 15 foreigners visit the museum between 1:30
P.M and 2:30 P.M in the coming Wednesday?

2.3. Normal Distribution


Learner! The Binomial and Poisson distributions are both discrete probability distributions;
whereas the normal probability distribution is a continuous distribution. The normal
probability distribution plays a very important and pivotal role in statistical theory and
practice, particularly in the area of statistical inference and statistical quality control. Its
importance is also due to the fact that in practice the experimental results, very often seem to
follow the normal distribution or the bell shaped curve.
The normal curve , which is the curve of the normal distribution, is symmetrical and defined
by its mean ( μ) and standard deviation(δ ) . The normal distribution has equal mean, mode
and median.
Characteristics of normal curve
 All normal curves are symmetrical about the mean. That is, mean = median
 The height of the normal curve is at its maximum at the mean value. That is, mean =
mode
 The height of the curve declines as we go in either direction from the mean but never
touch the base, so that the tail of the curve on both sides of the mean extend
indefinitely.
 The first and the third quartiles are equidistant from the mean
 The height of the normal curve Y at any value of the random continuous variable x
( )
2
1 x− μ

1 2 σ
Y ( x )= e
is given by the equation σ √2 π
where X is a normal random variable, μ is the mean, σ is the standard deviation, π is
approximately 3.14159, and e is approximately 2.71828.

The normal equation is the probability density function for the normal distribution.

17
The Normal Curve

As we can observe from the normal equation, the graph of the normal distribution depends on
two factors - the mean and the standard deviation of the distribution. The mean of the
distribution determines the location of the center of the graph, and the standard deviation
determines the height and width of the graph. When the standard deviation is large, the
curve is short and wide; when the standard deviation is small, the curve is tall and narrow. All
normal distributions look like a symmetric, bell-shaped curve, as shown below.

The curve on the left is shorter and wider than the curve on the right, because the curve on the
left has a bigger standard deviation.

Probability and the Normal Curve

The normal distribution is a continuous probability distribution. This has several implications
for probability.

 The total area under the normal curve is equal to 1.


 The probability that a normal random variable X equals any particular value is 0.
 The probability that X is greater than a equals the area under the normal curve
bounded by a and plus infinity (as indicated by the non-shaded area in the figure
below).
 The probability that X is less than a equals the area under the normal curve bounded
by a and minus infinity (as indicated by the shaded area in the figure below).

Additionally, every normal curve (regardless of its mean or standard deviation) conforms to
the following "rule".

 About 68.26% of the area under the curve falls within 1 standard deviation of the
mean.
 About 95.44% of the area under the curve falls within 2 standard deviations of the
mean.
 About 99.74% of the area under the curve falls within 3 standard deviations of the
mean.

18
Collectively, these points are known as the empirical rule or the 1 rule.
Clearly, given a normal distribution, most outcomes will be within 3 standard deviations of
the mean.

Standard Normal Distribution


The normal curve depends on two parameters; on the mean ( and standard deviation ( )
of the distribution. Each pair determines a unique curve. Therefore, any change in the
pair results in a different normal curve. It means that for each pair we need to
have a normal table and hence we need to have as many tables as we would have different
combinations of and . Clearly, we can have infinitely many different combinations of
and ; however, constructing infinitely many tables is impossible and even
inconceivable. Fortunately, a mechanism was developed by which all normal distributions
can be converted to a single distribution called the standard normal distribution ( or
sometimes called the Z-distribution).
The normal curve has the following interesting properties.
 The two halves about the mean under the curve are exactly similar
 The area under the normal curve between the mean and any point x along the x-axis
can be identified by simply knowing the (directional) distance of x from the mean not
in units of the data but in terms of the number of standard deviations that the point x is
away from the mean and the standard deviation.
Now, let x be at a distance of n standard deviations from the mean μ . Then,
|x−μ|=nσ , n ∈ ℜ+ . That is, x−μ=±nσ
x−μ
z=
If we let z=±n , we get x− μ=zσ . That is, σ
The computed value z is also known as z-score or standardized normal deviation. It can be
shown that z follows a normal distribution with mean zero and standard deviation one. This

19
distribution is called standard normal distribution. This allows us to use only one table of
areas for all types of normal distributions.
Example 13: The average annual income of government employees in Addis is Birr 10,000
with standard deviation of Birr 1,000. Compute the Z-value for an income of;
a) Birr 9,000 and interpret the result
b) Birr 12,100 and interpret the result
Solution:
Here we have given that

a) X = 9,000

This can be interpreted as an annual income of Birr 9,000 is 1 standard deviation


below the average annual income.

b) X = 12,100

This can be interpreted as; an annual income of Birr 2,100 is 2.1 standard deviations
above the average annual income.

How to use the Standard Normal Distribution Table


A standard normal distribution table shows a cumulative probability associated with a
particular z-score. Table rows show the whole number and tenths place of the z-score. Table
columns show the hundredths place. The cumulative probability (often from minus infinity to
the z-score) appears in the cell of the table.

For example, a section of the standard normal table is presented below. To find the
cumulative probability of a z-score equal to -1.31, cross-reference the row of the table
containing -1.3 with the column containing 0.01. The table shows that the probability that a
standard normal random variable will be less than -1.31 is 0.0951; that is, P(Z < -1.31) =
0.0951.

z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09


0.000
-3.4 0.0003 3 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
0.000
-3.3 0.0005 5 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
... ... ... ... ... ... ... ... ... ... ...
0.079
-1.4 0.0808 3 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
0.095
-1.3 0.0968 1 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
0.113
-1.2 0.1151 1 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
... ... ... ... ... ... ... ... ... ... ...
-0.2 0.4207 0.416 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859

20
8
0.456
-0.1 0.4602 2 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247
0.504
0.0 0.5000 0 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.543
0.1 0.5398 8 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.583
0.2 0.5793 2 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
... ... ... ... ... ... ... ... ... ... ...
0.999
3.3 0.9995 5 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997
0.999
3.4 0.9997 7 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998

Example 14: Let X be a normal random variable with mean 25 and standard deviation 4.
Find,
b)
Solution:- Here we have given that
a)

Therefore,

Therefore,
Finding normal probabilities using the standard cumulative normal probability table
To use the standard normal probability table, which provides values for probabilities of the
form we use the following conversion formulas and then we convert the normal
probabilities to standard normal probabilities as we did before.
 p( X ≤x )= p( X < x )
…………………………. Already shown
 p( X > x )= p( X ≥x )=1−p ( X < x )


(Point probability of a continuous probability distribution in general and a normal
distribution in particular is always zero)
Example 15: Let X be a normal random variable with mean 80 and standard deviation 5.
Find,

Solution:-Here we have given that


a)

Therefore,

21
b)

Therefore,

Normal Approximation to Binomial

The Normal distribution can be used to approximate binomial probabilities when n is large
and p is close to 0.5. In answer to the question "How large is large?", or "How close is
close?", a rule of thumb is that the approximation should only be used when both np>5 and
nq>5.

Let X be a binomial random variable resulted from n trials of a binomial experiment and let
Y be a normal random variable with mean and standard deviation of . Under the
assumption np>5 and nq>5, if E(Y) = E(X), i.e and SD(Y) = SD(X), i.e. ,
then the cumulative binomial probability can be approximated by the cumulative
normal probability , where 0.5 is a correction factor resulted from
approximating the discrete binomial probability distribution by the continuous normal
probability distribution.

Example 16: Use normal approximation to find the probability of getting 4 heads in 16
tosses of a fair coin.
1 1
n=16 , x =4 , p= and q=
Solution:- 2 2
Now,
1
μ=E ( X )=np=16( )=8
Thus,
2 √
σ =√ npq= 16 ( 12 )( 12 )= √ 4=2
p ( X=4 )= p ( X≤4 )−p ( X≤3 )
≈ p (Y ≤4 .5 ) −p (Y ≤3 .5 )
Now, let’s find the Z-scores corresponding to y 1 =4 . 5 and y 2 =3 .5
x −μ 4 . 5−8 3.5 x −μ 3 . 5−8 4 .5
z2= 2 = =− =−1. 75 z 1= 1 = =− =−2. 25
σ 2 2 σ 2 2
Therefore,
p ( X=4 )= p ( X≤4 )−p ( X≤3 )
≈ p (Y ≤4 .5 ) −p (Y ≤3 .5 )
= p ( Z≤−1 .75 )− p ( Z≤−2 . 25 )
=0. 0401−0 .0122
=0. 0279

22
Using binomial probability formula for n = 16, x = 4 , p = ½ = q;

Exercise
Solve the following problems.

1) An average light bulb manufactured by the Day Light Engineering Company lasts 300
days with a standard deviation of 50 days. Assuming that bulb life is normally
distributed, what is the probability that a randomly chosen bulb of the company will
last at most 365 days?
2) Let X be a normal random variable with mean 50 and standard deviation 10. Find
p( 48< X <55 )

3) Tolessa earned a score of 940 on a national achievement test. The mean test score was
850 with a standard deviation of 100. What proportion of students had a higher score
than Tolessa? (Assume that test scores are normally distributed.)
4) Suppose scores on an IQ test are normally distributed. If the test has a mean of 100
and a standard deviation of 10, what is the probability that a person who takes the test
will score between 90 and 110?

Unit Two
Sampling Distribution

Introduction
The main purpose of selecting random samples is to elicit information about the unknown
population parameters. Suppose we wish to arrive at a conclusion concerning the proportion
of people in Ethiopia who prefer ‘Yirga Chefie’ coffee. It would be impossible to question
every Ethiopian and compute the parameter representing the true proportion. Instead, a large
random sample is selected and the proportion of this sample favoring ‘Yirga Chefie’ coffee is
calculated. This value is now used to make some inference concerning the true proportion.
In this unit, we will introduce the concept of sampling there by examining how sample means
and proportions from a population will tend to be distributed, and have fundamental
knowledge for studying inferential statistics.
After completing this unit the students will be able to:
 define sampling distribution of a statistic
 construct the sampling distribution of sample means and proportions for all samples of a
given size selected from a certain population.
 state the Central Limit Theorem and apply whenever necessary.
 explain the concept of degree of freedom.

23
 list down the basic properties of the standard normal distribution, T-distribution and
chi-square distribution
2.1 The Concept of Sampling Distribution
In this section we shall define the word statistic in terms of the concept of random variables.
Then, you will be introduced to the concept of sampling distribution and standard error of the
sampling distribution of a given statistic.
Since many random samples are possible from the same population, we would expect the
statistic to vary somewhat from sample to sample. Hence, statistic is a random sample. To
make it more concrete, consider a population of 5 integers, say{ 1,3,6,9,10 } . If we take the
1+ 9 10
X̄ = = =5
sample{ 1,9 } , the sample mean X̄ is 2 2 ; whereas if we take the sample{ 6 , 10 } ,
6 +10 16
X̄ = = =8
the sample mean X̄ is 2 2 . Thus, the sample statistic, in our case the sample
mean X̄ , is a random variable whose values depend on the sample we are taking. Therefore
we can define statistics in the concept of random variable as.
A statistic is a random variable that depends only on the observed random sample.

The following table provides you with some of the commonly used statistics, their symbols,
the specific values resulted from these statistics and the corresponding parameters.

Statistic Symbol Specific value Parameter

Mean X̄ x̄ μ

Proportion P p P

Variance S2 s2 σ2

Hypothetically, to use the sample statistic to estimate the population parameter one should
examine every possible sample that could occur. If this selection of all possible samples were
actually done, the distribution of the results would be referred to as a sampling distribution.
Now, we define sampling distribution as follows.
Sampling distribution is the probability distribution of a statistic of all samples of a given
size

24
The distribution of a statistic depends on the size of the population (N), the size of the sample
(n) and the method of choosing the random sample (whether it is sampling with replacement
or without replacement).
If the population size (N) is large or infinity, the statistic has the same distribution whether
we sample with or without replacement. On the other hand, sampling with replacement from
a small finite population gives a slightly different distribution for the statistic that we sample
without replacement. Sampling with replacement from a finite population is equivalent to
sampling from an infinite population, since there is no limit on the possible size of the sample
selected. Thus, for a small finite population, it is recommendable to use sampling with
replacement to construct sampling distribution.
Now let’s define a very important concept, standard error, which we usually encounter in any
sampling distribution
Standard error is the standard deviation of the sampling distribution of a statistic

2.2 Sampling distribution of the sample mean ( X̄ )


In the process of sampling, we may sample either from a normally distributed population or
from a population that is not normally distributed. Thus, in this section, you will be
introduced to the effect of the distribution of a given population on the sampling distribution
of a given statistic.
Sampling distribution of the mean is the probability distribution of the sample mean of all
samples of a given size.

To illustrate the concept, consider a population of five integers, say { 1,2,3,4 ,5 } . Now, let’s try
to construct the means of all samples of size 2. Clearly, from a population of size N, we can
draw Nn samples each of size n using sampling with replacement. In our example, N = 5 and
n 2
n = 2. Thus, from the given population we can draw N =5 =25 samples each of size 2
using sampling with replacement.
Now, the different samples and their corresponding means can be tabulated as follows.

Sampl X̄ Sample X̄ Samples X̄ Samples X̄ Sample X̄


e

1,1 1 2,1 1.5 3,1 2 4,1 2.5 5,1 3


1,2 1.5 2,2 2 3,2 2.5 4,2 3 5,2 3.5
1,3 2 2,3 2.5 3,3 3 4,3 3.5 5,3 4
1,4 2.5 2,4 3 3,4 3.5 4,4 4 5,4 4.5
1,5 3 2,5 3.5 3,5 4 4,5 4.5 5,5 5

Thus, the sampling distribution of the means of samples of size 2 will be the probability
distribution of the sample means calculated above.

25
X̄ f p ( X̄ )
1 1 1/25
1.5 2 2/25
2 3 3/25
2.5 4 4/25
3 5 5/25
3.5 4 4/25
4 3 3/25
4.5 2 2/25
5 1 1/25

∑ f =25 ∑ p ( X̄ )=1

Now, let us see how to find the expected value and the standard error of the sampling
distribution of the means.

Expected Value

μ
The expected value of the sampling distribution of the means, denoted by X̄
or E ( X̄ ) , is
defined as:

E ( X̄ )=μ X̄ =∑ x̄⋅p ( x̄ )
Thus, for the above sampling distribution of means, the expected value is:

E ( X̄ )=μ X̄ =∑ x̄⋅p ( x̄ )=1 ( 251 )+1. 5(252 )+. . .+ 4 . 5(252 )+5( 251 )=251+ 3+.. .+9+5 = 7525 =3
Standard Error
The Standard deviation of the sampling distribution of the means, usually called standard
error, denoted by
σ X̄ , is defined as:


σ X̄ = E ( X̄ 2 )− [ E ( X̄ ) ]
2

26
Thus, for the above sampling distribution of means, the standard error can be calculated as
follows.

√ 2
σ X̄ = E ( X̄ 2 )− [ E ( X̄ ) ] where E ( X̄ )=μ X̄ =3 and

E ( X̄ 2 )=∑ x̄ 2⋅p ( x̄ )=( 1 )2 ( 251 )+ ( 1. 5) ( 252 )+ .. .+( 4 . 5) (252 )+( 5) (251 )=10
2 2 2

Therefore, the standard error will be:


σ X̄ = E ( X̄ 2 )− [ E ( X̄ ) ] = √10−( 3 )2 =√ 10−9=√ 1=1
2

The following are the most important properties regarding sampling distribution of means.
1) The mean of the sampling distribution of means and the population mean are equal.
That is X̄
μ =μ
i.e If we consider the sampling distribution of means constructed above, we found that
the mean of the sampling distribution of the means is
μ X̄ =3 . Now, if we find the

μ=
∑ x = 1+ 2+ 3+4 +5 =15 =3
population mean, we get N 5 5 . That is
μ X̄ =μ

2) The standard error of the sampling distribution of the means and the population
σ
σ X̄ =
standard deviation are related by the formula √n
i.e, if we consider the sampling distribution of means constructed above, we found
that the standard error of the sampling distribution of the means is
σ X̄ =1 . Now, if we
find the population standard deviation, we get

σ=
√ ∑ ( x−μ )2 =
N √ (1−3 )2 + ( 2−3 )2 + (3−3 )2 + ( 4−3 )2 + ( 5−3 )2
5
=
10
5
= √2

σ √2 σ
= =1 σ X̄ =
Thus, √ n √ 2 . Therefore, √n
Normally, while we are sampling, we may sample from a population that is normally
distributed or from a population that is not normally distributed. Now, we will try to discuss
the different assumptions that we have to consider in each of these two sampling approaches.
Sampling from Normally Distributed Population
It can be shown that if sampling is done from a population that is normally distributed with
mean μ and standard deviationσ , the sampling distribution of the mean will be normally
σ
σ X̄ =
μ =μ and standard deviation
distributed with mean X̄ √ n . In the most elementary
case, if samples of size 1 are taken, each possible sample mean is a single observation from

27
the population. Therefore, if the population is normally distributed with mean μ and
standard deviationσ , the sampling distribution of the mean will be normally distributed with
σ σ
σ X̄ = = =σ
μ =μ and standard deviation
mean X̄ √ n √1 .
As the sample size increases, the sampling distribution of the mean still follows a normal
μ =μ
distribution with mean X̄ . However, as the sample size increases, the standard error of
the means decreases so that a larger proportion of sample means are closer to the population
mean. Thus, if sampling is done from normal distribution, the distribution of the population
and the sampling distribution of the means, which is also normally distributed, has the same
mean (i.e.
μ X̄ =μ ) but the population is more variable than the sampling distribution of the
means (i.e.
σ ≥σ X̄ )

28
Sampling distribution
Population distribution

σ X̄

σ X̄

μ X̄

Since the population is assumed to be normally distributed with mean μ and standard
deviationσ , the Z-score / the standardized normal variable / will be defined as:

X−μ
Z=
σ
Similarly, since the sampling distribution of the means is also normally distributed with mean
σ
σ X̄ =
μ X̄ =μ and standard deviation √ n , the Z- score corresponding to the sampling
X̄−μ X̄ X̄ −μ
Z= =
σ X̄ σ
distribution of the means will be √n
Example 1:The marks in mathematics of high school students in Addis are normally
distributed with mean 75 and standard deviation 12.
a) If a random sample of 4 students is taken, what is the probability that the mean of their
scores is between 72 and 78?
b) What will happen to the answer for the problem in (a) if we increase the sample size?
Why?
c) If a random sample of 4 students is taken, what is the probability that the mean of their
scores is between 78 and 85?
d) What will happen to the answer for the problem in (c) if we increase the sample size?
Why?
Solution:

29
Given: - μ=75 and σ =12

a) n=4 , p ( 72≤ X̄ ≤78 )=?


p ( 72≤ X̄ ≤78 )= p ( X̄ ≤78 )− p ( X̄ ≤72 )

Let z 1 and z 2 be the Z scores corresponding to x 1=72 and x 2 =78 . Then,


x 1 −μ x 2 −μ σ 12 12
z 1= z2= σ X̄ = = = =6
σ X̄ and σ X̄ where √n √4 2
x 1 −μ 72−75 −3 x −μ 78−75 3
z 1= = = =−0 .5 z2= 2 = = =0 . 5
Thus, σ X̄ 6 6 and σ X̄ 6 6

Therefore,

p ( 72≤ X̄≤78 )= p ( Z≤z 1 ) − p ( Z≤z 2 ) = p ( Z≤0 . 5 )− p ( Z≤−0 . 5 )


=0. 6915−0 .3085=0 .3830
b) Let’s increase the size of our sample to 9, i.e. n=9 . Now,
p ( 72≤ X̄ ≤78 )= p ( X̄ ≤78 )− p ( X̄ ≤72 )

Let z 1 and z 2 be the Z scores corresponding to x 1=72 and x 2 =78 . Then,


x 1 −μ x 2 −μ σ 12 12
z 1= z2= σ X̄ = = = =4
σ X̄ and σ X̄ where √ 9 √9 3
x 1 −μ 72−75 −3 x −μ 78−75 3
z 1= = = =−0 .75 z2= 2 = = =0 .75
Thus, σ X̄ 4 4 and σ X̄ 4 4

Therefore,

p ( 72≤ X̄≤78 )= p ( Z≤z 1 ) − p ( Z≤z 2 ) = p ( Z≤0 . 75 )− p ( Z≤−0 . 75 )


=0. 7734−0 . 2266=0 . 5468

c) n=4 , p ( 78≤ X̄≤85 )=?


p ( 78≤ X̄≤85 )= p ( X̄≤85 )− p ( X̄≤78 )

Let z 1 and z 2 be the Z scores corresponding to x 1=78 and x 2 =85 . Then,


x 1 −μ x 2 −μ σ 12 12
z 1= z2= σ X̄ = = = =6
σ X̄ and σ X̄ where √n √4 2
x 1 −μ 78−75 3 x −μ 85−75 10
z 1= = = =0 .5 z2= 2 = = =1 .67
Thus, σ X̄ 6 6 and σ X̄ 6 6

30
Therefore,

p ( 78≤ X̄≤85 )= p ( Z≤z 1 ) −p ( Z≤z 2 )= p ( Z≤1. 67 )−p ( Z≤0 . 5 )


=0. 9525−0 . 6915=0 . 2610
d) Let’s increase the size of our sample to 9, i.e. n=9 . Now,
p ( 78≤ X̄≤85 )= p ( X̄≤85 )− p ( X̄≤78 )

Let z 1 and z 2 be the Z scores corresponding to x 1=78 and x 2 =85 . Then,


x 1 −μ x 2 −μ σ 12 12
z 1= z2= σ X̄ = = = =4
σ X̄ and σ X̄ where √ 9 √9 3
x 1 −μ 78−75 3 x −μ 85−75 10
z 1= = = =0. 75 z2= 2 = = =2 .5
Thus, σ X̄ 4 4 and σ X̄ 4 4

Therefore,

p ( 78≤ X̄≤85 )= p ( Z≤z 1 ) −p ( Z≤z 2 )= p ( Z≤2. 5 )− p ( Z≤0 . 75 )


=0. 9938−0 .7734=0. 2204
Therefore, as sample size increases, the probability that the mean of a given sample to be
between 78 and 85 and hence to be far to the population mean decreases.
However, in many instances, either it is known that the population is not normally distributed
or it is unrealistic to assume a normal distribution. Thus, the sampling distribution of the
means for populations that are not normally distributed needs to be examined.
Sampling from Non-Normally Distributed Population
While we are sampling from non-normally distributed population, we always consider a very
important theorem in statistics called central limit theorem. Now, we state the theorem as
follows.
The Central Limit Theorem
As sample size n (that is, the number of observations in each sample) gets large enough
(usuallyn≥30) , the sampling distribution of the means is approximately normally
σ
σ X̄ =
distributed with mean
μ X̄ =μ and standard deviation √n .
Example : A certain population has a mean of 100 and a standard deviation of 56. Samples
of size 64 are randomly selected and their means are calculated. What value would you
expect for the
a) mean of all these sample means?
b) standard deviation of all these sample means?

31
Solution: Givens:

Required:
μ X̄ and σ X̄

Since the sample size is greater than 30, by the Central Limit Theorem, the sampling
μ X̄ =μ and
distribution of the means is approximately normally distributed with mean
σ σ 56 56
σ X̄ = σ X̄ = = = =7
standard deviation
μ
√ n . Therefore, X̄ =μ=100 and √ n √64 8
Exerecise
1. Suppose that you have given a population of 9 integers; 1, 2, 3, 4, 5, 6, 7, 8, and 9.
If a random sample of 4 integers is selected, then what is the probability that the mean
of this particular sample is between 3.5 and 5?
How many samples of size 4 can we draw from this population if sampling is done
with replacement?
How many of the samples, whose size is 4, are expected to have a mean greater than
3.5? / Assume sampling with replacement/
2. Consider a population of 10 integers, Say 1, 2, 3,4,5,6,7,8,9 and 10.
I. The population mean is _________
II. The population standard deviation is _________
III. How many samples of size 3 can we select using sampling with replacement?
IV. The expected value of the sampling distribution of the samples of all samples of
size 3 is ______________
V. The standard deviation of the sampling distribution of the means (Standard error
of X̄ ) of all samples of size 3 is ______________
3. A certain population has a mean of 500 and a standard deviation of 30. Samples of size
36 are randomly selected and their mean calculated.
I. What value would you expect to find for the mean of all these sample means?
II. What value would you expect to find for the standard deviation of all these
sample means?
III. What shape do you expect the distribution of these entire sample means to have?
2.3 Sampling Distribution of the Difference of Two Means (
X̄ 1 − X̄ 2 )
Suppose that we have two populations, the first with mean μ1 and standard deviation σ 1 and

the second with mean μ2 and standard deviationσ 2 . Assume that the statistic 1 represents
the mean of the random sample of size n 1 selected from the first population and the statistic
X̄ 2 represents the mean of the random sample of size n selected from the second population,
2
independent of the sample from the first population. What can you say about the sampling
X̄ 1 − X̄ 2 for repeated samples of size n and n ? In section 3.1,
distribution of the difference 1 2

we have seen that the variables 1


X̄ ∧ X̄ 2 are both approximately normally distributed with

32
means μ1 and μ2 and standard deviations 1
σ / n1 and σ 2 / n2 respectively. This √ √
,
approximation improves as n1 and n2 increase.

By choosing independent samples from the two populations, the variables


X̄ 1 ∧ X̄ 2 will be
independent and then we can say that
X̄ 1 − X̄ 2 is approximately normally distributed with
mean

μ X̄
1
− X̄ 2=μ X̄ 1−μ X̄ 2 =μ1−μ 2

and variance
2
X̄ 1− X̄ 2 2 X̄ 2X̄ σ1 σ2
2 2

σ =σ 1
+σ 2
= +
n1 n2
Now, we state the following theorem.
If independent samples of size n1 and n2 are drawn at random from two populations, discrete
or continuous, with means μ1 and μ2 , then the sampling distribution of the difference of
means
X̄ 1 − X̄ 2 , is approximately normally distributed with mean
2
X̄ 1− X̄ 2 σ1 σ2
2 2

σ = +
μ X̄ − X̄ 2=μ1 −μ 2 n1 n2 . Hence, the random variable
1 and variance

( X̄ 1 − X̄ 2 )−( μ 1−μ 2 )
Z=


σ 2 σ 2
1 2
+
n1 n2

is approximately a standard normal random variable.

If bothn1 ,n 2≥30 , the normal approximation for the distribution


X̄ 1 − X̄ 2 is very good.

Example 3: The TV picture tubes of manufacturer A have a mean life time of 6.5 years and
standard deviation of 0.9 years, while those of manufacturer B have a mean life time of 6.0
years and standard deviation of 0.8 years. What is the probability that a random sample of 36
tubes from manufacturer A will have a mean life time that is at least one year more than the
mean life time of a sample of 49 tubes from manufacturer B?
Solution:
We are given the following information

Population-1 Population-2

μ1 =6 .5 μ2 =6 . 0

σ 1=0 . 9 σ 2=0 . 8

33
n1 =36 n2 =49

p ( X̄ 1 − X̄ 2≥1 . 0 )
Required:

μ X̄ − X̄ 2=μ1 −μ 2 =6 .5−6 . 0=0 . 5


Now, 1

√ √
2 2
σ1 σ2 ( 0 . 9 ) ( 0 .8 )
2 2

σ X̄ − X̄ = + = + =0 . 189
1 2 n1 n2 36 49
σ X̄
1
− X̄ 2=0. 189

μ X̄ − X̄ 2=0 . 5 X̄ 1 − X̄ 2
1 1

( x̄ 1− x̄2 )−( μ1−μ 2 ) 1−0 . 5


z= = =2 .646


σ σ 0 .189
2 2
1 2
+
The Z-score corresponding to x̄ 1− x̄ 2 =1 is n1 n2

p ( X̄ 1 − X̄ 2≥1 . 0 )= p ( Z> 2. 646 )=1− p ( Z <2 . 646 )=1−0 . 9959=0 . 0041


Therefore,
Exercise
Brand A tyres have a mean tread life of 20,000 miles with a standard deviation of 1000
miles, while those of brand B tyres have a mean tread life of 19,500 miles with a standard
deviation of 1200 miles. What is the probability that a random sample of 40 tyres of
brand A will have a mean tread life that is at least 100 miles more than the mean tread life
of a sample of 50 tyres of brand B ?

( P)
2.4 Sampling Distribution of the Sample Proportion
Sampling Distribution of the Proportion is the probability distribution of the sample
proportion of all samples of a given size.
To make it a bit clear, consider a population of 5 students as follows.

Student Sex

A F
B M

34
C M
D F
E M

Now, let’s try to construct the proportions of female students of all samples of size 2. Clearly,
n 2
N = 5 and n = 2. Thus, from the given population we can draw N =5 =25 samples each of
size 2 using sampling with replacement.
Note that:

The number of success in a given population ( x )


Population proportion ( p )=
Population size ( N )
x
p=
i.e. N

The number of success in a given sample ( x )


Sample proportion ( p )=
Sample size ( n )
x
p=
i.e. n

Now, the different samples and the corresponding proportions can be tabulated as follows.

Proportion ( P ) Proportion ( P )
Sample Sex Sample Sex

A, A F, F 2/2 = 1 C, D M, F 1/2 = 0.5


A, B F, M 1/2 = 0.5 C, E M, M 0/2 = 0
A, C F, M 1/2 = 0.5 D, A F, F 2/2 = 1
A, D F, F 2/2 = 1 D, B F, M 1/2 = 0.5
A, E F, M 1/2 = 0.5 D, C F, M 1/2 = 0.5
B, A M, F 1/2 = 0.5 D, D F, F 2/2 = 1
B, B M, M 0/2 = 0 D, E F, M 1/2 = 0.5
B, C M, M 0/2 = 0 E, A M, F 1/2 = 0.5
B, D M, F 1/2 = 0.5 E, B M, M 0/2 = 0
B, E M, M 0/2 = 0 E, C M, M 0/2 = 0
C, A M, F 1/2 = 0.5 E, D M, F 1/2 = 0.5
C, B M, M 0/2 = 0 E, E M, M 0/2 = 0
C, C M, M 0/2 = 0

35
Thus, the sampling distribution of proportions will be the probability distribution of the
sample proportions computed above.

Proportion ( P )
f p(p)

0 9 9 / 25
0.5 12 12 / 25
1 4 4 / 25

∑ f =25 ∑ P=1
Expected Value

The expected value of the sampling distribution of proportions, denoted by


μ P or E ( P ) , is
defined as:

E [ P ]=μ P =∑ p⋅p ( p )
Thus, for the above sampling distribution of proportions, the expected value is:

μ P =∑ p⋅p ( p )=0 ( ) ( ) ( )
9
25
+0 . 5
12
25
+1
4
=
25 25
0+6+ 4 10
= =0 . 4
25
Standard Error
The standard deviation the sampling distribution of proportions, usually called standard
error, denoted by
σ P , is defined as:


σ P = E ( P2 )−[ E ( P ) ]
2

Thus, for the above sampling distribution of proportions, the standard error can be calculated
as follows.

√ 2
σ P = E ( P2 )−[ E ( P ) ] where E ( P )=μ P=0 . 4 and

E ( P2 )=∑ p2⋅p ( p )=( 0 )2 (259 )+( 0 . 5) ( 1225 )+ ( 1) (254 ) =0 .28


2 2

Therefore, the standard error will be:


σ P = E ( P2 )−[ E ( P ) ] =√ 0 .28−( 0 . 4 )2= √0 . 12
2

36
Important properties regarding sampling distribution of proportions.
The mean of the sampling distribution of proportions and the population proportion are equal.
That is,
μ P= p
If we consider the sampling distribution of proportions constructed above, we found that the
mean of the sampling distribution of the proportions is
μ P=0 . 4 . Now, if we find the
population proportion, we get:

Number of female students in the population ( x ) 2


p= = =0. 4
population size ( N ) 5

That is
μ P= p

The standard error of the sampling distribution of the proportions is given by the formula:

σ P=
√ p ( 1−p )
n
=√ pq /n , where q=1− p

If we consider the sampling distribution of proportions constructed above, we found that the
standard error of the sampling distribution of the proportions is σ P = √ 0 . 12 . Now, if we find
√ pq/n , we get:
√ pq /n=
√ 0 . 4 ( 1−0. 4 )
2 √
=
0. 4 ( 0 .6 )
2 √=
0 . 24
2
=√ 0 .12

That is, σ P = √ pq/n


μ= np
If the sampling is done from a population having a binomial distribution with mean
σ =√ npq
and standard deviation , the sampling distribution of proportions will be
approximately normally distributed with mean

μ P =E ( P )=E ( Xn )= 1n E ( X )= 1n ( np)= p
and

37
Standard deviation

σ P =σ ( P )=σ ( Xn )= 1n σ ( X )= 1n √npq= √ pq/n


1
2
provided that n is large and p is close to .

In answer to the question "How large is large?", or "How close is close?” a rule of thumb is
np> 5 nq> 5 q=1− p
that the approximation should only be used when both and where .
Thus, the Z-score corresponding to the sampling distribution of proportions is:

X
P−μ P P−p −p
n X−np
Z= = = =
σP √ pq /n √ pq /n √npq
Example 4: It is known that 60% of youngsters of Addis are supporters of St. George
Football Team.
a) What is the probability that in a random sample of 100 youngsters 50% or less support
the team?
b) What is the probability that in a random sample of 100 youngsters 50% to 70% support
the team?
Solution: -Givens: p=0. 6=μ P

a) Given: n=100 . Required: p ( P≤0 . 5 )


Let p1 =0 .5
Now,


p1 −μ P 0 . 4 ( 1−0 . 4 )
z 1= ; where μ P =p=0 . 6 and σ P = √ pq/n= = √ 0 . 0024
σP 100 So,
0. 5−0 .6
z1= =−2. 04
√ 0 .0024
Thus,

p ( P≤0.5 )= p ( Z≤−2.04 ) =0.0207

b) Given: n=100 . Required: p ( 0 .5≤P≤0 . 7 )


Let p1 =0 .5 and p2 =0 .7

38
p1 −μ P 0 . 5−0 . 6 p −μ 0. 7−0 .6
z 1= = =−2. 04 and z 2 = 2 P = =2 .04
Now, σP √ 0. 0024 σP √ 0 . 0024
Thus,

p ( 0 .5≤P≤0 . 7 )= p (−2 . 04≤Z≤2. 04 )= p ( Z≤2 . 04 )− p ( Z≤−2 . 04 )


=0 . 9793−0 .0207=0 . 9586

39
Exercise.
1. Experience shows that 15% of students in DRMC who are taking Statistics for
Management score “A”. in a random sample of 200 students, the probability that at
least 25 students score “A” is:
a) 0.8389 c) 0.6164
b) 0.1611 d) 0.7512
2. A toy store has determined that 15% of all video games sold during Christmas are
returned. In a random sample of 200 video games sold by the store, the probability that
at least 10% will be returned is :
a) 0.9761 b) 0.8761 c) 0.7761 d) 0.6761
2.5 Sampling Distribution of the Difference of Two Proportions (
P1 −P2 )
Suppose that we have two binomial populations with means n1 p1 and n2 p 2 and variances
n1 p1 q1 and n2 p 2 q2 , respectively. Assume that the statistic P1 represents the proportion of
P
success in the first sample, whose size is n1 , and 2 represents the proportion of success in
P ∧P2 are
the second sample, whose size is n2 . From the discussion above we know that 1
each approximately normally distributed with mean p 1 and p2 and variances p1 q 1 /n1 and
p2 q 2 /n 2 respectively. By choosing independent samples from the two populations, the

variables
P1 ∧P2 will be independent and then we can say that P1 −P2 is approximately
μ P −P = p1 − p2
normally distributed with mean 1 2 and standard deviation

σ P −P =
1 2
√ p 1 q 1 p2 q2
n1
+
n2
Now, we state the following theorem.
If two independent samples of size n 1 and n2 are drawn at random from two binomial
σ = σ =
populations with means μ1 = n1 p1 and μ2 = n2 p 2 and variances 12 n1 p1 q1 and 22
n2 p 2 q2 respectively, then the sampling distribution of the difference of proportions P1 −P2 is
approximately normally distributed with mean and standard deviation given by

μ P −P = p1 − p2
1 2 and
Hence, the random variable
σ P −P =
1 2
√ p1 q1 p2 q2
n1
+
n2

( P1−P 2 ) −( p1 −p 2 )
Z=


p 1 q1 p2 q 2
n1
+
n2 is approximately a standard normal variable.

σ P −P
1 2

40
μ P −P = p1 − p2
1 2

P −P2
Note that, if n1 p1 ,n1 q1 ,n2 p 2 ,n2 q2 ≥5 the normal approximation for the distribution 1
is very good.
Exercise
1. In a population of 1000 high school students, 800 are females; whereas in a population of
400 college students, 240 are females. Calculate the standard error of the sampling
distribution of the difference in sample proportion of female students in high schools and
colleges, where p1 is the population proportion of female students in high schools and p 2
is the population proportion of female students in colleges
2 2 2
2.6 The Distribution of the Random Variable( n−1 ) S /σ ( χ -distribution)
2 2
Before discussing the distribution of the random variable ( n−1 ) S /σ we need to introduce
the concept of degree of freedom and the concept of chi-squared distribution.
The Concept of Degree of Freedom
To illustrate the concept of degree of freedom, let’s consider the sample variance
n
∑ ( x i− x̄ ) 2
s2 = i=1
n−1 . In order to compute s2, first x̄ need to be known. For instance, if a sample
of size3 , say {
x 1 , x 2 , x3 }
is considered and if the sample mean is x̄=10 , then only two of the
values, say
x 1 and x 2 are free to vary. For instance, if x 1 =7 (free) and x 2=12 (free),
x =11 (fixed). Thus, the degree of freedom is 2. Therefore, in general if we
then necessarily 1
consider a sample of size n, then only n-1 of the sample values are free to vary. This means
that there is n-1 degree of freedom.
Chi-square Distribution
The distribution of the sum of the squares of independent standard normal random variables
is referred to as Chi-square distribution. More formally we define Chi-squared distribution as
follows.

If
Z1 , Z 2 , ..., Z n are independent standard normal random variables, then the random variable
n
∑ Z i2
i=1 is said to be a chi-square random variable with n degree of freedom.
Now, let us see how to find the expected value of chi-square distribution.
Expected Value

41
Since
Zi for each i=1, 2, ... , n is a standard normal random variable we have
E ( Z i ) =0 and Var ( Z i )=1
.

(
Var ( Z i ) = E Z 2 )−[ E ( Z i ) ]
2
E ( Z 2 )=1
Now, from the variance formula i we have i for each

(∑ ) ∑
n n n
E Z 2 = ∑ 1= n
i=1, 2, ... , n . Therefore,
E
i=1
Z2 =
i i=1
( i) i=1

Now, we have the following property.


The expected value of a chi-square random variable is equal to its degree of freedom.

Suppose now that we have a sample 1 2


X , X , ..., X
n from a normal population having mean
2
μ and varianceσ . Consider the sample variance S2 defined by:
n
∑ ( X i − X̄ ) 2
S2 = i=1
n−1

The following result follows:


If S2 is the variance of a random sample of size n taken from a normal population having the
2
varianceσ , then the random variable
n
∑ ( X i− X̄ )2
X 2=( n−1 ) S 2 / σ 2= i =1
σ2 has a chi-square distribution with n-1 degree of freedom.
Although a mathematical proof for the above theorem is beyond the scope of this course, we
can obtain some understanding of why it is true. To begin, let us consider the standardized
X i −μ
, i=1 , 2, . .. , n
normal variables σ , where μ and σ are the population mean and
standard deviation respectively. Since these variables are independent standard normal
variables, it follows that the sum of their squares:
n
∑ ( X i−μ ) 2
( )
2
n X i −μ
∑ i.e i=1

i=1 σ σ2

has a chi-square distribution with n degree of freedom. Now, if we substitute μ by x̄ above,


then the new quantity
n
∑ ( X i− x̄ ) 2
i =1

σ2

42
will remain a chi-square variable, but lose 1 degree of freedom, because, μ is replaced by its
estimator x̄ .

n n
∑ ( X i− x̄ ) 2 ∑ ( X i − x̄ )2
i =1 n−1 n−1
2
= i=1 ⋅ 2 =S 2⋅ 2 = ( n−1 ) S 2 / σ 2
Thus, σ n−1 σ σ has a chi-square
distribution with n-1 degree of freedom.
The following are some of the characteristics of chi-square distribution.
 The chi-square values are all non-negative
 The distribution is non-symmetric, but approaches symmetry as its degree of freedom
(v) increases.
E ( X 2 )=v
 The mean is equal to its degree of freedom. That is,
 It has only one parameter, its degree of freedom (v).

Chi-square -distribution
How to use the Chi-square table
The chi-square distribution table shows the chi-square value that leaves an area of p from its
right under the chi-square distribution curve for a given degree of freedom. Table rows show
degrees of freedom and table columns show some selected values of p.
The chi-square distribution table is presented below. To find the chi-square value that leaves
an area of 0.05 from its right under the chi-square distribution curve for 3 degree of freedom,
cross-reference the row of the table containing 3 with the column containing 0.05. If you look
at the table accordingly, you will get 7.815.

Chi-square distribution table

DF 0.995 0.975 0.2 0.1 0.05 0.025 0.02 0.01 0.005 0.002 0.001

43
1 0.000 0.001 1.642 2.706 3.841 5.024 5.412 6.635 7.879 9.550 10.828
2 0.010 0.051 3.219 4.605 5.991 7.378 7.824 9.210 10.597 12.429 13.816
3 0.072 0.216 4.642 6.251 7.815 9.348 9.837 11.345 12.838 14.796 16.266
4 0.207 0.484 5.989 7.779 9.488 11.143 11.668 13.277 14.860 16.924 18.467
5 0.412 0.831 7.289 9.236 11.070 12.833 13.388 15.086 16.750 18.907 20.515
… … … … … … … … … … … …

19 6.844 8.907 23.900 27.204 30.144 32.852 33.687 36.191 38.582 41.610 43.820
… … … … … … … … … … … …

44
Example 5: For a chi-square random variable with 19 degree of freedom, find the chi-
square value having area
o 0.025 to its right
o 0.995 to its right
Solution:

To find the chi-square value that leaves an area of 0.025 from its right under the chi-
square distribution curve for 19 degree of freedom, cross-reference the row of the table
containing 19 with the column containing 0.025. If you look at the table accordingly, you
will get required 32.852.

To find the chi-square value that leaves an area of 0.995 from its right under the chi-
square distribution curve for 19 degree of freedom, cross-reference the row of the table
containing 19 with the column containing 0.995. If you look at the table accordingly, you
will get required 6.844.

Exercise
 For a chi-square -random variable with 10 degree of freedom, find the chi-square -
2
0. 05
value having area 0.05 to its right; that is χ .
 For a chi-square -random variable with 10 degree of freedom, find the chi-square -
2
0. 975
value having area 0.975 to its right; that is χ .
So far we have seen the chi-square distribution. In the coming subtopic we shall see another
important probability distribution, the t-distribution, which has immense applications.

X̄−μ
2.7 The Distribution of the Random Variable S/ √ n (T-distribution)
Dear learner, most of the time we are not fortune enough to know the variance of the
population from which we select our random samples. For samples of size n≥30 , a good
2
estimate of σ can be obtained by calculating a value for S 2. What then happens to our
X̄−μ
statistic σ / √ n if we replace σ by S ? As long as S2 provides a good estimate of σ and
2

does not vary from sample to sample, which is usually for the casen≥30 , the distribution of
X̄−μ
the statistic S/ √ n still approximately distributed as a standard normal variable Z. However,
if the sample size is small (n<30 ), the value of S2 fluctuates considerably from sample to
X̄−μ
sample and the distribution of the random variable S/ √ n is no longer a standard normal
distribution. Rather the random variable follows what we call t-distribution with n-1 degree
X̄−μ X̄−μ
T=
of freedom. Thus, we usually denote the random variable S/ √ n by T. i.e. S/ √ n
The following theorem relates the t-distribution with standard normal distribution and chi-
square distribution.

45
If Z is a standard normal random variable and V is a chi-square random variable with v
degree of freedom, then the random variable Z / √V / v follows a t-distribution.

Dear learner, here we do not go through the proof the above theorem. But, at least to make
you a bit comfortable let us see the following mathematical computation. Let
Z=( X̄ −μ ) /σ X̄ and let V = ( n−1 ) S 2 /σ 2 . Clearly, Z is a standard normal random variable
whereas V is a chi-squared random variable with n-1 degree of freedom. Now,

( X̄ −μ ) / σ X̄ ( X̄ −μ ) / ( σ / √ n ) ( X̄ −μ ) / ( σ / √ n ) X̄−μ σ X̄−μ
Z / √ V / v= = = = ⋅ =

√ √S
( n−1 ) S 2 / σ 2
2
/σ 2 S/σ σ / √ n S S/ √ n
n−1
X̄−μ
As we have seen, the last expression shows that S/ √ n and hence Z / √ V / v follows t-
distribution.
The following are some of the characteristics of the t-distribution.
1. Similar to the distribution of Z, the distribution of T:
a. is symmetric about a mean of 0
b. is bell shaped (uni-modal)
c. ranges from negative infinity to positive infinity
2. The distribution of T is more variable than the Z distribution as its values depend on
2
the fluctuation of two quantities; X̄ and S . Whereas the Z distribution depends
only on the changes of X̄ from sample to sample.
X̄−μ X̄−μ
T= Z=
S/ √ n σ /√n
n
Var ( T ) = , n>2
3. The variance of the t-distribution, n−2 , depends on the sample size
n and is always greater than 1. Whereas the variance of the Z distribution is always 1.
4. Since the variance of the t-distribution is always greater than 1 but the variance of the
standard normal distribution is always 1, the shape of the t-distribution will be slightly
flatter in the middle than the standard normal distribution but has a thicker tail.
5. The degree of freedom (V) of a T distribution is given by V =n−1 , where n is
sample size

46
T-distribution
How to use the T- distribution table
The T- distribution table shows the T- value that leaves an area of p from its right under the
T- distribution curve for a given degree of freedom. Table rows show degrees of freedom and
table columns show some selected values of p.

Dear learner, for your consideration a section of the T-distribution table is presented below.
To find the T- value that leaves an area of 0.01 from its right under the T- distribution curve
for 5 degree of freedom, cross-reference the row of the table containing 5 with the column
containing 0.01. If you look at the table accordingly, you will get 3.36493.

t table with right tail probabilities

df\p 0.40 0.25 0.10 0.05 0.025 0.01 0.005 0.0005


1 0.324920 1.000000 3.077684 6.313752 12.70620 31.82052 63.65674 636.6192
2 0.288675 0.816497 1.885618 2.919986 4.30265 6.96456 9.92484 31.5991
3 0.276671 0.764892 1.637744 2.353363 3.18245 4.54070 5.84091 12.9240
4 0.270722 0.740697 1.533206 2.131847 2.77645 3.74695 4.60409 8.6103
5 0.267181 0.726687 1.475884 2.015048 2.57058 3.36493 4.03214 6.8688
--- --- --- --- --- --- --- --- ---
10 0.260185 0.699812 1.372184 1.812461 2.22814 2.76377 3.16927 4.5869
--- --- --- --- --- --- --- --- ---
inf 0.253347 0.674490 1.281552 1.644854 1.95996 2.32635 2.57583 3.2905

Example 6: For a t-random variable with 10 degree of freedom, find the t-value having area
0.05 to its right, that is
t 0.05 .

Solution: To find the T- value that leaves an area of 0.05 from its right under the T-
distribution curve for 10 degree of freedom, cross-reference the row of the table containing
10 with the column containing 0.05. If you look at the table accordingly, you will get
required 1.812461.
Exercise
[Link] a t-random variable with 7 degree of freedom, find the t-value having area 0.25 to
t
its right, that is 0.25 .
[Link] a t-random variable with 12 degree of freedom, find the t-value having area 0.1 to
its right, that is
t 0.1 .

47
.
S /σ 2
12 1
S 2/σ 2
2.8 The Distribution of the Random Variable 2 2 (F-distribution)
The F-distribution named in honor of R.A. Fisher who studied it in 1924. It is used for
comparing the variances of two populations. It is defined in terms of the ratio of two
independent chi-squared random variables each divided by its degree of freedom. Hence, we
can write
U /v
F= 1 1
U 2/ v 2
Since U1 and U2 have chi-squared distributions with v1 and v2 degree of freedom, we may
U =( n 1−1 ) S 2 / σ 2 U 2 =( n2−1 ) S 2 / σ 2 v 1=n 1−1 and
write them as 1 1 1 and 2 2 . Clearly,
v 2=n 2−1 . Now,
( n1 −1 ) S 12 / σ 12
U 1/ v 1 n1 −1 S 2/ σ
1 12
F= = =
U 2 / v 2 ( n2 −1 ) S 2 / σ S /σ
2 2
2 22 22
n2 −1
Thus, we have the following theorem.
Theorem
S S
If 1 2 and 2 2 are variances of independent random samples of size n 1 and n2 taken from
σ σ
normal populations with variances 12 and 22 , respectively, then
S /σ 2
12 1
F=
S 2/σ 2
2 has an F-distribution with v 1=n 1−1 and v 2=n 2−1 degree of freedom on the
2

nominator and denominator respectively.


Theorem
Writing α ( 1 2 ) for α with v 1 and v 2 degrees of freedom
f v ,v f
1
f 1−α ( v 1 , v 2 )=
f α ( v 2 , v 1)

Note that, if the independent random samples are selected from two normal populations with
σ σ
equal variances, i.e. if 12 = 22 , then the statistic F will be reduced to the ratio of two
S
12
F=
2 S
sample variances, i.e. 2 and hence the statistic F is sometimes called the variance

ratio.
The following are some of the characteristics of F-distribution.
Since F-distribution is a direct consequence of chi-square distribution, many of the chi-square
properties carry over to the F distribution.
 The F-values are all non-negative
 The distribution is non-symmetric, but approaches symmetry as v increases.
 The mean is approximately 1

48
 There are two independent degrees of freedom, one for the numerator, and one for the
denominator.
 There are many different F distributions, one for each pair of degrees of freedom.

F-distribution Curve

How to use the F- distribution table

The F distribution is a right-skewed distribution used most commonly in Analysis of


Variance. When referencing the F distribution, the numerator degrees of freedom are
always given first, as switching the order of degrees of freedom changes the distribution
(e.g., F(10,12) does not equal F(12,10) ). For the four F tables below, the rows represent
denominator degrees of freedom and the columns represent numerator degrees of freedom.
The right tail area is given in the name of the table. For example, to determine the .05 critical
value for an F distribution with 10 and 12 degrees of freedom, look in the 10 column
(numerator) and 12 row (denominator) of the F Table for alpha=.05. F(.05, 10, 12) = 2.7534.

F Table for α = 0.05

/ df1=1 2 3 4 5 6 7 8 9 10
fd2=1 161.4476 199.5000 215.7073 224.5832 230.1619 233.9860 236.7684 238.8827 240.5433 241.8817
2 18.5128 19.0000 19.1643 19.2468 19.2964 19.3295 19.3532 19.3710 19.3848 19.3959
3 10.1280 9.5521 9.2766 9.1172 9.0135 8.9406 8.8867 8.8452 8.8123 8.7855
4 7.7086 6.9443 6.5914 6.3882 6.2561 6.1631 6.0942 6.0410 5.9988 5.9644
5 6.6079 5.7861 5.4095 5.1922 5.0503 4.9503 4.8759 4.8183 4.7725 4.7351
6 5.9874 5.1433 4.7571 4.5337 4.3874 4.2839 4.2067 4.1468 4.0990 4.0600
7 5.5914 4.7374 4.3468 4.1203 3.9715 3.8660 3.7870 3.7257 3.6767 3.6365
8 5.3177 4.4590 4.0662 3.8379 3.6875 3.5806 3.5005 3.4381 3.3881 3.3472
9 5.1174 4.2565 3.8625 3.6331 3.4817 3.3738 3.2927 3.2296 3.1789 3.1373
10 4.9646 4.1028 3.7083 3.4780 3.3258 3.2172 3.1355 3.0717 3.0204 2.9782
11 4.8443 3.9823 3.5874 3.3567 3.2039 3.0946 3.0123 2.9480 2.8962 2.8536
12 4.7472 3.8853 3.4903 3.2592 3.1059 2.9961 2.9134 2.8486 2.7964 2.7534

49
Example 1

For an F-random variable with 6 and 10 degrees of freedom, find the f-value having area
0.05 to its right, that is
f 0. 05 ( 6 , 10 ) .
Answer: 3.2172

Example 2
For an F-random variable with 6 and 10 degrees of freedom, find the f-value having area
0.95 to its right, that is
f 0. 95 ( 6 , 10 ) .
Solution
1 1
f 0. 95 ( 6 , 10 )= = =0 .246
f 0. 05 (10 , 6 ) 4 .0600

Exercise

1. For an F-random variable with 4 and 7 degrees of freedom, find the f-value having area
0.01 to its right, that is 0. 01 .
f (4 , 7 )
2. For an F-random variable with 6 and 10 degrees of freedom, find the f-value having
area 0.99 to its right, that is
f 0. 99 ( 4 ,7 ) .

50
Unit Three
Estimation Theory

Introduction
Dear Learner, first read the following introductory ideas.
The concept of statistical inference (estimation) basically relates sample characteristics to
population characteristics. Our real interest is to draw conclusions about the population
parameters like population mean μ1 , population proportion μ2 , etc based on the results of
the corresponding sample statistic, like sample mean μ1−μ2 , sample proportion x̄1−x̄2=82−76=6 , etc . Thus,
the primary objective of sampling is to give estimation of population parameter on the basis
of sample statistic.
Sample statistic Estimation Population parameter
Dear learner, in this unit, you will be introduced to point estimation and interval estimation,
how to construct confidence interval estimate for population mean, the difference of two
populations’ means, population proportion and the difference of two populations’
proportions.
As you study this unit, you are expected to:
 distinguish the two types of estimations;
 explain the advantages and disadvantages of point estimation and interval estimation;
 list down and explain the characteristics of best estimator;
 give point estimate for a population mean and for a population proportion;
 construct interval estimation for a population mean, the difference of two populations
means, population proportion and the difference of two populations proportions;
 find the maximum error that can be encountered in estimating a population mean by a
sample mean and a population proportion by a sample proportion and
 find the minimum sample size required to estimate a population mean by a sample
mean and a population proportion by a sample proportion for a given maximum
allowable error.

1.1. Types of Estimation


Basically, there are two types of estimations: point estimation and interval estimation.
Dear learner, in this section we will see the difference between these two types of estimations
and the characteristics of best estimator.
Well, Point estimation is a kind of estimation that uses a single sample statistic value to
estimate the corresponding parameter.

51
For instance, n1=50 is a point estimator of n2=75 ; s1=8 is a point estimator of σ 1 ; s2=6 is a point
estimator of σ 2

Example 1: Consider a sample of 5 integers; say z 0.02=2.054 which is taken randomly from a
certain population of integers. Now, let’s find the point estimator of the population mean (x̄−x̄)−z √σ/n+σ/n<μ−μ<(x̄−x̄)+z √σ/n+σ/n, 1 2 α/2 12 1 22 2 1 2 1 2 α/2 12 1 22 2

the population proportion 6−2.054√6/75+36/50<μ1−2<6+2.054√6/75+36/50 and the population variance 3.42<p1−p2<8.58.

Solution: The point estimate of μ1−μ2 is σ1 and σ2 , where

n1 , n2 <30
Next, the point estimate of the proportion (σ1 ∧σ2 ) of even numbers in the population is n1 ,n2<30, where

σ 1 = σ 2 =σ
is σ1 =σ2=σ , where
( X̄1−X̄ 2)−( μ1−μ2)
T=
At last, the point estimate of √σ12/n1+σ22/n2

( X̄ 1 − X̄ 2 )−( μ1 −μ 2 )
T=
σ 2 √ 1 / n1 +1 / n 2

The following are some of the advantages of point estimation.


One of the most important advantages of point estimation is that it is relatively easy to
compute when compared to interval estimation.
The other advantage of point estimation is that it is the base for making interval estimation.
The following is the main disadvantage of point estimation
One of the main disadvantages of point estimation is that it does not specify how close the
point estimate is to the true value (population parameter) that it is estimating.
Interval Estimation
Interval estimation is a kind of estimation that uses intervals to estimate population
parameters.
In applying interval estimation, we first need to find a point estimate and use this estimate to
construct an interval on both sides of the point estimate within which we can reasonably be
confident that the true parameter will lie. The interval constructed around the point estimate
is referred to as confidence interval and the degree of likelihood that the true parameter lies
within the confidence interval is called confidence level.
Well, the best estimator should be highly reliable and have such desired properties as
unbiasedness, consistency, efficiency and sufficiency.

a) Unbiasedness

52
An estimator (sample statistic) is said to be unbiased if and only if the expected value of the
sampling distribution of the statistic is equal to the corresponding population parameter.

Dear learner, in unit 3 of module-I, we have shown that σ


s p . Thus, we can say that the sample mean, the sample proportion and the
sample variance are unbiased estimators of the population mean, population proportion and
population variance, respectively.
b) Consistency
An estimator is said to be consistent if and only if the estimator approaches the population
parameter as sample size increases.

c) Efficiency
An estimator is considered to be efficient if and only if its value remains more or less stable
from sample to sample. From measures of central tendency, the mean is more efficient
( X̄1−X̄ 2)−( μ1−μ2)
measure and from measures of variation standard deviation ( √
(n1−1) s12+(n2−1) s22 T=
sp=
n1+n2−2 or variance ( s p √1/n1+1/n2 is more
efficient measure.

d) Sufficiency
An estimator is said to be sufficient if and only if it uses all the sample values in its
computation. For instance, mean uses all the sample values in its computation, whereas mode
and median are not.

Dear learner, so far we have seen the concept of estimation, types of estimation and
characteristics of best estimator. Now, in the following section we will try to see how to
estimate population mean from the corresponding sample mean from different perspectives.
Exercise
1) What are the two types of estimation?
2) What is/are the advantage(s) of using point estimation over using interval estimation?
3) What is/are the advantage(s) of using interval estimation over using point estimation?
4) List down the characteristics of a best estimator?
5) A sample of 10 observations taken from a normally distributed population having
standard deviation of 5 produced the following data: 35 32 23 36 26 37 33
45 24 19

a) Give the point estimate of the population meant α /2 .


b) Give the point estimate of the population proportion of evens.

1.2. Estimating the Mean α /2

53
Dear learner, in this section as far as interval estimation of a population mean is concerned;
we will try to consider three cases. The first is the case where α /2 is known and population is
normal (orα / 2 ), the second is the case (1−α)×100% is unknown, population is normal and
and the third case is (1−α)×100% is unknown and

A. Interval estimation for population mean1−α : α /2 is known and population is normal


(orα / 2 )
In section 3.2 of module-I, we have seen that if our sample is selected from a normal

population or if the sample size n is sufficiently large (−t α / 2 ), the variable


t α / 2 will
be a standard normal random variable.

1 1
0 p(−tα/2<T<tα/2)=1−α

( X̄1−X̄ 2)−( μ1−μ2)


represent the Z value above which we find an area of ( √ ) as shown below.
( X̄1−X̄2)−( μ1−μ2)
p −t α/2< <t α/2 =1−α

Now, let
1 1
s p √1/n1+1/n2 sp +
n1 n 2

sp
√ 1 1
+
n1 n2

X̄ 1− X̄ 2 p(X̄1−X̄2)−tα/2sp√1/n1+1/n2<(X̄1−X̄2)+tα/2sp√1/n1+1/n2)=1−α

μ1 0
μ2

Then,
μ1 − μ2

If we substitute
μ1−μ2 for Z, we get the following.
x̄ 1− x̄ 2=85−81=4
2 s
If we multiply all terms in the parenthesis by p 2 , then subtract σ from each term and at
last multiply all terms by -1, we will get the following equation.

54
( n1 −1 ) s 12 + ( n2 −1 ) s 22 11 ( 16 ) +9 ( 25 )
s 2= = =20 . 05
p n1 +n2 −2 12+ 10−2

55
Thus, we state the result as follows.

Confidence interval for x̄ known and population is normal (or( 1−α )× 100% )

A z α / 2 σ / √ n confidence interval for er or(e)=|x̄−μ|≤zα/2σ/√n is


Or in short,

μ
where (1−α)×10 %is the mean of a sample of size n from a population with known standard deviation zα/2σ/√n and
μ is the value of the standard normal distribution leaving an area of zα/2σ/√n=e to the right.
For small samples selected from non-normal populations, we can not expect our degree of
confidence to be accurate. However, for samples of size s p=4.478 , regardless of the shape of
the population distribution, sampling theory guarantees good result.

Example 2: Consider the sample σ 1 ≠ σ 2 that is selected from a population which is


normally distributed with SD of 2. Give a 95% confidence interval estimation forσ 1 ≠ σ2 .
Solution
' ( X̄ 1−X̄ 2 )−( μ 1−μ2 )
T=
√ S1 /n1 +S2 /n2
2 2

( s 12 / n 1 +s 2 2 / n2 )
t
2

we have v . Thus,
v=
Since ( s1 2 / n1 )
2
( 22 /n 2 ) / ( n2 −1 )
/ ( n 1−1 ) + s 2
α / 2

α /2 and
1− α
Therefore, α / 2
This means, we are 95% confident that the population mean is between 2.04 and 5.96.
Dear learner, it is evident that while we are estimating population mean from sample mean,
we usually commit error. Thus, the following discussion will illustrate this error.

56
Error (e)

The α /2 confidence interval provides an estimate of the accuracy of our point


estimate. Clearly, at the center of our confidence interval
−t α / 2 , we have the sample
mean t α /2 .

I I I I

t p(−tα/2<T'<tα/2)=1−α
( X̄1−X̄ 2)−( μ1−μ2)
√S12/n1+S22/n2 T '

Now, if ( √ ) is actually the center value of our interval, then we say √S /n +S /n estimates X̄1−X̄ 2 without
( X̄1−X̄2)−(μ1−μ2)
p −tα/2< <tα/2 =1−α
S 2/n1+S 2/n2
1 2 12 1 22 2

error. But, most of the time, p((X̄−X̄)−t √S/n+S/n<(X̄−X̄)+t √S/n+S/n)=1−α will not be at the center of the interval. Thus, in estimating x̄1=4.93
1 2 α /2 2 1 2 2 1 2 α / 2 2 1 2 2
12 12

by s1=1.14 , we usually commit an error. The size of this error will be the difference between n1=15
and x̄2=2.64 and we can be s2 =0 . 66 confident that this difference will be less than
n2 =10 . That is, μ1 − μ2 . Now, we have the following theorem.

μ1−μ2 , we can be x̄ 1− x̄ 2=4.93−2.64=2.29 confident that the error will be


2
(s12/n1+s22/n2)

If is used as an estimate of
v= 2 2
(s12/n1) /(n1−1)+(s22/n2) /(n2−1)
(1.14 2/15+0.662/10 )2
= 2 2
=22.7≈23
(1.14 2/15 ) /(15−1)+( 0.66 2/10 ) /(10−1)

less than α=0.05 .


Frequently, we want to know how large a sample is necessary to ensure that the error in
estimating t0.025=2.069 will be less than a specified amount e. Thus, the following discussion will
illustrate this case.

Sample size determination

From the above theorem, if we choose n such that v=23 , simple calculation shows
that

(x̄1−x̄2)−tα/2√s12/n1+s22/n2<μ1−μ2<(x̄1−x̄2)+tα/2√s12/n1+s22/n2
Now, we have the following theorem.

If 2.9−2.069√1.4/15+0.6/10<μ1−2<2.9+2.069√1.4/15+0.6/10 is used as an estimate of 2.02<μ1−μ2<2.56, we can be μ1 − μ2 confident that the error will be
22 22

less than a specified amount e when the sample size is

di .
Note that the above formula is applicable only if we know the standard deviation of the
population D1, . . , Dn from which we are to select our sample. But, if we don’t have any information
σ
about μ D , we could take a preliminary sample of size D 2 to provide an estimate of
σ
. D2

57
Then, using the above theorem we could determine approximately how many observations
are needed to provide the desired degree of accuracy.
Example 3: Suppose that you want to determine the average time it takes a typist to type a
page of note. How large a sample size should you take so as to be 99% certain that your
sample mean will not differ from the true mean by more than 1 minute? Assume that similar
studies conducted before have established a SD of 2 minutes
Solution
s s S
Since d 2 we have d2 . Thus, d 2

μ1 −μ2 =μD and therefore, D̄


So, we take the sample size to be the next integer. That is 27
This means, if we take a sample size of at least 27, we can be 99% confident that the error,
the difference between the sample mean and the true population mean, is no more than 1
minute.
D1+D2+...+Dn
B. Interval estimation for population mean D̄= n : (1−α)×100% unknown, population is normal
and μ D
Frequently we are attempting to estimate the mean of a population when the variance is
unknown and it is impossible to obtain a sample of size p (−tα/2<T<tα/2)=1−α . Cost can often be a factor that
limits our sample size. As long as our population is approximately bell shaped, confidence
is unknown and the sample size is small ( μ1 ∧μ2 ) by
D̄−μ D

intervals can be computed when T=


sd / √ n

using the sampling distribution of T, where

μD
The procedure is the same as for large samples, except that we use the T-distribution instead
of the standard normal distribution (Z-distribution).

Now, let d̄±tα/2 sd/√n represent the T value above which we find an area of d̄= n =10
∑di −5+8+(−2)+(−12)+5+(−2)+(−8)+1+(−6)+5=−16=−1.6
10 as shown below.

α=1−98 %=1−0.98=0.02

tα/2=t0.02/2=t0.01=2.821 √ ( )( )
2
n ∑ d 2 − ∑ di
i
sd =
n ( n−1 )

58
∑di=−5+8+(−2)+(−12)+5+(−2)+(−8)+1+(−6)+5=−1.6 0
∑di2=(−5)2+82+(−2)2+(−12)2+52 (−2)2+(−8)2+12 (−6)2+52=392 √ ( )√ √ √
()
2
n ∑ d 2 − ∑ di 2
i 10 ( 392 ) − ( 16 ) 3920 − 256 36 4
sd= = = = =√40.7=6.38
n ( − 1 ) 1 0 ( 1 0− 1 ) 1 0 ( 9 ) 9 0

Then,
CI=d̄±t α/2 s d / √n=−1.6±2.821
[√ ]
6.38
10
=−1.6±5.69 =(−7.29,4.09 )

If we substitute
−7.09<μD<4.09 for T, we get μ D . If we multiply all
terms in the parenthesis byd̄=−1.6<0 , then subtract ( μ1−μ2 ) from each term and at last multiply all
x̄ 1= 25
n 1 = 32
terms by -1, we will get s1 = 4 . Thus, we state the
result as follows.

Confidence interval for ( ) unknown, population is normal and x̄


2
zα / 2 σ
n=
e

A μ confidence interval for (1−α)×100% is

( )
2
zα / 2 σ
n=
e

Or in short,

σ
where σ and s are the mean and standard deviation, respectively, of a sample of size n≥30 from
an approximately normal population, andσ is the value of the t- distribution with n – 1 degree of
freedom, leaving an area of (1−α)×10 %=99% to the right

Example 4: It is desired to estimate the average age of students who graduate with PSM
diploma from DRMC. A random sample of 25 graduate students showed that the average age
was 27years with SD of 4. Construct a 99% confidence interval estimate of the true average
age of all such graduate students of the college. (Assume that the age of students who
graduate with PSM Diploma from DRMC follow a normal distribution.)
Solution
x̄ 2 = 15
n2 = 50
s2 = 5

Since the population SD is unknown and sample size is less than 30, we use the t-score to
form confidence interval.
σ
Since X̄ 1− X̄ 2
we have μ1 − μ2

Thus,
9.355≤μ1−μ2≤10.645 with 8 . 71≤μ1 −μ2 ≤11.29 degree of freedom 4.84≤μ1−μ2≤15.16

And
7 . 42≤μ 1−μ2 ≤12 .58

59
Therefore, P
This means, we are 99% confident that the average age of students who graduate with PSM
diploma from DRMC is somewhere between 24.76years and 29.24years.

D1+D2+...+Dn
C. Interval estimation for population meanD̄= n : σ -unknown and n≥30

t =t 0 . 005 confidence
If population standard deviation is known, but sample size n≥30 , a α / 2
interval for is given as follows.

CI = x̄±z α /2 s/ √ n
Example 5- Consider the sample

x 3 5 8 9 15

f 4 5 10 7 4

that is selected from a certain population. Construct a 95% confidence interval estimation for μ .

Solution

CI = x̄±z α /2 s/ √ n , where x̄=8 , s = 3.4641, n = 30 and z α /2 =1. 96

Therefore, CI=8±1.96(3.4641 )/ √30=8±1.24=(6.67 ,9.24 )


Dear learner, so far we have seen how to estimate population mean from sample mean. Now,
in the next section we shall see how to estimate the difference between two population means
from the difference between two sample means.
Exercise.
1) A sample of 10 observations taken from a normally distributed population having
standard deviation of 5 produced the following data: 35 32 23 36 26 37 33
45 24 19 . The 99% confidence interval for population meanσP=√ pq/n is:

a) (26.92,35.08) b) (24.92,34.08) c) (22.92,32.08)


2) A study was conducted to evaluate the stress level of senior business students at a
particular college. 40 students were selected at random from the senior business class,
and their stress level was found to be 35.8 micro-volts. In addition, the standard
deviation was found to be 3.5 microvolt. What would be the 95% confidence interval
on the true mean for all seniors in the class?
a) (34.78, 36.82) b) (35.03, 36.57) c) (34.72, 36.88)
3) For n = 121, sample mean = 96, and a known population standard deviation 24,
construct a 99% confidence interval for the population mean.
a) (92.72, 99.28) b) (93.51, 98.49) c) (90.37, 101.63)

60
4) For n = 25, sample mean = 645, and s = 45, construct a 95% confidence interval for
the population mean. Assume that population is normally distributed.
a) (614.23, 675.77) b) (622.30, 667.70) c) (626.42, 663.58)
5) A telephone company wants to estimate the mean number of minutes people in a city
spend talking long distance with 99% confidence. From past records, an estimate of
the standard deviation is 22 minutes. What is the minimum sample size required if the
desired error is 5 minutes?
a) 129 b) 39 c) 23
6) A cable TV company would like to estimate the average number of hours its
costumers spend watching cable TV per day. What sample size is needed if the
company wants to have 95% confidence that its estimate is correct to within 25
minutes? The standard deviation estimated from previous studies is 3hrs.
a) 200 b) 300 c) 400
7) The management of a local restaurant wants to estimate the average amount their
customers spend at the restaurant to within $1.50, with a 99% confidence. What is the
minimum sample size required, if the standard deviation is assumed to be $3.50?
a) 327 b) 189 c) 37
8) A 99% confidence interval for is being formulated based on a sample of size 26.
What is the appropriate value for t?
a) 2.78744 b) 2.94712 c) 2.13121

σP
1.3. Estimating the Difference between Two Population Means 1− P2

Dear learner, in this section, as far as interval estimation of two populations’ means is
concerned, we will try to consider three cases. The first is the case where α /2 are
known and populations are normal (or − pα /2 ), the second is the case where( 1−α )×100%
; and the third case is σ 1 and σ 2 are
n=z 2 p q /e2
are unknown, populations are normal and α /2

unknown and
n1 , n2 ≥30 .

A. Confidence interval forα /2: α / 2 are known, populations are normal (or
− pα /2 ) (Independent Samples)

If we have two populations with means


μP
1 and variances
pα / 2 , respectively, a
point estimator of the difference between P (i.e. −z α / 2 ) is given by the statistic
z α / 2 . Therefore, to obtain a point estimate of P , we shall select two independent
P−μ P P−p
Z= =
random samples one from each population of size σP √ pq /n , and compute the difference,
p (−zα/2<Z<zα/2)=1−α
of the sample means.

If our independent samples are selected from normal populations, or if (


p −z α /2 <
P− p
√ pq/n )
<z α/2 =1−α
, we can
establish a confidence interval for √ pq /n by considering the sampling distribution of

61
P . In such a case, the sampling distribution of p(P−zα/2√ pq/n<p<P+zα/2√ pq/n)=1−α is approximately normally
p by p p ( p−zα/2 √ pq/n<p<p+zα/2 √ pq/n )=1−α
distributed with mean and standard deviation .

Therefore,
n=500, x=340 is approximately a standard normal random variable.

x 340
p= = =0.68
n 500

q=1−p=1−0.68=0.32

(1−α)×10 %=95% α=0.05


zα/2=z0.05/2=z0.025=1.96 σP=√pq/50 =√(0.68×0.32)/50 =√0.2176/50 =√0. 0 435=0. 21 CI=p±zα/2σP=0.68±1.96(0. 21)=(0.64,0.72) z

Now, let (1−α )×100% represent the Z value above which we find an area of p±z α/2 √ pq/n as shown below.

Then, p

If we substitute
p−z α /2 √ pq/n for Z, we get the following.

p
If we multiply all terms in the parenthesis by , then subtract p+zα/2 √ pq/n from p
each term and at last multiply all terms by -1, we will get the following equation.

p
Thus, we state the result as follows.

62
Confidence interval forα=0.01 : zα/2=z0.01/2=z0.005=2.58 are known, populations are normal (or
e=1, σ=2 )

( Z e σ ) =( 2 (21.58 ) ) =26. 6256 confidence interval for ( μ ) is


2 2
α /2
n=
A

σ
Or in short,

n < 30
where n≥30 are means of independent random samples of size σ from populations
with known variances n<30 , respectively,
X̄−μ
T=
and S/ √ n is the value of the standard normal
distribution leaving an area of t α / 2 to the right.

Example 6
Assume that two sample of size 16 and 25 are selected from two normally distributed
populations having Standard deviations 4 and 5 respectively. If the sample means are 60 and
50 respectively, construct a 95% confidence interval estimate for μ1 −μ2 .
Solution:
CI =( x̄ 1 − x̄2 )±z α /2 σ 2 /n1 +σ 2 /n2
1 2 √ z =1. 96 ,σ 1=4 ,σ 2=5
,where x̄ 1− x̄ 2 =60−50=10 , α /2
,n1 =16 and n2 =25 .
Therefore,
CI =( x̄ 1 − x̄2 )±z α / 2 σ 2 /n1 +σ 2 /n2=10±1 . 96 √ 42 /16+52 /25=10±2. 772=(7 .228 , 12 .772 )
√ 1 2

This means, we are 95% confident that the difference between the means is between 7.228
and 2.772.

B. Confidence interval for p : ( 1−α )×100% are unknown, populations are normal
2
n=z p q /e
and α /2
2
(Independent Samples)

If ( ) ( ) are unknown, and Max ( p q )= 14 , we can assume either of the following two cases.
1 1 12 1 1
pq=p (1−p)=−p2+p=− p2−p+ + =− p+ + ≤
44 2 44

z z

Case 1:- Assume Max n =Max (z pq/e )= e Max pq = e ( 4)=z /4e (unknown)
1 2 α/22 α/2 2 2
() ( ) 2 2 2 2
α/2 α/2

As long as the samples are taken randomly and independently from a normal population, the

variable
p will have a t-distribution. Now, by letting p ,

we obtain
( 1−α )×100% .Since, n=z 2 /4e2
α/2 is unknown, we can estimate it by the pooled

63
standard deviation ±2% which is defined by
( 1−α ) ×100%=95% . Thus, we have

α =0 . 05
Now, let
zα/2=z0.05/2=z0.025=1.96 represent the T value above which we find an area of e=2%=0. 2 p=40%=0.4as shown below.

q=1−p=1−0.4=0.6
2
n=pqz α/2/e2=0.4(0.6)(1.96)2/(0.02)2=2304.96 (1−α)×10 %=95%

α=0.05 zα/2=z0.05/2=z0.025=1.96 T

Then, e=2%=0.02 p=0.5 . If we substitute


q=1−p=1−0.5=0.5 for T, we get

2
n=z 2 /4e=( 1 .96) /4 ( 0 .02)=2401
α/2 . If we multiply all terms in the parenthesis by

(1−α)×100%=99%, then subtract α=0.01 from each term and at last multiply all terms by -1, we
will get the following equation.

z α / 2 = z 0 . 01 /2 = z 0 .005 =2 . 58
Thus, we state the result as follows.

Confidence interval forα / 2 : 1− α but unknown, populations are normal and


α /2

Aα /2 confidence interval for −t α / 2 is

t α / 2

Or in short,

64
p (−t α/2<T<t α/2)=1−α X̄ − μ
where are means of small independent random samples of size S/ √ n from
normal populations, ( ) is the pooled standard deviation, and S/ √ n is the value of the
X̄−μ
p −tα/2< <tα/2 =1−α
S/√n t-
distribution with X̄ degree of freedom leaving an area of p(X̄−tα/2S/√n<μ<X̄+tα/2S/√n)=1−α to the right

Example 7: In a batch chemical process, two catalysts compared for their effect on the output
process reaction. A sample of 12 batches is prepared using catalyst 1 and a sample of 10
batches was obtained using catalyst 2. The 12 batches for which catalyst 1 was used gave an
average yield of 85 with a sample standard deviation of 4, while the average for the second
sample gave an average of 81 and a sample standard deviation of 5. Find a 90% confidence
interval for the difference between the population means, assuming the populations are
approximately normally distributed with equal variances.
Solution

Let e=3%=0. 3 p=30%=0.3 and q=1−p=1−0.3=0.7 represent the population means of all yields using catalyst 1 and catalyst 2,
2 2 2
α/2 2
respectively. We wish to find a 90% confidence interval for n=pqz /e =0.3(0.7)(2.58) /(0.03) =15 3.16. Our point estimate of
(1−α )×100%=95% isα =0 . 01 . The pooled estimate, zα/2=z0.01/2=z0.0 5=2.58, of the common variance e=3%=0.03 , is

2
n=z 2 /4 e= ( 2 . 58 ) / 4 ( 0 . 03 )=1849
α /2

Taking the square root,


p1 − p2 . Since μ1 = . Using
we have,
n 1 p1 μ = forn2 p 2 degree of
, we find in the t- table that 2

freedom. Therefore, substituting in the formula


σ =
12

We obtain the 90% confidence interval

n1 p1 q1
which simplifies to
σ =
22

Hence we are 90% confident that the interval from 0.69 to 7.31 contains the true difference of
the yields for the two catalysts. The fact that both limits are positive indicates that catalyst 1
is superior to catalyst 2.

Case 2: Assume n2 p 2 q2 (both unknown)

65
If p1 − p2 but are unknown, as long as the two samples are selected randomly and

independently from two normal populations, then the statistic


P1 − P2 has
approximately a t-distribution with v degree of freedom, where

p1 − p2
Since P1−P2 is seldom an integer, we round it off to the nearest whole number.
μP −P =p1−p2 σ P −P =
Now, let 1 2 represent the T value above which we find an area of 1 2 as shown below.

√ p1q1/n1+p2q2/n2
σ P −P
1 2 α /2
α /2 −pα /2 μP −P
1 2

Then,
pα / 2 . If we substitute
P1 −P2 for−zα/2 , we get

zα / 2
If we multiply all terms in the parenthesis by
P1 − P2 , then subtract Z= σ ( P1−P2)−μP1−P2 ( P1−P2)−( p1−p2)
P1−P2
=
√ p1 q1/n1+p2 q2/n2 from
each term and at last multiply all terms by -1, we will get the following equation.

p (− z α / 2 < Z < z α / 2 ) =1− α


Thus, we state the result as follows.

Confidence interval forn=25, x̄=27, s=4 : (1−α)×100%=99% and unknown, populations are normal and α =0 . 01

A t α / 2 =t 0 . 005 confidence interval for v=n−1=25−1=24 is

= 2 . 797
Or in short,
σ X̄ =s/ √n=4 / √ 25=4/5=0 .8
66
Where
CI=x̄±σX̄tα/2=27±0.8(2.797)=(24.76,29.24) and ( μ ) are means and variances of small independent random samples of size
μ from normal populations, and ( μ1−μ2 ) is the value of the t- distribution with

μ1 − μ2 degree of freedom, leaving an area of σ 1 and σ2 to the right

Example 8: Records of the past 15 years have shown the average rainfall in a certain region
of the country for the month May to be 4.93 centimeter, with a standard deviation of 1.14
centimeters. A second region of the country has had an average rainfall in May of 2.64
centimeters of rain, with a standard deviation of 0.66 centimeters during the past 10 years.
Find a 95% confidence interval for the difference of the true average rainfalls in these two
regions, assuming that the observations came from normal populations with different
variances.

( ) , √ p1 q1 /n1+p2 q2 /n2 and P1 −P2 . For the second


( P1−P2 )−( p1−p2 )
p −zα /2 < <zα /2 =1−α
Solution: For the first region we have √ p1 q1/n1+p2 q2/n2

region we have ( 1 2) α/2√ 1 1 1 2 2 2 1 2 ( 1 2) α/2√ 1 1 1 2 2 2) , p by p and ( 1 2) α/2√ 1 1 1 2 2 2 1 2 ( 1 2) α/2√ 1 1 1 2 2 2) . We wish to find a 95% confidence
p P −P −z p q /n +p q /n <p −p < P −P +z p q /n +p q /n =1−α p P −P −z p q /n +p q /n <p −p < P −P +z p q /n +p q /n =1−α

interval for p1 . Since the population variances assumed to be unequal and our sample
sizes are not the same, we can only find an approximate 95% confidence interval based on
the t-distribution with v degree of freedom, where

p2
Our point estimate of p1=75/1500=0.05 is p2 =80/ 2000=0 . 04 . Using p1 − p2 we find in the
t- table that
p1−p2=0.05−0.04=0.01 forz0.05=1.645 degree of freedom. Therefore, substituting in the formula

= 2 . 797
We obtain the 90% confidence interval

( P1 − P2 )±1 . 645 √ 0 . 05 ( 0 .95 ) / 1500+0 . 04 ( 0 . 96 ) / 2000


which simplifies to
−0.0017< p1 − p2 <0.0217

Hence we are 95% confident that the interval from 2.02 to 2.56 contains the true difference of
the average rainfall for the two regions.

C. Confidence interval for p : σ 1 and σ 2 are unknown and n1 , n2 ≥30 (Independent


Samples)

67
σ and σ
If 1 2 are unknown and n1 , n2 ≥30 , a t α / 2 =t 0 . 005 confidence interval for v=n−1=25−1=24 is
given as follows.

√1
CI =( x̄ 1 − x̄2 )±z α /2 s 2 /n1 +s 2 /n2
2

Example 9: A standardized mathematics test was given to 50 girls and 75 boys. The girls
made an average grade of 76 with a standard deviation of 6, while the boys made an average
grade of 82 with a standard deviation of 8. Find a 96% confidence interval for the difference
p , where p is the mean score of all boys and (1−α)×10 % is the mean score of all girls who
might take this test.
Solution

The point estimate of z α/2 √ p q/n is


error ( e )=|x̄−μ|≤ . Since zα/2 √ p q/n and p are both
large, we can substitute p for (1−α)×10 % and z α/2 √ p q/n for p . Using the standard normal table,
we find p . Therefore, substituting into the formula

√1
CI =( x̄ 1 − x̄2 )±z α /2 s 2 /n1 +s 2 /n2
2

we obtain the 96% confidence interval


2
n= z 2 p q / e
α / 2 which simplifies to
p
This means, we are 96% confident that the difference between the mean score of all boys and
the mean score of all girls who might take this test is between 3.42 and 8.58.
Dear learner, so far we have seen how to estimate the difference between population means
from the difference between sample means. In the coming section we will see how to
estimate population proportion from sample proportion.
Exercise
1) Suppose you are given the following information:

n1 ,n2 ≥30 (x̄1−x̄2)±zα/2√σ12/n1+σ22/n2


I.
σ = σ =σ
The standard error for the difference of sample means( 1 2 ) is:
a) 1 b) 2 c) 0.5 d) 0.25
II. The 99% confidence interval for the difference between the populations’
meansn1 , n2 <30 is:
a) ( x̄ 1 − x̄ 2 ) ±t α /2 s p √ 1/n1 +1 /n2 b) s p=
√ ( n1 −1 ) s12 + ( n 2−1 ) s2 2
n1 +n 2−2

c) df =n1 + n2 −2 d) σ 1 ≠ σ2

68
2) A standardized Economics test was given to 60 girls and 50 boys. The girls made an
average grade of 70 with standard deviation of 5, while the boys made an average
grade of 80 with standard deviation of 8. Find a 95% confidence interval estimate for
the difference , where is the mean score of all boys and is the mean
score of all girls who might take this course.
a. (5.96, 14.04) b. (4.23, 8.75) c. (7.45, 12.55) d. None
3) From a certain school 45 boys and 40 girls showed that their average age is 18 years
and 16 years with standard deviations 2 years and 2.5 years, respectively. The 99%
confidence interval estimate of the difference of the average age of all boys ( ) and
the average age of all girls ( ) in the school( ) is:
a. (0.72, 3.28) b. (1.03, 2.97) c. (1.04, 2.96) d. None

1.4. Estimating a Proportion (P)

Dear learner, in section 3.4 of Module-I, we have discussed that the statistic n1 ,n2<30 follows
( s12/n 1+s22 /n2)
2

approximately a normal distribution with mean 1 2 α/2√ 12 1 22 2 and standard deviation


(x̄ −x̄ )±t s /n +s /n df =
( s 2/n1 ) / (n1−1 )+( s 2/n 2) / ( n2−1 )
1
2
2
2

, provided that p provided that

d̄±tα/2 sd / √ n

df=n−1 p
p±z α/2 √ pq/n p1−p2 σ2 The
(n−1) s2 2 (n−1 ) s2
2
<σ < 2
χ α/2 χ 1−α/2 - scale
σ 2 /σ S2
1 1 12 12 1
σ S
< <
1 22 0 S 2 f α/2(v1, v2) σ 2 S 2 f α/2(v2,v1)
2 2 2 The Z - scale

69
Thus, the standard normal random variable that corresponds to the normally distributed

random variable ( μ ) will be given by


( μ1 − μ2 ) .Therefore, we can assert that

( p ) . That is,
( p1 − p2 ) . If we multiply all
terms in the parenthesis by μ , then subtract H0:μ=8days from each term and at last multiply all
terms by -1, we will get
H0: μ =8 days

When n is large, very little error is introduced by substituting H 1 : μ>8 days under the radical sign.
Then we write
H 0 : μ =μ 0 . Thus, we state the result as
follows.

Confidence interval for p;


A μ1 − μ2 confidence interval for p is
X̄ 1− X̄ 2
Or in short,
X̄ 1 − X̄ 2
( X̄1−X̄2 )−( μ1−μ2)
μX̄ −X̄ =μ1−μ2 σ X̄ − X̄ = √ σ 1 /n1 +σ 2 /n2 Z=
where is the proportion of success in a random sample of size n, , and √σ12/n1+σ22/n2 is the value
2 2
1 2 1 2

σ X̄ − X̄
of the standard normal distribution leaving an area of 1 2 to the right

Example 10: A survey of 500 people shopping at a supermarket, selected at random, showed
that 340 of them used credit cards for their purchases and the rest used cash. Construct a 95%
confidence interval estimate of the proportion of all persons at the supermarket who use
credit card for shopping.
Solution:

H 1 : μ≠ μ 0 H 1 : μ> μ0 H1: μ < μ0

Since H 0 : μ=8 days we have x̄=10 days . Thus, ( s= 1 day )

( x̄ )
Therefore,( μ )
This means, we are 95% confident that the proportion of all persons at the supermarket who
use credit card for shopping is somewhere between 0.64 and 0.72.

70
It is evident that while we are estimating population proportion from sample proportion, we
usually commit error. Thus, the following discussion will illustrate this error.
Error (e)

The ( x̄ ) confidence interval provides an estimate of the accuracy of our point


estimate. Clearly, at the center of our confidence interval( μ) , we have the sample
proportionH0:μ=8days.
I I I I

( x̄ ) ( μ) μ H 0 : μ=μ 0

Now, if α is actually the center value of our interval, then we say ( 1−α ) estimates ( 1−α ) without
error. But, most of the time, H0: μ=8 will not be at the center of the interval. Thus, in estimating β
byα , we usually commit an error. The size of this error will be the difference between β &
β and we can be β confident that this difference will be less than β .
That is, ( 1− β )(α ) . Now, we have the following theorem.

If ( 1−α ) is used as an estimate of β−risk , we can be ( β ) confident that the error will be
less than ( 1− β ) .

Frequently, we want to know how large a sample is necessary to ensure that the error in
estimating ( 1−α ) will be less than a specified amount e. Thus, the following discussion will
illustrate this case.

Sample Size Determination


Frequently, we want to know how large a sample is necessary to ensure that the error in
estimating β will be less than a specified amount e. By the above theorem, this means we

must choose n such that α .This simple calculation shows that =1− β .
Now, we have the following theorem.

If α is used as an estimate of β , we can be β confident that the error will be

less than a specified amount e when the sample size is β .

If no previous survey has been conducted, we take .

71
Example 11: It is desired to estimate the proportion of children watching television on
Saturday morning in order to develop a promotional strategy for electronic games. We want
α=p(Type−I er or ) = p(Rejecting H / H is true ) 0 0

to be 95% confident that our estimate will be within =p(X<480/ μ=500) + p(X>520/ μ=500) of the true population proportion.
a) What sample size should we take if a previous survey showed that 40% of children
watched television on Saturday morning?
Solution

Since x̄ 1=480 and x̄ 1 =520 we have z = σ and z = σ , where μ=500 and σ =√n=√100=10 =10
x̄ −μ x̄ −μ σ 100 100 1 2
1 2 X̄
X̄ X̄

480−500 −20 520−500 20


z 1= = =−2 and z 2 = = =2
Thus, 10 10 10 10

β= p ( Type −II error )= p ( Accepting H 0 / H 0 is false )


α = p ( Z <−2 ) + p ( Z >2 )
= p ( 480 < X̄ <520 / H 1 is true )
= p ( Z<−2 ) + [ 1− p ( Z <2 ) ] =0 .0228+ [ 1−0 . 9772 ] =0 . 0228+0 .0228 =0 . 0456 = p ( 480 < X̄ <520 / μ=490 )

Now,
x̄ 1= 480 and x̄ 1 =520

So, we take n = 2305


That means, if we take a sample of size at least 2,305, as far as previous survey showed that
40% of children watched television on Saturday morning, we can be 95% confident that our
α=p (Type−I er or ) = p (Rejecting H / H is true)
0 0

estimate will be within =p(X<480/ μ=500) + p(X>520/ μ=500) of the true population proportion
b) What would the sample size be, for the same degree of confidence and same maximum
allowable error, if no such previous survey had been taken?
Solution
x̄ 1−μ x̄ −μ σ 100 100 480−490 −10 520−490 30
z1= and z2= 2 , where μ=490 and σ X̄= = = =10
Since σ X̄ σ X̄ √ n √100 10 we have z =10 1 = =−1 and z2= = =3
10 10 10

Thus, β=p (−1<Z<3 )= p ( Z<3 )− p ( Z<−1 )=0.9987−0.1587=0.8400


x̄ 1=480 and x̄ 1 =520
β = p ( Type − II error )
= p ( Accepting H 0 / H 0 is false )
= p ( 480 < X̄ < 520 / H 1 is true )
= p ( 480 < X̄ < 520 / μ= 515 )

2
(1.96) X 0.5 X 0.5 3.842 X 0.25 0.96
Now,
(0.02)2
=
0.0004
= 0.0004
= 2401
That means, if we take a sample of size at least 2,401, as far as no previous survey had been
α=p (Type−I er or ) = p (Rejecting H / H is true) 0 0

taken, we can be 95% confident that our estimate will be within =p(X<480/ μ=500) + p(X>520/ μ=500) of the true population
proportion
Example 12: It is desired to estimate the proportion of junior executives who change their
first job within 5 years. The proportion is to be estimated within 3% of error and 99% degree
of confidence is to be used. A study conducted several years ago revealed that 30% of such
junior executives changed their first job within 5 years.
a) How large a sample is required to update the study?
Solution:
480−515 −35 520−515 5
we have β=p(−3.5<Z<0.5)=p(Z<0.5 )−p(Z<−3.5)=0.6915−0.0 03=0.6912
z 1= = =− 3.5 and z2= = =0.5
Since 10 10 10 10

Thus, μ= 490 Birr


72
μ=500 Birr μ=515 Birr Now,

So, we take n = 1554


That means, if we take a sample of size at least 1,554, as far as previous survey showed that
30% of such junior executives changed their first job within 5 years, we can be 99%
confident that our estimate will be within of the true population proportion.
b) How large should the sample be if no such previous estimates are available?

Solution

Since μ= 490 Birr we have β


μ=515 Birr β
Thus, ,

Since no previous survey has been taken, we set

Now,
That means, if we take a sample of size at least 1,849, as far as no previous survey had been
taken, we can be 99% confident that our estimate will be within of the true population
proportion.

Exercise.
1) An anthropologist, studying for male/female ratio in a certain region, took a random
sample of 100 persons and found that 40 of them were females. A 99% confidence
interval estimate for the population proportion of females in the region is:
a) (0.127,0.273) b) (0.274,0.526) c) (0.327,0.473)

2) In a department store, it is found that out of a randomly selected 64 customers, 24 buy


cards. Construct a 99% confidence interval for the proportion of customers buying
cards.
a) (0.274, 0.526) b) (0.219, 0.531) c) (0.18, 0.48)

73
3) In a random sample of 1000 items, it was found that 120 of them are defective.
Construct a 99% confidence interval for the proportion of defective items in the
population.
a) (0.12, 0.28) b) (0.10, 0.30) c) (0.09, 0.15)
4) A sample of 36 students from D.R.M.C. revealed that 30 of them are males. What is
the maximum error that we encounter in estimating the population proportion of male
by the sample proportion of males? Assume that 95% level of confidence is required.
a) 0.2027 b) 0.1540 c) 0.1217
5) Assume that you want to estimate the proportion of major students of D.R.M.C. with
in 0.15 error margin. What is the minimum sample size required if you want to be
99% confident in your estimate?
a) 97 b) 167 c) 74
H 0 : θ=θ 0
1.5. Estimating the Difference between Two Proportions ( H 1 : θ <θ0 )
Dear learner, consider two independent samples of size n 1 & n2 are drawn at random from two
binomial populations with means H 1 : θ>θ0 H 1 : θ<θ0 &
H 0 : θ=θ 0
H 1 : θ≠θ0 H 1 : θ≠θ0 and variances α α &
α /2α / 2 respectively. A point estimator of the difference between the two proportions
α is given by
zα / 2 .
A confidence interval for z α can be established by considering the sampling distribution
of α=0.05 which is normally distributed with mean α =0 . 01 and standard deviation
μ=1000 Birr) μ≠1000 Birr
provided that
σ=200

α μ=1000 Bir )

β μ=925 Birr ( μ) The


H 0 : μ=μ0 - scale

H 1 : μ<μ 0 0
H 1 : μ>μ0 The Z - scale
Thus, the standard normal random variable that corresponds to the normally distributed

random variable
H 1 : μ≠μ0 will be given by ( α ) .
2
Therefore, we can assert that X or F. That is,

H 1 : μ< μ0 . If we multiply all terms in the parenthesis

74
by
Z <− z α , and then subtract 1 H : μ>μ 0 from each term and at last multiply all
terms by -1, we will get the following equation.

Z > z α

When n is large, very little error is introduced by substituting H 1 : μ≠μ0 under the radical sign.
Then we write

Thus, we state the result as follows.

Confidence interval for1−α ;

Aα /2 confidence interval for −z α / 2 is

Or in short,

where z α / 2 are the proportions of success in random samples of size α / 2


respectively, ( α/2 α/2) , √ σ / n +σ / n , and ( √σ /n +σ /n ) is the value of the standard normal
( X̄ −X̄ )−(μ −μ )
p − z <Z<z =1−α (
X̄ − X̄ )− ( μ −μ )
1 2 1 2 1 2 1 2
p −z < α/2 <z =1−αα/2
12 1 22 2 12 1 22 2

distribution leaving an area of √ 12 1


σ /n +σ 2/n2
2 to the right.

Example 13: A certain change in a manufacturing procedure for component parts are being
considered. Samples are taken using both the existing and the new procedure in order to
determine if the new procedure results in an improvement. If 75 of 1500 items from the
existing were found to be defective and 80 of 2000 items from the new procedure were found
to be defective, find a 90% confidence interval for the true difference in the fraction of
defectives between the existing and the new process.

Solution

Let Z>z α/2 and X or F be the true proportion of defectives for the existing and new procedures,
2

respectively. Hence, H 0 : μ= μ0 and σ , and the point estimate


X̄ − μ0
of n≥30 is
Z=
σ / √n . Using the standard normal table, we find
H 1 : μ< μ 0 . Therefore, substituting into the formula

75
Z <− z α

we obtain the 90% confidence interval

H 1 : μ> μ0
which simplifies to
Z> z α

Since the interval contains the zero value, there is no reason to believe that the new procedure
produces a significant decrease in the proportion of defectives over the existing method.
Exercise

1) In a sample of 1000 high school students, 800 are females; whereas in a sample of
400 college students, 240 are females. Construct a 95% confidence interval estimate
of the difference in population proportion of female students in high schools and
colleges, where p1 is the population proportion of female students in high schools and
p2 is the population proportion of female students in colleges.

76
Unit Four
Hypothesis Testing

Introduction
In the previous unit, you have learnt how to use a sample statistic that is obtained from a
random sample to get an interval estimation of the corresponding population parameter. In
this unit, the focus is on hypothesis testing, another phase of statistical inference. Like
confidence interval estimation, hypothesis testing is based on sample information. A step-by-
step methodology is developed that enables us to make inferences about a population
parameter by analyzing differences between the results observed (the sample statistic) and the
results that are expected to be obtained if some underlying hypothesis is actually true.
In this unit you will be introduced to the hypothesis testing methodology, how to test
statistical hypothesis concerning population mean, the difference of two populations’ means,
population proportion and the difference of two populations’ proportions.
At the end of this unit, you are expected to:
 distinguish null hypothesis from alternative hypothesis;
 find critical value(s) of a test statistic;
 identify regions of rejection and regions of non-rejection;
 differentiate type-I error from type-II error;
 differentiate one-sided test from two-sided test;
 identify the type of test required to test a given statistical hypothesis;
 conduct hypothesis testing about a population mean, the difference of two populations’
means, population proportion and the difference of two populations’ proportions.

2.1 Hypothesis Testing Methodology


The testing of statistical hypothesis is perhaps the most important area of decision theory.
Typically, it begins with some theory, or assertion about a particular parameter of a
population.
To make the concept of statistical hypothesis a bit clear, let us consider the following
problem.
Decision problem
Assume a court claims that it provides final decision for a murder case in 8 days on average.
To test this claim, suppose you take 100 of such cases from the court’s document and
discover that the court took 10 days on average with standard deviation of 1 day. At a 0.01
level of significance, is there evidence that the court needs more than 8 days on average to
give final decision for a murder case?
You are not expected to answer the above question. But we will use the problem repeatedly
in the course of time and ultimately we will solve it. In the rest of this section, we will try to
discuss some important concepts that we usually encounter in any hypothesis testing
problems. These are, null hypothesis, alternative hypothesis, critical value of a test statistic,

77
regions of rejection and non-rejection, risks in decision making using hypothesis testing
methodology, one-tailed tests and two-tailed tests.
Null Hypothesis
A null hypothesis is an assertion about the population parameter that is being tested by the
sample results. It is an assertion that we hold as true unless we have sufficient evidence to
conclude otherwise. We usually formulate a null hypothesis with the hope of rejecting and it
is usually denoted by Ho.
A null hypothesis might assert that the population mean is equal to 100. Unless we obtain
sufficient evidence that it is not 100, we will accept it as 100. Thus, we write the null
hypothesis as:

A hypothesis may assert that the parameter in question is at least or at most some value. For
instance, the null hypothesis may assert that the population proportion is at least 40%. In this
case, the null hypothesis becomes:

Or, if the null hypothesis asserts that the population proportion is at most 30%, the null
hypothesis becomes:

In all cases, the equality sign “=” appears in the null hypothesis.
Although the idea of null hypothesis is simple, determining what the null hypothesis should
be in a given situation may be difficult. It is important to be clear about what exactly the null
hypothesis is; otherwise the test is meaningless. There are several ways to characterize the
null hypothesis. It can be

 the less interesting or less important situation, or the one that does not require taking
any corrective action.
 the thing you want to disprove.
 the status quo - the assumption that nothing has changed from the past.
 your default position - the thing you would assume unless someone provided strong
evidence to the contrary.
To make the above characterization of a null hypothesis a bit clear, let us consider the
following examples.
Imagine an automatic bottling machine that fills 2 litter bottles with cola. Suppose a customer
advocate suspects that the average amount of cola is less than 2 litters and wants to test it.
Clearly, from the customer advocate’s point of view, if the amount of cola is greater than or
equal to 2 litters, no corrective action is needed, and therefore µ ≥ 2 liters should be the null
hypothesis.
Now, let us take another look at the same bottling operation. Assume that customers are
satisfied with the bottles. But the owner of the bottling company suspects that the machine is
filling more than 2 litters on average and thus wasting colas.

78
In this case, from the owner’s point of view, if the amount of cola is less than or equal to 2
litters, no corrective action is needed, and therefore µ ≤ 2 liters should be the null
hypothesis.
There is a third point of view regarding the bottling operation, and that is the engineering
point of view. Suppose the engineer who is in charge of the accuracy of the machine wants to
test the average amount filled.
Obviously, from the engineer’s point of view, no corrective action is needed only if the
average amount of cola is equal to 2 litters, and therefore µ = 2 liters should be the null
hypothesis.
Let us come back to the court’s case. From your (the researcher’s) point of view, if the court
takes on average less than or equal to 8 days, no corrective action is needed, and therefore µ
≤ 8 days should be the null hypothesis.
Alternative Hypothesis
The rejection of a null hypothesis Ho leads to the acceptance of a statement called alternative
hypothesis and is usually denoted by H 1. Thus, the alternative hypothesis represents the
conclusion reached by rejecting the null hypothesis if there is sufficient evidence from the
sample information to decide that the null hypothesis is unlikely to be true. In the court’s
case, the null hypothesis will be rejected only if the collected sample
showed that the court actually took more than 8 days on average. Thus the alternative
hypothesis will be

There are also several ways to characterize the alternative. It can be


 the thing you want to prove;
 that what was true in the past is no longer true and
 that something interesting or important or requiring action has occurred
Because the null hypothesis and the alternative hypothesis assert exactly opposite statements,
only one of them can be true. Rejecting one is equivalent to accepting the other. In general,
the null and alternative hypotheses pair can be given as follows.

Test Statistic
The logic behind the hypothesis testing methodology can be developed by thinking about
how sample information can be used to determine the plausibility of the null hypothesis. In
the court scenario, the null hypothesis is that the court takes at most 8 days on average to
handle a murder case ( ). The sample of 100 randomly selected murder cases
from the court’s document revealed that the court took 10 days on average ( n1 , n2 ≥30 )
( X̄ 1− X̄ 2 ) −d 0
Z=
with standard deviation of 1 day √σ 12 / n1+σ 22 / n2 . Even if the null hypothesis is true, this statistic
μ1−μ2<d0 can be greater than 8 days because of variation due to sampling. However, one expects
the sample statistic μ1−μ2>d0 to be less than or “very close” to 8 days if the null hypothesis (

79
) is true. In such a situation there is insufficient evidence to reject the null
hypothesis. On the other hand, if the value of the statistic Z<−zα/2 is “very greater than” 8 days,
the instinct is to conclude that the null hypothesis is unlikely to be true.
Unfortunately, the decision making process is not always so clear-cut and it can be left to the
individual’s subjective judgment to determine the meaning of “very close” , “very different”,
“very less than” and “very greater than”. Determining what “very close”, “very different”,
“very less than” and “very greater than” is arbitrary without definition. Hypothesis testing
provides clear definitions for evaluating such differences and enables you to quantify the
decision making process so that the probability of obtaining a given sample result can be
found if the null hypothesis is true. This is achieved by first determining the sampling
distribution of the sample statistic of interest (e.g. sample mean) and then computing the
particular test statistic based on the given sample result. Because the sampling distribution of
the test statistic often follows a well known statistical distribution (like Z, T, X 2), these
distributions can be used to determine the likelihood of a null hypothesis true.

Regions of Rejection and Non-rejection


The sampling distribution of a test statistic is divided into two regions: the region of rejection
(sometimes called the critical region) and the region of non-rejection. The rejection region of
a statistical hypothesis test is the range of numbers that will lead us to reject the null
hypothesis when the test statistic falls within this range. It is designed so that, before
sampling takes place, our test statistic will have a small probability of falling within the
region, if the null hypothesis is actually true.
The non-rejection region of a statistical hypothesis test is the range of numbers that will lead
us not to reject the null hypothesis when the test statistic falls out of this range. It is designed
so that, before sampling takes place, our test statistic will have a probability of falling
within the region, if the null hypothesis is actually true.

σ 1 ∧ σ2

Region of rejection Region of rejection


Region of non-rejection
Critical point Critical point
A point that separates the region of rejection from the region of non-rejection is referred to as
critical point. If the rejection region is on both tails of the distribution, then you can have two
distinct critical points. If the region of rejection is only on one tail of the distribution, either
on the right or on the left tail, then you can have only one critical point.
80
Well, failure to reject a null hypothesis is not a proof that it is true. One can never prove that
a null hypothesis is true, because the decision is based on the sample information, not on the
entire population. Therefore, if you fail to reject a null hypothesis, you can only conclude that
there is insufficient evidence to warrant its rejection. On the other hand, if the value of a test
statistic falls into the region of rejection, the null hypothesis is rejected, because the value is
unlikely if the null hypothesis is true. Therefore, rejection of a null hypothesis is to conclude
that it is false.
To make a decision concerning the null hypothesis, you first determine the critical value of
the test statistic. The critical value divides the non-rejection region from the rejection region.
The determination of this critical value depends on the size of rejection region. The size of
rejection region is directly related to the risk in using only sample evidence to make decisions
about a population parameter.

Risks in decision making using hypothesis testing methodology


In using a sample statistic to make decision about a population parameter, there is a risk that
an incorrect conclusion will be reached. Accordingly, two different types of errors can occur
when applying hypothesis testing methodology; Type-I error and Type-II error.
Type-I error
It is an error made in rejecting the null hypothesis when in fact it is true. For instance, in our
example, if the null hypothesis is true but the sample result leads us to reject it,
then we commit Type-I error. The probability of committing Type-I error is referred to as the
level of significance of the test.
Usually, one can control the Type-I error rate by deciding the risk level and that can be
tolerated in rejecting the null hypothesis, when in fact it is true. Because the level of
significance is specified before the test is performed, the risk of committing Type-I error,
( X̄1−X̄2 )−d0
usually denoted by √ , is directly under the control of the individual performing the test.
T=
s p 1/n1+1/n2

Most of the time, levels of 0.01, 0.05 or 0.10 are selected. The choice of a particular risk level
for making a Type-I error depends on the cost of making a Type-I error.

p
√ (n1−1 ) s12 +(n 2−1 ) s22
The complement s = n +n −2 of the probability of committing Type-I error is called the
1 2

confidence coefficient. In terms of hypothesis testing methodology, the confidence


coefficient represents the probability of concluding that the specified value of the parameter
being tested under the null hypothesis is plausible when in fact it is true. When the
confidence coefficient v=n1 +n2−2 is multiplied by 100 %, that means when it is expressed in
terms of percent, then it is referred to as confidence level.
Type-II error
It is an error resulted from accepting the null hypothesis when in fact it is false. For instance
in our example, if the null hypothesis is false but the sample result leads us not to
reject it, then we commit Type-II error. The probability of committing Type-II error is
usually denoted byT<−t α .
Unlike the Type-I error, which is controlled by the selection of μ1−μ2>d0 , the probability of making
a Type-II error depends on the difference between the hypothesized and actual values of the
population parameter. Because large differences are easier to find than small ones, if the
81
difference between the hypothesized and actual values of the population parameter is large,
the probability of making a Type-II error,T>−t α , is small. For instance, in the court’s case, if the
actual population mean were 15 days, then there will be only a small chance, n1 ,n2 <30 , of
concluding that the population mean is less than or equal to 8 days. On the other hand, if the
difference between the hypothesized and actual values of the population parameter is small,
the probability of committing Type-II error, μ1−μ2≠d0, is large. Thus, again in the court’s scenario, if
the actual population mean were 9 days, then there will be a high probability, T<−tα/2, of
concluding that the population mean is less than or equal to 8 days.

The complement of the Type-II error T >t α /2 is called the power of a statistical test. It
represents the probability of rejecting a null hypothesis when in fact it is false.

Now, we summarize what we have discussed regarding level of significance σ 1 ∧ σ 2 , confidence


levelσ 1 ≠σ 2 , the √ S /n + S /n and power of a test μ1−μ2<d 0 using the following table.
( )
2
s s

' ( X̄ − X̄ ) −d
12 22
+
n1 n2
1 2 0
T =
v=

( ) ( )
2 2
s 2 s 2
1 2
n1 n2
+
1
2
1 2
2
2 n1−1 n2−1

Actual situation
Statistical Decision Ho is true Ho is false
Correct Decision Type-II error
Accept Ho P(Type-II error) = μ1−μ2>d0
'
Confidence LevelT <−t α
Type-I error Correct Decision
Reject Ho
Power of test n1 , n2 <30
'
P(Type-I error) = T >−t α
Note that

 Type-I and Type-II errors are inversely related. That is an increase in one of them
results a decrease in the other.

 The size of the critical region, and therefore the probability of committing Type-I
error, can always be reduced by adjusting the critical value(s).
'
 An increase in sample size, n, will reduce μ1−μ2≠d0 and T <−tα/2 simultaneously.
'
 If Ho is false, T >t α/2 is maximum when the value of the parameter is close to the
hypothesized value. The greater the distance between the true value and the
hypothesized value, the smaller σ 1 ∧ σ2 will be.
One-tailed and two-tailed tests
A test of a statistical hypothesis, where the alternative is one-sided, such as
or

is called one-sided test. The critical region for the alternative hypothesis
lies entirely in the right tail of the distribution and therefore the test corresponding
to such alternative hypotheses is referred to as right-tailed test (RTT). Whereas, the critical
region for the alternative hypothesis
lies entirely in the left tail of the distribution and therefore the test corresponding
to such alternative hypotheses is referred to as left-tailed test (LTT).

82
A test of a statistical hypothesis, where the alternative is two-sided, which is;

is called two-tailed test (TTT). The critical region for the alternative hypothesis
lies on both tails of the distribution.
Right-tailed Left-tailed Two-tailed

μ1−μ2>d0 T>−t α v=n−1 μ1−μ2≠d0

Critical point Critical point Critical points


Some frequently used critical values
T<−t α/2
T>t α/2 ( μ1−μ2 )
0.05 1.96 1.645
0.02 2.33 2.05
0.01 2.58 2.33

A test is said to be significant if the null hypothesis is rejected at


x̄ and is considered
as highly significant if the null hypothesis is rejected at ( p )
Exercise
1) _______________is an assertion about the population parameter that is being tested by
the sample results.
2) _______________is an error resulted from accepting the null hypothesis when in fact
it is false.
3) _______________is an error made in rejecting the null hypothesis when in fact it is
true.
4) _______________is a claim about a population parameter that is accepted when the
null hypothesis is rejected.
5) The probability of making Type-I error is referred to as_______________.
6) The complement of the probability of making Type-I error is called ___________.

83
7) The probability of making Type-II error is referred to as_______________.
8) The complement of the probability of making Type-II error is called __________.
2.2 Tests Concerning a Population Mean ( α )
This type of testing involves decisions to check whether a reported population mean is
reasonable or not, compared to the sample mean computed from the sample taken from the
population.
The steps for testing a hypothesis about a population mean against the corresponding
alternatives are summarized as follows.
1) State the appropriate null hypothesis.
2) State the corresponding alternative hypothesis.
3) Choose the level of significance { (
x/p X≥x/ H0 is true)<α}
.
4) Determine the test statistic (Z, T ).
5) Determine the critical region (CR).
a. For { ( 0 )
x /p X≤x/ H is true <α/2, when x<npo }
(LTT), the critical region is { ( 0 )
x/ p X≥x/H is true <α/2, when x>npo}
.
H : p=0 . 8 (RTT), the critical region is H 1 : p<0.8
b. For 0
.
c. For α =0 . 05 (TTT), the critical region is b ( X ,n, p ) and { ( 0 ) }
x /p X≤x/ H is true <α
.
6) Computation: Compute the relevant test statistic (Z, T).
7) Decision and Conclusion: Reject H 0 if the test statistic value falls in the critical
region; otherwise do not reject H0.
Dear learner, the following table, together with the above seven-step procedure, may help you
make decisions regarding a test on population mean.

Assumptions Test statistic H1 Critical Region (CR)

Z=
P− p0
=
X−np0
√ p0 q0 /n √ np0 q0
H 0 : p=p0

p0
p0 is known
H 1 : p< p 0 H 1 : p>p 0

H 1 : p≠ p0 (α ) and b ( X ,n, p )
Population is normal or λ=np 0

Z> z α
Z<−z α ; H 1 : p≠p0
H1 : p<p0 is unknown

Population is normal H 1 : p> p 0 Z<−z α / 2 Z> z α /2

H 0 : p=0 . 5 H 1 : p≠0.5 and α=0.05


x−np0
z=
√ np0 q0

Z> z α /2
Z=
X−np 0
√np 0 q0 is unknown
Z<−zα/2 Z<−1.96
Z> 1. 96 x=230 n=40 p0=0.5 and q0=1−p0=1−0.5=0.5

84
H 0 : p=0 . 8 H 1 : p<0.8 and α=0.05
x−np0 230−400 (0 .5 ) 230−200 30
z= = = = =3
√np0 q0 √400 (0.5)(0 .5 ) √10010

Now, let’s restate the court’s case and decide upon it.
Example 2: Assume a court claims that it provides final decision for a murder case in 8 days
on average. To test this claim, suppose you take 100 of such cases from the court’s document
and discover that the court took 10 days on average with standard deviation of 1 day. At a
0.01 level of significance, is there evidence that the court needs more than 8 days on average
to give final decision for a murder case?
Solution: Given

If the court provides final decision for a murder case in less than 8 days on average, that is
well and good and hence no corrective action would be necessary. Thus, if the sample
evidence indicates that the court can provide final decision for a murder case in less than 8
days on average, then we will not have any reason to reject its claim. We only reject the
court’s claim if sample evidence shows that it takes more than 8 days on average to provide
final decision for a murder case. Therefore, the type of test we use is right-tailed test.
We follow the seven-step procedure:
a)
b) Z<−1. 64 (RTT)
c) x=675 n=90 p0=0.8 and q0=1−p0=1−0.8=0.2
is unknown but H 0 : p=0.08 , the relevant test statistic
x−np0 675−90 (0.8) 675−720 −45

d) Test statistic: Since z= = = = =−3.75


√np0q0 √90 (0.8)(0.2) √14 12

will be
e) The critical region is
X−μ 0 10−8 2
f) Computation: be Z= s = 1 = 1 = 20
√n √100 10
g) Decision and Conclusion:
Clearly, 20 falls in the critical region. Therefore, we reject H 0. That means, the court provides
final decision for a murder case in more than 8 days on average.
Example 3: Assume that the annual average income of government officers in Ethiopia is
reported by Census Bureau to be $18,750. There was some doubt whether the average yearly
income of government employees in Addis was representative of the national average. A
random sample of 100 government employees in Addis was taken and it was found that their
average salary was $19,240 with a standard deviation of $2,610. At a level of significant
Z > z α is there evidence that the average salary of government employees in Addis is not a
representative of the national average?
Solution: Given:

85
Z> 2. 33
If the sample evidence indicates that the annual average income of government officers in
Addis is somewhere or around $18,750, then we remove the doubt. That is, we can conclude
that the average yearly income of government employees in Addis is a representative of the
national average. Otherwise, if sample evidence shows that the average yearly income of
government employees in Addis is far less than or far more than $18,750, then we say the
average yearly income of government employees in Addis is not a representative of the
national average. Therefore, the type of test we use is two-tailed test.
The phrases “somewhere or around, far less than, far more than) can be determined
statistically.

Let The average income of government employees in Addis.


Now, we follow the seven-step procedure:

b)
x=22 n=200 p0=0.08 and q0=1−p0=1−0.08=0.92
x−np0 22−200 ( 0.08 ) 22−16 6
z= = = = =1.56
c) √np q √200 (0.08)(0.92 ) √14.72 3.84 (TTT)
0 0

d) ( p ) P1 − P2
Z=

e) Test statistic: Since ( p1−p2) is unknown but √ pq


( 1
n1

1
n2 ), the relevant test statistic will
x +x
p= 1 2
be n1 +n 1
f) The critical region is:
and
g) Computation:

h) Decision and Conclusion:


Clearly, 1.877 does not fall in the critical region. There is no sufficient evidence to reject H 0.
Example 4: The manufacturer of a light bulb claims that the light bulb lasts on an average of
1,600hrs. We want to test this claim. Assume that we reject the claim only if the average of
the sample taken lasts considerably less than 1,600hrs. If a random sample of 25 light bulbs
revealed that an average life time of 1,570hrs with standard deviation of 120hrs. At a level of
significance level H 1 : p 1 > p2 , is there evidence that the average life time of the population of
light bulbs is less than 1,600hrs? (Assume that the lifetime of light bulbs is normally
distributed.)
Solution: Given:

86
H 1 : p 1≠ p2
If the manufacturer of a light bulb produces bulbs whose lifetime is at least 1,600 hrs, then
there will be no corrective measure to be taken. That is, the null hypothesis is
Corrective measures will be taken only if the average lifetime of the bulbs is far less than
1,600 hrs.

Let The average lifetime of a bulb.


Now, we follow the seven-step procedure:
a)
P1 − P2
Z=

b) √ (pq
1
n1 ) (LTT)

1
n2

H :
c) 1 1 2p < p

d) Test statistic: Since Z<−zα is unknown, population is normal and H 1 : p 1> p2 , the

relevant test statistic will be


Z> z α with H 1 : p 1≠ p2 degree
of freedom
e) The critical region is

f) Computation:
Z> z α /2
g) Decision and Conclusion:
Clearly, -1.25 does not fall in the critical region. There is no sufficient evidence to reject H 0.
Exercise
An educator claims that the average IQ of a certain college students is no more than 110. To
test this claim, a random sample of 150 students was taken and given relevant tests. Their
average IQ score was 111.2 with a standard deviation of 7.2. At 0.01 level of significant, test
the claim of the educator.

The National Association of Employers claims that the average annual salary of business
degree graduates in accounting is birr 37,000. In a follow-up study, a sample of 48 graduating
accounting majors revealed a sample mean of birr 38,100 and a sample standard deviation of
birr 5,200. Test the claim of the association. (Use )

Dear learner, so far we have seen hypothesis testing concerning population mean. In the
coming section we will try to discuss hypothesis testing concerning the difference of two
populations’ means.
x2
p2 =
2.3. Tests Concerning the Difference of Two Populations’ Means n 2

If the two samples are independent, then we use either the Z-test, the pooled-variance T-test
or the separate-variance T-test which is appropriate depending on the assumptions we
consider.

87
Assumptions Test statistic H1 Critical Region (CR)

H 0 : p 1=p 2 are known Z<−1.96


P1 − P2
Z=

√ pq
( n1 − n1 )
1 2

α=0 . 05 Z> 1. 96 x 36
Population is normal p1= 1 = =0.18
n1 200

or H 1 : p 1≠ p2 p2 =
x2
=
60
=0 .15 x1+x2 36+60 96 z=
p1−p2
=
0.18−0.15
=
0.03 0.03
= =0.945

& √ ( )√
p= = = =0.16 1 1 √ 0.001008 0.031749
n2 400
n1+n1 200+400 600
11
( )
pq + 0.16×0.84 +
n1 n2 200 400

( p1 − p2 ) are unknown
( 1−α )
H 0 : θ=θ 0 ; H 1 : θ>θ 0 H0 : θ=θ0 ; H1 : θ<θ0
but ( 1−α )

Population is normal
α H 1 : θ >θ0 H 1 : θ<θ0
( 1− β )
H 0 : θ=θ 0 ; H 1 : θ≠θ0 H 1 : θ≠θ0
α=0.05 & α=0.01

n≥30
H 0 : μ=μ0 are unknown H 1 : μ< μ 0 Z<−z α

but σ

X̄−μ0
Population is normal
Z= H 1 : μ> μ 0 Z> z α
H 1 : μ≠μ0
σ/ √ n Z<− z α / 2
Z> z α / 2 & σ

X̄ − μ 0 v=n−1 H 1 : μ<μ 0
T=
n<30 are unknown
S /√n T <−t α H 1 : μ>μ 0
T >−t α H 1 : μ≠μ0 T <−t α /2 & T > t α / 2

Example 5: Nine dogs and ten cats were tested to determine if there is a difference in the
average number of days that the animals can survive without food. On average, the dogs
survive without food for 11 days with a standard deviation of 2 days whereas the cats survive
without food for 12 days with a standard deviation of 3 days .do you think that there is a
significant difference in the average number of days that the animals survive without food?
(Use = .05)
/Assume that
Z> z α and populations are normal /
Solution: Givens
Dogs Cats

88
σ 1 ∧ σ2
Let The average number of days that dogs survive without food
The average number of days that cats survive without food
Now, we follow the seven-step procedure:

a)
n1 , n2 ≥30
X̄ − μ0
Z=
b) S /√n (TTT)
c) H 1 : μ< μ 0

d) Test statistic: Since


Z<−z α are unknown but equal and populations are

normal, we select the test statistic


H 1 : μ≠μ0 where

Z<− z α / 2 with Z > z α / 2 degree


of freedom
e) The critical region is H 0 : μ1− μ 2= d 0
f) Computation:

σ 1 ∧ σ2
Therefore,
n 1 , n 2 ≥30
g) Decision and Conclusion: Clearly, -0.84 does not fall in the critical region.
There is no sufficient evidence to reject H 0. There is no significant
difference in the number of days that these animals (dogs and cats) survive
without food.

Example 6: Do employees perform better at work with music playing? The music was turned
on during the working hours of a business with 45 employees. There productivity level
average was 5.2 with a standard deviation of 2.4. On a different day the music was turned off
and there were 40 workers. The workers' productivity level average was 4.8 with a standard
deviation of 1.2. What can we conclude at the .05 level?

With music Without music

89
( X̄ 1− X̄ 2 ) −d 0
Z=
Solution: Givens
√σ 12 /n1+σ 22 /n2
Let The average productivity level of employees who perform their work with music
The average productivity level of employees who perform their work without music
Now, we follow the seven-step procedure:
a)
b) Z <− z α (RTT)
c) μ1 −μ2 >d 0
d) Test statistic: - Since
Z> z α are unknown but μ1 −μ 2≠d 0 , the

relevant test statistic will be:


Z<−z α /2

e) The critical region is Z > z α / 2


f) Computation:-

σ 1 ∧ σ2
g) Decision and Conclusion:
Clearly, 0.988 does not fall in the critical region. There is no sufficient evidence to reject H 0.
That is, employees perform no better at work with music playing.
Exercise
Two independent samples are taken to compare the means of two normally distributed
populations having unknown population variance but assumed equal. The sample statistics
are given in table below.
Can we conclude that the mean of population A is greater than the mean of population B at
the 0.05 level of significance?

sample n σ 1 ∧ σ2 s

A 10 51.5 6.2

B 15 54.4 10.6

Suppose a car company wishes to compare the performance of its two factories producing an
identical model of car. The factories are equipped with the same machinery but their outputs
might differ due to managerial ability, labor relation, etc. Senior management wishes to know
if there is any difference between the two factories. Output is monitored for 40 days, chosen
at random, with the following results.
Factory - 1 Factory - 2

90
Average daily output 420 408
Standard deviation of daily output 25 20
Number of days 40 40

Does this produce sufficient evidence of the real difference between the factories? That is,
test against . (Use )
Dear learner, so far we have seen hypothesis testing concerning the difference of two
populations’ means. In the coming section, we will try to deal with hypothesis testing
concerning population proportion.

2.4 Tests Concerning a Population Proportion σ 1 ≠σ 2


In some situations you might want to test the hypothesis pertaining to the population
proportion n1 , n2 <30 of values that are in a particular category. For instance, you may want to know
about the proportion of defective items to make certain arrangements for shipment or you
may need to know the fraction of voters in a certain locality favoring party-X. In all such
cases what you have to do is that, first you should select a random sample from the
T =
' ( X̄ 1 − X̄ 2 ) −d 0
population and compute the sample proportion √ S /n +S /n . Then, you can arrive at a decision1
2
1 2
2
2

by comparing this value with that of the hypothesized value.


If the number of success (X) and the number of failure (n – X) are each at least 5, the
( )
2
s2 s2
1
+ 2
n1 n 2
v=

sampling distribution of proportions approximately follows a normal distribution. So, we


2 2
(s12 /n1) + ( s22 /n2)
n 1−1 n2−1

base our decision criteria on the standard normal variable

T >−t α
To test a hypothesis about a population proportion using the normal curve approximation, we
proceed as follows:
a) State the appropriate null hypothesis.
b) State the corresponding alternative hypothesis.
( n−1 ) S2
X 2=
c) Choose the level of significance σ2
0

d) Test statistic:
T >−t α
e) Determine the critical region (CR)
σ 2 <σ 2 X 2< χ 2
a. For 0 (LTT), the critical region is 1−α

σ 2 >σ 2 X 2> χ 2
b. For 0 (RTT), the critical region is α

σ 2 ≠σ 2 X 2< χ 2 X 2> χ 2
c. For 0 (TTT), the critical region is 1−α /2 and α /2

f) Computation:
H 0 : σ 2=σ 2
Find x from the sample of size n and compute 1 2

91
g) Decision and Conclusion:
Reject H0 if the test statistic falls in the critical region; otherwise do not reject H 0.
Example 7:
A hunter claims that he hits at least 80% of the birds he shoots at. Would you agree with his
claim if on a given day he brings down 9 of the 15 birds he shoots at? At a 0.05 level of
significance, is there evidence that the hunter hits less than 80% of the birds he shoots at?
Solution: Let p: The proportion of successful shoots
Now, we follow the seven-step procedure:
9)
10) (LTT)
11) Z<−z α / 2

H 0 : σ 2=σ 2
12) Test statistic:- 1 2
13) The critical region is

14) Computation:
15) Decision and Conclusion: Clearly, -1.9365 is in the critical region. So, we
reject . That is, the hunter hits less than 80% of the birds he shoots at.
Example 8: The sponsor of a television show believes that his audience is divided equally
between men and women. Out of 400 persons attending the show one day, there were 230
men. At a 0.05 level of significance, is there evidence that the belief of the sponsor is
incorrect?
Solution: Let p: The proportion of male audiences of the TV show
Now, we follow the seven-step procedure:
S
12
F=
b) S
2(Audience divides equally between men and women means;
2

there are 50% men audiences and 50% women audiences.)


c) v 1=n 1−1 (TTT)
d) v 2=n 2−1
σ 2 <σ
e) Test statistic:- 1 22
F<f 1−α ( v 1 ,v 2) σ 2 >σ F> f α ( v 1 , v 2 ) σ ≠σ
f) The critical region is and 1 22 i.e. and 12 22

g) Computation:
F< f 1−α / 2 ( v1 , v2)

Therefore,
F> f α /2 ( v 1 , v 2)
h) Decision and Conclusion:
Clearly, 3 > 1.96. That is, x falls in the critical region.

92
Therefore, we reject the null hypothesis.
This means, there is significant difference in the proportion of male audience and female
audience.
Example 9: A briefcase manufacturing company claims that at least 80% of company
executives carried briefcases produced by it. A random sample of 900 executives showed that
675 of them carried these briefcases. At a 0.05 level of significance, is there evidence that the
proportion of company executives carried briefcases produced by it is less than 80%?
Solution: Let p: The proportion of company executives who carry the company’s briefcases
Now, we follow the seven-step procedure:
b)
c) 2
(LTT)
k
( o i− ei )

d) χ2 =
i =1 ei

e) Test statistic:- χ2
f) The critical region is α = 0.05 i.e. α
g) Computation:
2
5
(o i− ei )
χ 2
= ∑ ei
i =1

( 44−40 )2 ( 30−40 )2 ( 26−40 )2 ( 45−40 )2 (55−40 )2


+ + + +
Therefore, 40 40 40 40 40
h) Decision and Conclusion:
Clearly, -3.75 < -1.64. That is, x falls in the critical region.
Therefore, we reject the null hypothesis.
Example 10: An airline claims that at most 8% of its lost luggage is never found. A customer
advocacy agency wants to test this claim. In a study of 200 random cases of lost luggage, it
was found that in 22 cases the lost luggage was never found. At a 0.01 level of significance,
is there evidence that more than 8% of the airline’s lost luggage is never found?
Solution: Let p: The proportion of lost luggage that is never recovered
Now, we follow the seven-step procedure:
b)
c) (RTT)
d) χ2

e) Test statistic:-
( ¿
72
6 )
f) The critical region is α = 0.05 i.e. α
g) Computation:
2
χ

93
( O − E )2
Therefore, E
h) Decision and Conclusion:
Clearly, 1.56 < 2.33. That is, x does not fall in the critical region.
Therefore, there is no sufficient evidence to reject the null hypothesis.
Exercise.
A school buys and distributes pencils for its students. The school purchaser claims that a
defective pencil is found in at most 5% of the cartons. The purchaser of the school bought 6
cartons and found defective pencils in 2 of them. Test whether the proportion of defective
pencils is less than what is expected of the school purchaser at 5% significant level.
To test the assumption of the management that at least 60% employees favor a new bonus
scheme, a sample of 200 employees was drawn and their opinion was taken whether they
favor it or not. Only 56 employees out of 200 favored the new bonus scheme. Test the
hypotheses at level of significance.

2
2.5 Tests Concerning the Difference of Two Populations’ Proportion χ
It is important to make a comparison and analyze differences between two populations in
terms of a categorical variable. For instance, we might try to prove that the proportion of 10th
grade female students in Bahirdar is equal to the proportion of 10 th grade female students in
Mekele. A person might decide to give up smoking only if he is convinced that the proportion
of smokers with lung cancer exceeds the proportion of nonsmokers with lung cancer. In
evaluating such kinds of differences between tests on the difference of two populations’
proportions can be used. The statistic Z that is used for this purpose assumes either of the
following forms.

If is claimed, then ;

where , .

If is claimed, then

; where , , and
Example 11: A sample of 200 students at DRMC revealed that 36 of them were females. A
similar sample of 400 students at AAU revealed that 60 of them were females.

94
a) At 0.05 level of significance, test whether the difference between these two proportions
is significant enough to conclude that these proportions are indeed different.
b) At 0.05 level of significance, test whether p1 is greater than p2 by at least 0.08.
Solution:
a) We follow the seven-step procedure:

α
a)
58×55
=31 . 90
b) 100
2
(TTT)
r c
( Oij−e ij )
∑∑
c) χ2 =
i =1 j=1 e ij

p 1− p2
z=

d) Test statistic:-
2 ( )
√ pq
( 1 1
+
n1 n2 ) r c Oij−e ij 2

e) The critical region is χ and ∑ ∑ χ2 =


i =1 j=1 e ij

f) Computation:
Let : The proportion of female students in the sample taken from DRMC

: The proportion of female students in the sample taken from AAU

Clearly, and

α
p 1− p2 0 . 18−0 . 15 0 . 03
z= = = ≈0. 945

√ √
0 . 031749
pq
( 1 1
+
n1 n2 ) 0 .16×0 . 84 (2001 + 4001 )
g) Decision and Conclusion:
Clearly, 0.945 is not in the critical region. Therefore, there is no sufficient evidence to reject
H0.

b) We follow the seven-step procedure:


 2
r c
( Oij−e ij )
 χ2 = ∑∑
i =1 j=1 e ij

 Test statistic:-
 The critical region is

95
 Computation:

h) Decision and Conclusion: Clearly, -1.5381 is not in the critical region.


Therefore, there is no sufficient evidence to reject H0.

Exercise

An insurance company believes that smokers have higher incidence of heart disease than
nonsmokers in men over 50 years of age. Accordingly, it is considering offering discounts on
its life insurance policies to nonsmokers. However before the decision can be made, an
analysis is undertaken to justify its claim that smokers are at a higher risk of heart disease
than nonsmokers. The company randomly selected 200 men of whom 80 were smokers. If 18
of the smokers suffered from heart disease and 15 of the non smokers nonsmokers suffered
from heart disease. At a 0.05 level of significance, is there evidence that smokers have a
higher incidence of heart disease than nonsmokers?

96
Unit Five
Goodness- of -Fit Test and Test of Independence
Introduction
Dear Learner, welcome to the third unit of this module
The previous 5 units of the course have given you a wide variety of statistical techniques that
are frequently used in decision making. We have discussed numerous descriptive tools and
techniques, as well as large sample estimation and hypothesis tests for one and two
populations, small sample estimation and hypothesis tests using a t-distribution. However, as
we have often mentioned, these statistical tools are limited to use under some conditions for
which they were originally developed. For example, the large sample tests based on the
standard normal distribution assume that the data can be measured at least at interval level.
The small sample tests that employ the t-distribution assume that the sampled populations are
normally distributed.
In these situations in which the conditions just mentioned are not satisfied we suggest that
nonparametric techniques shall be used. These procedures will be shown to be generally the
nonparametric equivalent of the classical procedures discussed in the previous units. The
obvious questions when faced with the realistic decision-making situation are "which test do I
use?”, “should I consider a nonparametric test?" These questions are generally followed by a
second question: "Do the data come from a normal distribution?" But recall that we have also
described situations involving data from Poisson or binomial distributions. How do we know
which distribution applies to our situation? Fortunately, a statistical technique called
goodness-of –fit test exists that can help to answer this question. Using goodness-of-fit tests,
we can decide whether a set of data come from a specific hypothesized distribution.
You will also encounter many businesses in which the level of data measurement for the
variable of interest is either nominal or ordinal, not interval or ratio. For example, a bank may
use a code to indicate whether a customer is a good or poor credit risk. The bank may also
have data for these customers that indicate, by a code, whether each person is buying or
renting home. The loan officer may be interested in determining whether credit-risk status is
independent of home ownership. Because both credit risk and home ownership are
qualitative, or categorical, variables, their measurement level is nominal and the previously
introduced statistical techniques cannot be used to analyze this problem. We, therefore, need
a new statistical tool to assist the manager in reaching an inference about the customer
population. That statistical tool is contingency analysis. Contingency analysis is a widely
used tool for analyzing the relationship between qualitative variables, one that decision
makers in all business areas find helpful for data analysis.

97
Dear learner, in this unit, you will be introduced to two applications of the chi-square
distribution; namely goodness-of-fit test and test of independence.
As you study this unit, you are expected to:

 Utilize the chi-square goodness-of –fit test to determine whether data from a process
fit is specified distribution.
 Set up a contingency analysis table and perform a chi-square test of independence.

3-1 Introduction to Goodness- of -Fit Test


Many of the statistical procedures introduced in earlier units require that the sample data
come from populations that are normally distributed. For example, when we use the t-
distribution in confidence interval estimation or hypothesis testing about one or two
population means, the population(s) interest is (are) assumed to be normally distributed. But
how can you determine whether these assumptions are satisfied? In other instances, you may
wish to employ a particular probability distribution to help solve a problem related to an
actual data from the process fit the probability distribution being considered. In such instance,
a statistical technique known as a goodness-of-fit test can be used.
The term goodness-of-fit aptly describes the technique. Suppose a major retail department
store believes that the proportion of customers who use each of the four entrances to the store
is the same. This would mean that customer arrivals are uniformly distributed across the four
entrances. Suppose a sample of 1,000 customers is observed entering the store and entrance
(A, B, C or D) selected by each customer is recorded. The following table shows the results
of the sample.
Customer Door Entrance Data

Entrance Number of Customers

A 260

B 290

C 230

D 220

If the manager's assumption about the entrances being used uniformly holds true and if there
was no sampling error involved, we would expect one-fourth of the customers, or 250, to
enter through each door. When we allow for the potential of sampling error, we would still
expect close to 250 customers entering through each entrance. The question is, how "good is
the fit" between the sample data in the above table and the expected number of 250 people at
each entrance? At what point do we no longer believe that the differences between what is
actually observed at each entrance and what we expected can be attributed to sampling error?
If these differences get too big, we will reject the uniformity assumption and conclude that
customers prefer some entrance to others.

98
The chi-square goodness-of fit test can be used to determine whether the sample data come
from any hypothetical distribution. Consider the following application:
Betzata Health Guard- Betzata Health Guard, Addis Ababa health clinic with 25 offices is
open seven days a week. The operation manager was recently hired from a similar position at
a smaller chain of clinics in Addis Ababa. She is naturally concerned that the level of
staffing-physicians, nurses, and other support personnel-be balanced with patient demand.
Currently, the staffing level is balanced. Monday through Friday, with reduced staffing on
Saturday and Sunday. Her predecessors explained that patient demand is fairly level
throughout the week and about 25% less on weekends, but the new manager suspects that the
staff want to have weekends free. Although she was willing to operate with this schedule for
a while, she has decided to study patient demand to see whether the assumed demand patient
will apply.
The operations manager requested a sample of 20 days for each that showed the number of
patients on each of the sample days. A portion of those data follow. For the 140 days
observed the total count was 56,000 patients. The total patient counts for each day the week
are shown in the following table.
Example 1: Patient count data for Betzata Health Guard

Day Total patient Counts

Sunday 4502
Monday 6623
Tuesday 8308
Wednesday 10420
Thursday 11032
Friday 10754
Saturday 4361

Total 56000

Recall that the previous operation manager at Betzata Health Guard based his staffing on the
premise that from Monday to Friday the patient count remained essentially the same and on
Saturdays and Sundays it went down to 25%. If this is so, how many of the 56,000 patients
would be expect on Monday? How many on Tuesday, and so forth? To figure out this
demand, we determine weighting factors by allocating four units each to days Monday
through Friday and three units each ( representing the 25% reduction) to Saturday and
Sunday. The total number of units is then (5  4) + (2  3) = 26. The percentage of total

patients expected on each weekday is s2 = 0.154, or 15.4%, and the percentage expected on
2
a weekend day is σ = 0.115, or 11.5%. The expected number of patients on a weekday is
0.154  56,000 = 8,624, and the expected number on each weekend is 0.115  56,000 =

99
6,440. The situation facing Betzata Health Guard is one for which a number of statistical tests
have been developed. One of the most frequently used is the chi-square goodness-of –fit test.
What we need to examine is how well the sample data fit the hypothesized distribution. The
following null and alternative hypotheses can represent this.
H0: The patient demand distribution is evenly spread throughout the weekdays and
25% lower on the weekend.
H1 The patient demand follows some other distribution.
The equation for the chi-square-goodness-of fit test statistic is given below. The logic behind
this test is based on determining how for the actual observed frequency is from the expected
frequency. Because we are interested in whether a difference exists, positive or negative, we
remove the effect of negative values by squaring the differences. In addition, how important
this difference is really depends on the magnitude of the expected frequency ( e.g, a
difference of . 5 is more important if the expected frequency is 10 then if if the expected
frequency is 1,000), so we divide the squared difference by the expected frequency. Finally,
we sum these difference ratios for all days. This sum is a statistic that has an appropriate chi-
square distribution.
Chi-Square Goodness –of-Fit Test Statistic

{ 1,4,5,7,8 }
Where: oi = Observed cell frequency for category i
ei = Expected cell frequency for category i
k = Number of categories.

The μ statistic is distributed approximately as a chi-square only if the sample size is large.
Special Note A sample size of at least 30 is sufficient in most cases provided that none of the
expected frequencies is too small.
Example 2: The following data reflect the choice of majors by a sample of 200 incoming
freshman students.

Major Number of students

Liberal Arts 44

Engineering 30

Education 26

Business 45

Science 55

100
Using a significance level of 0.05, is there sufficient evidence to conclude that the
distribution of major choice is not uniform?
Solution:
H0: Distributions of choices of major is uniform across the fields.
H1: Distribution of choices is not uniform.
( p)
5
( o i−ei ) 2
χ =∑
2

Test statistic: i =1 ei
The critical region is
Where v = k – 1 = 5 – 1 = 4
Therefore,
Computation: Under the hypothesis of a uniform distribution, the total number of students;
that is 200, should be divided equally among the five departments. Thus, in each department

we expect of 200, that is 40 students.

5
( o i−ei ) 2 ( 44−40 )2 ( 30−40 )2 ( 26−40 )2 ( 45−40 )2 (55−40 )2
χ2 = ∑ ei
+ + + +
i =1 = 40 40 40 40 40

16 100 1 96 25 225 562


+ + + + =
= 40 40 40 40 40 40 = 14.05

Decision and Conclusion:


Clearly, falls in the critical region. So, we reject H0.
That means, the choices are not uniform across the fields.
Example 3 : Suppose that throwing a die 72 times yields the following data:

Score on die 1 2 3 4 5 6

Frequency 6 15 15 7 15 14

Are these data consistent with the die being unbiased at 0.05 significance level?
Solution:
H0: the die is unbiased
H1: the die biased
( p)
5
( o i−ei ) 2
2
χ = ∑ ei
Test statistic: i =1

The critical region is


Where v = k – 1 = 6 – 1 = 5
Therefore,

101
Computation: On the basis of the null hypothesis, the expected values are based on the
uniform distribution, i.e each number should come up with equal number of times. The
number of evens in the sample 2
p= = =0. 4
expected values are therefore 12 sample size 5 for each number on the die.

E(X̄ )=μ, E(P)=p


Observed Expected
Score frequency ( O ) frequency ( E) O -E ( O – E )2

1 6 12 -6 36 3

2 15 12 3 9 0.75

3 15 12 3 9 0.75

4 7 12 -5 25 2.0833

5 15 12 3 9 0.75

6 14 12 2 4 0.3333

Total 7.6667

Decision and Conclusion: Clearly, fall in the critical region. So, there is no
sufficient evidence to reject H0.
Example 4: The third more realistic example will now be examined to reinforce the message
2
about the use of the σ ) distribution and to show how the expected values might be generated
in different ways. This example looks at road accident figures to see if there is any variation
through the year. Quarterly data on the number of people killed on British road are used, and
the null hypothesis is that the number does not vary seasonally:
H0: There is no difference to fatal accidents between quarters.
H1: There is some difference in fatal accidents between quarters.
Such a study might be carried out by government, for example, to try to find the best means
of reducing road accidents.
The key Data 1994/95 Table below (published by the Central Statistical Office), is found.
Table Road causalities in the UK, 1993

Quarter I II II IV Total

causalities 837 889 951 1142 3819

Test the above evidence using 5% level of significance


Solution:
H0: There is no difference to fatal accidents between quarters.

102
H1: There is some difference in fatal accidents between quarters.
( p)
5
( o i−ei ) 2
χ =∑
2

Test statistic: i =1 ei
The critical region is
Where v = k – 1 = 4 – 1 = 3
Therefore,
Computation: On the basis of the null hypothesis, the expected values are based on the
uniform distribution, i.e the number of fatal car accidents in all the quarters are the same.

Thus, the expected values are therefore for


each quarter.

σ
Quarte Observed deaths Expected deaths O - E (O – E )2
r

I 837 954.75 -117.75 13865.06 14.52219167

II 889 954.75 -65.75 4323.063 4.527952344

III 951 954.75 -3.75 14.0625 0.014728987

IV 1142 954.75 187.25 35062.56 36.72433883

Total 55.7892

Decision and Conclusion:


Clearly, falls in the critical region. So, we reject H0.
That is, we can conclude that there is a difference between seasons in the car accident rate.
The reason for this difference might be the increased hours of darkness during winter months,
leading to more accidents.
Exercise.
Assume that a die was rolled 60 times to check whether the die was fair or loaded. In the
experiment conducted, the actual number of times each face came up in sequence is as
follows:
8, 14, 6, 12, 16, 4
At a level of significance 5%, can we conclude that the die is fair?
Assume that 41% of Addis population has type-A blood, 9% are type-B blood, 4% has type-
AB blood and 46% are type-O blood.

Suppose that there is a general belief that the distribution of blood types of those people
suffering from stomach cancer is the same as the distribution of blood type of the overall
population.

103
To test this hypothesis, assume that we take 200 stomach cancer patients and we observe:
92 having type-A blood
20 having type-B blood
4 having type-AB blood
84 having type-O blood
At a level of significance 5%, can we conclude that the distribution of blood types of those
people suffering from stomach cancer is the same as the distribution of blood type of the
overall population?
3.2 Introduction to Test of Independence
In unit 2 of this module, you were introduced to hypothesis tests involving one and two
populations' proportions. Although these techniques are useful in many cases you will also
encounter many situations involving multiple population proportions. For example, a major
mutual fund company offers six different mutual funds. The president of the company may
wish to determine if the proportion of customers selecting each mutual fund is related to the
four sales regions in which the customers reside. A hospital administrator who collects
service-satisfaction data from patients might be interested in determining whether there is a
significant difference in patient rating by hospital department. A personnel manager for a
large corporation might be interested in determining whether a relationship between levels of
employee job satisfaction and job classification. In each of these cases, the proportion relate
to characteristic categories of the variable of interest. The six mutual funds, four sales
regions, hospital departments, and job satisfactions are the specific categories.
These situation involving categorical data call for a new statistical tool known as contingency
analysis to help make decision when multiple proportions are involved. Contingency analysis
can be used when a level of data measurement is either nominal or ordinal and the values are
determined by counting the number of occurrences in each category.
In this section, you will be introduced to the concept of contingency table by considering a 2
 2 (two-by-two) and r  c (r-by-c) contingency tables.

3.2.1 A 2  2 CONTINGENCY TABLE


Among his many interests in this study, marketing manager questioned whether funding
source and gender of the yearbook editor were related in some manner. To analyze this issue,
we examine these two variables. Source of university funding is a categorical variable called
as follows:
1= Private funding
2= State funding
Of the 121 respondents who provide data for the variable, 155 came from privately funded
colleges or universities and 66 were from publicly funded institutions.

104
The second variable sex of the yearbook editor ,is also a categorical variable, with two
response categories, coded as follows:
1 = Male
2 = Female
Of the 121 responses to the survey, 164 were from females and 57 were from males.
In cases in which the variables of interest are both categorical and the decision maker is
interested in determining whether a relationship exists between the two, a statistical technique
known as contingency analysis is useful. We first set up a two-dimensional table called a
contingency table. The contingency table for these two variables is shown below.

Contingency Table
It is a table used to classify sample observations according to two or more identifiable
characteristics. It is also called a cross tabulation table.

SOURCE OF FUNDING

GENDER Private State Total

Male 14 43 57
Female 141 23 164

Total 155 68 121

The table above shows that 14 of the respondents were males from schools that are privately
funded. The numbers at the extreme right and along the bottom are called the marginal
frequencies. For example, 57 respondents were males, and 155 respondents were from
privately funded institutions.
The issue of whether there is a relationship between responses to these two variables is
formally addressed through a hypothesis test, in which the null and alternative hypotheses are
stated as follows:
H0: Gender of yearbook editor is independent of the college's funding source.
H1: Gender of yearbook editor is not independent of the college's funding source.
If the null hypothesis is true, the population proportion of yearbook editors from private
institutions who are males should be equal to the proportion of male editors from state-funded
institutions. These two proportions should also equal the population proportion of male
editors without regard to a school's funding source. To illustrate, we can use the sample data
to determine the sample proportion of male editors as follows:

105
n ≥30
Then, if the null hypothesis is true, we would expect 25.57% of the 155 privately funded
schools, or 39.98 schools, to have a male yearbook editor. We would also expect 25.79% of
the 66 state-funded schools, or 17.02,to have male yearbook editors. (Note that the expected
numbers need not be integer values. Note also that the sum of expected frequencies in any
column or row add to the marginal frequency. We can use this reasoning to determine the
expected number of respondents in each cell of the contingency table as shown below:

GENDER Private State Total

Male O11= 14 O12= 43 57


e11= 39.98 e12= 17.02

Female O21= 141 O22= 23 164


e21= 115.02 e22= 48.38

Total 155 66 121

You can simplify the calculations needed to produce the expected values for each cell. Note
that the first cell's expected value 39.98 was obtained by the following calculation:
e11 = (0.2579) (155) = 39.98
However, because the probability 0.2579 is calculated by dividing the row total 57,by the
grand total, 121,the calculation can be represented as

X̄ − μ
Z=
e11 = σ / √ n
Keep in mind that the row and column totals (the marginal frequencies) must be the same for
the expected values as for the observed values. Therefore, when there is only one cell in the
row or a column for which you must calculate an expected value, you can obtain it by
subtraction. So as an example, the expected value e12 could have been calculated as
e12 = 57 – 39.98 = 17.02.

106
Chi-Square contingency Test statistic

z With df = (r – 1) (c – 1)
Where:
Oij = Observed frequency in cell ( i , j )
eij = Observed frequency in cell ( i , j )
r = Number of rows
c = Number of columns

Example 5: Before releasing a major advertising campaign to the media, Barger advertising
runs a test on the material. Recently, it has called 100 people and asked them to listen to a
commercial that was slated to run nationwide on the radio. At the end of the commercial, the
respondents were asked to name the company that was in the advertising. The following
contingency table shows the results of the sampling.

Female Male Total

Correct Recall 33 25 58

Incorrect Recall 22 20 42

Total 55 45 100

Determine whether there is a relationship between gender and a person's ability to recall the
company’s name at the significance level of 0.01.
Solution:
H0: Ability to correctly recall the company name is independent of gender.
H1: Recall ability and gender are not independent.

5
( o i−ei ) 2
χ =∑
2

Test statistic: i =1 ei
The critical region is
Where v = (r – 1)(c-1) = (2-1)(2-1) = (1)(1) = 1
Therefore,

107
Computation: The expected cell frequencies are determined by multiplying the row total by
the column total and dividing by grand total. For example, for the cell corresponding to
female and correct recall, we get:
p (X̄ −zα/2σ/√n<μ< X̄+zα/2σ/√n)=1−α
Expected = e =
The expected cell values for all cells are:

Female Male Total

Correct Recall 33 ( e = 31.90) 25 ( e = 26.10) 58

Incorrect Recall 22 ( e = 23.10) 20 ( e = 18.90) 42

Total 55 45 100

The test statistic is computed using

σ
Observed Expected O-E (O –
frequencies (O) frequencies (E) E )2
Females who 33 1.1 1.21 0.037931
correctly recall
Males who 25 -1.1 1.21 0.04636
correctly recall
Females who do 22
not correctly -1.1 1.21 0.052381
recall
Males who do not 20 1.1 1.21 0.064021
correctly recall
Total 0.2007

Decision and Conclusion: Since 0.2007 is not in the critical region, there is no sufficient
evidence to reject Ho.

Activity 12: 2  2 contingency table


Show your steps clearly and neatly.
Take about 15 minutes.
Suppose we want to test whether there is any relationship between sex and the opinion of
nuclear disarmament. To this end, assume we take 100 persons of whom 60 are men and their
responses are summarized as follows.
Sex
Male Female Total

108
Opinion Favor 35 25 60

Against 25 15 40

Total 60 40 100

With a significant level of 5% , can we conclude that there is no relationship between sex and
the opinion of nuclear disarmament?
Assume that we want to test whether there is a relation ship between heart disease and
smoking. To this end, we take a random sample of 100 persons and we obtain the following
results.
Smoking Condition
Smoker Non-smoker Total
Disease

Positive 25 15 40
Condition

Negative 20 40 60
Heart

Total 45 55 100
Test the hypothesis at 5% level of significance that heart disease and smoking are
independent.

A r  c contingency table
As with 2  2 contingency table analysis, the test for independence can be made using the
chi-square test, where the expected cell frequencies are compared to the actual ell frequencies

and the test statistic


σ is used. The logic of the test says that if the
actual and expected frequencies closely match, then the null hypothesis of independence is
not rejected. However, if the actual and expected cell frequencies are substantially different
overall, the null hypothesis of independence is rejected. The calculated chi-square statistic is
computed to a table critical value for the desired significance and degrees of freedom equal (r
- 1) (c - 1).
The expected cell frequencies are determined assuming that the row and column variables are
independent. This means, for example, that the probability of a married person being absent
more than 5 days during the year is the same as the probability of any employee being absent
more than 5 days. An easy way to compute the expected cell frequencies e ij is given by the
formula:

σ
Example 6: Nefas Silk Company manufactures carpets and draperies in Addis Ababa area. It
pays market wages, provides competitive benefits, and attractive options for employees in an
effort to create a satisfied workforce and reduce turnover. Recently, however, several
supervisors have complained that employee absenteeism is becoming a problem. In response
to these complaints, the human resource manager studied a random sample of 500 employees.
One aim of this study was to determine whether there is a relationship between absenteeism
and marital status.

109
Absenteeism during the past year was broken down into three levels.
1) zero absence
2) 1 to 5 absences
3) over 5 absences
Marital status was divided into four categories:
1. Single 2. Married
3. Divorced 4. Widowed
The following contingency table shows for the sample of 500 employees.

ABSENTEE RATE

Marital Status Zero 1–5 over 5 Row Total

Single 84 82 34 200

Married 50 64 36 150

Divorced 50 34 16 100

Widowed 16 20 14 50

Column Total 200 200 100 500

Determine whether there is a relationship between absenteeism and martial status at


s = 0.05 level of significance.

Solution:
H0: Absenteeism behavior is independent of marital status.
H1: Absenteeism behavior is not independent of marital status.

5
( o i−ei ) 2
χ =∑
2

Test statistic: i =1 ei
The critical region is
Where v = (r – 1)(c-1) = (4-1)(3-1) = (3)(2) = 6
Therefore,
Computation: The expected cell frequencies are determined by multiplying the row total by
the column total and dividing by grand total. For example, for the cell corresponding to row 1
and column 1 is:

e11 =
{ 1,3,4,8 }
and the expected cell frequency for row 2, column 3 is

110
e23 =
μ
The expected cell values for all cells are:

ABSENTEE RATE

Marital Status Zero 1–5 over 5 Row Total

Single 84 ( e11 = 80) 82 (e12 = 80 ) 34( e13 = 40 ) 200

Married 50( e21 = 60) 64 ( e22 = 60 ) 36 ( e23 = 30) 150

Divorce 50( e31 = 40 ) 34 (e32 = 40 ) 16( e33 = 20 ) 100

Widowed 16( e41 = 20 ) 20 ( e42 = 20 ) 14( e43 = 10 ) 50

Column Total 200 200 100 500

The test statistic is computed using

σ =2
( 1−α ) ×100 %=95 %
= 10.8833
Decision and Conclusion: Since 10.8833 is not in the critical region, there is no sufficient
evidence to reject Ho.

Example 7: A survey of 100 firms found the following evidence regarding profitability
and market share:

Profitability Market share

< 15 % < 15 – 30 % > 30% Total

Low 18 7 8

Medium 13 11 8

High 8 12 15

Total

Is there evidence that market share and profitability are associated?


Solution:
H0: there is no association between market share and profitability

111
H1: there is some association between market share and profitability

2
5
( o i−ei )
χ =∑
2

Test statistic: i =1 ei
The critical region is
Where v = (r – 1)(c-1) = (3-1)(3-1) = (2)(2) = 4
Therefore,
Computation:

A quick way to calculate the expected value in any cell is to multiply the appropriate row
total by column total and divide through by the grand total (100). For example, to get the
expected value for the low / < 15% cell:

Expected value =
α / 2
In carrying out the analysis, care should again be taken to ensure that information is retained
about the sample size, i.e. the numbers in the table should be actual numbers and not
percentages or proportions. This can be checked that the grand total is always the same as the
sample size.

Note that the α /2 test is only valid if the expected value in each cell is not less than five.

Profitability Market share

< 15 % < 15 – 30 % > 30% Total

Low 18 ( 12.87 ) 7 ( 9.90 ) 8 ( 10.23 ) 33

Medium 13 ( 12.48) 11( 9.60 ) 8 ( 9.92 ) 32

High 8 ( 13.65 ) 12 ( 10.50) 15 ( 10.85 ) 35

Total 39 30 31 100

The calculated value of the test statistic is obtained similar to the one carried out before, the
formula being the same

− zα / 2

112
The evaluation of the test statistic then proceeds as follows, cell by cell:

zα /2
p (− z α / 2 < Z < z α / 2 ) =1− α
=
= 2.05 + 0.85 + 0.48 + 0.02 + 0.20 + 0.36 + 2.34 + 0.21 + 1.59
= 8.118
Decision and Conclusion Since 8.118 is not in the critical region, there is no sufficient
evidence to reject Ho.

Chi-square limitations
The chi-square distribution is only an approximation for the true distribution for contingency
analysis. We use the chi-square approximation because the true distribution is impractical to
compute in most instances. However, the approximation (and, therefore, the conclusion
reached) is quite good when all expected cell frequencies are at least [Link] expected cell
frequencies drop below 5.0, the calculated chi-square value tends to be inflated and may
inflate the true probability of Type I error beyond significance level. As a rule, if the null
hypothesis is not rejected, you do not need to worry when the expected cell frequencies drop
below 5.0.
There are two alternatives that can be used to overcome the small expected-cell-frequency
problem. The first is to increase the sample size. This may increase the marginal frequencies
in each row and column enough to increase the expected cell frequencies. The second option
is to combine the categories of the row and/or column variables. If you do decide to group
categories together, there should be some logic behind the resulting categories. You don't
want to lose the meaning of the results through poor groupings. You will need to examine
each situation individually to determine whether the option of grouping classes to increase
expected cell frequencies makes sense.

113
Activity 13: r  c contingency table
Show your steps clearly and neatly.
Take about 15 minutes.
To see if there is any dependency between the type of professional job held and one’s
religious affiliation, a random sample of 1000 individuals belonging to a national
organization of doctors, lawyers and engineers were chosen for a study. The results of the
sample are given in the following contingency table.
Profession
Doctors Lawyers Engineers Total

Orthodox 100 170 230 500

Catholic 50 130 120 300

Protestant 100 50 50 200


Religion

Total 250 350 400 1000


Test the hypothesis at 5% level of significance that the profession of individuals and their
religious affiliations are independent.
Assume that we want to test whether there is a relationship between one’s economic status
and the presence of cancer. To this end, we take a random sample of 100 persons and we
obtain the following results.
Economic Status
Low Medium High Total

Positive 5 6 9 20
Condition

Negative 25 30 25 80
Cancer

Total 30 36 34 100
Test the hypothesis at 5% level of significance that one’s economic status and the presence of
cancer are independent.
Dear learner, I hope you have been following attentively every lesson discussed so far. Now,
to remind the main points of unit 3, look at the unit summary below.

114
 Summary
This unit has introduced two very useful statistical tools goodness-of-fit tests and contingency
analysis.
Goodness –of-fit testing is used when a decision maker wishes to determine whether sample
data come from a population having specific characteristics.
The goodness- of-fit procedure that was introduced in this unit is the chi-square goodness-of-
fit test.
This test relies on the idea that if the distribution of the sample data is substantially different
from the hypothesized population distribution, then the population distribution from which
these sample data came must not be what was hypothesized.
The chi-square goodness-of-fit test is most effective when the sample data have been
organized into a grouped frequency distribution.
Contingency analysis is a frequently useful tool that allows the decision maker to test whether
responses to two variables are independent.
A two-way classification of observations is known as a contingency table. The independence
or otherwise of the two variables may be tested using the zα/2=z0.05/2=z0.025=1.96 distribution, by comparing
observed values with those expected under the null hypothesis of independence.

EQUATIONS
Chi – Square Goodness-of-fit Test Statistic

x̄=
∑ x = 1+3+4+8 =16 =4
n 4 4
Chi-Square Contingency Test Statistic

σ 2
σ X̄ = = =1
√n √4 With df = (r – 1) (c – 1)
Expected Cell frequencies

CI= x̄±σ X̄ zα /2=4±1( 1.96 )= (2 .04,5.96 )

115
 Self Assessment Questions-3

1. Four different holiday firms which all carried equal numbers of holiday makers reported
the following numbers who expressed satisfaction with their holiday:
Firm A B C D
Number satisfied 576 558 580 546

Test the hypothesis at 5% level of significance that the four firms are more or less equally
satisfactory.
2. ( Business Application) A study of automobile drivers was conducted to determine
whether the number of traffic citations issued during a three-year period was independent of
the sex of the driver. The following data were collected.
Sex of Driver
Citations Issued Male Female
0 240 160
1 80 40
2 32 18
3 11 9
Over 3 5 4
Using an ( 1−α ) × 100% = 0.05 level, determine whether the two variables (citation issued and sex of
driver) are independent.
3. In a recent labor negotiation, union officials collected data from a sample of union
members regarding how long they had been with the company and how long they would be
willing to stay out on strike if a strike were called. The following data were collected.
Strike- Length Toleration
Time with Company Under 1 Week 1 – 4 Week Over 4 Weeks
Under 1 Year 23 6 3
1 – 2 Years 19 15 8
2 – 5 Years 20 23 19
5 – 10 Years 4 21 29
Over 10 Years 2 5 18
Based on these data, can the union conclude that strike-length toleration is independent of
time with the company? Test at the x̄±zα/2σ/√n = 0.05 level.

Alright dear learner! If you are in doubt of your answers, you can refer to the notes or the
answer key section.

116
Unit Four

One-Way ANOVA
Contents
Page

Introduction………………………………………………………………………………..85
4.1. The Rational Behind
ANOVA………………………………………………………………86
4.2. Assumptions of
ANOVA…………………………………………………………………….87
4.3. Applying One-Way
ANOVA………………………………………………………………..87
4.4. The Steps in Testing a Hypothesis for Equality of Several
Means………………………….88

Summary…………………………………………………………………………………..94
Self Assessment Questions (SAQ-4)
……………………………………………………...95

Introduction
Dear Learner, first read the following introductory ideas.

In unit-2 of this module, procedures for determining whether or not two populations have
equal means were presented. However, the focus of this unit will be about procedures for
determining whether or not three or more populations have equal means. To conduct this test,
you will need a new tool called ANalysis Of VAriance (ANOVA). It was developed by R.A
Fisher and is sometimes called F-Test.

ANOVA lets us test hypotheses about the mean (average) of a dependent variable across
different groups. While the t-test is used to compare the means between two groups, ANOVA
is used to compare means between 3 or more groups.

There are several varieties of ANOVA, such as one-factor (or one-way) ANOVA, two-factor
(or two-way) ANOVA, and so on, and also repeated measures ANOVA. The factors are the
independent variables, each of which must be measured on a categorical scale - that is, levels
of the independent variable must define separate groups.

One-way refers to the situation when only one fact or variable is considered. One-factor
ANOVA, also called one-way ANOVA is used when the study involves 3 or more levels of a
single independent variable.

117
For instance, to find out if there is any significant difference in the average sales figures of 4
salesmen employed by the same company, what we may assume is that, sales figure depends
on salesman selling ability. If we consider other factors affecting sales volume, like price
charged, extent of advertisement, etc, the corresponding statistical test is considered as multi-
way ANOVA.

As a second example, to find out if there is any significant difference in the average test
scores of students who are exposed to one of three different teaching techniques (three levels
of a single independent variable), what we may assume is that, test score depends on
teaching technique.

As a third example, to find out if the average monthly expenditures of families of 4 in five
localities are similar or not, what we may assume is that, monthly expenditure depends on
locality.

As you study this unit, you are expected to:


 State the assumptions of ANOVA;
 Describe the steps in ANOVA test;
 Perform ANOVA test and interpret the result.

4.1. The Rational Behind ANOVA

Dear learner, the one-way Analysis of Variance (ANOVA) is used with one categorical
independent variable and one continuous variable. The independent variable can consist of
any number of groups (levels).

For example, an experimenter hypothesizes that learning in groups of three will be more
effective than learning in pairs or individually. Students are randomly assigned to three
groups and all students study a section of text. Those in group one study the text individually
(control group), those in group two study in groups of two and those in group three study in
groups of three. After studying for some set period of time all students complete a test over
the materials they studied. First, note that this is a between-subjects design since there are
different subjects in each experimental condition. Second, notice that, instead of two groups
(i.e., levels) of the independent variable, we now have three. The t-test, which is often used in
similar experiments with two group, is only appropriate for situations where there are only
two levels of one independent variable. When there is a categorical independent variable and
a continuous dependent variable and there are more than two levels of the independent
variable and/or there is more than one independent variable (a case that would require a
multi-way, as opposed to one way ANOVA), then the appropriate analysis is the work horse
of experimental psychology research, the analysis of variance.

In the case where there are more than two levels of the independent variable the analysis goes
through two steps. First, we carry out an over-all F test to determine if there is any significant
difference existing among any of the means. If this F score is statistically significant, then we
carry out a second step in which we compare sets of two means at a time in order to
determine specifically, where the significance difference lies. Let's say that we have run the
experiment on group learning and we recognize that this is an experiment for which the
appropriate analysis is the between-subjects one-way analysis of variance. We use a

118
statistical program and analyze the data with group as the independent variable and test score
as the dependent variable.

119
4.2. Assumptions of ANOVA
In analyzing equality of means of three or more populations, we assume the following.
Populations are normally distributed.
The populations have equal standard deviations.
The samples are selected independently.
4.3. Applying One-Way ANOVA
Dear learner, most ANOVA calculations nowadays are done using computer software such as
Excel and Minitab. The software packages will do all the computations while we focus on
interpreting the results. But, in this unit, we will illustrate the manual computational approach
of ANOVA. Recall from equation 6-1 that we can partition the total sum of squares into two
components:
One –Way ANOVA Table:

Source of Variations SS df MS F-Ratio


Between Samples SSB k-1

Within Samples SSW N-k

Total SST N-1

Where: k = Number of populations


N = Sum of the sample sizes from all populations
df = Degrees of freedom
SSB: Sum of Squares Between
SSW: Sum of Squares Within
MSB: Mean of Squares Between
MSW: Mean of Squares Within
Dear learner, there are two methods computing SSB and SSW.
Method 1: Traditional Method

120
Where he mean of the ith sample
= The mean of the k sample means
k: = Number of samples (groups)
he size of the ith sample
= The mean of the ih sample
jh element in the ith sample;
Method 2: Short-cut Method

Where CF (Correction Factor) = and TS (Total Sum) =

The "Between Groups" row represents what is often called "explained variance" or
"systematic variance". We can think of this as variance that is due to the independent
variable, the difference among the three groups. For example the difference between a
person's score in group one and a person's score in group two would represent explained
variance. The "Within Groups" variance represents what is often called "error variance". This
is the variance within your groups, variance that is not due to the independent variable. For
example, the difference between one person in group one and another person in group one
would represent error variance. Intuitively, it's important to understand that, at it's heart, the
analysis of variance and the F score it yields is a ratio of explained variance versus error.

4.4. The Steps in Testing a Hypothesis for Equality of Several Means

H 0 : μ1=μ 2=. . .=μk


H1: At least two means are not equal
Choose the level of significance ( α )

Determine the test statistic (in our case, the test statistics is )
Determine the critical region (CR)
The critical region is
F> Fα , k −1 , N −k

Computation: . (Make use of the ANOVA table in your computation)


Decision and Conclusion: Reject H 0 if the test statistic falls in the critical region; otherwise
accept H0.

Example-1:

Suppose that a typewriter manufacturer prepared three different study manuals, for use by
typists learning to operate in typewriter. Then each manual was studied by a simple random
sample of 5 typists. The time to achieve proficiency was recorded for each typist as given
below.

121
Manual-1 Manual-2 Manual-3
21 17 31
27 25 28
29 20 22
23 15 30
25 23 24

At a level of significance , test whether there is a significant difference in


population mean learning times. (Assume that populations are normally distributed with
equal standard deviations and the samples are selected independently)

Solution:

H0: Population mean learning times are equal. That is,


H 0 : μ1=μ 2=μ 3

H1: At least two means are not equal


The level of significance of the test is

The test statistic is


Determine the critical region (CR)
The critical region is
F> Fα , k−1 , N −k=F 0 .05 , 3−1 , 15−3 =F0 .05 , 2 , 12=3.89

Computation: Compute the test statistics


Manual-1 Manual-2 Manual-3
xi1 xi2 xi3
21 17 31
27 25 28
29 20 22
23 15 30
25 23 24

=125 =100 =13


5

122
Now, we fill the One –way ANOVA Table as follows

Source of Variations SS df MS F-Ratio


Between Samples 130 k-1 = 3-1=2

Within Samples 168 N-k = 15-3=12

Total 298 14

Decision and Conclusion: Clearly, the computed value of F (4.64) in the critical region.
Therefore, we reject H0. That means, at least two of the population mean learning times are
not equal.

The computation of the F- ratio using the method-2 (Short-cut Method) can be done as
follows.

Where CF (Correction Factor) = and TS (Total Sum) =

Manual-1 Manual-2 Manual-3


2 2
xi1 x i1 xi2 xi2 xi3 xi32
21 441 17 289 31 961
27 729 25 625 28 784
29 841 20 400 22 484
23 529 15 225 30 900
25 625 23 529 24 576

=125 = 3165 =100 =2068 =135 = 3705

123
CF (Correction Factor) = where,

Thus,

So,

Example-2:

Assume that you want to test whether the populations of 4 brands of light bulbs have equal
life time on average or not. To this end, you take 4, 5, 7 and 6 light bulbs from brand-1,
brand-2, brand-3 and brand-4, respectively and keep them lit until all of them burned out. As
a result, you discover the following data (burning times in hours).

[Link] Brand-1 Brand -2 Brand -3 Brand -4


1 1000 1050 879 957
2 980 1074 957 820
3 1010 1102 1040 1020
4 1200 990 1011 1150
5 1012 1000 1002
6 1024 1030
7 1200

At a level of significance , test whether there is a significant difference in the


average lifetime of the 4 brands of light bulbs. (Assume that populations are normally
distributed with equal standard deviations and the samples are selected independently)

124
Solution:
H0: The average lifetime of the 4 brands of light bulbs are equal. That is,
H 0 : μ1=μ 2=μ 3=μ 4
H1: At least two means are not equal
The level of significance of the test is

The test statistic is


Determine the critical region (CR)
The critical region is
F> Fα , k−1 , N −k=F 0 .01, 4−1 , 22−4 =F0 .05, 3 , 18=3.16

Computation: Compute the test statistics

Where CF (Correction Factor) = and TS (Total Sum) =

Brand-1 Brand -2 Brand -3 Brand -4


xi1 xi12 xi2 xi22 xi3 xi32 xi4 xi42
1000 1000000 1050 1102500 879 772641 957 915849
980 960400 1074 1153476 957 915849 820 672400
1010 1020100 1102 1214404 1040 1081600 1020 1040400
1200 1440000 990 980100 1011 1022121 1150 1322500
1012 1024144 1000 1000000 1002 1004004
1024 1048576 1030 1060900
1200 1440000

=419 =522 =711 597 601605


0 = 8 =5474624 1 = 9 3
4420500 7280787

= 23037255.44

125
CF (Correction Factor) = where,

Thus,

23191964

So,

Now, we fill the One –way ANOVA Table as follows

Source of Variations SS df MS F-Ratio


Between Samples k-1 = 4-1=3

Within Samples N-k = 22-4=18

Total 298 14

Decision and Conclusion: Clearly, the computed value of F (0.37) is not in the critical region.
Therefore, there is no sufficient reason to reject H 0. That means, the population mean lifetime
of the three brands of light bulbs are equal, unless different evidence comes to conclude
otherwise.

Dear learner, I hope you have been following attentively every lesson discussed so far. Now,
to remind the main points of unit 4, look at the unit summary below.

126
 Summary
ANOVA is acronym for "ANalysis Of VAriance". It was developed by R.A Fisher and is
sometimes called F-Test. ANOVA lets us test hypotheses about the mean (average) of a
dependent variable across different groups. While the t-test is used to compare the means
between two groups, ANOVA is used to compare means between 3 or more groups. There
are several varieties of ANOVA, such as one-factor (or one-way) ANOVA, two-factor (or
two-way) ANOVA, and so on, and also repeated measures ANOVA. The factors are the
independent variables, each of which must be measured on a categorical scale - that is,
levels of the independent variable must define separate groups. One-way refers to the
situation when only one fact or variable is considered. One-factor ANOVA, also called
one-way ANOVA is used when the study involves 3 or more levels of a single
independent variable.

127
 Self Assessment Questions-4

1) Suppose there are three factories whose outputs have been sampled with the results shown
in Table below
Observation Factory 1 Factory 2 Factory 3
1 415 385 408
2 430 410 415
3 395 410 418
4 399 403 440
5 408 405 425
6 418 400
7 399

At a level of significance , test whether there is a significant difference in the


average outputs of the 3 factories. (Assume that populations are normally distributed with
equal standard deviations and the samples are selected independently)

128

You might also like