0% found this document useful (0 votes)

13 views27 pages

Chapter 2=Estimation

Chapter Two discusses statistical estimation, focusing on how to estimate population parameters based on sample statistics. It covers key concepts such as estimators, point and interval estimates, and the four important properties of estimators: unbiasedness, efficiency, consistency, and sufficiency. The chapter also explains how to construct confidence intervals for population means, providing examples to illustrate the estimation process.

Uploaded by

viza VS netsi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views27 pages

Chapter 2=Estimation

Uploaded by

viza VS netsi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 27

CHAPTER TWO

STATISTICAL ESTIMATION

Introduction
After developing sampling distributions of different population
parameters, it is very important to estimate where these population
parameters might be located. That means the population parameters
must be estimated relative the samples in hand.
As its name suggests, the objective of estimation is to determine the
approximate value of a population parameter on the bases of a sample
statistic. An estimator of a population parameter is a random variable
that is a function of the sample data. An estimate is the calculation of
a specific value of this random variable.
The sampling distribution of the mean shows how far sample means
could be from a known population mean. Similarly, the sampling
distribution of the proportion shows how far sample proportions could
be from a known population proportion. In estimation, our aim is to
determine how far an unknown population mean could be from the
mean of a simple random sample selected from that population; or
how far an unknown population proportion could be from a sample
proportion. Those are the concerns of statistical inference, in which a
statement about an unknown population parameter is derived from
information contained in a random sample selected from the
population.

Basic concepts:
Estimation: is the process of using statistics as estimates of
parameters. It is any procedure where sample information is used
to estimate/ predict the numerical value of some population
measure (called a parameter).
Estimator- refers to any sample statistic that is used to estimate
a population parameter. E.g. for for p.
Estimate- is a specific numerical value of our estimator. E.g.
9, 2, 5

……………. Estimators
………………… items being estimated
1, 0.5, 9, 3 …………………... Estimates

Four Important Properties of Estimators

A number of different estimators are possible for the same population
parameter, but some estimators are better than others. To understand
how, we need to look at four important properties of estimators:
unbiasedness, efficiency, consistency, and sufficiency.

2
Unbiasedness: An estimator exhibits unbiasedness when the mean
of the sampling estimator is equal to the population parameter: E (ө)
= Ө.
In general, unbiasedness is a desirable property for an estimator. The
sample mean is an unbiased estimator of the population mean.
Similarly, the sample variance is an unbiased point estimator the
population variance because the mean of the sampling distribution of
the sample variance is equal to the population variance. And the
sample proportion is an unbiased estimator of the population
proportion. However, because standard deviation is a nonlinear
function of variance, the sample standard deviation is not an unbiased
estimator of population standard deviation. The bias of a point
estimator is: Bias = E (ө) = Ө. If there are a number of unbiased
estimators to choose from, there are three other criteria that could be
used to select an estimator.
Efficiency: Efficiency is another standard that can be used to
evaluate estimators. Efficiency refers to the size of the standard
error of the statistics. The most efficient estimator is the one with the
smallest variance. Thus, if there are two estimators for Ө with var (ө 1)
and var (ө2), then the first estimator ө1 is said to be more efficient
than the second estimator ө2, if var(ө1) < var (ө2) although E(ө1) =
E(ө2) = Ө.
Consistency: A third property of estimators, consistency, is related
to their behavior as the sample gets large. A statistic is a consistent
estimator of a population parameter if, as the sample size increases, it
becomes almost certain that the value of the statistic comes very close
to the value of the population parameter.
An unbiased estimator is a consistent estimator if the variance
approaches 0 as n increases. For example, the sample mean is an
unbiased and a consistent estimator of population mean. Although the
sample standard deviation is not an unbiased estimator of population
standard deviation, it is a consistent estimator of population standard
deviation.
Sufficiency: The last property of a good estimator is sufficiency. A
sufficient statistic is an estimator that utilizes all the information a
sample contains about the parameter to be estimated. For example,
the sample mean is a sufficient estimator of the population mean. This
means that no other estimator of the population mean from the same
sample data, such as the sample median, can add any further
information about the parameter (population mean) that is being
estimated.
Types of Estimates:
We can make two types of estimates about a population: a point
estimate and an interval estimate.

2
A point estimate: - is a single number that is used to estimate an
unknown population parameter. It is a single value that is measured
from a sample and used as an estimate of the corresponding
population parameter.
The most important point estimates (given that they are single values)
are:
o Sample mean for population mean ;
o Sample proportion for population proportion ;
o Sample variance for population variance and
o Sample standard deviation for population standard deviation

An interval estimate - is a range of values used to estimate a

population parameter. It describes the range of values with in which
a parameter might lie. Stated differently, an interval estimate is a
range of values with in which the analyst can declare with some
confidence that the population parameter will fall.
Example:
Suppose we have the sample 10,20,30,40 and 50 selected randomly
from a population whose mean is unknown.

The sample mean, , = is a point estimate of

.
On the other hand, if we state that the mean, , is between , the
range of values from 20 (30-10) to 40 (30+10) is an interval estimate.
Interval Estimation
Point estimators of population parameters, while useful, do not convey
as much information as interval estimators. Point estimation produces
a single value as an estimate of the unknown population parameter.
The estimate may or may not be close to the parameter value; in other
words, the estimate may be incorrect. An interval estimate, on the
other hand, is a range of values that conveys the fact that estimation
is an uncertain process. The standard error of the point estimator is
used in creating a range of values; thus a measure of variability is
incorporated into interval estimation. Further, a measure of
confidence in the interval estimator is provided; consequently,
interval estimates are also called confidence intervals. For these
reasons, interval estimators are considered more desirable than point
estimators.
Interval estimation for population mean,
As a result of the Central Limit Theorem (discussed in Chapter III) the
following z formula for sample means can be used when sample sizes
are large, regardless of the shape of the population distribution or for
smaller sizes if the population is normally distributed.

2
Rearranging the formula:

Because the sample mean can be greater than or less than the
population mean, z can be positive or negative. Thus, the preceding
expression takes the form:

The value of the population mean, , lies somewhere within this range.
Rewriting this expression yields the confidence interval for population
mean:

The confidence interval for population mean is affected by:

1. The population distribution, i.e., whether the population is
normally distributed or not
2. The standard deviation, i.e., whether is known or not.
3. The sample size, i.e., whether the sample size, n, is large or
not.

Confidence internal estimate of - Normal population,

known
A confidence interval estimate for  is an interval estimate together
with a statement of how confident we are that the interval estimate is
correct.
When the population distribution is normal and at the same time is
known, we can estimate (regardless of the sample size) using the
following formula .
1

Where: = sample mean; Z = value from the standard normal table

reflecting confidence level; σ = population standard deviation; n =
sample size; α = the proportion of incorrect statements (α = 1 – C);
and  = unknown population mean
From the above formula we can learn that an interval estimate is
constructed by adding and subtracting the error term to and from the
point estimate. That is, the point estimate is found at the center of the
confidence interval.

1
This formula works also for problems which involve large sample size (n>30) even
though the population is not normally distributed. And if n>.05N, finite population
correction factor may be used.

2
To find the interval estimate of population mean, we have the
following steps.
1. Compute the standard error of the mean
2. Compute from the confidence coefficient.
3. Find the Z value for the from the table
4. Construct the confidence interval
5. Interpret the results
Examples:
1. The vice president of operations for Ethiopian Tele
Communication Corporation (ETC) is in the process of developing a
strategic management plan. He believes that the ability to estimate
the length of the average phone call on the system is important. He
takes a random sample of 60 calls from the company records and finds
that the mean sample length for a call is 4.26 minutes. Past history
for these types of calls has shown that the population standard
deviation for call length is about 1.1 minutes. Assuming that the
population is normally distributed and he wants to have a 95%
confidence, help him in estimating the population mean.
Solution:
n= 60 calls = 4.26 minutes σ = 1.1 minutes C= 0.95
i. = = 0.142 iv.
ii. α = 1 – C = 1- 0.95 = 0.05 = 4.26 ±
1.96(0.142)
= 0.05/2 = 0.025 = 4.26 ± 0.28
iii.
3.98 ≤  ≤ 4.54
The vice-president of ETC can be 95% confident that the average
length of a call for the population is between 3.98 and 4.54 minutes.
2. A survey conducted by “Addis Zemen Gazetta” found that the
sample mean age of men was 44 years and the sample mean age of
women was 47 years. All together, 454 people from Addis were
included in the reader poll –340 women and 114 men. Assume that
the population standard deviation of age for both men and women is 8
years.
a. Develop a 95% confidence interval estimate for the mean age of
the population men who read the gazetta.
b. Develop a 95% confidence interval estimate for the mean age of
the population women who read the gazetta.
c. Compare the widths of the two interval estimates form part (a)
& (b) which one has a better precision? Why?
Solution:
a.
n= 114 men = 44 years σ = 8 years
C= 0.95

2
i. = = 0.75 iv.
ii. α = 1 – C = 1- 0.95 = 0.05 = 44 ±
1.96(0.75)
= 0.05/2 = 0.025 = 4.26 ± 1.47
iii.
42.53 ≤  ≤ 45.47
b.
n= 340 women = 47 years σ = 8 years
C= 0.95
i. = = 0.434 iv.
ii. α = 1 – C = 1- 0.95 = 0.05 = 47 ± 1.96(0.434)
= 0.05/2 = 0.025 = 47 ± 0.85
iii.
46.15 ≤  ≤ 47.85
i. Part b has a better precision because the sample size is larger
as compared with part a.
3. Time magazine reports information on the time required for
caffeine from products such as coffee and soft drinks to leave the body
after consumption. Assume that the 99% confidence interval estimate
of the population mean time for adults is 5.6 hrs to 6.4 hrs.
a. What is the point estimate of the mean time for caffeine to leave
the body after consumption?
b. If the population standard deviation is 2 hrs, how large a sample
was used to provide the interval estimate?
Solution:
C = 0.99 Confidence interval: 5.6 ≤  ≤6.4

a. point estimate =
Or;

12 = 2
= 6 hours

b. 0.99 σ = 2 hours Confidence interval: 5.6 ≤  ≤6.4

n=?
α = 1- C = 1- 0.99 = 0.01 α/2 = 0.005

2
; rearranging the expression

; squaring both sides

n = 165

We state with 99% confidence that the mean time required for
caffeine to leave the body after consumption lies between 5.6 and 6.4
hrs.
Confidence interval estimate of - Normal population,
unknown, n large
If we know that the population is normal, and we know the population
standard deviation, the confidence interval for should be
constructed in the manner already shown, i.e., . If the
population standard deviation is unknown, it has to be estimated from
the sample; i.e., when is unknown, we use sample standard

deviation: . Then, the standard error of the mean, , is

estimated by the sample standard error of the mean: .

Therefore, the confidence interval to estimate when population

standard deviation is unknown, population normal and n is large is 2
.
Examples:
1. Suppose that a car rental firm in Addis wants to estimate the
average number of miles traveled by each of its cars rented. A
random sample of 110 cars rented reveals that the sample means
travel distance per day is 85.5 miles, with a sample standard deviation
of 19.3 miles. Compute a 99% confidence interval to estimate .

Solution:
n= 110 rented cars = 85.5 miles s = 19.3 miles
C= 0.99
i. = = 1.84 iv.

2
This formula also works for large sample size even though the parent population is
not normally distributed.

2
ii. α = 1 – C = 1- 0.99 = 0.01 = 85.5 ±
2.57(1.84)
= 0.01/2 = 0.005 = 85.5 ± 4.73
iii.
80.77 ≤  ≤ 90.23

We state with 99% confidence that the average distance traveled by

rented cars lies between 80.77 and 90.23 miles.
2. A study is being conducted in a company that has 800
engineers. A random sample of 50 of these engineers reveals that the
average sample age is 34.3 years, and the sample standard deviation
is 8 years. Assuming normality, construct a 98% confidence interval
to estimate the average age of all engineers in this company.
Solution:
n= 50 engineers N = 800 engineers = 34.3 years s = 8
years C= 0.98
i. 3
= = 1.10
ii. α = 1 – C = 1- 0.98 = 0.02
= 0.02/2 = 0.01
iii.
iv.
= 34.3 ± 2.33(1.10)
= 34.3 ± 2.56

31.74 ≤  ≤ 36.86
We state with 98% confidence that the mean age of engineers lies
between 31.74 and 36.86 years.
Confidence interval for unknown, n-small, population
normal
If the sample size is small (n<30), we can develop an interval estimate
of a population mean only if the population has a normal probability
distribution. If the sample standard deviation s is used as an
estimator of the population standard deviation and if the population
has a normal distribution, interval estimation of the population mean
can be based up on a probability distribution known as t-distribution.
Characteristics of t-distribution
1. The t-distribution is symmetric about its mean (0) and ranges
from - ∞ to ∞.
2. The t-distribution is bell-shaped (unimodal) and has
approximately the same appearance as the standard normal
distribution (Z- distribution).
3
Since the sample size is greater than 5% of the population size, finite population
multiplier is used to calculate the sample standard error of the mean.

2
3. The t-distribution depends on a parameter ν (Greek Nu) 4, called
the degrees of freedom of the distribution. Ν = n -1, where n is
sample size. The degree of freedom, ν, refers to the number of
values we can choose freely.
4. The variance of the t-distribution is ν/ (ν-2) for ν>2.
5. The variance of the t-distribution always exceeds 1.
6. As ν increases, the variance of the t-distribution approaches 1
and the shape approaches that of the standard normal distribution.
7. Because the variance of the t-distribution exceeds 1.0 while the
variance of the Z-distribution equals 1, the t-distribution is slightly
flatter in the middle than the Z-distribution and has thicker tails.
8. The t-distribution is a family of distributions with a different
density function corresponding to each different value of the
parameter ν. That is, there is a separate t-distribution for each
sample size. In proper statistical language, we would say, “There
is a different t-distribution for each of the possible degrees of
freedom”.
9. The t formula for sample when is unknown, the sample size is

small, and the population is normally distributed is:

This formula is essentially the same as the z-formula, but the

distribution table values are not.

The confidence interval to estimate becomes:

Where: = sample mean

α=1–C
ν = n – 1 (degrees of freedom)
s = sample standard deviation
n = sample size
 = unknown population mean
Steps:
i. Calculate degrees of freedom (v=n-1) and sample
standard error of the mean.

4
What are degrees of freedom? We can define them as the number of values we can
choose freely. In general, the degrees of freedom for a t statistic are the degrees of
freedom associated with the sum of squares used to obtain an estimate of the
variance. The variance estimate depends on not only on the sample size but also on
how many parameters must be estimated with the sample:

Here we calculate sample variance by using n observations and estimating one

parameter (the mean). Thus, there are (n – 1) degrees of freedom.

2
ii. Compute
iii. Look up
iv. Construct the confidence interval
v. Interpret results
Examples:
1. If a random sample of 27 items produces = 128.4 and s = 20.6.
What is the 98% confidence interval for ? Assume that x is normally
distributed for the population. What is the point estimate?
Solution:
The point estimate of the population mean is the sample mean, in this
case 128.4 is the point estimate.

n= 27 = 128.4 s = 20.6 C= 0.98

i. = = 3.96 ν = n – 1 = 27-1 = 26
ii. α = 1 – C = 1- 0.98 = 0.02
= 0.02/2 = 0.01
iii.
iv.
= 128.4 ± 2.479(3.96)
= 128.4 ± 9.82
118.56 ≤  ≤ 138.22
We state with 98% confidence that the population mean lies between
118.56 and 138.23.
2. A sample of 20 cab fares in Bahir Dar city shows a sample mean
of Br 2.50 and a sample standard deviation of Br. 0.50. Develop a
90% confidence interval estimate of the mean cab fares in Bahir Dar
city. Assume the population of cab fares has a normal distribution.

n= 20 = Birr 2.50 s = Birr 0.50 C= 0.90

i. = = 0.112 ν = n – 1 = 20-1 = 19
ii. α = 1 – C = 1- 0.90 = 0.10
= 0.10/2 = 0.05
iii.
iv.
= 2.50 ± 1.729(0.112)
= 2.50 ± 0.194
2.31 ≤  ≤ 2.69
We state with 90% confidence that the mean of cab fares in Bahir Dar
city lies between Birr 2.31 and 2.69.
3. Sales personnel for X Company are required to submit weekly
reports listing customer contacts made during the week. A sample of

2
61 weekly contact reports showed a mean of 22.4 customer contacts
per week for the sales personnel. The sample standard deviation was
5 contacts.
a. Develop a 95% confidence interval estimate for the mean
number of weekly customer contacts for the population of sales
personnel.
b. Assume that the population of weekly contact data has a normal
distribution. Use the t distribution to develop a 95% confidence
interval for the mean number of weekly customer contacts.
c. Compare your answer for parts (a) and (b). What do you
conclude from your results?
Solution:
a. n= 61 weekly contact reports5 = 22.4 contacts s = 5
contacts C= 0.95
i. = = 0.64
ii. α = 1 – C = 1- 0.95 = 0.05
= 0.05/2 = 0.025
iii.
iv.
= 22.4 ± 1.96(0.64)
= 22.4 ± 1.25
21.15 ≤  ≤ 23.65
I state with 95% confidence that the mean weekly contact lies
between 21.15 and 23.65 contacts.
b. n= 61 weekly contact reports = 22.4 contacts s = 5
contacts C= 0.95
i. = = 0.64 ν = n – 1 = 61 – 1 = 60
ii. α = 1 – C = 1- 0.95 = 0.05
= 0.05/2 = 0.025
iii.
iv.
= 22.4 ± 2.00 (0.64)
= 22.4 ± 1.28

21.12 ≤  ≤ 23.68

I state with 95% confidence that the mean weekly contact lies
between 21.12 and 23.68 contacts.

5
Since the sample size is large, we use the Z-distribution to construct the confidence
interval.

2
c. As the sample size increases, the t-distribution and z (normal)
distribution approximate to be equal.

Confidence interval for small, unknown, population not

normal
This is solved by non-parametric tests, which do not require
assumption about the underlying form of the population data.
Interval Estimation of the Population Proportion
We know that a sample proportion, , is an unbiased estimator of a
population proportion P and if the sample size is large then, the

sampling distribution of is normal with .

However, here p is unknown and we want to estimate p by and

hence z becomes . That is, is substituted by

Solving for P results in and since Z can assume both

positive and negative values, it becomes .

Since Z represents the confidence level we write it as

Where: = sample proportion

=1-
α=1–C
n = sample size
P = unknown population
proportion
Examples:
1. Recently, a study of 87 randomly selected companies with
telemarketing operation was completed. The study revealed that 39%
of the sampled companies had used telemarketing to assist them in
order processing. Using this information estimate the population
proportion of telemarketing companies who use their telemarketing
operation to assist them in order processing taking a 95% confidence
level.
Solution:
n= 87 = 0.39 = 0.61 C = 0.95
i. = = 0.0523
ii. α = 1 – C = 1- 0.95 = 0.05

2
= 0.05/2 = 0.025
iii.
iv.
= 0.39 ± 1.96(0.0523)
= 0.39 ± 0.1025
0.2875 ≤ P ≤ 0.4925
We state with 955 confidence that the proportion of companies which
use telemarketing to assist order processing lies between 0.2875 and
2. A fast food restaurant took a random sample of 400 customers
to determine the proportion of customers who are female. A
confidence interval of .73 to .87 was reported.
a. Find the number of females and the sample proportion
b. Find the level of confidence of this interval

Solution:
a. n= 400 0.73 ≤ P ≤ 0.87 =? Number of
females=?
Point estimate =
Or;

1.60 = 2
= 0.8
Number of females (X) = n* = 400*0.8 = 320
b.
0.87 = 0.8+
0.07 =
0.07 =
3.50 =
(P/Z=3.5) = 0.49977
C = 0.49977*2
= 99.954%
3. A random sample of 400 faculty members at AAU contained 120
people who believed that the University should improve its library
service. On the basis of this sample information, an analyst calculated
the confidence interval (.25, .35) for the population proportion of
faculty members favoring improvement. What is the level of
confidence of this interval?
Solution:
n= 400 X = 120 = 0.30 Interval estimate 0.25 ≤ P ≤
0.30 C =?

0.25 = 0.30 -

2
0.05 =
0.05 =
2.17 =
(P/Z=2.17) = 0.485
C = 0.485*2
= 97%

Interval Estimation of the Difference between two

independent Means
It is clear that the unbiased point estimate of the difference between
the means of two populations is the difference between two
sample means , where each sample is a random sample taken
from the respective target population. The confidence interval is
constructed by adding the relevant standard error value which is
called standard error of the difference between means and the
confidence level desired.

Interval Estimation of - population normal,

known
If the two parent populations are normal, then the sampling
distribution of the difference between two means will be normally
distributed regardless of n (sample size). And we can estimate
(regardless of using the following formula; given that & are
known.
6

When and are not known, the standard error between two sample
means is estimated by the sample standard error of the

difference between two sample means, , and

the interval estimation takes the following form:

, given that the sample sizes are large.

Example:
1. In a sex discrimination case, an employee alleged that a large
corporation paid men more than women for comparable work. Let
population 1 represent all male employees performing certain jobs

6
This formula works also for problems which involve large sample sizes
even though the parent population may not be normally distributed.

2
and population 2 represent all female employees performing
comparable jobs at the corporation. Independent samples are taken
of males and females; the sample means are
and , and the sample standard deviations
are and . Construct a 95% confidence interval
for . What do you conclude from this?
Solution:
Male employees Female employees
males females C= 0.95

Steps:
i. Calculate the (sample) standard error of the difference
between two means

ii. Compute
α = 1-C = 1- 0.95 = 0.05
α/2 = 0.05/2 = 0.025
iii. Look up
iv. Construct the confidence interval

=
900 ± 765.40
134.60 ≤ ≤ 1,665.40
We state with 95% confidence that the mean salary difference
between the male and female workers lies between Birr 134.60 and
Birr 1665.40
Because this interval contains only positive values, we can be quite
confident that > 0. Thus, it reasonable to assume that the
mean salary for males exceeds the mean salary for females.
2. A farmer wants to determine if different types of feed can
influence the mean member of eggs that hens lay per month. In a
random sample of 100 hens that ate feed 1, the average member of
eggs per month was with variance 4. In a random sample of
100 hens that ate feed2, the average number of eggs per month was
with variance 4. Construct a 95% confidence interval for .
What do you conclude?
Solution:
Feed 1 Feed 2
hens hens C= 0.95

2
Steps:
i. Calculate the (sample) standard error of the difference
between two means

ii. Compute
α = 1-C = 1- 0.95 = 0.05
α/2 = 0.05/2 = 0.025
iii. Look up
iv. Construct the confidence interval

= 1.2 ± 0.5547
0.6453 ≤ ≤ 1.7547

We state with 95% confidence that the mean number of eggs laid by
hens which ate the two type of feeds lies between 0.6543 eggs and
1.7547 eggs.
Since the interval contains only positive values, then those hens which
ate feed type 1 are more productive than those hens that ate feed type
2.

Interval estimation of population normal,

unknown,
When the sample sizes are small, the population standard deviations
are unknown, and the population distributions are normal, we use t-
distribution to construct a confidence interval for . Moreover, to
use a t-distribution we have to assume that the two variances
(standard deviations) are equal. In short, to use a t-distribution for
constructing confidence interval for , we assume the following:
1. The population standard deviation and are not known.
2. The sample sizes are small ( ).
3. The populations are assumed to be approximately normally
distributed.
4. The two (unknown) population variances are equal .
Given these assumptions, the sampling distribution of is
normally distributed regardless of the sizes. Because of the equal
variances assumption, the standard error of the means is written as

2
If the variance of the populations is known,
can be used to develop the interval estimate of .
However, in most cases, is unknown; thus the two sample variances
must be used to develop the estimate of . Since

is based on the assumption that , we do not

need a separate estimates of . In fact, we can combine the

data from the two samples to provide the best single estimate of .
The process of combing the results of two independent simple random
samples to provide one estimate of is referred to as pooling. The
pooled estimator of variance, , denoted by is the weighted
average of the two sample variances, , with the degrees of
freedom associated with each sample being used as the weights. The
formula for the pooled estimator of is:

Where:
= pooled estimate of the variance
= sample size drawn from population 1
= sample size drawn from population 2
= sample variance of the sample drawn from population
1
= sample variance of the sample drawn from population
2
n1+n2-2 = pooled degrees of freedom
Based on the assumption that the population standard deviations are
equal, the standard error of the difference between means is
estimated by the sample standard error of the difference between two
sample means, , according to the following equation:

The confidence interval for when the common standard

deviations are not known is based on t-distribution, and is
given by:

Where:

2
ν = pooled degrees of freedom (n1 + n2 – 2)
Examples:
1. Two manufacturing companies produce drill tips that are used
to cut holes in steel sheets. A customer wishing to know which drill
tips have the longer site purchases, independent samples of
drill tips from company 1 and drill tips from company 2. The
mean lives of the drill tips are minutes and minutes. The
population variances are unknown but assumed to be equal. The
sample variances are . Construct a 95% confidence
interval for assuming that the two populations are normally
distributed.
Solution:
Company One Company Two
drill tips drill tips C = 0.95
minutes minutes
= 41 = 36
i. Calculate the sample standard error of the difference
between two means and the pooled degrees of freedom

= 2.13
ν = n1 +n2 -2
= 20 + 15 -2 = 33
ii. Compute and look up
α = 1-C = 1- 0.95 = 0.05
α/2 = 0.05/2 = 0.025
= = 2.04
iii. Construct the confidence interval

-10.34 ≤ ≤ -1.66
The 95% confidence interval is (-10.34 to –1.66). This interval contains
only negative values indicating that the drill tips made by company 1
do not last as long on average, as those made by company 2.

2
2. Five year children were being studied to determine whether
children whose parents are college graduates watched more or less
TV than children whose parents are not college graduates.
Independent random samples of 21 children were selected from each
population. The sample means and variances were
The population variances are
assumed to be equal and the populations are assumed to be normal.
Calculate the 95% confidence interval for the difference between the
two population means.
Solution:
College graduates’ children Non-college graduates’ children
children children C = 0.95
hours hours
= 16 = 14
i. Calculate the sample standard error of the difference
between two means and the pooled degrees of freedom

= 1.2
ν = n1 +n2 -2
= 21 + 21 -2 = 40
ii. Compute and look up
α = 1-C = 1- 0.95 = 0.05
α/2 = 0.05/2 = 0.025
= = 2.021
iii. Construct the confidence interval

-6.43 ≤ ≤ -1.58
We state with 95% confidence that the mean difference between the
two populations lies between –6.42 and –1.58
 Children whose parents are college graduates
watched less TV than children whose parents are not college
graduates.
To use a t-test to construct a confidence interval we assume that:
populations are normal, population standard deviations are equal and
sample sizes are less than 30. However, may not be equal.

2
In such cases the estimated sample variances of the difference
between two sample means is calculated as:

However, the degrees of freedom is calculated as:

Confidence interval for the difference between two

population proportions
We know that the unbiased estimator of the difference between the
proportions of two populations is the difference between two
sample proportions , where each sample is a random sample
taken from the respective target population. Moreover, based on
CLT, if are greater than 5, the sampling

distribution of is normal with

However, here are unknown, and we want to estimate

by respectively, and hence Z becomes:

. That is, is substituted by

Solving for results in:

, and since Z can

assume both positive and negative values, it becomes:

Since z represents the confidence level we write it as

Where:
= the sample proportion of success in the first
sample

2
= the sample proportion of in the second
sample
= 1- ; = 1-
= sample size drawn from the first population
= sample size drawn from the second population
α=1-C
This formula holds true provided that
Examples:
1. A TV executive is interested in determining if the proportion of
people who watch a late-night talk show is higher with the regular
host or a guest host. In a random sample of 400 people, 175 watch
the show when the regular host is on. In an independent random
sample of 500 people, 185 watch the show a guest host is on.
Calculate a 95% confidence interval for . What do you conclude?
Solutions:
Regular host Guest Host
= 400 = 0.4375 = 500 =
0.37
X1 = 175 = 0.5625 X2 = 185 = 0.63
C = 0.95
i. Calculate the sample standard error of the diff. between
two proportions

ii. Compute
α = 1-C = 1- 0.95 = 0.05
α/2 = 0.05/2 = 0.025
iii. Look up
iv. Construct the confidence interval

= 0.0675 ± 0.065

0.0025 ≤ ≤ 0.1325
We state with 95% confidence that the true difference between
is between 0.0025 and 0.1325. Since this interval contains only
positive value it is reasonable to say that the proportion of people who
watch TV when the regular host is on is greater than when the guest
host is on.
2. A city planner claims that home owner tend to have closer ties
to their community than do renters. Thus, home owners are more
willing to pay for good schools and recreational facilities than are

2
renters. In a random sample of 120 home owners, 51 stated that the
local tax rates were too high and 69 stated that tax rates were “about
right” In an independent random sample of 200 renters 70 stated that
the tax rates were too high and 130 thought they were “about right” .
a. Find a 99% confidence interval for the difference in proportions
who think that taxes are too high.
b. Do the data support the city planners claim?
Solution:
Home Owners Renters
= 120 = 0.425 = 200 = 0.35
X1 = 51 = 0.575 X2 = 70
= 0.65
C = 0.99
i. Calculate the sample standard error of the diff. between
two proportions

ii. Compute
α = 1-C = 1- 0.99 = 0.01
α/2 = 0.01/2 = 0.005
iii. Look up
iv. Construct the confidence interval

= 0.075 ± 0.144
- 0.069 ≤ ≤ 0.219
We state with 99% confidence that the difference between the
proportion of home owners and renters who said that the tax rates are
too high lies between –0.069 and 0.219.
 Since the confidence interval contains positive,
zero, and positive values, we can not certainly state that home
owners are more willing to pay for good schools and recreational
facilities than are renters. Hence, the data do not necessarily
support the city planner’s claim.

Determination of Sample Size

The reason for taking a sample from a population is that it would be
too costly to gather data for the whole population. But collecting
sample data also costs money; and the larger the sample, the higher
the cost. To hold cost down, we want to use as small a sample as
possible. On the other hand, we want a sample to be large enough to
provide “good” approximation/estimates of population parameters.
Consequently, the question is “How large should the sample be?”

2
The answer depends on three factors:
1) How precise (narrow) do we want a confidence interval to be?
2) How confident do we want to be that the interval estimate is
correct?
3) How variable is the population being sampled?

Sample size for estimating population mean,

The confidence interval for is .

From the above expression is called error of estimation (e). That

is, the difference between and which results from the sampling
process. So e=

Squaring both sides results in . Solving for n results in,

Examples:
1. A gasoline service station shows a standard deviation of Birr
6.25 for the changes made by the credit card customers. Assume that
the station’s management would like to estimate the population mean
gasoline bill for its credit card customers to be with in ± Birr 1.00.
For a 95% confidence level, how large a sample would be necessary?
Solution:
e = Birr 1.00 σ = Birr 6.25 C = 0.95

2. The National Travel and Tour Organization (NTO) would like to

estimate the mean amount of money spent by a tourist to be with in
Birr 100 with 95% confidence. If the amount of money spent by
tourist is considered to be normally distributed with a standard
deviation of Br 200, what sample size would be necessary for the NTO
to meet their objective in estimating this mean amount?

7
It a procedure for determining sample size produces a non-integer value, always
round to the next larger integer.

2
Solution:
e = Birr 100 σ = Birr 200 C = 0.95

If population standard deviation, , is unknown we have to make an

educated guess or take a pilot sample and estimate it.
- The rough approximation is because 95.4% of the total
population falls with in .

Relationship between the error term and sample size

Reducing error term in estimation of an interval estimate to 1/a of the
original amount, while holding the confidence level constant requires
a sample size of a2 times the original sample size.

Sample size for estimating population proportion, p

The confidence interval for p is . The expression
is called the error term (e). That is,
, squaring both sides

, solving for n

Since we are trying to determine n, we cannot have . Instead,

we should have p and q. so it becomes

Example
1. Suppose that a production facility purchases a particular
component parts in large lots from a supplier. The production
manager wants to estimate the proportion of defective parts received
from this supplier. She believes that the proportion of defects is no
more than 0.2 and wants to be with in 0.02 of the true proportion of
defects with a 90% level of confidence. How large a sample should
she take?
Solution:

2
e = 0.02 p = 0.2 q =0.8 C = 0.90

2. What is the largest sample size that would be needed in

estimating a population proportion to with in ± 0.02, with a
confidence coefficient of 0.95?
Solution:
e = 0.02 C = 0.95

The largest sample size would be obtained when p = 0.5. So,

If p is unknown and there is no possibility of estimating it, use 0.5 as

the value of p because it will generate the greatest possible sample
size as compared with other values.

Determining Sample Size When Estimating

When taking two random samples and using the difference in sample
means to estimate the difference in population means, a researcher
should have an idea of how large the sample sizes need to be solving

for n form the formula does not look promising,

because the equation has nine variables including two different values
of n. However making some assumptions can generate a workable
sample size formula.
1. Variances of the two populations are the same:
2. The sample size for each sample is the same:
The difference between and is the error of estimation. Or

Incorporating these assumptions into the z-formula yields

2
= = =

Solving for n produces the sample size:

The above formula suggests that the necessary sample sizes for
comparing two sample means are each twice as large as the required
sample size for estimating single sample means. It is clear that the
larger the sample, the more it costs. Thus sample size formulas can
be effective aids in ensuring that a research project’s goals are met
and that the cost of sampling is minimized.

Examples:
1. A college admissions officer wants to estimate the difference in
the average GMAT scores of men and women. She plans to take a
random sample of men and women who have taken the GMAT at the
same time. She wants to be with in 10 points of the true difference in
the mean scores of men and women and 95% confident of her results.
Past GMAT test results indicate that the standard deviation of GMAT
test scores is about 105 points. How large the sample sizes be?
Solution:
e = 10 points σ = 105 points C = 0.95
n=?

2. A researcher wants to estimate the difference between the

average price of a 21-inch black and white TV and the average price
of a 21-inch color TV set. He believes that the standard deviation of
the price of a 21-inch TV set is about Birr 100. He wants to be 99%
confident of his results and with in Birr 20 of the true difference. How
large a sample should he take for each type of television set?
Solution:
e = Birr 20 σ = Birr 100 C = 0.99
n=?

2
2

Case Study - Hello Kitty
92% (12)
Case Study - Hello Kitty
12 pages
Medication Administration Case Study 1
No ratings yet
Medication Administration Case Study 1
2 pages
Corporate Strategy. Case Study 1 Sturbucks
100% (1)
Corporate Strategy. Case Study 1 Sturbucks
5 pages
MGMT 222 Ch. IV
50% (2)
MGMT 222 Ch. IV
30 pages
Chapter Two Stat II
No ratings yet
Chapter Two Stat II
20 pages
Stat For Fin CH 4 PDF
No ratings yet
Stat For Fin CH 4 PDF
17 pages
Cha 2
0% (1)
Cha 2
23 pages
Ch-1.Ppt Business Statx (2)
No ratings yet
Ch-1.Ppt Business Statx (2)
66 pages
Offiwiz File
No ratings yet
Offiwiz File
46 pages
POINT INTERVAL Estimates
No ratings yet
POINT INTERVAL Estimates
48 pages
ssc gds notes
No ratings yet
ssc gds notes
88 pages
Lecture 5 final Point Estimation and Interval Estimation
No ratings yet
Lecture 5 final Point Estimation and Interval Estimation
10 pages
Estimation in Statistics
100% (1)
Estimation in Statistics
4 pages
Statistics Estimation
No ratings yet
Statistics Estimation
15 pages
4. Interval Estimation
No ratings yet
4. Interval Estimation
69 pages
CH Ii Business Stat
No ratings yet
CH Ii Business Stat
28 pages
Stat CH 3 Edited 1
No ratings yet
Stat CH 3 Edited 1
9 pages
Statistics For Manangement II
No ratings yet
Statistics For Manangement II
28 pages
CH-2 Estimation - 071222
No ratings yet
CH-2 Estimation - 071222
16 pages
Chapter 8
No ratings yet
Chapter 8
19 pages
Biostat Inferential Statistics
No ratings yet
Biostat Inferential Statistics
62 pages
Unit 2 Statistical Estimation
No ratings yet
Unit 2 Statistical Estimation
15 pages
Lecture 8
No ratings yet
Lecture 8
85 pages
1 Review of Basic Concepts - Interval Estimation
No ratings yet
1 Review of Basic Concepts - Interval Estimation
4 pages
BS - CH II Estimation
No ratings yet
BS - CH II Estimation
10 pages
Unit 5
No ratings yet
Unit 5
17 pages
Unit 5 Estimation: Structure
No ratings yet
Unit 5 Estimation: Structure
17 pages
Learning Objectives
No ratings yet
Learning Objectives
20 pages
Ch.3-Estimation module
No ratings yet
Ch.3-Estimation module
27 pages
BBA IV Business Statistics
No ratings yet
BBA IV Business Statistics
270 pages
Chapter 5- Estimation
No ratings yet
Chapter 5- Estimation
8 pages
Estimation
No ratings yet
Estimation
14 pages
SM Lec-2
No ratings yet
SM Lec-2
6 pages
Statistics and Probability Module 4 Moodle
No ratings yet
Statistics and Probability Module 4 Moodle
6 pages
Business Statistics CH 2
No ratings yet
Business Statistics CH 2
49 pages
Ch4 Estimation of Parameters Complete
No ratings yet
Ch4 Estimation of Parameters Complete
53 pages
Estimation and Sample Size Determination
No ratings yet
Estimation and Sample Size Determination
37 pages
Chapter 6
No ratings yet
Chapter 6
33 pages
Statistics 2 Chapter Two
No ratings yet
Statistics 2 Chapter Two
14 pages
Statistical Inference Point Estimators Estimating The Population Mean Using Confidence Intervals
No ratings yet
Statistical Inference Point Estimators Estimating The Population Mean Using Confidence Intervals
40 pages
Estimation
No ratings yet
Estimation
92 pages
BS_IMI_U4_Oct23_complete
No ratings yet
BS_IMI_U4_Oct23_complete
182 pages
Chapter 2 Statistics Estimation Final
No ratings yet
Chapter 2 Statistics Estimation Final
13 pages
Session: 27: Topic
No ratings yet
Session: 27: Topic
62 pages
Inferential Statistics
No ratings yet
Inferential Statistics
119 pages
Inferential PDF
No ratings yet
Inferential PDF
9 pages
Chapter 4 - BUSINESS STATISTICS
No ratings yet
Chapter 4 - BUSINESS STATISTICS
14 pages
Estimation
No ratings yet
Estimation
53 pages
7 Estimation
No ratings yet
7 Estimation
91 pages
11 Parameter Estimation
No ratings yet
11 Parameter Estimation
101 pages
6.Estimation
No ratings yet
6.Estimation
65 pages
Unit v Estimation
No ratings yet
Unit v Estimation
33 pages
Session 10 - Estimation & PT Estimation
No ratings yet
Session 10 - Estimation & PT Estimation
14 pages
CH II - Statistical Estimations
No ratings yet
CH II - Statistical Estimations
17 pages
Statistics for Economists Lecture VI
No ratings yet
Statistics for Economists Lecture VI
33 pages
Module 5
No ratings yet
Module 5
67 pages
stat2 chapter 2-1
No ratings yet
stat2 chapter 2-1
10 pages
Chapter 5 Estimation PDF
No ratings yet
Chapter 5 Estimation PDF
15 pages
Chapter Four
No ratings yet
Chapter Four
9 pages
University of Gondar College of Medicine and Health Science Department of Epidemiology and Biostatistics
No ratings yet
University of Gondar College of Medicine and Health Science Department of Epidemiology and Biostatistics
119 pages
CHAPTER-8_ESTIMATION
No ratings yet
CHAPTER-8_ESTIMATION
65 pages
Chapter Two
No ratings yet
Chapter Two
154 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
5-Bacilli bacteria type
No ratings yet
5-Bacilli bacteria type
88 pages
5=Contraceptive & CORTICOSTEROID
No ratings yet
5=Contraceptive & CORTICOSTEROID
76 pages
5-Embolism (2)
No ratings yet
5-Embolism (2)
44 pages
ch=iii=stat I (2)
No ratings yet
ch=iii=stat I (2)
17 pages
AcFn 2011 Ch 1 FOA-1 (3)
No ratings yet
AcFn 2011 Ch 1 FOA-1 (3)
85 pages
Acct2202 Ch05 Corporation (2)
No ratings yet
Acct2202 Ch05 Corporation (2)
71 pages
1. Introduction to Microbiology
No ratings yet
1. Introduction to Microbiology
47 pages
CH=5b (4)
No ratings yet
CH=5b (4)
19 pages
1. Introduction
No ratings yet
1. Introduction
28 pages
work-sheet-1
No ratings yet
work-sheet-1
4 pages
Unit 3.2.. Cytokines
No ratings yet
Unit 3.2.. Cytokines
52 pages
Ch 5-Corporation-2016 (4)
No ratings yet
Ch 5-Corporation-2016 (4)
43 pages
4. Antigens & Immunogens
No ratings yet
4. Antigens & Immunogens
23 pages
CH 1 Introduction Microbiology (2)
No ratings yet
CH 1 Introduction Microbiology (2)
41 pages
Immunity
No ratings yet
Immunity
23 pages
#3P atient case presentation (2)
No ratings yet
#3P atient case presentation (2)
21 pages
Best One Introduction 04
No ratings yet
Best One Introduction 04
89 pages
NERVOUS SYSTEM (1)
No ratings yet
NERVOUS SYSTEM (1)
105 pages
1905 Psychopathic Characters on the Stage
No ratings yet
1905 Psychopathic Characters on the Stage
4 pages
20 Burgers To Eat Before You Die
No ratings yet
20 Burgers To Eat Before You Die
8 pages
Shankharapur Polytechnic Institute
No ratings yet
Shankharapur Polytechnic Institute
34 pages
Arabic Script in Unicode
No ratings yet
Arabic Script in Unicode
22 pages
Daniel Defoe
No ratings yet
Daniel Defoe
15 pages
Sustainability in The Education of Interior Design
No ratings yet
Sustainability in The Education of Interior Design
11 pages
Courtyard Brick House: Hxsyddgn
No ratings yet
Courtyard Brick House: Hxsyddgn
9 pages
Public Personnel Administration
No ratings yet
Public Personnel Administration
9 pages
Schools of Hindu Law
100% (1)
Schools of Hindu Law
3 pages
Adpll JSSC 2007
No ratings yet
Adpll JSSC 2007
11 pages
[Ebooks PDF] download Orthodontic Treatment of Class III Malocclusion 1st Edition Peter W. Ngan full chapters
100% (4)
[Ebooks PDF] download Orthodontic Treatment of Class III Malocclusion 1st Edition Peter W. Ngan full chapters
51 pages
Cryptographic Embedded Systems S
No ratings yet
Cryptographic Embedded Systems S
6 pages
Leadership Quotes
No ratings yet
Leadership Quotes
2 pages
School Form 10 JHS Learners Academic Permanent Record
No ratings yet
School Form 10 JHS Learners Academic Permanent Record
4 pages
Show, Don't (Just) Tell - Jerz's Literacy Weblog (Est. 1999)
No ratings yet
Show, Don't (Just) Tell - Jerz's Literacy Weblog (Est. 1999)
19 pages
6 - SCIENCE Eng Med
No ratings yet
6 - SCIENCE Eng Med
5 pages
CA Foundation 1 Day Before Exam Questions & Answers-12
No ratings yet
CA Foundation 1 Day Before Exam Questions & Answers-12
184 pages
GPPAs As of December 31 2022
No ratings yet
GPPAs As of December 31 2022
127 pages
All You Need Is Love
No ratings yet
All You Need Is Love
4 pages
Second Newsletter of An-Nisa
No ratings yet
Second Newsletter of An-Nisa
16 pages
Albert Bandura's Social Learning Theory and Islamic Perspective
100% (2)
Albert Bandura's Social Learning Theory and Islamic Perspective
7 pages
2 Congregation of The Religious of The Virgin Mary Vs Orola
No ratings yet
2 Congregation of The Religious of The Virgin Mary Vs Orola
7 pages
Strategic Management Paper Example Chapter 2
No ratings yet
Strategic Management Paper Example Chapter 2
10 pages
Martes, Hulyo 03 2012: Philippine Legal Maxims With Meaning and Legal Cases
No ratings yet
Martes, Hulyo 03 2012: Philippine Legal Maxims With Meaning and Legal Cases
68 pages
AQA-7407-2-QP(16-22) 副本
No ratings yet
AQA-7407-2-QP(16-22) 副本
245 pages
Listening Script Bright Viii KM Final
No ratings yet
Listening Script Bright Viii KM Final
9 pages
TCBC Grand Opening Press Release (Gainesville VA 11-28-2014)
No ratings yet
TCBC Grand Opening Press Release (Gainesville VA 11-28-2014)
2 pages

Chapter 2=Estimation

Uploaded by

Chapter 2=Estimation

Uploaded by

CHAPTER TWO

Four Important Properties of Estimators

An interval estimate - is a range of values used to estimate a

The sample mean, , = is a point estimate of

The confidence interval for population mean is affected by:

Confidence internal estimate of - Normal population,

Where: = sample mean; Z = value from the standard normal table

b. 0.99 σ = 2 hours Confidence interval: 5.6 ≤  ≤6.4

; squaring both sides

deviation: . Then, the standard error of the mean, , is

estimated by the sample standard error of the mean: .

Therefore, the confidence interval to estimate when population

We state with 99% confidence that the average distance traveled by

small, and the population is normally distributed is:

This formula is essentially the same as the z-formula, but the

The confidence interval to estimate becomes:

Where: = sample mean

Here we calculate sample variance by using n observations and estimating one

n= 27 = 128.4 s = 20.6 C= 0.98

n= 20 = Birr 2.50 s = Birr 0.50 C= 0.90

Confidence interval for small, unknown, population not

sampling distribution of is normal with .

However, here p is unknown and we want to estimate p by and

hence z becomes . That is, is substituted by

Solving for P results in and since Z can assume both

positive and negative values, it becomes .

Since Z represents the confidence level we write it as

Where: = sample proportion

Interval Estimation of the Difference between two

Interval Estimation of - population normal,

difference between two sample means, , and

the interval estimation takes the following form:

Interval estimation of population normal,

is based on the assumption that , we do not

need a separate estimates of . In fact, we can combine the

The confidence interval for when the common standard

However, the degrees of freedom is calculated as:

Confidence interval for the difference between two

distribution of is normal with

However, here are unknown, and we want to estimate

. That is, is substituted by

Solving for results in:

, and since Z can

assume both positive and negative values, it becomes:

Since z represents the confidence level we write it as

Determination of Sample Size

Sample size for estimating population mean,

From the above expression is called error of estimation (e). That

Squaring both sides results in . Solving for n results in,

2. The National Travel and Tour Organization (NTO) would like to

If population standard deviation, , is unknown we have to make an

Relationship between the error term and sample size

Sample size for estimating population proportion, p

Since we are trying to determine n, we cannot have . Instead,

we should have p and q. so it becomes

2. What is the largest sample size that would be needed in

The largest sample size would be obtained when p = 0.5. So,

If p is unknown and there is no possibility of estimating it, use 0.5 as

Determining Sample Size When Estimating

for n form the formula does not look promising,

Incorporating these assumptions into the z-formula yields

Solving for n produces the sample size:

2. A researcher wants to estimate the difference between the

You might also like