0% found this document useful (0 votes)
4 views74 pages

Estimation

The document outlines the principles of statistical estimation, focusing on the distinction between point and interval estimates, and the properties of t-distribution. It emphasizes the importance of using sample statistics to estimate population parameters and the construction of confidence intervals to provide a range of values for these estimates. Key concepts include the characteristics of good estimators, the significance of confidence levels, and the application of statistical methods in inferential statistics.

Uploaded by

robelalaye53
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views74 pages

Estimation

The document outlines the principles of statistical estimation, focusing on the distinction between point and interval estimates, and the properties of t-distribution. It emphasizes the importance of using sample statistics to estimate population parameters and the construction of confidence intervals to provide a range of values for these estimates. Key concepts include the characteristics of good estimators, the significance of confidence levels, and the application of statistical methods in inferential statistics.

Uploaded by

robelalaye53
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

Statistical Estimation

1
learning objectives:

• To describe the relationship between point


estimate and interval estimate
• To calculate and interpret the confidence interval
• Describe the basic properties of t- distribution

07/13/2025 2
Principles of Statistical
Estimation

• Descriptive statistics help investigators describe and


summarize data.
• Probability and sampling distribution concepts needed to
evaluate data using statistical methods.
• Without probability and sampling distribution theory, we
could not make statements about populations without
studying everyone in the population, clearly an
undesirable and often impossible task
07/13/2025 3
Descriptive & Inferential Statistics

Descriptive statistics?

• Consists of the collection, organization, classification,


summarization, and presentation of data obtained from the
sample.
• Used to describe the characteristics of the sample

07/13/2025 4
„ Inferential statistics
• Consists of generalizing from samples to populations,
performing estimations, hypothesis testing, determining
relationships among variables
• Used when we want to draw a conclusion for the data
obtained from the sample
• „ Used to infer, estimate, approximate the characteristics
of the target population

07/13/2025 5
• Inferential statistics are the statistical methods used to
draw conclusions from a sample and make inferences to
the entire population.
• The two primary methods for making inference are

-Estimation and
-Hypothesis testing.
Purpose
–Make decisions about population characteristics
07/13/2025 6
Statistical estimation
Every member of the
population has the
same chance of being
selected in the sample

Population

Parameters

estimation Random sample

Statistics

07/13/2025 7
07/13/2025 8
Parameter and statistic

Parameter = Descriptive measure of a


population
Statistic = Descriptive measure of a sample
= Estimates the population parameter

07/13/2025 9
• Statistical Estimation

The goal of conducting surveys is to obtain


information about a particular population.
• When the sample has been selected and the
information is collected and, there still remains
the task of linking the information gathered from
the sample back to the overall population.
07/13/2025 10
• Estimation…..

• Estimation is the process of determining a likely


value for a variable in the survey population,
based on information collected from the sample.
• Estimation is the use of sample statistics to
estimate population parameters.

07/13/2025 11
For example, a sample survey could be used to produce
any of the following statistics:
– estimates for the proportion of smokers among all people
aged 15 to 24 in the population;
– the mean level of a certain enzyme among healthy men.
• The objective of estimation is to determine the
approximate value of a population parameter on the
basis of a sample statistic.
07/13/2025 12
Point and Interval Estimates

Estimate

Point estimate Interval estimate


sample mean confidence interval for mean
sample proportion confidence interval for proportion

Point estimate is always within the interval estimate

07/13/2025 13
07/13/2025 14
Point estimate
• A single numerical value used to estimate the
corresponding population parameter
• A single value quoted as an estimate of a population
parameter is of little use unless it is accompanied by some
indication of its precision.
• The following slides describe various ways of enhancing
the value of point estimates

07/13/2025 15
• Properties of good Estimators
1. An unbiased estimator of a population parameter is an
estimator whose expected value(mean of estimates
obtained from samples) is equal to that parameter.
2. Efficiency (cont) - Among unbiased estimators
therefore, we want the one with the smallest variance
3. Consistency
As sample size increases, variation of the
estimator from the true population value
decreases

07/13/2025 16
• An unbiased estimator is said to be consistent if
the difference between the estimator and the
parameter grows smaller as the sample size grows
larger
• If there are two unbiased estimators of a
parameter, the one whose variance is smaller is
said to be relatively efficient.

07/13/2025 17
07/13/2025 18
07/13/2025 19
07/13/2025 20
07/13/2025 21
Point estimate

• How good is a point estimate ?


• There is no way of knowing how close the
point estimate is to the population mean.
• Therefore we need _________?

07/13/2025 22
Interval estimation
• Usually, we only have a sample and don’t know the
entire population.
Example: Point estimate of 0.30 for population proportion
• It is not reasonable to assume that the population
proportion is exactly 0.30
• The probability of getting a sample statistic value that is
exactly equal to the corresponding population parameter
is usually quite small.
07/13/2025 23
• Interval estimate …
• It may be reasonable to assume that 0.30 is close to the
population proportion
• We use a point estimate to obtain an interval estimate
• An interval or range of values is used to estimate the
parameter
• This estimate may or may not contain the value of the
parameter being estimated
07/13/2025 24
• Interval estimate …..
• Takes into consideration variation in sample statistics
from sample to sample
• Provides Range of Values

– Based on Observations from sample


• Gives Information about closeness to unknown
Population Parameter
• Stated in terms of Probability

07/13/2025 Never 100% Sure 25


• Interval estimate
• To be more confident that the interval contains the true
population mean, one must make the interval wider
• A confidence interval is a specific interval estimate of a
parameter determined by using a data from sample and a
specific confidence level of the estimate
• Narrow CI widths reflects large sample size or low
variability or both.

07/13/2025 26
• Confidence level of an interval estimate of a
parameter is the probability that the interval
estimate will contain the parameter
• Scientists usually accept a 5% chance
that the range will not include the true
population value
• The range or interval is called 95%
confidence interval

07/13/2025 27
A wide interval provides little information.
For example, suppose we estimate with 95% confidence
that an accountant’s average starting salary is between
$15,000 and $100,000.

Contrast with: a 95% confidence interval estimate of


starting salaries between $42,000 and $45,000.

The second estimate is much narrower, providing


accounting students more precise information about
starting salaries.

07/13/2025 28
The width of the confidence interval estimate is a
function of the confidence level, the population
standard deviation, and the sample size…

A larger confidence level


produces a wider
confidence interval:

07/13/2025 29
Confidence Level
• Confidence Level
– Confidence in which the interval will contain the
unknown population parameter
• A percentage (less than 100%)
– Example: 90%, 95%, 99%
• Also written (1 - α) = 0.95
• Definition = we are 100(1-α) [e.g. 95%]
confident that the computed interval contains the
unknown population parameter.

07/13/2025 30
07/13/2025 31
07/13/2025 32
Usually upper confidence
represented with limit (UCL)
a “plus/minus”
( ± ) sign

lower confidence
limit (LCL)

07/13/2025
10.33
Four commonly used confidence levels

07/13/2025 34
• Central Limit theorem-
• Approximately 95% of the sample means fall within
1.96 standard deviations of the population mean if the
sample size is 30 or more or if population standard
deviation is known when n is less than 30
(Elementary statistics, 3rd edition).

07/13/2025 35
CI for a population mean(normally
distributed)
a) large sample size (standard deviation known)

07/13/2025 36
Cont….
Assumptions
 Population standard deviation () is known
 Population is normally distributed
 If population is not normal, use large sample
size (CLT)
• There are 3 elements to a CI:
1. Point estimate
2. SE of the point estimate
3. reliability coefficient

07/13/2025 37
Example
• A data on 199 patients on systolic blood pressure gives a
mean value of 125.8 mmHg. Let us assume that the
standard deviation for this patient population is known to
be 20 mmHg. Construct a 95 percent confidence
interval for the population mean.

07/13/2025 38
07/13/2025 39
07/13/2025 40
• When constructing CIs, it has been assumed that
the standard deviation of the underlying
population,  , is known

• What if  is not known?

07/13/2025 41
• In this case, the standard deviation of the population can
be replaced by the standard deviation (S) of the sample if
the sample size is large enough (n≥30).
• With large sample size, we assume a normal distribution
(CLT).
• Exercise
• It was found that a sample of 35 patients were 17.2
minutes late for appointments, on the average, with SD of
8 minutes. What is the 90% CI for µ?

07/13/2025 42
b) Confidence interval for the mean
( n < 30 and population standard deviation unknown)

07/13/2025 43
Student’s t Distribution
• The t test is sometimes called “Student's t test” after the
person who first studied the distribution of means from
small samples in 1890.
• Student was really a mathematician named William
Gosset who worked for the Guiness Brewery; he was
forced to use the pseudonym Student because of
company policy prohibiting employees from publishing
their work
07/13/2025 44
t- distribution…
• t- distribution is used when the sample size is less than
30 and the variable is normally or approximately
normally distributed
• In many situations, the population standard deviation is
not known and the sample size is less than 30- in such
situations, the standard deviations from the sample can
be used in place of the population standard deviation

07/13/2025 45
t-distribution..
t-distribution differs from SND in the following ways
1. the variance is greater than one
2. based on the concept of degree of freedom which is
related to sample size
3. As the sample size increases, the t-distribution
approaches the SND

07/13/2025 46
07/13/2025 47
07/13/2025 48
07/13/2025 49
Example
• In a study of preeclampsia, Kaminski and Rechberger
found the mean systolic blood pressure of 10 healthy, non
pregnant women to be 119 with a standard deviation of
2.1.
A. What is the estimated standard error of the mean?
B. Construct the 99% confidence interval for the mean
of the population from which the 10 subjects may be
presumed to be a random sample.
C. What is the precision of the estimate?
D. What assumptions are necessary for the validity of
the confidence interval you constructed?
07/13/2025 50
07/13/2025 51
Solution
C. Precision = 3.250 X 0.66
= 2.16
D. The population is normally distributed
the 10 subjects represent a random sample
from this population

07/13/2025 52
2. CI for the difference between two
population means(normally distributed)

• From each of the populations an independent random


sample is drawn and, from the data of each the sample
means are computed
• Population1 = μ1 and σ1,
• Population 2 = μ2 and σ2.
• Provides information that is helpful in deciding whether
or not it is likely that the two population means are equal

07/13/2025 53
A. Known variances(2 independent samples)
• When the population variances are known and
both populations are normal, the test statistic is
a z-value…

07/13/2025 54
07/13/2025 55
Cont……
• When the constructed intervals does not include
zero, we say that the interval provides the evidence
that the two population means are not equal.
• When the interval includes zero, we say that the
population means may be equal

07/13/2025 56
Example
A. A sample of 12 individuals with Down’s syndrome yielded a
mean of X1=4.5 mg/100 ml, and 15 normal individuals had a
mean value of X2=3.4. If it is reasonable to assume that the two
populations of values are normally distributed with Variances
equal to 1 and 1.5, respectively, find the 95% CI for μ1-μ2.
• Point estimate = μ1-μ2, X1-X2 = 4.5-3.4 = 1.1

07/13/2025 57
Interpretation..
• We are 95% confident that the true difference is
somewhere between .26 and 1.94 because in repeated
sampling, 95% of the intervals constructed in this
manner would include the difference between the true
means
• Since the interval does not include zero, we conclude
that the two population means are not equal

07/13/2025 58
B. Unknown variances (large sample)
• The central limit theorem applies when sampling
is from non-normal population
• Use sample standard deviation s to estimate ,
and
• the test statistic is a z-value

07/13/2025 59
Exercise for learners
• If 50 nonsmokers have a mean life of 76 years with
SD of 8 years and 65 smokers live 68 years with a SD
of 9 years,
• What is the point estimate of the difference of the
means?
• Find a 95% CI.

Given
Nonsmokers, n1 = 50, X1 = 76 years, SD1= 8 years
Smokers, n2 = 65, X2 = 68 years, SD2 = 9 years

X1-X2 = 76-68 years = 8 years

07/13/2025 60
• SE = = √(64/50) + (81/65) = 1.59

• 95% CI for μ1-μ2 = 8 ± 1.96 (1.59) = (4.88 to


11.12 years).

• Since the interval doesn’t include zero, the two


population means are not equal.

07/13/2025 61
3. CIs for single population
proportion
Many questions of interest to the health worker are
related to the population proportion
Ex: what proportion of some population has a certain ds.
Assumptions
– Normal Approximation Can be used
– n ×p≥5 & n ×(1 -p) ≥5

• Is based on three elements of CI.


– Point estimate
– SE of point estimate
– Confidence coefficient
07/13/2025 62
CI for single proportion..
• A sample is drawn from a population of interest and
sample proportion is computed.
• This sample proportion is used as the point estimator
of the population proportion
• CI is obtained by
• Estimator ± (reliability coefficient) *(standard error
of the estimator)

07/13/2025 63
07/13/2025 64
Example
• A random sample of 100 people shows that 25
are left-handed. Construct a 95% CI for the true
proportion of left-handers.

07/13/2025 65
Interpretation

07/13/2025 66
CI for two Population Proportions

• The magnitude of the difference between two


population proportions is of interest
• We may want to compare;
-Men and women, two age groups, two
socioeconomic groups with respect to the proportion
possessing some characteristics of interest

07/13/2025 67
CI for Two Population Proportions…

• SE of the difference =

• The confidence interval for p1 – p2 is:

07/13/2025 68
• Example
• In a clinical trial for a new drug to treat hypertension,
N1 = 50 patients were randomly assigned to receive
the new drug, and N2 = 50 patients to receive a
placebo. 34 of the patients receiving the drug showed
improvement, while 15 of those receiving placebo
showed improvement.
• Compute a 95% CI estimate for the difference
between proportions improved.

07/13/2025 69
• p1 = 34/50 = 0.68, p2 = 15/50 = 0.30
• The point estimate for the difference is:
= [0.68−0.30]=0.38

• SE of the difference =

• 95% CI
– Lower = ( point estimate ) - (Zα/2) (SE)
= 0.38 – (1.96)(0.0925) = 0.20
– Upper = ( point estimate ) + (Zα/2) (SE)
= 0.38 + (1.96)(0.0925) = 0.56
• 95% CI = (0.20, 0.56)

07/13/2025 70
Exercise 1
1. Waiting times (in hours) at a particular hospital are
believed to be approximately normally distributed with
a variance of 2.25 hr.
a. A sample of 20 outpatients revealed a mean waiting
time of 1.52 hours. Construct the 95% CI for the
estimate of the population mean.
b. Suppose that the mean of 1.52 hours had resulted from
a sample of 32 patients. Find the 95% CI.
c.07/13/2025
What effect does larger sample size have on the CI? 71
Exercise 2

• 2. In a simple random sample of 125 unemployed


male high school dropouts between the ages of 16
and 21,inclusive, 88 stated that they were regular
consumers of alcoholic beverages . Construct a 95%
CI for the population proportion and interpret the
finding

07/13/2025 72
Exercise 3

• 3. To study the difference in drug therapy adherence


among subjects with depression who received usual care
and those who received care in a collaborative care
model(CCM).
• Of the 50 subjects receiving usual care, 24 adhered to the
prescribed drug regimen, while 50 out of the 75 subjects
in the CCM adhered to the drug regimen. Construct a
95% CI for the difference in adherence proportion for the
population of subjects represented by this two samples
07/13/2025 73
Thank You

07/13/2025 74

You might also like