Unit 6
Unit 6
STAT130
Unit 6: Comparing Population
Means
Chapter 10
Statistical Inferences Based on
Two Samples
Comparing Two Means
Suppose we have two independent populations
with means 1 and 2, respectively. We are
interested in making statistical inferences about
the difference in the population means: 1- 2.
We take a simple random sample of n1 subjects
from the first population and an independent
simple random sample of n2 subjects from the
second population.
Examples:
Compare the average salary of male and female
accountants in UAE.
Compare the average number of customers arriving two
branches of Abu Dhabi Islamic bank.
3
Comparing Two Means: Assumptions
1. Suppose we have two independent simple
random samples.
2.
i. Either both populations are normally
distributed:
X1 ~ N(1, 12) and X2 ~ N(2, 22)
ii. The populations are possibly nonnormal but
both sample sizes are large enough such that
the central limit theorem applies.
3. The population standard deviations 1 and 2
are unknown.
4
Comparing Two Means
Generally, the true values of the population
variances σ12 and σ22 are unknown and are
estimated from the sample variances s12 and s22,
respectively.
Also need to estimate the standard deviation of
the sampling distribution of the difference
between sample means
Two approaches:
1. If it can be assumed that σ12 = σ22 = σ2, then
calculate the “pooled estimate” of σ2
2. If σ12 ≠ σ22, then use approximate methods
5
Comparing Two Means
A point estimate of 1-2 is the difference in
the sample means X1 X 2 .
The sampling distribution of X 1 X 2
depends on the equality of the population
variances 12 and 22:
Case 1: (Equal Variances 12 = 22)
6
Case 1: Equal Variances (12 = 2 )
2
t
X 1 X 2 1 2
1 1
s p2
n1 n 2
follows a t distribution with degrees of freedom
(df) equal to n1 + n2 – 2 where the pooled
variance sp2 is given by
7
Case 2: Unequal Variances (12 ≠ 22)
When 1 and 2 are unknown and assumed
unequal then the statistic
t
X 1 X 2 1 2
s12 s 22
n1 n 2
follows a t distribution with degrees of freedom
(df) equal to (round down to the nearest integer)
2 2
s12 / n1 s 22 / n2
df
2 2
s12 / n1 s 22 / n 2
n 1
1 n 2 1
8
Confidence Interval for 1-2
Case 1: CI with ’s Unknown and equal is
1 1
x 1 x 2 t /2 2
sp
n1 n 2
Case 2: CI with ’s Unknown and unequal is
2 2
s1 s2
x 1 x 2 t /2
n1 n2
9
Example
Two different types of electrical cable insulation are
tested to determine the voltages level at which
failures tend to occur. A sample of each type is
selected and the summary statistics are:
10
Example
Let 1= mean failure voltage for type A and
2= mean failure voltage for type B.
95% CI for the Difference Between Two
Population Means:
(n1 1)s12 (n 2 1)s 22 13(7.21)2 11(9.41)2
s
2
68.74
n1 n 2 2 14 12 2
p
1 1 1 1
x 1 x 2 t /2 s 15.54 2.064 68.74
2
p
n1 n 2 14 12
11
2-Sample t-test: Hypotheses
We hypothesize that the difference between
the population means equals some specified
value D0 (=0 in most cases) and want to test
whether this value is reasonable or whether
the alternative is true.
x 1 x 2 D0
t
s p2 1
n1
n12
Case 2: Unequal variances
x 1 x 2 D0
t
s12 s 22
n1
n2
13
2-Sample t-test: Test Statistic
If we have D0=0, then the test statistic is:
Case 1: Equal variances (Pooled)
x1 x 2
t
2
sp 1
n1
n12
Case 2: Unequal variances
x1 x 2
t
s12 s 22
n1
n2
14
Comparing 2 Means in MegaStat
For Two-sample t-test and confidence
intervals in MegaStat
MegaStat Hypothesis Tests
Compare Two Independent Groups
Note: You need to check “Test for equality of
variances” to select the proper case when
comparing means.
Note: If the sample size is less than 30, we
have to test the normality using MegaStat.
15
Example (1)
A researcher was interested in comparing the amount
of time spent browsing internet by women and by
men. Independent samples of 14 women and 17 men
were selected and each person was asked how many
hours he or she had spent browsing the internet
during the previous week. The summary statistics are
as follows:
Sample size Mean Standard deviation
Men 17 16.9 4.7
Women 14 11.3 4.4
At 5% significance level, do the data provide sufficient
evidence to conclude that mean time for women is
less than mean time for men? Assume that the
distributions of browsing time for men and women are
normal with equal variances. 16
Example (1)
Let 1= mean time spent browsing internet by
women and 2 = mean time spent browsing
internet by men.
1. H0: 1= 2 vs. Ha: 1< 2
2. Test Statistic (df=14+17-2=29)
n1 14, x1 11.3, s1 4.4, n2 17, x2 16.9, s2 4.7
(14 1)(4.4)2 (17 1)(4.7)2
s
2
20.8662
14 17 2
p
x1 x 2 11.3 16.9
t 3.3968
s p2 1
n1
n12 20.8662 141 171
17
Example (1)
3. P-value=P(T29<-3.3968)=0.0009985
4. Conclusion: Since the p-value < 0.05, we reject H0
and conclude that on average, women spend less
time browsing internet than men.
MegaStat output:
18
Example (2)
Do people who eat high-fiber cereal for breakfast
consume, on average, fewer calories for lunch
than people who do not eat high-fiber cereal for
breakfast?
A sample of 150 people was randomly drawn.
Each person was identified as a consumer or a
non-consumer of high-fiber cereal.
For each person the number of calories consumed
at lunch was recorded (Cereal data).
The claim to be tested is that the mean caloric
intake of consumers (µ1) is less than that of non-
consumers (µ2).
19
Example (2)
The hypotheses are:
H0: 1 - 2 = 0 (i.e. H0: 1 = 2)
Ha: 1 - 2 < 0 (i.e. Ha: 1 < 2)
First check the equality of variances by testing
H0: 12 = 22 vs. Ha: 12 ≠ 22.
The p-value = 0.0007 < 0.05, we reject H0.
Therefore, we run the 2 samples t-test
assuming unequal variances.
20
Example (2)
The test statistic t=-2.09
P-value=0.0193<0.05
We reject H0 and conclude that people who eat
high-fiber cereal for breakfast consume, on
average, fewer calories for lunch than people who
do not eat high-fiber cereal for breakfast.
21
Exercises
1) A recent Time magazine reported the following
information about the weekly work load of
workers in Germany and the United States:
Sample Size Mean Standard Deviation
US 60 42 5
Germany 70 38 6
22
Exercise
2) A random sample of 36 tourists in the Grand
Bahamas showed that they spent an average of
$1,750 (in a week) with a standard deviation of
$125; and a sample of 25 tourists in New
Province showed that they spent an average of
$1,900 (in a week) with a standard deviation of
$130. At 5% significance level, do the data
indicate that tourists who visited New Province
spent, on average, $100 more than those who
visited the Grand Bahamas? State any
assumptions you’ve made to do this problem.
23
Exercises
3) A firm is studying the delivery times for two
raw material suppliers. In this regards random
samples of delivery times of the 2 suppliers
were collected (Delivery times worksheet).
a) Does the data suggest that the mean delivery
times of supplier B is less than that of
supplier A? Use a 1% significance level.
b) Find a 99% confidence interval for the
difference between the mean delivery times
of suppliers A and B.
24
Wilcoxon Rank Sum Test
If the sample sizes n1 and n2 are not large (<30)
and the sampled populations are not normally
distributed, then we can use the nonparametric
method; Wilcoxon Rank Sum test.
The test is also used when the data in ordinal.
Here, we assume that the two populations have
identical shapes but differ only in location, which
is measured by the median.
For Wilcoxon Rank Sum Test in MegaStat
MegaStat Nonparametric Tests
Wilcoxon – Mann/Whitney test
25
Example
The prices of a random sample of houses
with a basement and a random sample of
houses without a basement were saved. Test
whether the mean prices of houses with and
without basements are equal. Use = 0.05.
Let 1: mean price of houses with a basement
2: mean price of houses without a basement
Hypotheses: H0: 1 = 2 versus Ha: 1 ≠ 2
Assumptions: n1 = 20 < 30 (i.e. small sample)
and n2 = 30 hence we must test the normality
assumption for the first population.
26
Example
Use the chi-square test of normality:
p-value = 0.0143 < 0.05, hence we reject H0. Thus
the 2 sample t-test is not valid.
27
Exercise
A consumer agency wants to compare the
caffeine content of two brands of coffee. Eight
jars of each brand are analyzed, and the
amount of caffeine found in each jar is recorded
as shown in the table (Caffeine data).
Brand I 82 77 85 73 84 79 81 82
Brand II 75 80 76 81 72 74 73 78
Using =0.10, can you conclude that the two
brands have different median caffeine contents
per jar?
28
Paired Difference Experiments
Before, drew random samples from two different
populations. Now, have two different processes.
Draw one random sample of units and use those
units to obtain the results of each process
For instance, use the same individuals for the
results from one process vs. the results from the
other process
E.g., use the same individuals to compare “before”
and “after” treatments
Using the same individuals, eliminates any
differences in the individuals themselves and just
comparing the results from the two processes.
29
Paired Difference Experiments
Let d be the mean of population of paired
differences
d = 1 – 2, where 1 is the mean of population
1 and 2 is the mean of population 2
Let d and sd be the mean and standard
deviation of a sample of paired differences
that has been randomly selected from the
population
d is the mean of the differences between pairs
of values from both samples
30
Paired t-test
1. Hypotheses
H0: d = D0
Ha: d > D0 or Ha: d < D0 or Ha: d D0
2. Test Statistic
d D0
t
sd / n
D0 = µ1 – µ2 (Often D0 = 0) is the claimed or actual
difference between the population means
The sampling distribution of this statistic is a t
distribution with (n – 1) degrees of freedom
31
Example: Programming languages
A computer scientist is investigating the
usefulness of two different design languages in
improving programming tasks. Ten expert
programmers, familiar with both languages, are
asked to code a standard function in both
languages, and the time (in minutes) is
recorded. Do the data that follow indicate any
difference in the mean coding times? Use
=0.01.
32
Example: Programming languages
Let d = the mean difference in the coding
times (d=L1-L2)
1. Hypotheses: H0: d = 0, Ha: d ≠ 0
2. Assumptions: n=10<30, test the normality of
the difference (d)
P-value=0.0359>0.01, so the normality assumption
is met.
Test Statistic
t=1.28
3. P-value= 0.2339
33
Example: Programming languages
4. Conclusion: P-value>0.01, so do not reject H0.
There is no significant difference in the mean
coding times of the two languages.
MegaStat Output:
34
Exercises
1) In an effort to increase production of an auto part,
the factory manager decides to play music in the
manufacturing area. Eight workers are selected,
and the number of items each produced for a
specific day is recorded. After one week of music,
the same workers are monitored again. The data
are given in “music” worksheet. At =0.05, can
the manager conclude that the music has
increased production?
2) The profits of a random sample of banks during
1990 and 1991 were observed. Does the data
suggest, at the 10% significance level, that the
mean profits of banks in 1990 and 1991 differ.
35
Chapter 11
One-Way ANOVA
Introduction: ANOVA
Analysis of variance helps compare two or
more means.
The procedure works by analyzing the sample
variance.
The characteristic that differentiates the
treatments or populations from one another is
called the factor under study, and the different
treatments or populations are referred to as
the levels of the factor.
ANOVA considers the effect of a single factor
on the response variable of interest.
37
One-Way ANOVA
One-Way ANOVA focuses on a comparison of
more than two population or treatment means.
p= the number of populations (treatments) being
compared.
i= the mean of population i or the true average
response when treatment i is applied.
Hypotheses: The hypotheses of interest are
H0: µ1=µ2=…=µp
38
ANOVA Notation
ni denotes the size of the sample randomly
selected for treatment i
n denotes the total number of selected
observations (n =n1 + n2 +…+np)
xij is the jth value of the response variable using
treatment i.
x is the overall (grand) mean.
x i is the ith treatment sample mean.
si is the ith treatment sample standard
deviation.
39
One-Way ANOVA
Compare the between-treatment variability to
the within-treatment variability
Between-treatment variability is the variability
of the sample means from sample to sample
Within-treatment variability is the variability of
the treatments (that is, the values) within each
sample
40
Formulas for ANOVA: Test Statistic
Treatment sum of squares (SST)
p
n x x
2
SST i i
i 1
Error sum of squares (SSE)
p ni
x
2
SSE ij xi
i 1 j 1
x
2
SSTO ij x SST SSE
i 1 j 1 41
Formulas for ANOVA: Test Statistic
SST SSE
Mean Squares: MST , MSE
p 1 np
MST
Test Statistic: F
MSE
F has an F distribution with p – 1 and n – p degrees
of freedom.
P-value= P(Fp-1, n-p > F) F distribution
P-value
42
ANOVA Table
Degrees of Sum of Mean
Source F
Freedom Squares Squares
Treatments p-1 SST MST MST/MSE
43
One-Way ANOVA Assumptions
Normality
The p populations of values of the response
variable all have normal distributions.
Constant variance
The p populations of values of the response
variable (associated with the p treatments) all
have the same variance.
Independence
The samples of experimental units are
randomly selected, independent samples.
44
Notes on Assumptions
One-way ANOVA is not very sensitive to
violations of the equal variances assumption
Especially when the samples are about the same size.
No test is available in MegaStat but the assumption
will be approximately met if:
(largest s)/(smallest s) <2
Normality is not crucial
ANOVA results are approximately valid for mound-
shaped distributions.
If the sample distributions are reasonably symmetric
and if there are no outliers, then ANOVA results are
valid for even small samples.
45
Example
A marketing specialist
Store Store Store
wishes to see whether there
A B C
is a difference in the average
3 5 1
time a customer has to wait
in a checkout line in three 2 8 3
large self-service 5 8 3
department stores. The
times (in minutes) are 5 9 4
shown in the table. Is there 6 6 2
a significant difference in the
3 2 7
mean waiting times of
customers for each store 1 5 3
using = 0.05?
46
Example
The hypotheses
H0: 1= 2= 3
H1: At least two means differ.
Assumptions:
The three samples are drawn from normal
populations where the p-values are 0.0995, 0.0495
and 0.210. The plot of store B data does not show
any outliers.
Largest s = 2.41 while the smallest s is 1.81, so
the ratio is less than 2.
The assumptions of normality and equal variances
are met.
The test statistic is F=4.11
47
Example
The p-value=0.0340<0.05
Reject H0. There is a significant difference in the
mean waiting times of customers for each store .
48
Multiple (Pairwise) Comparisons
If we do not reject H0 in an ANOVA, the analysis
is finished—there are no differences among the
means. If we, however, reject H0 then we want to
know which of the μi’s are different from each
other.
Any method for carrying out this further analysis
is called a multiple comparison procedure. There
are many such procedure in statistical literature.
The naive way of doing this would be to construct
pooled t-tests for each pair of μi’s.
We will use Tukey method.
49
The Tukey Multiple Comparisons
Tukey simultaneous 100(1 - )% confidence
interval for i – j:
1 1
x i x j 2 MSE n n
qα
i j
where q is the upper percentage point of the
studentized range for p and (n – p) from Table
A.9.
50
The Tukey method in MegaStat
The Tukey method compares all pairs of means
with the H0: i= j (for all i ≠ j).
MegaStat calculates the t statistic as follows:
xi x j
t
MSE n1i n1j
Stores B & C differ, while A & C and A & B are the same.
52
Exercises
1) North American automobile manufacturers have
become more concerned with quality because of
foreign competition. One aspect of quality is the cost
of repairing damage caused by accidents. A
manufacturer is considering several new types of
bumpers. In order to test how well they react to low-
speed collisions, 40 bumpers of each of four different
types were installed on midsize cars, which were
then driven into a wall at 5 miles per hour. The cost
of repairing the damage in each case was assessed.
The relevant data are stored in Bumpers worksheet.
a) Is there sufficient evidence to infer that the bumpers
differ in their reactions to low-speed collisions?
b) If differences exist, use Tukey’s method to determine
which bumpers differ. 53
Exercises
2) A consumer agency investigated the premiums
charged by four auto insurance companies. The
agency randomly selected five drivers insured by
each company who had similar driving records,
autos, and insurance coverage. The following table
gives the monthly premiums paid by the 20 drivers.
Can you conclude that the average auto insurance
premiums paid per month by all such drivers are
the same for all four companies? Use =0.05.
Company A Company B Company C Company D
$65 $48 $57 $62
73 69 61 53
54 88 89 45
43 75 77 51
70 72 69 44
54
Exercises
3) Part of an ANOVA table involving 8 groups for a
study is shown below.
Source DF SS MS F P-value
Treatments 126
Error 240
Total 67
a) Complete all the missing values in the above table
and fill in the blanks.
b) Use = 0.01 to determine if there is any
significant difference among the means of the
eight groups.
55
Kruskal-Wallis Test
If either the assumption of normality or
equality of variances in ANOVA is violated then
the F-test is not valid anymore, the Kruskal-
Wallis test should be used instead.
The test is used in ordinal data as well.
For Kruskal-Wallis Test in MegaStat
MegaStat Nonparametric Tests
Kruskal-Wallis Test
56
Example
A consumer group wanted to compare the service
time at three fast-food restaurants, Al’s,
Eduardo’s, and Patel’s. Every Tuesday and
Wednesday for four weeks, three staff members of
the group were randomly assigned to these three
restaurants. Each staff member went to his or her
assigned restaurant and ordered a hamburger,
fries, and a Coke and then recorded the time that
elapsed from entering the restaurant until
receiving the food. The service times (in minutes)
for these eight days for the three restaurants are
stored in “service time” worksheet.
57
Example
At the 10% level of significance, can you
conclude that there is a difference in the mean
service times at these three restaurants?
Hypotheses:
H0: 1= 2= 3 vs. Ha: “means are not equal”
Assumptions:
Normality assumption: Using Chi-squared test of
normality, the p-values are 0.0143, 0.3173 and
0.0143. This means that the distribution of
service time at Al’s and Patel’s are not normal.
ANOVA is not valid, we use Kruskal-Wallis test.
58
Example
Using Kruskal-Wallis test:
Test statistic: H = 5.419
p-value = 0.066 < 0.10
Conclusion: we reject H0. The data suggest that
there is a difference in the mean service times at
these three restaurants.
59
Exercise
Manufacturers of luxury cars are very much
interested in knowing the age distribution of their
customers because then they can change these
models to attract younger buyers without losing
the older customers who have traditionally
favored such cars. The “Drivers” worksheet gives
the ages of 7, 8 and 9 randomly selected primary
drivers of these three makes of cars. At the 5%
level of significance, can you conclude that the
average age of drivers for each of these three
makes of cars is the same?