Lesson 9 Anova
Lesson 9 Anova
INTRODUCTION:
OBJECTIVES:
1
One-Way Analysis of Variance
1. The populations from which the samples were obtained must be normally
or approximately normally distributed.
2. The samples must be independent of one another.
3. The variances of the populations must be equal.
Even though you are comparing three or more means in this use of the 𝐹
test, variances are used in the test instead of means.
With the 𝐹 test, two different estimates of the population variance are made.
The first estimate is called the between-group variance, and it involves
finding the variance of the means. The second estimate, the within-group
variance, is made by computing the variance using all the data and is not
affected by differences in the means. If there is no difference in the means, the
between-group variance estimate will be approximately equal to the within-group
variance estimate, and the 𝐹 test value will be approximately equal to 1. The null
hypothesis will not be rejected. However, when the means differ significantly, the
between-group variance will be much larger than the within-group variance;
the 𝐹 test value will be significantly greater than 1; and the null hypothesis will
2
be rejected. Since variances are compared, this procedure is called analysis of
variance (ANOVA).
For a test of the difference among three or more means, the following
hypotheses should be used:
𝐻0 : 𝜇1 = 𝜇2 = . . . = 𝜇𝑘
3
Solution:
𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐 = 𝝁𝟑 (claim)
𝑑. 𝑓. 𝑁. = 𝑘 − 1 = 3 − 1 = 2
𝑑. 𝑓. 𝐷. = 𝑁 − 𝑘 = 15 − 3 = 12
4
Step 3: Compute the test value, using the procedure outlined here.
a. Find the mean and variance of each sample (these values are shown
below the data).
b. Find the grand mean. The grand mean, denoted by 𝑋̅𝐺𝑀 , is the mean of all
values in the samples.
∑ 𝑋 10 + 12 + 9 + 15 + 13 + 6 + 8 + 3 + 0 + 2 + 5 + 9 + 12 + 8 + 4
𝑋̅𝐺𝑀 = =
𝑁 15
116
= = 𝟕. 𝟕𝟑
15
When samples are equal in size, find 𝑋̅𝐺𝑀 by summing the 𝑋̅’s and dividing by 𝑘,
where 𝑘 = 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑔𝑟𝑜𝑢𝑝𝑠.
Note: This formula finds the variance among the means by using the sample
sizes as weights and considers the differences in the means.
𝒔𝟐 𝑩 𝟖𝟎. 𝟕
𝑭= 𝟐 = = 𝟗. 𝟏𝟕
𝒔 𝑾 𝟖. 𝟕𝟑
Step 4: Make the decision. The decision is to reject the null hypothesis, since
9.17 > 3.89.
Step 5: Summarize the results. There is enough evidence to reject the claim and
conclude that at least one mean is different from the others.
5
The numerator of the fraction obtained in step 3, part c, of the
computational procedure is called the sum of squares between groups,
denoted by 𝑆𝑆𝐵 . The numerator of the fraction obtained in step 3, part d, of the
computational procedure is called the sum of squares within groups, denoted
by 𝑆𝑆𝑊 . This statistic is also called the sum of squares for the error. 𝑆𝑆𝐵 is
divided by 𝑑. 𝑓. 𝑁. to obtain the between-group variance. 𝑆𝑆𝑊 is divided by 𝑁 − 𝑘
to obtain the within-group or error variance. These two variances are sometimes
called mean squares, denoted by 𝑀𝑆𝐵 and 𝑀𝑆𝑊 . These terms are used to
summarize the analysis of variance and are placed in a summary table, as shown
below.
In the table,
𝑆𝑆𝐵 = the sum of squares between groups
𝑘 = number of groups
𝑆𝑆𝐵
𝑀𝑆𝐵 =
𝑘−1
𝑆𝑆𝑊
𝑀𝑆𝑊 =
𝑁−𝑘
𝑀𝑆𝐵
𝐹=
𝑀𝑆𝑊
6
The totals are obtained by adding the corresponding columns. For example 1,
the ANOVA summary table is shown below.
Solution:
𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐 = 𝝁𝟑 (claim)
7
Step 2: Find the critical value. Since 𝑘 = 3 and 𝑁 = 18, and 𝛼 = 0.05,
𝑑. 𝑓. 𝑁. = 𝑘 − 1 = 3 − 1 = 2
𝑑. 𝑓. 𝐷. = 𝑁 − 𝑘 = 18 − 3 = 15
8
Step 3: Compute the test value, using the procedure outlined here.
a. Find the mean and variance of each sample (these values are shown
below the data columns in the example).
b. Find the grand mean. The grand mean, denoted by 𝑋̅𝐺𝑀 , is the mean of all
values in the samples.
∑ 𝑋 7 + 14 + 32+ . . . +11 152
𝑋̅𝐺𝑀 = = = = 𝟖. 𝟒𝟒
𝑁 18 18
Step 4: Make the decision. Since 5.05 > 3.68, the decision is to reject the null
hypothesis.
Step 5: Summarize the results. There is enough evidence to support the claim
that there is a difference among the means.
9
The steps for computing the F test value for the ANOVA are summarized in this
Procedure Table.
When the null hypothesis is rejected in ANOVA, it only means that at least
one mean is different from the others. To locate the difference or differences
among the means, it is necessary to use other tests such as the Tukey or the
Scheffé test.
10
Exercises: Assume that all variables are normally distributed, that the samples
are independent, and that the population variances are equal. Also, for each
exercise, perform the following steps.
2. Hybrid Vehicles. A study was done before the recent surge in gasoline
prices to compare the cost to drive 25 miles for different types of hybrid
vehicles. The cost of a gallon of gas at the time of the study was
approximately $2.50. Based on the information given below for different
models of hybrid cars, trucks, and SUVs, is there sufficient evidence to
conclude a difference in the mean cost to drive 25 miles? Use 𝛼 = 0.05.
11
3. Expenditures per Pupil. The per-pupil costs (in thousands of dollars) for
cyber charter school tuition for school districts in three areas of southwestern
Pennsylvania are shown. At 𝛼 = 0.05, is there a difference in the means? If
so, give a possible reason for the difference.
12
5. Ocean Water Temperatures. The National Oceanographic Data Center
lists water temperatures in degrees Fahrenheit for beaches all around the
country. Below are listed selected beach temperatures from the month of
February for various coastal areas of the United States. At the 0.05 level
of significance, is there sufficient evidence to conclude a difference in
mean temperatures?
13
14
15
16
17
18