0% found this document useful (0 votes)
7 views18 pages

Lesson 9 Anova

This document provides an overview of the Analysis of Variance (ANOVA) method, specifically focusing on the F test for comparing three or more means. It outlines the assumptions necessary for conducting ANOVA, the hypothesis testing process, and includes examples illustrating the computational procedures involved. The document also explains the significance of the F test value and how to interpret the results of the analysis.

Uploaded by

rgricarte
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views18 pages

Lesson 9 Anova

This document provides an overview of the Analysis of Variance (ANOVA) method, specifically focusing on the F test for comparing three or more means. It outlines the assumptions necessary for conducting ANOVA, the hypothesis testing process, and includes examples illustrating the computational procedures involved. The document also explains the significance of the F test value and how to interpret the results of the analysis.

Uploaded by

rgricarte
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Lesson 9

Test of Differences using ANOVA


(Analysis of Variance)

INTRODUCTION:

The 𝐹 test, used to compare two variances can also be used to


compare three or more means. This technique is called analysis of
variance, or ANOVA. It is used to test claims involving three or more
means. For three groups, the 𝐹 test can only show whether a difference
exists among the three means. It cannot reveal where the difference
lies—that is, between 1 and 2, or 1 and 3, or 2 and 3. If the 𝐹 test
indicates that there is a difference among the means, other statistical
tests are used to find where the difference exists. The most commonly
used tests are the Scheffé test and the Tukey test, which are not
explained in this lesson.

OBJECTIVES:

At the end of this lesson, you should be able to:


1. Discuss the requirements in using analysis of variance; and
2. Perform hypothesis testing involving one-way analysis of variance.

1
One-Way Analysis of Variance

When an 𝐹 test is used to test a hypothesis concerning the means of


three or more populations, the technique is called analysis of variance
(commonly abbreviated as ANOVA). At first glance, you might think that to
compare the means of three or more samples, you can use the 𝑡 test, comparing
two means at a time. But there are several reasons why the 𝑡 test should not be
done. First, when you are comparing two means at a time, the rest of the means
under study are ignored. With the 𝐹 test, all the means are compared
simultaneously. Second, when you are comparing two means at a time and
making all pairwise comparisons, the probability of rejecting the null hypothesis
when it is true is increased, since the more 𝑡 tests that are conducted, the
greater is the likelihood of getting significant differences by chance alone. Third,
the more means there are to compare, the more 𝑡 tests are needed. For
example, for the comparison of 3 means two at a time, 3 𝑡 tests are required.
For the comparison of 5 means two at a time, 10 tests are required. And for the
comparison of 10 means two at a time, 45 tests are required.

Assumptions for the 𝑭 Test for Comparing Three or More Means

1. The populations from which the samples were obtained must be normally
or approximately normally distributed.
2. The samples must be independent of one another.
3. The variances of the populations must be equal.
Even though you are comparing three or more means in this use of the 𝐹
test, variances are used in the test instead of means.

With the 𝐹 test, two different estimates of the population variance are made.
The first estimate is called the between-group variance, and it involves
finding the variance of the means. The second estimate, the within-group
variance, is made by computing the variance using all the data and is not
affected by differences in the means. If there is no difference in the means, the
between-group variance estimate will be approximately equal to the within-group
variance estimate, and the 𝐹 test value will be approximately equal to 1. The null
hypothesis will not be rejected. However, when the means differ significantly, the
between-group variance will be much larger than the within-group variance;
the 𝐹 test value will be significantly greater than 1; and the null hypothesis will

2
be rejected. Since variances are compared, this procedure is called analysis of
variance (ANOVA).

For a test of the difference among three or more means, the following
hypotheses should be used:

𝐻0 : 𝜇1 = 𝜇2 = . . . = 𝜇𝑘

𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑚𝑒𝑎𝑛 𝑖𝑠 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑓𝑟𝑜𝑚 𝑡ℎ𝑒 𝑜𝑡ℎ𝑒𝑟𝑠

As stated previously, a significant test value means that there is a high


probability that this difference in means is not due to chance, but it does not
indicate where the difference lies.

The degrees of freedom for this 𝐹 test are 𝑑. 𝑓. 𝑁. = 𝑘 − 1, where 𝑘 is the


number of groups, and 𝑑. 𝑓. 𝐷. = 𝑁 − 𝑘, where 𝑁 is the sum of the sample sizes
of the groups 𝑁 = 𝑛1 + 𝑛2 + . . . +𝑛𝑘 . The sample sizes need not be equal.
The 𝐹 test to compare means is always right-tailed.

The following examples illustrate the computational procedure for the


ANOVA technique for comparing three or more means, and the steps are
summarized in the Procedure Table shown after the examples.

Example 9.1. Lowering Blood Pressure. A researcher wishes to try three


different techniques to lower the blood pressure of individuals
diagnosed with high blood pressure. The subjects are randomly
assigned to three groups; the first group takes medication, the second
group exercises, and the third group follows a special diet. After four
weeks, the reduction in each person’s blood pressure is recorded. At
𝛼 = 0.05, test the claim that there is no difference among the means.
The data are shown.

Medication Exercise Diet


10 6 5
12 8 9
9 3 12
15 0 8
13 2 4
̅
𝑿𝟏 = 𝟏𝟏. 𝟖 ̅
𝑋2 = 3.8 ̅
𝑋3 = 7.6
𝒔𝟐 𝟏 = 𝟓. 𝟕 𝑠 2 2 = 10.2 𝑠 2 3 = 10.3

3
Solution:

Step 1: State the hypotheses and identify the claim.

𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐 = 𝝁𝟑 (claim)

𝑯𝟏 : 𝑨𝒕 𝒍𝒆𝒂𝒔𝒕 𝒐𝒏𝒆 𝒎𝒆𝒂𝒏 𝒊𝒔 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒕 𝒇𝒓𝒐𝒎 𝒕𝒉𝒆 𝒐𝒕𝒉𝒆𝒓𝒔

Step 2: Find the critical value. Since 𝑘 = 3 and 𝑁 = 15,

𝑑. 𝑓. 𝑁. = 𝑘 − 1 = 3 − 1 = 2

𝑑. 𝑓. 𝐷. = 𝑁 − 𝑘 = 15 − 3 = 12

The critical value is 3.89, obtained from 𝐹 Table with 𝛼 = 0.05.

4
Step 3: Compute the test value, using the procedure outlined here.

a. Find the mean and variance of each sample (these values are shown
below the data).
b. Find the grand mean. The grand mean, denoted by 𝑋̅𝐺𝑀 , is the mean of all
values in the samples.
∑ 𝑋 10 + 12 + 9 + 15 + 13 + 6 + 8 + 3 + 0 + 2 + 5 + 9 + 12 + 8 + 4
𝑋̅𝐺𝑀 = =
𝑁 15
116
= = 𝟕. 𝟕𝟑
15

When samples are equal in size, find 𝑋̅𝐺𝑀 by summing the 𝑋̅’s and dividing by 𝑘,
where 𝑘 = 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑔𝑟𝑜𝑢𝑝𝑠.

c. Find the between-group variance, denoted by 𝒔𝟐 𝑩 .


̅𝒊 − 𝑿
∑ 𝒏 𝒊 (𝑿 ̅ 𝑮𝑴)𝟐
𝒔𝟐 𝑩 =
𝒌−𝟏
5(11.8 − 7.73)2 + 5(3.8 − 7.73)2 + 5(7.6 − 7.73)2 160.13
𝑠 2𝐵 = = = 𝟖𝟎. 𝟎𝟕
3−1 2

Note: This formula finds the variance among the means by using the sample
sizes as weights and considers the differences in the means.

d. Find the within-group variance, denoted by 𝒔𝟐 𝑾.


𝟐
∑(𝒏𝒊 − 𝟏)𝒔𝟐 𝒊
𝒔 𝑾=
∑(𝒏𝒊 − 𝟏)
(5 − 1)(5.7) + (5 − 1)(10.2) + (5 − 1)(10.3) 104.80
𝑠 2𝑊 = = = 𝟖. 𝟕𝟑
(5 − 1) + (5 − 1) + (5 − 1) 12
Note: This formula finds an overall variance by calculating a weighted average of
the individual variances. It does not involve using differences of the means.
e. Find the 𝐹 test value.

𝒔𝟐 𝑩 𝟖𝟎. 𝟕
𝑭= 𝟐 = = 𝟗. 𝟏𝟕
𝒔 𝑾 𝟖. 𝟕𝟑

Step 4: Make the decision. The decision is to reject the null hypothesis, since
9.17 > 3.89.

Step 5: Summarize the results. There is enough evidence to reject the claim and
conclude that at least one mean is different from the others.

5
The numerator of the fraction obtained in step 3, part c, of the
computational procedure is called the sum of squares between groups,
denoted by 𝑆𝑆𝐵 . The numerator of the fraction obtained in step 3, part d, of the
computational procedure is called the sum of squares within groups, denoted
by 𝑆𝑆𝑊 . This statistic is also called the sum of squares for the error. 𝑆𝑆𝐵 is
divided by 𝑑. 𝑓. 𝑁. to obtain the between-group variance. 𝑆𝑆𝑊 is divided by 𝑁 − 𝑘
to obtain the within-group or error variance. These two variances are sometimes
called mean squares, denoted by 𝑀𝑆𝐵 and 𝑀𝑆𝑊 . These terms are used to
summarize the analysis of variance and are placed in a summary table, as shown
below.

Analysis of Variance Summary Table


Source Sum of 𝒅. 𝒇. Mean Square 𝑭
Squares
Between 𝑆𝑆𝐵 𝑘−1 𝑀𝑆𝐵
Within (error) 𝑆𝑆𝑊 𝑁−𝑘 𝑀𝑆𝑊
Total

In the table,
𝑆𝑆𝐵 = the sum of squares between groups

𝑆𝑆𝑊 = the sum of squares within groups

𝑘 = number of groups

𝑁 = 𝑛1 + 𝑛2 + . . . +𝑛𝑘 = sum of sample sizes for groups

𝑆𝑆𝐵
𝑀𝑆𝐵 =
𝑘−1
𝑆𝑆𝑊
𝑀𝑆𝑊 =
𝑁−𝑘
𝑀𝑆𝐵
𝐹=
𝑀𝑆𝑊

6
The totals are obtained by adding the corresponding columns. For example 1,
the ANOVA summary table is shown below.

Analysis of Variance Summary Table


Source Sum of 𝒅. 𝒇. Mean Square 𝑭
Squares
Between 160.13 2 80.07 𝟗. 𝟏𝟕
Within (error) 104.80 12 8.73
Total 264.93 14
Most computer programs will print out an ANOVA summary table.

Example 2: Employees at Toll Road Interchanges. A state employee


wishes to see if there is a significant difference in the number of
employees at the interchanges of three state toll roads. The data are
shown. At 𝛼 = 0.05, can it be concluded that there is a significant
difference in the average number of employees at each interchange?

Pennsylvania Greensburg Bypass/ Beaver Valley


Turnpike Mon-Fayette Expressway
Expressway
7 10 1
14 1 12
32 1 1
19 0 9
10 11 1
11 1 11
̅
𝑿𝟏 = 𝟏𝟓. 𝟓 ̅
𝑋2 = 4.0 𝑋̅3 = 5.8
𝒔𝟐 𝟏 = 𝟖𝟏. 𝟗 𝑠 2 2 = 25.6 𝑠 2 3 = 29.0

Solution:

Step 1: State the hypotheses and identify the claim.

𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐 = 𝝁𝟑 (claim)

𝑯𝟏 : 𝑨𝒕 𝒍𝒆𝒂𝒔𝒕 𝒐𝒏𝒆 𝒎𝒆𝒂𝒏 𝒊𝒔 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒕 𝒇𝒓𝒐𝒎 𝒕𝒉𝒆 𝒐𝒕𝒉𝒆𝒓𝒔

7
Step 2: Find the critical value. Since 𝑘 = 3 and 𝑁 = 18, and 𝛼 = 0.05,

𝑑. 𝑓. 𝑁. = 𝑘 − 1 = 3 − 1 = 2

𝑑. 𝑓. 𝐷. = 𝑁 − 𝑘 = 18 − 3 = 15

The critical value is 3.68.

8
Step 3: Compute the test value, using the procedure outlined here.

a. Find the mean and variance of each sample (these values are shown
below the data columns in the example).
b. Find the grand mean. The grand mean, denoted by 𝑋̅𝐺𝑀 , is the mean of all
values in the samples.
∑ 𝑋 7 + 14 + 32+ . . . +11 152
𝑋̅𝐺𝑀 = = = = 𝟖. 𝟒𝟒
𝑁 18 18

c. Find the between-group variance.


̅𝒊 − 𝑿
∑ 𝒏 𝒊 (𝑿 ̅ 𝑮𝑴)𝟐
𝒔𝟐 𝑩 =
𝒌−𝟏
6(15.5 − 8.44)2 + 6(4 − 8.44)2 + 6(5.8 − 8.44)2 459.16
𝑠 2𝐵 = = = 𝟐𝟐𝟗. 𝟓𝟖
3−1 2
d. Find the within-group variance.
∑(𝒏𝒊 − 𝟏)𝒔𝟐 𝒊
𝒔𝟐 𝑾 =
∑(𝒏𝒊 − 𝟏)
(6 − 1)(81.9) + (6 − 1)(25.6) + (6 − 1)(29.0) 682.50
𝑠 2𝑊 = = = 𝟒𝟓. 𝟓𝟎
(6 − 1) + (6 − 1) + (6 − 1) 15
e. Find the 𝐹 test value.
𝒔𝟐 𝑩 𝟐𝟐𝟗. 𝟓𝟖
𝑭= 𝟐 = = 𝟓. 𝟎𝟓
𝒔 𝑾 𝟒𝟓. 𝟓𝟎

Step 4: Make the decision. Since 5.05 > 3.68, the decision is to reject the null
hypothesis.

Analysis of Variance Summary Table


Sum of 𝒅. 𝒇. Mean Square 𝑭
Source Squares
Between 459.16 2 229.58 𝟓. 𝟎𝟓
Within (error) 682.50 15 45.50
Total 1141.66 17

Step 5: Summarize the results. There is enough evidence to support the claim
that there is a difference among the means.

9
The steps for computing the F test value for the ANOVA are summarized in this
Procedure Table.

When the null hypothesis is rejected in ANOVA, it only means that at least
one mean is different from the others. To locate the difference or differences
among the means, it is necessary to use other tests such as the Tukey or the
Scheffé test.

10
Exercises: Assume that all variables are normally distributed, that the samples
are independent, and that the population variances are equal. Also, for each
exercise, perform the following steps.

a. State the hypotheses and identify the claim.


b. Find the critical value.
c. Compute the test value.
d. Make the decision.
e. Summarize the results, and explain where the differences in the means
are.

1. Weight Gain of Athletes. A researcher wishes to see whether there is any


difference in the weight gains of athletes following one of three special diets.
Athletes are randomly assigned to three groups and placed on the diet for
6weeks.The weight gains (in pounds) are shown here. At 𝛼 = 0.05,can the
researcher conclude that there is a difference in the diets?

Diet A Diet B Diet C


𝟑 10 8
𝟔 12 3
𝟕 11 2
𝟒 14 5
8
6

2. Hybrid Vehicles. A study was done before the recent surge in gasoline
prices to compare the cost to drive 25 miles for different types of hybrid
vehicles. The cost of a gallon of gas at the time of the study was
approximately $2.50. Based on the information given below for different
models of hybrid cars, trucks, and SUVs, is there sufficient evidence to
conclude a difference in the mean cost to drive 25 miles? Use 𝛼 = 0.05.

Hybrid cars Hybrid SUVs Hybrid trucks


𝟐. 𝟏𝟎 2.10 3.62
𝟐. 𝟕𝟎 2.42 3.43
𝟏. 𝟔𝟕 2.25
𝟏. 𝟔𝟕 2.10
𝟏. 𝟑𝟎 2.25

11
3. Expenditures per Pupil. The per-pupil costs (in thousands of dollars) for
cyber charter school tuition for school districts in three areas of southwestern
Pennsylvania are shown. At 𝛼 = 0.05, is there a difference in the means? If
so, give a possible reason for the difference.

4. Post-Secondary School Enrollments. A random sample of enrollments


from public institutions of higher learning (with enrollments under 10,000)
is shown. At 𝛼 = 0.10, test the claim that the mean enrollments are the
same in all parts of the country.

12
5. Ocean Water Temperatures. The National Oceanographic Data Center
lists water temperatures in degrees Fahrenheit for beaches all around the
country. Below are listed selected beach temperatures from the month of
February for various coastal areas of the United States. At the 0.05 level
of significance, is there sufficient evidence to conclude a difference in
mean temperatures?

13
14
15
16
17
18

You might also like