Analysis of Variance
Analysis of Variance
Hope Sabao(PhD)
University of Lusaka
1
F -Distribution
4
• To read the F distribution table, we need to know three
quantities: the degrees of freedom for the numerator, the
degrees of freedom for the denominator, and an area in the
right tail of an F distribution curve.
• Note that the F distribution table is read only for an area in
the right tail of the F distribution curve.
• Also note that the F distribution table has four parts. These
four parts give the F values for areas of .01, .025, .05, and
.10, respectively, in the right tail of the F distribution curve.
We can make the F distribution table for other values in the
right tail.
Example
Find the F value for 8 degrees of freedom for the numerator, 14
degrees of freedom for the denominator, and .05 area in the right
tail of the F distribution curve.
Solution
• To find the required value of F , we use the portion of the F
distribution table that corresponds to .05 area in the right tail
of the F distribution curve.
• To find the required F value, we locate 8 in the row for
degrees of freedom for the numerator and 14 in the column
for degrees of freedom for the denominator.
• The entry where the column for 8 and the row for 14 intersect
gives the required F value. This value of F is 2.70.
• The F value obtained from this table for a test of hypothesis
is called the critical value of F.
Obtaining the F Value From the F distribution table
7
One-Way Analysis of Variance
Definition
ANOVA is a procedure that is used to test the null hypothesis that
the means of three or more populations are all equal.
9
Calculating the Value of the Test Statistic
The value of the test statistic F for a test of hypothesis using
ANOVA is given by the ratio of two variances, the variance
between samples (MSB) and the variance within samples (MSW).
Example
Fifteen fourth-grade students were randomly assigned to three
groups to experiment with three different methods of teaching
arithmetic. At the end of the semester, the same test was given to
all 15 students. The following table gives the scores of students in
the three groups.
10
Calculate the value of the test statistic F. Assume that all the
required assumptions of one way analysis of variance hold true.
11
Solution
13
The within-samples sum of squares, denoted by SSW, is calculated
as 2
T22 T32
X
2 T1
SSW = x − + + + ···
n1 n2 n3
The following table lists the scores of 15 students who were taught
arithmetic by each of the three different methods; the values of T1 ,
T2 and T3 and the values of n1 , n2 and n3 .
14
• In the table above, T1 is obtained by adding the five scores of
the first sample. Thus T1 = 48 + 73 + 51 + 65 + 87 = 324,
• Similarly, the sums of the values in the second and third
samples give T2 = 369, and T3 = 388 respectively.
• Because there are five observations in each sample,
n1 = n2 = n3 = 5.
• The values of
P
x and n are respectively,
X
x = T1 + T2 + T3 = 324 + 369 + 388 = 1081
n = n1 + n2 + n3 = 5 + 5 + 5 = 15
P 2
• To calculate x , we square all the scores included in all
three samples and then add them. Thus
X
x 2 = (48)2 +(73)2 +(51)2 +(65)2 +(87)2 +(55)2 +(85)2 +(70)2
16
The variance between samples (MSB) and the variance within
samples (MSW) are calculated using the following formulas.
The value of the test statistic F is given by the ratio of MSB and
MSW. Therefore,
MSB 216.0667
F = = = 1.09
MSW 197.7333
18
Substituting the values of the various quantities into our previous
table, we write the ANOVA table for our example as follows:
One-Way ANOVA Test
20
Example
Reconsider our previous example about the scores of 15
fourth-grade students who were randomly assigned to three groups
in order to experiment with three different methods of teaching
arithmetic. At a 1% significance level, can we reject the null
hypothesis that the mean arithmetic score of all fourth-grade
students taught by each of these three methods is the same?
Assume that all the assumptions required to apply the one-way
ANOVA procedure hold true.
Solution
To make a test about the equality of the means of three
populations, we follow our standard procedure with five steps.
21
Step 1. State the null and alternative hypotheses.
Let µ1 ,µ2 and µ3 be the mean arithmetic scores of all fourth-grade
students who are taught, respectively, by Methods I, II, and III.
The null and alternative hypotheses are
22
Step 3. Determine the rejection and nonrejection regions.
• The significance level is .01. Because a one-way ANOVA test
is always right-tailed, the area in the right tail of the F
distribution curve is 0.01, which is the rejection region in the
following figure.
• Next we need to know the degrees of freedom for the
numerator and the denominator. In our example, the students
were assigned to three different methods. As mentioned
earlier, these methods are called treatments. The number of
treatments is denoted by k. The total number of observations
in all samples taken together is denoted by n. Then, the
number of degrees of freedom for the numerator is equal to
k − 1 and the number of degrees of freedom for the
denominator is equal to n − k In our example, there are 3
treatments (methods of teaching) and 15 total observations
(total number of students) in all 3 samples. Thus,
Degrees of freedom for the numerator = k −1=3−1=2
Degrees of freedom for the denominator = n − k = 15 − 3 = 12
25
Step 4. Calculate the value of the test statistic.
We computed the value of the test statistic F for these data in our
previous example. This value is
F = 1.09