0% found this document useful (0 votes)
26 views26 pages

Analysis of Variance

This document discusses analysis of variance (ANOVA) including F-distributions, assumptions of one-way ANOVA, and calculating test statistics. It provides examples of calculating values for degrees of freedom, sums of squares, variances between and within samples, and the test statistic F. The document contains detailed information and calculations related to performing one-way ANOVA.

Uploaded by

mesho poso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views26 pages

Analysis of Variance

This document discusses analysis of variance (ANOVA) including F-distributions, assumptions of one-way ANOVA, and calculating test statistics. It provides examples of calculating values for degrees of freedom, sums of squares, variances between and within samples, and the test statistic F. The document contains detailed information and calculations related to performing one-way ANOVA.

Uploaded by

mesho poso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Analysis of Variance

Hope Sabao(PhD)

University of Lusaka

28th April, 2021

1
F -Distribution

• The shape of a particular F distribution curve depends on the


number of degrees of freedom.
• The F distribution has two numbers of degrees of freedom:
degrees of freedom for the numerator and degrees of freedom
for the denominator.
• These two numbers representing two types of degrees of
freedom are the parameters of the F - distribution.
• Each combination of degrees of freedom for the numerator
and for the denominator gives a different F distribution curve.
• The units of an F distribution are denoted by F , which
assumes only nonnegative values.
• the F distribution is a continuous distribution.
• The shape of an F distribution curve is skewed to the right,
but the skewness decreases as the number of degrees of
freedom increases.

For an F distribution, degrees of freedom for the numerator and


degrees of freedom for the denominator are usually written as
follows:
• The following figure shows three F distribution curves for
three sets of degrees of freedom for the numerator and for the
denominator.
• In the figure, the first number gives the degrees of freedom
associated with the numerator, and the second number gives
the degrees of freedom associated with the denominator.
• We can observe from this figure that as the degrees of
freedom increase, the peak of the curve moves to the right;
that is, the skewness decreases.

4
• To read the F distribution table, we need to know three
quantities: the degrees of freedom for the numerator, the
degrees of freedom for the denominator, and an area in the
right tail of an F distribution curve.
• Note that the F distribution table is read only for an area in
the right tail of the F distribution curve.
• Also note that the F distribution table has four parts. These
four parts give the F values for areas of .01, .025, .05, and
.10, respectively, in the right tail of the F distribution curve.
We can make the F distribution table for other values in the
right tail.
Example
Find the F value for 8 degrees of freedom for the numerator, 14
degrees of freedom for the denominator, and .05 area in the right
tail of the F distribution curve.

Solution
• To find the required value of F , we use the portion of the F
distribution table that corresponds to .05 area in the right tail
of the F distribution curve.
• To find the required F value, we locate 8 in the row for
degrees of freedom for the numerator and 14 in the column
for degrees of freedom for the denominator.
• The entry where the column for 8 and the row for 14 intersect
gives the required F value. This value of F is 2.70.
• The F value obtained from this table for a test of hypothesis
is called the critical value of F.
Obtaining the F Value From the F distribution table

7
One-Way Analysis of Variance

Definition
ANOVA is a procedure that is used to test the null hypothesis that
the means of three or more populations are all equal.

Assumptions of One-Way ANOVA


The following assumptions must hold true to use one-way ANOVA.
1. The populations from which the samples are drawn are
(approximately) normally distributed.
2. The populations from which the samples are drawn have the
same variance (or standard deviation).
3. The samples drawn from different populations are random and
independent.
• The ANOVA test is applied by calculating two estimates of
the variance, σ s of population distributions: the variance
between samples and the variance within samples. The
variance between samples is also called the mean square
between samples or MSB. The variance within samples is also
called the mean square within samples or MSW.
• The variance between samples, MSB, gives an estimate of σ 2
based on the variation among the means of samples taken
from different populations.
• The variance within samples, MSW, gives an estimate of σ 2
based on the variation within the data of different samples.
• The one-way ANOVA test is always right-tailed with the
rejection region in the right tail of the F distribution curve.

9
Calculating the Value of the Test Statistic
The value of the test statistic F for a test of hypothesis using
ANOVA is given by the ratio of two variances, the variance
between samples (MSB) and the variance within samples (MSW).

The test statistic F for one way ANOVA test


The value of the test statistic F for an ANOVA test is calculated as
Variance between samples MSB
F = or
Variance within samples MSW

Example
Fifteen fourth-grade students were randomly assigned to three
groups to experiment with three different methods of teaching
arithmetic. At the end of the semester, the same test was given to
all 15 students. The following table gives the scores of students in
the three groups.

10
Calculate the value of the test statistic F. Assume that all the
required assumptions of one way analysis of variance hold true.

11
Solution

In ANOVA terminology, the three methods used to teach


arithmetic are called treatments. The table contains data on the
scores of fourth-graders included in the three samples. Each
sample of students is taught by a different method. Let

x = the score of a student


k = the number of different samples
ni = the size of sample i
Ti = the sum of the values in sample i
n = the number of values in all samples n1 + n2 + n3 · · ·
X
x = the sum of the values in all samples = T1 + T2 + T3 · · ·
X
x2 = the sum of the squares of the values in all samples
To calculate MSB and MSW, we first compute the
between-samples sum of squares, denoted by SSB, and the
within-samples sum of squares, denoted by SSW. The sum of SSB
and SSW is called the total sum of squares and is denoted by SST;
that is,
SST = SSB + SSW
The values of SSB and SSW are calculated using the following
formulas.

Between- and Within-Samples Sums of Squares


The between-samples sum of squares, denoted by SSB, is
calculated as
 2
T22 T32 ( x)2
 P
T1
SSB = + + + ··· −
n1 n2 n3 n

13
The within-samples sum of squares, denoted by SSW, is calculated
as  2
T22 T32

X
2 T1
SSW = x − + + + ···
n1 n2 n3

The following table lists the scores of 15 students who were taught
arithmetic by each of the three different methods; the values of T1 ,
T2 and T3 and the values of n1 , n2 and n3 .

14
• In the table above, T1 is obtained by adding the five scores of
the first sample. Thus T1 = 48 + 73 + 51 + 65 + 87 = 324,
• Similarly, the sums of the values in the second and third
samples give T2 = 369, and T3 = 388 respectively.
• Because there are five observations in each sample,
n1 = n2 = n3 = 5.
• The values of
P
x and n are respectively,
X
x = T1 + T2 + T3 = 324 + 369 + 388 = 1081
n = n1 + n2 + n3 = 5 + 5 + 5 = 15
P 2
• To calculate x , we square all the scores included in all
three samples and then add them. Thus
X
x 2 = (48)2 +(73)2 +(51)2 +(65)2 +(87)2 +(55)2 +(85)2 +(70)2

+(69)2 +(90)2 +(84)2 +(68)2 +(95)2 +(74)2 +(67)2 = 80, 709


Substituting all the values in the formulas for SSB and SSW, we
obtain the following values of SSB and SSW:

(324)2 (369)2 (388)2 (1081)2


 
SSB = + + − = 432.1333
5 5 5 15
(324)2 (369)2 (388)2
 
SSW = 80, 709 − + + = 2372.8000.
5 5 5

The value of SST is obtained by adding the values of SSB and


SSW. Thus,

SST = 432.133 + 2372.8000 = 2804.9333.

16
The variance between samples (MSB) and the variance within
samples (MSW) are calculated using the following formulas.

Calculating the Values of MSB and MSW


MSB and MSW are calculated as, respectively,
SSB SSW
MSB = and MSW =
k −1 n−k
where k − 1 and n − k are, respectively, the df for the numerator
and the df for the denominator for the F distribution. Remember,
k is the number of different samples.

Consequently, the variance between samples is


SSB 432.1333
MSB = = = 216.0667
k −1 3−1
The variance within samples is
SSW 2372.800
MSW = = = 197.7333
n−k 15 − 3

The value of the test statistic F is given by the ratio of MSB and
MSW. Therefore,
MSB 216.0667
F = = = 1.09
MSW 197.7333

For convenience, all these calculations are often recorded in a table


called the ANOVA table. The following table gives the general
form of an ANOVA table.

18
Substituting the values of the various quantities into our previous
table, we write the ANOVA table for our example as follows:
One-Way ANOVA Test

• Now suppose we want to test the null hypothesis that the


mean scores are equal for all three groups of fourth-graders
taught by three different methods of our previous Example
against the alternative hypothesis that the mean scores of all
three groups are not equal.
• Note that in a one-way ANOVA test, the null hypothesis is
that the means for all populations are equal.
• The alternative hypothesis is that not all population means
are equal.
• In other words, the alternative hypothesis states that at least
one of the population means is different from the others.
• The following example demonstrates how we use the one-way
ANOVA procedure to make such a test.

20
Example
Reconsider our previous example about the scores of 15
fourth-grade students who were randomly assigned to three groups
in order to experiment with three different methods of teaching
arithmetic. At a 1% significance level, can we reject the null
hypothesis that the mean arithmetic score of all fourth-grade
students taught by each of these three methods is the same?
Assume that all the assumptions required to apply the one-way
ANOVA procedure hold true.

Solution
To make a test about the equality of the means of three
populations, we follow our standard procedure with five steps.

21
Step 1. State the null and alternative hypotheses.
Let µ1 ,µ2 and µ3 be the mean arithmetic scores of all fourth-grade
students who are taught, respectively, by Methods I, II, and III.
The null and alternative hypotheses are

H0 : µ1 = µ2 = µ3 (The mean scores of the three groups are all equal.)


H1 : Not all three means are equal.

Note that the alternative hypothesis states that at least one


population mean is different from the other two.

Step 2. Select the distribution to use.


Because we are comparing the means for three normally distributed
populations and all of the assumption required to apply ANOVA
procedure are satisfied, we use the F distribution to make this test.

22
Step 3. Determine the rejection and nonrejection regions.
• The significance level is .01. Because a one-way ANOVA test
is always right-tailed, the area in the right tail of the F
distribution curve is 0.01, which is the rejection region in the
following figure.
• Next we need to know the degrees of freedom for the
numerator and the denominator. In our example, the students
were assigned to three different methods. As mentioned
earlier, these methods are called treatments. The number of
treatments is denoted by k. The total number of observations
in all samples taken together is denoted by n. Then, the
number of degrees of freedom for the numerator is equal to
k − 1 and the number of degrees of freedom for the
denominator is equal to n − k In our example, there are 3
treatments (methods of teaching) and 15 total observations
(total number of students) in all 3 samples. Thus,
Degrees of freedom for the numerator = k −1=3−1=2
Degrees of freedom for the denominator = n − k = 15 − 3 = 12

From the F distribution table, we find the critical value of F for 2


df for the numerator, 12 df for the denominator, and 0.01 area in
the right tail of the F distribution curve. This value of F is 6.93,
as shown in the following figure

Thus, we will fail to reject H0 if the calculated value of the test


statistic F is less than 6.93, and we will reject H0 if it is 6.93 or
larger.
Critical value of F for df = (2, 12) and α = 0.01

25
Step 4. Calculate the value of the test statistic.
We computed the value of the test statistic F for these data in our
previous example. This value is

F = 1.09

Step 5. Make a decision.


Because the value of the test statistic F = 1.09 is less than the
critical value of F = 6.93 it falls in the nonrejection region. Hence,
we fail to reject the null hypothesis, and conclude that the means
of the three populations are equal. In other words, the three
different methods of teaching arithmetic do not seem to affect the
mean scores of students. The difference in the three mean scores
in the case of our three samples occurred only because of sampling
error.

You might also like