BST32202:
LINEAR
REGRESSION
5. Multiple
Comparisons
Lindizgani K. Ndovie,
2025
OUTLINE
● One way and two way ANOVA: ○ model adequacy:
○ analysis of variance equation, ■ normality assumption,
○ F-statistic, ■ constant variance assumption,
○ multiple comparison procedures. ■ colinearity,
● Simple and multiple linear regression: ○ influential observations:
○ assumptions, ■ leverage,
○ least squares estimation: ■ outliers;
■ derivation of parameter estimates, ○ polynomial regression;
■ properties of least squares estimates, ○ transformation techniques.
○ correlation and regression, ● Relationship between ANOVA and the
○ inference of parameters linear model:
■ (test of significance of coefficients), ○ one way ANOVA model,
○ inference of predicted values using the ○ two way ANOVA model
model; ● Use of SPSS, STATA, R to perform ANOVA
and linear regression.
INTRODUCTION
• The analysis of variance procedures that we have done so far showed whether
differences among several means are significant.
• However, if we reject the null hypothesis and accept the stated alternative—that the
means are not all equal—we still do not know which of the population means are
equal and which are different.
• Often it is of interest to make several (perhaps all possible) paired comparisons among
the treatments.
• Actually, a paired comparison may be viewed as a simple contrast, namely, a test of
• H0: µi − µj = 0,
• H1: µi − µj ≠ 0 for all i = j.
• A contrast is a specific comparison between group means (e.g., comparing the first
and second groups).
• Making all possible paired comparisons among the means can be very beneficial when
particular complex contrasts are not known in advance.
INTRODUCTION
• Example 1: A researcher wishes to try three different techniques to lower the blood
pressure of individuals diagnosed with high blood pressure.
• The subjects are randomly assigned to three groups; the first group takes medication,
the second group exercises, and the third group follows a special diet.
• After four weeks, the reduction in each person’s blood pressure is recorded.
• At 𝛼 =0.05, test the claim that there is no difference among the means.
• Given k = 3 and N = 15, d.f.N.
= k - 1 = 2, d.f.D. = N - k = 12
• The critical value for the
analysis of variance is 3.89,
obtained from F Table with 𝛼 =
0.05
• The within-group variance is
calculated to be = 8.73
INTRODUCTION
• The model for this situation may be set up as follows.
• There are 5 observations taken from each of 3 populations with means μ1, μ2,μ3,
respectively.
• We may wish to test
• H0: μ1 = μ2 = μ3 ,
• H1: At least two of the means are not equal
• Now suppose that we wish to test the following sets of hypothesis
• H0 : µ1 − µ2 = 0,
• H1 : µ1 − µ2 ≠ 0.
• H0 : µ1 − µ3 = 0,
• H1 : µ1 − µ3 ≠ 0.
• H0 : µ2 − µ3 = 0,
• H1 : µ2 − µ3 ≠ 0.
• The test is developed through use of an F, t, or confidence interval approach.
FISHER’ LEAST SIGNIFICANT DIFFERENCE METHOD
• One of the earliest strategies for comparing multiple groups is the so-called least
significant difference (LSD) method due to Sir Ronald Fisher.
• We assume equal by using the data from all groups to estimate the assumed common
variance when any two groups are compared with Student’s t.
• Under normality and homoscedasticity, this has the advantage of increasing the
degrees of freedom, which in turn can mean more power.
• Formula for LSD
𝑋-! − 𝑋-"
𝑇=
$
𝑠# 10 + 10
𝑛! 𝑛"
• where
• 𝑋-! and 𝑋-" are the means of the samples being compared,
• ni and nj are the respective sample sizes, and
$
• 𝑠# is the within-group variance.
FISHER’ LEAST SIGNIFICANT DIFFERENCE METHOD
• When the assumptions of normality and homoscedasticity are met, T has a Student’s t-
distribution with ν = N −k degrees of freedom, where k is the number of groups being
compared and N is the total number of observations in all k groups.
• So when comparing the ith group to the jth group, you reject the hypothesis of equal
means if |T| ≥ t𝛼 /2(N − k)
FISHER’ LEAST SIGNIFICANT DIFFERENCE METHOD
For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-$
𝑋-% − 𝑋-$ 11.8 − 3.8 8
𝑇= = = = 4.28
$ 10 + 10 10 + 10 1.868689
𝑠# 𝑛! 𝑛" 8.73 5 5
For 𝑋-$ 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-&
𝑋-$ − 𝑋-& 3.8 − 7.6 −3.8
𝑇= = = = −2.03
$ 10 + 10 10 + 10 1.868689
𝑠# 𝑛! 𝑛" 8.73 5 5
For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-&
𝑋-% − 𝑋-& 11.8 − 7.6 4
𝑇= = = = 2.14
$ 10 + 10 10 + 10 1.868689
𝑠# 𝑛! 𝑛" 8.73 5 5
FISHER’ LSD METHOD
• We will reject the hypothesis of equal
means if |T| ≥ t𝛼 /2(N − k)
• Our critical value will be T(0.025, 12) =
± 2.179
• For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-$ , T = 4.28 > 2.179, we
reject H0
• For 𝑋-$ 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-& , T = -2.179 < -2.03 <
2.179, we fail to reject H0
• For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-& , T = 2.14 < 2.179, we fail
to reject H0
• Since only the T value for 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-$ is
greater than the critical value, 2.179, the
only significant difference is between
𝑋-% and 𝑋-$ , that is, between medication and
exercise.
FISHER’ LEAST
SIGNIFICANT
DIFFERENCE METHOD
RELATIONSHIP BETWEEN T AND F
• In this example, we have used a pooled t-test.
• We combined (pooled) the data from all five samples to get a more stable estimate of
the variance.
• This helps us use more degrees of freedom, making the test more reliable.
• In addition, we have tested a contrast.
• In this case, we checked whether the difference between the means of two groups is
significant.
• If the T-value is squared, the result is exactly of the same form as the value of F for a
test on a contrast.
• This is because the t-test and F-test are closely related — both analyze differences
between means, but F-tests compare variances as well.
• You will see this being used in Scheffe test.
CONFIDENCE INTERVAL APPROACH (CI) TO A
PAIRED COMPARISON
• It is straightforward to solve the same problem of a paired comparison (or a contrast)
using a confidence interval (CI) approach.
• Clearly, if we compute a 100(1 − 𝛼)% confidence interval on µ1 − µ2, we have
𝑋-%. − 𝑋-$. ± 𝑡!⁄" 𝑠#
$ %⁄
)# + %
0)$
• where 𝑡!⁄" is the upper 100(𝛼 /2)% point of a t-distribution with N-k degrees of freedom
(degrees of freedom coming from s2).
• Remember, a confidence interval will have the form:
𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 ± 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 ∗ 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 ÷ 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
• The test of the simple contrast μ1 − μ2 involves no more than observing whether or not
the confidence interval above covers zero.
• If the CI contains a zero, i.e. spans from negative to positive real numbers, we will fail
to reject the H0
• If the CI does not contain a zero, we will reject the H0
CI APPROACH TO A PAIRED COMPARISON
• Example 2: Find the 95% confidence interval for Example 1 and make a conclusion
• For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-$
11.8 − 3.8 ± 2.179 8.73 10 + 10 = 8 ± 2.179 ∗ 1.868689 = 8 ± 4.07
5 5
• The CI (3.93, 12.07) does not have a zero, we reject H0, the contrast is significant.
• We find a significant difference between medication and exercise.
• For 𝑋-$ 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-&
3.8 − 7.6 ± 2.179 8.73 10 + 10 = −3.8 ± 2.179 ∗ 1.868689 = −3.8 ± 4.07
5 5
• The CI (− 7.87, 0.27) has a zero, we fail to reject H0 , the contrast is not significant.
• We do not find a significant difference between exercise and diet.
• For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-&
11.8 − 7.6 ± 2.179 8.73 10 + 10 = 4 ± 2.179 ∗ 1.868689 = 4 ± 4.07
5 5
• The CI (− 0.07, 8.07) has a zero, we fail to reject H0 , the contrast is not significant.
• We do not find a significant difference between medication and diet.
EXPERIMENT-WISE ERROR RATE
• Serious difficulties occur when the analyst attempts to make many or all possible
paired comparisons.
• For the case of k means, there will be, of course, r = k(k − 1)/2 possible paired
comparisons.
• Assuming independent comparisons, the experiment-wise error rate or family error
rate is the probability of false rejection of at least one of the hypotheses.
• It is given by 1 − (1 − 𝛼)r , where 𝛼 is the selected probability of a type I error for a
specific comparison.
• Clearly, this measure of experiment-wise type I error can be quite large.
• For example, in the case of 3 means, then the probability of rejecting at least one of
them is with 𝛼 = 0.05 is
• 1 − (1 − 0.05)3 = 1 − (0.95)3 ≈ 0.857
• which is significantly higher than the original error value of 0.05.
SCHEFFÉ’S TEST
• This is used to compare all pairs of groups and is designed to control FWE (the
probability of at least one Type I error).
• It is called Scheffé’s method, which assumes normality and that groups have equal
variances.
• To conduct the Scheffé test, you must compare the means two at a time, using all
possible combinations of means.
• For example, if there are three means, the following comparisons must be done
• 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-$ 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-& 𝑋-$ 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-&
• Formula for the Scheffé Test
$
𝑋-! − 𝑋-"
𝐹* =
$
𝑠# 10 + 10
𝑛! 𝑛"
• where
• 𝑋-! and 𝑋-" are the means of the samples being compared,
• ni and nj are the respective sample sizes, and
$
• 𝑠# is the within-group variance.
SCHEFFÉ’S TEST
+
• To find the critical value 𝐹 for the Scheffé test, multiply the critical value for the F test
by k − 1:
• 𝐹 + = 𝑘 − 1 ∗ 𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒
• There is a significant difference between the two means being compared when FS is
greater than 𝐹 +
SCHEFFÉ’S TEST
• Example 3: Using the Scheffé test, test each pair of means in Example 1 to see
whether a specific difference exists, at 0.05.
• Remember: A researcher wishes to try three different techniques to lower the blood
pressure of individuals diagnosed with high blood pressure.
• The subjects are randomly assigned to three groups; the first group takes medication,
the second group exercises, and the third group follows a special diet.
• After four weeks, the reduction in each person’s blood pressure is recorded.
• Given k = 3 and N = 15, d.f.N. = k -
1 = 2, d.f.D. = N - k = 12
• The critical value for the analysis
of variance is 3.89, obtained from
F Table with 𝛼 =0.05
• The within-group variance is
calculated to be = 8.73
SCHEFFÉ’S TEST
For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-$
𝑋-% − 𝑋-$ $ 11.8 − 3.8 $
𝐹* = = = 18.33
$
𝑠# 10 + 10 8.73 105 + 105
𝑛% 𝑛$
For 𝑋-$ 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-&
𝑋-$ − 𝑋-& $ 3.8 − 7.6 $
𝐹* = = = 4.14
$
𝑠# 10 + 10 8.73 105 + 105
𝑛$ 𝑛&
For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-&
𝑋-% − 𝑋-& $ 11.8 − 7.6 $
𝐹* = = = 5.05
$
𝑠# 10 + 10 8.73 105 + 105
𝑛% 𝑛&
• Since F = 3.89, the critical value for 𝐹 + at 𝛼 =0.05, with d.f.N. = 2 and d.f.D. = 12, is
𝐹 + = 𝑘 − 1 ∗ 𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 = 3 − 1 ∗ 3.89 = 7.78
SCHEFFÉ’S TEST
• For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-$ , 𝐹 = 18.33 > 7.78, we reject H0
• For 𝑋-$ 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-& , 𝐹 = 4.14 < 7.78, we fail to reject H0
• For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-& , 𝐹 = 5.05 < 7.78, we fail to reject H0
• Since only the 𝐹 + test value for 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-$ is greater than the critical value, 7.78, the
only significant difference is between 𝑋-% and 𝑋-$ , that is, between medication and
exercise.
• On occasion, when the 𝐹 + test value is greater than the critical value, the Scheffé test
may not show any significant differences in the pairs of means.
• This result occurs because the difference may actually lie in the average of two or
more means when compared with the other mean.
• The Scheffé test can be used to make these types of comparisons, but the technique is
beyond the scope of this course.
TUKEY TEST
• The Tukey test can also be used after the analysis of variance has been completed to
make pairwise comparisons between means when the groups have the same sample
size.
• The symbol for the test value in the Tukey test is q
• Formula for the Tukey Test
𝑋-! − 𝑋-"
𝑞=
$
𝑠#0
𝑛
• where
• 𝑋-! and 𝑋-" are the means of the samples being compared,
• n is the size of the samples, and
$
• 𝑠# is the within-group variance.
• When the absolute value of q is greater than the critical value for the Tukey test, there
is a significant difference between the two means being compared.
• The procedures for finding q and the critical value for the Tukey test are shown in the
next example
TUKEY TEST
• Example 4: Using the Tukey test, test each pair of means in Example 2 to see whether
a specific difference exists, at 𝛼 =0.05.
• Remember: A researcher wishes to try three different techniques to lower the blood
pressure of individuals diagnosed with high blood pressure.
• The subjects are randomly assigned to three groups; the first group takes medication,
the second group exercises, and the third group follows a special diet.
• After four weeks, the reduction in each person’s blood pressure is recorded.
• Given k = 3 and N = 15, d.f.N. = k - 1 = 2,
d.f.D. = N - k = 12
• The critical value for the analysis of
variance is 3.89, obtained from F Table
with 𝛼 =0.05
• The within-group variance is calculated to
be = 8.73
TUKEY TEST
For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-$
𝑋-% − 𝑋-$ 11.8 − 3.8 8
𝑞= = = = 6.06
$
𝑠# 8.73 1.32
0𝑛 5
For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-&
𝑋-% − 𝑋-& 11.8 − 7.6 4
𝑞= = = = 3.18
$
𝑠# 8.73 1.32
0𝑛 5
For 𝑋-$ 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-&
𝑋-$ − 𝑋-& 3.8 − 7.6 −3.8
𝑞= = = = −2.88
$
𝑠# 8.73 1.32
0𝑛 5
TUKEY TEST
• To find the critical value for the Tukey test, use Table
for Tukey Test.
• The number of means k is found in the row at the top,
$
and the degrees of freedom for 𝑠# are found in the
left column (denoted by v).
$
• Degrees of freedom for 𝑠# is
• d.f.D. = N – k = 15 – 3 = 12
• Since k = 3, d.f. = 12, and 𝛼 =0.05, the critical value is
3.77.
• For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-$ , q = 6.06 > 3.77, we reject H0
• For 𝑋-$ 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-& , q = -2.88 < 3.77, we fail to reject H0
• For 𝑋-% 𝑣𝑒𝑟𝑠𝑢𝑠 𝑋-& , q = 3.18 < 3.77, we fail to reject H0.
• Hence, the only q value that is greater in absolute value than the critical value is the
one for the difference between 𝑋-% and 𝑋-$ .
• The conclusion, then, is that there is a significant difference in means for medication
and exercise.
SUMMARY
• You might wonder why there are two different tests that can be used after the ANOVA.
• Actually, there are several other tests that can be used in addition to the the ones
discussed here.
• It is up to the researcher to select the most appropriate test.
• The Scheffé test is the most general, and it can be used when the samples are of
different sizes.
• Furthermore, the Scheffé test can be used to make comparisons such as the average
of 1 and 2 compared with 3.
• However, the Tukey test is more powerful than the Scheffé test for making pairwise
comparisons for the means.
• A rule of thumb for pairwise comparisons is to use the Tukey test when the samples
are equal in size and the Scheffé test when the samples differ in size.
TUTORIAL
• Exercise 1: The following set of data values
was obtained from a study of people’s
perceptions on whether the color of a
person’s clothing is related to how
intelligent the person looks.
• The subjects rated the person’s intelligence
on a scale of 1 to 10.
• Group 1 subjects were randomly shown
people with clothing in shades of blue and
gray.
• Group 2 subjects were randomly shown
people with clothing in shades of brown and
yellow.
• Group 3 subjects were randomly shown
people with clothing in shades of pink and
orange.
TUTORIAL
• Use the Tukey test to test all possible pairwise comparisons.
• Are there any contradictions in the results?
• Explain why separate t tests are not accepted in this situation.
• When would Tukey’s test be preferred over the Scheffé method? Explain
• Exercise 2: For five independent groups, assume that you plan to do all pairwise
comparisons of the means and you want FWE to be .05.
• Further assume that n1 = n2 = n3 = n4 = n5 = 4, 𝑋-% = 15, 𝑋-$ = 8, 𝑋-& = 6, 𝑋-, = 13,
- $ $ $ $ $
𝑋- = 7, 𝑠% = 4 and 𝑠$ = 9, 𝑠& = 𝑠, = 𝑠- = 15,
• Assuming the ANOVA F test rejects the null hyphothesis, test which group
means are different with
• Tukey Test
• Scheffe Test
• Fishers LSD Method