Subject Code
18MEO113T - Design of Experiments
Handled by
S. Murali, Ph.D.,
Associate Professor,
Department of Mechanical Engineering,
SRM IST, Kattankulathur.
Disclaimer
The content prepared in the presentation are from various sources, only used for education
purpose. Thanks to all the sources.
Unit 5
2
Unit 5
1: Introduction and uses of confounding
2: 23 factorial experiment with complete confounding
3: 23 factorial experiment with partial confounding
4: Confounding in the 2n series and examples
5: Confounding of 3n factorial and examples
6: ANOVA (One-way and two-way, higher-way ANOVA)
7: MANOVA and ANCOVA overview
8: Solving Case studies on ANOVA with statistics software
9: Regression Models and Regression Analysis
ANOVA
• Analysis of Variance (abbreviated as ANOVA)
• ANOVA was developed by Fisher in the early 1920s, and initially applied to
agricultural experiments. Used extensively today for industrial experiments.
• The ANOVA technique enables us to perform to examine the significance of
the difference amongst more than two sample means at the same time.
• Using this technique, one can infer whether the samples have been drawn
from populations with the same mean.
4
ANOVA
• ANOVA is a procedure for testing the difference among different data groups
for homogeneity.
• Variance is an important statistical measure described as the mean of the
squares of deviations taken from the mean of the given data series. It is a
frequently used measure of variation. The square of standard deviation is
called as the variance.
i.e., Variance = (Standard deviation)2
• There may be variation between samples and also within sample items.
5
ANOVA
• An ANOVA test is a way to find out if survey or experiment results are
significant.
• In other words, ANOVA help us to figure out if there is a need to reject the
null hypothesis or accept the alternative hypothesis.
• Basically, we’re testing groups to see if they are different.
6
ANOVA
• Example:
– A group of psychiatric patients are trying three different therapies:
counselling, medication and biofeedback. We want to see if one therapy
is better than the others.
– A manufacturer has two different processes to make light bulbs. They
want to know if one process is better than the other.
– Students from different colleges take the same exam. You want to see if
one college outperforms the other.
7
ANOVA
• Example:
– A Manager wants to evaluate the performance of three(or more)
employees to see if any performance different from others.
– A Marketing executives want to see if there is a difference in sales
productivity in the 5 company region.
– A Teacher wants to see if there is a difference in students performance if
he use 3 or more approach to teach.
8
ANOVA
• ANOVA is two types:
– One Way ANOVA: Only one factor is investigated
• One independent variable (With 2 levels)
• Analysis of Variance could have one independent variable
– Two Way ANOVA: Investigate two factors at the same time
• Two independent variables (can have multiple levels)
• Analysis of Variance could have two independent variables
– Two way ANOVA without replication
– Two way ANOVA with replication
9
ANOVA
• Two way ANOVA without replication
– We are testing one set of individual before and after they take a
medication to see if it works or not.
• Two way ANOVA with replication
– Two groups, and the members of those groups are doing more than one
thing.
– For example, two groups of patients from different hospitals trying two
different therapies.
10
ANOVA
• Three way ANOVA
– A three-way ANOVA tests which of three separate variables influence an outcome, and the
relationship between the three variables. It is also called a three-factor ANOVA.
11
ANOVA
12
ANOVA
• ANOVA Technique
– Obtain the mean of each sample
– Work out the mean of the sample means
– Calculate sum of squares for variance between the samples (or SS between)
– Obtain variance or mean square (MS) between samples
– Calculate sum of squares for variance within samples (or SS within)
– Obtain the variance or mean square (MS) within samples
– Find sum of squares of deviation for total variance
– Finally, find F-ratio.
13
One-way ANOVA
• A one way ANOVA is used to compare two means from two
independent (unrelated) groups using the F-distribution.
• The null hypothesis for the test is that the two means are
equal.
• Therefore, a significant result means that the two means are
unequal.
• Alternate hypothesis …? Two means are not equal..
14
One-way ANOVA
Note
• The independent variable is the categorical variable that defines the
compared groups. E.g., instructional methods, grade level, or marital status.
• The dependent variable is the measured variable whose means are being
compared e.g., level of job satisfaction or text anxiety.
15
One-way ANOVA
Examples of when to use a one-way ANOVA
• Situation 1: You have a group of individuals randomly split into
smaller groups and completing different tasks. For example,
you might be studying the effects of tea on weight loss and form
three groups: green tea, black tea, and no tea.
16
One-way ANOVA
Examples of when to use a one-way ANOVA
• Situation 2: Similar to situation 1, but in this case, the individuals are split
into groups based on an attribute they possess. For example, you might be
studying the leg strength of people according to weight. You could split
participants into weight categories (obese, overweight and normal) and
measure their leg strength on a weight machine.
17
One-way ANOVA
Limitation of the one-way ANOVA
• A one-way ANOVA will tell you that at least two groups were different from
each other. But it won’t tell you which groups were different. If your test
returns a significant f-statistic, you may need to run an ad hoc test (like the
Least Significant Difference test) to tell you exactly which groups had a
difference in means.
18
One-way ANOVA
Assumptions
1. Your dependent variable should be measured at the interval or ratio scales
(i.e., they are continuous)
Examples of variable that meet this criterion include, revision time (measured in
hours), intelligence (measured using IQ score), exam performance (measured
from 0 to 100), weight (measured in kg)
19
One-way ANOVA
Assumptions
2. Your independent variable should consist of two or more categorical,
independent groups. Typically, a one-way ANOVA is used when you have three
or more categorical, independent groups, but it can be used for just two groups
(but an independent-samples t-test is more commonly used for two groups)
Examples independent variables that meet this criterion include ethnicity (e.g., 3
groups: Indian, Chinese, Korean), physical activity level (e.g., 4 groups:
sedentary, low, moderate and high), profession (e.g., 5 groups: surgeon, doctor,
nurse, dentist, therapist), and so forth.
20
One-way ANOVA
Assumptions
3. You should have independence of observations, meaning there is no
relationship between the observations in each group or between the groups
themselves.
Example: it is an important assumption of one-way ANOVA. If your study fails
this assumption, you will need to use another statistical test instead of the
one-way ANOVA (e.g., a repeated measures design)
21
One-way ANOVA
Assumptions
4. There should be no significant outliers. Outliers are simple single data points
within your data that do not follow the usual pattern.
Example: In a study of 100 students’ IQ scores, where the mean score was 108
with only a small variation between students, one student had a score of 156,
which is very unusual.
The problem with outliers is that they can have a negative effect on the one-way
ANOVA, reducing the validity of your results.
22
One-way ANOVA
Assumptions
5. Your dependent variable should be approximately normally distributed for
each category of the independent variable.
One-way ANOVA only requires approximately normal data because it is quite
“robust” to violations of normality, meaning that assumption can be a little
violated and still provide valid results. The Kolmogorov–Smirnov test and the
Shapiro–Wilk test are most widely used methods to test the normality of the
data
23
One-way ANOVA
Assumptions
6. There needs to be homogeneity of variances.
24
One-way ANOVA Process
• A one-way ANOVA uses the following null and alternative hypotheses:
25
One-way ANOVA Process
If the p-value is less than your chosen
significance level (e.g. 0.05), then you can
reject the null hypothesis and conclude that
at least one of the population means is
different from the others.
Note: If you reject the null hypothesis, this
indicates that at least one of the population
means is different from the others, but the
ANOVA table doesn’t specify which
population means are different. To
determine this, you need to perform post
hoc tests, also known as “multiple
comparisons” tests.
26
One-way ANOVA (Problem 1)
Suppose we want to know whether or not three different exam
preparation programs lead to different mean scores on a certain
exam. To test this, we recruit 30 students to participate in a study
and split them into three groups. The students in each group are
randomly assigned to use one of the three exam preparation
programs for the next three weeks to prepare for an exam. All
students take the same exam at the end of the three weeks.
27
One-way ANOVA (Problem 1)
Since there are millions of high school students around the country, it would be too time-consuming and
costly to go around to each student and let them use one of the exam prep programs. Instead, we might
select three random samples of 100 students from the population and allow each sample to use one of the
three test prep programs to prepare for the exam. Then, we could record the scores for each student once
they take the exam.
28
One-way ANOVA (Problem 1)
The exam scores for each group are shown below:
Determine if the mean exam score is different between the three
groups:
29
One-way ANOVA (Problem 1)
Step 1: Calculate the group means and the overall mean.
30
One-way ANOVA (Problem 1)
Step 2: Calculate SSR.
Between Groups
31
One-way ANOVA (Problem 1)
Step 3: Calculate SSE.
Within Groups
32
One-way ANOVA (Problem 1)
Step 3: Calculate SSE.
Within Groups
33
One-way ANOVA (Problem 1)
Step 4: Calculate SST.
85 0.64 91 27.04 79 46.24
86 0.04 92 38.44 78 60.84
88 4.84 93 51.84 88 4.84
75 116.64 85 0.64 94 67.24
78 60.84 87 1.44 92 38.44
94 67.24 84 3.24 85 0.64
98 148.84 82 14.44 83 7.84
79 46.24 88 4.84 85 0.64
71 219.04 95 84.64 82 14.44
80 33.64 96 104.04 81 23.04
sum of squares 698 330.6 264.2
Total sum of squares 1292.8
34
One-way ANOVA (Problem 1)
Step 5: Fill in the ANOVA table.
35
One-way ANOVA (Problem 1)
Step 6: Interpret the results.
If, F calculated
falls within this
region, then
Accept the H0
2.35
8
-3.354 3.354
1 1
36
One-way ANOVA (Problem 1)
Step 6: Interpret the results.
F critical > F calculated 🡪 Accept the null hypothesis
37
One-way ANOVA
F – Table with alpha = 0.05 (95% confidence interval)
38
One-way ANOVA (Problem 1 – Second Method)
The exam scores for each group are shown below:
Determine if the mean exam score is different between the three
groups:
39
One-way ANOVA (Problem 1 – Second Method)
Step 1: Calculate Grand Total
Step 2: Count number of observations
Step 3: Calculate Correction Factor (C)
40
One-way ANOVA (Problem 1 – Second Method)
41
One-way ANOVA (Problem 1 – Second Method)
42
One-way ANOVA (Problem 1 – Second Method)
43
One-way ANOVA (Problem 1 – Second Method)
Step 6: SS Within group (Error)
= SS total – SS between group
= 1292.8 - 192.2
SS within group (Error) = 1100.6
Step 7: Make F-distribution table
44
One-way ANOVA (Problem 1 – Second Method)
Step 8: Write recommendation or summary
F critical > F calculated 🡪 Accept the null hypothesis
45
One-way ANOVA (Problem 2)
The exam scores for each group are shown below:
Group 1 Group 2 Group 3
51 23 56
45 43 76
33 23 74
45 43 87
67 45 56
Determine if the mean exam score is different between the three
groups:
46
One-way ANOVA (Problem 2)
The exam scores for each group are shown below:
If, F calculated
falls outside the
curve, then
Reject the H0
Between 9.74
Groups 7
3.354
1
Within Groups
From
F-Table
47
One-way ANOVA (Problem 3)
A paper manufacturer makes grocery bags. They are interested in increasing
the tensile strength of their product. It is thought that strength is a function of the
hardwood concentration in the pulp. An investigation is carried out to compare
four levels of hardwood concentration: 5%, 10%, 15% and 20%. Six test
specimens are made at each level and all 24 specimens are then tested in
random order. The results are shown below:
Do all our groups come from
populations with the same mean?
48
One-way ANOVA (Problem 3)
If, F calculated
falls outside the
curve, then
Reject the H0
19.60
5
3.09
8
49
One-way ANOVA (Problem 3)
50
One-way ANOVA (Problem 4)
A pharmaceutical company conducts an experiment to test the effect of a new
cholesterol medication. The company selects 15 subjects randomly from a
larger population. Each subject is randomly assigned to one of three treatment
groups. Within each treament group, subjects receive a different dose of the
new medication. In Group 1, subjects receive 0 mg/day; in Group 2, 50 mg/day;
and in Group 3, 100 mg/day. The treatment levels represent all the levels of
interest to the experimenter, so this experiment used a fixed-effects model to
select treatment levels for study. After 30 days, doctors measure the cholesterol
level of each subject. The results for all 15 subjects appear in the table below:
51
One-way ANOVA (Problem 4)
In conducting this experiment, the experimenter had two research questions:
Does dosage level have a significant effect on cholesterol level?
How strong is the effect of dosage level on cholesterol level?
To answer these questions, the experimenter intends to use one-way analysis
of variance.
52
One-way ANOVA (Problem 4)
If, F calculated
falls inside the
region, then
accept the H0
4.1
6
3.88
5
53
One-way ANOVA (Problem 5)
A trucking company wishes to test the average life of each of the four brands of
types. The company uses all brands on randomly selected trucks. The records
showing the lives (thousands of miles) are as given. Test the hypothesis that the
average life for each brand of types is the same at 5% level of significance.
Brand 1 20 23 18 17 -
Brand 2 19 15 17 20 16
Brand 3 21 19 20 17 16
Brand 4 15 17 16 18 -
54
One-way ANOVA (Problem 5)
Remember, n- means the number of observations
in each group. Here, Brand 1 and Brand 4 have
observations of 4. But Brand 2 and Brand 3 have
observations of 5. So, when substituting ‘n’ to
calculate SSR, we need to use n=4 for Brand 1 &
4; n = 5 for Brand 2 & 3.
55
Two-way ANOVA
• The two-way ANOVA compares the mean differences between groups that
have been split on two independent variables (called factors).
• The primary purpose of a two-way ANOVA… is to understand if there is an
interaction between the two independent variables on the dependent
variable.
• For example, you may want to determine whether there is an interaction
between physical activity level(IV) and gender(IV) on blood cholesterol
concentration(DV) in children.
56
Two-way ANOVA
Assumptions
• Assumption #1: Your dependent variable should be measured at the continuous level (i.e., they are
interval or ratio variables).
• Assumption #2: Your two independent variables should each consist of two or more categorical,
independent groups.
• Assumption #3: You should have independence of observations, which means that there is no
relationship between the observations in each group or between the groups themselves.
• Assumption #4: There should be no significant outliers. Outliers are data points within your data that
do not follow the usual pattern
• Assumption #5: Your dependent variable should be approximately normally distributed for each
combination of the groups of the two independent variables.
• Assumption #6: There needs to be homogeneity of variances for each combination of the groups of the
two independent variables.
57
Two-way ANOVA (with interaction) method 1
58
Two-way ANOVA (without interaction) method 2
59
Two-way ANOVA (Problem 1 )
An engineer is studying methods for improving the ability to detect
targets on a radar scope. Two factors she considers to be important
are the amount of background noise, or “ground clutter,” on the
scope and the “type of filter” placed over the screen. The response
variable is intensity level. It is experienced that the ground clutter
can be categorized into three levels, ie., low, medium, and high and
two filter types are available in the market.
60
Two-way ANOVA (Problem 1)
Two factors: Ground Clutter type & Filter Type
Response variable: Intensity level
a = number of levels in factor 1 (clutter)
DF for clutter = a – 1 = 3-1 = 2 b = number of levels in factor 2 (filter)
DF for Filter = b – 1 = 2 – 1 = 1 n = number of replicates in each condition
DF for interaction = (a -1) (b-1) = 2 N = abn
DF for errors = ab(n-1) = 2 * 3 *(4-1) =18
DF for total = N – 1 = 24 – 1 = 23
61
Two-way ANOVA (Problem 1)
Step 1: Calculate Row, Column, and Grand Total
Step 2: Count the total number of observations = N = 24 Grand
Total
Step 3: Calculate Correction Factor (C) = (Grand Total^2) / N = 2288^2/24 = 218122.7
62
Two-way ANOVA (Problem 1 )
63
Two-way ANOVA (Problem 1 )
64
Two-way ANOVA (Problem 1 )
65
Two-way ANOVA (Problem 1 )
Step 8: SS Error = SS Total – SS Clutter – SS Filter – SS Interaction
= 1985.333 – 353.0833 – 937.5 – 81.25
SS Error = 613.5
Step 9: Make ANOVA Table
66
Two-way ANOVA (Problem 2)
Suppose you want to determine whether the brand of laundry detergent used
and the temperature affects the amount of dirt removed from your laundry. To
this end you buy two detergents with different brand (“Super” and “Best”) and
choose three different temperature levels (“cold”, “warm” and “hot”). Then you
divide your laundry randomly into “4*r” pile of equal size and assign each ‘r’
piles into the combination of (“super” and “Best”) and (“cold”, “warm” and “hot”).
In this example, we are interested in testing Null Hypothesis.
H0D = The amount of dirt removed does not depend on the type of detergent.
H0T = The amount of dirt removed does not depend on the temperature.
67
Two-way ANOVA (Problem 2)
The example has two factors(factor detergent, factor temperature) at a=2(Super
and Best) and b=3(cold, warm and hot) levels. Thus, there are a*b = 3*2=6
different combination of detergent and temperature with each combination.
There are r=4 loads. (r is called the number of replicates). This sums up to
“n=a*b*r”=24=2*3*4 loads in total.
68
Two-way ANOVA (Example problem)
The amounts of Y(ijk) of dirt removed when washing sub pile k(k=1,2,3,4) with
detergent i(i=1,2) at temperature j(j=1,2,3) are recorded in table below:-
69
Two-way ANOVA (Example problem)
Solution:
Calculate mean
• We have calculated all the means like detergent mean(Md), temperature mean(Mt) and
mean of every group combination.
• Now what we only have to do is calculate the sum of squares(ss) and degree of freedom(df)
for temperature, detergent and interaction between factor and levels.
70
Two-way ANOVA (Example problem)
Step 1:
Calculate of SS (within) and df(within) is:
71
Two-way ANOVA (Example problem)
Step 2:
Calculate of SS (detergent) and df(detergent) and MS(detergent)
72
Two-way ANOVA (Example problem)
Step 3:
Calculate of SS (temperature), df(temperature) and MS(temperature)
73
Two-way ANOVA (Example problem)
Step 4:
Calculate of SS (interaction), df(interaction) and MS(interaction)
74
Two-way ANOVA (Example problem)
Step 5: F-Test
75
Thank you
Reference:
Douglas C. Montgomery. Design of Experiments. Eighth Edition.
76