0% found this document useful (0 votes)
56 views70 pages

Design of Experiments Updated

This document discusses research design and data collection. It covers statistical design of experiments including types and principles. It also discusses different types and sources of data, including how data is classified as numerical or categorical, discrete or continuous, nominal or ordinal, and primary or secondary based on its nature and source. Different methods of data collection like surveys and tools are also mentioned.

Uploaded by

maragathamani21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views70 pages

Design of Experiments Updated

This document discusses research design and data collection. It covers statistical design of experiments including types and principles. It also discusses different types and sources of data, including how data is classified as numerical or categorical, discrete or continuous, nominal or ordinal, and primary or secondary based on its nature and source. Different methods of data collection like surveys and tools are also mentioned.

Uploaded by

maragathamani21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 70

UNIT II RESEARCH DESIGN AND DATA COLLECTION

• Statistical design of experiments- types and principles; data types &


classification; data collection -methods and tools
Design of Experiments
Dr.S.Bhuvaneshwari
Definition
Design of experiments (DOE) is a systematic, efficient method that
enables scientists and engineers to study the relationship between
multiple input variables (aka factors) and key output variables (aka
responses). It is a structured approach for collecting data and making
inferences and discoveries.
Purpose of DOE
• To determine whether a factor, or a collection of factors, has an
effect on the response.
• To determine whether factors interact in their effect on the
response.
• To model the behavior of the response as a function of the factors.
• To optimize the response.
Why use DOE
• In driving knowledge of cause and effect between factors.
• To experiment with all factors at the same time.
• To run trials that span the potential experimental region for our
factors.
• In enabling us to understand the combined effect of the factors.
Traditional way of Experimentation
• Trial and Error Method
• One factor at a time
Example: Determine the optimal temperature and time settings that will maximize
yield through experiments.
Example:
1. Conduct a trial at starting values for the two variables and Trial and Error method
record the yield:

3. Repeat Step 2 until we think we've


found the best set of values:

2. Adjust one or both values based on our results:


Disadvantages of Trial and Error method
Inefficient, unstructured and ad hoc (worst if carried out without
subject matter knowledge).
Unlikely to find the optimum set of conditions across two or more
factors.
One factor at a time (OFAT) method
(a) Change the value of the one factor, then measure the response,
repeat the process with another factor.

Start with temperature: Find the temperature resulting in the


highest yield, between 50 and 120 degrees.

1a. Run a total of eight trials. Each trial increases


temperature by 10 degrees (i.e., 50, 60, 70 ... all the way to
120 degrees).

1b. With time fixed at 20 hours as a controlled variable.

1c. Measure yield for each batch.


Contd…
2. Run the second experiment by varying
time, to find the optimal value of time
(between 4 and 24 hours).
2a. Run a total of six trials. Each trial
increases temperature by 4 hours (i.e., 4,
8, 12… up to 24 hours).
2b. With temperature fixed at 90
degrees as a controlled variable.
2c. Measure yield for each batch.
Contd…
• 3. After a total of 14 trials, we’ve
identified the max yield (86.7%)
happens when:

• Temperature is at 90 degrees;
Time is at 12 hours.
Disadvantages

We’re unlikely to find the optimum set of conditions across two or more factors.
Results of both the methods

Notice that none of them has trials conducted at a low temperature and time and near optimum
conditions.
Shortcomings…
• Did not simultaneously change the settings of both factors.
• Did not conduct trials throughout the potential experimental region.

The result was a lack of understanding on the combined effect of the two variables on the response. The two
factors did interact in their effect on the response!
Full Factorial Method
Experiment with two factors, each factor with two values

These four trials form the corners of the design space:


Run all possible combinations of factor levels, in random order to average out
effects of lurking variables.
(Optional) Replicate entire design by running each treatment twice to find
out experimental error
Contd….
Analysing the results enable to build a statistical model that estimates
the individual effects (Temperature & Time), and also their interaction.
Advantages
• DOE requires fewer trials.
• DOE is more effective in finding the best settings to maximize yield.
• DOE enables us to derive a statistical model to predict results as a
function of the two factors and their combined effect.
A good experimental design is focussed on
Increase the efficiency of the design
Reduce the experimental errors
Contd…..
Statistically designed experiments are economical
They allow one to measure the influence of one or several factors
on a response
They allow the estimation of the magnitude of experimental error
Experiments designed without adhering to statistical principles
Usually violate one or more of these desirable design goals.
Terminologies in DoE
• Block. Group of homogeneous experimental units.
• Confounding. One or more effects that cannot unambiguously be attributed to a single
factor or interaction.
• Covariate. An uncontrollable variable that influences the response but is unaffected by
any other experimental factors.
• Design (layout). Complete specification of experimental test runs, including blocking,
randomization, repeat tests, replication, and the assignment of factor–level combinations
to experimental units.
• Effect. Change in the average response between two factor–level combinations or
between two experimental conditions.
Contd…..
• Repeat tests. Two or more observations that have the same levels for all the factors.
• Replication. Repetition of an entire experiment or a portion of an experiment under
two or more sets of conditions.
• Response. Outcome or result of an experiment.
• Test run. Single combination of factor levels that yields an observation on the
response.
• Unit (item). Entity on which a measurement or an observation is made; sometimes
refers to the actual measurement or observation.
Contd…
• Experimental region (factor space). All possible factor–level combinations for which
experimentation is possible.
• Factor. A controllable experimental variable that is thought to influence the response.
• Homogeneous experimental units. Units that are as uniform as possible on all
characteristics that could affect the response.
• Interaction. Existence of joint factor effects in which the effect of each factor depends
on the levels of the other factors.
• Level. Specific value of a factor.
Basic Principles of DOE
Completely randomized design
Contd….
Sources of Variation
Example
Contd….
Contd….
Contd….
• Apply one way ANOVA
• There is no significant effect of different fertilizer doses on
the fresh weight of plants
Randomized block design
Randomized Completely Block Design
(RCBD)
RCBD
Contd….
Result of ANOVA

• There is highly significant effect of fertilizer on the fresh water plants


• There is a significant effect of river on the fresh weight of plants
Latin Square Design
Latin Square Design
• Handles two sources of variation that occur in a gradient
• Example: A field has river on one side and road on the other side
Results of ANOVA analysis

 There is a very highly significant effect of fertilizer on the fresh weight


of plants

 The effect of road on the fresh weight of plants is non-significant

 There is highly significant effect of river on the fresh weigh of plants


Types of DOE
• Full factorial designs
• Fractional factorial designs (Screening designs)
• Response surface designs
• Mixture designs
• Taguchi array designs
• Split plot designs
Types of DoE
Data
Data is a set of values of qualitative or quantitative variables

The scientific investigations involve observations on variables


The observations made on these variables are obtained in the form of
data
Variable is a quantity or characteristics which varies from person to
person
Example: Weight of X individuals, denoted as N; N varies from person to
person
Types of Data
Classified based on Nature of
Variables
 Nature of variables
 Source of variables
Numerical Categorical

Nature of Variables
 Numerical Discrete Continuous Ordinal Nominal
Categorical
Numerical Variable: are the measurable or countable variables
Also known as Quantitative variables
Ex: Weight of a class; Population data

Categorical Data: unmeasurable variables


Non –numerical data; Qualitative data
Ex: Colour of flower;Shape of leaves; Shape of seeds etc.,
Contd….
Discrete Variable: are the discontinuous variables
Values of the variables are limited to whole numbers
Ex: No.of petals or no. of human beings
There cannot be fractions

Continuous Variable : are the variables that can take any value within
a certain range
There are no gaps between the variables
The value can be 10.5 or 11.8
Nominal Variable: Have distinct levels that have no inherent ordering
Ex: Hair colour( White, black, brown)
Gender (Men, Women)

Ordinal Variable: Follow levels that have distinct ordering


Ex: Vast improvement; moderate improvement;
Classification of variables based on source
of variables
Primary data: Data originally collected in the process of
investigation
Data  More accurate and uniform
 Under a supervision
 Time and labour consuming
Primary Secondary
Data Data
Ex: Biological studies, Experimental Studies

Secondary data: Data collected from some other person or


an industry
 Data published by primary investigator
 Less expensive and less time consuming
Ex: Population census data
Data Collection Methods and Tools

You might also like