Research
Methodology and
Biostatistics
PRESENTER : Dr. Abanish Chandra Ray
MODERATOR: Prof (Dr.) Mohanchandra
Mandal
Department of Anaesthesiology
Conducting Research:
Steps
Research
question
Study
Design Data
Collection
Data
2
MAKING A START ON YOUR PROJECT
CHOOSING A RESEARCH TOPIC
A research project often starts with an idea
that interests you, or a problem you have
noticed.
Youshould choose a topic of interest,
explore what has already been written on
the subject, what local research exists, in
what context this has been done, talk to
your supervisors and take time to identify
clear research questions and choose a
feasible and practical method for your study.
A clear research
question
A research question should be objective and
answerable using a research methodology.
Research questions can be :
quantitative, qualitative or a combination of both.
Quantitative research questions generate data that are
measures or values, which can be used for descriptive
and inferential statistics (such as ‘what are the causes
of anaemia in children presenting to SSKM Hospital?)’
Qualitative research generates broader understanding
of opinions,or reasons, providing insight. It can help
explain the reasons for quantitative results.
Qualitative research questions may include perceptions
of patients, parents or healthcare workers (‘what do
adolescents with rheumatic heart disease understand
about their condition?’ or ‘what are the greatest
concerns of the parents of children with epilepsy?’
Definition of terms and
metrics of measurement
It is possible to describe some studies using the mnemonic
PICOT, for population, intervention, comparator, outcome and
time. This applies to Intervention studies, and is a way of
phrasing the primary question clearly. Examples of this are in
mentioned in
P—among children under 2 years of age with moderate-to-
severe pneumonia or bronchiolitis presenting to an
emergency department.
I—does nebulised hypertonic saline given in up to three doses
over 2 hours?
C—compared with standard care including antibiotics, oxygen.
O—result in a lower respiratory distress score and fewer
children requiring inpatient care.
T—over the first 12 hours.
The FINER Framework for
creating research question
F-Feasible
I-Interesting
N –Novel
E-Ethical
R-Relevant
What is study design?
⚫ The procedures and methods, predetermined
by an investigator, to be adhered to in
conducting a research project
⚫ Methods used to obtain valid data to answer
a research question
3
Determining study design: Key
question
⚫ Do you plan to stand apart to observe the
events taking place in the study subjects:
(Observational studies)
OR
⚫ Do you want to give an intervention and see
its effects on the following events:
(Interventional or Experimental studies)
4
5
Study design to employ
⚫Dependent on the hypothesis posed
⚫ Is your objective to observe, associate
factors, or show cause and effect?
⚫ Are exposure or outcome factors
common or rare?
⚫ Are your resources many or
constrained?
Types of Study
Designs
⚫Observational Designs
⚫Descriptive – Case report, case
series
⚫Analytic
⚫ Cross-sectional, Case-control,
Cohort
⚫Experimental Designs
⚫Quasi experimental
⚫ Non randomized or non control
trial
⚫True experimental
Quasi Experimental vs True Experimental
True experimental design
Example: half the patients in a mental health clinic are randomly
assigned to receive the new treatment. The other half—the control
group—receives the standard course of treatment for depression.
Every few months, patients fill out a sheet describing their
symptoms to see if the new treatment produces significantly better
(or worse) effects than the standard one.
Quasi-experimental design
Example: You discover that a few of the psychotherapists in the
clinic have decided to try out the new therapy, while others who
treat similar patients have chosen to stick with the normal
protocol.
You can use these pre-existing groups to study the symptom
progression of the patients treated with the new therapy versus
those receiving the standard course of treatment.
Although the groups were not randomly assigned, if you properly
account for any systematic differences between them, you can be
reasonably confident any differences must arise from the
treatment and not other confounding variables.
Descriptive
Studies
⚫No assignment of exposure or
risk factor
⚫Objective is to observe and record
⚫Record events or activities.
⚫Single event or case - Case
Report
⚫Several events or cases - Case
Cross-Sectional Studies
⚫ Measurement of risk and outcome at the
same time.
Risk factor
Outcom
e
Cross-Sectional
Design
⚫ The only study
capable of
calculating
prevalence
⚫ Proportion of
the population
with
the outcome
at any point
in time
Cross-Sectional Studies
⚫ Most useful if exposure continues right up to
the time that the outcome is recognized
⚫ Suggests the possibility of certain risk
factors as the cause of a common disease
⚫ Can be used to initiate and evaluate
effective health services programmes
Cross-Sectional Studies
⚫Advantages
⚫ Cheap and quick studies
⚫ Data is frequently available through current
records or statistics
⚫ Ideal for generating new hypothesis about
the cause of a
disease
Cross-Sectional
Studies
⚫ Disadvantages: The importance of the
relationship between the cause and the
effect cannot be determined
⚫ Temporal weakness:
⚫ Cannot determine if cause preceded the
effect or the
effect was responsible for the cause.
⚫ The rules of contributory cause cannot be
fulfilled.
Case-Control
Studies
⚫ Subjects are grouped according to the
presence or absence of the outcome
?
?
Risk Disease
⚫ Reviews past histories of the
subjects for the occurrence of
suspected risk factors
Case-Control Studies
⚫ Case - Control studies have two main
purposes:
⚫ Descriptive
⚫ Describe the risk factor profile for an outcome
⚫ Analytic
⚫ Analyze associations between outcome and
risk factors
Case-Control Studies
⚫ Advantages
- Good initial explanatory studies
⚫ Investigators can explore multiple
risk factors simultaneously for one
outcome
- Efficient, relatively cheap, and
quick
⚫ Data available through chart
review
Case-Control Studies
⚫ Advantages
- Well suited for rare diseases
- Since the study begins with subjects who
already have the outcome, it is easier to
accumulate enough subjects for significant
results
- Tend to support (but not prove) causal
hypothesis by establishing associations
Case-Control
Studies
⚫ Disadvantages
- Data Quality
⚫ Data with inadequate detail, questionable
reliability, or use of different standards to
judge disease severity
- Recall bias
⚫ Subjects who have unpleasant experiences
may recall past differently than control
subjects
Case-Control
Studies
⚫ Disadvantages
- Sampling bias
⚫ Sample usually not representative of all
subjects who
could be included – clinical cases are selective
survivors
- Other
⚫ Capable of studying only one outcome at a
time
⚫ Cannot calculate prevalence or incidence
⚫ Subject to confounding factors
⚫ Cannot prove contributory cause
Cohort Studies
⚫ Subjects identified according to the
presence or absence of the risk factor /
exposure
⚫ Followed over time until the outcome
occurs or becomes evident
Cohort
Studies
⚫ Subjects with and without the suspected
risk factor are followed for the
development of the outcome
Disease
R
i ?
s ?
k
⚫ The frequency of the outcome is
compared between the two groups
Cohort Studies
⚫ Cohort studies have two main purposes:
⚫ Descriptive
⚫ Describe the incidence of outcome over time
⚫ Analytic
⚫ Analyze associations between risk factors and
outcome
Cohort Studies
⚫ Advantages
- More powerful design for defining incidence
-Powerful design for associating the
cause with the effect
⚫ Can suggest that the cause precedes the
effect-temporal association
⚫ Data can be collected in a comprehensible
and uniform fashion
Cohort
Studies
⚫ Advantages
- No recall bias
-Cohort designs can examine many
outcomes for potential risk factors under
investigation
Cohort
Studies
⚫ Disadvantages
-Expensive in time, money, and number of
patients necessary to demonstrate
significant differences between groups
-Less likely than retrospective studies to
uncover new risk factors
- Also subject to confounding
- Loss of valuable information due to patient
attrition
-Patients may change their behaviors or risk
factors after the initial grouping of subjects
resulting in misclassification and study error
Sampling Methods
There are two major categories of sampling methods
probability sampling methods where all subjects in the
target population have equal chances to be selected in the
sample
non-probability sampling methods where the sample
population is selected in a non-systematic process that
does not guarantee equal chances for each subject in the
target population
Samples which were selected using probability sampling
methods are more representativesof the target population.
Clinical Trial
⚫ Experimental study
Unique features:
⚫ Intervention in the subjects for the
purposes of the
study
⚫ Randomization of subjects
⚫ Control group comparison
⚫ Placebo or treatment
Clinical Trial Design
Randomized Outcome
Intervention/Effect
Clinical Trial
⚫ Randomization
⚫ Subjects are randomly assigned to
control or experimental group
⚫ Groups are similar in every way
except for the intervention under
study
⚫ Each subject has equal probability of being
placed in either group
Clinical
Trials
⚫ Advantages
-Subject to the fewest methodological biases
of all study designs
-Most powerful study designed to show
contributory cause
Clinical
Trials
⚫ Disadvantages
-Is the most expensive study design in terms
of money, and number of patients.Time is
iesser as compare to cohort study.
⚫ Issues of patient attrition and compliance may
invalidate the results
⚫ Can be problematic for ethical reasons
⚫ Use of placebo
⚫ Harm outweighing benefits
⚫ Zero tolerance for some exposures
Study Designs: comparison
Study Primar outcome exposur
design y at entry e at
objecti in study entry in
ve study
Cross- Disease Yes Yes
section burden
al
Case-control Association Yes Yes
Cohort Cause-effect No Yes
Clinical trial Cause-effect No Yes
31
Study designs: relative strength
Strengt Study design
h
STRON Clinical trial
G
Cohort study
Case control study
Cross-sectional
Case series
WEA
K
Case report
3
Choosing the specific
design
⚫ Study design is highly dependent on the
type of analysis (step 3)
⚫ Type of analysis is dependent on hypothesis
posed (step 2)
⚫ The hypothesis is dependent on the intent of
your research (step 1)
1. Research
question
2. Study Design
3. Data Collection
4. Data Analysis
5. Interpretation
1. Research Study
Intent
⚫ Know the problem
⚫ Determine what
you want to
conclude
⚫ Formulate the
question
Examples of Intent
⚫ I intend to show that aspirin resistance is
associated with the severity of heart
disease
⚫ I will compare levels of aspirin
resistance among patients with differing
severity of heart disease
⚫ I intend to show that breast feeding is
protective against allergies developing
in infants
⚫ I will compare rates of allergies among
women who breast feed and those who do
not
2. Research
Hypothesis
⚫ Know the question you want to be answered
⚫ Restate the question into terms of Ho and Ha
⚫ Think about corresponding analysis
Examples of Hypotheses
⚫ Is aspirin resistance associated with heart
disease?
⚫ Aspirin resistance increases the risk of heart
attack
⚫ Is breast feeding associated with allergies?
⚫ Breast feeding decreases the risk of
allergies in babies
3. Statistical Plan of
Analysis
⚫ Correlation
?
⚫ Compariso
n?
⚫ Association
?
⚫ Difference
?
Examples of Analysis
⚫ The level of aspirin resistance is compared
between those with heart attack and those
without
⚫ Differences in resistance scores between
two groups as this data is continuous data
so Student t-test was used for analysis
⚫ The rate of infant allergies is
compared among women who breast
feed and those who do not
⚫ Relative risk association as it is a
Choose appropriate
design
⚫ Cross-
sectional.
⚫ Case-control.
⚫ Cohort.
⚫ Clinical Trial.
Apply the best
design
⚫ Think about the
measures to be
used
⚫ Know the
analysis
required
⚫ Rethink
desired
conclusions
Decision to select appropriate
study design
⚫ The objectives of the study
⚫ The characteristics of the exposure and
disease
⚫ The current state of knowledge:
relationship
⚫ The research setting
⚫ The resources available
Assign Exposure?
Yes No
Experimental Observational
Rand om allocation Comparison group
Yes No Yes No
RCT Non- Analytic Descriptive
RCT study study
Direction?
Exposure Outcome Exposure & Outcome
Simulteneous
Cohort Case- Cross-
Control Section
Outcome Exposure
Prerequisite of a good research
• At the outset,
the primary objectives (descriptive/analytical) and
primary outcome measure(mean/proportion/rates) should be defined.
• Choose a primary outcome and lock that for the study.Primary outcome can not be changed after
start of the study because sample size is determined on the basis of primary outcome .
• The minimum difference that investigator wants to detect between the groups is the effect size for
the sample size calculation.
• Hence, if the researcher changes the planned outcome after the start of the study, the reported P
value and inference becomes invalid.
Prerequisite of a good research
The sample size for any study depends on certain factors such as the acceptable ‘level
of significance’ (P value), Power (1 − β) of the study, expected ‘clinically relevant’
effect size, underlying event rate in the population, etc.
Primarily, 3 factors:
• P value (depends on α),
• Power (related with β) and
• the ‘Effect size’ (clinically relevant assumption) govern an appropriate sample size.
Type I error, Type II error, and Power or (1- β)
Type I error - α (alpha) error: The chance that the researcher detects a
difference between the two groups when in reality no difference
exists. False-positive conclusion.
Type II error- β (beta) error: The chance of not detecting the difference
when in reality the difference exists. False-Negative conclusion
Power or (1- β): the complement of beta: The power represents the
chance of avoiding a false-negative conclusion; or, the chance of
detecting an effect if it really exists.
Type I error, Type II error-which is more important?
The level of acceptable Type I error (α) and Type II error (β) should
also be determined. The magnitude of Type I error is customarily
set lower (=the research is protected more rigidly) than Type II
error because the impact of a false positive error (Type I) is more
detrimental than that of false negative (Type II) error.
As the values of alpha and beta is lowered, the sample size
increases.
As the power is increased (beta is lowered), the sample size
increases.
Type I and type II errors bear a reciprocal
relationship.
For a given sample size, both cannot be minimized at the same time. If
we seek to minimize type II error, the probability of type I error will go up
and vice versa.
The strategy is to strike an acceptable balance between the two, as a
priori.
An ideal study that makes a researcher happy is one where the Power of
the study is high. (the study has a high chance of making a conclusion
with reasonable confidence).
Type I and type II errors- acceptable balance between the two
We must estimate an adequate sample (number of subjects) for the research so
that the study with a particular degree of certainty has acceptable ‘power’ (i.e., it
can avoid ‘type II error’) to support the null hypothesis. Hence, if no difference is
found between the groups, then, YES, this is a true finding.
Researcher’s
conclusion
Fail to Reject Reject Null
Null hypothesis hypothesis
No difference Correct Type I error (α)
exists. False Positive
Null hypothesis error
(H0) is true
Real
situation
Difference Type II error (β) Correct
does exist. False Negative Hazra A. 2013
Null hypothesis error DOI: 10.5530/rjps.2013.4.1
(H0) is false
Effect size
The effect size is the smallest difference that would be of clinical
importance; a clinically relevant assumption
The researcher determines or assumes the effect size from
Scientific knowledge and wisdom.
Data available from previous publications on related topic
the opinion of expert - if there is paucity of literature on the topic.
‘Minimum clinically important difference’ is the smallest difference that
would be worth testing.
Sample size varies inversely with effect size.
P-value
P value: It is the ‘probability’ that an observed outcome has merely happened
by chance.(observed outcome being entirely due to chance is 5 %).
The concept was formally introduced by Karl Pearson in 1900s (Pearson’s Chi-
square test); popularized by Ronald Aylmer Fisher (Fisher’s exact test). A P-
value of 0.05 means that 5 of 100 results would show a difference at least as
big as that observed, just by chance. Fisher proposed the level P<0.05
(i.e., a 5 in 100 chance of results occurring by chance) as the cut-off for
statistical significance.
Historically, the originators concluded that an alpha of 0.05 or a 5 in 100
chance of being incorrect was good enough.
Calculating the sample size by comparing two
means
A study to see the effect of phenylephrine on MAP as continuous variable after spinal
anaesthesia to counteract hypotension.
MAP as continuous variable:
n = Sample size in each of the groups
μ1 = Population mean in treatment Group 1,
μ2 = Population mean in treatment Group 2
μ1−μ2 = The difference the investigator wishes to detect
℧ = Population variance (SD)
a = Conventional multiplier for alpha = 0.05,
b= Conventional multiplier for power = 0.80.
Value of a= 1.96, b= 0.842 .
If a difference of 15 mmHg in MAP is considered between the phenylephrine and the
placebo group as clinically significant (μ1− μ2), the ‘effect size’, and be detected
with 80% power and a significance level alpha of 0.05.
n= 2 × ([1.96 + 0.842]2× 202)/152
= 27.9. That means 28 subjects per group is the sample size.
Calculating the sample size by comparing two
proportions
A study to see the effect of phenylephrine on MAP as a binary variable after
spinal anaesthesia to counteract hypotension.
MAP as a binary outcome, below or above 60 mmHg (hypotension – yes/no):
n= The sample size in each of the groups
p1 = Proportion of subjects with hypotension in treatment Group 1
q1 = Proportion of subjects without hypotension in treatment Group 1 (1 −
p1)
p2 = Proportion of subjects with hypotension in treatment Group 2
q2 = Proportion of subjects without hypotension in treatment Group 2 (1 −
p2)
x= The difference the investigator wishes to detect
a= Conventional multiplier for alpha = 0.05
b= Conventional multiplier for power = 0.8