1. Statistical Methods for Continuous Variables_Part One
1. Statistical Methods for Continuous Variables_Part One
Variables
Comparison of two or more than two
groups
1
Learning objectives
At the end of this session, participants should be
able:
• Define data analysis
• Become familiar with Statistical Methods for
Continuous Variables
• Explain one way ANOVA
Introduction
Data Analysis
• Turning raw data into useful information
Importance of comparison:
• Evaluate magnitudes of problems
• Program evaluations
6
Introduction to comparison…
7
A) Comparison using t-test
• There are a number of different types of t-tests:
i) One sample t-test
– Test hypothesis about single population mean
8
One-Sample t-Test
• It a statistical test used to determine whether the mean
of a sample differs significantly from a known or
hypothesized population mean.
Examples
1. A researched wants to know if the average
haemoglobin level of study participants (sample) is
different from 11 g/dl.
df=n-1
df are used to determine the
critical t-value from the t-
distribution table.
Steps to Perform a One-Sample t-Test
Step 6: Conclusion
• Interpret the results based on whether or not the Ho was rejected.
14
SPSS…
15
SPSS output for the above data:
16
2) The independent t-sample test
• Comparing the means of two independent groups to
determine if there is a statistically significant difference
between them.
E.g
o Comparing the average scores of students from two
different classes.
o Comparing the average weight loss after two different
diets/treatments.
o Comparing the average CD4 count after ART
between male and female.
Assumptions of the independent t-
sample test
1. Independence of Observations
• Observations in one group should not influence in the
other group.
2. Normality
• The data in each group should be normally distributed.
3. Homogeneity of Variances (Equal Variances)
• The variances of the two groups should be equal.
4. Scale of Measurement
• The dependent variable should be continuous
5. Random sampling
• The data randomly sampled from the population
Null and Alternative Hypotheses of the
independent t-sample test
• H₀: There is no significant difference between the means
of the two groups.
• H ₀: μ1=μ2
24
SPSS output for the above data:
Group Statistics
Sex N Mean Std. Std. Error
Deviation Mean
Male 79 429.71 255.224 28.715
CD4 count at
36 months Female 131 480.34 184.128 16.087
25
Result of Independent t-test:
Advantages:
• Controls erroneous factors
• Biological variability is eliminated
• Makes comparison more precisely
Example
• A researcher wants to know whether there is a difference
in the weights of individuals before and after a diet
program.The same individuals are weighed before and
after the program.
Assumptions for the Paired Sample
t-Test
H1:μd≠0 (two-tailed)
OR
H1:μd>0 (one-tailed, positive difference)
OR
H1:μd<0 (one-tailed, negative difference)
The t-statistic in a paired sample t-test
The formula for the t-statistic The formula for
of freedom (df)
df=n-1
df are used to determine the
critical t-value from the t-
Where: distribution table.
• d: The mean of the differences between
the paired observations.
• 𝑠𝑑: The standard deviation of the
differences between the paired
observations.
• 𝑛: The number of pairs.
Steps to calculate the Paired Sample
t-Test
• State the hypotheses
• Check assumptions
• Calculate the differences between pairs
• Subtract each paired observation
• Calculate the mean of the differences 𝑑 and 𝑠𝑑, then
compute the t-statistic.
• Find the Critical t-Value using the df (df = n - 1) and the
significance level (𝛼),
• Compare the t-Statistic to the Critical
• Based on the comparison, interpret the result to decide
whether the difference is statistically significant or not
Example
• Use SPSS dataset “Data for exercise2.sav ”, test
hypothesis that the mean CD4 before and after
ART among HIV patients is not different at
α=0.05. Calculate the effect size.
Solution:
• Check assumptions
• Analyze → compare means → paired samples
T-test → move the two quantitative variable that
you are interested in comparing for each subject
into the box ‘paired variables’ → then, click Ok.
33
SPSS output for the above data:
Paired Samples Statistics
Mean N Std. Std. Error
Deviation Mean
CD4 count at 36
461.30 210 214.483 14.801
Pair months
1 CD4 count at
167.32 210 93.459 6.449
Baseline
34
Conclusion: Paired t-test result
35
B) ANOVA (Analysis of variance)
• ANOVA is an inferential statistics technique used to
compare mean of a numerical/ Continuous
outcome variable in groups defined by exposure level
with two or more categories/groups
• Splits the total variance into component parts
(within and among groups).
Independent Dependent
Nominal Variable
(Experimental Grouping) Interval/ratio variable
38
Why ANOVA?
• When the group to be compared are more than two, using
‘t’ test is unreliable and is tedious (nC2 comparisons)
𝒏 𝒏!
nC k = =
𝒌 𝒌! 𝒏−𝒌 !
Let say n=6 and k=2, what is possible combination?
In this case:
• n = 6 (total number of groups) e.g 𝐴, 𝐵, 𝐶, 𝐷, 𝐸, 𝐹
• k = 2 (number of groups you’re choosing),
2. Check assumptions
9. Conclusion
• When we reject H0, we conclude that not all population
means are equal
• When we fail to reject H0, we conclude that the
population means are not significantly different from
each other
( x ) 2
2
Total : SS = (x − x ) = x −
430.2
= 4651.8 − = 137.85
2 2
n 41
(ni xi ) 2 ( x) 2 (16 8.7125) 2 (10 10.63) 2 (15 12.3) 2 430.2 2
K
Between groups : SS = − = + + − = 99.89
i ni n 16 10 15 41
df = K − 1 = 2
OR
( x ) 2
430.2 2
Between groups : SS = ni xi − = 16 8.7125 + 10 10.63 + 15 12.3 − = 99.89
2 2 2 2
n 41
df = K − 1 = 2
Within groups : SS = (ni − 1)S i = 15 0.8445 2 + 9 1.28412 + 14 0.9419 2 = 37.96
2
df = n − K = 41 − 3 = 38
68
Example...
70
One-way ANOVA with SPSS
71
SPSS outputs:
Details of descriptive statistics such as number of respondents, mean,
SD, minimum, maximum values in each group and the overall total
72
If this is < 0.05,
equal variance is
not assumed. So
the ANOVA table
will be used Equal
variance not
assumed tests
73
This is reported
as P=0.015
There is a
significant
difference
among the
four types of
WHO staging.
So it is
appropriate to
proceed to a
posthoc (a
posteriori) test.
74
Post Hoc Tests
75
Post hoc Or Multiple Comparisons
• Are comparisons of group means made after data have
been collected.
• They do not assume any prior hypotheses.
• Most are pairwise comparisons, meaning they compare
all pairs of means, to determine if they are significantly
different.
Most frequently used ones are:
• Bonferroni’s t-method
• Tukey’s HSD (Honestly Significant Difference)
76
Conclusion
77
Application of ANOVA
• Applicable to quantitative variables
• Number of groups to be compared is more than two.
• More appropriate when randomized experimental design is
employed.
• But, can also be used for observational designs and
surveys.
• When ANOVA is significant and Ho is rejected, then it is
important to do a post hoc test ( pair-wise comparison).
• When assumptions required for ANOVA are not met, a non-
parametric test equivalent to ANOVA (Kruskal Wallis test) is
performed.
Corresponding non-parametric tests for
each of the parametric tests
Parametric Test Non-Parametric Test
One sample t-test Wilcoxon Signed-Rank Test
Mann-Whitney U Test (or
Independent samples t-test
Wilcoxon rank-sum test)
Wilcoxon Signed-Rank Test
Paired sample T-Test
(for paired data)
One way ANOVA Kruskal-Wallis H Test