0% found this document useful (0 votes)
68 views10 pages

Unit 545 Differences Between Two or More Groups Non Parametric With Answers

Uploaded by

z13612909240
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views10 pages

Unit 545 Differences Between Two or More Groups Non Parametric With Answers

Uploaded by

z13612909240
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Assignment for Unit 545

Does a new medicin reduce asthma, methods to improve pre-natal care,


and more …
Henk van der Kolk

09 January 2024

Goals of the practice questions and this assignment


In this assignment, you will learn how to check whether two or more groups are different
in a population, using sample statistics about these groups. This time, however, the samples
are small and the distribution of the variables is not close to normal. This implies that the
standard (and powerful) statistical methods for testing group differences cannot be used.
We will use alternatives Mann–Whitney–Wilcoxon test and the Kruskal-Wallis test

Selecting tests
Which test would you select?
1. Two groups of students were asked the question: “How likely is it that you will go on
holiday this year?” The three answering categories of this question were “not very
much”, “somewhat” and “very much”. Suppose you want to test the (null-)
hypothesis that in the population the two groups do not differ in the way they
respond to the question. Which variant of the linear model or non-parametric test
would you use for this?
The Mann-Whitney_Wilcoxon test. As the dependent variable is ordinal, a non-parametric test
should be used. And since we are having two groups only, the MWW test is to be preferred.
This allows you to test whether the medians in the groups are different.
2. Three groups of 3 x 60 students were randomly assigned to three teaching methods.
The grade afterwards was used to test the difference between the teaching methods.
Which variant of the linear model or non-parametric test would you use for this?
Since we are having three groups, and the sample is rather big, an (Welch) ANOVA or a linear
model with a nominal independent variable (dummies) (which is the same as a (non Welch)
ANOVA) is to be preferred. Generally parametric tests are said to be more powerful.
3. Five relatively small groups of politicians have responded to the statement: “Global
warming is fake news”. The politicians were able to respond to this question on a 7-
point Likert scale (with 1 = “not at all agree” to 7 = “totally agree”). Suppose a
researcher wants to test the hypothesis that in the population the groups do not
differ in the way they respond to the question. The researcher has no problem in
treating the Likert scale as an interval-level (quantitative) variable instead of

1
ordinal. However, she finds that when she runs a linear model, the residuals do not
show normal distributions in each group. Even after trying several different ways of
transforming the dependent variable, the residuals are far removed from showing a
normal distribution.
What would you recommend the researcher? Here are the options:
A. Consider a Kruskall-Wallis test.
B. Remove data points that cause the non-normality and run the linear model again.
C. Collect more data until the residuals show a more normal distribution and run the linear
model again.
D. There might be errors in the data: check if the problem goes away if you change a few
values in the data matrix. Then run the linear model again.
Removing data points is often arbitrary and reduces possibilities to generalize to the
population of politicians (no B). Collecting data requires more politicians, and why do we
expect the distribution the change if we do that? (so no C). Checking data is always good, but
to randomly check things you do not like, is confirmation bias (so no D).
You will now do some of the tests covered in this unit.

Example 1: A clinical trial about an asthma medicine


Consider a clinical trial designed to investigate the effectiveness of a new drug to reduce
symptoms of asthma in children. A total of n = 10 participants are randomly assigned to
two groups to receive either the new drug or a placebo. Participants are asked to record the
number of episodes of shortness of breath over a 1 week period following receipt of the
assigned treatment.
4. Create a data frame called ‘data’ by using the following command lines:
In the data frame the C refers to the Control group, the T refers to the Treatment group and
the Asthma variable refers to the number of episodes of shortness of breath over a 1 week
period following receipt of the assigned treatment.
5. Suppose you would have done this study with a larger group of participants (say 2
times 40 people), which test would you have used?
Since this is about differences between two groups, either using a dummy variable in a linear
model OR using a (Welch) independent sample t-test would be the best methods. However,
since we now have a skwed dependent variable (look at the histogram) with a small number
of cases, you would opt for the (non parametric) Mann–Whitney–Wilcoxon test.
6. If the group is small (like in this example) and the data are skewed, you would opt
for the (non parametric) Mann–Whitney–Wilcoxon test. Check the skewness of the
variable by creating a histogram. What do you conclude?
data %>%
ggplot(aes(x = asthma)) +
geom_histogram()

2
# geom_histogram(aes(y =..density..))

# +
# stat_function(fun = dnorm,
# args = list(mean = mean(asthma),
# sd = sd(asthma)))

We indeed have a skewed dependent variable with a small number of cases, so we opt for the
(non parametric) Mann–Whitney–Wilcoxon test.
7. Create a scatter (or jitter) plot with these data. What do you think: are the groups
really different?
data %>%
ggplot(aes(x = group, y = asthma)) +
geom_point()

3
The groups look different! However, the sample size is very small, so maybe this is just an
outcome of a random sample?
8. Check whether the groups are really different by using the Mann–Whitney–
Wilcoxon test. Check the RHelpdesk file for this when needed.
test <- wilcox.test(asthma ~ group,
data = data,
exact = FALSE)

test

##
## Wilcoxon rank sum test with continuity correction
##
## data: asthma by group
## W = 22, p-value = 0.05855
## alternative hypothesis: true location shift is not equal to 0

9. How do you interpret the outcome of the test?

4
Under the null hypothesis the groups come from the same distribution (read, come from the
same population, read, there is no difference between the medians of both groups) and thus
the distributions are the same. The only reason for finding differences between the groups is
then because by chance. The W statistic and the associated table gives a p-value. This p-value
is the chance that these data are both random samples from the same (distribution of the)
population. In this case, the p value is low, but still above the cut off point of 0.05, so we
CANNOT exclude the possibility that there is actually NO difference between the two groups.
(so: groups maybe/probably not different/the same).
10. Let us ignore some observations about the skewness of the data and do a simple
parametric test, assuming equal variances for the groups. What do you see?
test_2 <- data %>%
lm(asthma ~ group, .)

summary(test_2)

##
## Call:
## lm(formula = asthma ~ group, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.80 -1.65 -0.50 0.65 5.20
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.800 1.158 5.874 0.000372 ***
## groupT -3.600 1.637 -2.199 0.059081 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.588 on 8 degrees of freedom
## Multiple R-squared: 0.3767, Adjusted R-squared: 0.2988
## F-statistic: 4.836 on 1 and 8 DF, p-value: 0.05908

The outcome is similar, also in this case we do NOT reject the null hypothesis.

Example 2: Prenatal care


A new approach to prenatal care is proposed for pregnant women living in a rural
community. The new program involves in-home visits during the course of pregnancy in
addition to the usual or regularly scheduled visits. A pilot randomized trial with 15
pregnant women is designed to evaluate whether women who participate in the program
deliver healthier babies than women receiving usual care.

5
The outcome is the APGAR score test annotation indicator measured 5 minutes after birth.
APGAR scores range from 0 to 10 with scores of 7 or higher considered normal (healthy),
4-6 low and 0-3 critically low.1
The data of the study can be created using the following commands.
# Example again stolen from https://sphweb.bumc.bu.edu/otlt/mph-
modules/bs/bs704_nonparametric/bs704_nonparametric4.html

# Create a dataframe using the following commands:


apgar <- c(8, 7, 6, 2, 5, 8, 7, 3, 9, 9, 7, 8, 10, 9, 6)
group <- c("U", "U", "U", "U", "U", "U", "U", "U", "N", "N", "N", "N",
"N", "N", "N")
data_2 <- cbind(group, apgar) %>% as.data.frame()
data_2$apgar <- data_2$apgar %>% as.numeric()

11. Why would you opt for a (non parametric) Mann–Whitney–Wilcoxon test in this
context?
Because the samples are small, there are two groups and the APGAR scores are not normally
distributed. Check this by creating a histogram.
12. Test whether the groups are really different.
test_4 <- wilcox.test(apgar ~ group,
data = data_2,
exact = FALSE)

test_4

##
## Wilcoxon rank sum test with continuity correction
##
## data: apgar by group
## W = 47.5, p-value = 0.02609
## alternative hypothesis: true location shift is not equal to 0

13. How do you interpret the outcome of the test?


Under the null hypothesis the groups come from the same distribution (read, come from the
same population, read, there is no difference between the groups) and thus the distributions
are the same. The only reason for finding differences between the groups is then because by
chance. The W statistic and the associated table gives a p-value. This p-value is the chance
that these data are both random samples from the same (distribution of the) population. In
this case, the p value is low, even below the cut off point of 0.05, so we CAN (almost) fully

1
The APGAR is based on 5 criteria Appearance of the skin, Pulse rate, Grimace (reflex or reaction to
stimulation), Activity (or muscle tone), and Respiration. Each of the 5 criteria is rated as 0 (very unhealthy), 1
or 2 (healthy) based on specific clinical criteria. The APGAR score is the sum of the 5 component scores and
ranges from 0 to 10. Infants with scores of 7 or higher are considered normal, 4-6 low and 0 to 3 critically
low. Sometimes the APGAR scores are repeated, for example at 1 minute after birth, at 5 and at 10 minutes
after birth and analyzed.

6
exclude the possibility that there is no difference between the two groups. This is supporting
the alternative. So it seems there REALLY is a difference.
Suppose now we do not have two but several groups. Again the sample is small and the
distribution of the dependent variable is not close to normal. So the standard (and
powerful) statistical methods for testing group differences may not be used.

Example 3: intrinsic motivation for health-related behavior


We want to compare intrinsic motivation for health-related behaviour across four age
groups of adults in the Netherlands: 21-30, 31-40, 41-50, and 51-60. We collected data
from 40 people from 4 random samples of 10 people from each of these age groups. We
compared the groups using a linear model and checked our assumptions. Look at the
output below.
library("tidyverse")

set.seed(19640304)
n <- 40
my_data <- c(1:n) %>% as.data.frame() %>% rename(RNR = ., .)
my_data$class <- rep(1:4, len = 10)

mu <- 100
sigma <- 0.65
my_data$int_mot <- rlnorm(n, mu, sigma)

my_data$int_mot <- scale(my_data$int_mot/max(my_data$int_mot))

my_data$int_mot <- round(my_data$int_mot, 1)

my_data %>%
ggplot(aes(x = int_mot)) +
geom_histogram(aes(y = after_stat(density))) +
stat_function(fun = dnorm,
args = list(mean = mean(my_data$int_mot),
sd = sd(my_data$int_mot)))

7
shapiro.test(my_data$int_mot)

##
## Shapiro-Wilk normality test
##
## data: my_data$int_mot
## W = 0.88437, p-value = 0.0006918

14. Would you recommend we use this linear model, or would you recommend a non-
parametric alternative? Why? And if you recommend a non-parametric alternative:
which test would you recommend?
There are clear deviations from normality in the plots. There are only ten participants per
group so we cannot assume that a linear model would be robust to these deviations from
normality. A non-parametric test is more appropriate. As we are comparing several groups,
the Kruskal-Wallis test for group comparisons is recommended.

8
Example 4: intrinsic motivation for health-related behavior
A total of 21 people were randomly assigned to participate in either an aerobics class, a
spinning class or a pilates class. After class they were asked to report how depressed they
felt at that moment. Depressed mood was measured using 1 item with 5 indicating high
depression and 1 indicating low depression. Check the output below.
set.seed(19640304)
n <- 21
my_data <- c(1:n) %>% as.data.frame() %>% rename(RNR = ., .)
my_data$group <- rep(1:3, len = 7)
my_data$type <- "other"
my_data$type <- ifelse(my_data$group == 1, "aerobics", my_data$type)
my_data$type <- ifelse(my_data$group == 2, "spinning", my_data$type)
my_data$type <- ifelse(my_data$group == 3, "pilates", my_data$type)

mu <- 100
sigma <- 0.3
my_data$depress <- rlnorm(n, mu + 0.5*my_data$group, sigma)

my_data$depress <- my_data$depress/max(my_data$depress)

my_data$depress <- round((my_data$depress)*4 + 1,1)

kw_test <- kruskal.test(depress ~ type,


data = my_data
)

my_data %>%
ggplot(aes(x = type, y = depress)) +
geom_boxplot()

kw_test

15. What was tested here? Explain in your own words and give the null hypothesis in
words.
Explanation: It was tested whether there is a difference in distribution of depressed mood
after participating in either a spinning, an aerobics class or in pilates. Null hypothesis in
words: the three distributions of depressed mood are equal. OR the medians in all three
groups is the same.
16. Which group reported the more depressed moods?
groups <- my_data %>%
group_by(type) %>%
summarise(median = median(depress))

NOT <- ifelse(kw_test$p.value >= 0.05, "not", "")

answer <- paste("Since the p-value of the Kruskal-Wallis test is ",

9
kw_test$p.value, " we do ", NOT, " reject the null hypothesis of there
being no differences in the medians of the groups.", sep = "")

The boxplot already shows the highest median.


17. What would your conclusion be, regarding the relationship between exercise class
and depressed mood?
Since the p-value of the Kruskal-Wallis test is 0.00038189289419122 we do reject the null
hypothesis of there being no differences in the medians of the groups.
<< END OF THE ASSIGNMENT>>

10

You might also like