Wilcoxon
Rank-Sum
Test
Presentation by:
Sahil Jain
IIIT-Delhi
» Generally used when normality assumption for the
sample does not hold and sample size is small
» Non-parametric statistical hypothesis test for
assessing whether one of two samples of
independent observations tends to have larger
values than the other.
» Select a random sample from each of the
populations
» Let n1 and n2 be the number of observations
in the smaller and larger sample respectively
» Arrange the combined n1+n2 observations in
ascending order and substitute a rank of 1,
2… to the n1 + n2 observations
» In the case of ties, we assign the conflicting
observations with their mean ranks
» Our decision is based on the value of the test statistic
U(the random variable for u)
» For one-tail test: u1 or u2
» For two-tail test: u= min(u1, u2)
» Null hypothesis will be rejected whenever the
appropriate statistic U1, U2 or U assumes a value less
than or equal to the desired critical value.
Test Statistic ≤ Critical Value
H0 H1 Compute
µ1 < µ2 u1
µ1=µ2 µ1 > µ2 u2
µ1 ≠ µ2 u
Brand A 2.1 4.0 6.3 5.4 4.8 3.7 6.1 3.3
Brand B 4.1 0.6 3.1 2.5 4.0 6.2 1.6 2.2 1.9 5.4
The nicotine content of two brands of cigarette,
measured in mg, was found to be as given the table.
Test the hypothesis, at 0.05 level of significance, that
the median nicotine content of two brands are equal
against the alternative that they are unequal.
» H0 : µ1 = µ2
» H1 : µ1 ≠ µ2
» α = 0.05
» n1 =8
» n2 =10
» Critical Region: µ ≤ 17 (From Table)
» Computation Steps:
Arranging observations in ascending
order and assigning ranks from 1 to 18.
DATA RANKS BRAND
0.6 1 B
1.6 2 B
1.9 3 B
2.1 4 A
2.2 5 B
2.5 6 B
3.1 7 B
3.3 8 A
3.7 9 A
4.0 10.5 A
4.0 10.5 B
4.1 12 B
4.8 13 A
5.4 14.5 A
5.4 14.5 B
6.1 16 A
6.2 17 B
6.3 18 A
• w1 = 4+8+9+10.5+13+14.5+16+18 = 93
• w2 = 1+2+3+5+6+7+10.5+12+14.5+17 = 78
• Therefore, u1 = 93-((8*9)/2) = 57
• u2 = 78-((10*11)/2) = 23
• Min(u1, u2) = 23 ( not ≤ 17)
Decision: Don't reject Null Hypothesis H0 and
conclude that there is no significant
difference in median nicotine contents of
two brands of cigarettes at 0.05 significance
level.
In a genetic inheritance study discussed by
Margolin[1988],samples of individuals from several
ethnic groups were taken. Blood samples
were collected from each individual and several variables
measured.
Here, we want to test the hypothesis that the median of
blood samples for Native Americans is the same as that
for Caucasians against the alternative hypothesis that the
median of the blood samples for Native Americans is less
than that for Caucasians.
n1 =8
n2 =10
Wilcoxon Rank-Sum Test

Wilcoxon Rank-Sum Test

  • 1.
  • 2.
    » Generally usedwhen normality assumption for the sample does not hold and sample size is small » Non-parametric statistical hypothesis test for assessing whether one of two samples of independent observations tends to have larger values than the other.
  • 3.
    » Select arandom sample from each of the populations » Let n1 and n2 be the number of observations in the smaller and larger sample respectively » Arrange the combined n1+n2 observations in ascending order and substitute a rank of 1, 2… to the n1 + n2 observations » In the case of ties, we assign the conflicting observations with their mean ranks
  • 5.
    » Our decisionis based on the value of the test statistic U(the random variable for u) » For one-tail test: u1 or u2 » For two-tail test: u= min(u1, u2) » Null hypothesis will be rejected whenever the appropriate statistic U1, U2 or U assumes a value less than or equal to the desired critical value. Test Statistic ≤ Critical Value H0 H1 Compute µ1 < µ2 u1 µ1=µ2 µ1 > µ2 u2 µ1 ≠ µ2 u
  • 6.
    Brand A 2.14.0 6.3 5.4 4.8 3.7 6.1 3.3 Brand B 4.1 0.6 3.1 2.5 4.0 6.2 1.6 2.2 1.9 5.4 The nicotine content of two brands of cigarette, measured in mg, was found to be as given the table. Test the hypothesis, at 0.05 level of significance, that the median nicotine content of two brands are equal against the alternative that they are unequal.
  • 7.
    » H0 :µ1 = µ2 » H1 : µ1 ≠ µ2 » α = 0.05 » n1 =8 » n2 =10 » Critical Region: µ ≤ 17 (From Table) » Computation Steps: Arranging observations in ascending order and assigning ranks from 1 to 18.
  • 9.
    DATA RANKS BRAND 0.61 B 1.6 2 B 1.9 3 B 2.1 4 A 2.2 5 B 2.5 6 B 3.1 7 B 3.3 8 A 3.7 9 A 4.0 10.5 A 4.0 10.5 B 4.1 12 B 4.8 13 A 5.4 14.5 A 5.4 14.5 B 6.1 16 A 6.2 17 B 6.3 18 A • w1 = 4+8+9+10.5+13+14.5+16+18 = 93 • w2 = 1+2+3+5+6+7+10.5+12+14.5+17 = 78 • Therefore, u1 = 93-((8*9)/2) = 57 • u2 = 78-((10*11)/2) = 23 • Min(u1, u2) = 23 ( not ≤ 17) Decision: Don't reject Null Hypothesis H0 and conclude that there is no significant difference in median nicotine contents of two brands of cigarettes at 0.05 significance level.
  • 10.
    In a geneticinheritance study discussed by Margolin[1988],samples of individuals from several ethnic groups were taken. Blood samples were collected from each individual and several variables measured. Here, we want to test the hypothesis that the median of blood samples for Native Americans is the same as that for Caucasians against the alternative hypothesis that the median of the blood samples for Native Americans is less than that for Caucasians.
  • 11.