Lecture On Non-Parametric Statistics
Lecture On Non-Parametric Statistics
A. Learning Outcomes
B. Course Contents
The following is a sample problem to demonstrate the application of the above formula:
Sample Problem:
A hypothetical data from a youth sexuality study looked into the difference in engagement
in premarital sex (PMS) between male and female adolescents aged 16 to 19. The following data
were obtained. Compute the χ2 . Use α = .05:
Answer:
1. State the Problem: Is there a significant difference in the proportion of male and female
adolescents who engage in (PMS)
2. State the Hypotheses: H0: The proportion of male and female adolescents who have
engaged in PMS is the same.
Ha: The proportion of male and female adolescents who have
engaged in PMS is not the same
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm
6. Decision on H0 : Reject H0 since the computed value is equal to 104.337 which is greater
than the critical value of 3.84.
7. State the Conclusion: There is a significant difference in the proportion of male and female
adolescents who engage in PMS. Cursory look at the data reveals that higher proportion of
males or 33.3% have engaged in PMS compared to 18.2% females.
2. McNemar Test
The McNemar Test or McNemar Chi-square Test is a statistical test used in paired
nominal dichotomous data. This is used to determine whether a significant change in the
proportion of paired data has occurred. This tests the null hypothesis that pb = pc. The formula is:
Sample:
To illustrate the use of this test, assuming that we test the effectiveness of a new vaccine in
controlling Disease X. Below are data on individuals before and after the administration of the vaccine.
Do we have evidence to say that the vaccine is effective in controlling the spread of Disease X. Test
the H0 at α = 0.01.
4. Calculate the McNemar Statistic: χ2 = (80-100)2 / 80+100 = (20)2 / 180 = 400 / 180 = 2.22
5. Decision on H0: Since χ2 = 2.22 < χ2crit = 6.63,H0 Is not rejected.
6. State the conclusion: The results showed that the vaccine is not effective in controlling ghte
spread of Disease X at 0.01 (99.0%) level of significance.
3. Cochran Q Test
The Cochran Q Test is a nominal level test of significance for k related samples. An
extension of the McNemar’s Test, this examines the differences in related sets of three or more
sets of proportions of a dichotomous variable. This tests the null hypothesis that the proportion
of the dichotomous response is the same for all groups. The formula is:
1. State the Problem: Is there a significant difference in the effectiveness of the four VCTs?
2. Hypotheses: H0: The four VCTs are equally effective.
H0: At least one of the VCTs is more effective than the other.
3. Determine the Critical Value: The df is equal to k – 1 or 4- 1 = 3. Referring to Appendix C,
the critical value is equal to 7.82.
7 0 1 0 1 2 4
8 0 1 0 1 2 4
9 0 0 0 1 1 1
10 1 1 0 0 2 4
11 0 0 0 1 1 1
12 1 1 1 1 4 16
G 3 9 6 7 25 65
G2 9 81 36 49
5. Decision on H0: Since χ2 = 6.43 < χ2crit = 7.82, we do not reject H0.
6. State the conclusion: The four platforms are equally effective when it comes to features and
interconnectivity.
4. Mann-Whitney U Test
The Mann-Whitney U Test, also known as Wilcoxon Sum Rank Test, is used in ordinal
level data. It is a test for significant difference between two independent population medians. This
is the alternative to t test of independence when the parametric assumptions are not met. The
formula is:
U1 = n1 n2 + n1 ( n1 + 1) - R1 and U2 = n1 n2 + n2 ( n2 + 1) - R2
2 2
Where: U = the lower value between U1 or U2
n = number of observations per group
R1 = sum of ranks of the first group
R2 = sum of ranks of the second group
Sample:
The salary of 15 male and 10 female managers in an engineering firm was compared. At
α = .05, test the H0 that there is no difference in the salary between sex.
Answer:
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm
1. State the Problem: Is there as significant difference in the salary of male and female
managers of an engineering firm?
2. Hypotheses: H0: There is no significant difference in the salary of male and female
managers.
H0: There is a significant difference in the salary of male and female
Managers.
3. Determine the Critical Value: Refer to Appendix D. The critical value is obtained by getting
the intersection between the n1 = 15 and n2 = 10, which is equal to 44. We will reject the
null hypothesis if the computed value is smaller than critical value (U < U crit).
The Wilcoxon Signed Rank test (T) is a non-parametric test used to compare two related,
paired, matched or repeated measurements (e.g. pre-test and posttest measurements of a single
sample). This is the counterpart of t test for related sample when one of the parametric
requirements is not satisfied. The following case illustrates the use of this test:
Sample:
As part of a behavioral genetic research program, a psychologist investigated on the
influence of heredity and environment on emotional intelligence. He recruited identical twins
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm
raised in different home environment to participate in the study and compared their scores on a
standardized test. Compute U statistic at α = .05.
Scores
Pair A B C D E F G H I J K L M N O
X 80 85 87 88 86 85 95 85 85 92 90 87 91 86 95
Y 85 85 88 86 80 90 90 90 89 89 90 89 88 93 90
Answer:
1. State the Problem: Is there as significant difference in the emotional intelligence between
twins?
2. Hypotheses: H0: Twins who were raised in different home environment do not differ in
their level of emotional intelligence.
H0: Twins who were raised in different home environment differ in their
level of emotional intelligence.
3. Determine the Critical Value: Refer to Appendix E. Degree of freedom is obtained by
getting the intersection of N and the α. Exclude in the N the paired observation with a 0
difference. In the given data set, Twins B and K have zero difference. Hence our N = 13.
The critical value is equal to 21. We will reject the null hypothesis if the computed value
is smaller than critical value (T < T crit).
Case X Y X-Y /X-Y/ Rank With Tied (+) Signed (-) Signed
Rank Rank Rank
A 80 85 -5 5 7 9 -9
B 85 85 0 0 - -
C 87 88 -1 1 1 1 -1
D 88 86 2 2 2 2.5 +2.5
E 86 80 6 6 12 12 +12
F 85 90 -5 5 8 9 -9
G 95 90 5 5 9 9 +9
H 80 90 -5 5 10 9 -9
I 85 89 -4 4 6 6 -6
J 92 89 3 3 4 4.5 +4.5
K 90 90 0 0 - -
L 87 89 -2 2 3 2.5 -2.5
M 91 88 3 3 5 4.5 +4.5
N 86 93 -7 7 13 13 -13
O 95 90 5 5 11 9 +9
41.5 -49.5
Step 1: Get the difference between each paired scores (X and Y).
Step 2: Get the absolute difference then rank the absolute difference from lowest to
highest, disregarding the 0 difference.
Step 3: Assign rank on the absolute difference. In case of ties, apply the principle of
assigning tied ranks.
Step 4: Return the sign of each rank. Get the sum of the positive ranks. Get the sum of
the negative ranks.
Step 5: The T statistic is equal to the lower value between the sum of positive ranks
and the sum of negative ranks.
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm
5. Decision on H0: Since T = 41.5 > Tcrit = 21, we do not reject the null hypothesis.
7. Kruskal—Wallis H Test.
The Kruskal-Wallis H test, also known as one-way ANOVA on ranks, is used to test the
significant difference among groups of an independent variable. This test is the counterpart of F
test one-way ANOVA when one of the parametric assumptions is not met. This is also an
extension of the Mann-Whitney U test. The formula is:
To illustrate the computation of H statistic, we will adopt the same case used in the earlier
discussion of F test ANOVA, but this time with the presence of outliers.
Sample Problem:
Answer:
1. State the Problem: Is there a significant different in the effectiveness of the three
methods of teaching?
2. State the Hypotheses:
H0: Median1 = Median2 = Median3 : There is no significant difference in the
effectiveness of the three methods of teaching
Ha: Median1 ≠ Median2 ≠ Mediank; : There is a significant difference in the effectiveness
of the three methods of teaching.
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm
3. Determine the critical value: Refer to the Chi-square Table of Critical Values in
Appendix C. At α = .05 and degrees of freedom equals 2 (df = k – 1 or 3 – 1), the
critical value for rejecting H0 is 5.99.
4. Calculate the H statistic:
Step 1: Rank all cases across groups form lowest to highest. In case of ties, calculate
the mean rank and assign the value to the tied scores.
Step 2: Sum up the ranks per group.
Step 3: Apply the formula.
5. Decision on H0: Since H = 17.12 is greater than 5.99, the null hypothesis is rejected.
8. Friedman ANOVA
The Friedman ANOVA is an ordinal test used for testing significant difference for multiple
(k > 2) related treatments or samples. This is an extension of the Wilcoxon Signed-Rank test.
This is also a the counterpart of ANOVA for repeated treatment (interval/ratio data) and the
Cochran Q (nominal –dichotomous data). The The formula is:
n = number of subjects
k = number of groups / treatment
Ri2 = sum of ranks for each dependent group
We will use the problem given earlier in the discussion of Cochran Q test, however, we
will modify the data.
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm
Sample:
The school is deciding on which platform to use in conducting online classes. They asked
12 faculty members, who are familiar and have used four types of available virtual classroom
tools (VCTs), to rate the features and interconnectivity. The ratings were then ranked from 1
(most recommended) to 4 (least recommended). Give the data below, can we say that at least
one platform is more preferred than the others? Test the null hypothesis that all platforms are
equally effective at the 0.05 level of significance.
Rater 1 2 3 4 5 6 7 8 9 10 11 12
VCT 1 4 4 4 3 4 3 4 4 4 2 4 4
VCT 2 1 1 3 2 1 1 1 1 2 1 2 1
VCT 3 2 2 1 1 2 2 3 3 3 3 3 2
VCT 4 3 3 2 4 3 4 2 2 1 4 1 3
Answer:
1. State the Problem: Is there a significant difference in the effectiveness of the four VCTs?
2. Hypotheses: H0: The four VCTs are equally effective.
H0: At least one of the VCTs is more effective than the other.
3. Determine the Critical Value: The df is equal to k – 1 or 4- 1 = 3. Referring to Appendix C,
the critical value is equal to 7.82.
12
Friedman ANOVA = [(44)2 + (17)2 + (27)2 + (32)2 - 3(12) (4 +1)
(12) (4) (4+1)
12
= (1936 + 289 + 729 + 1024) - 36 (5)
(48) (5)
= 12 / 240 (3978) – 180 = 0.05 (3978) – 180 = 198.9 - 180
= 18.9
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm
7. Decision on H0: Since Fr = 18.9 is greater than 7.82, the null hypothesis is rejected.
8. Conclusion: There is a significant difference in the ratings on of the four platforms. At least
one of the platforms have better features and interconnectivity
ACTIVITIES / ASSESSMENT
General Direction: For activities that involve solution to problems, perform the steps in conducting
hypothesis testing such as stating of problems and hypothesis, and so on.
1. Community Mental Health Centers tend to see a variety of problems, but some centers seem
to see more of one kind of problems than the others. Out of the last 100 clients seen by each
center, a count has been made of those classed as having (1) social adjustment problems,
(2) problems with living, and other problems The data follow:
1.1 Run the appropriate analysis and interpret the results. Use α = .05
1.2 Using the same data to demonstrate how chi-square varies as a function of sample size,
cut each cell entry in half and recompute chi-square. What does this have to say about
the role of sample size in hypothesis testing.
2. A psychologist operating a group home for delinquent adolescents needs to show that it is
successful at reducing delinquency. He samples ten adolescents living at home whom the
police have identified as having problems, ten similar adolescents living in foster homes, and
ten adolescents living in the group home. As an indicator variable he uses truancy (number
of days truant in the past semester), which is already obtained from school records. On the
basis of the following data, draw appropriate conclusions:
Group Truancy
Natural 15 18 19 14 5 8 12 13 7
Home
Foster 16 14 20 22 19 5 17 18 12
Home
Group 10 13 14 11 7 3 4 18 2
Home
(Note: This exercise was taken from Howell, 2011)
1.1 Assuming that the data are not normally distributed, test the hypothesis that there is no
difference in the truancy among three groups of adolescents. Use α = .05
1.2 If the null hypothesis is rejected, conduct a pairwise comparison to determine the groups
which had significant difference. Use α = .01. Make inference about the group which has
the least truancy problem.
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm
3. An exercise physiologist compared the perceived exertion of women between the ages of 20
and 25 who exercised with three different pieces of equipment – a stationary bicycle, a
treadmill, and free weights. Perceived exertion was rated on a scale of 1 to 50. Eighteen
women, randomly divided into three groups, used the exercise equipment for 30 minutes
before rating their perceived exertion. Assuming ordinal data, test if there were differences
among the three kinds of exercise. Use α = .05
Bicycle 33 38 31 44 41 28
Treadmill 42 35 40 46 37 39
Weights 24 30 26 34 36 22
(Note: This exercise was taken from Hinkle, et al, 2003)
4. Using the above data on Exercise No. 3, determine if there is significant difference in the
perceived exertion between the Treadmill Group and the Weights Group. Use α = .05
5. A group of sexually active, single male college students was questioned about whether they
always use condoms. They were asked the same question three months later after attending
workshops on “safer sex”. Determine whether the change in condom use after attending the
workshop was significant at the .05 level.
6. A psychologist looked into the perceived quality of life of three groups of 600 older adults
aged 65 years and above – the co-resident (residing with family members), those who live
alone, and those who live in home care facilities. Determine if there is a significant association
between quality of life and living arrangement. Test the null hypothesis at α = .05.
Participant A B C D E F G H I J K L M N O P Q R S T
No music 5 2 6 3 8 2 3 8 9 9 5 7 8 5 4 1 7 8 5 2
With piano 6 6 6 7 9 6 6 7 8 10 8 8 5 8 9 7 5 7 7 8
instrumental
music
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm
7.1. Determine if there is a significant difference between the paired observations at α = .05.
7.2. Let us convert the scores into Pass or Fail (dichotomous data) so that those who scored
6 and above passed (coded as “1”) and those who scored 5 and below failed (coded as
“0”). Determine if scores have improved after the participants were exposed to the
independent variable (soft instrumental music). Use α = .05.
8. Supposed we add another level of independent variable, this time with “relaxing nature
sounds”, and administer this to the same participants. The following scores were obtained:
Participant A B C D E F G H I J K L M N O P Q R S T
With 9 7 8 7 9 9 5 8 9 10 8 9 5 10 10 8 8 8 8 9
relaxing
nature
sounds
8.1. Test the null hypothesis (α = .05) that there is no significant difference in the recall of non-
sense syllables among k dependent groups.
8.2. Convert the scores again into Pass or Fail (dichotomous data) so that those who scored
6 and above passed (coded as “1”) and those who scored 5 and below failed (coded as
“0”). Determine if scores among groups using α = .05.
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm
Source: Howell, D.C. (2011). Fundamental statistics for the behavioral sciences (7th Ed.).
Belmont, CA: Wadsworth file:///C:/Users/Swift/Downloads/document.pdf
LECTURE ON INFERENTIAL STATISTICS: NON-PARAMETRIC TESTS
Professor: Elmer G. De Jose, PhD, RPsy, RPm