0% found this document useful (0 votes)
19 views79 pages

Non-Parametric Analysis - 20241029 - 033906 - 0000

Uploaded by

g.sychq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views79 pages

Non-Parametric Analysis - 20241029 - 033906 - 0000

Uploaded by

g.sychq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

Non-Parametric

Analysis
WHAT IS NON-PARAMETRIC
ANALYSIS?
Non-parametric analysis is a type of statistical method used in
research when the data doesn't meet the assumptions required for
parametric tests, such as normal distribution or equal variances.
Unlike parametric methods, which rely on assumptions about the
population parameters (like the mean and variance), non-parametric
methods do not require the data to fit a specific distribution.
Non-parametric analysis is characterized by its flexibility, as it does not require
assumptions about the underlying data distribution. Often referred to as
"distribution-free," this method can be applied to data that does not follow a
normal distribution, making it suitable for a wide range of situations. It is
especially useful for ordinal (ranked) or nominal (categorical) data, as well as
data that is not normally distributed.Non-parametric methods are also robust in
the presence of outliers, meaning that extreme values have less impact on the
results. Additionally, these methods are effective for small sample sizes, where
the assumptions of parametric tests may not hold true. This versatility makes
non-parametric analysis a valuable tool in many fields of research.
Examples of non-parametric
tests include:
• Chi-square test: Examines relationships between categorical variables.
• Mann-Whitney U test: Compares differences between two independent
groups.
• Wilcoxon signed-rank test: Compares paired data or two related samples.
• Kruskal-Wallis test: Compares more than two independent groups.

Non-parametric methods are widely used in fields like social sciences,


biology, and medicine, where data often do not meet the assumptions
required for parametric analysis.
CHI-SQUARE TEST
The chi-squared test (χ² test) is a statistical
method used to determine if there is a
significant association between categorical
variables.
WHAT IS CATEGORICAL VARIABLE?
These are the types of data that represent categories or
groups. Unlike numerical variables, they do not have a
meaningful order of quantity.
Ex.
Gender
Type of pet
Color
Marital Status
Smart phone Preference
TYPES OF CHI-SQUARE TEST

• Chi-square Goodness of Fit Test


• Chi-square Test of Independence
CHI-SQUARE GOODNESS OF FIT TEST
You can use a chi-square goodness of fit test
when you have one categorical variable. It
allows you to test whether the frequency
distribution of the categorical variable is
significantly different from your expectations.
EXAMPLE
A car manufacturer expects the order colours of their model
J to be distributed as follows:

A random sample of 140 orders revealing the following:

Test at a = 0.05 to determine if the observed colours differ


significantly from the manufacturer's expectation.
EXAMPLE
A car manufacturer expects the order colours of their model
J to be distributed as follows:

Expected
A random sample of 140 orders revealing the following:

Observed
Test at a = 0.05 to determine if the observed colours differ
significantly from the manufacturer's expectation.
Expected

H0: P1= 0.28, P2=0.25, P3=0.16, P4=0.31


Observed
H1: at least one pi is not as specified in H0
H0: P1= 0.28, P2=0.25, P3=0.16, P4=0.31 Expected

H1: at least one pi is not as specified in H0

df (degrees of freedom) Observed


df = k - 1 = 4 - 1 = 3 a=0.05
FIND THE CRITICAL VALUE
H0: P1= 0.28, P2=0.25, P3=0.16, P4=0.31 Expected

H1: at least one pi is not as specified in H0

df (degrees of freedom) Observed


df = k - 1 = 4 - 1 = 3
Rejection Region: x² > 7.815
FORMULA OF CHI-SQUARE TEST

X² = chi square test


Σ = summation operation
("take the sum of")
O = observed frequency
E eexpected frequency
Expected
H0: P1= 0.28, P2=0.25, P3=0.16, P4=0.31
df = k - 1 = 4 - 1 = 3
Rejection Region: x² > 7.815
Observed
Expected
H0: P1= 0.28, P2=0.25, P3=0.16, P4=0.31
df = k - 1 = 4 - 1 = 3
Rejection Region: x² > 7.815
Observed
Expected
H0: P1= 0.28, P2=0.25, P3=0.16, P4=0.31
df = k - 1 = 4 - 1 = 3
Rejection Region: x² > 7.815
Observed
X²= 1.6315 < 7.815
Fail to reject H0
not statistically significant

not enough evidence to conclude that the


observed differs from expectation
CHI-SQUARE TEST OF INDEPENDENCE
Is a statistical test used to determine
whether two categorical variables
are related or independent to each
other.
LIMITATIONS
Only Categorical Data: The chi-squared test can only be
used with categorical variables; it cannot be applied to
continuous data without categorization.
Large Sample Requirement: It requires a large sample size
for accurate results. Small samples may lead to inaccurate
conclusions.
CHI- SQUARED TEST
OF INDEPENDENCE
EXAMPLE SCENARIO
A school administrator is interested in understanding if students' preferred
study methods (group study or solo study) are associated with their
academic performance (pass or fail). To investigate this, they collect data
from 100 students, recording each student's study method and whether they
passed or failed their exams. The administrator wants to determine if there’s
a significant relationship between study method and exam performance.
QUESTION:

Is there a relationship between students' preferred study methods


(group study vs. solo study) and their performance level (pass vs.
fail)?

Luckiest Guy
SOLUTION:

STEP 1: STATE THE HYPOTHESES


Null Hypothesis (H0​): There is no significant relationship
between students' preferred study methods (group study vs.
solo study) and their performance level (pass vs. fail).
Alternative Hypothesis (H1): There is a significant
relationship between students' preferred study methods and
their performance level.
STEP 2: CREATE A CONTINGENCY TABLE
FROM THE COLLECT DATA
STUDY METHOD PASSED FAILED TOTAL

GROUP STUDY 40 10 50

SOLO STUDY 30 20 50

TOTAL 70 30 100
STEP 3: CALCULATE EXPECTED FREQUENCIES (E)

Use the formula:


EXPECTED FREQUENCY FOR GROUP EXPECTED FREQUENCY FOR GROUP
STUDY - PASSED: STUDY - FAILED:

EXPECTED FREQUENCY FOR SOLO EXPECTED FREQUENCY FOR SOLO


STUDY - PASSED: STUDY - FAILED:

Luckiest Guy
STEP 4: CREATE THE EXPECTED FREQUENCY TABLE
STEP 5: CALCULATE THE CHI-SQUARED
STATISTIC
Use the formula:
FOR GROUP STUDY - PASSED: FOR GROUP STUDY - FAILED:

FOR SOLO STUDY - PASSED: FOR SOLO STUDY - FAILED:


STEP 6: SUM THE CHI-SQUARED VALUES

Now add all the chi-squared values together:

χ2≈0.714+1.667+0.714+1.667≈4.762
STEP 7: DETERMINE DEGREES OF FREEDOM
The degrees of freedom (df) for a chi-squared test of
independence is calculated as:
df = (r−1)×(c−1)
Where:
r = number of rows (2 study methods)
c = number of columns (2 performance levels)
STEP 7: DETERMINE DEGREES OF FREEDOM

SOLUTION:
df = (r−1)×(c−1)
df = (2−1)×(2−1)
df = 1×1
df = 1
STEP 8: FIND THE CRITICAL VALUE
Using a chi-squared distribution table and a significance level (α) of 0.05 with 1
degree of freedom, the critical value is approximately 3.841.
STEP 9: MAKE A DECISION

Calculated χ²: 4.762


Critical Value: 3.841

Since 4.762 > 3.841, we reject the null hypothesis. This


suggests there is a significant association between students'
preferred study methods and their performance levels.
MANN-WHITNEY U
TEST
MANN-WHITNEY U
Tests whether there is a difference between
two independent samples
WHAT IS INDEPENDENT VARIABLE?
It is a variable that stands alone and isn't changed by
the other variables you are trying to measure

Ex.
Reaction time Reaction time
of Male of Female
Students Students
EXAMPLE OF MANN-
WHITNEY U

Is there a difference The effectiveness of


between the reaction advertising for two rival
time of Male and Female products (Brand X and
Students Brand Y) was compared.
WILCOXON SIGNED
RANK TEST
WILCOXON SIGNED RANK TEST
WILCOXON SIGNED RANK TEST
WILCOXON SIGNED RANK TEST
WILCOXON SIGNED RANK TEST
WILCOXON SIGNED RANK TEST
WILCOXON SIGNED RANK TEST
EXAMPLE
WILCOXON SIGNED RANK TEST
KRUSKAL-
WALLIS TEST
KRUSKAL-WALLIS TEST
The Kruskal-Wallis test is used when the assumptions
for a one way analysis of variance are not met. Since the
Kruskal-Wallis test is a nonparametric testthe data used
do not have to be normally distributed. The only
requirement is that the data be ordinal scale.
KRUSKAL-WALLIS TEST
KRUSKAL-WALLIS TEST
Key Characteristics:
Non-parametric:
It does not assume a normal distribution of the data, making it
suitable for non-normally distributed data.
Ordinal or Continuous Data:
It can be used with ordinal data or continuous data that has been
converted to ranks.
Independent Groups:
The test is used for comparing independent groups, meaning the
observations in each group are not related.
KRUSKAL-WALLIS TEST
Examples for the Kruskal-Wallis test
For the Kruskal-Wallis test, of course, the same examples can be used as for the
single factor analysis of variance, but with the addition that the data need not be
normally distributed.

Medical example:
For a pharmaceutical company you want to test whether a drug XY has an influence
on body weight. For this purpose, the drug is administered to 20 test persons, 20 test
persons receive a placebo and 20 test persons receive no drug or placebo.
KRUSKAL-WALLIS TEST
KRUSKAL-WALLIS TEST
The calculation of the Kruskal and Wallis rank variance analysis is
similar to that of the Mann-Whitney U-Test, which is the nonparametric
counterpart of the t-test for independent samples.

Let's say the null hypothesis is true and thus there is no difference
between the independent samples. Then high and low ranks are randomly
distributed across the samples and should be equally distributed across the
groups. Therefore, the probability that a rank is assigned to a group is the
same for all groups.
KRUSKAL-WALLIS TEST
If there is no difference between the groups, the mean value of the ranks
should also be the same in all groups. The expected value of the ranks for
each group is then given by
KRUSKAL-WALLIS TEST
Each sample has the same expected value of the ranks, which corresponds to the
expected value of the population. Furthermore, the variance of the ranks is needed,
the variance can be calculated with the following formula:
KRUSKAL-WALLIS TEST
KRUSKAL-WALLIS TEST
Calculation with example data :

Let's say you have measured the reaction time of three groups and you want to know if
there is a difference between them. To find out, you now use the H-test (Kruskal-Wallis
test)

First we assign a rank to each person, then we calculate the rank sum and the mean rank
sum.
At a significance level of 5%, the critical chi-square value is therefore
5.991. This critical value is therefore greater than the calculated chi-
square or H value. Thus, the null hypothesis is maintained and there is
no difference in reaction time in the three groups.
A health coach wants to compare the effectiveness of two different diet
plans on weight loss among two independent groups of participants. The
weight loss data is ordinal (e.g., categorized as "low," "medium," "high").
Mann-Whitney U Test
A researcher investigates whether there is a relationship between
smoking status (smoker/non-smoker) and the occurrence of lung
disease (yes/no) among 300 participants.
Chi-Square Test
A researcher aims to compare the satisfaction levels of customers using
three different brands of fitness trackers. She collects satisfaction ratings
(on a scale from 1 to 10) from independent groups of customers for each
brand.
Kruskal-Wallis Test
A nutritionist wants to evaluate the effectiveness of a new meal plan on
cholesterol levels. She measures the cholesterol levels of 15 participants
before they start the meal plan and then again after six weeks of
following it.
Wilcoxon Signed-Rank Test
Thank you

You might also like