0% found this document useful (0 votes)

13 views6 pages

Publication Excerpt Dazong Richard Hosea

This study investigates the mathematical relationship and performance differences between Pearson's and Spearman's correlation coefficients through simulations of various data conditions. Key findings indicate that Pearson's is effective for linear, normally distributed data, while Spearman's is more robust to outliers and better for non-linear monotonic relationships. The research emphasizes the importance of selecting the appropriate correlation metric based on the data structure and research objectives.

Uploaded by

danielcaleb058

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views6 pages

Publication Excerpt Dazong Richard Hosea

Uploaded by

danielcaleb058

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Investigating the Relationship Between Pearson's and Spearman's

Correlation Coefficients
By Dazong Richard Hosea

Department of Statistics, Federal University Kashere, Gombe State, Nigeria

1.2 Statement of the Problem

Many practitioners use Pearson and Spearman interchangeably without fully understanding
the consequences. This project aims to explore the mathematical relationship and
performance differences between these two measures under simulated conditions,
addressing when each should be preferred.

1.3 Objectives of the Study

- To mathematically derive Pearson’s and Spearman’s correlation coefficients.

- To simulate data under various conditions (normal, skewed, nonlinear) and compute both
coefficients.
- To compare their behaviors in terms of sensitivity to outliers, linearity, and monotonicity.

1.4 Research Questions

- How do Pearson’s and Spearman’s correlation coefficients relate mathematically?

- What are the differences in behavior between the two under simulated data?
- Under what conditions do their values significantly diverge?

1.5 Significance of the Study

This research will provide deeper insights into the appropriate use cases for each
correlation coefficient, guiding researchers and practitioners in selecting the most robust
statistical tool for their data type and research design.

1.6 Scope and Limitations

This study focuses on simulated data ranging from 30 to 1000 observations across different
distributions (normal, uniform, exponential). The study does not involve real-world
datasets and limits itself to bivariate correlation analysis.

1.7 Operational Definition of Terms

Pearson’s Correlation Coefficient: A measure of the strength and direction of the linear
relationship between two variables. It assumes both variables are normally distributed
(Benesty et al., 2009).
Spearman’s Rank Correlation Coefficient: A non-parametric measure of correlation based
on the rank values of the variables instead of raw data, useful when the relationship is
monotonic but not necessarily linear (Sheskin, 2004).

Linear Relationship: A type of relationship that can be described by a straight line equation,
where changes in one variable predict changes in another with a constant ratio (Rodgers &
Nicewander, 1988).

Monotonic Relationship: A relationship that is consistently increasing or decreasing but not

necessarily at a constant rate (Lehmann, 2006).

1.8 Structure of the Study

This project is structured into five chapters. Chapter One introduces the study and outlines
the problem, objectives, and significance. Chapter Two reviews relevant literature on
correlation theory and past empirical findings. Chapter Three details the methodology,
including the mathematical derivation of Pearson and Spearman correlations and
simulation techniques. Chapter Four presents and analyzes the results. Chapter Five
concludes with findings and offers recommendations for future research (Creswell, 2014).

CHAPTER TWO

LITERATURE REVIEW

2.1 Theoretical Foundations of Correlation

Correlation is a statistical measure used to describe the strength and direction of a

relationship between two variables. The origins of correlation theory trace back to Francis
Galton, who introduced the concept of regression toward the mean (Galton, 1886). Karl
Pearson formalized this concept mathematically, resulting in the Pearson correlation
coefficient, which measures linear dependence between variables (Pearson, 1896). Later,
Spearman introduced a rank-based method that measures the degree of monotonic
association (Spearman, 1904).

2.2 Pearson’s Correlation Coefficient

Pearson’s correlation coefficient (r) is derived from the covariance of the variables divided
by the product of their standard deviations. It is suitable when data is normally distributed
and assumes homoscedasticity (Rodgers & Nicewander, 1988). Pearson's r ranges between
-1 and +1, where +1 indicates a perfect positive linear relationship and -1 indicates a perfect
negative one (Benesty et al., 2009).

2.3 Spearman’s Rank Correlation Coefficient

Spearman’s rho is a non-parametric measure of correlation based on the ranked values of
the data. It does not assume normality or linearity and is more robust to outliers and
skewed distributions (Sheskin, 2004). This makes Spearman’s coefficient suitable for
ordinal data or data that fails to meet the assumptions required for Pearson’s r (Lehmann,
2006).

2.4 Comparative Empirical Studies

Several empirical studies have compared the performance of Pearson’s and Spearman’s
coefficients under different data conditions. Myers and Well (2003) found that in the
presence of outliers, Spearman’s rho maintains higher accuracy than Pearson’s r. Corder
and Foreman (2014) also demonstrated that Spearman’s correlation provides more reliable
results when analyzing non-linear but monotonic trends. However, when data satisfies the
assumptions of linearity and normality, Pearson’s r tends to offer greater statistical power
(Hauke & Kossowski, 2011).

2.5 Gaps in the Literature

Although much research has compared Pearson and Spearman correlations, fewer studies
have examined their behavior under simulated conditions with systematically varied data
properties. There is also limited work integrating both theoretical derivations and empirical
simulations in a unified framework, which this study aims to address (Mukaka, 2012).

CHAPTER THREE

METHODOLOGY

3.1 Research Design

This study adopts a quantitative simulation-based research design to explore the

mathematical and empirical relationships between Pearson’s and Spearman’s correlation
coefficients. Synthetic datasets will be generated under controlled conditions (e.g., normal,
skewed, monotonic, and nonlinear) to assess the statistical behavior of each coefficient.

3.2 Population and Sample

The population in this simulation study consists of theoretical data structures designed to
represent various correlation patterns. Samples of varying sizes (n = 30, 100, 300, 1000)
will be drawn randomly from synthetic populations using NumPy-based random number
generators. Each synthetic dataset will undergo correlation analysis using both Pearson’s
and Spearman’s techniques.

3.3 Sampling Techniques

Stratified random sampling is used to ensure balanced representation of correlation
patterns. The number of observations, n, in each stratum is computed using the
proportional allocation formula:

nₕ = (Nₕ / N) × n

Where:
nₕ = sample size for stratum h
Nₕ = population size of stratum h
N = total population size
n = total sample size

3.4 Method of Data Collection

Data was collected by generating values from known distributions. Normal distributions (μ
= 0, σ = 1) and uniform distributions were used. Non-linear monotonic transformations
(e.g., exponential and logarithmic) were applied to assess non-linear relationships. Python
was used for this simulation using libraries such as NumPy and SciPy.

3.5 Mathematical Formulation

Pearson’s correlation coefficient (r) is given by:

r = Σ[(xᵢ - x̄)(yᵢ - ȳ)] / √[Σ(xᵢ - x̄)² * Σ(yᵢ - ȳ)²]
Where x̄ and ȳ are the means of x and y respectively. It measures linear dependence.

Spearman’s rank correlation coefficient (ρ) is given by:

ρ = 1 - [ (6 × Σdᵢ²) / (n(n² - 1)) ]
Where dᵢ is the difference in ranks of the ith element of variables x and y, and n is the
number of observations.
This coefficient assesses monotonic relationships between ranked variables.

3.6 Tools for Data Analysis

Data simulation and analysis were conducted using Python libraries such as NumPy,
pandas, matplotlib, and SciPy. These tools enabled efficient generation of random data and
accurate computation of correlation coefficients.

CHAPTER FOUR

DATA ANALYSIS AND RESULTS

4.1 Introduction

This chapter presents the results of the simulation and statistical analysis conducted to
explore the relationship between Pearson’s and Spearman’s correlation coefficients. The
synthetic datasets were analyzed under conditions of linear, monotonic, and nonlinear
associations. Both correlation coefficients were computed and compared across multiple
sample sizes and distributions.

4.2 Simulated Data Scenarios

Three types of data scenarios were generated to examine how Pearson and Spearman
respond to various relationships:
1. Linear relationship with normally distributed data
2. Non-linear monotonic relationship (exponential)
3. Non-monotonic relationship (sinusoidal)
Each scenario was repeated using 30, 100, and 300 sample sizes.

4.3 Comparative Results of Pearson vs. Spearman

From the table above, Pearson’s and Spearman’s coefficients yield closely similar values in
the linear-normal datasets, indicating both metrics perform well under ideal assumptions.
However, for exponential and sinusoidal datasets, Spearman’s coefficient often remains
higher, showing better robustness to non-linearities.

4.4 Graphical Representations

Graphs for each data scenario illustrate the observed relationships. Scatter plots for linear
data show a tightly clustered linear pattern, whereas exponential and sinusoidal data show
patterns where Pearson's measure underestimates the strength of association compared to
Spearman’s.

CHAPTER FIVE

SUMMARY, CONCLUSION AND RECOMMENDATIONS

5.1 Summary of Findings

This study investigated the relationship between Pearson’s and Spearman’s correlation
coefficients both mathematically and through simulation. Key findings are:
- Pearson’s coefficient is highly effective when the data follows a linear and normally
distributed pattern.
- Spearman’s coefficient is more robust to outliers and better captures non-linear
monotonic relationships.
- In simulated datasets with exponential and sinusoidal structures, Spearman's coefficient
demonstrated greater consistency than Pearson's.
- Both coefficients showed similar values under ideal conditions, but diverged significantly
in complex data distributions.

5.2 Conclusion
Pearson and Spearman correlation coefficients, while related, serve different analytical
purposes depending on the nature of the data. Pearson’s method is optimal for linear
associations under strict assumptions, whereas Spearman’s rank-based method provides a
more flexible tool for analyzing ordinal or non-normally distributed data. The study
confirms that the choice of correlation metric should be informed by the underlying data
structure and research objective.

5.3 Recommendations

Correlation New
100% (1)
Correlation New
38 pages
Social2 - Unit Plans All Units
No ratings yet
Social2 - Unit Plans All Units
9 pages
Journal Article Dazong Richard Hosea
No ratings yet
Journal Article Dazong Richard Hosea
7 pages
RMPS M4
No ratings yet
RMPS M4
47 pages
Module III Correlation and Regression
No ratings yet
Module III Correlation and Regression
61 pages
Introduction To Correlation and Regression Analysis
No ratings yet
Introduction To Correlation and Regression Analysis
14 pages
Stats Unit 2
No ratings yet
Stats Unit 2
24 pages
Spearmann's Correlation
No ratings yet
Spearmann's Correlation
37 pages
3
No ratings yet
3
4 pages
Correlation (Pearson, Kendall, Spearman)
100% (1)
Correlation (Pearson, Kendall, Spearman)
4 pages
Hauke & Kossowski (2011) COMPARISON OF VALUES OF PEARSON'S AND SPEARMAN'S CORRELATION COEFFICIENTS ON THE SAME SETS OF DATA PDF
No ratings yet
Hauke & Kossowski (2011) COMPARISON OF VALUES OF PEARSON'S AND SPEARMAN'S CORRELATION COEFFICIENTS ON THE SAME SETS OF DATA PDF
7 pages
Correlational Analysis Pearson R and Spearman's Rank
No ratings yet
Correlational Analysis Pearson R and Spearman's Rank
12 pages
Lesson 8. Correlation
No ratings yet
Lesson 8. Correlation
29 pages
Pearson and Correlation
No ratings yet
Pearson and Correlation
8 pages
Correlation
No ratings yet
Correlation
46 pages
Measures of Association: Lesson 1 Data Analysis
No ratings yet
Measures of Association: Lesson 1 Data Analysis
41 pages
Spear Man
No ratings yet
Spear Man
5 pages
Using Statistical Techniq Ues in Analyzing Data
100% (1)
Using Statistical Techniq Ues in Analyzing Data
40 pages
Correlation Analyses
No ratings yet
Correlation Analyses
8 pages
SUBMIT ASSIGNMENT - Spearmann Correlation PT 1
No ratings yet
SUBMIT ASSIGNMENT - Spearmann Correlation PT 1
8 pages
Spearman's Rank Correlation Coefficient
No ratings yet
Spearman's Rank Correlation Coefficient
11 pages
ARTICLE 2, Vol 1, No 4, Correlation Coefficient For Continuous and Discrete Data 2
No ratings yet
ARTICLE 2, Vol 1, No 4, Correlation Coefficient For Continuous and Discrete Data 2
26 pages
Lecture06 Prel
No ratings yet
Lecture06 Prel
10 pages
Lesson 10 Relationship Between Variables
No ratings yet
Lesson 10 Relationship Between Variables
85 pages
L7 Correlation
No ratings yet
L7 Correlation
40 pages
Correlation: Some Commonly Used Jargons
No ratings yet
Correlation: Some Commonly Used Jargons
19 pages
MA262 Continues Internal Evaluation
No ratings yet
MA262 Continues Internal Evaluation
5 pages
Correlation & Regression
100% (1)
Correlation & Regression
23 pages
IV - Measures of Relationship
100% (1)
IV - Measures of Relationship
4 pages
r23 P & S Unit 2 Material
No ratings yet
r23 P & S Unit 2 Material
14 pages
QT - Unit 2 - Part A - Correlation
No ratings yet
QT - Unit 2 - Part A - Correlation
48 pages
Measures of Relationship June 2024
100% (1)
Measures of Relationship June 2024
22 pages
Lesson 11 - Regression and Correlation Analysis
No ratings yet
Lesson 11 - Regression and Correlation Analysis
8 pages
Psychstat Semifinals Reviewer
No ratings yet
Psychstat Semifinals Reviewer
5 pages
Psychstat Semifinals Reviewer (Bundalian)
No ratings yet
Psychstat Semifinals Reviewer (Bundalian)
8 pages
Econmetrics Chapter 3
No ratings yet
Econmetrics Chapter 3
20 pages
Correlation and Dependence: Navigation Search
No ratings yet
Correlation and Dependence: Navigation Search
7 pages
Conduct and Interpret A Pearson Correlation
No ratings yet
Conduct and Interpret A Pearson Correlation
2 pages
Correlation 1
100% (1)
Correlation 1
57 pages
Lecture No 04 - Stats - 3!5!24
No ratings yet
Lecture No 04 - Stats - 3!5!24
26 pages
1595579871SMS 202 Odl
No ratings yet
1595579871SMS 202 Odl
65 pages
Correlation vs. Regression
No ratings yet
Correlation vs. Regression
15 pages
Inferential Statistics - 20250127 - 084736 - 0000
No ratings yet
Inferential Statistics - 20250127 - 084736 - 0000
14 pages
Nadhratul Hikmah 1910533031 Multivariate Statistics
No ratings yet
Nadhratul Hikmah 1910533031 Multivariate Statistics
7 pages
QT Presentation T3 Group 2
No ratings yet
QT Presentation T3 Group 2
13 pages
Correlation
No ratings yet
Correlation
8 pages
Psych Assess Chap 4
No ratings yet
Psych Assess Chap 4
5 pages
Correlation Research Design - PRESENTASI
100% (1)
Correlation Research Design - PRESENTASI
62 pages
Correlation Analysis
No ratings yet
Correlation Analysis
102 pages
G3 Correlation Analysis
No ratings yet
G3 Correlation Analysis
60 pages
Correlation
100% (1)
Correlation
49 pages
CORRELATION
No ratings yet
CORRELATION
61 pages
Correlation Analysis - Final
No ratings yet
Correlation Analysis - Final
40 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
23 pages
Egression & Orrelation: Nalysis
0% (1)
Egression & Orrelation: Nalysis
48 pages
Correlation Analysis
No ratings yet
Correlation Analysis
30 pages
Regression
No ratings yet
Regression
5 pages
Blocks of Time For Kindergarten
No ratings yet
Blocks of Time For Kindergarten
57 pages
CP401 PDF
No ratings yet
CP401 PDF
1 page
DIFFICULTIES IN LEARNING TAMIL - IRREGULARITIES IN ... - Infitt
No ratings yet
DIFFICULTIES IN LEARNING TAMIL - IRREGULARITIES IN ... - Infitt
6 pages
Perception of Senior High School Students As Regards
No ratings yet
Perception of Senior High School Students As Regards
15 pages
Human Social Temperaments: Yin, Yang, and Harmony
88% (8)
Human Social Temperaments: Yin, Yang, and Harmony
60 pages
Ai&Ml Lab
No ratings yet
Ai&Ml Lab
63 pages
HOME SCIENCE - Form 1 - Term-I
No ratings yet
HOME SCIENCE - Form 1 - Term-I
8 pages
Spatial Neglect - Overview, Etiology, Mechanisms and Morbidities in Spatial Neglect
100% (1)
Spatial Neglect - Overview, Etiology, Mechanisms and Morbidities in Spatial Neglect
9 pages
Daily Lesson LOG: Presenting Examples/instances of The New Lesson
No ratings yet
Daily Lesson LOG: Presenting Examples/instances of The New Lesson
2 pages
De TS Tieng Anh 10 Binh Phuoc 23 24
No ratings yet
De TS Tieng Anh 10 Binh Phuoc 23 24
5 pages
Emotional Disturbance
100% (1)
Emotional Disturbance
57 pages
School Conductor Survey Article
No ratings yet
School Conductor Survey Article
12 pages
1-Son Fevral 2024
No ratings yet
1-Son Fevral 2024
144 pages
11.julius Caesar - VAL
No ratings yet
11.julius Caesar - VAL
4 pages
The Relationship of Educational Preparation, Autonomy, and Critical Thinking To Nursing Job Satisfaction
No ratings yet
The Relationship of Educational Preparation, Autonomy, and Critical Thinking To Nursing Job Satisfaction
9 pages
CV For Exam
No ratings yet
CV For Exam
1 page
EDUC 202 Educational Statistics and Analysis 2
No ratings yet
EDUC 202 Educational Statistics and Analysis 2
6 pages
Curiculum Vitae Johan
No ratings yet
Curiculum Vitae Johan
4 pages
100 Questions OMR Sheet
No ratings yet
100 Questions OMR Sheet
1 page
Class 11 Question Bank Computer Science Chap 4-1
No ratings yet
Class 11 Question Bank Computer Science Chap 4-1
26 pages
CV de DANIELLE AMELIE ITEBY KOUNG
No ratings yet
CV de DANIELLE AMELIE ITEBY KOUNG
2 pages
Translation of Indonesian Proverbs Into English
No ratings yet
Translation of Indonesian Proverbs Into English
13 pages
Indian Ethos For Work-Life-2
No ratings yet
Indian Ethos For Work-Life-2
10 pages
Vivek Prajapati: Skills
No ratings yet
Vivek Prajapati: Skills
1 page
Taedel311 R1
No ratings yet
Taedel311 R1
4 pages
3D Printers Revolutionize The Manufacturing Industry: I. Read The Text Carefully and Answer The Questions
No ratings yet
3D Printers Revolutionize The Manufacturing Industry: I. Read The Text Carefully and Answer The Questions
3 pages
Safari
No ratings yet
Safari
3 pages
DLL - Mapeh 3 - Q3 - W2
No ratings yet
DLL - Mapeh 3 - Q3 - W2
2 pages
GNS 201 Powerpoint DR Tayo Eegunlusi-1
No ratings yet
GNS 201 Powerpoint DR Tayo Eegunlusi-1
36 pages

Publication Excerpt Dazong Richard Hosea

Uploaded by

Publication Excerpt Dazong Richard Hosea

Uploaded by

Investigating the Relationship Between Pearson's and Spearman's

Department of Statistics, Federal University Kashere, Gombe State, Nigeria

1.2 Statement of the Problem

1.3 Objectives of the Study

- To mathematically derive Pearson’s and Spearman’s correlation coefficients.

1.4 Research Questions

- How do Pearson’s and Spearman’s correlation coefficients relate mathematically?

1.5 Significance of the Study

1.6 Scope and Limitations

1.7 Operational Definition of Terms

Monotonic Relationship: A relationship that is consistently increasing or decreasing but not

1.8 Structure of the Study

2.1 Theoretical Foundations of Correlation

Correlation is a statistical measure used to describe the strength and direction of a

2.2 Pearson’s Correlation Coefficient

2.3 Spearman’s Rank Correlation Coefficient

2.4 Comparative Empirical Studies

2.5 Gaps in the Literature

3.1 Research Design

This study adopts a quantitative simulation-based research design to explore the

3.2 Population and Sample

3.3 Sampling Techniques

3.4 Method of Data Collection

3.5 Mathematical Formulation

Pearson’s correlation coefficient (r) is given by:

Spearman’s rank correlation coefficient (ρ) is given by:

3.6 Tools for Data Analysis

DATA ANALYSIS AND RESULTS

4.2 Simulated Data Scenarios

4.3 Comparative Results of Pearson vs. Spearman

4.4 Graphical Representations

SUMMARY, CONCLUSION AND RECOMMENDATIONS

5.1 Summary of Findings

You might also like