0% found this document useful (0 votes)
23 views

Correlation

The document discusses correlation between variables and different types of correlation. It defines correlation and describes positive and negative correlation and linear and non-linear correlation. It also discusses different methods to measure correlation including scatter diagrams and Karl Pearson's coefficient of correlation.

Uploaded by

tiver53790
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Correlation

The document discusses correlation between variables and different types of correlation. It defines correlation and describes positive and negative correlation and linear and non-linear correlation. It also discusses different methods to measure correlation including scatter diagrams and Karl Pearson's coefficient of correlation.

Uploaded by

tiver53790
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Meaning of Correlation

variables
Correlation indicates the relationship between two variables of a series so that changes
othero sin the
values of one variable are associated with changes in the values of the
(either in: same or in
In other words, when two variables vary simultaneously opposite direction)
and change in value of one variable is accompanied by a change in fother variable, then
these two variables are said to be correlated. For example, relationship between
height and
weight,income and expenditure, price and demand, etc.
TYPES OF CORRELATION

Positive and Negative Linear and Non-Linear Simple, Multiple and


Correlation Correlation PartialCorrelation

Positive Negative Linear Non-Linear Simple Multiple Partial


Correlation Correlation Correlation Correlation Correlation Correlation
Correlation
(When two (When two (When change (When change (When only (When (When relationship
variables move variables move in one variable none variable twO variables relationship between two
tends to bear a does not bear a are studied) among three variables is
in the same in opposite
direction) directions) constant ratio constant ratio to or more than examined keeping
to the amount the amount of three variables other variables as
change in other is studied) constant)
of change in
other variable) related variable)
Degree of Correlation

Degree of Correlation Positive Correlation Negative Correlation


+1 -1
Perfect Correlation
Between + 0.75 and + 1 Between-0.75 and -1
High degree of Correlation
Moderate degree of Correlation Between + 0.25 and + 0.75 Between -0.25 and -0.75
Low degree of Correlation Between 0 and + 0.25 Between 0 and- 0.25
No Correlation (Zero Correlation
or Uncorrelated)

10.6 METHODS OFMEASUREMENT OF CORRELATION


There are different methods for measuring correlation between two variables. Some of them are:
1. Scatter Diagram
2. Karl Pearson's Coefficient of Correlation
3. Spearman's Rank Correlation Coefficient
10.7 SCATTERDIAGRAM
Scatter diagram is asimple and attractive method of diagrammatic representation ofa bivariate
distribution to determine the nature of correlation between the variables.
can be interpreted in the
The scatter diagram the
1. Perfect Positive
Correlation: If all the
points
following ways:
of a scatter
(known as Line of Best Fit") with positive slope (see Fig. 10.3),diagram fall on a straight line
then the
to be perfectly positive (r = + 1) correlation is said
*Line of Best Fit is ar
Ithelps to define mathematical concept that
the relationship between thecorrelates points scattered across agraph.
dots
.Decfoct Negative Correlation: If all thepoints of a scatter plotted in case of Scatter Diagram.
sngative slope (see Fig. 10.4), then the correlation is said diagram fall on astraight line with
to be perfectly negative (r =-l).
, DocitiveCorrelation: When all the
points of a scatter diagram cluster around a straight
line goingupwards from left to right, the correlation is
positive correlation (see Fig. 10.5
and 10.6).
Perfect Positive Perfect Negative
Y Correlation Y
High Degree of Low Degree of
Correlation Y Positive Correlation YPositive Correlation
(r=+1) (r=-1)

X X X X

Fig. 10.3 Fig. 10.4 Fig. 10.5 Fig. 10.6

2 Negative Correlation: When all the points of a scatter diagram cluster around a straight line
With negative slope, the correlation is said to be negative as shown in (Fig. 10.7 and 10.8).
zero
D NoCorrelation: If the points are scattered in a haphazard manner, then it is a case of
Or no correlation (see Fig. 10.9 and 10.10).
No Correlation
No Correlation
High Degree of Low Degree of
YNegative Correlation Y Negative Correlation

X
X
X Fig. 10.10
X Fig.10.9
Fig. 10.7 Fig. 10.8
Practicals on Scatter Diagram
diagram for the following data and state h
Example 1. Make
between X and Y.
a scatter type of corelatc,
20 30 40 50 Y
X 10
140 210 280 350
70 Series
Y 350
Solution: 280
of Series X on
The scatter diagram is obtained by plotting the values 210
the X-axis and values of Series Yon the Y-axis. Plotting the values (10, 140
70), (20, 140),....(50, 350) on the graph paper, we get the scatter 70
diagram (See Fig. 10.11): o 10 20 30
It is obvious from the scatter diagram that there is perfect positive 40 50
Series X
correlation between the values of Series X and Series Y.
Fig. 10.11
Example 2. Draw a scatter diagram to represent the following
Y
values of Xand Y variables. Comment on the type and degree
of correlation.
25
X 15 20 25 27 30
20
Y 7 10 12 16 18 15
Solution: 10
Plot the values of variable >X on the X-axis and variable Y on the Y-axis. 5

Aglance at the above scater diagram shows that there is an upward ol 55 10 15 20 25 30


trend of the dots from lower left-hand corner to the upper right-hand
Fig. 10.12
corner. It means, there is positive correlation between values of X
and Y variables.
PEARSON'S COEFFICIENT OF CORRELATION
10.8 KARL
Karl Pearson (1867-1936), a British statistician, was the first person to give
a mathematical formula for measuring the degree of relationship between
1890.
hwo variables in
" The Karl Pearson's coefficient of correlation is also known as Product
Moment Correlation' or ´Simple Correlation Coefficient.
It is the most popular and widely used method to calculate the
correlation coefficient. Karl Pearson
" Itis denoted by the symbol 'r (r is a pure number, i.e. it has no unit). (1867 -1936)

Acording to Karl Pearson, "Coefficient of correlation is determined by dividing the sum of


products of deviations fromtheir respective means by the product of number of pairs and their
standard deviations, "

It means, Karl Pearson'sCoefficient of Correlation (r) iscalculated as:


r
Sum of Products of Deviations from their respective means
Number of Pairs x Standard Deviations of Both Series
i.e. r =
Exy
NX o XOy
Where:
N= Number of Pair of observations
X = Deviation of Xseries fromn mean (X-X)
y = Deviation of Yseries from mean (Y-Y)
Ex2
.. = Standard deviation of X series, i.e.,
VN

Oy = Standard deviation of Y series, i.e., Ey²


VN
r = Coefficient of Correlation
10.9 CALCULATION OF KARL PEARSON'S COEFFICIENT OF CORRELATION
correlation according to Karl Pearson' 's formula,
While calculating coefficient of
the following methods:
We can USR
1. Actual Mean Method (Example 6)
2. Direct Method (Example 9)
Method (Examnle 10
3. Short-Cut Method/Assumed Mean Method/Indirect
4. Step Deviation Method (Example 12)
Actual Mean Method

Karl Pearson's method of computing correlation by the Actual Mean method


following steps: involves the
Steps for Calculation:
Step 1. Calculate the means of the two series (X and Y),i.e., calculate Xand Y.
Step 2. Take the deviation of Xseries from X(mean of X) and denote the deviations byx
Step 3. Square these deviations and obtain the total, i.e., Ex?.
Step 4. Take the deviations of Y series from Y (mean of Y) and denote the deviations by v.
Step 5. Square these deviations and obtain the total, i.e., Ey'.
Step6. Multiplythe respective deviations of yXand Y 10.13
Step7. Substitute the values of Exy, Ex, Ey' in the series and obtain their total, i.e.,
Exy following formula: Exy
Zx? x y²

Example 6. Calculate the coefficienttof


Method. correlation for the following data by the
Actual Mean
X 12 15 18 21
Y 6 8 24 27
10 12 30
14 16
Solution: 18

Calculation of Coefficient of Correlation (Actual Mean


X-Series
Method)
Y-Series
X X=X-)X Y y=Y-y
12 -9 81 xy
6 -6 36 54
15 -6 36 8 -4 16 24
18 -3 10 -2 4 6
21 12 0
24 +3 14 +2 4 6
27 +6 36 16 +4 16 24
30 +9 81 18 +6 36 54
EX= 147 Ex²= 252 EY= 84 Ey² =112 }xy = 168
EX 147
X = = 21
N 7

EY 84
Y = = 12
N 7

Exy
Coefficient of Correlation (r) =

}xy =168; Ex² = 252; Ey² = 112


168 168 168
252 x112 28,224 168
positive correlation between the values of Series
Ans. Coefficient of Correlation = 1.There is perfect
Xand Series Y.
are to be
of both the series
Note: Actual Mean method is a lengthy process because the actual means
calculated and then deviations are taken.
data:
Direct Method
The coefficient of correlation can also be obtained without finding out the
actual means.The steps involved in the Direct Method are: deviations from the
Steps for Calculation
Step 1. Calculate the sumn of Series X and denote it by EX
Step 2. Calculate the sum of Series Y and denote it by EY
Chapter10 . Measures of Correlation 10.15

thevalues of Xseries and obtain the total, i.e., EX?


Step 3. Square
the values of Yseries and obtain the total, i.e., EY2
Step4. Square
Multiply the values of series Xand series Yand obtain their total, i.e., EXY
k Substitute the values of EXY,EX, EY,EX and EY? in the following formula:
NEXY- EX.ZY
Coefficient of Correlation (r)
NEX?-(EX)? x NZy2-(EY)?

Example 9. Calculate ProducttMoment Correlation(or Karl Pearson's Coefficient of Correlation)


Direct Method.
of the data given in Example 6 by the
Solution: Calculation of Coefficient of Correlation (Direct Method)
X-Series Y-Series

X
Y y2 XY
144 6 36 72
12
225 64 120
15
324 10 100 180
18
441 12 144 252
21
14 196 336
24 576
16 256 432
27 729
18 324 540
30 900
EY = 84 EY²=1,120 EXY =1,932
EX = 147 EX' =3,339
NEXY-EX.EY
Coefficient of Correlation () =
NEX2-(EX) xNEY2-(EY2
and N=7
Here,EX = 147; EY=84;EX2=3,339; Ey² =1,120; EXY =1,932
7x1,932-147 x84
J7x3,339-(147)2 xJ7x1,120-(84)°
13,524 - 12,348
J23,373 -21,609 x, 7,840 -7,056
1,176 1,176 1,176
= 1
J1,764 xJ 784 42 x 28 1,176
Series
perfect positive correlation between the values of
AIS. COeficient of Correlation = 1. There is
Xand Series Y.

Short-Cut Method (Assumed Mean Method) values


based on actual means is quite lengthy and is possible only when the mean
nethod
are whole numbers. But in practice, mean values are in fractions. method (assumed mean
to use short-cut
In order to avoid difficult calculatations, it is better following formula
assumed mean and the
method). In this method, deviations. are taken from an
is used:
NEdxdy -Zdx x Zdy
r =
NZdx² - (Edx)? x/NEdy?- (Zdy)
Where:
N = Number of pair of observations
Edx = Sum of deviations of X values from assumed mean.
mean.
Zdy = Sum of deviations of Y values from assumed
Edx? = Sum of squared deviations of X values from assumed mean.
Edy² = Sum of squared deviations of Y values from assumed mean.
Edxdy = Sum of the products of deviations dx and dy.
Steps of Short-cut method
Step 1. Take the deviations of Xseries from the assumed mean and denote them by
obtain their total, i.e., Ldx. dx and
Step 2. Square the deviations of Xseries and obtain the total, i.e., Zdx².
Step 3. Takethe deviations of Yseries from the assumned mean and denote
obtain their total, i.e., Edy. them
by dy and
Step 4. Square the deviations of Yseries and obtain the total, i.e., Ldy²,
Step 5. Multiply dx with dy and obtain the total Zdxdy.
Step 6. Substitute the values of Edx,Edy, Edx, Zdy' and Zdxdy in the following formul.
NEdxdy-Zdx x Edy
NZdx²- (2dx)? x NZdy² -(Zdy)?
Example 10. Calculate the coefficient of correlation of the data given in Example 6by Short-Cut
Method.
Solution: Calculation of Coefficient of Correlation (Short-Cut Method)
X-Series Y-Series
X dx = X-A dy? Y dy = Y-A dy? dxóy
A= 18 A= 10
oo
12 -6 36 6 -4 16 24
15 -3 8 -2 4 6
18 (A) 10 (A) 0 0
21 +3 9 12 +2 4 6
24 +6 36 14 +4 16 24
27 +9 81 16 +6 36 54
30 + 12 144 18 +8 64 96
Zdx = 21 Zdx² =315 Edy² = 140 Edxdy=210
Zdy = 14
NEdxdy- Zdx x Zdy
JNEdx?-(Zdx)? x/NZdy²-(Zdy)?
Zdxdy = 210; Zdx = 21;Edy = 14; N=7; Zdx² =315; Edy² = 140
7x210- (21) (14)
J7x315-(21) x| 7x 140 -(14)2
1,470- 294 1,176 1,176
=1
1,764 x784 42 x 28 1,176
Ans. Coefficient of Correlation = 1.There is perfect positive correlation between the values of Series
X and Series Y.
Step Deviation Method
Pm Deviation Method is generally used when actual mean is in fraction. Use of this method
nlifies the method of calculating coefficient of correlation. Under this method, deviations
E andY are taken from assumed means and are divided by a common factor. The value of
porrelation coefficient is not affected by change of origin and change of scale of Xand Y.
Steps of Step Deviation Method
Sten 1. Take the deviations of X series from the assumed mean and divide them by common
factor (C) to get step deviations (dx'). Find out their total to get Zdx'.
Step 2. Take thedeviations of Yseries from the assumed mean and divide them by common
factor to get step deviations (dy').Obtain their total to get Edy'.
Step 3. Square the step deviations of X series and obtain the total, i.e.. Edx'2,
Step 4. Square the step deviations of Yseries and find the total to get Edy.
Step 5. Multiply dx' with dy' and obtain the total Edx'dy'.
Srics
Step 6. Substitute the values of Edx', Edy', Edx'", Edy'2 and Edx'dy' in the
NEdx'dy'- Zdx' x Edy' fol owing formula
NEdx'2-(Edx')² x NZdy'2-(Edy')?

Where:
N= Number of pair of observations.
Ldx' = Sum of step deviations of X values from assumed mean.
Edy' = Sum of step deviationsof Yvalues from assumed mean.
Sdx'2 = Sum of squared step deviations of Xvalues from assumed mean.
Zdy? = Sum of squared step deviations of Yvalues from assumed mean.
Ldx'dy' = Sum of the products of step deviations dx' and dy'.
Example 12. Calculate the coefficient of correlation of the data given in Example 6by the Step
Deviation Method.
Solution: Calculation of Coefficient of Correlation (Step Deviation Method)
X-Series Y-Series
X dx = X-A dx dx'2 Y dy = Y-A dy dy? d°'dy
dx'= A= 10 dy'=
A= 18 C C
C=3 C=2

12 -6 -2 4 6 -4 -2

15 -3 -1 1 -2 -1 1 1

18 (A) 10 (A)
21 +3 +1 1 12 +2 +1 1

4 14 +4 +2 4
24 +6 +2
16 +6 +3 9
27 +9 +3
18 +8 +4 16 16
30 + 12 +4 16
Zdx' =7 Zdx'² =35 Edy' =7Zdy'2 =35 Zdx'dy =35
N~dx'dy' - Zdx' x Zdy'
NZdx'e-(Zdx')² xJ NEdy'2 -(Edy')?
Here, Zdx'dy' =35;Zdx' =7;Edy' = 7;N=7; Edx'? =35; Zdy'? =35
7x35-(7) x (7)
J7x35-(7)² x7x 35 -(7)2
245 49 196
-= 1
196 x196 196
Series X
Ans. Coefficient of Correlation = 1.There is perfect positive correlation between values of
and Series Y.
CORRELATION
SUMMARY OF KARL PEARSON'S COEFFICIENT OF
different methods.
Example:Calculate the Coefficient of Correlation (r) from the following data by
4 6 8 10
X 12
18 24 30
6 12 36
1t Method: Actual Mean Method 2nd Method: Direct Method
X-Series YSeries X-Series Y-Series
X X= Y y xy Y
X-X ý-y XY
2 4 6 36
-5 25 6 -15 225 75 16 12 144 12
-3 12 -9 81 27 6 36 18 324
6 -1 1 18 -3 8 64 24 576 108
1 24 9 3 10 100 30 900 192
10 30 81 27 12 144 36 1296 300
12 5 25 36 15 225 75 432
EX = 42 EX²=364 EY =126 Ey2 =3,276
ZX= Ex? = EY = Ey² = 630 Exy = 210 ZXY =1,092
42 70 126 NEXY -X.EY
r=
VNEX-(EXY x NEY-(2Y
X==- 7 Y-==21 6x 1092 42 x 126
Exy 210 210 /6x364-(42) xV6x 3,276-(126)2
-=1
Vix'x ~y W70x 630 210 6,552 - 5292
V2,184-1,764 x\19,656 15,876
1,260 1,260
=1
V420 x 3,780 1,260
3rd Method: Short-Cut Method or Assumed 4th Method:
Mean Method Step Deviation Method
X-Series Y-Series X-Series Y-Series
X dx = dy2 dy = dy² dxdy X dx = dx dx2Y dy = dy² dáy
X-A Y-A X-A dx Y-A dy'
A=8 A= 24 A=8 C=2 A= 24 C=6
2 -6 36 6 -18 324 108 2 -6 9 6 -18 -3
4 -4 16 12 -12 144 48 4 -4 12 -12 -2 4 4
-2
6 -2 4 18 -6 36 12 1
6 -2 -1 18 -6 -1 1
8 24 8 24 0
10 2 4 30 6 36 12 10 2 1 30 6 1 1
12 16 36 12 144 48 12 4 2 4 36 12 4

Zdx Zdy² = Edy Zdy? Zdxdy Zdx' Edx'2 Zdy' Zdy'² Edx'dy
=-6 76 =-18 = 684 = 228 =-3 =-3 = 19 = 19
= 19

r=
NEdxdy - Zdx x Zdy NEdx'dy'- Zdx' x Zdy'
VNZdy?-(Zdx xNZdy'-(Zdy) r=
VNZdx2-(Zdx xNZdy'- (Zdy'
{6 x 228)--6x-18)
(6x 19}--3x-3)
V6x 76(-6 x6x 684 -(-18)2 V6x 19-(-3) x\6 x19-(-3)
1,368 108
114-9
V456-36 x V4,104-324
V114 -9 x114-9
1,260 1,260
=1 105 105
420 x 3,780 1,260 V105x 105 105=1
SUMMARY OF SPEARMAN'S RANK
qst Case: When Ranks are
Glven
2nd
CORRELATION
Example1.Ina competition, two judges rankthe 5contestants| Example 2.Case:When Ranks are NOT
Calculate Spearman's RankG0ven
correlation of
asfollows: Coefficient from the following data:
2 3 5
Judge1 87 22 33
1 3 5 75 37
4 2
29 63
Judge2 correlation 52 46 48
coefficient of rank Solution: It is necessary to assign ranks. Assigning rank from
Calculate
the highest to the lowest.
Solution:
Ranks by D=R-Rz D2 X Ranks
Ranksby Ranks D= D
Judge 1(R) (R) (A) R,-Rz
Judge 1(F) -3 87 1
4 29 5 -4 16
1 22 5
2 63 1 16
2 1 2 4 33 4 52 2 2 4
3 3 1 1 75 2 46 4 -2 4
4 37 3
5 48
5
ED² = 14 ZD² = 40
6D2 6ZD2
Rank Correlation (r)=1 Rank Correlation(r) =1-:
N3-N NO-N

6x 14 84 6x 40 240
=1 = 0.3 7=1-. =1- -=-1
(5)-5 120 (5)3-5 120

3rd Case: When Ranks are EQUAL or REPEATED


coefficient from the following data:
Example 3. Calculate Spearman's Rank correlation of 62
75 74 70 65
X 90 88 75
42 38 47
25 34 34 34
18
from the highest to the lowest.
Solution: It is necessary to assign ranks. Assigning rank D
Y Ranks (R) D=R,-R2
X Ranks (R) -7 49
90 1 18 25
7 -5
88 2 25 -1.5 2.25
34 5 2.25
75 3.5 5 -1.5
34 -Ott
75 3.5 0
34 5 16
74 5 4
42 2 16
70 6 4
38 3 49
65 7 7
62 8 47 SD = 159.5

in Y, m=3.
and 34 is repeated thrice in series Y. Therefore, in X, m =2 and
o1S repeated twice in series X
1 1
6 ED2+ (m-m) + 12-(m-m)
12
k =1 -
N3-N

1 1
6 159.5 + (23-2) + 12 (39-3)
12
=1
83-8
6x 162
6(159.5 +0.5 +2) -0.93
512-8 504

You might also like