3.
0 Measures of Dispersion
Spread is the degree of scatter or variation of the variable about the central value(mean). Let us
now consider two distributions representing the marks scored by two groups of 6 students each.
Groups 1 2 3 4 5 6 Total
A 20 22 23 27 28 30 150
B 15 18 21 29 32 35 150
It is obvious from the given data that the mean of each series is 25 and there is no difference
between their two series as far as their means are concerned.
However, if we examine the items of the two distributions, we find that the marks in one group
vary between 20 and 30 while in the group B, the variation is between 15 and 35. This certainly
indicates the difference between the distributions. This difference arises due to the fact that the
divergence from the mean is different in the two groups.
In other words, the scatter or dispersion from the mean is less in group A than in group B. This
scatter or lack of uniformity in the series of items of a group is known as dispersion.
The following are the measures of dispersion which are commonly used:
a) Range b) Quartile deviation c) Mean Absolute Deviation, d) standard deviation
3.1 Range
It is the simplest measure of dispersion and is given by the difference between the largest and
smallest value of the items in a distribution.
It is a very crude measure of dispersion because it depends only on two extreme values. For a
continuous distribution, it is the difference between the lower limit of the smallest class interval
and the upper limit of the largest class interval.
3.2 Quartile deviation
Quartiles are the magnitudes of the items in a series which divides it into four equal parts. The
difference between the upper quartile (Q3 ) and lower quartile(Q¿¿ 1)¿ gives the quartile range.
Q3−Q1
But Q= is called the quartile deviation.
2
3.4 Mean Absolute Deviation
It is defined as the average of deviations, all deviations taken positive from the mean, mode or
median.
Mathematically, it is expressed as follows
MAD¿
∑ |X i− X|= ∑ f i|X i−X|
N N
Mean Absolute deviation (MAD) is a better measure of dispersion when compared to the range
or quartile deviation. But it is disadvantage of forcefully removing the negative signs by
introducing absoluteness.
Example 1:
Find the quartile deviation and the mean absolute deviation for the following data.
3, 6, 9, 10, 7, 12, 13, 15, 6, 5, 13
Solution: Sorted data: 3, 5, 6, 6, 7, 9, 10, 12, 13, 13, 15
Recall Q1= 6 and Q3 =13 from earlier calculations
Q=
Q 3−Q 1 1
= ( 13−6 ) =3.5, Arithmetic Mean, x=
∑ fx = 3+5+ …+15 =9
2 2 n 11
MAD¿
∑ |X i− X|=|3−9|+|5−9|+|6−9|+…|15−9|= 36 =3.2727
n 11 111
3.5 Variance and Standard Deviation
Ignoring the negative sign in order to compute MAD is not the only option we have to deal with
deviations. We can square the deviations and then average. The average of the squared
2
deviations from the mean is called the variance denoted s and it is given s=2∑ 2
( x i−x ) and its
n
standard deviation is s=
√ ∑ ( x i−x )2 .If
n
x 1 , x 2 ,… , x n occur with frequencies f 1 , f 2 , … , f n ,then
∑
√ ∑ f i ( x i−x )2 .
2
f i ( x i−x )
2
s= and its its corresponding standard deviation is s=
n n
Variance for ungrouped data s2=
∑ ( x i−x )2
n
and standard deviation is s=
√ ∑ ( x i−x )2
n
Variance for a grouped data s=
∑2
f i ( x i−x )
2
and its standard deviation is
n
s=
√ ∑ f i ( x i−x )2
n
Other useful formulae
∑
√ ∑ ( x i )2 −x 2
2
( xi )
2
s= −x 2, s=
n n
2
s=
∑ f i x i −x 2,
n
s=
√ ∑ f i x i −x 2
n
If we define d i=x i− A ⇒ xi =d i+ A ,where A is an assumed mean, then
x i−x=( d i+ A )−( A+ d ) since x= A+ d ,
x i−x=d i −d
⇒ ∑ f i ( xi −x ) =∑ f i ( d i−d ) =∑ f i d i −N d
2 2 2 2
1
⇒s =
2
N
∑ 2
f i d i −d
2
x i− A d i
If the class interval has same width C, then we can write ui= =
c C
1 1 1
⇒ d i=C ui so that s =
2
N
∑ f i d i −d = ∑ f i (cu¿ ¿ i) − ∑ f i (C ui ) ¿
2 2
N
2
N
2
( ∑ f i ui❑
)
2
1 2
s = C ∑ f i ui −C
2 2 2
N N
[ ( )]
2
1 f u
2
s =C
N
∑
2 2
f i ui − ∑ Ni i
[ ( )]
∑ fu
2
1
Or 2
s =C
N
∑ 2
fu−
2
N
and its standard deviation is given by
√[ ( )]
2
s= C
1
∑ f u2−2 ∑ fu
N N
Example
Length(mm) 118-126 127-135 136-144 145-153 154-162 163-171 172-180
frequency 3 5 9 12 5 4 2
Solution
Mid pt(x) f fx fx2 x− A fu fu2
u=
C
122 3 366 44652 -3 -9 27
131 5 655 85805 -2 -10 20
140 9 1260 176400 -1 -9 9
149 12 1728 266412 0 0 0
158 5 290 124820 1 5 5
167 4 668 111556 2 8 16
176 2 352 61952 3 6 18
Total 40 5879 871597 0 0 ∑ f u2 =95
2
s=
∑ f i x i −x 2= 871597 −146.975=188.274 , s=√ 188.274 =13.72mm
n 40
[ ( )] [ ( )]
∑ fu
2
2
1 95 −9
2
s =C
N
∑
2 2
fu−
N
=81
40
−
40
=188.274375
s=√ 188.274 =13.72mm
3.6 Properties of measures of Spread
i) They are not affected by change of origin. Adding or subtracting a
constant from each and every observation in a data set does not
affect any measures of spread. That is New measure old measure
iii) They are affected by change of scale. Multiplying each and every
observation in a data set by a constant value scales up all the
measures of spread by the
same value except in the case of variance which is scaled up by a
square of the same constant.
3.7 Measures of Relative Dispersion
These measures are used in comparing spreads of two or more sets of observations. These
measures are independent of the units of measurement. These are a sort of ratio and are called
coefficients
Q3−Q1
Coefficient of Quartile Deviation (CQD) = ×100 %
Q3 +Q1
MAD M. A. D
Coefficient of Mean Deviation (CMD) = ×100 %= ×100 %
MEAN X
standard deviation σ
Coefficient of Variation = × 100 %= ×100 %
mean X
Coefficient of variation is the percentage ratio of standard deviation and the arithmetic mean.
Example 1: Below are the scores of two cricketers in 10 innings. Find who is more “consistent
scorer” by Indirect method.
A 204 68 150 30 70 95 60 76 24 19
B 99 190 130 94 80 89 69 85 65 40
Solution: x A=79.6 , S A =58.2 , x B=94.1, S B=41.1
58.2
Coefficient of variation for player A is CV(X) A= ×100 %=73.153 %,
79.6
41.1
Coefficient of variation for player B is CV(X)B= × 100 %=43.7028 %
94.1
Conclusion
Coefficient of variation of A is greater than coefficient of variation of B and hence we conclude
that player B is more consistent