Measures of central tendency
Measures of dispersion
Measures of shape
Introduction
• The resulting “pictures” of frequency distributions illustrated trends
and patterns in the data.
• In most cases, however, we need more exact measures.
• In these cases, we can use single numbers called summary statistics to
describe characteristics of a data set.
• Two of these characteristics are particularly important to decision
makers: central tendency and dispersion.
Central Tendency Central tendency is the middle point of a distribution.
Measures of central tendency are also called measures of location.
Dispersion is the spread of the data in a distribution, that is, the extent to which the
observations are scattered.
There are two other characteristics of data sets that provide useful
information: skewness and kurtosis.
Dispersion
• Skewness Curves representing the data points in the data set may be either
symmetrical or skewed.
• Symmetrical curves, are such that a vertical line drawn from the center of the
curve to the horizontal axis divides the area of the curve into two equal parts.
• Each part is the mirror image of the other.
• Kurtosis When we measure the kurtosis of a distribution, we are measuring its
peakedness.
• For example, curves A and B differ only in that one is more peaked than the other.
• They have the same central location and dispersion, and both are symmetrical.
Statisticians say that the two curves have different degrees of kurtosis.
Measures of central tendency
(a) Mathematical Averages :
(i) Arithmetic Mean or Mean
(ii) Geometric Mean
(iii) Harmonic Mean
(iv) Quadratic Mean
(b) Positional Averages:
(i) Median
(ii) Mode
Measures of Central Tendency
• Average – a measure of the center value or central tendency of a
distribution of values.
• Three types of average:
• Mode
• Median
• Mean
ARITHMETIC MEAN
• Direct Method: Under this method, X is obtained by dividing sum of
observations by number of observations, i.e.,
Example 1: The following figures relate to monthly output of
cloth of a factory in a Measures of Central Tendency
given year:
Calculating the Mean from Grouped Data
Example 2: The following is the frequency
distribution of age of 670 students of a school.
Compute the arithmetic mean of the data.
Example 3: Calculate arithmetic mean of the
following distribution
Solution
Class Interval x f fx
0-10 5 3 15
10-20 15 8 120
20-30 25 12 300
30-40 35 15 525
40-50 45 18 810
50-60 55 16 880
60-70 65 11 715
70-80 75 5 375
Total 88 3740
Weighted Arithmetic Mean
• In the computation of simple arithmetic mean, equal importance is
given to all the items.
• But this may not be so in all situations. If all the items are not of equal
importance, then simple arithmetic mean will not be a good
representative of the given data.
• Hence, weighing of different items becomes necessary. The weights
are assigned to different items depending upon their importance,
• i.e., more important items are assigned more weight.
For example
• Calculate mean wage of the workers of a factory,
• The simple arithmetic mean, in such a situation, will give a higher
value that cannot be regarded as representative wage for the group.
• In order that the mean wage gives a realistic picture of the distribution,
the wages of managers should be given less importance in its
computation.
• The computation of weighted arithmetic is useful in many situations
where different items are of unequal importance.
• e.g., the construction index numbers, computation of standardized
death and birth rates, etc.
Formulae for Weighted Arithmetic Mean
Example: Ram purchased equity shares of a
company in 4 successive months, as given below.
Find the average price per share.
Solution
Month Price of Share No. of Share XW
(X) in Rs. (w)
Dec 91 100 200 20000
Jan 92 150 250 37500
Feb 92 200 280 56000
March 92 125 300 37500
Total 1030 151000
Example: From the following results of two
colleges A and B, find out which of the two is
better :
Arithmetic Mean of combined series
Example: There are 130 teachers and 100 non-teaching employees in a
college. The respective distributions of their monthly salaries are given
in the following table :
Solution
Median
• Median of distribution is that value of the variate which divides it into
two equal parts.
• In terms of frequency curve, the ordinate drawn at median divides the
area under the curve into two equal parts.
• Median is a positional average because its value depends upon the
position of an item and not on its magnitude.
Determination of Median
• (a) When individual observations are given
Example: Find median of the following
observations :
• 20, 15, 25, 28, 18, 16, 30.
• 245, 230, 265, 236, 220, 250.
(b) When ungrouped frequency distribution is
given
• In this case, the data are already arranged in the order of magnitude.
• Here, cumulative frequency is computed and the median is
determined in a manner similar to that of individual observations.
• Variable (X) : 10 11 12 13 14 15 16
• Frequency ( f) : 8 15 25 20 12 10 5
Sol.
• X : 10 11 12 13 14 15 16
• f : 8 15 25 20 12 10 5
• c. f. : 8 23 48 68 80 90 95
• Here N = 95, which is odd. Thus, median is size of [(95+1)/ 2]th
• i.e., 48th observation. From the table 48th observation is 12,
• \ Md = 12.
MODE
• Mode is that value of the variate which occurs maximum number of
times in a distribution and around which other items are densely
distributed.
• In the words of Croxton and Cowden, “The mode of a distribution is
the value at the point around which the items tend to be most heavily
concentrated.
• It may be regarded the most typical of a series of values.”
• Further, according to A.M. Tuttle, “Mode is the value which has the
greatest frequency density in its immediate neighbor-hood.”
Determination of Mode
• (a) When data are either in the form of individual observations or in
the form of ungrouped frequency distribution
• Given individual observations, these are first transformed into an
ungrouped frequency
• distribution. The mode of an ungrouped frequency distribution can be
determined in two
• ways, as given below :
• (i) By inspection or
• (ii) By method of Grouping
By inspection:
• Example 33: Compute mode of the following data :-
• 3, 4, 5, 10, 15, 3, 6, 7, 9, 12, 10, 16, 18,
• 20, 10, 9, 8, 19, 11, 14, 10, 13, 17, 9, 11
Solution: Writing this in the form of a frequency distribution, we get
• Values : 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
• Frequency: 2 1 1 1 1 1 3 4 2 1 1 1 1 1 1 1 1 1
• \ Mode = 10
Excel
• Mean: =average
• Median: =median
• Mode: =mode
What is command of median, mode, standard deviation,
variance
mean()
✓ The basic syntax for calculating mean in R is –
mean(x, trim = 0, na.rm = FALSE, ...)
median() Weighted mean:
The basic syntax for calculating median in R is − weighted.mean(x, w, …)
median(x, na.rm = FALSE) # S3 method for default weighted.mean(x, w, …,
na.rm = FALSE)
Mode
✓ The mode is the value that has highest number of occurrences in a set of data.
✓ Unlike mean and median, mode can have both numeric and character data.
✓ R does not have a standard in-built
function to calculate mode. •# Create the function.
•getmode <- function(v) {
✓ So, we create a user function uniqv <- unique(v)
uniqv[which.max(tabulate(match(v, uniqv)))]
to calculate mode of a data set in R. }
✓ This function takes the vector •# Create the vector with numbers.
•v <- c(2,1,2,3,1,2,3,4,1,5,5,3,2,3)
as input and gives the mode value •# Calculate the mode using the user function.
•result <- getmode(v)
as output. •print(result)
•# Create the vector with characters.
•charv <- c("o","it","the","it","it")
•# Calculate the mode using the user function.
•result <- getmode(charv)print(result)
Comparing the Mean, Median, and Mode
• Four levels of data – nominal, ordinal, interval, ratio (Chapter 1)
• Mode – can be used with all four levels.
• Median – may be used with ordinal, interval, of ratio level.
• Mean – may be used with interval or ratio level.
Critical Thinking
• Mound-shaped
data – values of
mean, median
and mode are
nearly equal.
Critical Thinking
• Skewed-left data – mean < median < mode.
Critical Thinking
• Skewed-right data – mean > median > mode.