MEASURES OF CENTER
Mean, Mode, Median, Weighted Mean
Objectives
1. Compute the mean of a data set
2. Compute the median of a data set
3. Compare the properties of the mean and median
4. Find the mode of a data set
5. Approximate the mean, median, and mode using
grouped data
2
The Mean
The mean of a data set is a measure of center.
If we imagine each data value to be a weight, then
the mean is the point at which the data set
balances.
3
Notation
A list of n numbers is denoted by x1, x2, x3, … xn
represents the sum of these numbers:
If x1, x2, x3, … xn is a sample, then the mean is
called the sample mean and is given by
If x1, x2, x3, … xN is a population, then the mean is
called the population mean and is given by
4
Example
During a semester, a student took five exams. The
population of exam scores is 78, 83, 92, 68, and 85.
Find the mean.
Solution:
xi 78 83 92 68 85
N 5
406
5
81.2
5
Rounding
In the last example, the mean was rounded to
one more decimal place than the original data.
Generally, it is a good practice to round
the mean to one more decimal place
than the data that appear in the
original data set.
6
The Median
The median is another measure of center.
The median is a number that splits the data set in
half, so that half the data values are less than the
median and half of the data values are greater
than the median.
The procedure for computing the median differs,
depending on whether the number of
observations in the data set is even or odd.
7
Procedure for Finding the
Median
Following is the procedure for finding the median of a data set:
Step 1:Arrange the data values in increasing order.
Step 2: Determine the depth (position or location)
of the median
Step 3:Find the value of median
If n is odd, the median is the middle number.
If n is even, the median is the average of the two
middle numbers.
8
Example
During a semester, a student took five exams. The population
of exam scores is 78, 83, 92, 68, and 85. Find the median.
Solution:
Step 1: Arrange the data values in increasing order:
68 78 83 85 92
Step 2: Determine the number of data values.
Step 3: Since n is odd, the median is the middle number.
68 78 83 85 92
The median
is 83. 9
Example
One of the goals of medical research is to develop treatments
that reduce the time spent in recovery. Eight patients undergo
a new surgical procedure, and the number of days spent in
recovery for each is as follows.
20 15 12 27 13 19 13 21
Solution:
Step 1: Arrange the data values in increasing order:
12 13 13 15 19 20 21 27
Step 2: Determine the number of data values.
There are 8 data values, so n is even.
Step 3: Since n is even, the median is the average of the two
middle numbers. These numbers are 15 and 19. Therefore,
10
Resistant
A statistic is resistant if its value is not affected
much by extreme values (large or small) in the data
set.
The median is resistant, but the mean is not.
11
Example
Five families, named Smith, Jones, Gonzales, Brown, and
Jackson live in an apartment building. Their annual incomes, in
dollars, are 25,000, 31,000, 34,000, 44,000 and 56,000. The
Smith family, whose income is 25,000, wins a million dollar
lottery, so their income increases to 1,025,000. Find the mean
and median income both before and after the Smiths win the
lottery. Which measure of center is more influenced by the
large number, the mean or the median?
12
Solution
Before the lottery win, the mean and median are:
Mean = $38,000
Median = $34,000
After the lottery win, the mean and median are:
Mean = $238,000
Median = $44,000
The extreme value of $1,025,000 influences the mean quite a
lot; increasing it from $38,000 to $238,000.
In comparison, the median has been influenced must less
increasing from $34,000 to $44,000.
13
Describing the Shape
of a Data Set
The mean and median measure the center of a data
set in different ways. When a data set is symmetric,
the mean and median are equal.
When a data set is skewed to the right, there are large
values in the right tail. Because the median is resistant
while the mean is not, the mean is generally more
affected by these large values. Therefore for a data set
that is skewed to the right, the mean is often
greater than the median.
Similarly, when a data set is skewed to the left, the
mean is often less than the median.
14
Skewed to the Right
Shape: Skewed to the Right
Relationship Between
the Mean and Median: Mean is noticeably greater than
the median
15
Skewed to the Left
Shape: Skewed to the Left
Relationship Between
the Mean and Median: Mean is noticeably less than the
median
16
Approximately
Symmetric
Shape: Approximately Symmetric
Relationship Between
the Mean and Median: Mean and median are
approximately the same
17
Mode
Another value that is sometimes classified as a
measure of center is the mode.
The mode of a data set is the value that appears
most frequently.
If two or more values are tied for the most
frequent, they are all considered to be modes.
If no value appears more than once, we say that
the data set has no mode.
18
Example
Ten students were asked how many siblings they
had. The results, arranged in order, were
0111122336
Find the mode of this data set.
Solution:
The value that appears most frequently is 1.
Therefore the mode of this data set is 1.
19
Measure of Center?
The mode is sometimes classified as a measure of
center. However, this isn’t really accurate. The
mode can be the largest value in a data set, or the
smallest, or anywhere in between.
20
Mode for Qualitative
Data
The mean and median can only be computed for
quantitative data. The mode, on the other hand, can be
computed for quantitative data and qualitative data. For
qualitative data, the mode is the most frequently
appearing category.
Example:
Following is a list of the makes of all the cars rented by an
automobile
Honda
rental company Toyota Toyota day.
on a particular Honda Ford of car is
Which make
Chevrolet Nissan Ford Chevrolet Chevrolet
the mode?
Honda Dodge Ford Ford Toyota
Chevrolet Toyota Toyota Toyota Nissan
21
Weighted Mean
When data values are assigned different weights,
we can compute a weighted mean
22
Example
In her first semester of college, a student of the author took five
courses.
Her final grades along with the number of credits for each course
were A (3 credits), A (4 credits), B (3 credits), C (3 credits), and F (1
credit).
The grading system assigns quality points to letter grades as follows:
A = 4; B = 3; C = 2; D = 1; F = 0.
Compute her grade point average.
Solution:
Use the numbers of credits as the weights: w = 3, 4, 3, 3, 1.
Replace the letters grades of A, A, B, C, and F with the corresponding quality
points: x = 4, 4, 3, 2, 0.
= 3.07
23
Appoximating
Measures of Center
using Grouped Data
Appoximating Measures of
Center using Grouped Data
Sometimes we don’t have access to the raw data
in a data set, but we are given a frequency
distribution. In these cases we can approximate
the mean, median and mode.
25
Approximating the Mean
Mean for sample data:
Following is the procedure for approximating the
mean using grouped data:
o Step 1: Compute the midpoint of each class. The
midpoint of a class is found by taking the average of the
lower class limit and the lower limit of the next larger
class.
o Step 2: For each class, multiply the class midpoint by
the class frequency.
o Step 3: Add the products (Midpoint)x(Frequency) over
all classes.
o Step 4: Divide the sum obtained in Step 3 by the sum of
the frequencies.
26
Example
The following table presents the number of text
messages sent via cell phone by a sample of 50
high school students. Approximate the mean
number of messages sent.
Number of Text Messages Sent Frequenc
y
0 – 49 10
50 – 99 5
100 – 149 13
150 – 199 11
200 – 249 7
250 – 299 4
27
Solution
Step 1: Compute the midpoint of each class.
Step 2: For each class, multiply the class midpoint by the class
frequency.
Step 3: Add the products (Midpoint)x(Frequency) over all classes.
Step 4: Divide the sum obtained in Step 3 by the sum of the
Number of Text Frequency,
frequencies. Class fx
Messages Sent f Midpoint, x
0 – 49 10 24.5 245
50 – 99 5 74.5 372.5
100 – 149 13 124.5 1618.5
150 – 199 11 174.5 1919.5
200 – 249 7 224.5 1571.5
250 – 299 4 274.5 1098.0
50 6825
28
Approximating the Median
Following is the procedure for approximating the
median using grouped data:
o Step 1: Construct the cumulative frequency distribution.
o Step 2: Determine the class median – the first class with
the value of cumulative frequency is at least
o Step 3: Find the median by using the following formula:
where n – total frequency,
F – cumulative frequency before class median
i – class width
- frequency of the class median
- lower boundary of the class median
29
Example
Number of Text Frequency, Cumulative
Messages Sent f Frequency,
0 – 49 10 F
50 – 99 5 10
100 – 149 13 15
150 – 199 11 28
200 – 249 7 39
250 – 299 4 46
50
Step 1: Construct cumulative frequency distribution
Step 2: Determine the class median
Step 3: Find the median
30
Approximating the
Mode
Mode is the value that has the highest frequency in
a data set.
For grouped data, class mode (or, modal class) is
the class with the highest frequency.
To find mode for grouped data, use the following
formula
where i – class width
- the difference between the frequency of class
mode and
the frequency of the class before the class mode
- the difference between the frequency of class
mode and
the frequency of the class after the class mode
- lower boundary of the class mode 31
Example
Number of Text Frequency,
Messages Sent f
0 – 49 10
50 – 99 5
100 – 149 13
150 – 199 11
200 – 249 7
250 – 299 4
Step 1: Determine class mode
Step 2: Find the mode
32
Approximating the
Mode
We can also obtain the mode using histogram
33