PAPER – I
Data Analysis, Usage of Mean, Median, Mode, graph, Pie-chart
Data Analysis : Data Analysis is the process of inspecting,
cleansing, transforming and modelling data with the goal of
discovering useful information, drawing conclusions and supporting
decision-making. In modern world, data is generated at an
unprecedented rate, and the ability to analyse this dat5a effectively
is crucial across various fields such as business, health care,
education, social sciences and more.
Among the foundational tools of data analysis are measures of
central tendency - Mean, Median and Mode - as well as graphical
representation like bar graphs, Line graphs and Pie-charts. These
tools help summarize large data sets, reveal patterns and
communicate findings clearly and effectively.
(Contd...2)
// 2 //
PAPER - I
Measures of Central Tendency
Measures of central tendency are statistical metrics that
describe the centre or typical value of a data set. They provide a
single value that represents the entire distribution, making it easier
to understand and compare data.
1. Mean (Arithmetic Average)
Definition : the mean is the sum of all data points divided by
the number of data points. It is the most commonly used
measure of central tendency.
Sum of the terms
Mean =
Number of terms
Example :
(1) Consider the data set representing the scores of 5
students in a test : 70, 75, 80, 85, 90
Mean = (70 + 75 + 80 + 85 + 90) ̸ 5 = 400/5 = 80
Contd...3)
// 3 //
PAPER - I
(2) Finding a Missing number using Mean
The mean of five numbers is 20. Four of the numbers are
18, 22, 19 and 24. Find the fifth number.
Let the fifth number be ‘x’
Mean = (sum of all numbers) ÷ 5
20 = (18 + 22 + 19 + 24 + x) ÷ 5
100 = 83 + x
X = 100 – 83 = 17
The fifth number is 17
(3) The mean of 10 numbers is 15 and the mean of another
15 numbers is 20. Find the mean of all 25 numbers
combined.
Sum of first group = 10 x 15 = 150
Sum of second group = 15 x 20 = 300
Total sum = 250 + 300 = 450
Total numbers = 10 + 15 = 25
Mean of all numbers = 450 ̸ 25 = 18
The mean of 25 numbers is 18
(Contd...4)
// 4 //
PAPER - I
4) Effect of Mean when a number is added
The mean of 8 numbers is 12. If a number is added to the
group, what is the new mean.
Sum of 8 numbers = 8 x 12 = 96
New sum = 96 + 20 = 116
New number of values = 8+1=9
New mean = 116/9 = 12.89
The new mean is approximately 12.89
5) Mean from Frequency distribution
Find the mean of the following data :
Value (x) Frequency (f)
5 3
7 4
10 2
12 1
Calculate FxX
5x3 = 15
7x4 = 28
10 x 2 = 20
12 x 1 = 12
Sum of F x X = 15 + 28 + 20 + 12 = 75!
Total frequency = 3 + 4 + 2 + 1 = 10
Mean = 75/10 = 7.5
The mean is 7.5 (Contd...5)
// 5 //
PAPER - I
Median
The median is the middle value when the data points are
arranged in ascending or descending order. If there is an even
number of data points, the median is the average of the two
middle values.
Examples
1) Ungrouped Data (odd number of observations)
n + 1 th
Formula : Median = term
2
Steps:
1. Arrange data in ascending order
2. Identify the middle term using the formula above
Example
For the data set ( 29, 33, 37, 38, 40, 41, 42) with n = 7
7+1
Median term = 4th term = 38
2
(Contd...6)
// 6 //
PAPER - I
2) Ungrouped Data (Even number of observations)
Formula :
n th
n th
Median = term + +1 term
2 2
_______________________________________
Steps
1. Arrange data is ascending order
2. Average the two middle numbers.
Example
For the data set (73, 80, 85, 88, 91, 92, 94, 97) with n = 8
4th term + 5th term 88 + 91
Median = = = 89.5
2 2
3) Grouped Data (Frequency Distribution)
Formula:
𝑛
- cf
2
Median = l + xh
f
(Contd...7)
// 7 //
PAPER - I
Variables:
L = Lower limit of the Median class
N = Total frequency
Cf = Cumulative frequency of the class
before the median class
F = Frequency of the median class
h = class width
Steps :
1. Identify the median class (Where the cumulative
frequency exceeds N/2)
2. Apply the formula
(Contd...8)
// 8 //
PAPER - I
Example :
For the frequency distribution below:
Class Interval Frequency Cumulative Frequency
39.5 – 44.5 1 1
44.5 – 49.5 5 6
49.5 – 54.5 9 15
54.5 – 59.5 12 27
59.5 – 64.5 7 34
64.5 – 69.5 2 36
Here N = 36, So N/2 =18
The Median class is 54.5 – 59.5 (Cumulative frequency 27)
18 - 15
Median = 54.5 x 5 = 54.5 + 1.25 = 55.75
12
(Contd...9)
// 9 //
PAPER - I
4) If the data set is 70, 75, 80, 85, 90
The median is 80 (the middle value)
If the data set is 70, 75, 80,85 (even number of points)
Median = (75 + 80)/2 = 77.5
5) The runs scored by 11 players in a cricket match
are : 7, 16, 121, 51, 101, 81, 1, 16, 9, 11, 16
Find the median of the data
Arrange the data in ascending order :
1, 7, 9, 11, 16, 16, 16, 51, 81, 101, 121
Since there are 11 observations (odd), the Median is the
sixth term. Therefore, the median of the given data is 16
6) The weight of 8 students in Kgs are 54, 49, 51, 58, 61, 52,
54, 60. Find the median weight .
Arrange the data in ascending order:
49, 51, 52, 54, 54, 58, 60, 61
The middle two terms are 54 and 54
Hence Median = (54 + 54)/2 = 108/2 = 54
Therefore, the Median weight is 54 KG.
(Contd...10)
// 10 //
PAPER - I
Mode
Mode is the value that appears most frequently in a
data set. It is a measure of central tendency, indicating
the most common value. A data set can have one Mode
(unimodal), more than one mode (bimodal or multimodal),
or no mode if no value repeats.
Formulas for Mode Calculations
1. Mode for ungrouped data
The mode is simply the value with the highest
frequency. No complex formula is needed. Just
identify the most frequent value.
Mode = Value with highest frequency
(Contd...11)
// 11 //
PAPER - I
2. Mode for grouped data
When data is grouped into class intervals, mode is
estimated using the formula
fm - f1
Mode = L + x h
fm –f1 + fm – f2
Variables
L = Lower limit of the modal class (class with
highest frequency)
h = class width (Upper limit – lower limit)
fm = frequency of the modal class
f1 = frequency of the class preceding the modal
class
f2 = frequency of the class succeeding the
modal class
(Contd...12)
// 12 //
PAPER - I
Examples
1) Mode of ungrouped data
Data : 2, 4, 5, 5, 67
Mode is 5 (appears twice more than others)
2) Mode of ungrouped Data (car colours)
Colour Red Blue Silver Black Yellow
Count 10 12 20 15 11
Mode is 20 (Silver car sold most)
3) Mode of grouped data
Age group 20 - 30 30 - 40 40 – 50 50 -60
Frequency 30 55 44 25
Model class - 30 -40 (frequency 55)
L = 30, h = 10, fm = 55, f1 = 30, f2 = 44
55 – 30 25
Mode = 30 x 10 = x 10 =
(55 – 30) + 55 – 44 25 + 11
30 + 6.94 = 36.94
(Contd..13)
// 13 //
PAPER - I
4) Find Mode using Empirical Relation
Given mean = 45.5, Median = 43, find Mode using:
Mode = 3 x Median – 2 x mean
Mode = 3 x 43 – 2 x 45.5 = 129 -91 = 38
Graph & Pie Charts
In data analysis, graphs and pie charts are
essential tools for visualizing data to reveal patterns,
trends and relationships.
Usage of Graphs
Line Graphs : Track changes overtime and
compare multiple groups Eg. Sales rates of
different products over months.
(Contd...14)
// 14 //
PAPER - I
Bar Charts : Compare quantities across categories,
useful for showing sales by region or product type.
Scatter Plots : Show relationship between two
variables and highlight outliers, such as customer
satisfaction versus response time.
Bubble Charts : Similar to scatter plots but add a
third variable via Bubble size, useful for multi
dimensional data like sales by month, location
product category.
Waterfall Charts : Illustrate how an initial value
changes through intermediate positive and
negative values. Eg. Revenue changes due to
different departments.
(Contd...15)
// 15 //
PAPER - I
Pie Charts
Advantages of Pie charts
Simple and easy to understand visually
Effective for showing proportional relationship
quickly
Useful for audience unfamiliar with detailed
data analysis.
Design Best Practices
Limit the number of categories to avoid clutter
Order slices by size for clarity
Ensure slices sum 100%
(Contd...16)
// 16 //
PAPER - I
Examples of Pie chart usage
Showing customer roles in a company Eg.
Individual contributors making up over 50% of
customers.
Company market share of products or brands
Eg. Web server market share like Apache
44.46%, Microsoft 30.1% etc.
Visualising monthly expenditure of a family or
types of houses people own.
Representing sales distribution by product
types or store locations
Displaying student performance percentages
in different subjects.
(Contd...17)
// 17 //
PAPER - I
In summary, graphs like line and scatter
plots are excellent for showing trends and
relationships, while Pie charts are ideal for
illustrating parts of a whole in categorical data,
making both vital for effective date analysis and
communication.