Descriptive Statistics:
The first step towards statistical analysis
Descriptive Statistics
It is a branch of statistics that focuses
on summarizing and presenting data in
a meaningful way.
I t prov ides a set of tools and
techniques for organizing, analyzing,
and interpreting data to help people
understand and make sense of the
information at hand.
Types of Descriptive Statistics
1. Measures of Central Tendency (mean, median, mode)
2. Measures of Position (percentiles, deciles, quartiles, Z-
scores)
3. Measures of Variability (range, average deviation, variance,
standard deviation)
Measures of Central Tendency
Central tendency is defined as “the statistical measure that
identifies a single value as representative of an entire
distribution.”
It aims to provide an accurate description of the
entire data. It is the single value that is most
typical/representative of the collected data.
Measures of Central Tendency
Mean: The sum of all values divided by the number of values.
Median: The middle value when data is ordered.
Mode: The value that appears most frequently.
Example of Measures of Central Tendency
Data Set: 35, 15, 22, 40, 25, 18, 28, 35
Find the Mean, Median, and Mode of this dataset.
Mean: (35 + 15 + 22 + 40 + 25 + 18 + 28 + 35) ÷ 8 = 27.25
Median: 15, 18, 22, 25, 28, 35, 35, 40
(25+28) ÷ 2 = 26.5
Mode: 35
Measures of Position
Measures of position are statistical tools that describe the
relative standing of data points within a data set.
They tell us how a specific data value compares to other
values in the set and help to understand the distribution of
the data.
Percentiles
Percentiles: Divide the data into 100 equal parts.
Each percentile indicates the value below which a certain
percentage of the data points fall.
Percentiles are widely used to rank and compare data points
within a dataset.
75th percentile indicates the value below which 75% of the
data falls.
75% of the scores of 60 students (45) are below 82.
Quartiles
Quartiles are a type of measure of position that divide a data set
into four equal parts, each containing 25% of the data.
•Q1 (First Quartile): The 25th percentile.
•Q2 (Second Quartile or Median): The 50th percentile.
•Q3 (Third Quartile): The 75th percentile.
The difference between the third and first quartile is
called the interquartile range (IQR), which measures the
spread of the middle 50% of the data.
Quartiles are useful for identifying outliers and
understanding the central tendency and variability
in a data set.
Interquartile Range
The IQR measures the spread of the middle 50% of the
data, calculated as the difference between the third
quartile (Q3) and the first quartile (Q1).
The IQR is especially useful for identifying outliers
because it gives a more robust measure of spread than
the range, which can be distorted by extreme values.
Why use IQR?
The IQR is resistant to outliers because it only looks at the
middle 50% of the data, ignoring the lowest 25% and the
highest 25%.
It's commonly used in conjunction with box plots to
visually represent the spread and to detect outliers
(any data points that fall below Q1 - 1.5 × IQR or
above Q3 + 1.5 × IQR are considered outliers).
Z-score
A Z-score (also called a standard score) measures how many
standard
deviations a data point is from the mean of a data set.
It helps us determine the position of a value within a
distribution. A Z-score tells us whether the data point is
below or above the mean and by how much.
Z-score
Example: The table below shows the test-scores of Liza on
three subjects, the mean, and the standard deviation of the Lisa’s score in science is
scores of the section where she belongs. In which subject did 1.29 standard deviations
she perform best? above the average (or mean
score), her score in
mathematics is 1.25
standard deviations below
the average while her score
in English is 0.96 standard
deviation above the mean
score of t he class she
belongs to. Thus, she
performed best in Science.
Measures of Variability
Who among the students is the most
consistent or a student with most compressed
scores?
Measures of Variability
Measures of variability (also called measures of dispersion)
describe how spread out or scattered the data points in a data
set are.
These measures help us understand the degree to
which the data points differ from each other and from
the central tendency (mean, median).
Range
The range is the difference between the highest and lowest
values in the data set.
Example: If the highest score in a class is 95 and
the lowest is 55, the range is 40.
Variance
Variance measures the average squared deviation of each
data
point from the mean.
It gives a sense of how far the numbers are from the
mean, but in squared units.
Standard Deviation
Standard deviation is a measure of the amount of variation or
dispersion in a set of data points.
It tells us how much the individual data points tend to differ
from the mean (average) of the data set. In simple terms, it
indicates how spread out the data is.
Standard Deviation
Interpretation:
A low standard deviation means the data points are clustered close
to the mean, indicating less variability.
A high standard deviation means the data points are spread out,
indicating more variability.
Rule of Thumb
Typically, if the standard deviation is less
than 10% of the mean, it’s often considered
small in relative terms.
The table below presents the test results for Grade 5 in Mathematics, where the teacher taught his three classes using th
Uses of Standard Deviation
Interpreting Teacher’s Performance
Example: The table below presents the test results of Grade 5 in Mathematics
where the teacher taught his three classes using the same teaching
strategy. To which section did the teacher made his teaching strategy
most effective, given that the three classes are homogenously
grouped?
Uses of Standard Deviation
Interpreting Teacher’s Performance
Uses of Standard Deviation
Interpreting Teacher’s Performance
The teaching strategy
was most effective in
reducing the gap
between good and poor
performers in Section
C.
Excel Commands
Measures of Position
Percentile: =PERCENTILE.EXC(A1:A41, 0.7) 0.7 means 70th percentile
Percentile Rank: =PERCENTRANK(A1:A50, 75)
First Quartile (Q1) : =QUARTILE(A1:A50, 1)
Third Quartile (Q3) : =QUARTILE(A1:A50, 2)
Excel Commands
Measures of Variability
Range : =MAX(A1:A50) - MIN(A1:A50)
Variance: =VAR.P(A1:A50) (or VAR.S(A1:A50) for sample variance)
Standard Variation: =STDEV.P(A1:A50) (or STDEV.S(A1:A50) for sample)
Z-score: = (X - AVERAGE(A1:A50)) / STDEV.P(A1:A50)
THANK YOU