Basics of Educational Statistics (Descriptive statistics)

EDUCATIONAL STATISTICS
PRESENTED BY DR. HINA JALAL
Descriptive DataAnalysis
Measures of Central Tendency
Measures of Dispersion
2

LEARNING OBJECTIVES
After completion of unit, the students will be able to:
1. Tell the basic purpose of measure of central tendency.
2. Define Range and determine range of a given data.
3. Write down the formulas for determining quartiles.
Define mean or average deviation.
Determine variance and standard deviation.
Define normal curve.
Explain skewness and kurtosis. 7/18/2020 3

MEASURES OF CENTRALN TENDENCY
Measures of central tendency (also referred as measures of center of
central location) allow us to summarize data with a single value. It is
a typical score among a group of scores (the midpoint). They give us
an easy way to describe a set of data with a single number. This
single number represents a value or score that generally is in the
middle of the dataset.
As Mean Median Mode
7/18/2020 4

MEAN/ AVERAGE / ARITHMETIC MEAN
The mean (or average) is the most popular and well known measure of central
tendency. It can be used with both discrete and continuous data, although its use
is most often with continuous data.
It is defined as the sum of all the observations divided by the number of
observations. It is denoted by

EXAMPLE OF MEAN
Example:
5, 10, 12, 16, 8, 42, 25, 15, 10, 7
Solution: 5+10+12+16+8+42+25+15+10+7=150/10
 Mean = 15
7/18/2020 6

MEAN ORAVERAGE
Qualities of GoodAverage
An average that possesses all or most of the following
qualities is considered good average.
It should be rigidly defined.
It should be easy to understand and easy to calculate.
It should be based on all the observations of the data.
It should be unaffected by extreme observations.
It should have sampling stability 7/18/2020 7

ADVANTAGES OF MEAN
Qualities of GoodAverage
An average that possesses all or most of the following
qualities is considered good average.
It should be rigidly defined and easy to understand.
It should be easy to calculate.
It should be based on all the observations of the data.
It should be unaffected by extreme observations.
It should have sampling stability 7/18/2020 8

DISADVANTAGES OF MEAN
It is highly affected by extreme values.
It cannot be accurately calculated for open
end frequency distribution.
It cannot be calculated accurately if any observation
is missing.
7/18/2020 9

MEDIAN (X̃)
The median is the middle score for a set of data that has been arranged in
order of magnitude. The median is less affected by outliers and skewed
data. In order to calculate the median, suppose we have the data below;
When we Apply Median
We apply median to the situations, when the direct measurements of
variables are not possible like poverty, beauty and intelligence etc.
7/18/2020 10

EXAMPLE MEDIAN
Median
Example: 12,15, 10, 20, 18, 25, 45, 30, 26
We need to make order of the data
10, 12, 15, 18, 20, 25, 26, 30, 45

So Mean = 20 7/18/2020 11

ADVANTAGES MEDIAN
It is easy to calculate and understand.
It is not affected by extreme values.
It can be computed even in open end frequency
distribution.
It can be used for qualitative data.
It can be located graphically. 7/18/2020 12

DISADVANTAGES MEDIAN
 Disadvantages of Median
 It is not rigorously defined.
 It is not based on all the observations.
 It is not suitable for further algebraic treatment.
7/18/2020 13

MODE
The most frequent value that occurs in the set of data is called
mode.
The mode is the most frequent score in our data set. On a histogram
it represents the highest bar in a bar chart or histogram. You can,
therefore, sometimes consider the mode as being the most popular
option. For example;
• 12, 24, 15, 18, 30, 48, 20, 24______Mode = 24
• 12, 15, 15, 18, 30, 48, 24, 24, 15, 24____Mode = 15 & 24
7/18/2020 14

APPLICATION OF MODE
When to apply Mode
We apply mode when it is required to study the problems
like average size of shoes, average size of readymade
garments, and average size of agriculture holding. This
average is widely used in Biology and Meteorology.
7/18/2020 15

ADVANTAGES OF MODE
It is easy to understand.
It is not affected by extreme values.
It can be computed even in open-end classes.
It can be useful in qualitative data.
7/18/2020 16

DISADVANTAGES OF MODE
 It is not clearly defined.
 It is not suitable for further algebraic treatment.
 It is not based on all the observations.
 It may not exist in some cases.
7/18/2020 17

SUMMARY OF WHEN TO USE THE MEAN, MEDIAN AND MODE

MEASURES OF DISPERSION
As the name suggests, the measure of dispersion shows the scatterings of the data. It tells the variation of the
data from one another and gives a clear idea about the distribution of the data. The measure of dispersion
shows the homogeneity or the heterogeneity of the distribution of the observations.
Classification of Measures of Dispersion
The measure of dispersion is categorized as:
(i) An absolute measure of dispersion:
The measures which express the scattering of observation in terms of distances i.e., range, quartile deviation.
The measure which expresses the variations in terms of the average of deviations of observations like mean
deviation and standard deviation.
(ii) A relative measure of dispersion:
We use a relative measure of dispersion for comparing distributions of two or more data set and for unit free
comparison. They are the coefficient of range, the coefficient of mean deviation, the coefficient of quartile
deviation, the coefficient of variation, and the coefficient of standard deviation
7/18/2020 19

The measure of central tendency does not tell us anything
about the spread data because any two sets of data may have
same central tendency with vast difference magnitude of
variability. Consider two types of data sets have same mean
but different reliability.
10, 12, 11, 14, 13
10, 2, 18, 27, 3
7/18/2020 19

These two data have same mean 12, but differ in their variations. There is more
variation in data (b) as compared to data (a). This illustrates the fact that of
central tendency is not sufficient. We therefore need some additional
information concerning with how the data are dispersed about the average.
This is measuring the dispersion. By dispersion we mean the degree to which
data tend to spread about an average value.
There are two types of measures of dispersion, absolute and relative
dispersion.
7/18/2020 20

TYPES MEASURES OF DISPERSION
Measures of Dispersion
Followings are the measure of dispersion.
The Range
The semi Interquartile Range or the Quartile Deviation
The Mean Deviation
The variance and the standard deviation
7/18/2020 21

RANGE
It is defined as difference between largest and smallest
observations in a set of data. Range = R = Xm - X0
Where Xm = the largest observation X0 = the
smallest observation. The range is very simple measure
of variability and only concerned with two most extreme
observations. Its relative measure is known as the co-
efficient of dispersion. Xm - Xo
Co-efficient of Range = Xm + Xo7/18/2020
22

EXAMPLE OF RANGE
Example:
Calculate Range and Co-efficient of Range from the
following data. 15, 20, 18, 16, 30, 42, 12, 25
Solution:
Xm = 42, Xo = 12 R = Xm — Xo =42-12 = 30
7/18/2020 23

STANDARD DEVIATION
 Standard deviation is the most commonly used and the most important
measure of variation.
 It determines whether the scores are generally near or far from the mean.
 In simple words, standard deviation tells how tightly all the scores are
clustered around the mean in a data set.
 When the scores are close to the mean, standard deviation is small. And large
standard deviation tells that the
scores are spread apart. Standard devi
7/
a
18/t
202
i0on is s
24 imply square root of
variance

V
ARIANCE
 Variance (σ2) in statistics is a measurement of the
spread between numbers in a data set. That is, it
measures how far each number in the set is from the
mean and therefore from every other number in the set.
 Variance measures how far a data set is spread out. Itis
mathematically defined as the average of the squared
differences from the mean. 7/18/2020 25

NORMAL CURVE
One way of presenting out how data are distributed is to plot them in a graph.
If the data is evenly distributed, our graph will come across a curve.
In statistics this curve is called a normal curve and in social sciences, it is
called the bell curve.
Normal or bell curved is distribution of data may naturally occur in several
possible ways, with a number of possibilities for standard deviation.
7/18/2020 26

STANDARD NORMAL CURVE

7/18/2020 27

SKEWNESS
Skewness tells us about the amount and direction of the
variation of the data set.
It is a measure of symmetry. A distribution or data setis
symmetric if it looks the same to the left and right of
the central point.
If bulk of data is at the left i.e. the peak is towards left
and the right tail is longer, we say that the distribution
is skewed right or positively skewed.
7/18/2020 28

EXAMPLES OF SKEWNESS
7/18/2020 29

KURTOSIS
 Kurtosis is a parameter that describes the shape of variation. It
is a measurement that tells us how the graph of the set of data
is peaked and how high the graph is around the mean.
 In other words we can say that kurtosis measures the shape of
the distribution, .i.e. the fatness of the tails, it focuses on how
returns are arranged around the mean.
 A positive value means that too little data is in the tail and
positive value means that too much data is in the tail.
7/18/2020 30

TYPES OF KURTOSIS
Kurtosis has three types, mesokurtic, platykurtic, and
leptokurtic.
If the distribution has kurtosis of zero, then the graph is
nearly normal. This nearly normal distribution is called
mesokurtic.
If the distribution has negative kurtosis, it is called platykurtic.
An example of platykurtic distribution is a uniform distribution.
 If the distribution has positive kurtosis, it is called leptokurtic
7/18/2020 31

TYPES OF KURTOSIS
7/18/2020 32

SELF ASSESSMENTACTIVITY
Q. 1. Tell the basic purpose of measure of central tendency?
Q. 2. Define Range and determine range of a given data?
Q. 3. Write down the formulas for determining quartiles?
Q. 4. Define mean or average deviation?
Q. 5. Determine variance and standard deviation?
Q. 6. Define normal curve?
Q. 7. Explain skewness and kurtosis?7/18/2020 33

Basics of Educational Statistics (Descriptive statistics)

More Related Content

What's hot

Similar to Basics of Educational Statistics (Descriptive statistics)

More from HennaAnsari

Recently uploaded

In this document

Basics of Educational Statistics (Descriptive statistics)