EDUCATIONAL STATISTICS
PRESENTED BY DR. HINA JALAL
Descriptive DataAnalysis
Measures of Central Tendency
Measures of Dispersion
2
LEARNING OBJECTIVES
After completion of unit, the students will be able to:
1. Tell the basic purpose of measure of central tendency.
2. Define Range and determine range of a given data.
3. Write down the formulas for determining quartiles.
Define mean or average deviation.
Determine variance and standard deviation.
Define normal curve.
Explain skewness and kurtosis. 7/18/2020 3
MEASURES OF CENTRALN TENDENCY
Measures of central tendency (also referred as measures of center of
central location) allow us to summarize data with a single value. It is
a typical score among a group of scores (the midpoint). They give us
an easy way to describe a set of data with a single number. This
single number represents a value or score that generally is in the
middle of the dataset.
As Mean Median Mode
7/18/2020 4
MEAN/ AVERAGE / ARITHMETIC MEAN
The mean (or average) is the most popular and well known measure of central
tendency. It can be used with both discrete and continuous data, although its use
is most often with continuous data.
It is defined as the sum of all the observations divided by the number of
observations. It is denoted by
EXAMPLE OF MEAN
Example:
5, 10, 12, 16, 8, 42, 25, 15, 10, 7
Solution: 5+10+12+16+8+42+25+15+10+7=150/10
 Mean = 15
7/18/2020 6
MEAN ORAVERAGE
Qualities of GoodAverage
An average that possesses all or most of the following
qualities is considered good average.
It should be rigidly defined.
It should be easy to understand and easy to calculate.
It should be based on all the observations of the data.
It should be unaffected by extreme observations.
It should have sampling stability 7/18/2020 7
ADVANTAGES OF MEAN
Qualities of GoodAverage
An average that possesses all or most of the following
qualities is considered good average.
It should be rigidly defined and easy to understand.
It should be easy to calculate.
It should be based on all the observations of the data.
It should be unaffected by extreme observations.
It should have sampling stability 7/18/2020 8
DISADVANTAGES OF MEAN
It is highly affected by extreme values.
It cannot be accurately calculated for open
end frequency distribution.
It cannot be calculated accurately if any observation
is missing.
7/18/2020 9
MEDIAN (X̃)
The median is the middle score for a set of data that has been arranged in
order of magnitude. The median is less affected by outliers and skewed
data. In order to calculate the median, suppose we have the data below;
When we Apply Median
We apply median to the situations, when the direct measurements of
variables are not possible like poverty, beauty and intelligence etc.
7/18/2020 10
EXAMPLE MEDIAN
Median
Example: 12,15, 10, 20, 18, 25, 45, 30, 26
We need to make order of the data
10, 12, 15, 18, 20, 25, 26, 30, 45

So Mean = 20 7/18/2020 11
ADVANTAGES MEDIAN
It is easy to calculate and understand.
It is not affected by extreme values.
It can be computed even in open end frequency
distribution.
It can be used for qualitative data.
It can be located graphically. 7/18/2020 12
DISADVANTAGES MEDIAN
 Disadvantages of Median
 It is not rigorously defined.
 It is not based on all the observations.
 It is not suitable for further algebraic treatment.
7/18/2020 13
MODE
The most frequent value that occurs in the set of data is called
mode.
The mode is the most frequent score in our data set. On a histogram
it represents the highest bar in a bar chart or histogram. You can,
therefore, sometimes consider the mode as being the most popular
option. For example;
• 12, 24, 15, 18, 30, 48, 20, 24______Mode = 24
• 12, 15, 15, 18, 30, 48, 24, 24, 15, 24____Mode = 15 & 24
7/18/2020 14
APPLICATION OF MODE
When to apply Mode
We apply mode when it is required to study the problems
like average size of shoes, average size of readymade
garments, and average size of agriculture holding. This
average is widely used in Biology and Meteorology.
7/18/2020 15
ADVANTAGES OF MODE
It is easy to understand.
It is not affected by extreme values.
It can be computed even in open-end classes.
It can be useful in qualitative data.
7/18/2020 16
DISADVANTAGES OF MODE
 It is not clearly defined.
 It is not suitable for further algebraic treatment.
 It is not based on all the observations.
 It may not exist in some cases.
7/18/2020 17
SUMMARY OF WHEN TO USE THE MEAN, MEDIAN AND MODE
MEASURES OF DISPERSION
As the name suggests, the measure of dispersion shows the scatterings of the data. It tells the variation of the
data from one another and gives a clear idea about the distribution of the data. The measure of dispersion
shows the homogeneity or the heterogeneity of the distribution of the observations.
Classification of Measures of Dispersion
The measure of dispersion is categorized as:
(i) An absolute measure of dispersion:
The measures which express the scattering of observation in terms of distances i.e., range, quartile deviation.
The measure which expresses the variations in terms of the average of deviations of observations like mean
deviation and standard deviation.
(ii) A relative measure of dispersion:
We use a relative measure of dispersion for comparing distributions of two or more data set and for unit free
comparison. They are the coefficient of range, the coefficient of mean deviation, the coefficient of quartile
deviation, the coefficient of variation, and the coefficient of standard deviation
7/18/2020 19
MEASURES OF DISPERSION
The measure of central tendency does not tell us anything
about the spread data because any two sets of data may have
same central tendency with vast difference magnitude of
variability. Consider two types of data sets have same mean
but different reliability.
10, 12, 11, 14, 13
10, 2, 18, 27, 3
7/18/2020 19
MEASURES OF DISPERSION
These two data have same mean 12, but differ in their variations. There is more
variation in data (b) as compared to data (a). This illustrates the fact that of
central tendency is not sufficient. We therefore need some additional
information concerning with how the data are dispersed about the average.
This is measuring the dispersion. By dispersion we mean the degree to which
data tend to spread about an average value.
There are two types of measures of dispersion, absolute and relative
dispersion.
7/18/2020 20
TYPES MEASURES OF DISPERSION
Measures of Dispersion
Followings are the measure of dispersion.
The Range
The semi Interquartile Range or the Quartile Deviation
The Mean Deviation
The variance and the standard deviation
7/18/2020 21
RANGE
It is defined as difference between largest and smallest
observations in a set of data. Range = R = Xm - X0
Where Xm = the largest observation X0 = the
smallest observation. The range is very simple measure
of variability and only concerned with two most extreme
observations. Its relative measure is known as the co-
efficient of dispersion. Xm - Xo
Co-efficient of Range = Xm + Xo7/18/2020
22
EXAMPLE OF RANGE
Example:
Calculate Range and Co-efficient of Range from the
following data. 15, 20, 18, 16, 30, 42, 12, 25
Solution:
Xm = 42, Xo = 12 R = Xm — Xo =42-12 = 30
7/18/2020 23
STANDARD DEVIATION
 Standard deviation is the most commonly used and the most important
measure of variation.
 It determines whether the scores are generally near or far from the mean.
 In simple words, standard deviation tells how tightly all the scores are
clustered around the mean in a data set.
 When the scores are close to the mean, standard deviation is small. And large
standard deviation tells that the
scores are spread apart. Standard devi
7/
a
18/t
202
i0on is s
24 imply square root of
variance
V
ARIANCE
 Variance (σ2) in statistics is a measurement of the
spread between numbers in a data set. That is, it
measures how far each number in the set is from the
mean and therefore from every other number in the set.
 Variance measures how far a data set is spread out. Itis
mathematically defined as the average of the squared
differences from the mean. 7/18/2020 25
NORMAL CURVE
One way of presenting out how data are distributed is to plot them in a graph.
If the data is evenly distributed, our graph will come across a curve.
In statistics this curve is called a normal curve and in social sciences, it is
called the bell curve.
Normal or bell curved is distribution of data may naturally occur in several
possible ways, with a number of possibilities for standard deviation.
7/18/2020 26
STANDARD NORMAL CURVE

7/18/2020 27
SKEWNESS
Skewness tells us about the amount and direction of the
variation of the data set.
It is a measure of symmetry. A distribution or data setis
symmetric if it looks the same to the left and right of
the central point.
If bulk of data is at the left i.e. the peak is towards left
and the right tail is longer, we say that the distribution
is skewed right or positively skewed.
7/18/2020 28
EXAMPLES OF SKEWNESS
7/18/2020 29
KURTOSIS
 Kurtosis is a parameter that describes the shape of variation. It
is a measurement that tells us how the graph of the set of data
is peaked and how high the graph is around the mean.
 In other words we can say that kurtosis measures the shape of
the distribution, .i.e. the fatness of the tails, it focuses on how
returns are arranged around the mean.
 A positive value means that too little data is in the tail and
positive value means that too much data is in the tail.
7/18/2020 30
TYPES OF KURTOSIS
Kurtosis has three types, mesokurtic, platykurtic, and
leptokurtic.
If the distribution has kurtosis of zero, then the graph is
nearly normal. This nearly normal distribution is called
mesokurtic.
If the distribution has negative kurtosis, it is called platykurtic.
An example of platykurtic distribution is a uniform distribution.
 If the distribution has positive kurtosis, it is called leptokurtic
7/18/2020 31
TYPES OF KURTOSIS
7/18/2020 32
SELF ASSESSMENTACTIVITY
Q. 1. Tell the basic purpose of measure of central tendency?
Q. 2. Define Range and determine range of a given data?
Q. 3. Write down the formulas for determining quartiles?
Q. 4. Define mean or average deviation?
Q. 5. Determine variance and standard deviation?
Q. 6. Define normal curve?
Q. 7. Explain skewness and kurtosis?7/18/2020 33

Basics of Educational Statistics (Descriptive statistics)

  • 1.
    EDUCATIONAL STATISTICS PRESENTED BYDR. HINA JALAL Descriptive DataAnalysis Measures of Central Tendency Measures of Dispersion 2
  • 2.
    LEARNING OBJECTIVES After completionof unit, the students will be able to: 1. Tell the basic purpose of measure of central tendency. 2. Define Range and determine range of a given data. 3. Write down the formulas for determining quartiles. Define mean or average deviation. Determine variance and standard deviation. Define normal curve. Explain skewness and kurtosis. 7/18/2020 3
  • 3.
    MEASURES OF CENTRALNTENDENCY Measures of central tendency (also referred as measures of center of central location) allow us to summarize data with a single value. It is a typical score among a group of scores (the midpoint). They give us an easy way to describe a set of data with a single number. This single number represents a value or score that generally is in the middle of the dataset. As Mean Median Mode 7/18/2020 4
  • 4.
    MEAN/ AVERAGE /ARITHMETIC MEAN The mean (or average) is the most popular and well known measure of central tendency. It can be used with both discrete and continuous data, although its use is most often with continuous data. It is defined as the sum of all the observations divided by the number of observations. It is denoted by
  • 5.
    EXAMPLE OF MEAN Example: 5,10, 12, 16, 8, 42, 25, 15, 10, 7 Solution: 5+10+12+16+8+42+25+15+10+7=150/10  Mean = 15 7/18/2020 6
  • 6.
    MEAN ORAVERAGE Qualities ofGoodAverage An average that possesses all or most of the following qualities is considered good average. It should be rigidly defined. It should be easy to understand and easy to calculate. It should be based on all the observations of the data. It should be unaffected by extreme observations. It should have sampling stability 7/18/2020 7
  • 7.
    ADVANTAGES OF MEAN Qualitiesof GoodAverage An average that possesses all or most of the following qualities is considered good average. It should be rigidly defined and easy to understand. It should be easy to calculate. It should be based on all the observations of the data. It should be unaffected by extreme observations. It should have sampling stability 7/18/2020 8
  • 8.
    DISADVANTAGES OF MEAN Itis highly affected by extreme values. It cannot be accurately calculated for open end frequency distribution. It cannot be calculated accurately if any observation is missing. 7/18/2020 9
  • 9.
    MEDIAN (X̃) The medianis the middle score for a set of data that has been arranged in order of magnitude. The median is less affected by outliers and skewed data. In order to calculate the median, suppose we have the data below; When we Apply Median We apply median to the situations, when the direct measurements of variables are not possible like poverty, beauty and intelligence etc. 7/18/2020 10
  • 10.
    EXAMPLE MEDIAN Median Example: 12,15,10, 20, 18, 25, 45, 30, 26 We need to make order of the data 10, 12, 15, 18, 20, 25, 26, 30, 45  So Mean = 20 7/18/2020 11
  • 11.
    ADVANTAGES MEDIAN It iseasy to calculate and understand. It is not affected by extreme values. It can be computed even in open end frequency distribution. It can be used for qualitative data. It can be located graphically. 7/18/2020 12
  • 12.
    DISADVANTAGES MEDIAN  Disadvantagesof Median  It is not rigorously defined.  It is not based on all the observations.  It is not suitable for further algebraic treatment. 7/18/2020 13
  • 13.
    MODE The most frequentvalue that occurs in the set of data is called mode. The mode is the most frequent score in our data set. On a histogram it represents the highest bar in a bar chart or histogram. You can, therefore, sometimes consider the mode as being the most popular option. For example; • 12, 24, 15, 18, 30, 48, 20, 24______Mode = 24 • 12, 15, 15, 18, 30, 48, 24, 24, 15, 24____Mode = 15 & 24 7/18/2020 14
  • 14.
    APPLICATION OF MODE Whento apply Mode We apply mode when it is required to study the problems like average size of shoes, average size of readymade garments, and average size of agriculture holding. This average is widely used in Biology and Meteorology. 7/18/2020 15
  • 15.
    ADVANTAGES OF MODE Itis easy to understand. It is not affected by extreme values. It can be computed even in open-end classes. It can be useful in qualitative data. 7/18/2020 16
  • 16.
    DISADVANTAGES OF MODE It is not clearly defined.  It is not suitable for further algebraic treatment.  It is not based on all the observations.  It may not exist in some cases. 7/18/2020 17
  • 17.
    SUMMARY OF WHENTO USE THE MEAN, MEDIAN AND MODE
  • 18.
    MEASURES OF DISPERSION Asthe name suggests, the measure of dispersion shows the scatterings of the data. It tells the variation of the data from one another and gives a clear idea about the distribution of the data. The measure of dispersion shows the homogeneity or the heterogeneity of the distribution of the observations. Classification of Measures of Dispersion The measure of dispersion is categorized as: (i) An absolute measure of dispersion: The measures which express the scattering of observation in terms of distances i.e., range, quartile deviation. The measure which expresses the variations in terms of the average of deviations of observations like mean deviation and standard deviation. (ii) A relative measure of dispersion: We use a relative measure of dispersion for comparing distributions of two or more data set and for unit free comparison. They are the coefficient of range, the coefficient of mean deviation, the coefficient of quartile deviation, the coefficient of variation, and the coefficient of standard deviation 7/18/2020 19
  • 19.
    MEASURES OF DISPERSION Themeasure of central tendency does not tell us anything about the spread data because any two sets of data may have same central tendency with vast difference magnitude of variability. Consider two types of data sets have same mean but different reliability. 10, 12, 11, 14, 13 10, 2, 18, 27, 3 7/18/2020 19
  • 20.
    MEASURES OF DISPERSION Thesetwo data have same mean 12, but differ in their variations. There is more variation in data (b) as compared to data (a). This illustrates the fact that of central tendency is not sufficient. We therefore need some additional information concerning with how the data are dispersed about the average. This is measuring the dispersion. By dispersion we mean the degree to which data tend to spread about an average value. There are two types of measures of dispersion, absolute and relative dispersion. 7/18/2020 20
  • 21.
    TYPES MEASURES OFDISPERSION Measures of Dispersion Followings are the measure of dispersion. The Range The semi Interquartile Range or the Quartile Deviation The Mean Deviation The variance and the standard deviation 7/18/2020 21
  • 22.
    RANGE It is definedas difference between largest and smallest observations in a set of data. Range = R = Xm - X0 Where Xm = the largest observation X0 = the smallest observation. The range is very simple measure of variability and only concerned with two most extreme observations. Its relative measure is known as the co- efficient of dispersion. Xm - Xo Co-efficient of Range = Xm + Xo7/18/2020 22
  • 23.
    EXAMPLE OF RANGE Example: CalculateRange and Co-efficient of Range from the following data. 15, 20, 18, 16, 30, 42, 12, 25 Solution: Xm = 42, Xo = 12 R = Xm — Xo =42-12 = 30 7/18/2020 23
  • 24.
    STANDARD DEVIATION  Standarddeviation is the most commonly used and the most important measure of variation.  It determines whether the scores are generally near or far from the mean.  In simple words, standard deviation tells how tightly all the scores are clustered around the mean in a data set.  When the scores are close to the mean, standard deviation is small. And large standard deviation tells that the scores are spread apart. Standard devi 7/ a 18/t 202 i0on is s 24 imply square root of variance
  • 25.
    V ARIANCE  Variance (σ2)in statistics is a measurement of the spread between numbers in a data set. That is, it measures how far each number in the set is from the mean and therefore from every other number in the set.  Variance measures how far a data set is spread out. Itis mathematically defined as the average of the squared differences from the mean. 7/18/2020 25
  • 26.
    NORMAL CURVE One wayof presenting out how data are distributed is to plot them in a graph. If the data is evenly distributed, our graph will come across a curve. In statistics this curve is called a normal curve and in social sciences, it is called the bell curve. Normal or bell curved is distribution of data may naturally occur in several possible ways, with a number of possibilities for standard deviation. 7/18/2020 26
  • 27.
  • 28.
    SKEWNESS Skewness tells usabout the amount and direction of the variation of the data set. It is a measure of symmetry. A distribution or data setis symmetric if it looks the same to the left and right of the central point. If bulk of data is at the left i.e. the peak is towards left and the right tail is longer, we say that the distribution is skewed right or positively skewed. 7/18/2020 28
  • 29.
  • 30.
    KURTOSIS  Kurtosis isa parameter that describes the shape of variation. It is a measurement that tells us how the graph of the set of data is peaked and how high the graph is around the mean.  In other words we can say that kurtosis measures the shape of the distribution, .i.e. the fatness of the tails, it focuses on how returns are arranged around the mean.  A positive value means that too little data is in the tail and positive value means that too much data is in the tail. 7/18/2020 30
  • 31.
    TYPES OF KURTOSIS Kurtosishas three types, mesokurtic, platykurtic, and leptokurtic. If the distribution has kurtosis of zero, then the graph is nearly normal. This nearly normal distribution is called mesokurtic. If the distribution has negative kurtosis, it is called platykurtic. An example of platykurtic distribution is a uniform distribution.  If the distribution has positive kurtosis, it is called leptokurtic 7/18/2020 31
  • 32.
  • 33.
    SELF ASSESSMENTACTIVITY Q. 1.Tell the basic purpose of measure of central tendency? Q. 2. Define Range and determine range of a given data? Q. 3. Write down the formulas for determining quartiles? Q. 4. Define mean or average deviation? Q. 5. Determine variance and standard deviation? Q. 6. Define normal curve? Q. 7. Explain skewness and kurtosis?7/18/2020 33