0% found this document useful (0 votes)

16 views

Topic 3

Uploaded by

eddyyow

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Topic 3

Uploaded by

eddyyow

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Topic 3 Topic: Descriptive Statistics • To construct a frequency

distribution
• Introduction • To calculate mean, and
• Constructing a Frequency Distribution median mode for population
• Measures of Central Tendency and sample
• Measures of Variability • To calculate range, variance
and standard deviation for
population & and sample

Why This Topic

To describe situations, draw conclusions, or make inferences about events, one must organise the data in some
meaningful way. The most convenient method of organising data is to construct a frequency distribution. After
organising the data, the researcher must present them so they can be understood by those who will benefit from
reading the study. The most useful method of presenting the data is by constructing statistical charts and graphs.
There are many different types of charts and graphs, and each one has a specific purpose. This lesson shows the
statistical methods that can be used to summarise data. The method is the finding of averages, median, mode,
range, variance and standard deviation will be discussed in this lesson.

Introduction

Statistics
Statistics is the mathematical science that deals with the collection, analysis, and presentation of data, which
can then be used as a basis for inference and induction.

Data
Values assigned to observations or measurements

Information
Data that are transformed into useful facts that can be used for a specific purpose, such as making a decision

The Two Main Types of Data

Data can be classified into two categories, namely qualitative and quantitative

© UNITAR International University Page 1 of 22

Classifying Data by Level of Measurement

Branches of Statistics

1. Descriptive statistics

• collecting, summarising, and displaying data

2. Inferential statistics

• making claims or conclusions about the data based on a sample

Population and Sample

© UNITAR International University Page 2 of 22

1. Population

• represents all possible subjects that are of interest in a particular study

2. Sample

• refers to a portion of the population that is representative of the

population from which it was selected

Parameter and statistics

• Parameter – a described characteristic of a population

• Statistic – a described characteristic of a sample

Inferential Statistics

Making claims about a population by examining sample results

• Example:

Constructing a Frequency Distribution

A frequency distribution shows the number of data observations that fall into specific intervals.
• Graphically summarise information not readily observable by merely looking at data in a table.
• A class is a category (row) in a frequency distribution.

Example: Number of iPads sold per day

© UNITAR International University Page 3 of 22

Discrete data are values based on observations that can be counted and are typically represented by whole
numbers.

• Represent something that has been counted.

• Take on whole numbers such as 0, 1, 2, 3.

Continuous data are values that can take on any real numbers, including numbers that contain decimal points.
• Usually measured rather than counted.
• Examples are weight, time, and distance.

Examples of Discrete data

• Number of children per family.
• Number of cars listed per insurance policy.
• Vacation days per month.

Examples of Continuous data

• Time required to read chapter 2.
• Thickness of paint applied to a car body.
• Voltage of batteries produced in August.

Relative frequency distributions display the proportion of observations of each class relative to the total number
of observations.
• Shows the fraction of observations in each class.
• Found by dividing each frequency by the total number of observations.
• The fractions in a relative frequency distribution add up to 1.00.

Example:

© UNITAR International University Page 4 of 22

Cumulative Relative Frequency Distributions

A cumulative relative frequency distribution totals the proportion of observations that are less than or equal to
the class at which you are looking.

• Shows the accumulated proportion as values vary from low to high

Using a Histogram to Graph a Frequency Distribution

A histogram is a graph showing the number of observations in each class of a frequency distribution.

The Shape of Histograms

© UNITAR International University Page 5 of 22

Constructing a Frequency Distribution Using Grouped Quantitative Data

Ideally, the number of classes in a frequency distribution should be between 4 and 20.
• Some data sets, particularly those with continuous data, require several values to be grouped together
in a single class.
• This grouping prevents having too many classes in the frequency distribution, which can make it difficult
to detect patterns.

Number of Classes
One method to determine the number of classes in a frequency distribution is the rule
2k ≥ n
where k = Number of classes
n = Number of data points

• Find the lowest value of k that satisfies the rule.

Suppose n = 50
25 = 32 < 50 (k = 5 is too small.)
26 = 64 > 50 (k = 6 is a good choice.)

Class Width

Once k is known, the width of each class can be found.

• The width is the range of numbers to put into each class.

• Round this estimation to a useful whole number that makes the frequency distribution more readable.

© UNITAR International University Page 6 of 22

There is no one correct answer for the class width.
• The goal is to create a histogram to clearly and usefully show the pattern in the data.
• Often there is more than one acceptable way to accomplish this.

Class Boundaries

Class boundaries represent the minimum and maximum values for each class.
• Choose class boundaries that are easy to read.

☺🗹 ☹🗷
3 to less than 6 minutes 3.21 to less than 6.21 minutes
6 to less than 9 minutes vs. 6.21 to less than 9.21 minutes
9 to less than 12 minutes 9.21 to less than 12.21 minutes

Class Frequencies

Find class frequencies by counting and recording the number of observations in each class.
• This is easier when the data are sorted.

Example:

Rules for Classes for Grouped Data

1. Equal-size classes - all classes in the frequency distribution must be of equal width.
2. Mutually exclusive classes - class boundaries cannot overlap.
3. Include all data values - make sure all data values are accounted for in the total row of the frequency
distribution.
4. Avoid empty classes - it is undesirable for a histogram to display a class so narrow that there are no
observations in it.
5. Avoid open-ended classes (if possible) - these violate the first rule of equal class sizes.

The Consequences of Too Few or Too Many Classes

© UNITAR International University Page 7 of 22

Wide classes result in few class intervals:
• Can obscure important patterns
• Gives a “blocky” distribution graph
• Summarizes the data too much
• Tells us little about the true distribution shape

Too many narrow classes have consequences:

• Results in a “jagged” histogram
• Some classes may be empty
• Does not summarize the data enough

The Ogive
The ogive is a line graph that plots the cumulative relative frequency distribution.
It provides a simple representation of the frequencies that are less than or equal to a certain number.

Displaying Qualitative Data

Qualitative data are values that are categorical.

• Can be nominal or ordinal measurement level.
• Describe a characteristic, such as gender or level of education.

Frequency distributions help display qualitative data by indicating the number of occurrences of various
categories.

© UNITAR International University Page 8 of 22

Bar Charts
Bar charts are a good tool for displaying qualitative data that have been organised into categories.

Vertical Bar Chart Horizontal bar chart

Pareto Charts
Pareto charts are bar charts that show the frequency of the categories that cause quality control problems.
Show quality problem categories in decreasing order:
• The most problematic categories are shown first

Pareto charts also plot the cumulative relative frequency as a line on the chart known as an ogive.

© UNITAR International University Page 9 of 22

Pie Charts
Pie charts are another excellent tool for comparing proportions for categorical data.
Each segment of the pie represents the relative frequency of one category:
• All categories in the data set must be included in the pie.
• Use a pie chart to compare the relative sizes of all possible categories.
• Bar charts are more useful when you want to highlight the actual data values and when the classes
combined don’t form a whole.

Stem and Leaf Display

A stem and leaf display splits the data values into stems (the larger place values) and leaves (the smaller place
value). By listing all of the leaves to the right of each stem, we can graphically describe how the data are
distributed.

• All the original data points are visible on the display.

• Easy to construct by hand.
• Provides a histogram-like view of the distribution.

© UNITAR International University Page 10 of 22

For this example, use the 10’s digit as the stem
Use the 1’s digit as the leaf

1. Sort the data from lowest to highest.

2. Determine the unique stem values.
7, 8, 9 are the different stem values in this example.
3. List the stems in a vertical column and then add the leaf values to the right of the appropriate stem, in
ascending order.

7|8 8 9 9 9
8|0 0 0 0 1 1 2 3 3 4 4 4 5 6 7 8
9|0 2 5

To get more detail the stems can be split in half

7(5) | 8 8 9 9 9
8(0) | 0 0 0 0 1 1 2 3 3 4 4 4
8(5) | 5 6 7 8
9(0) | 0 2
9(5) | 5

• The stem labeled 7(5) stores all the scores between 75 and 79.
• The stem 8(0) stores all the scores between 80 and 84.

Measures of Central Tendency

Central tendency is a single value used to describe the center point of a data set.

© UNITAR International University Page 11 of 22

The Mean
The mean, or average, is the most common measure of central tendency.
• Calculate the mean by adding all the values in a data set and then dividing the result by the number of
observations.

The formula for the Sample Mean:

The formula for the Population Mean:

Example: suppose a sample of size n = 5 gives the following values:

6.2 7.1 4.8 9.0 3.3

The sample mean:

Advantages and Disadvantages of Using the Mean to Summarise Data

Advantages:
• Simple to calculate.
• Summarises the data with a single value

Disadvantages:
• With only a summary value you lose information about the original data.

© UNITAR International University Page 12 of 22

• Sample 1 with n = 3: 999, 1000, 1001 𝑥̅ = 1000
• Sample 2 with n = 3: 0, 1000, 2000 𝑥̅ = 1000
• Just knowing the mean does not help you know what the underlying data looks like.
• The value of the mean is sensitive to outliers (values that are much higher or lower than most of the data).

The Median
The median is the value in the data set for which half the observations are higher and half the observations are
lower.
• First arrange the data in ascending order.

Example with sample of size n = 7:

21 27 27 28 34 45 50

The median value is, therefore, in the fourth position of our sorted data.
21 27 27 28 34 45 50

The median is not sensitive to outliers.

21 27 27 28 34 45 5000
• The median is still 28.

When there are odd numbers of data values, the median is always the middle value in the data set.

When there are even numbers of data values, the median is halfway between the two middle values.
Example with a sample of size n = 6:

145 157 170 182 204 209

The Mode
The mode is the value that appears most often in a data set.
• If no data value or category repeats more than once, then we say that the mode does not exist.
• More than one mode can exist if two or more values tie for the most frequent.

The mode is a particularly useful way to describe categorical data.

Example with numerical data:

• Number of children per family in a sample of 24 families:

0,0,0,0,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,4,5

© UNITAR International University Page 13 of 22

Number
of children Frequency
The value that appears most often is 2
0 4 (occurs 8 times), so the mode = 2
1 5 children.
2 8
3 4
4 2
5 1

Example with categorical data:

• The car that appears most often is Toyota (which occurs 7 times), so the mode is the Toyota model.

Example:
Prices for 5 homes have been collected

House Prices:

$2,000,000
500,000
300,000
100,000
100,000

Sum 3,000,000

Which Measure of Central Tendency Should You Use?

The mean is generally used as it is relatively easy to determine and most widely understood by people with little
statistical training.

If outliers are present, the median is often used, since the median is not sensitive to outliers.
• For example, median home prices may be reported for a region; it is less sensitive to outliers.

For categorical data, the mode is the only choice.

Measures of Variability

Measures of variability show how much spread is present in the data.

The Range
Simplest measure of variation. Difference between the highest value and the lowest value in a data set.

Advantages:
• Easy to calculate and understand

Disadvantages:
• Only based on two numbers in the data set

(Ignores the way in which data are distributed)

• Sensitive to outliers

Example:

The Variance and Standard Deviation

The Standard Deviation
The standard deviation is the square root of the variance.
• Has the same units as the original data

Sample standard deviation formula:

Calculating the Sample Standard Deviation

Short-Cut Formulas for the Sample Variance and Standard Deviation
Equivalent, but easier for hand calculations:

The Variance and Standard Deviation for a Population

Used when the data set represents an entire population rather than a sample from a population

Short-Cut Formulas for the Population Variance and Standard Deviation

Example calculation using short-cut formula:

The standard deviation is a common measure of consistency in business applications, such as quality control.
• The standard deviation measures the amount of variability around the mean.

The standard deviation is affected by the scale of the data.

• When sample means are very different, comparing standard deviations can be misleading.

The Coefficient of Variation

The coefficient of variation, CV, measures the standard deviation in terms of its percentage of the mean.

• A high CV indicates high variability relative to the size of the mean.

• A low CV indicates low variability relative to the size of the mean.

A smaller coefficient of variation indicates more consistency within a set of data values.

Example:

Working with Grouped Data
Suppose data has already been summarised by a frequency distribution.
• The individual data values are no longer shown.
• Only grouped data is available.

To estimate the average for the frequency distribution:

• Find the midpoint for each group.

(The midpoint is the halfway point in each group.)

• Use the midpoint as a representative value for that group.

Example: The Mean of Grouped Data

Example An online merchant has collected the following grouped data for the number of web pages viewed
by a sample of its customers:

Number of pages Frequency

1 to under 5 6

5 to under 9 12

9 to under 13 10

13 to under 17 4

The merchant would like to calculate the average number of viewed pages.

1. Find the midpoint of each class

Midpoint
Number of pages Frequency
(mi)

1 to under 5 3 6

5 to under 9 7 12

9 to under 13 11 10

13 to under 17 16 4

2. Calculate the mean

The average number of viewed pages is about 8.5.

The Variance and Standard Deviation of Grouped Data

- end of content –

SQT I
No ratings yet
SQT I
52 pages
Business Mathematics 2nd Quarter 7th Week Lesson Presentation and Analysis of Business Data
75% (4)
Business Mathematics 2nd Quarter 7th Week Lesson Presentation and Analysis of Business Data
23 pages
BADB1014 Quantitative Methods - Lesson 3
No ratings yet
BADB1014 Quantitative Methods - Lesson 3
23 pages
1st Mid
No ratings yet
1st Mid
19 pages
Unit 2
No ratings yet
Unit 2
11 pages
1. Descriptive Statistics (1)
No ratings yet
1. Descriptive Statistics (1)
65 pages
2. presenting of data_١١١٠٥٩
No ratings yet
2. presenting of data_١١١٠٥٩
39 pages
Data visualization (3)
No ratings yet
Data visualization (3)
5 pages
Chapter 2, Part A Descriptive Statistics
No ratings yet
Chapter 2, Part A Descriptive Statistics
5 pages
Organizing-Data_250120_180858
No ratings yet
Organizing-Data_250120_180858
32 pages
Describing Data New
No ratings yet
Describing Data New
13 pages
BIOL 2163 Lecture 2 - Summarizing and Graphing Data
No ratings yet
BIOL 2163 Lecture 2 - Summarizing and Graphing Data
59 pages
Math 140 Chapter 2 Notes
No ratings yet
Math 140 Chapter 2 Notes
5 pages
Week 1 - Ch 2
No ratings yet
Week 1 - Ch 2
49 pages
Lecture-02 Data Organization and Presentation
No ratings yet
Lecture-02 Data Organization and Presentation
36 pages
2.Data presentation
No ratings yet
2.Data presentation
26 pages
ADDB - Week 1
No ratings yet
ADDB - Week 1
44 pages
1 Stats Intro 14022024 105127am
No ratings yet
1 Stats Intro 14022024 105127am
26 pages
Lecture 2 Statistics
No ratings yet
Lecture 2 Statistics
38 pages
Intro To Statistics
No ratings yet
Intro To Statistics
38 pages
Chapter 1 Descriptive Data
No ratings yet
Chapter 1 Descriptive Data
113 pages
Methods of Data Collection and Presentation
No ratings yet
Methods of Data Collection and Presentation
33 pages
Section 2.1, Frequency Distributions and Their Graphs
No ratings yet
Section 2.1, Frequency Distributions and Their Graphs
2 pages
QMM 2
No ratings yet
QMM 2
68 pages
Screenshot 2025-02-20 at 1.50.52 PM
No ratings yet
Screenshot 2025-02-20 at 1.50.52 PM
39 pages
Probability Statistics Lecture 2
No ratings yet
Probability Statistics Lecture 2
38 pages
M 301 - Ch1 - Introduction To Statistics
No ratings yet
M 301 - Ch1 - Introduction To Statistics
96 pages
STA112 Week 2 Class Note
No ratings yet
STA112 Week 2 Class Note
102 pages
QMM 2 6 2017
No ratings yet
QMM 2 6 2017
87 pages
_ Unit 2 _ Descriptive Analytics
No ratings yet
_ Unit 2 _ Descriptive Analytics
85 pages
Catatan Statisktik FIX
No ratings yet
Catatan Statisktik FIX
59 pages
PLU Quantitative Techniques 2
No ratings yet
PLU Quantitative Techniques 2
20 pages
Statistics For Begineers
No ratings yet
Statistics For Begineers
28 pages
SLIDES Statistics-Chapter 2
No ratings yet
SLIDES Statistics-Chapter 2
31 pages
Ch - 2 (Organizing and Graphing Data)
No ratings yet
Ch - 2 (Organizing and Graphing Data)
83 pages
Week 02 Data Organizatiion and Presentaion
No ratings yet
Week 02 Data Organizatiion and Presentaion
51 pages
Finals Rt Core 3
No ratings yet
Finals Rt Core 3
25 pages
Statanalysis C2a
No ratings yet
Statanalysis C2a
6 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
Lesson 2: Summarizing Data
No ratings yet
Lesson 2: Summarizing Data
53 pages
Organizing and Graphing Data
No ratings yet
Organizing and Graphing Data
83 pages
Frequency, Distribution & Graphs
No ratings yet
Frequency, Distribution & Graphs
4 pages
Frequency Distribution
No ratings yet
Frequency Distribution
28 pages
chapter2
No ratings yet
chapter2
32 pages
Chapter 2-190810 074149
No ratings yet
Chapter 2-190810 074149
19 pages
chapter1-3 statistic
No ratings yet
chapter1-3 statistic
69 pages
Intro To Statistics Lecture
No ratings yet
Intro To Statistics Lecture
41 pages
DATA PRESENTATION
No ratings yet
DATA PRESENTATION
19 pages
Statistics - 1: Presentation of Data
No ratings yet
Statistics - 1: Presentation of Data
37 pages
Chapter 2
No ratings yet
Chapter 2
74 pages
2 Frequency Distribution and Graphs
0% (1)
2 Frequency Distribution and Graphs
4 pages
AE-9-REVIEWER
No ratings yet
AE-9-REVIEWER
7 pages
Statistics and Probability_CSE (1)
No ratings yet
Statistics and Probability_CSE (1)
49 pages
MATH 101 - Data Management
No ratings yet
MATH 101 - Data Management
44 pages
METHODS OF DATA PRESENTATION
No ratings yet
METHODS OF DATA PRESENTATION
10 pages
Math Reviewer
No ratings yet
Math Reviewer
6 pages
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
No ratings yet
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
39 pages
Chapter 2 - Descriptive Statistics
No ratings yet
Chapter 2 - Descriptive Statistics
54 pages
Business Statistics I Essentials
From Everand
Business Statistics I Essentials
Louise Clark
5/5 (5)
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Topic 1 Part B
No ratings yet
Topic 1 Part B
9 pages
Topic 1 Part A
No ratings yet
Topic 1 Part A
4 pages
Topic 2
No ratings yet
Topic 2
6 pages
BADB1014 QM Hypothesis Testing Examples of Comparing 2 Means 2
No ratings yet
BADB1014 QM Hypothesis Testing Examples of Comparing 2 Means 2
4 pages
Stat - Assignment
No ratings yet
Stat - Assignment
2 pages
Math Fourth Q
No ratings yet
Math Fourth Q
7 pages
"Development of Kabaddi Skill Test Battery For High School Grils
No ratings yet
"Development of Kabaddi Skill Test Battery For High School Grils
9 pages
20180808085223D4998 - Chapter - 07 Continuous Probability Distributions
No ratings yet
20180808085223D4998 - Chapter - 07 Continuous Probability Distributions
31 pages
LT 1. Introduction To Statistics
No ratings yet
LT 1. Introduction To Statistics
79 pages
Module 5 94 128 2
No ratings yet
Module 5 94 128 2
35 pages
MATHS Mcqs in Maths
100% (1)
MATHS Mcqs in Maths
10 pages
Advanced Practical Physics: For Students
No ratings yet
Advanced Practical Physics: For Students
30 pages
2020/21 MOCK EXAM MATHEMATICS Compulsory Part Paper 2
No ratings yet
2020/21 MOCK EXAM MATHEMATICS Compulsory Part Paper 2
29 pages
Some Important Theoretical Distributions: 3.1 Binomial Distribution
No ratings yet
Some Important Theoretical Distributions: 3.1 Binomial Distribution
35 pages
Saemrue Fo Nectral Cendency: Measure of Central Tendency
No ratings yet
Saemrue Fo Nectral Cendency: Measure of Central Tendency
23 pages
Cheat Sheet: Stan, Pystan and Arviz: Preliminaries Putting It All Together
No ratings yet
Cheat Sheet: Stan, Pystan and Arviz: Preliminaries Putting It All Together
9 pages
LM9 Stat&probab Lessons
No ratings yet
LM9 Stat&probab Lessons
3 pages
Mini Report
No ratings yet
Mini Report
27 pages
Statistics and Probability: Normal Distribution
No ratings yet
Statistics and Probability: Normal Distribution
40 pages
Trent University School of Graduate Studies MSMG - 5100: Application of Statistics Assignment 1
No ratings yet
Trent University School of Graduate Studies MSMG - 5100: Application of Statistics Assignment 1
12 pages
Measures of Central Tendency and Variability
No ratings yet
Measures of Central Tendency and Variability
9 pages
Anderson Darling Test
100% (1)
Anderson Darling Test
45 pages
TRANSPO Traffic Engg Studies (Spot Speed Studies)
No ratings yet
TRANSPO Traffic Engg Studies (Spot Speed Studies)
19 pages
Final Exam Study Guide For EIN6935allC13 - Session 7
No ratings yet
Final Exam Study Guide For EIN6935allC13 - Session 7
10 pages
Mb0040 - Statistics For Management-4 Credits Assignment Set - 1 (60 Marks) Note: Each Question Carries 10 Marks. Answer All The Questions
No ratings yet
Mb0040 - Statistics For Management-4 Credits Assignment Set - 1 (60 Marks) Note: Each Question Carries 10 Marks. Answer All The Questions
5 pages
PHD Course Work Syllabus Geology
100% (2)
PHD Course Work Syllabus Geology
5 pages
Supplement To Chapter 9
No ratings yet
Supplement To Chapter 9
2 pages
Competency Appraisal MArch 232024
No ratings yet
Competency Appraisal MArch 232024
130 pages
EDUC 75 Module 7revised Measures of Central Tendency.
No ratings yet
EDUC 75 Module 7revised Measures of Central Tendency.
14 pages
Unilever S&P Analysis Report s1
No ratings yet
Unilever S&P Analysis Report s1
12 pages
Self-Instructional Manual (SIM) For Self-Directed Learning (SDL)
No ratings yet
Self-Instructional Manual (SIM) For Self-Directed Learning (SDL)
79 pages
Math 7-Q4-Module-5
100% (3)
Math 7-Q4-Module-5
17 pages
Chap 1
No ratings yet
Chap 1
4 pages

Topic 3

Uploaded by

Topic 3

Uploaded by

Topic 3 Topic: Descriptive Statistics • To construct a frequency

Why This Topic

The Two Main Types of Data

© UNITAR International University Page 1 of 22

• collecting, summarising, and displaying data

• making claims or conclusions about the data based on a sample

Population and Sample

© UNITAR International University Page 2 of 22

• represents all possible subjects that are of interest in a particular study

• refers to a portion of the population that is representative of the

Parameter and statistics

• Parameter – a described characteristic of a population

Making claims about a population by examining sample results

Constructing a Frequency Distribution

Example: Number of iPads sold per day

© UNITAR International University Page 3 of 22

• Represent something that has been counted.

Examples of Discrete data

Examples of Continuous data

© UNITAR International University Page 4 of 22

• Shows the accumulated proportion as values vary from low to high

Using a Histogram to Graph a Frequency Distribution

The Shape of Histograms

© UNITAR International University Page 5 of 22

• Find the lowest value of k that satisfies the rule.

Once k is known, the width of each class can be found.

© UNITAR International University Page 6 of 22

Rules for Classes for Grouped Data

The Consequences of Too Few or Too Many Classes

© UNITAR International University Page 7 of 22

Too many narrow classes have consequences:

Displaying Qualitative Data

Qualitative data are values that are categorical.

© UNITAR International University Page 8 of 22

Vertical Bar Chart Horizontal bar chart

© UNITAR International University Page 9 of 22

Stem and Leaf Display

• All the original data points are visible on the display.

© UNITAR International University Page 10 of 22

1. Sort the data from lowest to highest.

To get more detail the stems can be split in half

Measures of Central Tendency

Measures of Central Tendency

© UNITAR International University Page 11 of 22

The formula for the Sample Mean:

The formula for the Population Mean:

Example: suppose a sample of size n = 5 gives the following values:

The sample mean:

Advantages and Disadvantages of Using the Mean to Summarise Data

© UNITAR International University Page 12 of 22

Example with sample of size n = 7:

The median is not sensitive to outliers.

145 157 170 182 204 209

The mode is a particularly useful way to describe categorical data.

Example with numerical data:

• Number of children per family in a sample of 24 families:

© UNITAR International University Page 13 of 22

Example with categorical data:

© UNITAR International University Page 14 of 22

For categorical data, the mode is the only choice.

Measures of variability show how much spread is present in the data.

© UNITAR International University Page 15 of 22

(Ignores the way in which data are distributed)

The Variance and Standard Deviation

© UNITAR International University Page 16 of 22

Sample standard deviation formula:

Calculating the Sample Standard Deviation

© UNITAR International University Page 17 of 22

The Variance and Standard Deviation for a Population

© UNITAR International University Page 18 of 22

Example calculation using short-cut formula:

© UNITAR International University Page 19 of 22

The standard deviation is affected by the scale of the data.

The Coefficient of Variation

• A high CV indicates high variability relative to the size of the mean.

© UNITAR International University Page 20 of 22