Unit 4: Data Management (10 hours)
Introduction
Statistics is very important especially in academic endeavors like research
writing. Data management is one of those processes involved to come up with
an accurate findings and conclusion.
This unit will broaden your understanding of Mathematics as it relates to
managing data. You are expected to apply methods for organizing and
analyzing large amounts of information and carry out a culminating
investigation that integrates statistical concepts and skills.
More so, this unit covers important statistical tools in data management. It
presents data gathering and organizing data, representing data using graphs
and charts, interpreting organized data, measures of central tendency,
measures of dispersion and relative position, the normal distribution curve,
and linear correlation.
In this unit, you are expected gain a practical, legal and ethical understanding
of how to access, query and manage data collections, using real-world
datasets, standard software packages and data visualization techniques. They’ll
learn how to organize and analyze data collections to answer questions about
the world, as well as developing an appreciation of user needs surrounding
data systems.
Do your best in accomplishing the different tasks provided in this unit and
answer the questions honestly by considering your previous experiences and
prior knowledge.
Enjoy your learning!
Learning Outcomes
Upon the completion of this unit, you are expected to:
a. Use variety of statistical tools to process and manage numerical data;
b. Calculate the measures of central tendency and measures of dispersion for a
set of discrete data;
c. Identify the location of data in a given set of observations;
d. Determine the relationship that exists between two quantitative variables;
e. Use the methods of linear regression and correlations to predict the value of
a variable given certain conditions; and
f. Advocate the use of statistical data in making important decisions.
Activating Prior Learning
Directions: Find out how much you already know about these topics. On a
sheet of paper, write the letter of the option that best answers the question.
1. For the set of data consisting of 8, 8, 9, 10, and 10, which of these is
TRUE?
A. Mean = Mode C. Mean = Median
B. Median = Mode D. Mean < Median
2. Nine people contributed 100, 200, 100, 300, 300, 200, 200, 150, 100,
and 100 pesos for a door prize. What is the median contribution?
A. 100 C. 175
B. 150 D. 200
3. Which of the following indicates how many standard deviations a data
point is from the mean?
A. Z-Score C. Quantiles
B. Box Whisker’s Plot D. Skewness
4. Which of these is equivalent to the median of a distribution?
A. First Quartile C. Fifth Decile
B. Tenth Percentile D. Second Quartile
5. Which of the following Statistical tests allows us to determine the
strength of association of two quantitative variables?
A. T-test C. Chi-square
B. Linear Correlation D. Regression Analysis
Topic 1: Data Gathering, Organization, Presentation and
Interpretation
Learning Objectives
Upon the completion of this topic, you are expected to:
a. summarize and present data using the different methods of data
presentation;
b. construct graphs and tables to present given data; and
c. interpret the data presented.
Presentation of Content
I. Data Gathering
Research is only valuable if you can share the data effectively. In this topic,
you will learn how to organize data and construct various charts and graphs to
represent the same.
What is a Data?
Data is a collection of information from facts, statistics, numbers,
characteristics, observations, and measurements that represent an idea. There
are two forms of data.
1. Quantitative data deals with the quantity (for example, the number of
whales at Sea World).
2. Qualitative data is another form of data that deals with the description
of things. It can be observed but not measured (such as the color of
your eyes).
What are the Levels of Measuring Data?
When grouped, data can be formed into a single variable. Variables in
quantitative analysis are usually classified by their level of measurement, as
indicated below.
1. Nominal data are categorical variables and has lowest level of
measurement. Category means that the values are not numerical.
Examples are civil status, ID number, religion, sex, etc.
When you are asked about your civil status, you will not answer 1,2,3
etc. But rather your answer would either be single, married, widow or
widower. These data (single, married, widow, widower) are called
categorical data.
Sex is either be male or female, but not 4 or 5. the category is either
female or male.
2. Ordinal variables are categorical variables with order. (e.g. level of
satisfaction, quality of life indices)
3. Interval are quantitative variables but has no true zero point. (e.g.
temperature in degree Celsius, Intelligence Quotient)
4. Ratio is the highest level of measurement and has true zero point.
(e.g. weight of child, number of vaccinations)
A. Methods of Gathering Data
There are different methods that you can use to collect data and they are the
following:
1. Direct method is data collection through the use of interviews. The
enumerator talks to the subject personally. He gets the data through a
series of questions asked from the subject of the interview.
2. Indirect Method is data collection through the use of questionnaires.
These questionnaires may be sent through the postal or electronic mail.
3. Observation is done through observation with the use of our senses.
For example, the MMDA gives report every week on the number of
accidents happening at EDSA. To do this, an MMDA personnel will
just count the number of accidents through their CCTV.
4. Experimentation is usually done through experiment in laboratories
and classrooms.
5. Registration is acquiring data from private and government agencies
such as from the National Statistics Office, the Bangko Sentral ng
Pilipinas, Department of Finance, etc.
II. Organization of Data
After data has been collected, it can be consolidated and summarized in tables.
When the variable of interest is qualitative, the statistical table is a list of the
categories being considered, along with a measure of how often each value
occurred.
The data can be summarized through the following ways:
A. The frequency or number of measurements in each category
B. The relative frequency, or proportion, of measurements in each
category
C. The percentage of measurement in each category
III. Presentation of Data
Once the measurements are summarized in a statistical table, you can either
use graphs or charts to display the distribution of the data.
A. Ways of Presenting Data
These are the different ways of presenting data.
1. Textual Form– Data and information are presented in paragraph and
narrative form.
2. Tabular Form– Quantitative data are summarized in rows and columns.
3. Graphical Form– Data are presented in charts, graphs or pictures.
Textual Form
Have you seen data presented in textual form? Below is an example.
Study revealed that Mathematics teachers always used chalkboard (4.62) and
textbooks (4.37); and they sometimes used geometric figures (3.29), graphs
(3.16), graphing board (3.12), pictures (3.02), flash cards (3.01), and
whiteboard (3.00). The respondents seldom used geometry board (2.19),
advance organizers (2.12), and realia (2.12). The overall weighted mean of
2.93 indicates that the Mathematics teachers sometimes used the given
traditional instructional materials in teaching mathematical concepts.
Tabular Form
We can present data using stem and leaf plot or frequency distribution table.
Stem and Leaf Plot
In the stem and leaf plot, data are displayed using the actual numerical values
of each data point.
Steps in constructing a stem and leaf plot:
A. Divide each measurement into two parts: stem and leaf.
B. List the stem in column, with the vertical line to the right.
C. For each measurement, record the leaf portion in the same row as its
corresponding stem.
D. Order the leaves from the lowest to highest in each stem.
Example:
Daily sales of ream of bond papers of MARS Paper Company for the forty days:
34 40 31 33 20 25 51 62
45 30 38 45 61 42 30 28
35 31 28 42 39 40 52 43
36 46 48 51 52 47 42 39
40 31 29 33 47 36 45 21
Below shows the presentation of data using stem and leaf plot.
2 0588 2 0588
3 41308051969136 3 00111334566899
4 05522036872075 4 00022235556778
5 1212 5 1122
6 21 6 12
Frequency Distribution Table
The frequency distribution is an arrangement of numerical data according to
size or magnitude, with corresponding frequencies and class mark.
How can we present data using frequency distribution?
Constructing the Frequency Distribution Table
Refer to the guidelines below in constructing the table.
1. Construct the stem and leaf plot of the set of data.
2. Determine the range of the data (the difference between the highest
and lowest figure).
3. Divide the range by the number of classes to determine the class
interval. To determine the number of classes, we can use the formula:
𝑘 = 1 + 3.3 𝑙𝑜𝑔 𝑛
Where:
𝑘 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠
𝑛 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
4. The result is rounded off to the nearest whole number.
5. Start the first class with the lowest observation or a multiple of the
class interval. This is the lower limit of the first class. The highest
observation is the upper limit of the last class.
6. Determine the other lower limits by adding the class interval until we
reach the computed number of classes (k).
7. Write the upper limits by subtracting 1 from the lower limit of the
upper class.
8. Count the number of values that fall under each class.
Example:
Construct a frequency distribution from the sales volume of 50 medical sales
representatives.
723 735 720 765 779 788 745 757 819 767
767 755 781 800 812 796 753 728 740 753
770 793 786 775 760 801 793 786 794 781
738 744 757 769 752 735 746 769 777 766
750 771 730 745 783 779 805 788 768 760
Solution:
1. Construct a stem and leaf plot.
72 038
73 0558
74 04556
75 0233577
76 005677899
77 015799
78 1136688
79 3346
80 015
81 29
2. Compute the range (R).
𝑅 = 819 − 720 = 99
3. Find the number of classes (k).
𝑘 = 1 + 3.3 𝑙𝑜𝑔 𝑛
𝑘 = 1 + 3.3 𝑙𝑜𝑔 (50)
𝑘 = 6.6 𝑜𝑟 7
4. Compute the class interval (𝑖).
𝐶𝑙𝑎𝑠𝑠 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 = 99/7 = 14.14 𝑜𝑟 14
After computing the required values, we can now construct the frequency
distribution table.
Number of Relative
Amount of Sales
Boundaries Sales Frequency
(Classes)
(Frequency) (Percentage)
720 – 733 719.5 – 733.5 4 8%
734 - 747 733.5 – 747.5 8 16%
748 - 761 747.5 – 761.5 9 18%
762 - 775 761.5 – 775.5 10 20%
776 - 789 775.5 – 789.5 10 20%
790 - 803 789.5 – 803.5 6 12%
804 - 819 803.5 – 819.5 3 6%
Note: The lower boundaries for the classes is 0.5 unit below the smallest
observation of the class. The upper boundary for the class is 0.5 unit above the
largest observation of the class. The data can be summarized in the table by
recording the number (frequency) and the percentage (relative frequency) of
observations in each category or class.
Graphical Form
We can present data using charts and graphs. For instance, pie chart displays
how the total quantity is distributed among the categories while the bar chart
uses the height of the bar to display the amount in a particular category.
Example:
Four thousand new students were admitted at a university in Metro Manila for
the school year, 2011-2012. The students were enrolled in the following
programs:
Program Number of students
Accounting Actuarial 320
Science Banking and 440
Finance 720
Entrepreneurial Management 1,080
Economics 800
Marketing 400
Tourism 240
Total 4,000
How do we present these data using pie chart and bar graph?
Below are the calculations for the construction of the pie chart.
Program Frequency Relative Percent Angle
Accounting Actuarial 320 .08 8% 28.8
Science Banking and 440 .11 11% 39.6
Finance 720 .18 18% 64.8
Entrepreneurial Management 1,080 .27 27% 97.2
Economics 800 .20 20% 72.0
Marketing 400 .10 10% 36.0
Tourism 240 .06 6% 21.6
Total 4,000 1.00 100% 3600
From the given calculations, this is how to present the data using pie chart.
Program Preference of the New Students
Acc As Bf Em Eco M T
This can also be represented by a solid diagram:
This is how to present the data using graph.
Application
Activity 1
Below is a summary of color preference of 400 randomly selected car buyers
in Cebu:
BLACK RED BLUE GRAY WHITE
320 180 195 155 250
A. Construct a percentage of relative distribution.
B. Construct a pie chart to describe the data.
C. Construct a bar chart to describe the data.
Activity 2
Three hundred eighty students are grouped into four categories: W, X, Y, and
Z. the number of female and male students who fall in each category is shown
in the table:
Category Female Male Total
W 24 20 44
X 36 45 81
Y 74 60 134
Z 66 55 121
Total 200 180 380
1. Construct a pie chart and a bar graph to describe the data on female students.
2. Construct a pie chart band a bar chart to describe the data on male students.
Activity 3
The Table below shows the number of items bought daily at the MAI Computer
Shop. Construct both the pie chart and the bar chart to describe the data.
Items Number of pieces
LED monitor 5
Desktop computer 7
Laptop computer 6
Printer 4
Fax machine 3
Total 25