0% found this document useful (0 votes)
4 views

chapter2

Chapter Two discusses methods of data collection and presentation, highlighting the distinction between primary and secondary data. It outlines various techniques for data measurement, such as focus groups and surveys, and emphasizes the importance of data presentation through tabular and graphical methods. The chapter also covers frequency distributions, their types, and guidelines for constructing grouped frequency distributions.

Uploaded by

abay kassie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

chapter2

Chapter Two discusses methods of data collection and presentation, highlighting the distinction between primary and secondary data. It outlines various techniques for data measurement, such as focus groups and surveys, and emphasizes the importance of data presentation through tabular and graphical methods. The chapter also covers frequency distributions, their types, and guidelines for constructing grouped frequency distributions.

Uploaded by

abay kassie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Chapter Two

Methods of data collection and presentation


Sources and types of Data
there are two types of data
1. Primary Data
Data measured or collect by the investigator or the user directly from
the source.
Two activities involved:
 Planning and
 Measuring.

03/25/2025
i. planning
 Identify source and elements of the data.
Decide whether to consider sample or census.
If sampling is preferred, decide on sample size,
selection method,… etc
Decide measurement procedure.
Set up the necessary organizational structure

03/25/2025
ii. Measuring: there are different options.
 Focus Group
 Telephone Interview
 Mail Questionnaires
 Door-to-Door Survey
 Mall Intercept
 New Product Registration
 Personal Interview and
 Experiments are some of the sources for collecting the
primary
03/25/2025
data.
2. Secondary Data

Data gathered or compiled from published and unpublished sources


or files.
Example: Hospital records, vital statistics and registers, etc.
When our source is secondary data check that:
• The type and objective of the situations.
• The purpose for which the data are collected and compatible with
the present problem.
• The nature and classification of data is appropriate to our problem.
• There are no biases and misreporting in the published data.

Note: Data which are primary for one may be secondary for the other.
METHODS OF DATA PRESNTATION

 The presentation of data is broadly classified in to the following two


categories:
 Tabular presentation
 Diagrammatic and Graphic presentation.

The process of arranging data in to classes or categories according to


similarities technically is called classification.
 Classification is a preliminary and it prepares the ground for proper
presentation of data.

Definitions:
 Raw data: recorded information in its original collected form,
03/25/2025
whether it be counts or measurements, is referred to as raw data.
 Frequency: is the number of values in a specific class
of the distribution.
 Frequency distribution: is the organization of raw
data in table form using classes and frequencies.
Example: student age distribution

03/25/2025
Why Use Frequency Distributions?
The reasons for constructing a frequency distribution are as follows:
 To organize the data in a meaningful, intelligible way.
 To enable the reader to determine the nature or shape of the
distribution
 To facilitate computational procedures for measures of average and
spread
 To enable the researcher to draw charts and graphs for the
presentation of data
 To enable the reader to make comparisons between different data
set
Types of Frequency Distribution
1. Categorical
2. Ungrouped 3. Grouped
03/25/2025
1. categorical frequency distribution
• Used for data that can be place in specific categories
such as nominal, or ordinal. e.g. marital status.
Example: a social worker collected the following data
on marital status for 25 persons.(M=married, S=single,
W=widowed, D=divorced) construct categorical
frequency distribution

Class Tally Frequency Percent

M S D W D (1) (2) (3) (4)


Solution:
S S M M M M ///// 6 24
W D S M M
S //// // 7 28
W D D S S D //// // 7 28
S W W D D W //// 5 20

03/25/2025
2. Ungrouped frequency distribution
- Is a table of all potential values that could possibly occur
in the data collection along with their corresponding
frequencies.
Example: Consider age of 20 students who read in library last night
30, 41, 39, 41, 32, 29, 35, 31, 30, 36, 33, 36, 32, 42, 30, 35, 37,
32,30, and 41.

03/25/2025
3. Grouped Frequency distribution
When the range of the data set is large, the data must be grouped in to
classes that are more than one unit in width.

Example : Mark of a student in a class


89,17.21,100,11,3,90,45,41,67,87,34,69,3,39,63,41,5
7,53,12,79, 91, 42 ,100, 62,73,1,38,56,45,25, 24, 35,
17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41,
43, 44, 27, 53, 27 …
.We need to summarize the data to make it Meaningful.

03/25/2025
Definitions:
Grouped Frequency Distribution: a frequency distribution when several numbers are grouped
in one class.
Class limits: Separates one class in a grouped frequency distribution from another.
The limits could actually appear in the data and have gaps between the upper limits of one
class and lower limit of the next.
Units of measurement (U): the distance between two possible consecutive measures. It is
usually taken as 1, 0.1, 0.01, 0.001, -----.
Class boundaries: Separates one class in a grouped frequency distribution from another.
The boundaries have one more decimal places than the row data and therefore do not
appear in the data.
Class width: the difference between the upper and lower class boundaries of any class.
It is also the difference between the lower limits of any two consecutive classes or the
difference between any two consecutive class marks.
Class mark (Mid points): it is the average of the lower and upper class limits or the average of
upper and lower class boundary.
Cumulative frequency: is the number of observations less than/more than or equal to a specific
value
03/25/2025
• Cumulative frequency above: it is the total frequency of all values

greater than or equal to the lower class boundary of a given class.

• Cumulative frequency below: it is the total frequency of all values less

than or equal to the upper class boundary of a given class.

• Cumulative Frequency Distribution (CFD): it is the tabular arrangement

of class interval together with their corresponding cumulative frequencies.

It can be more than or less than type, depending on the type of cumulative

frequency used.

• Relative frequency (rf): it is the frequency divided by the total frequency.

• Relative cumulative frequency (rcf): it is the cumulative frequency

divided by the total frequency


03/25/2025
Guidelines for classes

1. There should be between 5 and 20 classes.

2. The classes must be mutually exclusive. This means that no data value can

fall into two different classes

3. The classes must be all inclusive or exhaustive. This means that all data

values must be included.

4. The classes must be continuous. There are no gaps in a frequency

distribution.

5. The classes must be equal in width. The exception here is the first or last

class. It is possible to have an "below ..." or "... and above" class. This is

often used with ages


03/25/2025
Steps for constructing grouped frequency distribution
1. Find the largest and smallest values
2. Compute the Range(R) = Maximum - Minimum
3. Select the number of classes desired, usually between 5 and 20 or use Sturges rule k=
1+3.322logn where k is number of classes desired and n is total number of observation.

4. Find the class width W= R/K


5. Pick a suitable starting point ≤ to the minimum value.
6. To find the upper limit of the first class, subtract U from the lower limit of the second class.
Then continue to add the class width to this upper limit to find the rest of the upper limits.
7. Find the boundaries by subtracting U/2 units from the lower limits and adding U/2 units on
the upper limits.
8. Tally the data.
9. Find the frequencies.
10. Find the cumulative frequencies
11. If necessary, find the relative frequencies and/or relative cumulative frequencies
03/25/2025
Example 2.3

Construct a frequency distribution for the following data.

11 29 6 33 14 31 22 27 19 20
18 17 22 38 23 21 26 34 39 27

6 11 14 17 18 19 20 21 22 22 23 26 27 27 29 31 33 34 38 39
lutions:
Step 1: Arrange the data in ascending order.
Step 2: Find the range (R) : 𝑅 = 𝑀𝑎𝑥 − 𝑀𝑖𝑛 = 39 − 6 = 33.
Step 3: Select the number of classes desired using Sturges formula;
𝑘 = 1 + 3.322 𝑥 𝑙𝑜𝑔𝑛 =𝑘 = 1+ 3.322 𝑥 𝑙𝑜𝑔ሺ20ሻ = 5.32 ≈ 5 (𝑟𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑑𝑜𝑤𝑛).
Step 4: Find the class width; 𝑤= 𝑘 = 𝑤= = 6.6 ≈ 7 ሺ𝑟𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑢𝑝ሻ .
𝑅 33
5

Step 5: Find the lower and the upper class limits.


Select the starting point, let it be the smallest observation.
03/25/2025
 6, 13, 20, 27, 34 are the lower class limits.
Find the upper class limits; e.g. the first upper
class limit

12, 19, 26, 33, 40 are the upper class limits.


So combining , one can construct the following classes

Class limits
6 – 12
13 – 19
20 – 26
27 – 33
34 – 40
03/25/2025
Step 6: Find the class boundaries;

• Then continue adding on both boundaries to obtain


the rest boundaries. By doing so one can obtain the
following classes.
Class boundary
5.5 – 12.5
12.5 – 19.5
19.5 – 26.5
26.5 – 33.5
33.5 – 39.5

Step 7: Find the frequencies

03/25/2025
The complete frequency distribution is given as follows
Class Class Class f Lcf Mcf rf. %rf %rcf
limit boundary Mark

6 – 12 5.5 – 12.5 9 2 2 0.10 10% 10%

13 – 19 12.5 – 19.5 16 4 0.20 20% 30%

20 – 26 19.5 – 26.5 23 6 0.30 30% 60%

27 – 33 26.5 – 33.5 30 5 0.25 25% 85%

34 – 40 33.5 – 39.5 37 3 0.15 15% 100%

03/25/2025
Diagrammatic and Graphic presentation of data
These are techniques for presenting data in visual displays using
geometric and pictures.
Importance:
They have greater attraction.
They facilitate comparison.
They are easily understandable.
Diagrams are appropriate for presenting discrete data.
The three most commonly used diagrammatic presentation
for discrete as well as qualitative data are:
1. Pie charts
2. Pictogram
3. Bar charts
03/25/2025
1. Pie charts

Class Frequency Percent Degree


Men 2500 25 90
Women 2000 20 72
Girls 4000 40 144
Boys 1500 15 54
CLA SS

Boys Men

G ir ls Women

03/25/2025
2. Pictogram

In these diagram, we represent data by means of some picture


symbols. We decide abut a suitable picture to represent a definite
number of units in which the variable is measured.
Example: draw a pictogram to represent the following population of
a town.
Year 1989 1990 1991 1992

Population 2000 3000 5000 7000

03/25/2025
Bar Charts:

• A set of bars (thick lines or narrow rectangles)


representing some magnitude over time space.
• They are useful for comparing aggregate over
time space.
• Bars can be drawn either vertically or
horizontally.
• There are different types of bar charts. The most
common being :

03/25/2025
There are different types of bar charts.
The most common are :

a. Simple bar chart


b. Component or sub divided bar chart.
c. Multiple bar charts.
Product Sales($) Sales($) Sales($)
In 1957 In 1958 In 1959

A 12 14 18
B 24 21 18
C 24 35 54

03/25/2025
a. Simple Bar chart
Sales by product in 1957

30
25
Sales in $

20
15
10
5
0
A B C
product

03/25/2025
2. Component Bar Chart
SALES BY PRODUCT 1957-1959

100

80
Sales in $

Product C
60
Product B
40
Product A
20

0
1957 1958 1959
Year of production

03/25/2025
3. Multiple Bar Chart

Sales by product 1957-1959

60
50
Sales in $

40 Product A
30 Product B
20 Product C

10
0
1957 1958 1959
Year of production

03/25/2025
Graphical Presentation of data
• The histogram, frequency polygon and cumulative frequency graph or ogive are
most commonly applied graphical representations for continuous data.
a. Histogram

Interval Frequency

10 but less than 20 3


20 but less than 30 6
30 but less than 40 5
40 but less than 50 4
50 but less than 60 2

(No gaps
between 0 10 20 30 40 50 60
bars) 70 Temperature in Degrees
03/25/2025
b. Frequency Polygon
– A line graph.
– The frequency is placed along the vertical axis and classes
mid points are placed along the horizontal axis.
– It is customer to the next higher and lower class interval
with corresponding frequency of zero, this is to make it a
complete polygon.

Example: Draw a frequency polygon for the above


example.

03/25/2025
03/25/2025
c. Ogive (cumulative frequency polygon)
• A graph showing the cumulative frequency (less
than or more than type) plotted against upper or
lower class boundaries respectively.
• Class boundaries along the horizontal axis
corresponding cumulative frequencies are along
the vertical axis. The points are joined by a free
hand curve.
03/25/2025
20
Less than Ogive

15

10

More than Ogive

5.5 11.5 17.5 23.5 29.5 35.5 41.5


03/25/2025 Class Boundaries
End!!!
03/25/2025

You might also like