1
STATISTICS
Data Collection and Summarization
2
TYPES OR METHODS OF COLLECTION OF
DATA
• There are two types of data -
• Primary data
• Secondary data
1. Primary data:
Data which are collected for the first time, for a specific
purpose are known as primary data.
2. Secondary data:
Data which are already used in an investigation, originally
collected by some one else, are known as secondary
data.
3
EXAMPLE
• Let us take an example which can be easily understood
by everyone. In our country after every ten years,
counting of population is done, which is commonly known
as census. For this purpose data are collected by the
Government of Pakistan. The data collected are known
as primary data. In that data, population information
about age of persons, educations, income etc. are
available. Suppose a separate and related data for a
purpose, then the data will be known as secondary data
to them.
4
DIFFERENCE BETWEEN PRIMARY
AND SECONDARY DATA
PRIMARY DATA SECONDARY DATA
The data collected for the first time The data that have already been
and original in character. collected earlier by some other
persons for same different
purpose.
The data are in the form of raw The data are in the form of
material to which statistical finished product as they have
methods are applied for the been already statistically applied.
purpose of analysis.
The data are collected directly The data are collected from
from the people to which enquiry is published and unpublished
related. sources.
Data are primary to an institution And it is secondary for all others.
collecting it.
5
METHODS OF COLLECTING PRIMARY
DATA
1. Direct Personal Observation. In this method, the
investigator collects the data personally. He has to go to
the spot for conducting enquiry and has to meet the
persons concerned. This method is applicable when the
field of enquiry is small.
2. Indirect Oral Investigation. In this method data are
collected through indirect sources. Persons having
some knowledge of the enquiry are cross-examined so
that the desired information is collected. Evidence of
one person should not be relied, but a number of views
should be taken to find out real position.
6
METHODS OF COLLECTING PRIMARY
DATA
1. Schedules and Questionnaires. A list of questions
regarding the enquiry is prepared and printed. Data are
collected in any of the following ways. (a) By sending
the questionnaires to the informants concerned with a
request to answer the questions and return the
questionnaire. (b) By sending the questionnaires
through enumerators to help the informant.
2. Local Reports. This method does not imply on formal
collection of data. Only local agents or correspondents
are requested to supply the estimate required. This
method only gives approximate results, of courses at a
low cost.
7
SOURCES OF SECONDARY DATA
• Official publication by the Central and provincial
governments, District Boards
• Reports of Committees, Commissions.
• Publication by Research institutions. Universities others.
• Economic and commercials journals.
• Publication of trade associations, Chambers of
Commerce, etc.
• Market reports, individual research work of statisticians.
• Secondary data are also available from unpublished
records or government offices, chambers of commerce,
labor bureaus, etc.
8
Organizing and Presenting Data
• Data collected or compiled by previously discussed
methods are usually in unorganized, crude form known
as raw data.
• These data are not fit for any statistical purposes.
• To analyze and interpret data, we need to properly
arrange and organize them.
• Three techniques can be used for this purpose:
i. Classification
ii. Tabulation
iii. Graphic display
9
CLASSIFICATION
• It is the process of arranging data into different classes or
groups according to resemblances and similarities.
• Ideally, classification should be unambiguous, stable and
flexible.
“The process of arranging things in groups or classes
according to their resemblance and affinities”.
• Benefits of classification
• It clearly shows points of similarity and dissimilarity.
• It prepares the ground for comparisons and analysis by orderly
arrangement of data.
10
TYPES OF CLASSIFICATION
There are four types of classification depending upon the
nature of data:
i. Geographical i.e. area wise or region wise.
ii. Chronological i.e. w.r.t occurrence of time.
iii. Qualitative i.e., males, females, literate, illiterate, etc.
iv. Quantative i.e. ages of persons vary and so do their
heights and weights.
11
TYPES OF CLASSIFICATION
Geographical classification:
This type of classification is based on geographical
or locational differences between various items in the data
like states, cities, regions, zones etc.
Chronological classification:
In this type of classification data are classified on the
basis of difference in time, i.e., sale of a firm in different
periods, population of Pakistan in various decades etc.
12
TYPES OF CLASSIFICATION
Classification according to attributes:
• Simple classification – is that when only one attribute is
present i.e., classification of persons according to gender
i.e., male or female.
• Manifold classification – is that when more than one
attributes are present simultaneously, i.e., classification of
persons regarding deafness gender wise. The data thus,
are to be divided into four classes:
(a) males who are deaf.
(b) males who are not deaf.
(c) females who are deaf.
(d) females who are not deaf.
13
TYPES OF CLASSIFICATION
The study can be further continued, if we find another
attribute, say religion.
Classification according to class-intervals:
This type arises when direct measurement of data is
possible.
Data relating to height, weight, production, etc., come under
this category.
For instance persons having weight, say 100-110 Ibs. can
from one group, 110-120 Ibs. another group and so on.
In this way data are divided into different classes, each
interval is known as Class Interval.
14
TYPES OF CLASSIFICATION
Number of items which fall in any class interval is known
as Class Frequency.
In class intervals mentioned above, the first figures in
each of them are the Lower Limits.
The difference between the limits of a class intervals is
known as Magnitude Of A Class Interval.
If for each class interval the frequencies given are
aggregates of the preceding frequencies, they are known
as Cumulative Frequencies. The frequencies may be
cumulated from top or from below.
• Class intervals should be of equal magnitude otherwise it
would give misleading impression.
15
TABULATION
• Tabulation is a systematic and scientific presentation of
data in a suitable form for analysis and interpretation.
• It is the last stage in compilation and collection of data
and is a stepping stone to the analysis and interpretation.
“The intermediate process between the accumulation
of the data in whatever form they were obtained in
and final reasoned account of the results shown by
the statistics.”
16
TABULATION
A table broadly consists of five parts.
i. Number and title indicating the serial number of the
table and the subject matter of the table.
ii. Stub, i.e., the column indicating the headings of rows.
iii. Caption, i.e., the headings of the column.
iv. Body i.e., figures to be entered in the table.
v. Foot-note is source from which the data have been
obtained.
17
TABULATION
Thus a table should be arranged as follows:
TABLE
Title
18
TYPES OF TABULATION
• Mainly there are two types of table simple and complex.
• Simple tabulation reveals information regarding one or
more groups of independent question, while complex
table gives information about one or more inter related
questions.
• One way table is one that answers one or more
independent questions. Following is an example of a
simple table to explain the point:
19
TABLE 1
Daily wages in Rs. Obtained by 50 workers in
factory.
Wages (Rs) No. of workers-
4-6 20
6-8 9
8-10 10
10-12 7
12-14 4
Total 50
The table shows the number of workers belonging to each
class-interval of wages. We can now easily say that there are
20 workers obtaining wages between 4 and 6 (The minimum
range) and there are 4 workers obtaining wages between 12
and 14 (the maximum range). So the table reveals information
regarding only one characteristic of data, i.e., wages of workers.
20
TABLE 2
Daily wages in Rs. Obtained by 50 workers gender wise.
Wages (Rs) No. of Workers
Male Female Total
4-6 12 8 20
6-8 6 3 9
8-10 6 4 10
10-12 4 3 7
12-14 4 0 4
Total 32 18 50
The above table shows the wages obtained
by workers and sex-wise distribution of
workers in question. This is called the two
way table.