UNIT 2
DATA COLLECTION AND REPRESENTATION
DATA
• Datum: Singular, Data : Plural
• A dictionary defines data as facts or figures
from which conclusions can be drawn.
• A collective recording of observations
numerically is called data.
PRIMARY DATA
• PRIMARY DATA: Primary data are original data
which are collected for the first time by the
investigator for the purpose of a specific
statistical investigation. In the words of Wessel, “
Data originally collected in the process of
investigations are known as primary data.”
• Primary Data is also called as first hand data.
• Eg: Personal Interview, Telephonic Interview,
Questionnaire, Surveys
SECONDARY DATA
• Secondary data are those data that have
already been collected by others.
• According to [Link] , “ Secondary data are
those which are already in existence, and
which have been collected for some other
purpose than the answering of the questions
in hand.”
• Eg: External sources like, magazines,
newspapers, research articles, Internet, etc.
• The merit of secondary data is that it is
cheaper and easier to acquire than primary
data.
• Major demerit of secondary data is that the
reliability, accuracy and integrity of the data is
uncertain.
METHODS OF COLLECTING PRIMARY DATA
1. OBSERVATION METHOD – It is commonly used
in studies related to behavioral science.
2. INTERVIEW METHOD –
a. Structured Interview
b. Unstructured Interview
c. Personal Interview
d. Telephonic Interview
3. QUSTIONNAIRE: Set of printed questions
4. POLLS: Polls comprise of one single or
multiple choice question. Short in length.
METHODS OF COLLECTING SECONDARY DATA
• Official publications such as the Ministry of Finance,
Statistical Departments of the government, Federal
Bureaus, Agricultural Statistical boards, etc. Semi-official
sources include State Bank, Boards of Economic Enquiry,
etc.
• Data published by Chambers of Commerce and trade
associations and boards.
• Articles in the newspaper, from journals and technical
publications.
• Balance Sheet, Profit & Loss Statement, Sales Record of
Organizations
PRESENTATION OF DATA
• Presentation of data refers to an exhibition or
putting up data in an attractive and useful
manner such that it can be easily interpreted.
• The three main forms of presentation of data
are:
[Link] presentation
[Link] tables
[Link] presentation
Textual Presentation
• This kind of representation is useful when we
are looking to supplement
qualitative statements with some data. For this
purpose, the data should not be voluminously
represented in tables or diagrams.
• E.g.: “the 2002 earthquake proved to be a
mass murderer of humans. As many as 10,000
citizens have been reported dead”.
Tabular Presentation
• A table facilitates representation of large amounts of
data in an attractive, easy to read and organized
manner. The data is organized in rows and columns.
This is one of the most widely used forms of
presentation of data since data tables are easy to
construct and read.
• According to D. Gregory, “Tabulation is the process of
condensing classified data in the form of a table so
that it may be more easily understood, and so that any
comparisons involved may be more readily made.”
Objectives of Tabulation
• To Simplify Complex Data
• To Facilitate Comparison
• To Economise Space
• To Depict Trend
• To Detect Error and Omissions in the Data
• To Facilitate Statistical Processing
Components of a Data table
• Table Number: Each table should have a specific table
number for ease of access and locating.
• Title: A table must contain a title that should be clear,
concise and self explanatory.
• Headnotes : A headnote further aids in the purpose of
a title and displays more information about the table.
• Stubs: These are titles of the rows in a table. Thus a
stub display information about the data contained in a
particular row.
• Caption: A caption is the title of a column in the data
table. In fact, it is a counterpart of a stub.
• Body or field: The body of a table is the content of a
table. Each item in a body is known as a ‘cell’.
• Footnotes: Footnotes are often used at the bottom of
the table to point any omission or characteristic in the
table.
• Source: When using data obtained from a secondary
source, this source has to be mentioned below the
footnote.
DRAFT
EXAMPLE
• Eg1 : In a sample study about coffee habits in two
towns, the following information was received:
Town A: Females were 40%, total coffee drinkers
were 45% and male non coffee drinkers were
20%
Town B: Males were 55%, male non coffee drinkers
were 30% and female coffee drinkers were 15%.
Represent this data in a tabular form.
Table1: Coffee Drinking Habits of Town A and B (In Percentage)
ATTRIBUTE TOWN A TOWN B
MALES FEMALES TOTAL MALES FEMALES TOTAL
Coffee DRINKERS 60-20=40 40-35=5 45 55- 15 40
30=25
NON COFFEE 20 55-20=35 100-45=55 30 45-15=30 100-
DRINKERS 40=60
TOTAL 100- 40 100 55 45 100
40=60
• Eg.2: In 2000, out of a total of 1750 workers of a
factory, 1200 were members of a trade union. The
no. of women employees was 200, out of which 175
did not belong to a trade union. In 2002, the no. of
union workers increased to 1580 of which 1290
were men. On the other hand, the no. of non union
workers fell down to 208, of which 180 were men.
In 2004, there were 1800 employees who belonged
to a trade union and 50 who did not belong to a
trade union.
• Of all employees in 2004, 300 were women of
whom only 8 did not belong to a trade union.
Present the above data in a suitable tabular
form.
• SOLUTION
DIAGRAMATIC REPRESENTATION OF DATA
• When data is presented in a simple and
attractive manner in the form of diagrams is
called diagrammatic presentation of data.
• It makes the presentation of data attractive
and simple.
TYPES OF GRAPHICAL REPRESENTATION OF
DATA
• Line Graph
• Bar Diagram
• Histogram
• Pie Diagram
• Frequency Polygon
• Frequency Curve
• Ogive Curves