0% found this document useful (0 votes)
14 views25 pages

Lecture 2 - 1

Uploaded by

tomshave28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views25 pages

Lecture 2 - 1

Uploaded by

tomshave28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

LECTURE 2

Graphical
Descriptive
Techniques
Tallying Qualitative Data

Qualitative Data

One Variable Two Variables

Cross-tabulation /
Frequency distribution
Contingency table
Tallying Qualitative Data

 A frequency distribution for qualitative data indicates


the frequency, amount, or percentage of items in a
set of categories so that you can see differences between
categories.

 A contingency table / cross-tabulation lists the


frequency, amount, or percentage of each combination
of the values of the two or more qualitative
variables.

It is used to study patterns and relationship that may


exist between the responses of these variables.
Visualizing Qualitative Data

Qualitative Data

One Variable Two Variables

Clustered &
Bar & pie chart
stacked bar chart
Visualizing Qualitative Data

 Bar chart is often used to display frequencies.

 Pie chart (carta pai) is often used to show the relative


frequencies, in percentage.

 Clustered bar chart & stacked bar chart are the


graphical presentation for a contingency table.

 These charts are also frequently used to present


numbers rather than counts associated with
categories.
Tabulating the Values of a
Quantitative Variable

 Data in raw form are usually not easy to use for


decision making.

 The frequency distribution for quantitative data is a


table that divides the data values into classes and
shows the number of observed values or
frequency that fall into each class.

 In general, a frequency distribution should have at


least 5 but no more than 15 classes.
Guidelines of Constructing a
Frequency Distribution

 mutually exclusive: classes do not overlap so that a


data value can be placed in only one class

 exhaustive: a set of classes that contains all the


possible data values

 If possible, classes should have equal width, class


width should be round numbers (e.g. 5, 10, 25,
50, 100) and avoid using open-end classes
(e.g. less than 10, 50 and above).
Guidelines of Constructing a
Frequency Distribution

1) Determine the number of classes to use

Alternative, we could use Sturges’ formula:

Number of classes, k = 1 + 3.3 log (n)


where n is number of observations
Guidelines of Constructing a
Frequency Distribution

2) Determine the class width

𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑜𝑏𝑠 . − 𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑜𝑏𝑠 .


¿
class width
𝑘
where k is number of classes

Refer to Example 3.1 (Keller)

To help determine whether efforts should be made to


encourage younger bridge players to join ACBL, a
random sample of 200 ACBL members was drawn, with
each reporting his or her age.
Ordered Data
16 27 33 39 51 55 61 65 72 79
16 27 33 40 51 55 61 66 72 80
17 28 33 40 52 55 62 66 73 81
18 28 33 41 52 56 62 66 73 82
19 28 33 42 52 56 62 67 73 82
20 28 35 42 52 57 62 67 73 82
22 28 35 44 52 57 63 68 74 82
22 30 35 44 52 57 63 68 74 82
23 30 35 45 53 58 63 68 74 83
24 30 36 45 53 58 63 69 74 83
24 30 36 46 53 59 63 69 75 83
24 30 36 46 53 60 64 69 75 85
25 30 36 48 53 60 64 69 75 90
25 31 37 49 54 60 64 70 76 90
25 31 37 49 54 60 65 70 76 91
25 31 37 49 54 60 65 71 76 93
26 31 38 49 54 60 65 71 76 94
26 32 38 50 54 60 65 71 77 95
27 32 38 50 54 60 65 71 78 96
27 32 38 51 54 61 65 72 78 99
Frequency, Relative Frequency, Cumulative
Frequency & Cumulative Relative Frequency

Ages Freq. RF CF CRF


10 - < 20 5 2.5 5 2.5
20 - < 30 22 11 27 13.5
30 - < 40 34 17 61 30.5
40 - < 50 16 8 77 38.5
50 - < 60 34 17 111 55.5
60 - < 70 42 21 153 76.5
70 - < 80 28 14 181 90.5
80 - < 90 11 5.5 192 96
90 - < 100 8 4 200 100.0
200 100.0
Types of Frequencies

 Relative frequency lists the % of data values that fall


within each class. It will be useful in comparing two
or more groups with different sample sizes.

 Cumulative frequency lists the number of


observations with values less than or equal to the upper
class boundary of each class.

 Cumulative relative frequency lists the % of data


values that are less than or equal to the upper class
boundary.
Visualizing Quantitative Data

Quantitative Data

Time series
One Variable Two Variables
Variable

Scatter
Histogram Line chart
diagram
Histogram

 The histogram describes a frequency distribution


by using a series of adjacent rectangles, each of
which
has a length proportional to either frequency or the
relative frequency of the class it represents

 The purpose of drawing histogram is to determine


the shape of the data distribution.
Shapes of Histograms

symmetry & bell Symmetry (simetri)


shaped (bentuk loceng)

negatively / left skewed


positively / right (pencong ke kiri)
skewed (pencong ke
kanan)
Excel Output :
Histogram for the Age Data

45
40
35
30
Frequency

25
20
15
10
5
0
20 30 40 50 60 70 80 90 100

Ages
Scatter Diagram

 Scatter diagram is used for numerical data


consisting of paired observations taken from 2
numerical variables.

 One variable is measured on the vertical axis and


the other variable is measured on the horizontal axis.

 Scatter diagram is used to examine the type (linear


or nonlinear) and the direction (positive or negative) of
the relationship between two numerical variables.
Type and the Direction
of Relationship

positive linear relationship negative linear relationship

no relationship nonlinear relationship


Example of Scatter Diagram
Line Chart

 Time-series data graphed on a line chart or time-


series plot, which plots the value of the variable
on the vertical axis against the time periods on the
horizontal axis.

 Line chart is used to describe the trend of the


time series variable over time.

 Types of trend: increasing / upward, decreasing /


downward, constant, fluctuating
FTSE Bursa Malaysia KLCI Index

20
-M
20 ay-

1,200
1,300
1,400
1,500
1,600
1,700
1,800
1,900
2,000
-A 10
2 0 u g-
-N 10
20 ov-
-F 10
2 0 eb -
-M 11
20 ay-
-A 11
2 0 u g-
-N 11
20 ov-
-F 11
2 0 eb -
-M 12
20 ay-
-A 12
2 0 u g-
-N 12
20 ov-
-F 12
2 0 eb -
-M 13
20 ay-
-A 13
2 0 u g-
-N 13
20 ov-
-F 13

DATE
2 0 eb -
-M 14
20 ay-
-A 14
2 0 u g-
-N 14
20 ov-
-F 14
2 0 eb -
-M 15
20 ay-
-A 15
2 0 u g-
-N 15
20 ov-
-F 15
2 0 eb -
-M 16
20 ay-
-A 16
2 0 u g-
-N 16
20 ov-
-F 16
2 0 eb -
-M 17
20 ay-
-A 17
2 0 u g-
-N 17
20 ov-
-F 17
2 0 eb -
-M 18
FTSE Bursa Malaysia KLCI Index, 20/5/2010 – 31/5/2018
Example of Line Chart (Non-zero Y-axis)

ay
-1
8
Sources of Data
 Primary sources
 The data collected by the users of the data on their own – first
hand data gathered by the researcher himself
 Real time data
 Data from survey, observation
 Secondary Sources
 data is the data that have been already collected for another
purpose but has some relevance to your current research
needs – data collected from someone else
 Past data
 Data from publications (Economic report), book, journals
Types Of Data
 Time series data
 Data for an entity/country in many periods of time
e.g years, monthly
 Data on Gross Domestic Products for Malaysia
2000-2010

 Cross-sectional Data
 Data for many entity/country in one period
 Data on Gross Domestic Products developed
economics in 2005.
Types Of Data

 Pooled Data
 Data for many entity/country in many period
 Data on GDP for developed countries, 2005-2010.

You might also like