Dr.
Kelly Page
Cardiff Business School
E: [email protected]
T: @drkellypage
T: @caseinsights
FB: [email protected]
Analytical Design - Quant
Week 6 (2)
(cc) Kelly Page
Lecture Objectives
Get an overview of the data analysis procedure;
Develop an understanding of the importance and nature
of quality control checks;
Understand the data entry process and data entry
alternatives;
Learn how data are tabulated;
Learn how to set up and interpret cross tabulations;
Comprehend the basic techniques of statistical analysis.
(cc) Kelly Page
Data Analysis Overview
Validation &
Editing
(cc) Kelly Page
Coding
Data
Entry
Machine
Cleaning
of Data
Tabulation &
Statistical
Analysis
Data Analysis - Overview
Step One:
Validation: Confirming the interviews/surveys occurred
Editing: Determining the questionnaires were completed correctly
Step Two:
Coding: Grouping and assigning numeric codes to the question responses.
Step Three:
Data Entry: Process of converting data to an electronic form
Can use scanning devices to enter data
Scanning the questionnaire into a database (such as with bubble sheets)
Step Four:
Clean the Data: Check for data entry errors or data entry inconsistencies
Machine cleaning a computerized check of the data
Step Five:
Data tabulations and statistical analysis.
(cc) Kelly Page
Variables & Values
Variables are measurable factors, attributes, properties or characteristics of an
item, individual or system (what we measure)
Values are the results of measuring or observing a variable the data scores
(how we measure it response format)
Examples
Account Balance: a variable describing how much money you have in the bank
203.45 The Value of your Account Balance at 11.32 a.m. on Friday 13th
February
Broadcast Media a variable which denotes the type media channel that
someone owns
Sony TV the specific value according to which set they own
Customer Satisfaction (CS) a variable which denotes how satisfied a
customers experience with X was
Positive = the values of positive or negative, high or low could be the value of
customer satisfaction and is dependent on how we measured it
(cc) Kelly Page
Considerations
Research objectives
Type of data (e.g., Nominal, Ordinal, Interval,
Ratio)
Sample size (e.g., min=100)
Sampling method (e.g., non-probability)
(cc) Kelly Page
1) I want to describe the data!
Frequency Analysis Univariate (one variable)
Count
Percentages
Missing
Cross-tabulations Bivariate (two variables)
2x count
2x percentage
Descriptive Statistics Univariate (one variable)
Mean, median, mode, kurtosis, standard deviation,
skewness, and variance.
Graphical Presentation
How to visually display the descriptive profile of the data
(cc) Kelly Page
a). Frequency Analysis
One Way Frequency Tables/Graphs
AAtable
tableshowing
showingthe
thenumber
number(n)
(n)or
orpercentage
percentage(%)
(%)of
ofrespondents
respondents
choosing
choosingeach
eachanswer
answerto
toaasurvey
surveyquestion.
question.
Did
DidYou
YouLike
Likethe
theMovie?
Movie?
77
88
66
44
44
33
Grand
GrandTotal
Total
22
00
(cc) Kelly Page
No
No
Yes
Yes
Female
Female
8
b). Cross Tabulations
Examination
Examinationofofthe
theresponses
responsestotoone
onequestion
questionrelative
relativetotothe
theresponses
responsestotoone
oneor
ormore
more
questions
in
a
survey
set.
questions in a survey set.
Bi-variate cross-tabulation:
Cross tabulation two items
Business Category and Gender
Multi-variate cross-tabulation:
Additional filtering criteria - Veteran
Status - Now filtering three items.
Race/Ethnicity
(All)
Are You a Veteran?
Yes
You Liked the Chamber's Services (All)
Count of Respondent
Business Category
Computers/Technology
Construction
Manufacturing
Other
Professional
Grand Total
(cc) Kelly Page
Gender
Female Male
Grand Total
1
3
4
1
1
5
5
3
2
5
1
1
9
7
16
Are You a Veteran?
(All)
You Liked the Chamber's Services (All)
Race/Ethnicity
(All)
Count of Respondent
Business Category
Computers/Technology
Construction
General Services
Manufacturing
No Response
Other
Professional
Retail
Wholesale
#N/A
Grand Total
Gender
Female Male
Grand Total
5
7
12
2
4
6
1
1
13
6
19
1
4
5
15
11
26
1
3
4
4
4
8
1
1
2
1
1
42
42
84
c). Descriptive Statistics
Effective
Effective means
means of
of summarizing
summarizing large
large sets
sets of
of data.
data. Key
Key measures
measures
include:
include: mean,
mean, median,
median, mode,
mode, kurtosis,
kurtosis, standard
standard deviation,
deviation, skewness,
skewness,
and
andvariance.
variance.
Significant discrepancies in Mean
and Median should cause you to
look further into this data.
Years in Business
Measures of Central
Tendency!
Mean
Median
Mode
(cc) Kelly Page
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
22.4
2.6
15.0
5.0
23.1
534.5
3.8
2.1
98.0
2.0
100.0
1770.5
79.0
Measures of Dispersion!
Variance
Range
Standard Deviation
Skewness
10
d). Graphical Representation
Line, Pie, and Bar Charts
Did You Like the Movie?
Line Charts: Good for demonstrating
linear relationships.
15
12
10
7
Pie Charts: Good for special relationships
among data points.
0
Female
Bar Charts: Good for side by side
relationships / comparisons
Yes
5
3
2
4
3
No
Male
Grand Total
Grand Total
Did You Like the Movie?
4
Female
Male
Grand Total
6
2
Did You Like the Movie?
14
12
12
10
6
4
(cc) Kelly Page
No
8
4
5
3
Yes
Grand Total
11
0
Female
Male
Grand Total
2. I want to test the differences
between Groups of people or
things!
T-test
Measures of Difference
Differences between two groups (e.g., T-statistic (t-test)
males and females)
F-statistic (Anova)
Measure of difference: T-statistic
Significance of
Difference (p>0.01)
ANOVA
Differences between two or more
groups (e.g., age groups)
Measure of difference: F-statistic
(cc) Kelly Page
Needs to look at means
or run additional statistic
to identify where the
difference are!
12
E.g., T-test
(cc) Kelly Page
13
E.g., ANOVA
(cc) Kelly Page
14
3. I want to test if a
relationship exists between
2
or more variables
Measures of Association
Correlation
2 x variables
If interval or ratio data pearson
If ordinal data = spearman
Simple Regression Analysis
1 x independent variable
1 x dependent variable
Multiple Regression Analysis
Multiple x independent variable
1 x dependent variable
(cc) Kelly Page
Correlation coefficient (r)
Regression coefficient
(r2)
Strength of association
(0-1)
Direction of Association
(+/-)
Significance of
Association (p>0.01)
15
E.g., Correlation
(cc) Kelly Page
16
E.g., Simple Regression
(cc) Kelly Page
17
No Apparent Relationship Between X and Y
YY
XX
Perfect Negative Relationship Between X and Y
(cc) Kelly Page
Perfect Positive Relationship Between X and Y
Parabolic Relationship Between X and Y
18
General Negative Relationship Between X and Y
General Positive Relationship Between X and Y
Negative Curvilinear Relationship Between X and Y
(cc) Kelly Page
No Apparent Relationship Between X and Y
19
4. I want to group people
OR objects
Cluster Analysis
Group people or objects based differences between and
similarities within (segmentation)
Factor Analysis
Group data to most important related to criterion (11 items = 2
dimensions of satisfaction)
Perceptual Mapping
Visual representation of perceptions by groups (brand
associations)
Conjoint Analysis
Value of peoples rankings of important product attributes
(consumer choice > price, quality, location)
(cc) Kelly Page
20
Cluster Analysis
The general term for statistical procedures that classify objects or people into
some number of mutually exclusive and exhaustive groups on the basis of two
or more classification variables.
Cluster 1: Men
(cc) Kelly Page
Cluster 2: Women
Cluster 3: People with
Green Cars
21
E.g., Cluster Analysis
(cc) Kelly Page
22
(cc) Kelly Page
23
Factor Analysis
Procedure for grouping & simplifying data by reducing a large set of values/items
to a smaller set of factors/dimension of a variable by identifying dimensions in the
data .
Correlation between factor scores and the
original variables.
(cc) Kelly Page
24
E.g., Cinema Attribute Importance (1)
(cc) Kelly Page
25
E.g., Cinema Attribute Importance (2)
(cc) Kelly Page
26
Perceptual Mapping
Procedure of producing visual representations of consumer perceptions of
products, brands, companies, or other objects / issues.
Expensive
Men
Women
Well Designed
Setting Markers
(cc) Kelly Page
Poorly Designed
Inexpensive
27
Conjoint Analysis
Procedure use to
quantify the value
consumers
associate with
different levels of
product/service
attributes or
features.
(cc) Kelly Page
28
Analytical Design Key Points!
What type of data do you have?
Ratio, Interval, Ordinal, Nominal = Statistical power
What do you want to find out?
Describe how data is distributed
Group differences between two or more groups
Relationships between two or more variables
Who falls into which grouping
Customer preference criterion
Other Considerations:
How much missing data?
How big is sample size?
How was data collected random or non-random?
(cc) Kelly Page
29
The content of this work is of shared interest between the author, Kelly
Page and other parties who have contributed and/or provided support
for the generation of the content detailed within.
This work is licensed under a Creative Commons
Attribution-NonCommercial-Share Alike 2.0 UK: England & Wales.
http://creativecommons.org/
Kelly Page (cc)
(cc) Kelly Page
30