Educational Research
Chapter 7
Correlational Research
Gay, Mills, and Airasian
Topics to Be Discussed
n
n
n
Definition, purpose, and limitation of
correlational research
Correlation coefficients and their
significance
Process of conducting correlational
research
Relationship studies
Prediction studies
Correlational Research
n
Definition
n
Purpose
n
n
Whether and to what degree variables are
related
Determine relationships
Make predictions
Limitation
n
Cannot indicate cause and effect
Objectives 1.1, 1.2, & 1.3
The Process
n
Problem selection
n
Variables to be correlated are selected on the
basis of some rationale
n
n
Math attitudes and math achievement
Teachers sense of efficacy and their effectiveness
Increases the ability to meaningfully interpret
results
Inefficiency and difficulty interpreting the
results from a shotgun approach
Objective 2.1
The Process
n
Participant and instrument selection
n
n
Minimum of 30 subjects
Instruments must be valid and reliable
n
n
Design and procedures
n
Higher validity and reliability requires smaller samples
Lower validity and reliability requires larger samples
Collect data on two or more variables for each
subject
Data analysis
n
Compute the appropriate correlation coefficient
Objectives 2.2 & 2.3
Correlation Coefficients
n
A correlation coefficient identifies the
size and direction of a relationship
n
Size/magnitude
n
Ranges from 0.00 1.00
Direction
n
Positive or negative
Objectives 3.1, 3.2, & 3.3
Correlation Coefficients
n
Interpreting the size of correlations
n
General rule
n
n
n
Less than .35 is a low correlation
Between .36 and .65 is a moderate correlation
Above .66 is a high correlation
Predictions
n
Between .60 and .70 are adequate for group
predictions
Above .80 is adequate for individual predictions
Objective 3.5
Correlation Coefficients
n
Interpreting the size of correlations (cont.)
n
Criterion-related validity
n
n
Above .60 for affective scales is adequate
Above .80 for tests is minimally acceptable
Inter-rater reliability
n
n
n
n
Above .90 is very good
Between .80 and .89 is acceptable
Between .70 and .79 is minimally acceptable
Lower than .69 is problematic
Objective 3.5
Correlation Coefficients
n
Interpreting the direction of correlations
n
Direction
n
Positive
n
Negative
n
High scores on the predictor are associated with high
scores on the criterion
Low scores on the predictor are associated with low
scores on the criterion
High scores on the predictor are associated with low
scores on the criterion
Low scores on the predictor are associated with high
scores on the criterion
Positive or negative does not mean good or bad
Objective 3.3
Correlation Coefficients
n
Interpreting the size and direction of
correlations using the general rule
n
n
n
n
n
n
+.95 is a strong positive correlation
+.50 is a moderate positive correlation
+.20 is a low positive correlation
-.26 is a low negative correlation
-.49 is a moderate negative correlation
-.95 is a strong negative correlation
Which of the correlations above is the
strongest, the first or last?
Objective 3.3 & 3.5
Correlation Coefficients
n
Scatterplots
Graphical presentations of correlations
n Example of predicting from an attitude
scale EX 1 to an achievement test
EX 2
n
Predictor variable - EX1 - is on the
horizontal axis
Criterion variable - EX 2 - is on the vertical
axis
Objective 3.4
An Example of a Scatterplot
50.00
Linear Regression
ex2 = 11.23 + 0.72 * ex1
R-Square = 0.66
ex2
45.00
40.00
35.00
30.00
30.00
40.00
ex1
50.00
Objective 3.4
Correlation Coefficients
n
Common variance
n
Definition
n
n
The extent to which variables vary in a systematic manner
Interpreted as the percentage of variance in the criterion
variable explained by the predictor variable
Computation
n
n
The squared correlation coefficient - r2
Examples
2
n If r = .50 then r = .25
n 25% of the variance in the criterion can be explained
by the predictor
2
n If r = .70 then r = .49
n 49% of the variance in the criterion can be explained
by the predictor
Objectives 3.6 & 3.7
Statistical Significance
n
Statistical significance
n
Is the observed coefficient different from 0.00?
n
n
Determining statistical significance
n
n
Does the correlation represent a true relationship?
Is the correlation only the result of chance?
Consult a table of the critical values of r
See Table A.2 in Appendix A
Three common levels of significance
n
n
n
.01 (1 chance out of 100)
.05 (5 chances out of 100)
.10 (10 chances out of 100)
Objectives 4.1 & 4.3
Statistical Significance
n
Sample size and statistical significance
n
n
Small samples require higher correlations for significance
Large samples require lower correlations for significance
Practical significance and statistical significance
n
Small correlation coefficients can be statistically significant even
though they have little practical significance
n +.20
n
n
Statistically significant at the .05 level if the sample is about 100
Little or no practical significance because it is very low and
predicts only .04 of the variation in the criterion scores
-.30
n
n
Statistically significant at the .05 level if the sample is about 40
Little or no practical significance because it is low and predicts
only .09 of the variation in the criterion scores
Objectives 4.2 & 4.4
Relationship Studies
n
General purpose
n
Gain insight into variables that are related to other
variables relevant to educators
n
n
n
Achievement
Self-esteem
Self-concept
Two specific purposes
n
Suggest subsequent interest in establishing cause
and effect between variables found to be related
Control for variables related to the dependent
variable in experimental studies
Objectives 5.1 & 5.2
Conducting Relationship Studies
n
Identify a set of variables
n
n
n
n
n
n
Limit to those variables logically related to the criterion
Avoid the shotgun approach
n Possibility of erroneous relationships
n Issues related to determining statistical significance
Identify a population and select a sample
Identify appropriate instruments for measuring each
variable
Collect data for each instrument from each subject
Compute the appropriate correlation coefficient
Objective 6.1
Types of Correlation Coefficients
n
The type of correlation coefficient depends on the
measurement level of the variables
n
Pearson r - continuous predictor and criterion variables
n
Spearman rho ranked or ordinal predictor and criterion
variables
n
Rank in class and rank on a final exam
Phi coefficient dichotomous predictor and criterion
variables
n
Math attitude and math achievement
Gender and pass/fail status on a high stakes test
See Table 7.2
Objectives 7.1, 7.2, & 7.3
Linear and Curvilinear Relationships
n
Linear relationships
n
Plots of the scores on two variables are best
described by a straight line
n
n
Math scores and science scores
Teacher efficacy and teacher effectiveness
Curvilinear relationships
n
Plots of scores on two variables are best described
by functions
n
n
Age and athletic ability
Anxiety and achievement
Estimated by the eta correlation
Objectives 8.1, 8.2, & 8.3
An Example of a Linear Relationship
1.0000
Linear Regression
fp = 0.39 + 0.01 * ex1
R-Square = 0.80
fp
0.9000
0.8000
0.7000
30.00
40.00
50.00
ex1
Objective 8.4
An Example of a Curvilinear Relationship
LLR Smoother
100.00
score
75.00
50.00
25.00
0.00
2.00
4.00
6.00
8.00
10.00
study
Objective 8.4
Factors that Influence Correlations
n
Sample size
n
The larger the sample the higher the likelihood of
a high correlation
Analysis of subgroups
n
n
n
If the total sample consists of males and females each
gender represents a subgroup
Results across subgroups can be different because they
are being obscured by the analysis of the data for the
total sample
Reduces the size of the sample
Potentially reduces variation in the scores
Objective 9.1
Factors that Influence Correlations
n
Variation
The greater the variation in scores the
higher the likelihood of a strong correlation
n The lower the variation in scores the higher
the likelihood of a weak correlation
n
Attenuation
Correlation coefficients are lower when the
instruments being used have low reliability
n A correction for attenuation is available
n
Objectives 9.2 & 9.3
Prediction Studies
n
Attempts to describe the predictive
relationships between or among
variables
The predictor variable is the variable from
which the researcher is predicting
n The criterion variable is the variable to
which the researcher is predicting
n
Objectives 10.1 & 10.2
Prediction Studies
n
Three purposes
Facilitates decisions about individuals to
help a selection decision
n Tests variables believed to be good
predictors of a criterion
n Determines the predictive validity of an
instrument
n
Objective 11.1
Prediction Studies
n
Single and multiple predictors
n
Linear regression - one predictor and one
criterion
n
n
Y = a + bX
r2
Multiple regression more than one
predictor and one criterion
n
n
Y = a + bX1 + bX2 + + bXi
r2 or the coefficient of determination
Objective 11.4
Conducting a Prediction Study
n
Identify a set of variables
n
n
n
Identify a population and select a sample
Identify appropriate instruments for measuring each
variable
n
Ensure appropriate levels of validity and reliability
Collect data for each instrument from each subject
n
Limit to those variables logically related to the criterion
Typically data is collected at different points in time
Compute the results
n The multiple regression coefficient
n The multiple regression equation (i.e., the
prediction equation)
Conducting a Prediction Study
n
Issues of concern
Shrinkage the tendency of a prediction
equation to become less accurate when
used with a group other than the one on
which the equation was originally
developed
n Cross validation validation of a prediction
equation with another group of subjects to
identify problematic variables
n
Objective 11.3
Conducting a Prediction Study
n
Issues of concern (cont.)
n
Errors of measurement (e.g., low validity or
reliability) diminish the accuracy of the prediction
Intervening variables can influence the predictive
process if there is too much time between
collecting the predictor and criterion variables
Criterion variables defined in general terms (e.g.,
teacher effectiveness, success in school) tend to
have lower prediction accuracy than those defined
very narrowly (e.g., overall GPA, test scores)
Objective 11.5
Differences between Types of Studies
n
Correlational research is a general category
that is usually discussed in terms of two
variables
Relationship studies develop insight into the
relationships between several variables
n
The measurement of all variables occurs at about
the same time
Predictive studies involve the predictive
relationships between or among variables
n
The predictor variables are collected long before
the criterion variable
Objectives 11.2 & 11.3
Other Correlation Analyses
n
Path analysis
n
Investigates the patterns of relationships among a
number of variables
Results in a diagram that indicates the specific
manner by which variables are related (i.e., paths)
and the strength of those relationships
An extension of this analysis is structural equation
modeling (SEM)
n
n
n
Clarifies the direct and indirect relationships among
variables based on underlying theoretical constructs
More precise than path analysis
Often known as LISREL for the first computer program
used to conduct this analysis
Objective 13.1
Other Correlation Analyses
n
Discriminant function analysis
Similar to multiple regression except that
the criterion variable is categorical
n Typically used to predict group
membership
n
n
n
High or low anxiety
Achievers or non-achievers
Objective 13.2
Other Correlation Analyses
n
Cannonical correlation
n
An extension of multiple regression in which more
than one predictor variable and more than one
criterion variable are used
Factor analysis
n
A correlational analysis used to take a large
number of variables and group them into a smaller
number of clusters of similar variables called
factors
Objectives 13.3 & 13.4
A Checklist of Questions
n
n
n
Was the correct correlation coefficient
used?
Is the validity and reliability of the
instruments acceptable?
Is there a restricted range of scores?
How large is the sample?
Statistical Assessment of
Relationships
Data
Are the data quantitative or nominal?
quantitative
nominal
Do you have more than two predictor
variables?
No
Yes
Correlation Analysis:
Do you have more than two predictor
variables?
No
Yes
Chi-Square Analysis:
Regression Analysis: R
Log-Linear Analysis
Logistic Regression
The Correlation Coefficient
for Association among Quantitative Variables
Scatterplot
College
GPA
A graph in which the x axis indicates
4.0
the scores on the predictor variable
and the y axis represents the scores
on the outcome variable. A point is3.0
plotted for each individual at the
intersection of their scores.
2.0
Regression Line
1.0
A line in which the squared distances
of the points from the line are
minimized. (least square methods)
1.0
2.0
3.0
4.0
High School
GPA
Linear Relationships and Nonliniar Relationships
Y
Y
Positive Linear
Negative Linear
Curvilinear
X
Y
Curvilinear
Independent
The Pearson Correlation
Coefficient
Calculation
Esteem 1 Esteem 2
1
4 (4-3)/0.8 =1.674
2
4
3
3
3
2
4
2
2
5
2
1
Mean
2.4
Sesteem1 = 0.8 Sesteem2=1.04
r = (Z
ZY )
N 1
[(4-3)(4-2.4)]2 + ...
[( X X )(Y Y )]
= [ ( X X ) ][ (Y Y )
(4-2.4)2 + ...
(4-3)2 + ...
4+4+3+2+2
4 x 4 + 4 x 3 + ...
=
Task 1: compute r
XY
( X )( Y )
N
4+3+2+2+1
5
2
2
&
#
&
#
(
)
(
)
X
Y
2
2
$ X
! $ Y
!
N
N
$%
!" $%
!"
4 x 4 + 4 x 4 + 3 x 3 ...
4 x 4 + 3 x 3 + 2 x 2 ...
Interpretation of r
-1< r <1
If the relationship between X and Y are positive:0 < r < 1
-1 < r < 0
If the relationship between X and Y are negative:
If p-value associated with the r is < .05
The variable X and Y are significantly correlate to each other.
Positively: 0 < r < 1, Negatively -1 < r < 0
If p-value associated with the r is >. 05
There is no significant correlation between X and Y
Reporting Correlations
r(Number of Participants) = Correlation Coefficient r, p < p value.
As predicted by the research hypothesis, the variable of optimism
and reported health behavior were (significantly) positively correlated
in the sample (the data), r(20) = .52, p < .01
Limitation
1. Cases in which the correlation between X and Y that have
curvilinier relationships
r=0
2. Cases in which the range of variables is restricted.
Example. SAT scores and college GPA Restriction of Range
3. Cases in which the data have outliers
r > |.99|
Limitation (visual)
Curviliniar
Small Range
Outlier
The Chi-square Statistic
for Association among nominal variables
Yes
Northerner
No
30 (.15)70 (.35) 100 (.50)
45 (.225) 55 (.275)
Southerner
60 (.30)40 (.20) 100 (.50)
45 (.225) 55 (.275)
90 (.45)110 (.55)200 (1.00)
Row marginal X Column marginal
N
fe =
Task 2 computation 2
2 =
X=
( fo fe )2
f
e
(30 45) 2 (70 55) 2 (60 45) 2 (40 55) 2
+
+
+
22.5
27.5
22.5
27.5
Interpretation of 2
Go to Table E in Appendix E.
Degree of Freedom (df):
(Level of variable 1 - 1) X (Level of variable 2 -1)
Number of Participants
See the value at the intersection between Alpha p < .05 and df
If 2 is greater than the value in Table E, the contingency table
is significantly differ from the expectation.
If 2 is greater than the value in Table E, the contingency table
is not significantly differ from the expectation.
Reporting Chi-Square Statistic
2 (degree of freedom (df), Number of Participants(N)) =
Chi value, p < p value
As predicted by the research hypothesis, the southerners were more
likely to approve of a policeman striking an adult male citizen who
was being questioned as a suspect in a murder case, 2(1, N =30) =
34.23, p < .01