PSYC2012 Module 12 Correlation and Regression

Lecture notes

Uploaded by

thea.eveml

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views9 pages

PSYC2012 Module 12 Correlation and Regression

Lecture notes

Uploaded by

thea.eveml

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

PSYC2012 Module 12 Correlation and Regression

Evaluates the linear association between two “continuous”

variables.
- Population denotation:
o ρ XY
- Sample-based estimate of ρ XY :
o r XY
- We are looking at the Pearson product-moment correlation
coefficient.

- Correlation coefficient captures the straight-line relation between

two variables.

Example:
- 12 participants spirituality and longevity.
o The Pearson product-moment correlation between spirituality
and longevity was .59.
HOW TO CALCULATE:
- Sum of products:
o First step: how far and in what direction do scores vary from
the mean on each variable?
 X −X
 Y −Y
o Sample estimate denotation:
 X −μ X
 Y −μ Y
- A positive correlation suggests that individuals who score high on X
(above the X mean – X > X ) also score high on Y (above the Y mean –
Y >Y )
o So if ( X > X ) >0, we would expect (Y >Y ¿> 0
 Likewise, those below X should also be below Y
o So for a positive correlation, if we multiply these deviation
scores, it should be positive.
 ( X −X ) ( Y −Y )> 0
- Sum of products:
o SP XY =∑ ( X− X ) (Y −Y )

Sum of products increases with N

- We don’t want this for correlation coefficient – we want to factor in
sample size so that our correlation shows that an association of
N=12 for a data set has the same SIZE association as N=24 for the
same data set with points duplicated.
o Factor this in – COVARIANCE.

Covariance:
SP XY
- c^ov XY =
N −1
- This is the sample-based estimate of the population covariance (
SP XY
cov XY = )
N
o As sample size increases, SP XY increases but population
covariance remains the same.
o As sample size increases, estimate of covariance (sample)
gets closer to population covariance.
- PROBLEM:
o Covariance changes with the scale of either variable (because
SP XY changes when the scale changes).
 E.g., changing the scale from milliseconds to seconds
shouldn’t change the association between that variable
(time) and the other.
 Covariance does change – issue.
o Solution:
 Standardise the covariance. When we do this, we get
the correlation coefficient.

Correlation coefficient:
- Divide by the product of the (estimated) standard deviations of X
and Y.
o Population:
cov XY
 ρ XY =
σXσY
o The sample-based estimate is:
c^ ov XY
 r XY =
σ^ X σ^ Y
- HOW TO CALCULATE:
c^ ov XY
o r XY =
σ^ X σ^ Y

 To find c^
ov XY =
∑ ( X− X ) (Y −Y )
N−1
 To find σ^ X =
√∑
∑ ( X −X )2
N −1

√
2
 To find σ^ Y =( Y −Y )
N −1

o r XY =
∑ ( X −X ) ( Y −Y ) = SP XY
√
∑ ( X−X )2 ∑ ( Y −Y )2 √ SS X SSY
Properties of a correlation coefficient:
- Ranges from +1 to -1.
- Assumes a linear relation between X and Y.
- WE must have enough variability on X and Y – SSx and SSy will be 0.
o Range restriction: if variance is restricted (by restricting the
range of possible values on one variable), its correlation with
another variable is likely to be reduced.
 E.g., an easy test where everyone is getting 100%.

Making inferences about ρ XY from r XY

- Correlations between spirituality and longevity:
o r XY =0.5938
o ρ XY =?
- Ho: ρ XY =0
- Ha: ρ XY ≠ 0
o We can attach a p-value to a correlation (given its sample
size). Using software, we would find in this case that p=.042.
 Conclude: those who live longer tend to have
significantly higher levels of spirituality (r=.59, N=12),
p=.042.

Sampling distribution becomes increasingly skewed as ρ XY

approaches the extremes (due to boundaries).

-
o How do we deal with this skew?

Fisher z transformation:
- Use a non-linear transformation to make distribution of r XY normal.
- After transforming both r and ρ , we calculate a z statistic.
' '
z r −z ρ
z=

√
o 1
N −3
- Example:
o Our null hypothesis value ρ XY =0
o Our sample correlation was r XY =0.59
 We need to find transformed values for both of these.
o

' '
z r −z ρ
z=

√
- 1
N −3
o z 'r is the transformed value for observed r
 r XY =.59 → z 'r =0.678
o z 'ρis the transformed value for hypothesised ρ
 ρ XY =0 → z 'ρ=0
o
√
1 is the standard error of '
N −3
 (N=12)
zr

o Then it is just a z-test.

' '
z r −z ρ 0.678−0 0.678−0
z= = = =2.03

√ √
- 1 1 0.3333
N −3 12−3
o Go to z tables and find an associated p value.
o This is an observed z value. How do we evaluate it?
 Use z tables - .05 two tailed gives critical z of +-1.96.
- Observed z is more extreme than critical z, so Reject Ho.
o Conclusion:
 Those who live longer tend to have significantly higher
levels of spirituality (r = .59, N = 12), z=2.03, p<.05.
Confidence intervals:
-
1
( 1−α ) ×100 % C . I .=z 'r ± z c
N −3 √
o NOTE: This formular will produce confidence intervals on the
Fisher’s z’ scale. YOU MUST REMEMBER to transform the upper
and lower limits back to r.
- z’r is the transformed value for observed r
o rxy = 0.59 -> z’r = 0.678.
- zc is the critical z for the desired level of confidence.
o zc =+-1.96 for a 95% CI.
-
√ 1
N −3
is the standard error of z’r =0.3333 for N=12.
o Example:
 95% CI = 0.678+-1.96 x 0.3333 = (0.025 < z’p < 1.331).
 NEED TO TRANFORM BACK TO r…
o Use z’ table transformation of r.
 (.025 < ρ XY < .87).

Regression:
- We looked at correlation as a method of examining a bivariate
association (association between two variables).
o We can get more information by instead using a linear
regression.
- Simple linear regression:
o Coefficient of determination: correlation coefficient squared (r-
2
)
 r2 refers to the proportion of variability in Y that can be
accounted for (or predicted/explained) given knowledge
about scores on X (or vice versa).

o
- Example:
o The correlation between pain interference and depression
is .72. What proportion of variability in depression is
accounted for by pain interference?
 r = .72.
 r2 = (.72)2 = .523.
o This can now be interpreted as a percentage:
 About 52.3% of the variability in depression is
accounted for by pain interference.

o
- Correlation is useful for providing a standardised estimate of the
linear relation between two continuous variables:
o BUT if we want to more specifically describe the linear relation
OR to make explicit predictions about Y using X, we use
regression analysis.
Simple linear regression RESTRICTIONS:
- 1. We only examine one independent variable (this is the simple).
- 2. We only examine straight-line relations (this is the linear).
o Use the general equation of a straight line:
 Y =a+bX
 Where a is the y ‘intercept’.
 B is the ‘slope’ – change in y as x increases by 1.
The regression model:
- Incorporates errors in prediction (ei) onto the general equation for a
straight line:
o Y i=a+b X i + ei
 In the population
o Y^i= a^ + b^ X i + ei
 Is the sample-based model.
 i in this case is a property of an individual – individual 1,
2, 3 etc.

o
- Yi is the actual score on the dependent variable for the ith person.
- Xi is the actual score on the predictor (or independent variable) for
the ith person.
- ei is the error when predictor scores on the dependent variable for
the ith person.
o Yî=a+b X i
 Yî is the predicted dependent variable score for a person
with a given score on the independent variable X i
 Y i−Yî=e i
- Example:
o Yî= a^ + b^ X i
o a^ = -6.222 (intercept)
o b^ = -3.662 (slope)
 Yî=−6.222+ 3.662 X i
 Predicted depression score = -6.222 + 3.662 x
pain interference score
- Sample conclusion using simple linear regression results:
o It appears that higher pain interference is associated with
higher depression scores, such that depression score is
predicted to increase by 3.662 points for every additional pain
interference unit, b = 3.66, t(23) = 5.02, p <.001.
- Standardised regression coefficient:
o Usually denoted ^β
 = predicted number of standard deviations change in Y
for a 1 standard deviation increase in X.
 In simple linear regression, ^β=r XY
o Standardised regression equation:
 ^z Y =β × z X
 E.g., if I am 1.5 standard deviations below the
mean on pain interference (i.e., z X = -1.5), what
depression score am I predicted to have?
o ^z depression=.732×−1.5
 -1.10.
o i.e., predicted to be 1.1 standard deviations
below the mean depression score.
- Unstandardised vs Standardised?
o Unstandardised: As pain interference increased by 1 point,
depression is predicted to increase by 3.662 points.
o Standardised: AS pain interference increased by 1 standard
deviation, depression is predicted to increase by 0.723
standard deviations.
 Should we interpret unstandardised (b) or standardised (
β )?
- NO good rule.
o Guidelines:
 When there’s more than one predictor in the model, β
for different predictors can be directly compared by not
b.
 If there is uncertainty about the meaning of a “unit
increase” on the predictor, it is safer to interpret β .
 Otherwise, interpreting b is preferable, as it allows
interpretations to be expression in terms of the original
units of the dependent variable and predictor.

Partitioning variance:
- We take the variance of an independent variable and examine how
much of it is explained and unexplained.
o Why do scores on our dependent variables vary? Usually the
independent variables in our statistical modelling, and
whatever is left over (within-group variability/error).
 In regression we call this left over variance
residual variance.

-
o SST = sums of squares total.
o SSP = sums of squares regression.
o SSR = sums of squares residual.
- The regression line is the ‘line of best fit’ because it minimises errors
in prediction – no other straight line through the scatterplot will
provide a smaller SSR.

Effect size:
- What proportion of variability in depression is accounted for by pain
interference?
o R2 = SSP/SST

Stat 250 Gunderson Lecture Notes 11: Regression Analysis: Main Idea
No ratings yet
Stat 250 Gunderson Lecture Notes 11: Regression Analysis: Main Idea
22 pages
Econometrics For Finance
100% (1)
Econometrics For Finance
54 pages
Formulas
No ratings yet
Formulas
12 pages
Correlation and Regression: Associate Professor Georgi Iskrov, PHD Department of Social Medicine and Public Health
No ratings yet
Correlation and Regression: Associate Professor Georgi Iskrov, PHD Department of Social Medicine and Public Health
28 pages
1004B Tutorial 7 Slides (With Answers) - 1
No ratings yet
1004B Tutorial 7 Slides (With Answers) - 1
44 pages
Correlation and Regression 2020
No ratings yet
Correlation and Regression 2020
63 pages
Correlation
No ratings yet
Correlation
82 pages
Relationship - Correlation and Regression
No ratings yet
Relationship - Correlation and Regression
42 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
Regression Analysis
No ratings yet
Regression Analysis
6 pages
BRM-Lecture 4-2023
No ratings yet
BRM-Lecture 4-2023
48 pages
DAM Class 21-24 Regression Analysis
No ratings yet
DAM Class 21-24 Regression Analysis
93 pages
Corr and Regress
No ratings yet
Corr and Regress
42 pages
Biostat Lecture Note 3
No ratings yet
Biostat Lecture Note 3
5 pages
Linear Regression
100% (2)
Linear Regression
28 pages
Chapter 4: of Tests and Testing 12 Assumptions in Psychological Testing and Assessment
No ratings yet
Chapter 4: of Tests and Testing 12 Assumptions in Psychological Testing and Assessment
5 pages
Psych Stat Reviewer Midterms
No ratings yet
Psych Stat Reviewer Midterms
10 pages
Correlation Simple Regression
No ratings yet
Correlation Simple Regression
26 pages
BSC - Applied Statistics - Correlation and SLR
No ratings yet
BSC - Applied Statistics - Correlation and SLR
67 pages
10 Regression Analysis
No ratings yet
10 Regression Analysis
55 pages
Making Sense of Methods and Measurements: Simple Linear Regression
No ratings yet
Making Sense of Methods and Measurements: Simple Linear Regression
2 pages
Lecture 25 - Multiple Regression
No ratings yet
Lecture 25 - Multiple Regression
34 pages
14 - Regresi Dan Korelasi
No ratings yet
14 - Regresi Dan Korelasi
34 pages
Linear Regression - Stats 2 (Translated)
No ratings yet
Linear Regression - Stats 2 (Translated)
63 pages
@regression
No ratings yet
@regression
33 pages
06 Correlation and Regression
No ratings yet
06 Correlation and Regression
63 pages
Regression
No ratings yet
Regression
12 pages
Correlation and Regression Analysis_updated (1)
No ratings yet
Correlation and Regression Analysis_updated (1)
49 pages
06 Regression
No ratings yet
06 Regression
18 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
Topic03 Correlation Regression
No ratings yet
Topic03 Correlation Regression
81 pages
STAR Rando Questions Stats
No ratings yet
STAR Rando Questions Stats
14 pages
1.3. MR Using SPSS
No ratings yet
1.3. MR Using SPSS
24 pages
Module 6A Estimating Relationships
No ratings yet
Module 6A Estimating Relationships
104 pages
Lecture8 4
No ratings yet
Lecture8 4
29 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
46 pages
Correlation
100% (1)
Correlation
29 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
20 pages
Lecture - Correlation and Regression GEG 222
100% (1)
Lecture - Correlation and Regression GEG 222
67 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
25 pages
12.1correlation and Simple Linear
No ratings yet
12.1correlation and Simple Linear
45 pages
Lecture 4 Linear Regression
No ratings yet
Lecture 4 Linear Regression
75 pages
Lecture Week 12 - Intro To Regression
No ratings yet
Lecture Week 12 - Intro To Regression
5 pages
C8203 IRDA Class Support Handbook
No ratings yet
C8203 IRDA Class Support Handbook
53 pages
BRM 9e PPT CH 23
No ratings yet
BRM 9e PPT CH 23
27 pages
STA 212
No ratings yet
STA 212
14 pages
Simple Linear Regression Part I - Updated FA18
No ratings yet
Simple Linear Regression Part I - Updated FA18
59 pages
5 Chapter Fi
No ratings yet
5 Chapter Fi
29 pages
6 Correlation and Linear Regression
No ratings yet
6 Correlation and Linear Regression
32 pages
Correlation
No ratings yet
Correlation
72 pages
Regression: Leech N L, Barret K C & Morgan G A (2011)
No ratings yet
Regression: Leech N L, Barret K C & Morgan G A (2011)
35 pages
Lecture2 Xy 2025
No ratings yet
Lecture2 Xy 2025
31 pages
Correlation Regression 15 16
No ratings yet
Correlation Regression 15 16
19 pages
York University Adms2320 Chapter 16 Example
No ratings yet
York University Adms2320 Chapter 16 Example
7 pages
PBH7003 Tests of Relationships
No ratings yet
PBH7003 Tests of Relationships
68 pages
Introduction of Regression
No ratings yet
Introduction of Regression
57 pages
Lecture 2.2 - CH 3 Continued
No ratings yet
Lecture 2.2 - CH 3 Continued
32 pages
Chapter-23 Bivariate Statistical Analysis: Measurement of Association
No ratings yet
Chapter-23 Bivariate Statistical Analysis: Measurement of Association
30 pages
Regrion
No ratings yet
Regrion
19 pages
ECON 322 ECONOMETRICS 11 - Kabarak University
No ratings yet
ECON 322 ECONOMETRICS 11 - Kabarak University
6 pages
3 Sls
No ratings yet
3 Sls
31 pages
4 - How To Use SmartPLS Software Structural Model Assessment 1-25-13
No ratings yet
4 - How To Use SmartPLS Software Structural Model Assessment 1-25-13
48 pages
Online Correlation and Regression
No ratings yet
Online Correlation and Regression
6 pages
Theoretical Neuroscience II Exercise 8: Principal Component Analysis (PCA)
No ratings yet
Theoretical Neuroscience II Exercise 8: Principal Component Analysis (PCA)
2 pages
Lemlem Abebaw Asaye Assignment 9
No ratings yet
Lemlem Abebaw Asaye Assignment 9
8 pages
Social Media Cyberbullying Detection On Political Violence From Bangla Text
No ratings yet
Social Media Cyberbullying Detection On Political Violence From Bangla Text
22 pages
تكنولوجيا المعلومات والاتصالات والنمو الاقتصادي في البلاد ال... ة - دراسة قياسية باستخدام نماذج البانل (panel data models) للفترة 2005 - 2018
No ratings yet
تكنولوجيا المعلومات والاتصالات والنمو الاقتصادي في البلاد ال... ة - دراسة قياسية باستخدام نماذج البانل (panel data models) للفترة 2005 - 2018
23 pages
MAS-01 Cost Behavior Analysis
No ratings yet
MAS-01 Cost Behavior Analysis
6 pages
Intro To Factor Analysis
No ratings yet
Intro To Factor Analysis
52 pages
PLUM - Ordinal Regression: Notes
No ratings yet
PLUM - Ordinal Regression: Notes
3 pages
Time Series hw5
100% (2)
Time Series hw5
4 pages
ML Module2
No ratings yet
ML Module2
124 pages
Classification Basics
No ratings yet
Classification Basics
14 pages
W3 Sample Questionnaire and Dummy Tables PDF
67% (3)
W3 Sample Questionnaire and Dummy Tables PDF
20 pages
Maths Assignment Statistics Inference Unit 8
No ratings yet
Maths Assignment Statistics Inference Unit 8
5 pages
2023 Past Year Question Paper
No ratings yet
2023 Past Year Question Paper
6 pages
Chapter 3
No ratings yet
Chapter 3
32 pages
Chapter - 13 Correlation and Linear Regression
No ratings yet
Chapter - 13 Correlation and Linear Regression
32 pages
A00-485 Dumps - Modeling Using SAS Visual Statistics
No ratings yet
A00-485 Dumps - Modeling Using SAS Visual Statistics
14 pages
4-1 Introduction To Corrrelation and Its Properties
0% (1)
4-1 Introduction To Corrrelation and Its Properties
14 pages
Example Correlation Analysis
No ratings yet
Example Correlation Analysis
2 pages
Lect 6
No ratings yet
Lect 6
20 pages
Structural Equation Model
No ratings yet
Structural Equation Model
46 pages
Statistics For Business and Economics: Simple Regression
No ratings yet
Statistics For Business and Economics: Simple Regression
64 pages
Elements of Forecasting 4th Edition Francis X. Diebold Instant Download
100% (1)
Elements of Forecasting 4th Edition Francis X. Diebold Instant Download
47 pages
Lab9 Split Plot Design and Its Relatives
100% (1)
Lab9 Split Plot Design and Its Relatives
12 pages
University of Gujrat: Probability & Statistics
No ratings yet
University of Gujrat: Probability & Statistics
2 pages
May Jun 2023
No ratings yet
May Jun 2023
4 pages