0% found this document useful (0 votes)
92 views

Lecture 5 - Correlation and Regression Analysis

The document discusses correlation analysis and linear regression. It defines correlation as a statistical technique that evaluates both the direction and strength of the linear relationship between two variables. Correlation can be positive, negative, or no relationship. The value of the correlation coefficient r ranges from -1 to 1, with values farther from 0 indicating stronger relationships. Linear regression finds the linear relationship between an independent and dependent variable by estimating the intercept β0 and slope β1 of the regression line. It takes the form of Yi = β0 + β1Xi + Ui, where Ui is the random error term. Examples are provided to demonstrate calculating the correlation coefficient r and estimating the linear regression line.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views

Lecture 5 - Correlation and Regression Analysis

The document discusses correlation analysis and linear regression. It defines correlation as a statistical technique that evaluates both the direction and strength of the linear relationship between two variables. Correlation can be positive, negative, or no relationship. The value of the correlation coefficient r ranges from -1 to 1, with values farther from 0 indicating stronger relationships. Linear regression finds the linear relationship between an independent and dependent variable by estimating the intercept β0 and slope β1 of the regression line. It takes the form of Yi = β0 + β1Xi + Ui, where Ui is the random error term. Examples are provided to demonstrate calculating the correlation coefficient r and estimating the linear regression line.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Applied Statistics

Dr. Hossameldin Ahmed


Assistant Professor of Econometrics Applied Statistics

Lecture Six
Correlation Analysis

It is a statistical technique that evaluates both direction and


strength of the linear relationship between two variables
Correlation Analysis
Direction
Direct / Positive Inverse / Negative
• Income and Consumption • Taxes and Net Income
• Experience and Salary • Prices and quantity demanded
• Price and quantity Supplied • Dissatisfaction and Profits
• Trade and Economic Growth • Education and Crime
• Unemployment and Crime • Political instability and FDI
• Discount and Sales • Turnover and job instability
Correlation Analysis
Linear

Direct / Positive Inverse / Negative


Types of Relationships
Y Y
Linear
Relationships

X X

Y Y
Y
Curvilinear
Relationships

X X X

Y Y
No
Relationship

X X
Correlation Analysis
Strength

Strong Moderate Weak


Income and Salary and Experience Mood and Consumption
Consumption

Employee satisfaction Cultural factors and


Taxes and net income and Productivity crime
Correlation Analysis
n xy   x y
r
[n( x )  ( x) ][ n( y )  ( y ) ]
2 2 2 2

Correlation coefficient (r)


−1 ≤ 𝑟 ≤ +1
The sign of r denotes the nature of association while the value of r denotes
the strength of association.
Correlation Analysis

Example : r= 0.85
Comment: there is positive strong linear relationship between income and consumption
Loans (in million $) (Y) Deposits (in million $) (X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
Loans in Deposits in
xy x -square y-square
million $ (y) million $ (x)
n  xy   x  y 245 1400 343000 1960000 60025
312 1600 499200 2560000 97344
r  279 1700 474300 2890000 77841
2 2 2 2 308 1875 577500 3515625 94864

[ n (  x )  (  x ) ][ n (  y )  (  y ) ] 199
219
1100
1550
218900
339450
1210000
2402500
39601
47961
405 2350 951750 5522500 164025
324 2450 793800 6002500 104976
319 1425 454575 2030625 101761
255 1700 433500 2890000 65025
2865 17150 5085975 30983750 853423
sum of y sum of x sum of xy sum of x square sum of y square

10×5085975 −(17150 ×2865)


𝑟= 2 2
= 0.76
10×30983750 − 17150 × 10×853423 − 2865
Comment:
there is positive strong linear relationship between deposits and loans (measured in
million $).
Linear Regression

It is a statistical technique that evaluates the impact of


independent variable on dependent variable.
Linear Regression
Independent variable (X) Dependent variable(Y)
This is the variable that causes,
This is the variable that is caused,
impacts or has an influence on the
impacted or influenced by the
dependent variable.
independent variable.
Experience Salary
Income Consumption
Taxes Net Income
Uni-Directional
Family Size Relationship Power consumption
Discounts Sales
Simple Linear Regression Model
Regression & Random
Independent Error
Constant Coefficient Variable term
Dependent
Variable

Yi  β̂ 0  β̂1X i  U i
Linear component Random Error
component
Simple Linear Regression Model
Statistical Models and Mathematical Models?

Why the error term??

What is the value for error term?


Yi  β̂ 0  β̂1X i  U i
Actual Y
Y

+εi
- εi
Estimated Y

X
Linear Regression
Example:
𝑆𝑎𝑙𝑎𝑟𝑦 = 𝛽መ0 + 𝛽መ1 𝐸𝑥𝑝𝑒𝑟𝑖𝑒𝑛𝑐𝑒 𝑖 + 𝑈𝑖
Estimated model:
෣ = 3000 + 200 𝐸𝑥𝑝𝑒𝑟𝑖𝑒𝑛𝑐𝑒 𝑖
𝑆𝑎𝑙𝑎𝑟𝑦
Comment:

Constant = 𝛽መ0 = 3000


When x (experience) is equal to zero, the expected salary (y) is 3000 pounds on average

Regression Coefficient = 𝛽መ1 = 200


When x (experience) increases by one unit ( one year), y ( salary ) will increase by 200 pounds on average.
Linear Regression

Yi  β̂ 0  β̂1X i  U i
𝑛 σ 𝑋𝑖 𝑌𝑖 − σ 𝑋𝑖 σ 𝑌𝑖
𝛽መ0 = 𝒀
ഥ − 𝛽መ1 𝑿
ഥ 𝛽መ1 =
𝑛 σ 𝑋𝑖 2 − σ 𝑋𝑖 2
Loans in million $ (y) Deposits in million $ (x) xy x -square y-square

245 1400 343000 1960000 60025

312 1600 499200 2560000 97344

279 1700 474300 2890000 77841

308 1875 577500 3515625 94864

199 1100 218900 1210000 39601

219 1550 339450 2402500 47961

405 2350 951750 5522500 164025

324 2450 793800 6002500 104976

319 1425 454575 2030625 101761

255 1700 433500 2890000 65025

2865 17150 5085975 30983750 853423

sum of y sum of x sum of xy sum of x square sum of y square


Linear Regression
𝐿𝑜𝑎𝑛𝑠𝑖 = 𝛽መ0 + 𝛽መ1 𝐷𝑒𝑝𝑜𝑠𝑖𝑡𝑒𝑠𝑖 + 𝑈𝑖

𝛽መ0 = 𝒀
ഥ − 𝛽መ1 𝑿
ഥ 𝑛 σ 𝑋𝑖 𝑌𝑖 − σ 𝑋𝑖 σ 𝑌𝑖
𝛽መ1 =
𝑛 σ 𝑋𝑖 2 − σ 𝑋𝑖 2
σ𝑦 2865
𝑦ത = = = 286.5
𝑛 10 10 × 5085975 − 17150 × 2865
σ𝑥 17150
𝑥ҧ = = = 1715 10 × 30983750 − 17150 2
𝑛 10
= 0.109 ~0.11
= 286.5 − 0.11 × 1715 = 97.85
Linear Regression
Estimated Model:

෣ = 97.85 + 0.11 𝑑𝑒𝑝𝑜𝑠𝑖𝑡𝑠


𝑙𝑜𝑎𝑛𝑠
Comment:
Regression Coefficient = 𝛽መ1 0.11
When deposits increases by one unit (one million $), so the loans will increase on
average by 0.11 million $.

Constant = 𝛽መ0 = 97.85


when X ( deposits) is equal to zero, on average the loans will be 97.85 million $
Linear Regression
Estimated Model:

෣ = 97.85 + 0.11 𝑑𝑒𝑝𝑜𝑠𝑖𝑡𝑠


𝑙𝑜𝑎𝑛𝑠
What is the expected value for loans when deposits value is 2500 $(in million $) and
comment on the results?

෣ = 97.85 + 0.11 2500 = 372.85 𝑚𝑖𝑙𝑙𝑖𝑜𝑛 $


𝑙𝑜𝑎𝑛𝑠
Comment: ??
Linear Regression
Homework
𝐶ℎ𝑒𝑐𝑘 𝑛𝑢𝑚𝑒𝑟𝑖𝑐𝑎𝑙𝑙𝑦 𝑡ℎ𝑎𝑡

෡𝑖 = 0
෍𝑈

You might also like