DR.
APJ ABDUL KALAM TECHNICAL UNIVERSITY
Branch – PHARMACY
Biostatistics And Research Methodology (BP801T)
Lecture – 10
Regression(PART-2)
By
Dr. MANOJ KUMAR SHARMA
Associate Professor
I.T.S College of Pharmacy, MURADNAGAR
Contents
• Fitting the lines y= a + bx and x = a + by
• Multiple regression,
• standard error of regression
R e g r e s s i o n E q u a t i o n s i n Individual Series U s i n g Normal Equations
This method is also called a s Least S q u a r e Method.
U n d e r this method, regression equations can be calculated by
solving two normal equations:
For regression equation Y on X:
Y = a + bX
Σ𝑌 = 𝑁𝑎 + 𝑏Σ𝑋
Σ𝑋𝑌 = 𝑎Σ𝑋 + 𝑏Σ𝑋2
For regression equation X on Y:
X = a + bY
ΣX = 𝑁𝑎 + 𝑏ΣY
Σ𝑋𝑌 = 𝑎ΣY + 𝑏ΣY2
Cont….
Another method
𝑁 .Σ𝑋𝑌 − Σ𝑋.Σ𝑌
b yx = & a = 𝑌 − b𝑋
𝑁.Σ𝑋 2 −
(Σ𝑋) 2
Here a is the Y – intercept,
indicates the minimum value of Y for X = 0
& b is the slope of the line,
indicates the absolute increase in Y for a
unit increase in X.
Example
Cont.
Multiple Regression
INTRODUCTION
Multiple regression analysis is a powerful technique
used for predicting the unknown value of a variable
from the known value of two or more variables.
It also called as predictors.
Method used for studying the relationship between
a dependent variable and two or more independent
variables.
Cont..
Purposes:
Prediction
Explanation
Theory building
The variable whose value is to be predicted is
known as the dependent variable.
The ones whose known values are used
for prediction are known as independent
(explanatory) variables
Design requirement
One dependent variable (criterion)
Two or more independent variables
(predictor variables).
Sample size: ≤ 50 (at least 10 times as
many cases as independent variables
General equation
• In general, the multiple regression equation of Y on X1, X2, …, Xk is given
by:
Simple vs. Multiple Regression
One dependent variable Y
One dependent variable Y predicted from a set of
predicted from one independent variables (X1,
independent variable X X2 ….Xk)
One regression coefficient
One regression coefficient for each independent
variable
r2: proportion of variation in
R2: proportion of variation in
dependent variable Y dependent variable Y
predictable from X predictable by set of
independent variables (X’s)
Advantages
Once a multiple regression equation has been
constructed, one can check how good it is by
examining the coefficient of determination(R2).
R2 always lies between 0 and 1.
All software provides it whenever regression
procedure is run. The closer R2 is to 1, the better
is the model and its prediction.
Assumptions
Multiple regression technique does not test
whether data is linear. On the contrary, it
proceeds by assuming that the relationship
between the Y and each of Xi's is linear.
Hence as a rule, it is prudent to always look
at the scatter plots of (Y, Xi), i= 1, 2,…,k. If
any plot suggests non linearity, one may use
a suitable transformation to attain linearity.
STANDARD ERROR OF ESTIMATE
OR REGREESION
S t a n d a r d error of estimate helps u s to know
t h a t to w h a t extent t h e estimates a r e
accurate.
I t shows t h a t to w h a t extent t h e estimated
values by regression line a r e closer to actual
values
For two regression lines, t h e r e a r e two s t a n d a r d
error of estimates:
Stan d ard error of estimate of Y on X (Syx )
Stan d ard error of estimate of X on Y (Sxy )
FORMULAE FOR SE (Y ON X)
2
Σ 𝑌 −𝑌𝑐
S yx = 𝑁 Y = Actual Values,
Yc = Estimated Values
Σ𝑌2 −𝑎Σ𝑌
Sy x = −𝑏Σ𝑋𝑌 𝑁
Here a & b are to be
obtained from normal equations
S yx = σy 1 − 𝑟2
PRACTICE PROBLEMS – SE
Q1: Find the S t a n d a r d error of estimates if
σx = 4.4, σy = 2.2 & r = 0.8
Solution For y on x
Syx=
=2.2*.6=1.32
Ans: 1.32
THANK YOU