0% found this document useful (0 votes)
53 views23 pages

Linear Regression Analysis - 4

The document discusses linear regression analysis, focusing on the simple linear regression model and the least squares method for estimating regression parameters. It provides an example using height and weight data to illustrate the identification of dependent and independent variables, the creation of a scatter diagram, and the fitting of a regression model. Additionally, it covers residuals, mean squared error, and the estimation of standard deviation of errors.

Uploaded by

raisa.mim17
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views23 pages

Linear Regression Analysis - 4

The document discusses linear regression analysis, focusing on the simple linear regression model and the least squares method for estimating regression parameters. It provides an example using height and weight data to illustrate the identification of dependent and independent variables, the creation of a scatter diagram, and the fitting of a regression model. Additionally, it covers residuals, mean squared error, and the estimation of standard deviation of errors.

Uploaded by

raisa.mim17
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Linear Regression Analysis

Lecture 4
Simple Linear Regression Model
Least Square estimates of the regression
parameters
𝒏
𝒊=𝟏(𝑿𝒊 − 𝑿)(𝒀𝒊 − 𝒀)
𝜷𝟏 = 𝒏 𝟐
𝒊=𝟏 (𝑿𝒊 −𝑿)

𝜷𝟎 = 𝒀 − 𝜷𝟏 𝑿.
Example 1
Consider the following data on height (inches) and weight (lbs) of 10 individuals.
a. Identify the dependent and the independent variable.
b. Draw a scatter diagram for the given data. Interpret.
c. Fit a simple linear regression model using least squares method to estimate the
parameters. Show the calculation and Interpret your results.

Height 63 64 66 69 69 67 67 69 70 70
Weight 127 121 142 160 162 166 169 175 181 200
Solution
a. Height is the independent and Weight is the dependent variable.
250

b. 200

150
Weight

100

50

0
62 64 66 68 70 72 74 76
Height
c.
Ht (X) Wt (Y) X2 Y2 XY
63 127 3969 16129 8001
64 121 4096 14641 7744
66 142 4356 20164 9372
69 157 4761 24649 10833
69 162 4761 26244 11178
71 156 5041 24336 11076
71 169 5041 28561 11999
72 165 5184 27225 11880
73 181 5329 32761 13213
75 208 5625 43264 15600
693 1588 48163 257974 110896
1 = 6.137581
0 = - 266.534

𝑌 = 0 + 1 𝑋= - 266.534 + 6.137581 X
Fitted line or Fitted linear regression model
• The fitted line or the fitted linear regression model is
𝑌 = 0 + 1 𝑋
• The predicted values are
𝑌 = 0 + 1 𝑋
Residuals
• The difference between the observed value and the fitted (or
predicted) value is called a residual.
• The i-th residual is defined as
𝑒𝑖 = 𝑌𝑖 − 𝑌𝑖
Estimating residuals: from Example 1

wt Yhat ei=Y-Yhat
127 120.13 6.87
121 126.27 -5.27
142 138.55 3.45
157 156.96 0.04
162 156.96 5.04
156 169.23 -13.23
169 169.23 -0.23
165 175.37 -10.37
181 181.51 -0.51
208 193.78 14.22
Measures of Variation and Sum of Squares
Measures of Variation
Mean Squared Error
MSE = (1/(n-p)) * Σ(actual – forecast)2

• Where:
• n = number of items,
• p= number of parameters
• Σ = summation notation,
• Actual = original or observed y-value,
• Forecast = y-value from regression.
• General steps to calculate the MSE from a set of X and Y values:
• Find the regression line.
• Insert your X values into the linear regression equation to find the new Y
values (Y’).
• Subtract the new Y value from the original to get the error.
• Square the errors.
• Add up the errors (the Σ in the formula is summation notation).
• Find the mean.
• Example Problem: Find the MSE for the following set of values:
(43,41), (44,45), (45,49), (46,47), (47,44).
• Step 1: Find the regression line. Suppose that the regression line
y = 9.2 + 0.8x.
• Step 2: Find the new estimated Y values as 𝒀 = 𝟗. 𝟐 + 𝟎. 𝟖𝑿
• 9.2 + 0.8(43) = 43.6
• 9.2 + 0.8(44) = 44.4
• 9.2 + 0.8(45) = 45.2
• 9.2 + 0.8(46) = 46
• 9.2 + 0.8(47) = 46.8
Step 3: Find the error (Y – 𝒀):
• 41 – 43.6 = -2.6
• 45 – 44.4 = 0.6
• 49 – 45.2 = 3.8
• 47 – 46 = 1
• 44 – 46.8 = -2.8
Step 4: Square the Errors:
• -2.62 = 6.76
• 0.62 = 0.36
• 3.82 = 14.44
• 12 = 1
• -2.82 = 7.84
Step 5: Add all of the squared errors up: 6.76 + 0.36 + 14.44 + 1 +
7.84 = 30.4.

Step 6: Find the mean squared error:


30.4 / 3 = 10.13.
What does the Mean Squared Error Tell
You?
The smaller the mean squared error, the closer you are to finding
the line of best fit.
Depending on your data, it may be impossible to get a very small
value for the mean squared error.
For example, the above data is scattered wildly around the
regression line, so 6.08 is as good as it gets (and is in fact, the
line of best fit).
• The term mean squared error is sometimes used to refer to the
unbiased estimate of error variance: the residual sum of
squares divided by the number of degrees of freedom.
• This definition for a known, computed quantity differs from the
definition for the computed MSE of a variable, in which a different
denominator is used.
• The denominator is the sample size reduced by the number of model
parameters estimated from the same data, (n−p) for p predictors or
(n−p−1) if an intercept is used.
• Although the MSE is not an unbiased estimator of the error variance,
it is consistent, given the consistency of the predictor.
Estimate of 
• We want to estimate  , the SD of the I’s.
• An intuitive estimate of  is the sample standard deviation of the
errors I
𝑛 2
𝑖=1 (𝜖𝑖 −𝜖 )
̅
𝜎= .
𝑛−1
However, this is not possible because the parameters are unknown. We
can estimate the parameters and approximate the error with the
residuals. And get the sample sd of ei.
Standard error of estimates

You might also like