0% found this document useful (0 votes)
19 views4 pages

Lec 11

Uploaded by

Saad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views4 pages

Lec 11

Uploaded by

Saad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Lecture 11: Introduction to Statistics by Dr.

Javed Iqbal

Computation and formula Weiss p-649, Example 14.6, p-650

Ex/HW Weiss14.59, 14.60 p-658

Scatter plot and comments on (1) direction, (2) strength and (3) linearity of the relationship
and (4) possible outliers

Estimation of intercept and slope and prediction equation

Interpretation of slope and intercept. Note that in many cases intercept does not have a
meaningful interpretation because data of x do not contain values near x = 0.

More examples:
(1) Wage = 20,000 + 1500 Years of Education(wage in Rs.)
(2) Salary of CEO = 1000,000 + 50,000 Return on Equity (salary in Rs., ROE in %)
(3) GDP Growth of a country = 3+ 0.8 Export Growth (both x and y in %)

Sometimes we are interested in estimating changes in percentage terms not in unit terms. We can
compute elasticity.

∆𝑦 𝑥̅ 𝑥̅
Elasticity of y wrt x (at mean) = × =𝑆𝑙𝑜𝑝𝑒 × e.g. for Orion age-price data
∆𝑥 𝑦̅ 𝑦̅
∆𝑦 𝑥̅ 5.272727
Elasticity of price wrt age = ×̅= −20.26 × = −1.205
∆𝑥 𝑦 88.636363

Thus for an average car a one percent increase in age of the car is associated with a decrease of
price by 1.205% on average

Prediction (interpolation/extrapolation): Regression model should be used of within sample


prediction (interpolation) but not for outside the sample range prediction (extrapolation).

Computing residuals: 𝑦 − 𝑦̂
Comments on under or over estimation
e.g. estimate the residual for a car with age = 4 years whose actual value is 103($100)
𝑦̂ = 195.47 − 20.26𝑥
x =age of car in years, y = price (im $100)
𝑦̂ = 195.47 − 20.26𝑥 = 195.26 − 20.26(4) = 114.43 hundred dollars
𝑒 = 𝑦 − 𝑦̂ = 103 − 114.43 = −11.43 hundred dollars
Thus, the estimated model overestimates the price of this car.

Standardized residuals = Residuals / SD of residuals

Standardized residuals above 3 (in absolute value) may indicate that the corresponding
observation is an outlier.
An outlier may be an influential observation if its exclusion creates drastic changes in the
regression slope and intercept estimates and other measures.

You do it Weiss Ex/HW 14.60 p-658

a) Find the regression equation for the data points.

Check that n = 9, ∑ 𝑥 = 20682, ∑ 𝑦 = 3487.1, ∑ 𝑥𝑦 = 9254378, ∑ 𝑥 2 = 57414186,


∑ 𝑦 2 = 1590653, 𝑥̅ = 2298, 𝑦̅ = 387.4556
2
(∑ 𝑥) 2 206822
𝑆𝑥𝑥 = ∑ 𝑥 − = 57414186 − = 9886950
𝑛 9
(∑ 𝑥)(∑ 𝑦) (20682)(3487.1)
𝑆𝑥𝑦 = ∑ 𝑥𝑦 − = 9254378 − = 1241022.2
𝑛 9
𝑏1 = 0.1255 , 𝑏0 = 99.007

Estimated Regression Equation: 𝑦̂ = 99.008 + 0.1255 𝑥

x = living area of house (square feet), y = selling price ($1000)

b) Graph the regression equation and the data points.

y = House Price ($1000)


800
y = 99.008 + 0.1255 x
700

600

500

400

300

200

100

0
0 1000 2000 3000 4000 5000 6000

Living Area (Sq. feet)

c) Describe the apparent relationship between the two variables under consideration.
There appears to be (i) positive (ii) linear (iii) moderately strong relationship (iv) no
outlier visible from the plot.

d) Interpret the slope of the regression line.


A one Sq foot increase in living area is associated with increase in selling price by 0.1255
thousand of dollar (i.e. by $125.5)
e) Identify the predictor and response variables. Predictor: Living area , Response: Selling
Price
f) Identify outliers and potential influential observations.

Residuals and Standardized residuals are:


1 2 3 4 5 6 7 8 9
Residual 45.742 179.069 -31.508 -140.983 -153.558 7.798 -6.108 43.743 56.242
St. Residuals 0.418 1.637 -0.288 -1.289 -1.404 0.071 -0.056 0.400 0.514

SD of residuals = 102.336
All standardized residuals are less than 3 in absolute value so no outlier

g) Predict the values of the response variable for the specified values of the predictor
variable, and interpret your results

𝑦̂ = 99.008 + 0.1255 (2600) = 425.308 𝑡ℎ𝑜𝑢𝑠𝑎𝑛𝑑𝑠 𝑜𝑓 𝑑𝑜𝑙𝑙𝑎𝑟𝑠


(add these parts)
h) Find the residual corresponding to house whose area is 3000 sq. feet and actual price is
600 (thousands of dollars).
𝑦̂ = 99.008 + 0.1255 (3000) = 376.5 𝑡ℎ𝑜𝑢𝑠𝑎𝑛𝑑𝑠 𝑜𝑓 𝑑𝑜𝑙𝑙𝑎𝑟𝑠
𝑒 = 𝑦 − 𝑦̂ = 600 − 376.5 = 223.5𝑡ℎ𝑜𝑢𝑠𝑎𝑛𝑑𝑠 𝑜𝑓 𝑑𝑜𝑙𝑙𝑎𝑟𝑠
The regression model underestimated the selling price of this house.

i) Find the elasticity of selling price wrt. living area for an average house and interpret it.
𝑥̅ 2298
𝐸𝑙𝑎𝑠𝑡𝑖𝑐𝑖𝑡𝑦 = 𝑠𝑙𝑜𝑝𝑒 × = 0.1255 × = 0.74
𝑦̅ 387.4556

For an average house, a 1% increase in living area is associated with increase of house price by
0.74%.

HW: Ex 7 Anderson pdf p-694


Also add other relevant parts e.g. interpretation of intercept and slope, computing, and
interpretation of elasticity at mean, computing residual corresponding to part c)
Analysis in Excel
Go to Data>Data Analysis
[Note of Data Analysis is not visible, you install it as follows:
File>Options>Add-Ins>Select (Default) Analysis Toolpak>Go>select Analysis ToolPak>OK]
Go to Data Analysis > Regression>OK. Then input data as per given display

Following output results:

Analysis in R
orion=read.csv(file.choose()) # choose orion.csv data
attach(orion)
head(orion)
model1=lm(Price_hunder_dollar ~ Age_year, data=orion)
summary(model1)
plot(Age_year,Price_hunder_dollar)
abline(model1)

You might also like