Lec 11
Lec 11
Javed Iqbal
Scatter plot and comments on (1) direction, (2) strength and (3) linearity of the relationship
and (4) possible outliers
Interpretation of slope and intercept. Note that in many cases intercept does not have a
meaningful interpretation because data of x do not contain values near x = 0.
More examples:
(1) Wage = 20,000 + 1500 Years of Education(wage in Rs.)
(2) Salary of CEO = 1000,000 + 50,000 Return on Equity (salary in Rs., ROE in %)
(3) GDP Growth of a country = 3+ 0.8 Export Growth (both x and y in %)
Sometimes we are interested in estimating changes in percentage terms not in unit terms. We can
compute elasticity.
∆𝑦 𝑥̅ 𝑥̅
Elasticity of y wrt x (at mean) = × =𝑆𝑙𝑜𝑝𝑒 × e.g. for Orion age-price data
∆𝑥 𝑦̅ 𝑦̅
∆𝑦 𝑥̅ 5.272727
Elasticity of price wrt age = ×̅= −20.26 × = −1.205
∆𝑥 𝑦 88.636363
Thus for an average car a one percent increase in age of the car is associated with a decrease of
price by 1.205% on average
Computing residuals: 𝑦 − 𝑦̂
Comments on under or over estimation
e.g. estimate the residual for a car with age = 4 years whose actual value is 103($100)
𝑦̂ = 195.47 − 20.26𝑥
x =age of car in years, y = price (im $100)
𝑦̂ = 195.47 − 20.26𝑥 = 195.26 − 20.26(4) = 114.43 hundred dollars
𝑒 = 𝑦 − 𝑦̂ = 103 − 114.43 = −11.43 hundred dollars
Thus, the estimated model overestimates the price of this car.
Standardized residuals above 3 (in absolute value) may indicate that the corresponding
observation is an outlier.
An outlier may be an influential observation if its exclusion creates drastic changes in the
regression slope and intercept estimates and other measures.
600
500
400
300
200
100
0
0 1000 2000 3000 4000 5000 6000
c) Describe the apparent relationship between the two variables under consideration.
There appears to be (i) positive (ii) linear (iii) moderately strong relationship (iv) no
outlier visible from the plot.
SD of residuals = 102.336
All standardized residuals are less than 3 in absolute value so no outlier
g) Predict the values of the response variable for the specified values of the predictor
variable, and interpret your results
i) Find the elasticity of selling price wrt. living area for an average house and interpret it.
𝑥̅ 2298
𝐸𝑙𝑎𝑠𝑡𝑖𝑐𝑖𝑡𝑦 = 𝑠𝑙𝑜𝑝𝑒 × = 0.1255 × = 0.74
𝑦̅ 387.4556
For an average house, a 1% increase in living area is associated with increase of house price by
0.74%.
Analysis in R
orion=read.csv(file.choose()) # choose orion.csv data
attach(orion)
head(orion)
model1=lm(Price_hunder_dollar ~ Age_year, data=orion)
summary(model1)
plot(Age_year,Price_hunder_dollar)
abline(model1)