Chapter 2
Chapter 2
12th Edition
Chapter 14
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1
Learning Objectives
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-2
The Multiple Regression
Model
DCOVA
Idea: Examine the linear relationship between
1 dependent (Y) & 2 or more independent variables (Xi)
Y
i
β 0 β 1 X 1i β 2 X 2i
βk X ki
εi
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-3
Multiple Regression Equation
DCOVA
The coefficients of the multiple regression model are
estimated using sample data
Yˆ b 0 b 1 X 1i b 2 X b k X
i 2i ki
X 1
b le
ia
v ar
r
fo
l ope X2
S
a riabl e X2
f or v
S lope
X1
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-5
Example:
2 Independent Variables
DCOVA
A distributor of frozen dessert pies wants to
evaluate factors thought to influence demand
Dependent variable: Pie sales (units per week)
Independent variables: Price (in $)
Advertising ($100’s)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-6
Pie Sales Example
Pie Price Advertising DCOVA
Week Sales ($) ($100s) Multiple regression equation:
1 350 5.50 3.3
2 460 7.50 3.3
3 350 8.00 3.0 Sales = b0 + b1 (Price)
4
5
430
350
8.00
6.80
4.5
3.0
+ b2 (Advertising)
6 380 7.50 4.0
7 430 4.50 3.0
8 470 6.40 3.7
9 450 7.00 3.5
10 490 5.00 4.0
11 340 7.20 3.5
12 300 7.90 3.2
13 440 5.90 4.0
14 450 5.00 3.5
15 300 7.00 2.7
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-7
Excel Multiple Regression Output
DCOVA
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341 Sales 306.526 - 24.975(Pri ce) 74.131(Adv ertising)
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-8
Minitab Multiple Regression Output
DCOVA
Sales 306.526 - 24.975(Pri ce) 74.131(Adv ertising)
Analysis of Variance
Source DF SS MS F P
Regression 2 29460 14730 6.54 0.012
Residual Error 12 27033 2253
Total 14 56493
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-9
The Multiple Regression Equation
DCOVA
428.62
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-11
Predictions in Excel using PHStat
DCOVA
PHStat | regression | multiple regression …
Check the
“confidence and
prediction interval
estimates” box
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-12
Predictions in PHStat
(continued)
DCOVA
Input values
<
Predicted Y value
Confidence interval for the
mean value of Y, given
these X values
New
Obs Fit SE Fit 95% CI 95% PI
1 428.6 17.2 (391.1, 466.1) (318.6, 538.6)
Predicted Ŷ value
New
Obs Price Advertising
1 5.50 3.50 Prediction interval
for an individual Y
value, given these X
Input values values
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-14
Coefficient of
Multiple Determination
DCOVA
Reports the proportion of total variation in Y
explained by all X variables taken together
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-15
Multiple Coefficient of
Determination In Excel DCOVA
Regression Statistics
SSR 29460.0
Multiple R 0.72213
r
2
.52148
R Square 0.52148 SST 56493.3
Adjusted R Square 0.44172
Standard Error 47.46341
52.1% of the variation in pie sales is
Observations 15
explained by the variation in price
and advertising
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-16
Multiple Coefficient of
Determination In Minitab
DCOVA
Analysis of Variance
52.1% of the variation in pie sales
Source DF SS MS F P is explained by the variation in
Regression 2 29460 14730 6.54 0.012 price and advertising
Residual Error 12 27033 2253
Total 14 56493
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-17
Adjusted r2
DCOVA
r2 never decreases when a new X variable is
added to the model
This can be a disadvantage when comparing
models
What is the net effect of adding a new variable?
We lose a degree of freedom when a new X
variable is added
Did the new X variable add enough
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-20
Adjusted r 2 in Minitab
DCOVA
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-21
Is the Model Significant?
DCOVA
F Test for Overall Significance of the Model
Shows if there is a linear relationship between all
of the X variables considered together and Y
Use F-test statistic
Hypotheses:
H0: β1 = β2 = … = βk = 0 (no linear relationship)
H1: at least one βi ≠ 0 (at least one independent
variable affects Y)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-22
F Test for Overall Significance
DCOVA
Test statistic:
SSR
MSR k
F STAT
MSE SSE
n k 1
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-23
F Test for Overall Significance In
Excel DCOVA
(continued)
Regression Statistics
Multiple R 0.72213
MSR 14730.0
R Square 0.52148
F STAT 6.5386
Adjusted R Square 0.44172 MSE 2252.8
Standard Error 47.46341
With 2 and 12 degrees P-value for
Observations 15
of freedom the F Test
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-24
F Test for Overall Significance In
Minitab DCOVA
Analysis of Variance
Source DF SS MS F P
Regression 2 29460 14730 6.54 0.012
Residual Error 12 27033 2253
Total 14 56493
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-25
F Test for Overall Significance
(continued)
= (Yi – Yi)
<
Yi
x2i
X2
x1i
The best fit equation is found
X1 by minimizing the sum of
squared errors, e2
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-27
Multiple Regression Assumptions
DCOVA
Errors (residuals) from the regression model:
<
ei = (Yi – Yi)
Assumptions:
The errors are normally distributed
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-28
Residual Plots Used
in Multiple Regression
DCOVA
These residual plots are used in multiple
regression:
<
Residuals vs. Yi
Residuals vs. X1i
Residuals vs. X2i
Residuals vs. time (if time series data)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-29
Are Individual Variables
Significant?
DCOVA
Use t tests of individual variable slopes
Shows if there is a linear relationship between the
variable Xj and Y holding constant the effects of
other X variables
Hypotheses:
H0: βj = 0 (no linear relationship)
H1: βj ≠ 0 (linear relationship does exist
between Xj and Y)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-30
Are Individual Variables
Significant?
(continued)
DCOVA
H0: βj = 0 (no linear relationship)
H1: βj ≠ 0 (linear relationship does exist
between Xj and Y)
Test Statistic:
bj 0
t STAT (df = n – k – 1)
Sb
j
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-31
Are Individual Variables
Significant? Excel Output (continued)
DCOVA
Regression Statistics
Multiple R 0.72213
t Stat for Price is tSTAT = -2.306, with p-
R Square 0.52148
value .0398
Adjusted R Square 0.44172
Standard Error 47.46341 t Stat for Advertising is tSTAT = 2.855,
Observations 15 with p-value .0145
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-32
Are Individual Variables Significant?
Minitab Output
DCOVA
Analysis of Variance
t Stat for Advertising is tSTAT = 2.85, with
p-value .014
Source DF SS MS F P
Regression 2 29460 14730 6.54 0.012
Residual Error 12 27033 2253
Total 14 56493
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-33
Inferences about the Slope:
t Test Example
DCOVA
From the Excel output:
H0: βj = 0
H1: βj 0 For Price tSTAT = -2.306, with p-value .0398
bj tα /2 S b where t has
(n – k – 1) d.f.
j
Example: Form a 95% confidence interval for the effect of changes in price
(X1) on pie sales:
-24.975 ± (2.1788)(10.832)
So the interval is (-48.576 , -1.374)
(This interval does not contain zero, so price has a significant effect on sales)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-35
Confidence Interval Estimate
for the Slope DCOVA
(continued)
Confidence interval for the population slope βj
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-36
Testing Portions of the
Multiple Regression Model
DCOVA
Contribution of a Single Independent Variable Xj
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-37
Testing Portions of the
Multiple Regression Model
(continued)
DCOVA
Contribution of a Single Independent Variable Xj,
assuming all other variables are already included
(consider here a 2-variable model):
SSR(X1 | X2)
= SSR (all variables) – SSR(X2)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-40
Testing Portions of Model:
Example
(continued)
= .05, df = 1 and 12
F0.05 = 4.75
(For X1 and X2) (For X2 only)
ANOVA ANOVA
df SS MS df SS
29460.0268 14730.0134 Regression 1 17484.22249
Regression 2 7 3
Residual 13 39009.11085
27033.3064 2252.77553
Total 14 56493.33333
Residual 12 7 9
56493.3333
Total 14 3
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-41
Testing Portions of Model:
Example (continued)
DCOVA
(For X1 and X2) (For X2 only)
ANOVA ANOVA
df SS MS df SS
29460.0268 14730.0134 Regression 1 17484.22249
Regression 2 7 3
Residual 13 39009.11085
27033.3064 2252.77553
Total 14 56493.33333
Residual 12 7 9
56493.3333
Total 14 3
SSR (X 1 | X 2 ) 29 , 460 . 03 17 , 484 . 22
F STAT 5 . 316
MSE(all) 2252 . 78
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-42
Relationship Between Test
Statistics
DCOVA
The partial F test statistic developed in this section and
the t test statistic are both used to determine the
contribution of an independent variable to a multiple
regression model.
The hypothesis tests associated with these two
statistics always result in the same decision (that is, the
p-values are identical).
2
t STAT F STAT
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-43
Coefficient of Partial
Determination for k variable model
DCOVA
2
r Yj.(all variables except j)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-44
Coefficient of Partial
Determination in Excel
DCOVA
Coefficients of Partial Determination can be
found using Excel:
In t e r m e d i a t e C a l c u l a t i o n s
S S R (X 1 , X 2 ) 2 9 4 6 0 .0 2 6 8 7
SST 5 6 4 9 3 .3 3 3 3 3
S S R (X 2 ) 1 7 4 8 4 .2 2 2 4 9 S S R (X 1 | X 2 ) 1 1 9 7 5 .8 0 4 3 8
S S R (X 1 ) 1 1 1 0 0 .4 3 8 0 3 S S R (X 2 | X 1 ) 1 8 3 5 9 .5 8 8 8 4
C o e ffic ie n ts
r 2 Y 1 .2 0 .3 0 7 0 0 0 1 8 8
r 2 Y 2 .1 0 .4 0 4 4 5 9 5 2 4
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-45
Using Dummy Variables
DCOVA
A dummy variable is a categorical independent
variable with two levels:
yes or no, on or off, male or female
coded as 0 or 1
Assumes the slopes associated with numerical
independent variables do not change with the
value for the categorical variable
If more than two levels, the number of dummy
variables needed is (number of levels - 1)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-46
Dummy-Variable Example
(with 2 Levels)
DCOVA
Ŷ b0 b 1
X1 b 2
X 2
Let:
Y = pie sales
X1 = price
X2 = holiday (X2 = 1 if a holiday occurred during the week)
(X2 = 0 if there was no holiday that week)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-47
Dummy-Variable Example
(with 2 Levels) (continued)
DCOVA
Ŷ b0 b 1
X1 b 2
(1) (b 0 b 2 ) b 1
X1 Holiday
Ŷ b0 b 1
X1 b 2
(0) b0 b 1
X1 No Holiday
Different Same
intercept slope
Y (sales)
If H0: β2 = 0 is
b0 + b2 rejected, then
Holid
ay (X
b0 2 = 1)
“Holiday” has a
No H significant effect
o lid a
y (X on pie sales
2 = 0)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-49
Dummy-Variable Models
(more than 2 Levels) DCOVA
The number of dummy variables is one less
than the number of levels
Example:
Y = house price ; X1 = square feet
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-50
Dummy-Variable Models
(more than 2 Levels) DCOVA
(continued)
Y = house price
X1 = square feet
X2 = 1 if ranch, 0 otherwise
X3 = 1 if split level, 0 otherwise
Ŷ b 0 b 1X 1 b 2 X 2 b 3 X 3
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-51
Interpreting the Dummy Variable
Coefficients (with 3 Levels)
DCOVA
Consider the regression equation:
Ŷ 20.43 0.045X 1
23.53X 2
18.84X 3
For a colonial: X2 = X3 = 0
With the same square feet, a
Ŷ 20.43 0.045X 1
ranch will have an estimated
average price of 23.53
For a ranch: X2 = 1; X3 = 0 thousand dollars more than a
colonial.
Ŷ 20.43 0.045X 1
23.53
With the same square feet, a
For a split level: X2 = 0; X3 = 1 split-level will have an
estimated average price of
Ŷ 20.43 0.045X 1
18.84 18.84 thousand dollars more
than a colonial.
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-52
Interaction Between
Independent Variables
DCOVA
Hypothesizes interaction between pairs of X
variables
Response to one X variable may vary at different
levels of another X variable
Ŷ b 0 b 1X 1 b 2 X 2 b 3 X 3
b 0 b 1 X 1 b 2 X 2 b 3 (X 1 X 2 )
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-53
Effect of Interaction
DCOVA
Given: Y β 0 β 1X 1 β 2 X 2 β 3 X 1X 2 ε
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-54
Interaction Example
DCOVA
Suppose X2 is a dummy variable and the estimated
regression equation is Ŷ = 1 + 2X1 + 3X2 + 4X1X2
Y
12
X2 = 1:
8 Y = 1 + 2X1 + 3(1) + 4X1(1) = 4 + 6X1
4
X2 = 0:
Y = 1 + 2X1 + 3(0) + 4X1(0) = 1 + 2X1
0
X1
0 0.5 1 1.5
Slopes are different if the effect of X1 on Y depends on X2 value
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-55
Significance of Interaction Term
DCOVA
Can perform a partial F test for the contribution
of a variable to see if the addition of an
interaction term improves the model
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-56
Simultaneous Contribution of
Independent Variables
DCOVA
Use partial F test for the simultaneous
contribution of multiple variables to the model
Let m variables be an additional set of variables
added simultaneously
To test the hypothesis that the set of m variables
improves the model:
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-58
Logistic Regression DCOVA
(continued)
probabilit y of success
Odds ratio
1 probabilit y of success
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-59
Logistic Regression
(continued)
DCOVA
Logistic Regression Model:
ln(odds ratio) β 0 β 1 X 1i β 2 X 2i
βk X ki
εi
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-60
Estimated Odds Ratio and
Probability of Success
DCOVA
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-61
Chapter Summary
Developed the multiple regression model
Tested the significance of the multiple regression
model
Discussed adjusted r2
Discussed using residual plots to check model
assumptions
Tested individual regression coefficients
Tested portions of the regression model
Used dummy variables
Evaluated interaction effects
Discussed logistic regression
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-62