Chapter 2
Chapter 2
th
12 Edition
Chapter 14
Learning Objectives
In this chapter, you learn:
How to develop a multiple regression model
DCOVA
β βX βX β Xε
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall
Chap 14-3
Multiple Regression
Equ
atio
n
DCOV
A
The coefficients of the multiple regression
model are estimated using sample data
Multiple Regression
Equa
tion
(continue
d)
X1 Slope for variable X
1
ˆ
Y 0 1 1 22Y b b XbX
X2
Example:
2 Independent Variables
DCOV
A
A distributor of frozen dessert pies wants to
evaluate factors thought to influence demand
Advertising ($100’s)
Sales = b0 + b1(Price) +
b2(Advertising)
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall
Chap 14-7
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341 Observations 15 Sales 306.526 - 24.975(Pri ce) 74.131(Adv
ertising)
ANOVA
df SS MS F Significance F Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389
2.68285 0.01993 57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
Analysis of Variance
Source DF SS MS F P
Regression 2 29460 14730 6.54 0.012
Residual Error 12 27033 2253
Total 14 56493
Check the
“confidence and prediction interval
estimates” box
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall
Chap 14-12
Predictions in PHStat
(continued)
DCOVA
Input values
<
Predicted Y value
Confidence interval for the
mean value of Y, given
these X values
Minitab
New
Obs Price Advertising
1 5.50 3.50
Coefficient of
Multiple Determination
DCOV
A
Reports the proportion of total
variation in Y explained by all X
variables taken together
SSR
regression sum of squares
r
2
SST
total sum of squares
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall
Chap 14-15
Multiple Coefficient of
Determination In ExcelDCOVA
Regression Statistics
SSR
Multiple R 0.72213 R Square 0.52148 Adjusted R SST
Square 0.44172 Standard Error 47.46341 Observations 56493.3
15
52.1% of the variation in pie sales
29460.0 is explained by the variation in
r2 price
.52148 and advertising
ANOVA
df SS MS F Significance F Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389
2.68285 0.01993 57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall
Chap 14-16
Multiple Coefficient
Predictor Coef SE Coef T P
Constant 306.50 114.30 2.68 0.020
Price -24.98 10.83 -2.31 0.040
Minitab Source DF SS MS F P
Regression 2 29460 14730 6.54 0.012
Residual Error 12 27033 2253
Total 14 56493
SSR
29460.0
The regression equation is r2
Sales = 307 - 25.0 Price + 74.1 Advertising .52148
SST
56493.3
2
Adjusted r
DCOVA
r2 never decreases when a new X variable is
added to the model
This can be a disadvantage when comparing
models
What is the net effect of adding a new
variable?
variable is added
Did the new X variable add enough
2 DCOVA
Adjusted r (continued)
Shows the proportion of variation in Y
explained
by all X variables adjusted for the
number of X variables used
n
1
22
rr
adj
nk
1 (1 ) 1
(where n = sample size, k = number of independent
unimportant independent
variables
Smaller than r2
Useful in comparing among models
Regression Statistics
2
Adjusted r in r .44172 2
adj
Multiple R 0.72213 R Square 0.52148 Adjusted R
2
Adjusted r in Minitab
DCOV
A
Hypotheses:
H0: β1= β2=… = βk= 0 (no linear
relationship) H1: at least one βi≠ 0 (at
least one independent variable affects Y)
F STAT
SSE
MSE
1
nk
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389
2.68285 0.01993 57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
Analysis of Variance
The regression equation is
Source DF SS MS F P
Sales = 307 - 25.0 Price + 74.1 Advertising
Regression 2 29460 14730 6.54 0.012
Residual Error 12 27033 2253 MSR
Total 14 56493 14730.0
FSTAT
DCOVA 6.5386
MSE
2252.8
Y Sample
Residual = ei
<
= (Yi – Yi)
ˆ
0 1 1 22 Y b b XbXYi
observation
<
Yi
x2i x1i
X2
regression: <
Residuals vs. Yi
Are Individual 0
Variables b
Significant? j
Test Statistic:
t (df = n – k – 1)
STAT
S
j b
= -2.306, with
t Stat for Price is tSTAT
p
value .0398
ANOVA
df SS MS F Significance F Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 306.52619 114.25389
2.68285 0.01993 57.58835 555.46404 Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
Regression Model
(continued) DCOVA
SSR(X1| X2)
= SSR (all variables) – SSR(X2)
Testing Portions of
Model: Example
(continued)
H0: X1(price) does not = .05, df = 1 and 12
improve the model with
X2(advertising) included F0.05= 4.75
H1: X1 does improve model DCOVA
ANOVA
(For X1and X2) (For X2 only) ANOVA
df SS MS df SS
29460.0268 7
27033.3064 7
56493.3333 3
14730.0134 3
2252.77553 9
ANOVA
(For X1and X2) (For X2 only) ANOVA
df SS MS df SS
STAT F 5.316
MSE(all) 2252 .78
t F
2
STAT STAT
Coefficient of Partial
Determination for k variable
mo
del
DCOVA
2
r
Yj.(all variables except j)
SST SSR(all variables) SSR(X | all variables except j) j
In te r m e d ia te C a lc u la tio n s
S S R (X 1 , X 2 ) 2 9 4 6 0 .0 2 7
68
SST 5 6 4 9 3 .3 3 3 3
3
S S R (X 2 ) 1 7 4 8 4 .2 2 2 9 S S R (X 1 | 2 ) 1 1 9 7 5 .8 0
4 X 43
S S R (X 1 ) 1 1 1 0 0 .4 3 8 3 S S R (X 2 | 1 ) 1 8 3 5 9 .5 8
0 X 88
8
4
C o e ffic ie n ts
r 2 Y 1 .2 0 .3 0 7 0 0 0 1
8
r 2 Y 2 .1 0 .4 0 4 4 5 9 5
2
8
4
coded as 0 or 1
Dummy-Variable Example
(with 2 Levels)
DCOVA
ˆ
1 2
012 YbbXbX
Let:
Y = pie sales
X1= price
X2= holiday (X2= 1 if a holiday occurred during the
week) (X2= 0 if there was no holiday that week)
Dummy-Variable Example
(with 2 Levels) (continued)
DCOVA
ˆ
Holiday
Y b b X b (1) (b b ) b X 01021 121
ˆ
No Holiday
2 Levels)
Example: Sales 300 - 30(Price)
15(Holiday)
Dummy-Variable DCOVA
Models (more
than 2 Levels)
The number of dummy variables is one less
than the number of levels
Example:
Interaction Between
Independent Variables
DCOV
A
Hypothesizes interaction between pairs of X
variables
Response to one X variable may vary at different
ˆ
YbbXbXbX 0112233
b b X b X b (X X ) 01122312
Effect of Interaction
DCOVA
Y β β X β X β X X ε0 1 1 22 312
Given:
X2= 1:
8
Y = 1 + 2X1+ 3(1) + 4X1(1) = 4 + 6X1
4
X2= 0:
Y = 1 + 2X1+ 3(0) + 4X1(0) = 1 + 2X1
0
0 0.5 1 1.5 X1
Slopes are different if the effect of X1 on Y depends on
X2value
Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall
Chap 14-55
Significance of Interaction
Ter
m
DCOV
A
Can perform a partial F test for the
contribution
of a variable to see if the addition
of an interaction term improves
the model
added simultaneously
To test the hypothesis that the set of m variables
Logistic Regression
DCOVA
Used when the dependent variable Y is binary
(i.e., Y takes on only two values)
Examples
Logistic DCOVA
(continued)
Regression
Logistic regression is based on the odds ratio,
which represents the probability of a
success compared with the probability
of failure
probabilit y of success
Odds ratio
1 probabilit y of success
Chapter Summary
Developed the multiple regression
model
Tested the significance of the multiple regression
model
Discussed adjusted r2