Chapter 9
Correlation and
Regression
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 1
Chapter Outline
• 9.1 Correlation
• 9.2 Linear Regression
• 9.3 Measures of Regression and Prediction Intervals
• 9.4 Multiple Regression
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 2
Section 9.1
Correlation
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 3
Section 9.1 Objectives
• An introduction to linear correlation, independent and
dependent variables, and the types of correlation
• How to find a correlation coefficient
• How to test a population correlation coefficient ρ
using a table
• How to perform a hypothesis test for a population
correlation coefficient ρ
• How to distinguish between correlation and causation
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 4
Correlation
Correlation
• A relationship between two variables.
• The data can be represented by ordered pairs (x, y)
▪ x is the independent (or explanatory) variable
▪ y is the dependent (or response) variable
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 5
Correlation
A scatter plot can be used to determine whether a
linear (straight line) correlation exists between two
variables. y
Example: 2
x 1 2 3 4 5 x
y –4 –2 –1 0 2 2 4 6
–2
–4
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 6
Types of Correlation
y y
As x increases, y
tends to decrease.
As x increases, y
tends to increase.
x x
Negative Linear Correlation Positive Linear Correlation
y y
x x
No Correlation Nonlinear Correlation
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 7
Example: Constructing a Scatter Plot
An economist wants to determine CO2
whether there is a linear relationship emission
between a country’s gross domestic GDP (millions of
(trillions of metric tons),
product (GDP)
$), x y
and carbon dioxide (CO2) 1.6 428.2
emissions. The data are shown in 3.6 828.8
4.9 1214.2
the table. Display the data in a
1.1 444.6
scatter plot and determine whether 0.9 264.0
there appears to be a positive or 2.9 415.3
negative linear correlation or no 2.7 571.8
2.3 454.9
linear correlation. (Source: World 1.6 358.7
Bank and U.S. Energy Information 1.5 573.5
Administration)
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 8
Solution: Constructing a Scatter Plot
Appears to be a positive linear correlation. As the
gross domestic products increase, the carbon dioxide
emissions tend to increase.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 9
Example: Constructing a Scatter Plot
Using Technology
Old Faithful, located in Duration
x
Time,
y
Duration
x
Time,
y
Yellowstone National Park, is the 1.8 56 3.78 79
world’s most famous geyser. The 1.82
1.9
58
62
3.83
3.88
85
80
duration (in minutes) of several of 1.93 56 4.1 89
Old Faithful’s eruptions and the 1.98 57 4.27 90
2.05 57 4.3 89
times (in minutes) until the next 2.13 60 4.43 89
eruption are shown in the table. 2.3 57 4.47 86
2.37 61 4.53 89
Using a TI-83/84, display the data 2.82 73 4.55 86
in a scatter plot. Determine the 3.13 76 4.6 92
3.27 77 4.63 91
type of correlation. 3.65 77
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 10
Solution: Constructing a Scatter Plot
Using Technology
From the scatter plot, it appears that the variables have a
positive linear correlation, as the durations of the
eruptions increase, the times until the next eruption tend
to increase.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 11
Correlation Coefficient
Correlation coefficient
• A measure of the strength and the direction of a linear
relationship between two variables.
• The symbol r represents the sample correlation
coefficient.
• A formula for r is
n xy − ( x)( y) n is the number
r=
n x 2 − ( x) n y 2 − ( y) of data pairs
2 2
• The population correlation coefficient is represented
by ρ (rho).
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 12
Correlation Coefficient
• The range of the correlation coefficient is -1 to 1.
-1 0 1
If r = -1 there is a If r is close to 0 If r = 1 there is a
perfect negative there is no linear perfect positive
correlation correlation correlation
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 13
Linear Correlation
y y
r = −0.91 r = 0.88
x x
Strong negative correlation Strong positive correlation
y y
r = 0.42 r = 0.07
x x
Weak positive correlation No Correlation
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 14
Calculating a Correlation Coefficient
In Words In Symbols
1. Find the sum of the x- x
values.
2. Find the sum of the y- y
values.
3. Multiply each x-value by xy
its corresponding y-value
and find the sum.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 15
Calculating a Correlation Coefficient
In Words In Symbols
4. Square each x-value x 2
and find the sum.
5. Square each y-value y2
and find the sum.
6. Use these five sums to n xy − ( x)( y)
r=
n x 2 − ( x) n y 2 − ( y)
2 2
calculate the
correlation coefficient.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 16
Example: Calculating the Correlation
Coefficient
Calculate the correlation
coefficient for the gross GDP CO2 emission
(trillions of $), (millions of
domestic products and carbon x metric tons), y
dioxide emissions data. What 1.6 428.2
can you conclude? 3.6 828.8
4.9 1214.2
1.1 444.6
0.9 264.0
2.9 415.3
2.7 571.8
2.3 454.9
1.6 358.7
1.5 573.5
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 17
Solution: Calculating the Correlation
Coefficient
x y xy x2 y2
1.6 428.2 685.12 2.56 183,355.24
3.6 828.8 2983.68 12.96 686,909.44
4.9 1214.2 5949.58 24.01 1,474,281.64
1.1 444.6 489.06 1.21 197,669.16
0.9 264.0 237.6 0.81 69,696
2.9 415.3 1204.37 8.41 172,474.09
2.7 571.8 1543.86 7.29 326,955.24
2.3 454.9 1046.27 5.29 206,934.01
1.6 358.7 573.92 2.56 128,665.69
1.5 573.5 860.25 2.25 328,902.25
Σx = 23.1 Σy = 5554 Σxy = 15,573.71 Σx2 = 67.35 Σy2 = 3,775,842.76
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 18
Solution: Calculating the Correlation
Coefficient
Σx = 23.1 Σy = 5554 Σxy = 15,573.71 Σx2 = 32.44
n xy − ( x)( y) Σy2 = 3,775,842.76
r=
n x − ( x) n y − ( y)
2 2 2 2
10(15,573.71) − (23.1)(5554)
=
10(67.35) − 23.12 10(3, 775,842.76) − 5554 2
27, 439.7
= 0.882
139.89 6,911,511.6
r ≈ 0.882 suggests a strong positive linear correlation. As
the gross domestic product increases, the carbon dioxide
emissions also increase.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 19
Example: Using Technology to Find a
Correlation Coefficient
Use a technology tool to calculate Duration
x
Time,
y
Duration
x
Time,
y
the correlation coefficient for the 1.8 56 3.78 79
1.82 58 3.83 85
Old Faithful data. What can you 1.9 62 3.88 80
conclude? 1.93 56 4.1 89
1.98 57 4.27 90
2.05 57 4.3 89
2.13 60 4.43 89
2.3 57 4.47 86
2.37 61 4.53 89
2.82 73 4.55 86
3.13 76 4.6 92
3.27 77 4.63 91
3.65 77
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 20
Solution: Using Technology to Find a
Correlation Coefficient
To calculate r, you must first enter the
STAT > Calc DiagnosticOn command found in the Catalog menu
r ≈ 0.979 suggests a strong positive correlation.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 21
Using a Table to Test a Population
Correlation Coefficient ρ
• Once the sample correlation coefficient r has been
calculated, we need to determine whether there is
enough evidence to decide that the population
correlation coefficient ρ is significant at a specified
level of significance.
• Use Table 11 in Appendix B.
• If |r| is greater than the critical value, there is enough
evidence to decide that the correlation coefficient ρ is
significant.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 22
Using a Table to Test a Population
Correlation Coefficient ρ
• Determine whether ρ is significant for five pairs of
data (n = 5) at a level of significance of α = 0.01.
level of significance
Number of
pairs of data
in sample
• If |r| > 0.959, the correlation is significant. Otherwise,
there is not enough evidence to conclude that the
correlation is significant.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 23
Using a Table to Test a Population
Correlation Coefficient ρ
In Words In Symbols
1. Determine the number Determine n.
of pairs of data in the
sample.
2. Specify the level of Identify .
significance.
3. Find the critical value. Use Table 11 in
Appendix B.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 24
Using a Table to Test a Population
Correlation Coefficient ρ
In Words In Symbols
4. Decide if the If |r| > critical value, the
correlation is correlation is significant.
significant. Otherwise, there is not
enough evidence to
support that the
correlation is significant.
5. Interpret the decision
in the context of the
original claim.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 25
Example: Using a Table to Test a
Population Correlation Coefficient ρ
Using the Old Faithful data, you Duration
x
Time,
y
Duration
x
Time,
y
used 25 pairs of data to find 1.8 56 3.78 79
r ≈ 0.979. Is the correlation 1.82
1.9
58
62
3.83
3.88
85
80
coefficient significant? Use 1.93 56 4.1 89
α = 0.05. 1.98 57 4.27 90
2.05 57 4.3 89
2.13 60 4.43 89
2.3 57 4.47 86
2.37 61 4.53 89
2.82 73 4.55 86
3.13 76 4.6 92
3.27 77 4.63 91
3.65 77
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 26
Solution: Using a Table to Test a
Population Correlation Coefficient ρ
• n = 25, α = 0.05
• |r| ≈ 0.979 > 0.396
• There is enough evidence
at the 5% level of
significance to conclude
that there is a significant
linear correlation between
the duration of Old
Faithful’s eruptions and the
time between eruptions.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 27
Hypothesis Testing for a Population
Correlation Coefficient ρ
• A hypothesis test can also be used to determine
whether the sample correlation coefficient r provides
enough evidence to conclude that the population
correlation coefficient ρ is significant at a specified
level of significance.
• A hypothesis test can be one-tailed or two-tailed.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 28
Hypothesis Testing for a Population
Correlation Coefficient ρ
• Left-tailed test
H0: ρ 0 (no significant negative correlation)
Ha: ρ < 0 (significant negative correlation)
• Right-tailed test
H0: ρ 0 (no significant positive correlation)
Ha: ρ > 0 (significant positive correlation)
• Two-tailed test
H0: ρ = 0 (no significant correlation)
Ha: ρ 0 (significant correlation)
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 29
The t-Test for the Correlation Coefficient
• Can be used to test whether the correlation between
two variables is significant.
• The test statistic is r
• The standardized test statistic
r r
t= =
r 1− r2
n−2
follows a t-distribution with d.f. = n – 2.
• In this text, only two-tailed hypothesis tests for ρ are
considered.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 30
Using the t-Test for ρ
In Words In Symbols
1. State the null and alternative State H0 and Ha.
hypothesis.
2. Specify the level of Identify .
significance.
3. Identify the degrees of
d.f. = n – 2.
freedom.
4. Determine the critical
value(s) and rejection Use Table 5 in
region(s). Appendix B.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 31
Using the t-Test for ρ
In Words In Symbols
r
5. Find the standardized test t=
1− r2
statistic. n−2
6. Make a decision to reject or If t is in the rejection
fail to reject the null region, reject H0.
hypothesis. Otherwise fail to reject
H0.
7. Interpret the decision in the
context of the original claim.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 32
Example: t-Test for a Correlation
Coefficient
GDP CO2 emission
Previously you calculated (trillions of $), (millions of
r ≈ 0.882. Test the significance x metric tons), y
of this correlation coefficient. 1.6 428.2
Use α = 0.05. 3.6 828.8
4.9 1214.2
1.1 444.6
0.9 264.0
2.9 415.3
2.7 571.8
2.3 454.9
1.6 358.7
1.5 573.5
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 33
Solution: t-Test for a Correlation
Coefficient
• H0: ρ = 0 • Test Statistic:
• Ha: ρ ≠ 0 0.882
t= 5.294
• = 0.05 1 − (0.882)2
• d.f. = 10 – 2 = 8 10 − 2
• Rejection Region: • Decision: Reject H0
At the 5% level of significance,
there is enough evidence to
conclude that there is a
significant linear correlation
between gross domestic products
and carbon dioxide emissions.
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 34
Correlation and Causation
• The fact that two variables are strongly correlated
does not in itself imply a cause-and-effect
relationship between the variables.
• If there is a significant correlation between two
variables, you should consider the following
possibilities.
1. Is there a direct cause-and-effect relationship
between the variables?
• Does x cause y?
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 35
Correlation and Causation
2. Is there a reverse cause-and-effect relationship
between the variables?
• Does y cause x?
3. Is it possible that the relationship between the
variables can be caused by a third variable or by a
combination of several other variables?
4. Is it possible that the relationship between two
variables may be a coincidence?
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 36
Section 9.1 Summary
• Introduced to linear correlation, independent and
dependent variables and the types of correlation
• Found a correlation coefficient
• Tested a population correlation coefficient ρ using a
table
• Performed a hypothesis test for a population
correlation coefficient ρ
• Distinguished between correlation and causation
. Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 37
Section 9.2
Linear Regression
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 38
Section 9.2 Objectives
• How to find the equation of a regression line
• How to predict y-values using a regression equation
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 39
Regression Lines (1 of 2)
• After verifying that the linear correlation between two
variables is significant, next we determine the equation of
the line that best models the data (regression line).
• Can be used to predict the value of y for a given value of
x.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 40
Residuals
Residual
• The difference between the observed y-value and the
predicted y-value for a given x-value on the line.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 41
Regression Lines (2 of 2)
Regression line (line of best fit)
• The line for which the sum of the squares of the residuals
is a minimum.
• The equation of a regression line for an independent
variable x and a dependent variable y is
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 42
The Equation of a Regression Line
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 43
Example: Finding the Equation of a
Regression Line (1 of 4)
Find the equation of the GDP
CO2 emission
regression line for the gross (millions of
(trillions of $), x
metric tons), y
domestic products and 1.6 428.2
carbon dioxide emissions 3.6 828.8
data. 4.9 1214.2
1.1 444.6
0.9 264.0
2.9 415.3
2.7 571.8
2.3 454.9
1.6 358.7
1.5 573.5
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 44
Example: Finding the Equation of a
Regression Line (2 of 4)
Solution
Recall from section 9.1:
x y xy x2 y2
1.6 428.2 685.12 2.56 183,355.24
3.6 828.8 2983.68 12.96 686,909.44
4.9 1214.2 5949.58 24.01 1,474,281.64
1.1 444.6 489.06 1.21 197,669.16
0.9 264.0 237.6 0.81 69,696
2.9 415.3 1204.37 8.41 172,474.09
2.7 571.8 1543.86 7.29 326,955.24
2.3 454.9 1046.27 5.29 206,934.01
1.6 358.7 573.92 2.56 128,665.69
1.5 573.5 860.25 2.25 328,902.25
Σx = 23.1 Σy = 5554 Σxy = 15,573.71 Σx2 = 67.35 Σy2 = 3,775,842.76
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 45
Example: Finding the Equation of a
Regression Line (3 of 4)
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 46
Example: Finding the Equation of a
Regression Line (4 of 4)
• To sketch the regression line, use any two x-values within
the range of the data and calculate the corresponding y-
values from the regression line.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 47
Example: Using Technology to Find
a Regression Equation (1 of 2)
Use a technology tool to find Duration Time, Duration Time,
the equation of the regression x y x y
line for the Old Faithful data. 1.80 56 3.78 79
1.82 58 3.83 85
1.90 62 3.88 80
1.93 56 4.10 89
1.98 57 4.27 90
2.05 57 4.30 89
2.13 60 4.43 89
2.30 57 4.47 86
2.37 61 4.53 89
2.82 73 4.55 86
3.13 76 4.60 92
3.27 77 4.63 91
3.65 77
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 48
Example: Using Technology to Find
a Regression Equation (2 of 2)
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 49
Example: Predicting y-Values Using
Regression Equations (1 of 3)
The regression equation for the gross domestic products (in
trillions of dollars) and carbon dioxide emissions (in millions
of metric tons) data is ŷ = 196.152x + 102.289. Use this
equation to predict the expected carbon dioxide emissions for
the following gross domestic products. (Recall from section
9.1 that x and y have a significant linear correlation.)
1. 1.2 trillion dollars
2. 2.0 trillion dollars
3. 2.5 trillion dollars
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 50
Example: Predicting y-Values Using
Regression Equations (2 of 3)
Solution
ŷ = 196.152x + 102.289
1. 1.2 trillion dollars
ŷ =196.152(1.2) + 102.289 ≈ 337.671
When the gross domestic product is $1.2 trillion, the CO2 emissions
are about 337.671 million metric tons.
2. 2.0 trillion dollars
ŷ =196.152(2.0) + 102.289 ≈ 494.593
When the gross domestic product is $2.0 trillion, the CO2 emissions
are 494.595 million metric tons.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 51
Example: Predicting y-Values Using
Regression Equations (3 of 3)
3. 2.5 trillion dollars
ŷ =196.152(2.5) + 102.289 ≈ 592.669
When the gross domestic product is $2.5 trillion, the CO2
emissions are 592.669 million metric tons.
Prediction values are meaningful only for x-values in (or close
to) the range of the data. The x-values in the original data set
range from 0.9 to 4.9. So, it would not be appropriate to use
the regression line to predict carbon dioxide emissions for
gross domestic products such as $0.2 or $14.5 trillion dollars.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 52
Section 9.2 Summary
• Found the equation of a regression line
• Predicted y-values using a regression equation
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 53
Section 9.3
Measures of Regression
and Prediction Intervals
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 54
Section 9.3 Objectives
• How to interpret the three types of variation about a
regression line
• How to find and interpret the coefficient of determination
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 55
Variation About a Regression Line (1 of
4)
• Three types of variation about a regression line
▪ Total variation
▪ Explained variation
▪ Unexplained variation
• To find the total variation, you must first calculate
▪ The total deviation
▪ The explained deviation
▪ The unexplained deviation
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 56
Variation About a Regression Line (2 of
4)
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 57
Variation About a Regression Line (3 of
4)
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 58
Variation About a Regression Line (4 of
4)
Unexplained variation
• The sum of the squares of the differences between the y-
value of each ordered pair and each corresponding
predicted y-value.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 59
Coefficient of Determination
Coefficient of determination
• The ratio of the explained variation to the total variation.
• Denoted by r2
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 60
Example: Coefficient of
Determination
The correlation coefficient for the gross domestic products and
carbon dioxide emissions data is r ≈ 0.883. Find the coefficient of
determination. What does this tell you about the explained
variation of the data about the regression line? About the
unexplained variation?
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 61
Section 9.3 Summary
• Interpreted the three types of variation about a regression
line
• Found and interpreted the coefficient of determination
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 62
Section 9.4
Multiple Regression
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 63
Section 9.4 Objectives
• Use technology to find a multiple regression
equation, the standard error of estimate and the
coefficient of determination
• Use a multiple regression equation to predict y-values
© 2012 Pearson Education, Inc. All Copyright
rights reserved.
© 2015, 2012, and 2009 Pearson Education, Inc. 64
Multiple Regression Equation
• In many instances, a better prediction can be found
for a dependent (response) variable by using more
than one independent (explanatory) variable.
• For example, a more accurate prediction for the
carbon dioxide emissions discussed in previous
sections might be made by considering the number of
cars as well as the gross domestic product.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 65
Multiple Regression Equation
Multiple regression equation
• ŷ = b + m1x1 + m2x2 + m3x3 + … + mkxk
• x1, x2, x3,…, xk are independent variables
• b is the y-intercept
• y is the dependent variable
* Because the mathematics associated with this concept is
complicated, technology is generally used to calculate the
multiple regression equation.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 66
Example: Finding a Multiple Regression
Equation
A researcher wants to determine how employee salaries
at a certain company are related to the length of
employment, previous experience, and education. The
researcher selects eight employees from the company
and obtains the data shown on the next slide. Use
MINITAB to find a multiple regression equation that
models the data.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 67
Example: Finding a Multiple Regression
Equation
Employment Experience Education
Employee Salary, y (yrs), x1 (yrs), x2 (yrs), x3
A 57,310 10 2 16
B 57,380 5 6 16
C 54,135 3 1 12
D 56,985 6 5 14
E 58,715 8 8 16
F 60,620 20 0 12
G 59,200 8 4 18
H 60,320 14 6 17
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 68
Solution: Finding a Multiple Regression
Equation
• Enter the y-values in C1 and the x1-, x2-, and x3-
values in C2, C3 and C4 respectively.
• Select “Regression > Regression…” from the Stat
menu.
• Use the salaries as the response variable and the
remaining data as the predictors.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 69
Solution: Finding a Multiple Regression
Equation
The regression equation is
ŷ = 49,764 + 364x1 + 228x2 + 267x3
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 70
Predicting y-Values
• After finding the equation of the multiple regression
line, you can use the equation to predict y-values over
the range of the data.
• To predict y-values, substitute the given value for
each independent variable into the equation, then
calculate ŷ.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 71
Example: Predicting y-Values
Use the regression equation
ŷ = 49,764 + 364x1 + 228x2 + 267x3
to predict an employee’s salary given 12 years of
current employment, 5 years of experience, and 16
years of education.
Solution:
ŷ = 49,764 + 364(12) + 228(5) + 267(16)
= 59,544
The employee’s predicted salary is $59,544.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 72
Section 9.4 Summary
• Used technology to find a multiple regression
equation, the standard error of estimate and the
coefficient of determination
• Used a multiple regression equation to predict y-
values
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 73