0% found this document useful (0 votes)

26 views45 pages

Lecture 5

This document summarizes a lecture on inference for the slope parameter (β1) in normal linear regression models. It shows that the estimate of the slope (b1) follows a t-distribution with n-2 degrees of freedom. This allows constructing confidence intervals for β1 and performing hypothesis tests on β1 using the t-distribution. Formulas for 95% confidence intervals on β1 and examples of hypothesis tests on β1 are provided.

Uploaded by

Noman Shahzad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views45 pages

Lecture 5

Uploaded by

Noman Shahzad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Inference in Normal Regression

Model
Dr. Frank Wood

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 1

Remember
• Last class we derived the sampling variance
of the estimator of the slope, it being

2
2 σ
σ {b1 } = (Xi −X̄)2

• And we made the point that an estimate of

σ{b1} could be arrived at by substituting the
MSE for the unknown error variance.
SSE
2 M SE
s {b1 } = (X −X̄)2 = (X
n−2
2
i i −X̄)
Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 2
Sampling Distribution of (b1 - β)/s{b1}
• We determined that b1 is normally distributed
so (b1-β)/σ{b1} is a standard normal variable
• We don’t know σ{b1} so it must be estimated
from data. We have already denoted it’s
estimate s{b1}
• Using this estimate we it can be shown that
b1 −β1
s{b1 } ∼ t(n − 2) s{b1 } = s2 {b1 }

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 3

Where does this come from?
• We need to rely upon the following theorem
– For the normal error regression model

(Yi −Ŷi )2
SSE
σ2 = σ2 ∼ χ2 (n − 2)

and is independent of b0 and b1

• Intuitively this follows the standard result for the sum

of squared normal random variables
– Here there are two linear constraints imposed by the
regression parameter estimation that each reduce the
number of degrees of freedom by one.

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 4

Another useful fact : t distribution
• Let z and χ(ν) be independent random
variables (standard normal (N(0,1)) and χ
respectively). We then define a t random
variable as follows:

t(ν) = χz2 (ν)

This version of the t distribution has one

parameter, the degrees of freedom ν

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 5

Distribution of the studentized statistic
• To derive the distribution of this statistic using
the provided theorems, first we do the
following rewrite The numerator
is a N(0,1)
b1 −β1 normal variable
b1 −β1 σ{b1 }
s{b1 } = s{b1 }
σ{b1 }

s{b1 } s2 {b1 }
σ{b1 } = σ 2 {b1 }

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 6

Studentized statistic cont.
• And note the following
M SE
s2 {b1 } (Xi −X̄)2
M SE SSE
σ 2 {b1 } = σ2
= σ2 = σ 2 (n−2)
(Xi −X̄)2

where we know (by the given theorem) the

distribution of the last term is χ and indep. of
b1 and b0
SSE χ2 (n−2)
σ 2 (n−2) ∼ n−2

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 7

Studentized statistic final
• But by the given definition of the t distribution
we have our result
b1 −β1 z
s{b1 } ∼
χ2 (n−2)
n−2

because putting everything together we can

see that
b1 −β1
s{b1 } ∼ t(n − 2)
Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 8
Confidence Intervals and Hypothesis Tests
• Now that we know the sampling distribution of
b (t with n-2 degrees of freedom) we can
construct confidence intervals and hypothesis
tests easily

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 9

Confidence Interval for β
• Since the “studentized” statistic follows a t
distribution we can make the following
probability statement
b1 −β1
P (t(α/2; n − 2) ≤ s{b1 } ≤ t(1 − α/2; n − 2)) = 1 − α
0.4 1 3
ν = 10 ν = 10 ICDF ν = 10
0.9
0.35
2
0.8
0.3
0.7
1
0.25
0.6

0.2 0.5 0

0.4
0.15

0.3 -1
0.1
0.2
-2
0.05
0.1

0 0
-10 -8 -6 -4 -2 0 2 4 6 8 10 -10 -8 -6 -4 -2 0 2 4 6 8 10 -3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 10

Interval arriving from picking α
• Note that by symmetry
t(α/2; n − 2) = −t(1 − α/2; n − 2)

• Rearranging terms and using this fact we

have
P (b1 − t(1 − α/2; n − 2)s{b1 } ≤ β1 ≤ b1 + t(1 − α/2; n − 2)s{b1 }) = 1 − α

• And now we can use a table to look up and

produce confidence intervals

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 11

Using tables for Computing Intervals
• The tables in the book (table B.2 in the
appendix) for t(1-α/2;ν) where
– P{t(ν) ≤ t(1-α/2; ν)} = A
• Provides the inverse CDF of the t-distribution
• This can be arrived at computationally as well
– Matlab: tinv(1-α/2, ν)

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 12

1-α confidence limits for β
• The 1-α confidence limits for β are

b1 ± t(1 − α/2; n − 2)s{b1 }

• Note that this quantity can be used to
calculate confidence intervals given n and α.
– Fixing α can guide the choice of sample size if a
particular confidence interval is desired
– Give a sample size, vice versa.
• Also useful for hypothesis testing
Show demo.m

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 13

Tests Concerning β
• Example 1
– Two-sided test
• H0 : β = 0
• Ha : β ≠ 0
• Test statistic

∗ b1 −0
t = Ŝ(b1 )

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 14

Tests Concerning β
• We have an estimate of the sampling
distribution of b1 from the data.
• If the null hypothesis holds then the b1
estimate coming from the data should be
within the 95% confidence interval of the
sampling distribution centered at 0 (in this
case)

∗ b1 −0
t = s{b1 }

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 15

Decision rules
if |t∗ | ≤ t(1 − α/2; n − 2), conclude H0
if |t∗ | > t(1 − α/2; n − 2), conclude Hα

• Absolute values make the test two-sided

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 16

Intuition 1-α confidence interval

Test statistic
0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05
p-value is value of α
0 that moves the green line
-10 -8 -6 -4 -2 0 2 4 6 8 10
( β est. - β) / σ est.
to the blue line
β
Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 17
Calculating the p-value
• The p-value, or attained significance level, is
the smallest level of significance α for which
the observed data indicate that the null
hypothesis should be rejected.
• This can be looked up using the CDF of the
test statistic.

• In Matlab
– Two-sided p-value
• 2*(1-tcdf(|t*|,ν))

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 18

Inferences Concerning β
• Largely, inference procedures regarding β
can be performed in the same way as those
for β
• Remember the point estimator b0 for β

b0 = Ȳ − b1 X̄

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 19

Sampling distribution of b0
• The sampling distribution of b0 refers to the
different values of b0 that would be obtained
with repeated sampling when the levels of the
predictor variable X are held constant from
sample to sample.
• For the normal regression model the
sampling distribution of b0 is normal

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 20

Sampling distribution of b0
• When error variance is known

E(b0 ) = β0
2
2 2 1 X̄
σ {b0 } = σ (n + (Xi −X̄) 2
)

• When error variance is unknown

2
2
s {b0 } = M SE( n1 + X̄
(Xi −X̄) 2
)

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 21

Confidence interval for β
• The 1-α confidence limits for β are obtained
in the same manner as those for β

b0 ± t(1 − α/2; n − 2)s{b0 }

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 22

Considerations on Inferences on β and β
• Effects of departures from normality
– The estimators of β and β have the property of
asymptotic normality – their distributions
approach normality as the sample size increases
(under general conditions)
• Spacing of the X levels
– The variances of b0 and b1 (for a given n and σ)
depend strongly on the spacing of X

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 23

Sampling distribution of point estimator of mean response

• Let Xh be the level of X for which we would

like an estimate of the mean response
– Needs to be one of the observed X’s
• The mean response when X=Xh is denoted by
E{Yh}
• The point estimator of E{Yh} is

Ŷh = b0 + b1 Xh
We are interested in the sampling distribution
of this quantity

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 24

Sampling Distribution of Ŷh
• We have

Ŷh = b0 + b1 Xh
• Since this quantity is itself a linear
combination of the Yi’s it’s sampling
distribution is itself normal.
• The mean of the sampling distribution is

E{Ŷh } = E{b0 } + E{b1 }Xh = β0 + β1 Xh

Biased or unbiased?

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 25

Sampling Distribution of Ŷh
• To derive the sampling distribution variance
of the mean response we first show that b1
and (1/n) ∑ Yi are uncorrelated and, hence,
for the normal error regression model
independent
• We start with the definitions

Ȳ = ( n1 )Yi
(Xi − X̄)
b1 = ki Yi , ki =
(Xi − X̄)2
Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 26
Sampling Distribution of Ŷh
• We want to show that mean response and the
estimate b1 are uncorrelated
Cov(Ȳ , b1 ) = σ 2 {Ȳ , b1 } = 0

• To do this we need the following result (A.32)

n n n
σ { i=1 ai Yi , i=1 ci Yi } = i=1 ai ci σ 2 {Yi }
2

when the Yi are independent

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 27

Sampling Distribution of Ŷh
• Using this fact we have
n n n
2 1 1
σ { Yi , ki Yi } = ki σ 2 {Yi } from appendix
i=1
n i=1 i=1
n
n
1
= ki σ 2
i=1
n
σ n2
= ki i ki = 0
n i=1
= 0

So the mean of Y and b1 are uncorrelated

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 28
Sampling Distribution of Ŷh
• This means that we can write down the
variance
2 2
σ {Ŷh } = σ {Ȳ + b1 (Xh − X̄)}
alternative and equivalent
form of regression function

• But we know that the mean of Y and b1 are

uncorrelated so

σ 2 {Ŷh } = σ 2 {Ȳ } + σ 2 {b1 }(Xh − X̄)2

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 29

Sampling Distribution of Ŷh
• We know (from last lecture)
σ2
σ 2 {b1 } =
(Xi − X̄)2
M SE
s2 {b1 } =
(Xi − X̄)2

• And we can find

2 1
2 nσ 2 σ2
σ {Ȳ } = n2 σ {Ȳ } = n2 = n

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 30

Sampling Distribution of Ŷh
• So, plugging in, we get

2 σ2 2
σ {Ŷh } = n + σ
(Xi −X̄)2
(X h − X̄)2

• Or

2
1 (X − X̄)
σ 2 {Ŷh } = σ 2 n + h
(Xi −X̄)2

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 31

Sampling Distribution of Ŷh
• Since we often won’t know σ we can, as
usual, plug in s2 = SSE/(n-2), our estimate for
it to get our estimate of this sampling
distribution variance

2
1 (X − X̄)
s2 {Ŷh } = s2 n + h
(Xi −X̄)2

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 32

No surprise…
• The sampling distribution of our point
estimator for the output is distributed as a t-
distribution with two degrees of freedom

Ŷh −E{Yh }
s{Ŷh }
∼ t(n − 2)

• This means that we can construct confidence

intervals in the same manner as before.

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 33

Confidence Intervals for E{Yh}
• The 1-α confidence intervals for E{Yh} are

Ŷh ± t(1 − α/2; n − 2)s{Ŷh }

• From this hypothesis tests can be constructed

as usual.

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 34

Comments
• The variance of the estimator for E{Yh} is
smallest near the mean of X. Designing
studies such that the mean of X is near Xh will
improve inference precision
• When Xh is zero the variance of the estimator
fo E{Yh} reduces to the variance of the
estimator b0 for β

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 35

Prediction interval for single new observation
• Essentially follows the sampling distribution
arguments for E{Yh}
• If all regression parameters are known then
the 1-α prediction interval for a new
observation Yh is

E{Yh } ± z(1 − α/2)σ

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 36

Prediction interval for single new observation
• If the regression parameters are unknown the 1-α
prediction interval for a new observation Yh is given
by the following theorem

Ŷh ± t(1 − α/2; n − 2)s{pred}

• This is very nearly the same as prediction for a

known value of X but includes a correction for the
fact that there is additional variability arising from
the fact that the new input location was not used in
the orginal estimates of b1, b0, and s2

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 37

Prediction interval for single new observation
• The value of s2{pred} is given by

2
2 1 (Xh −X̄)
s {pred} = M SE 1 + n + (X −X̄)2
i

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 38

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 39
Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 40
Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 41
Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 42
Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 43
Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 44
Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 45

Science & Technology Philosophy
100% (1)
Science & Technology Philosophy
59 pages
Quarter 2 Post-Test
No ratings yet
Quarter 2 Post-Test
4 pages
Module 8 - Questionnaires
No ratings yet
Module 8 - Questionnaires
3 pages
Deduction and Induction in Social Research: An Expanded Explanation
No ratings yet
Deduction and Induction in Social Research: An Expanded Explanation
6 pages
Inductive and Deductive Method of Research
100% (1)
Inductive and Deductive Method of Research
3 pages
5.1. Testing Hypothesis
No ratings yet
5.1. Testing Hypothesis
34 pages
Jawaban Spss 1. Umur: One-Sample Statistics
No ratings yet
Jawaban Spss 1. Umur: One-Sample Statistics
5 pages
Solution Demand Forecasting (Part 1) FOR STUDENTS
No ratings yet
Solution Demand Forecasting (Part 1) FOR STUDENTS
27 pages
Answers Review Questions Econometrics PDF
93% (14)
Answers Review Questions Econometrics PDF
59 pages
Objective Assignment 6: (Https://swayam - Gov.in)
No ratings yet
Objective Assignment 6: (Https://swayam - Gov.in)
5 pages
Group 8 - Sales Forecast Assignment
No ratings yet
Group 8 - Sales Forecast Assignment
7 pages
Ejercicio Nº2: Se Desea Analizar Las Ventas de Cemento en Función A Las Variables Disponibles, Se Pide
No ratings yet
Ejercicio Nº2: Se Desea Analizar Las Ventas de Cemento en Función A Las Variables Disponibles, Se Pide
17 pages
Pondi Uni Syllabus
No ratings yet
Pondi Uni Syllabus
30 pages
Seminar 3 Solution 2015
No ratings yet
Seminar 3 Solution 2015
12 pages
PDF TheoriesandModelsofCommunication
No ratings yet
PDF TheoriesandModelsofCommunication
357 pages
02 Quality Theory
No ratings yet
02 Quality Theory
53 pages
Philo - Mod2 Q1 Method of Philospphizing v3
No ratings yet
Philo - Mod2 Q1 Method of Philospphizing v3
27 pages
Modern Math for College Students
No ratings yet
Modern Math for College Students
11 pages
A Nova Sumner 2016
No ratings yet
A Nova Sumner 2016
23 pages
2 Problem Solving
No ratings yet
2 Problem Solving
10 pages
Quiz 2
100% (1)
Quiz 2
9 pages
Business Stats Analysis Report
No ratings yet
Business Stats Analysis Report
3 pages
Business Research Philosophy Guide
No ratings yet
Business Research Philosophy Guide
14 pages
Statisticians' Guide to Multicollinearity
100% (5)
Statisticians' Guide to Multicollinearity
14 pages
First Quarter: Module in Introduction To The Philosophy of The Human Person Grade 12
No ratings yet
First Quarter: Module in Introduction To The Philosophy of The Human Person Grade 12
13 pages
Philo Quarter 1 Module 3 Methods of Philosophizing Distinguishing Opinion From Truth
No ratings yet
Philo Quarter 1 Module 3 Methods of Philosophizing Distinguishing Opinion From Truth
24 pages
Concepts of Logic
No ratings yet
Concepts of Logic
54 pages
2nd Lec. Reasoning & Critical Thinking
No ratings yet
2nd Lec. Reasoning & Critical Thinking
2 pages
Philosophy Quiz on Fallacies
No ratings yet
Philosophy Quiz on Fallacies
6 pages
Lgt2425 Introduction To Business Analytics: Lecture 3: Linear Regression (Part I)
No ratings yet
Lgt2425 Introduction To Business Analytics: Lecture 3: Linear Regression (Part I)
36 pages

Lecture 5

Uploaded by

Lecture 5

Uploaded by

Inference in Normal Regression

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 1

• And we made the point that an estimate of

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 3

and is independent of b0 and b1

• Intuitively this follows the standard result for the sum

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 4

t(ν) = χz2 (ν)

This version of the t distribution has one

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 5

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 6

where we know (by the given theorem) the

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 7

because putting everything together we can

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 9

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 10

• Rearranging terms and using this fact we

• And now we can use a table to look up and

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 11

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 12

b1 ± t(1 − α/2; n − 2)s{b1 }

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 13

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 14

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 15

• Absolute values make the test two-sided

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 16

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 18

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 19

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 20

• When error variance is unknown

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 21

b0 ± t(1 − α/2; n − 2)s{b0 }

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 22

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 23

• Let Xh be the level of X for which we would

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 24

E{Ŷh } = E{b0 } + E{b1 }Xh = β0 + β1 Xh

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 25

• To do this we need the following result (A.32)

when the Yi are independent

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 27

So the mean of Y and b1 are uncorrelated

• But we know that the mean of Y and b1 are

σ 2 {Ŷh } = σ 2 {Ȳ } + σ 2 {b1 }(Xh − X̄)2

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 29

• And we can find

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 30

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 31

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 32

• This means that we can construct confidence

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 33

Ŷh ± t(1 − α/2; n − 2)s{Ŷh }

• From this hypothesis tests can be constructed

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 34

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 35

E{Yh } ± z(1 − α/2)σ

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 36

Ŷh ± t(1 − α/2; n − 2)s{pred}

• This is very nearly the same as prediction for a

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 37

Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 5, Slide 38

You might also like