0% found this document useful (0 votes)
2 views21 pages

Chapter 2

Chapter 2 of 'Statistical Foundations of Business Analytics' focuses on inference and hypothesis testing, detailing the process of testing hypotheses about parameters (β) using test statistics. It explains univariate and multivariate restrictions, the construction of decision rules to minimize errors, and the significance of p-values and confidence intervals. The chapter concludes by emphasizing the importance of assumptions such as exogeneity and homoscedasticity in making valid inferences from estimators.

Uploaded by

daryn.imashev.bu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views21 pages

Chapter 2

Chapter 2 of 'Statistical Foundations of Business Analytics' focuses on inference and hypothesis testing, detailing the process of testing hypotheses about parameters (β) using test statistics. It explains univariate and multivariate restrictions, the construction of decision rules to minimize errors, and the significance of p-values and confidence intervals. The chapter concludes by emphasizing the importance of assumptions such as exogeneity and homoscedasticity in making valid inferences from estimators.

Uploaded by

daryn.imashev.bu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Statistical Foundations of Business Analytics

Chapter 2: Inference and Hypothesis Testing

Tim Ederer

Mini 2, 2024
Tepper Business School
Inference Through Hypothesis Testing

As said in Chapter 1, we only observe β̂ but we ultimately care about β


• β̂ could be very far off from β!
• When making inference from an estimator we are bound to make errors

Need a systematic way to quantify risk of being wrong when we make claims about β

Framework we will use: hypothesis testing


• Posit an hypothesis about β called H0
• Construct what we call a test statistic from β̂
• Determine rule for rejecting H0 that minimizes the probability of making a mistake

1 / 18
Hypothesis

There are several types of hypotheses about β that we can test

Univariate restrictions on β: H0 : βk = b
• Most common example: H0 : βk = 0
• Also called significance test

Multivariate restrictions on β: H0 : Rβ = b
• Equality of parameters: H0 : βk − βj = 0
• Joint significance: H0 : (βk , βj )′ = 0

2 / 18
Univariate Restrictions
Test of Univariate Restrictions

We want to test H0 : βk = b

How can we use β̂k to construct a decision rule such that


• We minimize the rate of type I errors: P(reject H0 |H0 is true)
• We minimize the rate of type II errors: P(not reject H0 |H0 is false)

We typically fix the rate of type I errors at α


• α is also called the level of a test
• The result of a test is often considered convincing if α ≤ 5%

3 / 18
Test of Univariate Restrictions: Procedure

If H0 is true then βk = b and we know the distribution of β̂k (for large n)

β̂k ∼ N (b, Var(β̂k ))

We just need to find a rejection region such that


• P(β̂k ∈ rejection region|H0 is true) = α = 5%
• P(β̂k ∈
/ rejection region|H0 is false) is small

4 / 18
Test of Univariate Restrictions: Visual Example

Assume b = 2 and Var(β̂k ) = 1, the distribution of β̂k under H0 looks like this

5 / 18
Test of Univariate Linear Restrictions: Visual Example

For which values of β̂k should we reject H0 ?


• Imagine that β̂k = −0.5, should we reject H0 or is this just an unlucky draw?

6 / 18
Test of Univariate Restrictions: Visual Example

Proposition: reject H0 if β̂k is in the shaded red area


• P(β̂k ∈ red area) = 5%

7 / 18
Test of Univariate Restrictions: Visual Example
Why not this red area?
• We still have P(β̂k ∈ red area) = 5%

Problem: P(β̂k ∈
/ rejection region|H0 is false) is very high
• We would not reject even if β̂k = −1000
8 / 18
Test of Univariate Restrictions: t-statistic

More generally we can define what we call the t-statistic:

β̂k − b β̂k − b
t̂ = =p
s.e.(
c β̂k ) σ̂ (X ′ X )−1 (k,k)
2

Under H0 we have that t̂ ∼ N (0, 1) for large n


• For small n and under normal errors, t̂ ∼ tn−K
• Focus on the case with large n in this course

Rejection rule: t̂ > z1−α/2 or t̂ < zα/2


• zx is the x quantile of the standard normal distribution
• zα/2 = −z1−α/2 =⇒ rewrite rejection rule as |t̂| > z1−α/2
• You can verify that P(|t̂| > z1−α/2 |βk = b) = α under H0

9 / 18
Illustration

Example of Chapter 1: lwagei = β1 + β2 educi + εi


• β̂2 = 0.060 and s.e.(
p
c β̂2 ) = σ̂ 2 (X ′ X )−1 (2,2) = 0.006

Question: can we reject H0 : β2 = 0 at α = 5% level?

Steps to follow
• Compute |t̂| = 0.060 = 10
0.006
• Compute z1−α/2 = z0.975 = 1.96
• Decision: |t̂| > z1−α/2 =⇒ we reject H0

10 / 18
P-value of a Test

What is the lowest α such that we would still reject H0 ?


• Assume that |t̂| = 10 as in previous example
• α = 1% =⇒ z1−α/2 = z0.995 = 2.58 =⇒ we still reject H0
• Need to find α such that |t̂| = z1−α/2 (it is < 0.00000001%!!)

This is called the p-value


• If α < p-value we would not reject H0
• If α ≥ p-value we would reject H0
• The lower the p-value, the more confident we can be in rejecting H0

11 / 18
Confidence Intervals

What are the values of b such that we would not reject H0 : βk = b?


• Find values of b such that |t̂| < z1−α/2
h i
• We do not reject if b ∈ β̂k − z1−α/2 × s.e.(
c β̂k ), β̂k + z1−α/2 × s.e.(
c β̂k )

This is called a confidence interval


• There is a (1 − α)% chance that the confidence interval includes the true βk
• The smaller α is the larger the confidence interval is
• Ideally we want a narrow confidence interval for small α

12 / 18
Remark

Very important: not rejecting H0 does not mean that H0 is true!


• Very common mistake: not rejecting H0 : βk = 0 =⇒ βk = 0
• The confidence interval for βk can include 0 but be very large!

Example: β̂k = 2 and s.e.(


c β̂k ) = 2
• Confidence interval for α = 5%: [−1.92, 5.92]
• Inference is very limited if confidence interval is large

13 / 18
Multivariate Restrictions
Test of Multivariate Restrictions

Consider a more general set of hypotheses: H0 : Rβ = b


• Assume l is the number of restrictions we want to test jointly
• R is (l × K ) matrix and Rβ is a (l × 1) vector

Examples
   
1 0 ... 0 0
0 1 ... 0 0
• Test of joint significance: R = . .. .. ..  and b =  ..  =⇒ H0 : β = 0
  
 .. . . . .
0 0 ... 1 0
• Test of equality of coefficients: R =

1 −1 . . . 0 and b = 0 =⇒ H0 : β1 = β2

How do we build a test statistic for more than one restriction?

14 / 18
Wald Test

Wald test can accommodate many restrictions


 ′   −1  
c = R β̂ − b
W c R β̂ − b|X
Var R β̂ − b
 ′ −1  
= R β̂ − b σ̂ 2 R(X ′ X )−1 R ′ R β̂ − b

c ∼ χ2 for large n
Under H0 we have that W l
• For small n and with normal errors use Fb = W
c/l where Fb ∼ Fl,n−K (called F-test statistic)
• Focus on cases with large n for this course

c > χ2
Rejection rule: W l,1−α
• χ2l,1−α is the 1 − α quantile of a chi-square distribution with l degrees of freedom
• You can verify that P(W
c > χ2l,1−α ) = α under H0
15 / 18
P-Value and Confidence Region

P-Value
• Lowest value of α such that you reject H0 : Rβ = b
• p-value is α such that W
c = χ2l,1−α

Confidence region for β


• Set of values b such that you do not reject H0 : β = b
 
• b:W c ≤ χ2K ,1−α

16 / 18
Using Tests for Model Selection

Tests of multivariate restriction can be used for model selection


• Assume we have the following model: yi = β0 + xi′ β1 + ϵi
• We want to test if our model is “useful”: H0 : β1 = 0

One can link the Wald test to measures of “goodness of fit”


• Share of variance of yi explained by explanatory variables xi is called R 2
• R 2 = 1 − ni=1 (yi − β̂0 − xi′ β̂1 )2 / ni=1 (yi − ȳ )2 = 1 − RSS/TSS
P P

• One can show that:


R2
W
c=
(1 − R 2 )/(n − K)

17 / 18
Recap

Summary
• We know how to test univariate and multivariate hypotheses about β
• We know how to construct confidence intervals/regions for β

We are now equipped to make inference about β from our estimator β̂!
• But all this relies on EXO, RANK, IID and HOMOSKEDASTICITY

Next
• Chapter 3: what should we do when HOMOSKEDASTICITY fails?
• Chapter 4: what should we do when EXO fails?

18 / 18

You might also like