0% found this document useful (0 votes)
11 views43 pages

Lecture 4

Uploaded by

drgpnjgvqm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views43 pages

Lecture 4

Uploaded by

drgpnjgvqm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
You are on page 1/ 43

Econometrics II

Week 4
Heteroscedasticity
Chapter 8
“Heteroscedasticity is a systematic change in the spread of the
residuals over the range of measured values.”
Topics in Heteroscedasticity
Consequences of heteroscedasticity
Observed residuals
For  calculated robust var() using White SE, which can now be
used for t-test and CI
For hypothesis testing for multiple coefficients: can calculate robust
F-test (not shown) and LM test (shown)
Heteroskedasticity testing: F-stat, BP test (LM-based), White test &
modified White test
Dealing with heteroscedasticity
Violation of homoscedasticity
If homoscedasticity assumption is violated, this creates a number of
problems for inference. If Var(u|x) is not constant:
1. The usual t statistics and confidence intervals are invalid for OLS
estimation no matter how large the sample size is
2. Asymptotic normality of OLS estimators is no longer true
3. We can show that there are estimators that will have smaller
asymptotic variance (more efficient) than OLS.

So, for asymptotic normality, homoscedasticity is important.


This about the distribution of
Back to residual plots residuals when regressing
consumption on income
Cone/fan shaped residuals
Does heteroscedasticity affect:
Unbiasedness of regression coefficients
Consistency of regression coefficients
Calculation of and adjusted  (HW question)
Variances (standard errors) of estimators
Confidence intervals and t-statistics of regression coefficients
F-statistics & LM statistic
BLUE-ness of OLS
Fortunately, there are
solutions to adjust standard
Does heteroscedasticity affect: errors, t, F and LM statistics
in case of
heteroscedasticity of
Unbiasedness of regression coefficients unknown form
Consistency of regression coefficients
Calculation of and adjusted  (HW question)
Variances (standard errors) of estimators  variances are biased
Confidence intervals and t-statistics  not valid, t-statistics don’t
have t-distribution and large sample can’t help (because SE is biased)
F-statistics & LM statistic  F-statistics not F-distributed, LM-statistic
doesn’t have asymptotic chi-square distribution
BLUE-ness of OLS  can find other estimators that are more
(asymptotically) efficient (have smaller variance)
Effect on 
Simple linear regression: Homoscedastic standard error
Simple linear regression: Heteroscedasticity-
robust standard error
Multiple linear regression:
Heteroscedasticity-robust standard error
- residual from regressing  on other independent
variables
 - sum of squared residuals from this regression
 - residuals from regressing y on all independent
variables
Square root from this is also known as White
standard error (1980). Can use this to find
heteroscedasticity-robust t statistic.

Note that Using the definition of


R^2, verify this
What conclusions can you make
based on these results?
Let’s observe an example

Robust standard errors can


be larger or smaller (they
tend to be larger).
Did we even know if we
needed robust SE?
Why not use robust SE always?
Under homoscedasticity and normality assumptions, t statistics have
exact t distribution for any size of sample. In small sample,
distribution of robust t statistics might not be very close to t
distribution.
Robust standard errors and robust t statistics are justified only as
sample size becomes large
Heteroscedasticity-robust hypothesis testing
Not all regression packages can compute heteroscedasticity-robust
F statistics, but any regression package can allow you to get
heteroscedasticity-robust LM statistic
Let’s explore how to do that:
LM test – homoscedastic data (review)
Once we enter the realm of asymptotic analysis, other test statistics can be used for hypothesis testing.
Heteroscedasticity-robust LM statistic
Detecting heteroscedasticity
Testing
Testing for heteroscedasticity
MLR.1-4 satisfied

This shows that, in order to test for


violation of the homoskedasticity
assumption, we want to test
whether  is related (in expected value)
to one or more of the explanatory
variables.
How would you do that?
Testing for heteroscedasticity (F-statistics)

Compute the F or LM statistics for


the joint significance of x1, . . . , xk.

To test the joint significance of x1, . . . , xk let’s look at the F


statistics
(given by regression output)
Testing for heteroscedasticity (LM test)Breusch-Pagan
test for heteroskedasticity

To test the joint significance of x1, . . . , xk let’s look at the LM


statistics
LM follows chi-square distribution
R^2 is coming from the regression
above
Example
=0.1601

F=

Result: reject null of homoscedasticity  evidence of


heteroscedasticity
Implications?
Can’t trust the usual standard errors reported in
regression
Example (cont.): log-transformation?
Testing for heteroscedasticity: White test
Asymptotics: homoscedasticity assumption can be replaced with
weaker assumption that is uncorrelated with all independent
variables, their squares, and cross products.
This motivated new test for heteroscedasticity developed by White
(1980)
White test for heteroscedasticity is the LM statistic of the following
form (can also use F-statistic):

Problem? Too many variables. So, let’s modify


Testing for heteroscedasticity: “modified” White
test
To make the regression model compact let’s use fitted values (which are
a function of estimated coefficients and x values:

Run the following regression:

We can use the F or LM statistic for the null hypothesis H0:  = 0, =


0
Summary of (modified) White’s test
Dealing with heteroscedasticity
1. Heteroscedasticity robust standard errors and corresponding t-
and F-statistics
2. Weighted least squares (WLS) method – if we know the form of
variance (as a function of explanatory variables) More efficient than
OLS and leads to new t- and F-statistics
3. Feasible generalized least squares (fGLS) – if the form of the
variance is unknown
Heteroscedasticity of known form
(multiplicative constant)
Conditional variance of u depends on some function of x variables

Let’s consider example of savings:

In this example 
Any restrictions on h(x) you can think of?
Heteroscedasticity of known form
(multiplicative constant)
Conditional variance of u depends on some function
Let’s of x variables
divide both sides of the
equation by  (which is 1/()^0.5)

Let’s consider example of savings: What will be the conditional
expected value of the resulting
error term? What about its
variance?
In this example 
How to use this information to deal with heteroscedasticity?
Heteroscedasticity of known form
(multiplicative constant)
Let’s divide both sides of the |)=0
equation by  Var
 As a result of this transformation,
What will be the conditional the error term of the regression is no
expected value of the resulting longer heteroscedastic!
error term? What about its
variance?
What about the rest of the regression?
We can run the following regression
with the adjusted variables.
Result: estimated from this
regression (by OLS) will have better
efficiency properties than original
OLS
We interpret coefficients (GLS
estimators) as we would interpret the Standard errors, t- and F-statistics are
regression with original variables all good.
(before transformation) Why do we call this WEIGHTED least
squares method?
WLS vs. OLS – talking about weighting
OLS minimizes sum of squared residuals, where each observation has
the same weight
WLS minimizes weighted sum of squared residuals, where weights
are given by 

This is same as minimizing sum of squared residuals with


transformed variables (by )
What about from this regression?
OLS regression with transformed
Built-in WLS package variables
R^2 not particularly useful, because Same, however, in addition to
we are measuring explained measuring variation of , R^2 of
variation in  (transformed this regression is estimated from a
variable), instead of original variable model without intercept (this affects
() calculation of SST, could cause bias
by forcing the regression to pass
F-statistic is still fine through (0,0))
Weights in special cases
Mostly choice of weights is arbitrary, with the exception of data with
a specific form.
Example: individual-level data from various firms vs. average data
from each firm (mi employees per firm)



Heteroscedasticity of unknown form –
feasible GLS
Knowing the exact form of is hard usually, so we try to estimate 
using  which results in feasible GLS estimator.
How to model heteroscedasticity? We’ll look at a quite flexible
approach. Assume that:
If delta’s were known,
we would use WLS
approach
When testing for heteroscedasticity,
h( ) it’s okay to use assume that

𝑥
heteroscedasticity is a linear function of independent variables (as
we did for BP test), however, we don’t want to use linear models to
correct the problem (predicted values of h(x) can be negative)
Heteroscedasticity of unknown form –
feasible GLS
How to estimate the coefficients of 

Transform it into a linear form by applying log and use OLS


log( 2) = + + +…+
Regress:
𝑢
𝛼
𝛿
𝑥
𝛿
𝑥
𝑒
0 1 1 2 2

Get fitted values from here which will be used as weights (1/) 
These fitted values are part of “” that are explained by our
independent variables (which cause heteroscedasticity)
Summary of feasible GLS approach
One problem in the process
If  was known, our estimators of derived in the WLS procedure
would be unbiased (and BLUE, as we have dealt with
heteroscedasticity as well)

However, since we are using  turns out that FGLS is no longer


unbiased. But it is consistent and asymptotically more efficient than
OLS.
Also, test statistics (t and F) are reliable (at least in large samples)
Heteroscedasticity of unknown form –
feasible GLS
Another way to estimate the coefficients of 
Regress: and get fitted value. Plug them into
exponential function and get 

Get fitted values from here which will be used as weights (1/) 
This just changes step 3 in the process (slide 36)
One more issue
Sometimes OLS and WLS estimates could be substantially
different (so that our conclusions about the effect of X change)
If the sign of significant coefficient changes, we should be suspicious
(we can test if the change is significant using Hausman test – which
we don’t study at the moment)
It’s possible that one of the other Gauss-Markov assumptions is
violated (e.g. functional form misspecification)
What if our assumed heteroscedasticity
function is wrong?
As a result:
WLS standard errors and test statistics are no longer valid (even in
large samples). What to do?
Heteroscedasticity robust standard errors for WLS (even if variance function
is misspecified)
Some criticize that when variance function is misspecified, WLS is
not necessarily more efficient than OLS. However, in case of
heteroscedasticity, it’s better to use wrong form of
heteroscedasticity than to ignore it
Summary
Let’s look at literature
Rosopa, P. J., Schaffer, M. M., & Schroeder, A. N. (2013). Managing
heteroscedasticity in general linear models. Psychological
methods, 18(3), 335.
Let’s look at
literature

You might also like