0% found this document useful (0 votes)
78 views34 pages

Econometrics I: Professor William Greene Stern School of Business Department of Economics

This document summarizes a lecture on partial regression and correlation. It introduces the Frisch-Waugh theorem, which shows that the coefficient for one set of variables can be obtained by regressing the dependent variable's residuals from regressing on the other set of variables. This allows isolating individual coefficients and interpreting them as net effects. It also introduces partial correlation as the correlation between two variable sets' residuals, and how this can differ significantly from zero-order correlations. The fixed effects model is discussed as an application of these concepts.

Uploaded by

Getaye Gizaw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views34 pages

Econometrics I: Professor William Greene Stern School of Business Department of Economics

This document summarizes a lecture on partial regression and correlation. It introduces the Frisch-Waugh theorem, which shows that the coefficient for one set of variables can be obtained by regressing the dependent variable's residuals from regressing on the other set of variables. This allows isolating individual coefficients and interpreting them as net effects. It also introduces partial correlation as the correlation between two variable sets' residuals, and how this can differ significantly from zero-order correlations. The fixed effects model is discussed as an application of these concepts.

Uploaded by

Getaye Gizaw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Econometrics I

Professor William Greene


Stern School of Business
Department of Economics

4-1/34 Part 4: Partial Regression and Correlation


Econometrics I

Part 4 – Partial Regression


and Correlation

4-2/34 Part 4: Partial Regression and Correlation


Frisch-Waugh (1933) Theorem
The Most Powerful Theorem in Econometrics
Context: Model contains two sets of variables:
X = [ (1,time) : (other variables)]
= [X1 X2]
Regression model:
y = X1β1 + X2β2 + ε (population)
= X1b1 + X2b2 + e (sample)
Problem: Algebraic expression for the second set
of least squares coefficients, b2

4-3/34 Part 4: Partial Regression and Correlation


Partitioned Solution
Method of solution (Why did F&W care? In 1933, matrix
computation was not trivial!)
Direct manipulation of normal equations produces

( X′X )b = X′y
 X1′ X1 X1′ X 2   X1′ y 
X = [X1 , X 2 ] so X ′X =   and X ′y =  
X ′
 2 1 X X ′
2 2
X X ′
 2  y
 X1′ X1 X1′ X 2   b1   X1′ y 
(X ′X)b =     = 
X ′
 2 1 X X ′
2 2  2 
X b X ′
 2  y
X1′ X1b1 + X1′ X 2b2 = X1′ y
X ′2 X1b1 + X
= ′2 X 2b2 X ′2 y ==> X
= ′2 X 2b2 X′2 y - X′2 X1b1
= X′2 (y - X1b1 )
4-4/34 Part 4: Partial Regression and Correlation
Partitioned Solution
Direct manipulation of normal equations produces
b2 = (X2′X2)-1X2′(y - X1b1)
What is this? Regression of (y - X1b1) on X2
If we knew b1, this is the solution for b2.
Important result (perhaps not fundamental). Note
the result if X2′X1 = 0.
Useful in theory: Probably
Likely in practice? Not at all.

4-5/34 Part 4: Partial Regression and Correlation


Partitioned Inverse
Use of the partitioned inverse result
produces a fundamental result: What is
the southeast element in the inverse of
the moment matrix?
-1
 X1'X1 X1'X2 
 X 'X 
 2 1 X 2'X 2 

4-6/34 Part 4: Partial Regression and Correlation


Partitioned Inverse
The algebraic result is:
[ ]-1(2,2) = {[X2’X2] - X2’X1(X1’X1)-1 X1’X2}-1
= [X2’(I - X1(X1’X1)-1X1’)X2]-1
= [X2’M1X2]-1
 Note the appearance of an “M” matrix.
How do we interpret this result?
 Note the implication for the case in which
X1 is a single variable. (Theorem, p. 34)
 Note the implication for the case in which
X1 is the constant term. (p. 35)

4-7/34 Part 4: Partial Regression and Correlation


Frisch-Waugh Result
Continuing the algebraic manipulation:

b2 = [X2’M1X2]-1[X2’M1y].

This is Frisch and Waugh’s famous result - the “double residual


regression.”

How do we interpret this? A regression of residuals on residuals.

“We get the same result whether we (1) detrend the other variables
by using the residuals from a regression of them on a constant
and a time trend and use the detrended data in the regression or
(2) just include a constant and a time trend in the regression and
not detrend the data”

“Detrend the data” means compute the residuals from the


regressions of the variables on a constant and a time trend.

4-8/34 Part 4: Partial Regression and Correlation


Important Implications
 Isolating a single coefficient in a regression.
(Corollary 3.2.1, p. 34). The double residual
regression.
 Regression of residuals on residuals – ‘partialling’
out the effect of the other variables.
 It is not necessary to ‘partial’ the other Xs out of
y because M1 is idempotent. (This is a very
useful result.) (i.e., X2′M1’M1y = X2′M1y)
 (Orthogonal regression) Suppose X1 and X2 are
orthogonal; X1′X2 = 0. What is M1X2?

4-9/34 Part 4: Partial Regression and Correlation


Applying Frisch-Waugh
Using gasoline data from Notes 3.
X = [1, year, PG, Y], y = G as before.
Full least squares regression of y on X.

4-10/34 Part 4: Partial Regression and Correlation


Detrending the Variables - Pg

4-11/34 Part 4: Partial Regression and Correlation


Regression of Detrended G on Detrended
Pg and Detrended Y

4-12/34 Part 4: Partial Regression and Correlation


Partial Regression
Important terms in this context:
Partialing out the effect of X1.
Netting out the effect …

“Partial regression coefficients.”


To continue belaboring the point: Note the interpretation of partial
regression as “net of the effect of …”

Now, follow this through for the case in which X1 is just a constant term,
column of ones.
What are the residuals in a regression on a constant. What is M1?
Note that this produces the result that we can do linear regression on
data in mean deviation form.

'Partial regression coefficients' are the same as 'multiple regression


coefficients.' It follows from the Frisch-Waugh theorem.

4-13/34 Part 4: Partial Regression and Correlation


Does Signature Explain (log)Price of Monet
Paintings after Controlling for (log)Size?

♦ = Signed
♦ = Unsigned
4-14/34 Part 4: Partial Regression and Correlation
.
10.112
.19299 =
10.112 + 428

4-15/34 Part 4: Partial Regression and Correlation


Partial Correlation
Working definition. Correlation between sets of residuals.
Some results on computation: Based on the M matrices.
Some important considerations:
Partial correlations and coefficients can have signs and
magnitudes that differ greatly from gross correlations and
simple regression coefficients.

Compare the simple (gross) correlation of G and PG with the partial


correlation, net of the time effect. (Could you have predicted
the negative partial correlation?)
4.50

CALC;list;Cor(g,pg)$
4.00

3.50

Result = .7696572 3.00

PG
2.50

CALC;list;cor(gstar,pgstar)$
2.00

1.50

Result = -.6589938 1.00

.50
70 80 90 100 110 120
G

4-16/34 Part 4: Partial Regression and Correlation


Partial Correlation

4-17/34 Part 4: Partial Regression and Correlation


A Useful Result
Squared partial correlation of an x in X with y is

squared t - ratio
squared t - ratio + degrees of freedom

We will define the 't-ratio' and 'degrees of freedom'


later. Note how it enters:

R 2Xz =
R 2X + (1 − R 2X )ryz*2 = (
> ryz2* = ) (
R 2Xz − R 2X / 1 − R 2X )

4-18/34 Part 4: Partial Regression and Correlation


Adding Variables to a Model
What is the effect of adding PN, PD, PS,

4-19/34 Part 4: Partial Regression and Correlation


Partial Correlation
Partial correlation is a difference in R2s.
For PS in the example above,

R2 without PS = .9861, R2 with PS = .9907


(.9907 - .9861) / (1 - .9861) = .3309
3.922 / (3.922 + (36-5)) = .3314 (rounding)

4-20/34 Part 4: Partial Regression and Correlation


THE Application of Frisch-Waugh
The Fixed Effects Model
A regression model with a dummy variable for
each individual in the sample, each observed Ti times.
yi = Xiβ + diαi + εi, for each individual
N columns

 y1   X1 d1 0 0 0
  X
 y2   2 0 d2 0 0   β 
+ε N may be thousands. I.e., the
       
   α 
regression has thousands of
    variables (coefficients).
 yN   X N 0 0 0 dN 
β
= [X, D]
ε  +
 α
=Zδ +ε

4-21/34 Part 4: Partial Regression and Correlation


Estimating the Fixed Effects Model
The FEM is a linear regression model but
with many independent variables
−1
b  X ′X X ′D   X ′y 
 = D′X D′D  D′y 
 a    
Using the Frisch-Waugh theorem
b =[X MD X ] X ′MD y 
′ −1

4-22/34 Part 4: Partial Regression and Correlation


Application – Health and Income
German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods
Variables in the file are
Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced
panel with 7,293 individuals. There are altogether 27,326 observations. The number of
observations ranges from 1 to 7 per family. (Frequencies are: 1=1525, 2=2158, 3=825,
4=926, 5=1051, 6=1000, 7=987). The dependent variable of interest is
DOCVIS = number of visits to the doctor in the observation period
HHNINC = household nominal monthly net income in German marks / 10000.
(4 observations with income=0 were dropped)
HHKIDS = children under age 16 in the household = 1; otherwise = 0
EDUC = years of schooling
AGE = age in years

We desire also to include a separate family effect (7293 of them) for each family. This
requires 7293 dummy variables in addition to the four regressors.

4-23/34 Part 4: Partial Regression and Correlation


Fixed Effects Estimator (cont.)
M1D 0 0 
 
0 M2
0 
MD =  D
(The dummy variables are orthogonal)
 
 N
 0 0 MD 
MDi =
I Ti − di (d′idi ) −1 d′ = I Ti − (1/Ti )did′
1 - T1i - T1i ... - T1i 
 1 
 - Ti 1 - Ti ... - Ti 
1 1

= 
 ... ... ... ... 
 - T1 - 1
... 1 - T1i 
 i Ti

4-24/34 Part 4: Partial Regression and Correlation


‘Within’ Transformations

X ′MD X = ΣNi=1 X ′iMDi X i , { }


X ′iMDi X i
k,l
=
Σ t=1
i
T
(x it,k -x i.,k )(x it,l -x i.,l )

X ′MD y = ΣNi=1 X ′iMDi y i , {X′M y }


i
i
D i k =
Σ t=1
i
T
(x it,k -x i.,k )(y it -y i. )

4-25/34 Part 4: Partial Regression and Correlation


Least Squares Dummy Variable Estimator

 b is obtained by ‘within’ groups least squares


(group mean deviations)
 Normal equations for a are D’Xb+D’Da=D’y
a = (D’D)-1D’(y – Xb)

Ti
ai =(1/Ti )Σ (y it- x ′itb)=e
t=1 i

4-26/34 Part 4: Partial Regression and Correlation


Fixed Effects Regression

4-27/34 Part 4: Partial Regression and Correlation


Time Invariant Variable

f = a variable that is the same in every period (FEMALE)


T
fMD f = Σ fM f =Σ Σ t=1
N
i=1 i
i i
D i
N
(fit -fi. )2
i=1

but f=
it = fi , so fit f=
i. and (fit -fi. ) 0
This is a simple case of multicollinearity.
f = diag(fi )*D

4-28/34 Part 4: Partial Regression and Correlation


4-29/34 Part 4: Partial Regression and Correlation
In the millennial edition of its World Health
Report, in 2000, the World Health
Organization published a study that
compared the successes of the health care
systems of 191 countries. The results
notoriously ranked the United States a
dismal 37th, between Costa Rica and
Slovenia. The study was widely
misrepresented , universally misunderstood
and was, in fact, unhelpful in understanding
the different outcomes across countries.
Nonetheless, the result remains
controversial a decade later, as policy
makers argue about why the world’s most
expensive health care system isn’t
the world’s best.
4-30/34 Part 4: Partial Regression and Correlation
The COMPOSITE Index
Equation
logCOMPi = Maximum Attainablei - Inefficiency i
=α +β 1log HealthExpi
+β 2log Educi +β 3(log Educi ) - ui 2

i = 1,...,191 countries

4-31/34 Part 4: Partial Regression and Correlation


Estimated Model

β1
β2
β3

4-32/34 Part 4: Partial Regression and Correlation


Implications of results: Increases in Health
Expenditure and increases in Education are both
associated with increases in health outcomes.
These are the policy levers in the analysis!

4-33/34 Part 4: Partial Regression and Correlation


WHO Data

4-34/34 Part 4: Partial Regression and Correlation

You might also like