5 TakingBackControl
5 TakingBackControl
l
reg tivaria
res t
sio e
• Next week: no class but extra Yuan office hours on Wed 19 Oct: Business
School Student Vault 12 (cave room on lower ground)
• Stream 1: 16.30-17.30
• Stream 2: 17.30-18.30
2. Address confounding/endogeneity
-
𝑊𝑎𝑔𝑒 = 𝛽! + 𝛽" 𝐸𝐷𝑈𝐶 + 𝜖
+
𝜖 = 𝛽# 𝐹𝐸𝑀𝐴𝐿𝐸 + 𝑢
𝛽! < 0
• Negative effect of FEMALE on WAGE and EDU implies positive correlation between 𝜖
• We get upward bias when attempting to estimate 𝛽!
Wage (£/h)
12
10
Men
4
Women
2
0
0 2 4 6 8 10 12 14
Years of education
Wage (£/h)
8
True model in univariate
case
6
𝜖"
4
Men
Women
2
Men
0
𝛽# 0 2 4 6 8
Years of education
10 12 14
Women
Imperial College Business School Imperial means Intelligent Business 9
Causality vs all else equal
EDUC Wage
- +
𝐸𝑋𝑃𝐸𝑅
The reason why EDUC and EXPER are correlated is likely because of a chain of causality from
schooling to experience (i.e. if you go to school longer you don’t have so much time to get job
experience; also not that S is typically determined before EXPER which supports the suggested
chain)
If you include EXPER as separate explanatory variable, then your coefficient on EDUC will not
reflect this causal channel. This is good if you really want the all else equal effect of EXPER.
However, if you want the full causal effect of EDUC (e.g. you want to advise the government
what an extra year of schooling does to wages) you get the wrong answer as you are
pretending that you can have extra schooling without reducing people’s experience. So it
would be better to exclude EXPER.
Imperial College Business School Imperial means Intelligent Business 11
Directions of causality EDUC☜FEMALE
EDUC Wage
- -
Female
EDUC=Schooling+Uni+Vocational Wage
Su
ed ppo
uc se
ati ED
on U
wh C in
ile clu
wo de
rki s fu
ng r t
EXPER
h er
If the causality between the two explanatory variables goes both ways we are in
trouble as far as finding the causal effect of EDUC is concerned (we are cool for
finding the ceteris paribus effect). Both including or dropping the gender variable
will lead to a biased estimate. We have to use other methods some of which we
shall discuss later in the module (e.g. Instrumental Variables).
• More control variables are not always better to identify a causal effect
• Report regression with and without control and discuss limitations of your analysis
• More research with other data or better model (e.g. Instrumental Variables which we
discuss later) might be needed.
• Might sometimes be beyond the scope of a study (e.g. in group coursework)
• Women tend to have less education than men (in this dataset)
Unemployment 2004
Unemployment 2011
Migration coefficient:
• Still not significant
• Value has become slightly lower
• We see that that some the age variables are highly correlated
• What matters is if a lot of the variation of an x variable is accounted for by a linear combination of all
other x variables.
• We can examine this by looking at R2 in regressions of the following kind:
Most
10
8
6
-2 0 2 4
yXhighepslow yXlowepshigh
y
𝑉𝐴𝑅( E
𝑌)
𝑅" =
𝑉𝐴𝑅(𝑌)
Imperial College Business School Imperial means Intelligent Business 38
Some more details on F-tests
Unrestricted Model Restricted Model
𝑌 = 𝛽!" 𝑋 + 𝛽!# 𝐴𝐺𝐸# + 𝛽!$ 𝐴𝐺𝐸$ + 𝜖! 𝑌 = 𝛽%" 𝑋 + 0×𝐴𝐺𝐸# + 0×𝐴𝐺𝐸$ + 𝜖%
1
.8
F distribution has two arguments
1. Number of restrictions
.6
Density
2. Degrees of freedom
unrestricted model
.4 .2
0
0 1 2 3 4 5
F statistic
Fden(2,50,x) Fden(3,50,x)