Question 2 (Final Exam, 2016 S1)
We would like to make a predictive model for amount of time that women sleep at night.
We have randomly selected 306 women and asked them to record number of minutes
slept every night for a week, the amount of hours that they engaged in paid work in
that week, and also asked some of their personal characteristics such as age, education
and number and age of their children. The variables that we will use in the analysis are:
SLEEP minutes sleep at night, per week
W RK minutes paid work, per week
EDU C years of education
AGE age in years
KID =1 if any children under 3 years old, 0 otherwise
a. Based on preliminary analysis of the data, in particular based on the information
in the following …gure, we have decided to drop the observation that reports 755
minutes of sleep in the survey week. Explain why you agree or disagree with this
decision. [2 marks]
1
We have estimated the following regression (the standard errors are reported below the
parameter estimates)
d
SLEEP = 4523:48 0:13 W RK 43:00 AGE + 0:50 AGE 2 12:50 EDU C 144:54 KID
(366:51) (0:03) (17:62) (0:21) (9:00) (88:12)
2
n = 305; R = 0:104 (1)
b. Test the null hypothesis that all else constant, having a child under the age of 3
has no e¤ect on a mother’s sleep against an alternative that makes sense in this
context. Perform the test at the 5% level of signi…cance and decide if you would
or would not drop KID from the regression. [3 marks]
c. Compute a 95% con…dence interval for the di¤erence between the mean sleep time
per week for two women with the same age and no young children, who work the
exact same hours in a week, but one has 12 years of education and the other 16
years of education. [3 marks]
d. Explain the insights that the regression results provide for the e¤ect of age on
sleep, all else equal. In particular, all else equal, at what age women are predicted
to sleep the least on average according to this estimated equation?
[3 marks]
2
Our research assistant has estimated a series of regressions which are reported below
(after part (f)). In these regressions Y HAT and U HAT refer to the predicted values
and residuals of equation (1).
e. Using the appropriate equation or equations, test that EDU C and KID are jointly
insigni…cant for predicting SLEEP at the 5% level of signi…cance. [3 marks]
f. Using the appropriate equation or equations, test for heteroskedasticity in the
errors of equation (1). Perform the test at the 5% level of signi…cance, and based
on your conclusion, explain if the OLS estimators in (1) are unbiased and if the
test results and con…dence intervals computed in previous parts are reliable. [3
marks]
d
SLEEP = 3519:43 0:13 W RK (2)
(50:64) (0:03)
n = 305; R2 = 0:078
d
SLEEP = 4206:17 0:13 W RK 37:64 AGE + 0:47 AGE 2 (3)
(333:60) (0:03) (17:46) (0:21)
n = 305; R2 = 0:092
d
U HAT = 8462:416 + 5:10 Y HAT 0:001 Y HAT 2 (4)
(9048:98) (5:45) (0:001)
n = 305; R2 = 0:003
d 2 =
U HAT 2590915 + 1696 Y HAT 0:260 Y HAT 2 (5)
(6275058) (3779) (0:568)
n = 305; R2 = 0:001
3
Question 2 (Final Exam, 2016 S2)
2.a. In the multiple regression model
y = X + u ;
n 1 n (k+1) (k+1) 1 n 1
state the assumptions necessary for the OLS estimator b = (X0 X) 1 X0 y to be an
unbiased estimator of : Provide a proof of unbiasedness of b and indicate where
each of these assumptions is used in your proof.
(5 marks)
4
2.b. We want to know if the separation of corporate management from corporate own-
ership a¤ects a …rm’s performance after controlling for its assets. We have data
on pro…ts and assets for a randomly chosen cross section of 69 …rms. 37 of these
…rms are managed by their owners and the other 32 are managed by professional
managers who are not the owners of the …rm. The summary statistics for pro…ts
and sales are provided in the table below.
Sample statistic P ROF IT S (million $) ASSET S (million $)
Mean 12.8 277.2
Median 7.4 168.4
Minimum -3.8 30.3
Maximum 131.0 1953.2
Standard deviation 18.6 345.8
The scatter plot of pro…ts versus assets is shown in the …gure below
2.b.i) We run a simple regression of P ROF IT S on a constant and ASSET S. What does
the scatter plot suggest about (a) the sign of the slope coe¢ cient? (b) the condi-
tional variance of the errors? What problems would conditional heteroskedasticity
cause for inference in this regression and is it likely that a log-log formulation
would solve the heteroskedasticity problem in this application?
(5 marks)
5
2.b.ii) We have de…ned a dummy variable M N O which is equal to 1 if the …rm is managed
by a manager who is not the owner of the …rm, and is equal to 0 otherwise. In
order to get the best linear unbiased estimators of all parameters in
P ROF IT S = 0 + 1 ASSET S + 2M N O +u (6)
and comment on whether the separation of management from ownership a¤ects a
…rm’s performance, we have estimated the following equation:
(W Pd
ROF IT S) = 2:49 W + 0:04 (W ASSET S) 2:03 (W M N O)
(1:70) (0:01) (1:88)
1
where W = pASSET S
: Under what assumption about the conditional variance of u
in (6) would this estimated equation provide the best linear unbiased estimator of
the parameters of 0 ; 1 and 2 ? Given that assumption, explain why multiplying
equation (6) by W produces a model that satis…es all requirements of the Gauss-
Markov Theorem.
(5 marks)
6
2.b.iii) We want to make sure that the form of management a¤ects neither the inter-
cept nor the slope of the conditional expectation of P ROF IT S as a function of
ASSET S: For that reason, we would like to test that 2 = 3 = 0 in
P ROF IT S = 0 + 1 ASSET S + 2M N O + 3 (M N O ASSET S) + u;
against the alternative that at least one of them is not zero. From preliminary
analysis we have concluded that errors are heteroskedastic and weighting all vari-
1
ables by W = pASSET S
solves the heteroskedasticity problem. Our research
assistant has provided us with the following estimation results:
dIT S = 5:19 + 0:03 ASSET S
P ROF (7)
^ = 16:11; SSR = 17392
(W Pd
ROF IT S) = 1:40W + 0:04 (W ASSET S) (8)
^ = 0:66; SSR = 27:95
dIT S = 1:56 + 0:05 ASSET S + 8:23M N O
P ROF
0:05(M N O ASSET S); (9)
^ = 13:17; SSR = 11272
(W Pd
ROF IT S) = 0:12W + 0:06 (W ASSET S) + 2:50 (W M N O)
0:03 (W MNO ASSET S) ; (10)
^ = 0:62; SSR = 25:32
Use the appropriate information to test 2 = 3 = 0 at the 1% level of signi…cance.
Remember to state the null, the alternative, the test statistic and its distribution
under the null and the rejection rule, and state your conclusion about the e¤ect
of separation of management from ownership on a …rm’s performance.
(5 marks)
7
Question 3 (Final Exam S2, 2017)
3.a. We wold like to estimate the demand for beef. We have a time series sample of
150 seasonally adjusted quarterly observations on
Qbeef = quantity of beef demanded in kg
Pbeef = price of beef in dollars per kg
Pchicken = price of chicken in dollars per kg
IN C = disposable income in dollars.
We consider the demand model for beef written as:
Qbeef;t = 0 + 1 Pbeef;t + 2 Pchicken;t + 3 IN Ct + 4t + ut ; (11)
where t is a time trend such that t = 1; 2; : : : ; 150. If (11) is dynamically well
speci…ed, what type of process would you expect the error term (ut ) to follow?
Provide the properties of this process.
(2 marks)
3.b. The researcher estimated (11) by OLS and obtained the corresponding residuals,
denoted by u^t . He plotted the sample autocorrelation and partial autocorrelation
functions for u
^t , which are shown in Figure 1 below:
Figure 1
Autocorrelation Partial Correlation AC PAC Q-Stat Prob
1 0.556 0.556 62.684 0.000
2 0.436 0.184 101.43 0.000
3 0.317 0.026 122.04 0.000
4 0.183 -0.07... 128.98 0.000
5 0.124 -0.00... 132.18 0.000
6 0.112 0.054 134.79 0.000
7 0.098 0.033 136.81 0.000
8 0.051 -0.04... 137.35 0.000
9 0.070 0.039 138.38 0.000
1... 0.063 0.020 139.22 0.000
(i) What does the information in Figure 1 suggest with regard to the behaviour
of the error term in (11)? Brie‡y explain.
(2 marks)
8
(ii) Set up a Breusch-Godfrey test that can be used for testing no serial correla-
tion in errors against the alternative of serial correlation of order 2. Clearly
state the steps involved, the null and alternative hypotheses of the test, the
statistic(s) of interest and corresponding distribution(s).
(3 marks)
(iii) The R2 obtained from the regression estimated in 3.b.(ii) with u
^t as dependent
variable was equal to 0:658. What conclusion would you draw with respect
to the behaviour of the error term? Brie‡y explain.
(2 marks)
9
(iv) What are the implications of the results drawn in 3.b.(iii) with regard to
linear regression modelling? What standard errors for the OLS estimates
would you recommend using in this instance? Brie‡y explain.
(3 marks)
3.c. Our economist friend tells us that economic theory suggests the following non-
linear demand function for beef at time t:
Qbeef;t = e 0+ t+vt 1
Pbeef;t 2
Pchicken;t IN Ct 3 ; (12)
where variables Qbeef , Pbeef , Pchicken , IN C and t are de…ned above and vt is a
random error term. He tells us that as a result, the linear regression technique
would be incapable of providing estimates of the parameters of this demand func-
tion, namely 1 , 2 and 3 , and we need more sophisticated nonlinear estimation
techniques.
(i) Explain to him that he is incorrect by recommending an appropriate trans-
formation of (12).
(2 marks)
(ii) Using the transformed model, or otherwise, interpret the parameter 1 .
(1 mark)
10