0% found this document useful (0 votes)
196 views30 pages

Exercises and Cases in Econometrics

This document appears to be an assignment or exercise related to econometrics. It provides 12 questions or problems for students to work through related to simple and multiple linear regression modeling. The questions cover topics like specifying regression models, estimating models using different methods, interpreting regression coefficients, calculating fitted and residual values, and assessing model fit. The overall document aims to help students practice common techniques in econometrics.

Uploaded by

Freya Grant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
196 views30 pages

Exercises and Cases in Econometrics

This document appears to be an assignment or exercise related to econometrics. It provides 12 questions or problems for students to work through related to simple and multiple linear regression modeling. The questions cover topics like specifying regression models, estimating models using different methods, interpreting regression coefficients, calculating fitted and residual values, and assessing model fit. The overall document aims to help students practice common techniques in econometrics.

Uploaded by

Freya Grant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

EXERCISES AND CASES

IN
ECONOMETRICS

2020/2021

Amado Peiró
(Universitat de València)

STUDENT: ______________________________________________

1
1. With the information shown in the following table,

OBS. X Y
1 4 8
2 8 2
3 5 5
4 2 5

a) Specify a simple linear regression model where 𝑌 is the dependent variable and 𝑋 is the
independent variable.
b) Estimate the model with the normal equations, with the direct equations and with a computer.
c) Write down the estimated regression equation.
d) Interpret the estimate that accompanies the independent or explanatory variable.
e) Calculate the fitted or predicted values of the dependent variable.
f) Calculate the residuals.
g) Plot the points of the observations (diagram of dispersion or scatter plot), the fitted regression
line, the fitted values and the residuals.

2. With the information shown in the following table,

OBS. X Y
1 4 2
2 8 8
3 5 7
4 2 3

a) Specify a simple linear regression model where 𝑌 is the dependent variable and 𝑋 is the
independent variable.
b) Estimate the model with the normal equations, with the direct equations and with a computer.
c) Write down the estimated regression equation.
d) Interpret the estimate that accompanies the independent or explanatory variable.
e) Calculate the fitted or predicted values of the dependent variable.
f) Calculate the residuals.
g) Plot the points of the observations (diagram of dispersion or scatter plot) and the fitted
regression line and point out the fitted values and the residuals.

3. Given the following regression models:

𝑌 𝛽 𝛽 ⋅𝑋 𝑢
𝑌∗ 𝛾 𝛾 ⋅𝑋 𝑢∗

where 𝑌 ∗ 𝜆 ⋅ 𝑌 , prove that the least squares estimators verify that:

𝛾 𝜆⋅𝛽
𝛾 𝜆⋅𝛽

2
4. Given the following regression models:

𝑌 𝛽 𝛽 ⋅𝑋 𝑢
𝑌 𝛾 𝛾 ⋅ 𝑋∗ 𝑢∗

where 𝑋 ∗ 𝜆 ⋅ 𝑋 , prove that:


𝛽
𝛾
𝜆

5. In a simple linear regression model, what is the effect of multiplying both the dependent and the
independent variables by the same constant?

6. A zoologist thinks that there is a linear relationship between the weights and the lengths of certain
mammals. To study this relationship, the following sample of twenty animals is available:

ANIMAL WEIGHT LENGTH ANIMAL WEIGHT LENGTH


1 2,1 44 11 1,6 41
2 2,7 49 12 1,8 47
3 2,5 51 13 1,8 43
4 1,4 43 14 1,3 42
5 1,9 39 15 1,4 39
6 1,7 46 16 1,9 46
7 2,4 51 17 1,8 43
8 1,8 46 18 1,6 43
9 2,3 44 19 1,6 47
10 2,2 42 20 1,6 49

where WEIGHT is expressed in kilograms and LENGTH in centimeters.

a) Specify a simple linear regression model that relates the weight to the length of the animals.
b) Estimate this model with a computer.
c) Interpret the estimate of the slope.
d) How much will an animal weigh if it is 45 centimeters long?
e) If an animal is three centimeters longer than another, what difference in weight can be
expected?
f) Interpret the coefficient of determination.
g) If weight is expressed in grams and length in centimeters, what estimation is obtained?
h) If weight is expressed in kilograms and length in meters, what estimation is obtained?

3
7. With the following sample data:

Year 𝑌 𝑋, 𝑋,
2011 10 1 0
2012 25 3 -1
2013 32 4 0
2014 43 5 1
2015 58 7 -1
2016 62 8 0
2017 67 10 -1
2018 71 10 2

estimate the model 𝑌 𝛽 𝛽𝑋 , 𝛽𝑋 , 𝑢 by least squares. Therefore,

a) Build and find 𝐗, 𝐗 𝐗, 𝐗′𝐗 , 𝐲, 𝐗 𝐲, 𝛃, 𝐲 and 𝐮.


b) Write down the estimated model.
c) Interpret the estimates.
d) Verify the descriptive properties.
e) Calculate the coefficient of determination and interpret its value.
f) Calculate the adjusted coefficient of determination.
g) Estimate the error variance.
h) Estimate now the model with a computer and verify your previous results.

8. With the following information,

Year Yt X 2t X3t
2011 10 1 0
2012 8 3 -1
2013 7 4 0
2014 6 5 1
2015 13 7 -1
2016 6 8 0
2017 12 10 -1
2018 ? 10 2

estimate the model 𝑌 𝛽 𝛽𝑋 𝛽𝑋 𝑢 by least squares. Therefore,

a) Choose a value for 𝑌 comprised between 4 and 20.


b) Build and find 𝐗, 𝐗 𝐗, 𝐗′𝐗 , 𝐲, 𝐗 𝐲, 𝛃, 𝐲 and 𝐮.
c) Write down the estimated model.
d) Interpret the estimates.
e) Verify the descriptive properties.
f) Calculate the coefficient of determination and interpret its value.
g) Calculate the adjusted coefficient of determination.
h) Estimate the error variance.
i) Estimate now the model with a computer and verify your previous results.

4
9. The file Alim contains the following variables for a sample of 80 families:

FOOD: Annual family expenditure on food and non-alcoholic drinks purchased to be


consumed at home. It is expressed in euros per year.
INC: Annual disposable family income. It is expressed in euros per year.

a) Estimate the model 𝐹𝑂𝑂𝐷 𝛽 𝛽 ⋅ 𝐼𝑁𝐶 𝑢 and interpret the estimate of 𝛽 .


b) Estimate the model LN 𝐹𝑂𝑂𝐷 𝛽 𝛽 ⋅ 𝐼𝑁𝐶 𝑢 and interpret the estimate of 𝛽 .
c) Estimate the model 𝐹𝑂𝑂𝐷 𝛽 𝛽 ⋅ LN 𝐼𝑁𝐶 𝑢 and interpret the estimate of 𝛽 .
d) Estimate the model LN 𝐹𝑂𝑂𝐷 𝛽 𝛽 ⋅ LN 𝐼𝑁𝐶 𝑢 and interpret the estimate of 𝛽 .

10. The following estimation has been obtained from a sample formed by 88 dwellings in a certain
area:
𝑃 19,3 0,11 ⋅ 𝑠𝑓 15,2 ⋅ 𝑏𝑟

where 𝑃: House price in thousands of US$.


𝑠𝑓: Area of the house in square feet.
𝑏𝑟: Number of bedrooms in the house.

a) Holding the area of the house fixed, what is the estimated increase in price if a new bedroom
is added?
b) What is the estimated increase in price for a house if one bedroom is built and the area of the
house increases 200 square feet? Compare your response with that of a).
c) A house of 2,500 square feet and four bedrooms is offered for 270,000 dollars. What do you
think of the price?
d) Write down the estimated model in square meters (𝑠𝑞𝑟𝑚𝑡𝑠) as the unit of area and with the
price in thousands of euros (𝑃€ ).

NOTES: 1 square feet is approximately equal to 0,1 square meters.


Use the exchange rate of 1 euro = 1,10 US dollars.

11. Consider the file Mates to analyze the relationship between the age of teachers and their
students’ grades in mathematics. This file contains a sample of 1,000 observations of the following
variables:

AGE: age of the teacher in years.


MATHS: students’ scores in a specific test on a 0-10 scale.

a) Specify a simple regression model that relates the age of teachers and their students’ scores.
b) Estimate the model with a computer.
c) Interpret the slope estimator.
d) Now add the independent variable in quadratic form and estimate this new model. According
to this estimation, interpret the relationship between variables.
e) What grades can be expected for the students of teachers 30, 40, 50 and 60 years old?
f) What would be the ‘optimal’ age of teachers?
g) Discuss the overall results.

5
12. Let us consider the following model:
𝑌 5 2,7 ∙ 𝑋 , 1,4 ∙ 𝑋 , 3,8 ∙ ln∙ 𝑋 , 6∙𝑋 , 0,5 ∙ 𝑋 , , with 𝑖 1, 2, … , 60.

𝑌 is expressed in euros, 𝑋 is expressed in kilograms, 𝑋 is expressed in centimeters and both 𝑋


and 𝑋 are expressed in euros. What change would you expect in 𝑌 when:

𝑋 increases by one kilogram …………………………………..

𝑋 increases by seven kilograms ……………………………..

𝑋 decreases by 3,1 kilograms …………………………………

𝑋 decreases by 600 grams ……………………………….……

𝑋 increases by one centimeter ……………………………….

𝑋 increases by two meters ……………………………………..

𝑋 decreases by 14 centimeters ………………………………

𝑋 increases by 2% …………………………………………………

𝑋 decreases by 7% ……………………………………………...…

The rate of growth in 𝑋 is 0,03 ……………………………….

The increase rate in 𝑋 is 0,03 …………………………………

The rate of change in 𝑋 is 0,03 ……………………………….

𝑋 increases 1 euro …………………………………………………

𝑋 decreases by 7 euros ………………………………………….

𝑋 is equal to 100 euros and decreases by 7 euros ……

6
13. Now, let us consider the following model:

ln 𝑌 5 2,7 ∙ 𝑋 , 1,4 ∙ 𝑋 , 3,8 ∙ ln 𝑋 , 6∙𝑋 , 0,5 ∙ 𝑋 , , with 𝑖 1, 2, … , 60.

𝑌 is expressed in euros, 𝑋 is expressed in kilograms, 𝑋 is expressed in centimeters and both 𝑋


and 𝑋 are expressed in euros. What change would you expect in 𝑌 when:

𝑋 increases by one kilogram …………………………………..

𝑋 increases by seven kilograms ……………………………..

𝑋 decreases by 3,1 kilograms …………………………………

𝑋 decreases by 600 grams ……………………………….……

𝑋 increases by one centimeter ……………………………….

𝑋 increases by two meters ……………………………………..

𝑋 decreases by 14 centimeters ………………………………

𝑋 increases by 2% …………………………………………………

𝑋 decreases by 7% ……………………………………………...…

The rate of growth in 𝑋 is 0,03 ……………………………….

The increase rate in 𝑋 is 0,03 …………………………………

The rate of change in 𝑋 is 0,03 ……………………………….

𝑋 increases 1 euro …………………………………………………

𝑋 decreases by 7 euros ………………………………………….

𝑋 is equal to 100 euros and decreases by 7 euros ……

7
14. With regard to the model ln 𝑌 𝛽 𝛽𝑋 𝛽 ln 𝑋 𝑢 , argue the truth or falsity of
the statements below and, if necessary, propose alternative correct statements:

a) When variable 𝑋 increases by one unit, variable 𝑌 has a percentage growth rate equal to 𝛽 .
b) When variable 𝑋 increases by one unit, variable 𝑌 increases by 𝛽 units.

15. With regard to the following models,

𝑙𝑛 𝑌 3 0,17 ∙ 𝑋 , 1,4 ∙ 𝑙𝑛 𝑋 ,

𝑌 2 9,3 ∙ 𝑙𝑛 𝑋 , 3,5 ∙ 𝑋 ,

argue the truth or falsity of the statements below and, if necessary, propose alternative correct
statements:

a) When variable 𝑋 increases by one unit, Y has a growth rate equal to 17%.
b) When 𝑋 decreases by 9.3%, variable Y decreases by one unit.

16. Given the Cobb-Douglas production function, 𝑄 𝐴 ⋅ 𝐿 ⋅ 𝐾 ⋅ 𝑒 , where 𝑄 designates


production, 𝐿 employment, and 𝐾 capital:

a) Linearize the model.


b) Interpret the parameters 𝛼 and 𝛽.

17. Argue the truth or falsity of the following statements:

a) In a regression model, the residuals are equal to the random errors.


b) In a regression model, the coefficient of determination (𝑅 ) is equal to the adjusted coefficient
of determination (𝑅 ).

18. With the file SELEC, and in accordance with the criteria of selection of models 𝑅 , 𝑅 and 𝐴𝐼𝐶,
select the best model with 𝑌 as regressand and 𝑋2, 𝑋3 and 𝑋4 as potential regressors. Now select
another model that has the logarithm of 𝑌 as regressand and the same potential regressors. Finally,
compare the two models selected earlier with any appropriate criterion.

19. With regard to exercise 7

a) What are the properties of the estimators?


b) Write down the variance-covariance matrix of these estimators.
c) Estimate the variance-covariance matrix of these estimators.
d) Under normality of errors, write down the distribution of the elements of the model.

8
20. With the data of the file Pau, the following model has been estimated:

𝑌 𝛽 𝛽 𝑋 , 𝛽 𝑋 , 𝑢

and the following results have been obtained:


Dependent variable: Y
Method: least squares
Sample: 1 44
Included observations: 44
Coefficient Std. Error t-Statistic Prob.
C 72.74922 67.59934 1.076182 0.2881
X2 2.088913 1.571701 1.329078 0.1912
X3 -8.125742 1.643958 -4.942791 0.0000
R-squared 0.409101 Mean dependent var 170.6647
Adjusted R-squared 0.380276 S.D. dependent var 134.6957
S.E. of regression 106.0359 Akaike info criterion 12.23118
Sum squared resid 460987.9 Schwarz criterion 12.35283
Log likelihood -266.0859 Hannan-Quinn criter. 12.27629
F-statistic 14.19287 Durbin-Watson stat 1.559506
Prob(F-statistic) 0.000021

a) Test the null hypothesis that 𝛽 is equal to 4.


b) Test the hypothesis that 𝛽 is equal to –2.
c) Test the significance of 𝛽 .
d) Test the significance of 𝛽 .
e) Test the joint significance of the model.
f) Estimate the error variance.
g) Calculate the standard error of regression.

21. In the previous model, test the significance of the variables 𝑋 and 𝑋 and the joint significance
of the model using the P-values or marginal significance levels (Prob.).

22. With the data of the file Selec and just taking into account the significance of the variables,
select the best model with regressand 𝑌 and with possible regressors 𝑋2, 𝑋3, 𝑋4. Use first the
significance level of 5% and, subsequently, the significance level of 10%. Compare the results with
those obtained by applying the criteria of 𝑅 , 𝑅 and 𝐴𝐼𝐶.

9
23. With 34 observations, the following model has been estimated:

𝑌 𝛽 𝛽𝑋 , 𝛽𝑋 , 𝛽𝑋 , 𝑢

and the following estimation has been obtained:

𝑌 2,4 6,2 ⋅ 𝑋 , 5,8 ⋅ 𝑋 , 12,5 ⋅ 𝑋 ,


1,1 2,4 1,2 2,7

where values in parentheses denote estimates of the standard deviations of the corresponding
estimates (standard errors).

a) Test the significance of the variable 𝑋 .


b) When 𝑋 increases by one unit, does 𝑌 decrease by 4 units?
c) Test if 𝛽 is equal to 2 against the possibility that it is less than 2.
d) Test if 𝛽 is equal to 10 against the possibility that it is greater than 10.

24. The following model:

𝑌 𝛽 𝛽 𝑋 , 𝛽 𝑋 , 𝛽 𝑋 , 𝑢

has been estimated with 34 observations and the sum of squared residuals is 57.29.

a) Write down the restricted model to test 𝛽 𝛽 𝛽.


b) The sum of squared residuals of the restricted model that includes the above restriction is 69.11.
Test this restriction with significance levels of 5% and 1%.
c) Write down the restricted model with the following restrictions:
H :𝛽 𝛽 𝛽
𝛽 𝛽
d) The sum of squared residuals of the restricted model that includes the two previous
restrictions is 74.35. Test these restrictions with significance levels of 5% and 1%.

25. With the data of the file Dimecres, the model

𝑌 𝛽 𝛽 𝑋 , 𝛽 𝑋 , 𝛽 𝑋 , 𝛽 𝑋 , 𝑢

has been estimated. Test the following hypotheses:

a) H : 2𝛽 𝛽 24
b) H : 5𝛽 4 6𝛽
6 3𝛽 𝛽
10 𝛽
c) H : 𝛽 𝛽 𝛽 0

Verify that the results obtained are the same as those obtained with direct computer tests.

10
26. With monthly data, the model 𝑌 𝛽 𝛽𝑋, 𝛽𝑋, 𝑢 has been estimated for different
sample periods, and the following sums of squared residuals have been obtained:

1994:01 – 2003:12 2004:01 – 2009:12 1994:01 – 2009:12


SSR 366 189 562

With these sums, test the structural stability of the model in the periods 1994:01 – 2003:12 and
2004:01 – 2009:12.

27. 35 students from school A and 31 students from school B do a common exam. To analyze
possible differences between both schools the following regressions have been estimated where
values in parentheses are standard deviations.

►Regression 1. Sample with all students:


𝑃 17,6 0,20 ∙ 𝐻 0,37 ∙ 𝐴
(5,5) (0,04) (0,13) 𝑅 0,35, 𝑆𝑆𝑅 10948,1

►Regression 2. Sample formed exclusively by students from school A:


𝑃 17,1 0,18 ∙ 𝐻 0,44 ∙ 𝐴
(6,7) (0,06) (0,18) 𝑅 0,41, 𝑆𝑆𝑅 5301,3

►Regression 3. Sample formed exclusively by students from school B:


𝑃 17,5 0,23 ∙ 𝐻 0,36 ∙ 𝐴
(9,8) (0,07) (0,19) 𝑅 0,30, 𝑆𝑆𝑅 5413,9

where 𝑃: Points obtained in the exam


𝐻: Hours of study
𝐴: Hours of attendance to school

a) Is the model in Regression 1 significant?


b) Do the schools differ?
c) Could the possible differences between schools be analysed with a simple t-test for the equality
of the means of P in both schools?

11
28. The file Sleep75 (Jeffrey M. Wooldridge, Introductory Econometrics) includes the following
variables for a sample of 706 adults:

SLEEP: Sleep time in weekly minutes. EDUC: Years of education.


TOTWRK: Work time in weekly minutes. AGE: Age in years.

The estimation of the model 𝑆𝐿𝐸𝐸𝑃 𝛽 𝛽 ∙ 𝑇𝑂𝑇𝑊𝑅𝐾 𝛽 ∙ 𝐸𝐷𝑈𝐶 𝛽 ∙ 𝐴𝐺𝐸 𝑢 is


shown in Table 1 (Model 1), and in Table 2 (Model 2) the same model is estimated excluding the
regressors EDUC and AGE.

a) Analyze the relevance of Model 1.


b) Analyze the relevance of each of the variables in Model 1.
c) How much time will a person sleep every day if he or she is 20 years of age, has 14 years of
education and works 40 hours a week?
d) Does working more time imply sleeping less?
e) Does an additional hour of work imply sleeping fifteen minutes less?
f) Does an additional hour of work imply sleeping ten minutes less?
g) Do education and age together affect the sleep time?

TABLE 1 (MODEL 1)
========================================================================
Dependent variable: SLEEP
Sample: 1 706
=====================================================
Variable Coefficient Std. Error t-Statistic Prob.
=====================================================
C 3638.245 112.2751 32.40474 0.0000
TOTWRK -0.148373 0.016694 -8.888075 0.0000
EDUC -11.13381 5.884575 -1.892034 0.0589
AGE 2.199885 1.445717 1.521657 0.1285
======================================================
R-squared 0.113364 Mean dependent var 3266.356
Adjusted R-squared 0.109575 S.D. dependent var 444.4134
S.E. of regression 419.3589 Akaike info criterion 14.92098
Sum squared resid 123455057 Schwarz criterion 14.94681
Log likelihood -5263.106 F-statistic 29.91889
Durbin-Watson stat 1.942609 Prob(F-statistic) 0.000000
========================================================================

TABLE 2 (MODEL 2)
========================================================================
Dependent variable: SLEEP
Sample: 1 706
======================================================
Variable Coefficient Std. Error t-Statistic Prob.
======================================================
C 3586.377 38.91243 92.16534 0.0000
TOTWRK -0.150746 0.016740 -9.004992 0.0000
======================================================
R-squared 0.103287 Mean dependent var 3266.356
Adjusted R-squared 0.102014 S.D. dependent var 444.4134
S.E. of regression 421.1357 Akaike info criterion 14.92662
Sum squared resid 124858119 Schwarz criterion 14.93953
Log likelihood -5267.096 F-statistic 81.08987
Durbin-Watson stat 1.954559 Prob(F-statistic) 0.000000

12
29. With a sample formed by 80 Italian, 110 German and 90 French people, an analysis of spending
on certain cultural products is carried out with the following regression model:

𝐶𝑈𝐿𝑇 𝛽 𝛽 𝐼𝑁𝐶 𝛽 𝑈𝑁𝐼 𝑢


where CULT: Spending on certain cultural products in thousands of euros per year.
INC: Available income expressed in thousands of euros per year.
UNI: Dummy variable that takes value 1 if the person has university studies and 0
otherwise.

The estimation with the sample formed by all people is:


𝐶𝑈𝐿𝑇 0,84 0,037 ∙ 𝐼𝑁𝐶 0,43 ∙ 𝑈𝑁𝐼 , 𝑛 280, 𝑅 0,19, 𝑆𝑆𝑅 104,0
and the models for each country are:
Italy: 𝐶𝑈𝐿𝑇 0,93 0,037 ∙ 𝐼𝑁𝐶 0,27 ∙ 𝑈𝑁𝐼 , 𝑛 80, 𝑅 0,21, 𝑆𝑆𝑅 22,2
Germany: 𝐶𝑈𝐿𝑇 0,82 0,048 ∙ 𝐼𝑁𝐶 0,35 ∙ 𝑈𝑁𝐼 , 𝑛 110, 𝑅 0,20, 𝑆𝑆𝑅 48,0
France: 𝐶𝑈𝐿𝑇 0,67 0,031 ∙ 𝐼𝑁𝐶 0,59 ∙ 𝑈𝑁𝐼 , 𝑛 90, 𝑅 0,27, 𝑆𝑆𝑅 24,2

Are there international differences in the relation of spending on culture to income and education?

13
30. The file Salari contains information of the following variables for the workers of a certain
company:

S: Gross annual salary in euros.


W: Work experience in years.
G: Dummy variable that takes the value 1 for men and 0 for women.

a) Examine the data. How many men are there in the sample? How many women?
b) Estimate the following regression model:

𝑆 𝛽 𝛽 𝑊 𝛽 𝐺 𝑢

to explain the workers’ salary on the basis of their work experience and gender. Which is the
gender of reference?
c) Interpret the parameters 𝛽 and 𝛽 , and their estimates.
d) According to the previous estimation, does work experience affect the salary?
e) According to the previous estimation, what increase can be expected in salary when work
experience increases by one year?
f) According to the previous estimation, is there gender discrimination?
g) According to the previous estimation, what proportion of salary variations are explained by
gender and experience?
h) Let us suppose now that value 1 in variable G denotes women and value 0 denotes men. How
do you interpret the estimator accompanying this variable?
i) In addition, the following variables are defined: M (with value 1 for men and 0 for women) and
W (with value 1 for women and 0 for men). Consider the following models:
𝑆 𝛽 𝛽 𝑊 𝛽 𝐺 𝛽 𝑀 𝑢
𝑆 𝛽 𝛽 𝑊 𝛽 𝐺 𝛽 𝑊 𝑢
𝑆 𝛽 𝛽 𝑊 𝛽 𝑀 𝛽 𝑊 𝑢
What will happen with these models?
j) Now estimate a model that allows us to analyze if experience is paid equally to men and women.
Analyze the results.

14
31. The file Oci contains observations of the following variables for a sample of 1200 people:

L: Spending on leisure, entertainment and culture in thousands of euros per year.


I: Available income in thousands of euros per year.
P: Dummy variable with value 1 if the maximum level of studies attained is primary education
and 0 otherwise.
S: Dummy variable with value 1 if the maximum level of studies attained is secondary education
and 0 otherwise.
U: Dummy variable with value 1 if the maximum level of studies attained is university education
and 0 otherwise.

a) Build a theoretical model without interactions between explanatory variables that allows us to
explain the spending on leisure on the basis of the income available and the level of studies.
Take primary education (P) as the reference level of studies.
b) Interpret the parameters of the previous model.
c) Estimate the model.
d) Are there significant differences in spending between people with secondary education and
people with primary education?
e) Are there significant differences in spending between people with secondary education and
people with university education?

32. With the same file Oci as in the preceding exercise,

a) Build now a model with interactions between the income and the level of studies that allows us
to explain the spending on leisure on the basis of the income available and the level of studies.
Take primary education (P) as the reference level of studies.
b) Interpret the parameters of the previous model.
c) Estimate this model.
d) Are there significant differences in spending between people with secondary education and
people with primary education?
e) Consider the model estimated in h). Are there significant differences in spending between
people with secondary education and people with university education?
f) Build a double-entry table that shows spending for incomes equal to 10,000, 20,000 and 30,000
euros and for different levels of education.

15
33. The following models have been estimated with a sample formed by 25 workers:

A. 𝑆 11,7 0,84 ⋅ 𝐸 1,23 ⋅ 𝐺 0,12 ⋅ 𝐺 1,88 ⋅ 𝐹


(0,69) (0,054) (0,58) (0,57) (0,52)
Sum of squared residuals 20,71, 𝑅 0,967

B. 𝑆 12,2 0,81 ⋅ 𝐸 1,72 ⋅ 𝐺


(0,71) (0,066) (0,70)
Sum of squared residuals 36,92, 𝑅 0,942

where S: Gross annual salary in thousands of euros.


E: Work experience in years.
G: Dummy variable with value 1 if the worker is a man and 0 otherwise.
A: Dummy variable that takes the value 1 if the worker speaks German and 0 otherwise.
F: Dummy variable that takes the value 1 if the worker speaks French and 0 otherwise.

and the values in parentheses are standard errors.

a) Does the work experience affect the salary? (Consider Model A)


b) Does the knowledge of both foreign languages affect the salary?
c) What difference in wage can be expected between a man who does not speak French and a
woman who speaks French?
d) Build a model that allows us to analyze if the wage gap between men and women increases
with work experience and explain how it can be tested. (Consider Model B)

16
34. The following estimation has been obtained with quarterly data from 1981:01 to 2008:04:
===========================================================
Dependent variable: CONSUM
Method: least squares
Sample: 1981Q1 2008Q4
Included observations: 112
===========================================================
Coefficient Std. Error t-Statistic Prob.
===========================================================
C 4.147752 1.206761 3.437096 0.0008
RENDA 0.111547 0.037210 2.997803 0.0034
T2 -0.502171 0.517375 -0.970613 0.3339
T3 -1.727730 0.518511 -3.332101 0.0012
T4 1.468067 0.517362 2.837600 0.0054
===========================================================
R-squared: 0.318068 Mean dependent var: 7.349508
Adjusted R-squared: 0.292575 S.D. dependent var: 2.300232
S.E. of regression: 1.934692 Akaike info criterion 4.201390
Sum squared resid.: 400.5047 Log likelihood: -230.2778
F-statistic: 12.47676 Durbin-Watson stat: 1.918160
Prob > F: 0.000000
===========================================================

where CONSUM is consumption in thousands of euros, RENDA is income in thousands of euros,


and T2, T3 and T4 are dummy variables that take the value 1 in the second, third and fourth quarter,
respectively, and 0 in the other cases.

a) Is consumption different in the first and in the third quarter?


b) Is consumption in the fourth quarter higher than in the first one?
c) What test would you run to analyze for seasonality or differences in the four quarters?
d) A new variable T1, with value 1 in the first quarter and 0 otherwise, is included. What will
happen?

35. To analyze the wages of different health workers, the following model is specified:

𝑆 𝛽 𝛽𝐸 𝛽G 𝛽𝑁 𝛽 G ∙𝑁 𝑢

where 𝑆: Annual salary in euros.


𝐸: Work experience in years.
G: Dummy variable with value 1 if the worker is a man and 0 otherwise.
𝑁: Dummy variable with value 1 if the worker is Spanish and 0 otherwise.

a) What salary can be expected for every possible combination of gender and nationality?
b) Interpret the meaning of 𝛽 .

17
Case 1. The file Casa1* contains the following variables for a sample of houses:

PD: House price in thousands of dollars


BR: Number of bedrooms
LSF: Size of lot in square feet
HSF: Size of house in square feet
CO: Dummy variable that takes value 1 if home is colonial style, 0 otherwise

Beware: Your comments and comparisons are specially important.

1. Build three new variables: PE (price in thousands of euros), LSM (lot size in square metres),
and HSM (house size in square metres). Take 1 € = 1,10 $ and 1 square metre = 10,76 square
feet.
2. Obtain descriptive statistics for the variable PE.
3. Detect possible outliers in PE.
4. Plot an histogram for the variable PE.
5. Obtain descriptive statistics for the variables BR, LSM and HSM.
6. Obtain the matrix of correlations for PD, BR, LSF and HSF.
7. Obtain the matrix of correlations for PE, BR, LSM and HSM.

8. Test the null hypothesis that the mean price for the houses is equal to 300.000 euros.
9. Are the mean price of colonial houses equal to 300.000 dollars?
10. Are the mean price of colonial houses equal to the mean price of non-colonial houses?

11. Plot a scatter graph for PE (ordinate) and HSM (abscissa).


12. Estimate by OLS a SLRM with PE as regressand and HSM as regressor.
13. Plot the residuals.

14. Estimate by OLS a MLRM with PE as regressand and BR, HSM, LSM and CO as regressors.
15. Exclude all the regresors that are not significant and estimate a new model that includes only
significant regressors.
16. Estimate a SLRM with PE as regressand and BR as the only regressor. Why is BR significant
now?
17. Build a 95% confidence interval for the price of one square meter in the house.

*Data taken from: Introductory Econometrics: A Modern Approach, Jeffrey M. Wooldridge,


Cengage Learning. Collected from the real estate pages of the Boston Globe during 1990. These are
homes selling in the Boston, MA area.

18
Case 2. The file Casa2* contains data for the following variables in 506 communities:

price median housing price, $


crime crimes committed per capita
nox nitrous oxide, parts per 100 mill.
rooms avg number of rooms per house
dist weighted dist. to 5 employ centers
radial accessibiliy index to radial hghwys
proptax property tax per $1000
stratio average student-teacher ratio
lowstat % of people 'lower status'

1. Analyse all the variables (descriptive statistics, graphs, outliers, …).


2. Estimate a MLRM and analyse the regression output, with special attention to the
significance and signs of the estimates.
3. Analyse the residuals.
4. Does crime affect house prices?
5. Does crime affect positively house prices?
6. Does crime affect negatively house prices?
7. Does an increase of one student per teacher increases house prices by 1000 dollars?
8. Does an increase of one student per teacher decreases house prices by 1000 dollars?
9. Which is the effect of a one percentage point increase in property tax on house prices?
10. Comment on the relationship between price and lowstat.

*Source: D. Harrison and D.L. Rubinfeld (1978), "Hedonic Housing Prices and the Demand for Clean
Air," by Harrison, D. and D.L.Rubinfeld, Journal of Environmental Economics and Management 5, 81-
102. See also Introductory Econometrics: A Modern Approach, Jeffrey M. Wooldridge, Cengage
Learning.

19
Case 3. The file Monet* contains the following variables for a sample of 430 Monet paintings:

Price: Sale Price in $ (million),


Height: Height (inches),
Width: Width (inches),
Signed: Dummy variable: 1 if signed, 0 if not,
Picture: ID number (identifies repeat sales),
House: Code for auction house where sale took place.

1. Convert the variable Price into euros and Height and Width into centimeters.
2. Describe the prices of Monet paintings.
3. Plot an histogram for the price of the paintings.
4. Are signed paintings more expensive than non-signed ones?
5. Do you see any anomalous observation?
6. Estimate a SLRM that relates the price of the paintings with the size.
7. Are larger paintings more expensive than smaller ones?
8. Estimate a MLRM that relates the price of the paintings with: size, format (Height related to
Width), Signed and House.
9. Are signed paintings more expensive than non-signed ones?
10. Are Monet paintings from auction house 1 cheaper than those from auction house 2?
11. Your econometrics teacher saves 500 euros every month. How long does your teacher need to
save for a square Monet painting, with size equal to one square meter, signed, and auctioned
by auction house 2?
12. Test the null hypothesis that auction prices are inelastic again the alternative hypothesis that
they are elastic with respect to size.

*Data taken from Econometric Analysis, William H. Greene, Pearson.

20
Case 4. For a sample of 706 people, the file Son* contains the following variables:

AGE: age in years


BLACK: = 1 if black
EDUC: years of schooling
EARNS74: total earnings, 1974
GDHLTH: = 1 if in good or excellent health
MALE: = 1 if male
MARR: = 1 if married
PROT: = 1 if Protestant
SELFE: = 1 if self employed
SLEEP: mins sleep at night, per week
TOTWRK: mins worked per week
EXPER: age - educ - 6
YNGKID: = 1 if children < 3 present

With these variables you must build a regression model to determine sleep time (variable SLEEP).
Then, you should write a brief report where you present, explain and interpret your model.

Remarks:
Though you probably will run many regressions, the report should include one final model with
significant variables. Of course, you can use logs, quadratic elements or interaction terms, if it is
convenient.
It should be fully interpreted and commented on, and you should state the conclusions of your
work (basically, which variables are significant, which variables are not significant, and the
quantitative impact of significant variables). Also, point out any relevant feature you may detect.

*Source: J.E. Biddle and D.S. Hamermesh (1990), “Sleep and the Allocation of Time,” Journal of
Political Economy 98, 922-943. See also Introductory Econometrics: A Modern Approach, Jeffrey
M. Wooldridge, Cengage Learning.

21
Case 5. The file Gasofa* contains the following variables:

Year: Year, 1953-2004


GasExp: Total U.S. gasoline expenditure
Pop: U.S. total population in thousands
GasP: Price index for gasoline
Income: Per capita disposable income
Pnc: Price index for new cars
Puc: Price index for used cars
Ppt: Price index for public transportation
Pd: Aggregate price index for consumer durables
Pn: Aggregate price index for consumer nondurables
Ps: Aggregate price index for consumer services

1. Compute the multiple regression of per capita consumption of gasoline on per capita income, the
price of gasoline, all the other prices and a time trend. Report all results. Do the signs of the
estimates agree with your expectations?
2. Test the hypothesis that at least in regard to demand for gasoline, consumers do not differentiate
between changes in the prices of new and used cars.
3. Estimate the own price elasticity of demand, the income elasticity, and the cross price elasticity
with respect to changes in the price of public transportation. Do the computations at the 2004
point in the data.
4. Reestimate the regression in logarithms so that the coefficients are direct estimates of the
elasticities. (Do not use the log of the time trend.) How do your estimates compare with the
results in the previous question? Which specification do you prefer?
5. Compute the simple correlations of the price variables. Would you conclude that
multicollinearity is a “problem” for the regression 1 or 4?
6. Has the gasoline market changed in 1973?

*Source: These data on the U.S. Gasoline Market (1953-2004) were compiled by Professor Chris
Bell, Department of Economics, University of North Carolina, Asheville. Sources: www.bea.gov
and www.bls.gov. This case has been taken from Econometric Analysis, William H. Greene,
Pearson.

22
Case 6. Data files in folder SOUS* contain information obtained from OECD’s Programme for the
International Assessment of Adult Competencies (PIAAC) (http://www.oecd.org/skills/piaac/).
These data files include the following variables for different countries (see Appendix):

COUNTRY: Country code


GENDER: Gender
EARN: Monthly earnings in local currency
EDUC: Years of education
EXPER: Years of paid work during lifetime
STATUS: Employment status
SPOUSE: Living with spouse or partner
BORNCOUNTRY: Born in country

1. Univariate data description: sample, frequencies, proportions, missing observations, measures


of central tendency, quantiles, measures of dispersion, outliers, tables, histograms, graphs, …
2. Do men earn more than women on average?
3. Do women who live without spouse or partner earn more than men who live with spouse or
partner?
4. Bivariate data description: scatter plots, covariances, correlations, …
5. Estimate by OLS a SLRM with EARN as regressand and EXPER as regressor.
6. Plot the residuals. Analyse the residuals.
7. Verify the descriptive or algebraic properties in the preceding regression.
8. Estimate by OLS a MLRM with EARN as regressand and EDUC and EXPER as regressors.
9. Interpret the estimates of EDUC and EXPER.
10. Examine the goodness-of-fit of the preceding model.
11. Estimate the same model but now with a log-lin specification: ln EARN   𝛽
𝛽 EDUC 𝛽 EXPER 𝑢 . Does education affect earnings? Is significant work
experience? Interpret the estimates.
12. Estimate the model ln EARN   𝛽 𝛽 EDUC 𝛽 EXPER 𝛽 EXPER 𝑢.
13. What is the marginal effect of EXPER on EARN? Which work experience implies higher
earnings?
14. With regard to the last estimated model: Is the model jointly significant? Does education affect
earnings? Does one additional year of education increases monthly earnings more than 5%?
Does work experience affect earnings?

To answer questions 15 – 20, consider (and estimate) the model:


 EARN 𝛽 𝛽 EDUC 𝛽 EXPER 𝛽 GENDER 𝛽 SPOUSE 𝛽 BORNCOUNTRY 𝑢

15. Does one more year of education conveys an increase of 80,00 € in monthly earnings?
16. Does one additional year of education conveys an increase of more than 80,00 € in monthly
earnings?
17. Does an increase of three years in education and of two years in work experience (both
increases together) convey an increase of 400,00 € in monthly earnings?
18. Are there differences in earnings between men and women?
19. Do people without spouse earn more than people with spouse?
20. Test for heteroscedasticy with White’s test. What are the consequences?

23
To answer questions 21 – 22, you should specify and estimate the appropriate model

21. Is education paid equally to men and women?


22. Is work experience paid more to people born in the country?

APPENDIX
Variable Name Value Label Value Value (SPSS) Value Type
STATUS Employed 1 1 Valid
STATUS Unemployed 2 2 Valid
STATUS Out of the labour force 3 3 Valid
STATUS Not known 4 4 Valid
STATUS Valid skip .V 6 Missing
STATUS Don't know .D 7 Missing
STATUS Refused .R 8 Missing
STATUS Not stated or inferred .N 9 Missing
EXPER Valid skip .V 96 Missing
EXPER Don't know .D 97 Missing
EXPER Refused .R 98 Missing
EXPER Not stated or inferred .N 99 Missing
EARN Valid skip .V 999999999996 Missing
EARN Not stated or inferred .N 999999999999 Missing
GENDER Male 1 1 Valid
GENDER Female 0 0 Valid
GENDER Not stated or inferred .N 9 Missing
SPOUSE Yes 1 1 Valid
SPOUSE No 0 0 Valid
SPOUSE Valid skip .V 6 Missing
SPOUSE Don't know .D 7 Missing
SPOUSE Refused .R 8 Missing
SPOUSE Not stated or inferred .N 9 Missing
BORNCOUNTRY Yes 1 1 Valid
BORNCOUNTRY No 0 0 Valid
BORNCOUNTRY Valid skip .V 6 Missing
BORNCOUNTRY Don't know .D 7 Missing
BORNCOUNTRY Refused .R 8 Missing
BORNCOUNTRY Not stated or inferred .N 9 Missing
EDUC Valid skip .V 96 Missing
EDUC Don't know .D 97 Missing
EDUC Refused .R 98 Missing
EDUC Not stated or inferred .N 99 Missing

*Thanks to José Pernias and Pedro Pérez for providing me with these data.

24
Case 6. (Short) Data files in folder SOUS* contain information from OECD’s Programme for the
International Assessment of Adult Competencies (PIAAC) (http://www.oecd.org/skills/piaac/).
These data files include the following variables for different countries (see Appendix):

COUNTRY: Country code


GENDER: Gender
EARN: Monthly earnings in local currency
EDUC: Years of education
EXPER: Years of paid work during lifetime
STATUS: Employment status
SPOUSE: Living with spouse or partner
BORNCOUNTRY: Born in country

Estimate the model:


ln EARN   𝛽 𝛽 EDUC 𝛽 EXPER 𝛽 EXPER 𝛽 GENDER 𝑢.

With regard to the preceding model, answer the following questions:

1. What is the effect of work experience on earnings?


2. Which work experience implies higher earnings?
3. Are there differences in earnings between men and women?

Estimate now the model:


 EARN 𝛽 𝛽 EDUC 𝛽 EXPER 𝛽 GENDER 𝛽 SPOUSE 𝛽 BORNCOUNTRY 𝑢

With regard to this model, answer the following questions:

4. Do people without spouse earn more than people with spouse?


5. Write down the null and alternative hypotheses to test if an increase of one year in education
and of two years in work experience (both increases taken together) convey an increase of
300,00 € in monthly earnings. Write down the restricted model and do the test.
6. According to the estimated model, and without doing any hypothesis test, does a man who was
not born in the country earns more than a woman who was born in the country?

Estimate now your own model that allows you to answer the following questions:

7. Is work experience paid equally to men and women?


8. Is work experience paid more to people born in the country?

25
APPENDIX
Variable Name Value Label Value Value (SPSS) Value Type
STATUS Employed 1 1 Valid
STATUS Unemployed 2 2 Valid
STATUS Out of the labour force 3 3 Valid
STATUS Not known 4 4 Valid
STATUS Valid skip .V 6 Missing
STATUS Don't know .D 7 Missing
STATUS Refused .R 8 Missing
STATUS Not stated or inferred .N 9 Missing
EXPER Valid skip .V 96 Missing
EXPER Don't know .D 97 Missing
EXPER Refused .R 98 Missing
EXPER Not stated or inferred .N 99 Missing
EARN Valid skip .V 999999999996 Missing
EARN Not stated or inferred .N 999999999999 Missing
GENDER Male 1 1 Valid
GENDER Female 0 0 Valid
GENDER Not stated or inferred .N 9 Missing
SPOUSE Yes 1 1 Valid
SPOUSE No 0 0 Valid
SPOUSE Valid skip .V 6 Missing
SPOUSE Don't know .D 7 Missing
SPOUSE Refused .R 8 Missing
SPOUSE Not stated or inferred .N 9 Missing
BORNCOUNTRY Yes 1 1 Valid
BORNCOUNTRY No 0 0 Valid
BORNCOUNTRY Valid skip .V 6 Missing
BORNCOUNTRY Don't know .D 7 Missing
BORNCOUNTRY Refused .R 8 Missing
BORNCOUNTRY Not stated or inferred .N 9 Missing
EDUC Valid skip .V 96 Missing
EDUC Don't know .D 97 Missing
EDUC Refused .R 98 Missing
EDUC Not stated or inferred .N 99 Missing

*Thanks to José Pernias and Pedro Pérez for providing me with these data.

26
Simple linear regression model

Theoretical model: 𝑌 𝛽 𝛽𝑋 𝑢 𝑖 1,2, … , 𝑛


Estimated model: 𝑌 𝛽 𝛽𝑋 𝑖 1,2, … , 𝑛

∑ 𝑌 𝑛∙𝛽 𝛽 ∙∑ 𝑋
Normal equations:
∑ 𝑋 ∙𝑌 𝛽 ∙∑ 𝑋 𝛽 ∙∑ 𝑋

∑ ∙
𝛽 ∑
Estimators:
𝛽 𝑌 𝛽 ∙𝑋

Multiple linear regression model

Theoretical model: 𝑌 𝛽 𝛽𝑋 , 𝛽𝑋 , ⋯ 𝛽 𝑋 , 𝑢 𝑖 1,2, … , 𝑛

Theoretical model: 𝑌 𝛽 𝛽𝑋 , 𝛽𝑋 , ⋯ 𝛽 𝑋 , 𝑢
𝑌 𝛽 𝛽𝑋 , 𝛽𝑋 , ⋯ 𝛽 𝑋 , 𝑢

𝑌 𝛽 𝛽𝑋 , 𝛽𝑋 , ⋯ 𝛽 𝑋 , 𝑢

Estimated model: 𝑌 𝛽 𝛽𝑋 , 𝛽𝑋 , ⋯ 𝛽 𝑋 , 𝑖 1,2, … , 𝑛

Estimated model: 𝑌 𝛽 𝛽𝑋 , 𝛽𝑋 , ⋯ 𝛽 𝑋 ,
𝑌 𝛽 𝛽𝑋 , 𝛽𝑋 , ⋯ 𝛽 𝑋 ,

𝑌 𝛽 𝛽𝑋 , 𝛽𝑋 , ⋯ 𝛽 𝑋 ,

27
Matrix notation

Regressand vector: 𝐲 Vector of fitted values: 𝐲


Regressor matrix: 𝐗
Vector of parameters: 𝛃 Vector of estimators: 𝛃
Vector of errors: 𝐮 Vector of residuals: 𝐮

𝑌 1 X , X , ⋯ X , 𝛽 𝑢
𝑌 1 X , X , ⋯ X , 𝛽 𝑢
𝐲 𝐗 𝛃 𝐮 ⋮
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
𝑌 1 X , X , ⋯ X , 𝛽 𝑢

𝑌 𝛽 𝑢 Theoretical model: 𝐲 𝐗𝛃 𝐮
⎛𝑌 ⎞ ⎛𝛽 ⎞ 𝑢
𝐲 𝛃 𝐮 Estimated model: 𝐲 𝐗𝛃
⋮ ⋮ ⋮
⎝𝑌 ⎠ ⎝𝛽 ⎠ 𝑢 Residuals: 𝐮 𝐲 𝐲

Distribution of the vector of errors: 𝐮 ⇒ 𝑁 𝟎, 𝜎 𝐈 , by hypothesis.


Distribution of the regressand vector: 𝐲 ⇒ 𝑁 𝟎, 𝜎 𝐈
Distribution of the vector of estimators: 𝛃 ⇒ 𝑁 𝛃, 𝜎 𝐗′𝐗
𝟏
Distribution of the vector of residuals: 𝐮 ⇒ 𝑁 𝟎, 𝜎 𝐌 , 𝐌 𝐈 𝐗 𝐗′𝐗 𝐗′

Hypothesis tests

𝐮 𝐮 𝐮 𝐮 ⁄𝑟 𝑆𝑆𝑅 𝑆𝑆𝑅 ⁄𝑟
⇒𝐹,
𝐮 𝐮 ⁄ 𝑛 𝑘 𝑆𝑆𝑅 ⁄ 𝑛 𝑘

Test of a single parameter:


𝛽 𝛽∗
⇒𝑡
𝐷𝑇 𝛽

Test of joint significance of the model:

𝑅 ⁄ 𝑘 1
⇒𝐹 ,
1 𝑅 ⁄ 𝑛 𝑘

Test of structural stability:

𝐮′𝐮 𝐮 𝐮 𝐮 𝐮 ⁄𝑘 𝑆𝑆𝑅 𝑆𝑆𝑅 𝑆𝑆𝑅 ⁄𝑘


⇒𝐹 ,
𝐮𝐮 𝐮 𝐮 ⁄ 𝑛 2𝑘 𝑆𝑆𝑅 𝑆𝑆𝑅 ⁄ 𝑛 2𝑘

28
Obs. Yt Xt
1
2
3
4
5
Sums
Means

Obs. Yt Xt
1
2
3
4
5
Sums
Means

29
Obs. Yt Xt
1
2
3
4
5
6
7
9
9
10
Sums
Means

30

You might also like