0% found this document useful (0 votes)

64 views18 pages

hw3 Spring2024 Solution

Uploaded by

bellance xavier

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views18 pages

hw3 Spring2024 Solution

Uploaded by

bellance xavier

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Econ3005: Solution for Applied Econometrics, Spring 2024

Homework #3
Do not copy and paste the answers from your classmates. Two identical homework
will be treated as cheating. Do not copy and paste the entire output of your statistical
package's. Report only the relevant part of the output. Please also submit your R-script
for the empirical part. Please put all your work in one single le and upload via Moodle.

Part 1 Multiple Choice (24 points, 3 each)

Please choose the answer(s) that you think is(are) appropriate.

(Non-graded excercises) A nonlinear function

a. makes little sense, because variables in the real world are related linearly.

b. can be adequately described by a straight line between the dependent variable

and one of the explanatory variables.

c. is a concept that only applies to the case of a single or two explanatory variables

since you cannot draw a line in four dimensions.

d. is a function with a slope that is not constant.

Answer: d

(Non-graded excercises) The binary variable interaction regression

a. can only be applied when there are two binary variables, but not three or more.

b. is the same as testing for dierences in means.

c. cannot be used with logarithmic regression functions because is not dened.

d. allows the eect of changing one of the binary independent variables to depend

on the value of the other binary variable.

Answer: d

(Non-graded excercises) The interpretation of the slope coecient in the model

Yi = β0 + β1 ln(Xi ) = ui is as follows:

a. 1% change in X is associated with a β1 % change in Y.

b. 1% change in X is associated with a change in Y of 0.01β1 .

c. change in X by one unit is associated with a 100β1 % change in Y.

d. change in X by one unit is associated with a β1 change in Y.

1
Answer: b

1.1 In the regression model Yi = β0 + β1 Xi + β2 Di + β3 (Xi × Di ) + ui , where X is

a continuous variable and D is a binary variable, to test that the two regressions are

identical, you must use the

a. t-statistic separately for β2 = 0, β3 = 0.

b. F-statistic for the joint hypothesis that β0 = 0, β1 = 0.
c. t-statistic separately for β3 = 0.
d. F-statistic for the joint hypothesis that β2 = 0, β3 = 0 .

Answer: d

1.2 To test whether or not the population regression function is linear rather than

a polynomial of order r,

a. check whether the regression for the polynomial regression is higher than that of

the linear regression.

b. compare the TSS from both regressions.

c. look at the pattern of the coecients: if they change from positive to negative

to positive, etc., then the polynomial regression should be used.

d. use the test of (r-1) restrictions using the F-statistic.

Answer: d

1.3 The major aw of the linear probability model is that

a. the actuals can only be 0 and 1, but the predicted are almost always dierent

from that.

b. the regression R2 cannot be used as a measure of t.

c. people do not always make clear-cut decisions.

d. the predicted values can lie above 1 and below 0.

Answer: d

1.4 The following tools from multiple regression analysis carry over in a meaningful

manner to the probit model, with the exception of the

a. F-statistic.

b. signicance test using the t-statistic.

c. 95% condence interval using 1.96 times the standard error.

d. regression R2 .

2
Answer: d

1.5 When estimating probit and logit models,

a. the t-statistic should still be used for testing a single restriction.

b. you cannot have binary variables as explanatory variables as well.

c. F-statistics should not be used, since the models are nonlinear.

d. it is no longer true that the R̄2 < R2

Answer: a

1.6 In the binary dependent variable model, a predicted value of 0.6 means that

a. the most likely value the dependent variable will take one is 60 percent.

b. given the values for the explanatory variables, there is a 60 percent probability

that the dependent variable will equal one.

c. the model makes little sense, since the dependent variable can only be 0 or 1.

d. given the values for the explanatory variables, there is a 40 percent probability

that the dependent variable will equal one.

Answer: b

1.7 In the expression , P r(Y = 1|X1 ) = Φ(β0 + β1 X) ,

a.(β0 + β1 X) plays the role of z in the cumulative standard normal distribution

function.

b. β1 cannot be negative since probabilities have to lie between 0 and 1.

c.β0 cannot be negative since probabilities have to lie between 0 and 1.

d. min(β0 + β1 X) > 0 since probabilities have to lie between 0 and 1.

Answer: a

1.8 The following problems could be analyzed using probit and logit estimation with

the exception of whether or not

a. a college student decides to study abroad for one semester.

b. being a female has an eect on earnings.

c. a college student will attend a certain college after being accepted.

d. applicants will default on a loan.

Answer: b

3
Part 2 Short Questions (29 points in total)

Note: for each sub-question, the answer should not be longer than 7 lines.

(Non-graded exercise) Dr. Qin would like to analyze the Return to Education and

the Gender Gap. The equation below shows the regression result using the 2005 Cur-

rent Population Survey. lnEearnings refer to the logarithem of the monthly earnings;

educ refers to the year of education; DF emme is a dummy variable, if the individual

is female, =1; exper is the working experience, measured by year; M idwest, South
and W est are dummy variables indicating the residence regions, while Northeast is the

ommited region. Interpret the major results(discuss the estimates for all variables and

also address the question that Dr. Qin wants to analyze.

ˆ
LnEarnings = 1.215 + 0.0899 × educ − 0.521 × DF emme + 0.0180 × (DF emme × educ)
(0.018) (0.0011) (0.022) (0.0016)
+0.0232 × exper − 0.000368 × exper2 − 0.058 × M idwest − 0.0078 × South − 0.030 × W est
(0.0008) (0.000018) (0.006) (0.006) (0.006)
¯
n = 57, 863 R2 = 0.242

Answer: The return to education for males is approximately 9% higher for 1 more

year education, and the estimate is statistically signicant at 1% level. For females, the

return of education is slightly higher, approximately 11% (0.0899+0.018). Since the bi-

nary variable for females is interacted with the number of years of education, the gender

gap depends on the number of years of education. For the typical high school graduate

(12 years of education), the gender gap is approximately 30%(-0.521+0.018*12=-0.3),

while for the typical college graduate (16 years of education) the gender gap narrows to

23% (-0.52+0.018*16). The potential experience variable enters in an inverted U-shape,

which is to be expected given the shape of age-earnings proles and the fact that poten-

tial experience depends on the age of the individual. There is a declining marginal value

for each year of potential experience until it eventually becomes negative. Northeast is

the omitted region, and all other regions have lower (log) earnings, ranging from 0.8%

in the South to 5.8% in the Midwest. All coecients are statistically signicant.

(15 points) 2.1 Sports economics typically looks at winning percentages of sports

teams as one of various outputs, and estimates production functions by analyzing the

relationship between the winning percentage and inputs. In Major League Baseball

(MLB), the determinants of winning are quality pitching and batting. All 30 MLB

teams for the 1999 season. Pitching quality is approximated by Team Earned Run

4
Average (teamera), and hitting quality by On Base Plus Slugging Percentage (ops).

Your regression output is:

W inpct = −0.19 − 0.099 × teamera + 1.49 × ops, R2 = 0.92

(0.08) (0.008) (0.126)

(a) (5 points) Interpret the regression. Are the results statistically signicant and

important?

Answer: Lowering the team ERA by one results in a winning percentage increase of

roughly ten percent. Increasing the OPS by 0.1 generates a higher winning percentage

of approximately 15 percent. The regression explains 92 percent of the variation in

winning percentages. Both slope coecients are statistically signicant, and given the

small dierences in winning percentage, they are also important.

(b) (8 points) There are two leagues in MLB, the American League(AL) and the

National League (NL). One major dierence is that the pitcher in the AL does not

have to bat. Instead there is a designatedhitter in the hitting line-up. You are

concerned that, as a result, there is a dierent eect of pitching and hitting in the AL

from the NL. To test this Hypothesis, you allow the AL regression to have a dierent

intercept and dierent slopes from the NL regression. You therefore create a binary

variable for the American League (DAL) and estiamte the following specication:

W inpct = −0.29 + 0.10 × DAL − 0.100 × teamera + 0.008 × (DAL × teamera)

(0.12) (0.24) (0.008) (0.018)
+1.622 ∗ ops − 0.187 ∗ (DAL × ops)
(0.163) (0.160) R2 = 0.92

How should you interpret the winning percentage for AL and NL? Can you tell the

dierent eect of pitching and hitting between AL and NL? If so, how much?

Answer: For AL, lowering the team ERA by one results in a winning percentage

increase of 9.2 (-0.1+0.008) percents, while the number for NL is 10 percents. Increasing

the OPS by 0.1 will increase the winning percentage by 14.35 (0.1622-0.0187) percent

for AL but 16 percent for NL. However, the coecient estimates of both interaction

terms are not statistically signicant. It is dicult to conclude that there is dierent

eect of pitching and hitting between Al and NL.

(2 points) (c) You remember that sequentially testing the signicance of slope coef-

cients is not the same as testing for their signicance simultaneously. Hence you ask

your regression package to calculate the F-statistic that all three coecients involving

5
the binary variable for the AL are zero. Your regression package gives a value of 0.35.

Looking at the critical value from the F-table, can you reject the null hypothesis at the

1% level? Should you worry about the small sample size?

Answer: The critical value of the F-statistic is 3.78 at the 1% level, and hence you

cannot reject the null hypothesis, that all three coecients are zero. However, the

sample size is too small (30 is much smaller than 100) and thus the F-statistic is not

really distributed as F3,∞ , and, as a result, inference is problematic here.

2.2 A study analyzed the probability of Major League Baseball (MLB) players to

survive for another season, or, in other words, to play one more season. The re-

searchers had a sample of 4,728 hitters and 3,803 pitchers for the years 1901-1999. All

explanatory variables are standardized. The probit estimation yielded the results as

shown in the table:

Regression (1) Hitters (2) Pitchers
Regression model probit probit
constant 2.010 1.625
(0.030) (0.031)
number of seasons played -0.058 -0.031
(0.004) (0.005)
performance 0.794 0.677
(0.025) (0.026)
average performance 0.022 0.100
(0.033) (0.036)

where the limited dependent variable takes on a value of one if the player had one

more season (a minimum of 50 at bats or 25 innings pitched), number of seasons played

is measured in years, performance is the batting average for hitters and the earned run

average for pitchers, and average performance refers to performance over the career.

(Note that all variables are standardized, so that the mean is zero, and the variance is

1 )

(4 points) (a) Interpret the two probit equations and calculate survival probabilities

for hitters and pitchers at the sample mean. Why are these so high?

Answer: Note that all variables are standardized, so that the mean is zero. This

results in a survival probability of 0.978 (Φ(2.01) = 0.9778) for hitters and 0.948

(Φ(1.63) = 0.9484) for pitchers. These results are so high because there is a high

probability, in general, for a player to return the following season.

(4 points) (b) Calculate the change in the survival probability for a player who has

6
a very bad year by performing two standard deviations below the average (assume also

that this player has been in the majors for many years so that his average performance

is hardly aected). How does this change the survival probability when compared to

the answer in (a)?

Answer: Since the variables are standardized, this implies a change of two for the

performance variable. The result for hitters is a lowering of the survival probability to

0.66 (Φ(2.01− 2 ∗ 0.794) = Φ(0.42) = 0.6628), and for pitchers to 0.61 (Φ(1.625 −2∗
0.677) = Φ(0.27) = 0.6064).
(6 points) (c) Since the results seem similar, the researcher could consider combining

the two samples. Explain in some detail how this could be done and how you could

test the hypothesis that the coecients are the same.

Answer: After combining the sample for hitters and pitchers, you would allow for

a dierent intercept and slopes by introducing a binary variable for pitchers if hitters

are the default. This binary variable would be introduced by itself and in combination

with each of the above variables, thereby allowing all coecients to dier. You could

then conduct an F-test for the joint hypothesis that all coecients involving the binary

variables are zero. If the hypothesis cannot be rejected, then there is no dierence

between the coecients for hitters and pitchers.

Part 3 Long Questions (47 points in total)

Note: for each sub-question, the answer should not be longer than 10 lines.

(32 points) 3.1 Use the data set CollegeDistance.dta and read the description le

CollegeDistance_DataDescription.pdf to answer the following questions.

(3 points) (a) Run a regression of ed on dist, female, black, hispanic, dadcoll, mom-

coll, tuition and report your result. Interpret the coecient for tuition. Does it makes
sense?

(3 points) (b) Run a regression of ln(ed) on dist, female, black, hispanic, dadcoll,

momcoll, ln(tuition) and report your result. Interpret the coecient for tuition. Does it
make sense? (Note, ln(ed) is the (natural) logarithem of ed , ln(tuition) is the (natural)
logarithem of tuition.
(6 points) (c) If we are interested in the causal eect of tuition on years of education

completed. Considering the available variables in the data, what are the variables that

7
might cause the omitted variables bias? Justify your answer by both economic logic

and empirical evidence (i.e. regressions or tests).

(4 points) (d) After additing the possible omitted variables (in(c)), what does the

coecient of tuition or ln(tuition) mean?

(6 points) (e) Now we are interested in the eect of dist and parents' education on

years of education completed. Generate a dummy variable for those whose fathers are

not college graduates (named as dadnoncoll ) and a dummy variable for those whose

mothers are not college graduates (named as momnoncoll ). Run a regression of ed on

dist, female, black, hispanic, dadnoncoll, momnoncoll, tuition and report your result.
Interpret the coecients for dist, dadnoncoll and momnoncoll.

(4 points) (f ) Whether the eect of dist on ed depends on dad's education and

mom's education background? Use regression(s) and test(s) to justify your discussion.

(6 points) (g) Now we are interested in the eect of the ethic groups on years of

education completed. Base on regression in (a), how to interpret such eect? Does this

eect depend on parents' education background? If so, how? Justify your answer by

regression(s)/test(s).

Answer: The regression results are reported in Table 1.

Table 1

8
Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
(Intercept) 13.530 *** 2.608 *** 2.585 *** 15.189 *** 13.238 *** 13.518 ***
(0.117) (0.004) (0.006) (0.136) (0.133) (0.119)
dist -0.047 *** -0.003 *** -0.003 *** -0.047 *** -0.049 *** -0.047 ***
(0.013) (0.001) (0.001) (0.013) (0.015) (0.013)
female 0.043 0.003 0.005 0.043 0.066 0.045
(0.056) (0.004) (0.004) (0.056) (0.056) (0.056)
black -0.371 *** -0.025 *** -0.019 *** -0.371 *** -0.288 *** -0.356 ***
(0.069) (0.005) (0.005) (0.069) (0.070) (0.075)
hispanic -0.012 0.001 0.006 -0.012 0.058 0.013
(0.085) (0.006) (0.006) (0.085) (0.086) (0.094)
dadcoll 0.992 *** 0.071 *** 0.061 *** 0.759 *** 1.018 ***
(0.080) (0.006) (0.006) (0.100) (0.094)
momcoll 0.667 *** 0.047 *** 0.042 *** 0.667 *** 0.670 ***
(0.091) (0.006) (0.006) (0.118) (0.109)
tuition 0.149 0.149 0.124 0.154
(0.103) (0.103) (0.103) (0.103)
lntuition 0.013 ** 0.012 **
(0.006) (0.006)
incomehi 0.029 *** 0.407 ***
(0.005) (0.069)
ownhome 0.018 *** 0.248 ***
(0.005) (0.072)
dadnoncoll -0.992 ***
(0.080)
momnoncoll -0.667 ***
(0.091)
ddedu 0.069 *
(0.036)
dmedu -0.051
(0.052)
bdedu -0.147
(0.248)
bmedu 0.047
(0.241)
hdedu -0.070
(0.237)
hmedu -0.134
(0.313)
N 3796 3796 3796 3796 3796 3796
Fstatistics 1.9949 0.1854
Pr(>F) 0.1362 0.9461
R2 0.108 0.109 0.121 0.108 0.121 0.109
Standard errors are heteroskedasticity robust. *** p < 0.01; ** p < 0.05; * p < 0.1.

9
(a) The result is reported in column (1). The coecient of tuition is 0.149, suggest-

ing that holding other variables constant, when the average state 4yr college tuition is

$1000 higher, the years of education completed is 0.149 year higher. It makes sense in

reality, since high average tuition usually means the large amount of excellent univer-

sities, people prefer more years of education. However, the estimate is not statistically

signicant and we cannot reject that the eect is actually zero.

(b) The result is reported in column (2). The coecient of lntuition is 0.013. Holding

other variables constant, when the average state 4 yr college tuition increased by 1%,

the years of education completed is increased by 0.013%. The estimate is statistically

signicant at 5% level.

variables are found to represent for the income level, incomehi and ownhome. Adding

the two variables into the regression, we re-estimate it and report the result in column

(3). Both coecients of the two variables are positive and statistically signicant, while

the coecient of lntuition becomes smaller and the signicance level is also lower. It

suggests that, without the two income related variables, there is a positive bias. The

economic logic is that, tuition fees in rich regions are higher while households in rich

regions tend to invest more in children's education.

(d) In column (3), the coecient 0.012 means that when the average state 4 yr

college tuition increaes by 1%, the years of education completed will be increased by

0.012%. The estimate is statistically signicant at 5% level.

(e) The result is reported in column (4). The coecient of dist is -0.047, which

suggests that if the individual lives 10 miles closer to a 4yr college, his/her years of

education completed will be 0.047 year higher. For those whose fathers are not college

graduates, their years of education completed is 0.99 year lower; for those whose mothers

are not college graduates, their years of education completed is 0.67 year lower. All

these estiamtes are statistically signicant at 1% level.

(f ) In column (5), I add the interaction terms between dist and dadcoll (ddedu) ,

between dist and momcoll (dmedu) into the regression model. (It is totally ne for you

to choose any model as the baseline model to add these interaction terms) As the result

shows, only the interaction term between dist and dadcoll is statistically signicant at

10%, which suggests the impact of distance to a 4yr college might depends on daddy's

education but not moms' education. To further investigate, I conduct a F test to test

whether the distance to a 4yr college depends on either parents' education or not,

i.e., the coecients of both interaction terms are jointly equal to zero. The p vaue is

10
0.1362, suggesting that I cannot reject the null hypothesis that the eect of dist does

not depend on parents' education.

(g) By the result of (a), the coecient of black is -0.371, which means that given

other factors the same, if the individual is black, the years of education completed is

0.371 year less. To test whether the eect of ethic groups depends on parents' education,

I add four interaction terms into the regression in (a) (you can also use regression from

bmedu(black ∗momcoll), bdedu(black ∗dadcoll), hdedu(hispanic∗dadcoll),

b, or c or d):

hmedu(hispanic ∗ momcoll). Estimates of all four coecients are not statistically sig-
nicant. Then, I conduct a joint hypothesis test to test whether the four coecients

are jointly equal to zero. F statistics is 0.185 and p value is 0.946 (as reported in the

bottom panel of table 1), which suggests that we cannot reject the hypothesis that the

eect of ethic groups does not depend on parents education.

3.2 We try to study health insurance, health status, age, and employment using

a random sample of more than 8000 workers in the United States surveyed in 1996.

Please download the data set insurance.dta from Moodle to nish the question. Here

is the description of the related variables:

Insured : health insurance binary variable 1=Insured

healthy : self reported health status binary variable 1= Healthy
self emp : employment binary variable 1 = Self Employed
age : age in years
age2 : age^2
f amilysz : family size
male : sex binary variable 1= Male
married : married binary variable 1 = married
deg _nd : education binary variable 1 = No degree
deg _ged : education variable 1 = GED (High School Equivalent)
deg _hs : education variable 1 = High School
deg _ba : education variable 1 = Bachelor
deg _ma : education variable 1 = Masters
deg _phd : education variable 1 = Ph.D
deg _oth : education variable 1 = other
race_bl : race binary variable 1 = Black
race_wht : race binary variable 1 = White
race_ot : race binary variable 1 = Other than Black or White

11
For the following questions, please use observations from those who report their

health status as healthy only.

Answer: All regression results are presented in the following table

12
Model 1 Model 2 Model 3 Model 4
(Intercept) 0.3391 *** -0.5272 *** -0.3157 *** 0.4527 ***
(0.0577) (0.1968) (0.1125) (0.0299)
selfemp1 -0.1795 *** -1.2452 *** -0.7091 *** -0.2822 ***
(0.0144) (0.0858) (0.0492) (0.0319)
age 0.0097 *** 0.0260 *** 0.0151 *** 0.0036 ***
(0.0028) (0.0032) (0.0018) (0.0004)
age2 -0.0001 **
(0.0000)
familysz -0.0183 *** -0.1041 *** -0.0595 *** -0.0183 ***
(0.0033) (0.0208) (0.0121) (0.0033)
male -0.0399 *** -0.3069 *** -0.1646 *** -0.0395 ***
(0.0082) (0.0634) (0.0355) (0.0082)
married 0.1441 *** 1.0212 *** 0.5754 *** 0.1348 ***
(0.0104) (0.0731) (0.0409) (0.0104)
deg_ged 0.1470 *** 0.6801 *** 0.4106 *** 0.1485 ***
(0.0288) (0.1488) (0.0877) (0.0287)
deg_hs 0.2444 *** 1.2873 *** 0.7625 *** 0.2461 ***
(0.0169) (0.0835) (0.0493) (0.0168)
deg_ba 0.3072 *** 1.8568 *** 1.0765 *** 0.3123 ***
(0.0178) (0.1126) (0.0631) (0.0177)
deg_ma 0.3256 *** 2.2679 *** 1.2812 *** 0.3282 ***
(0.0195) (0.2030) (0.1029) (0.0194)
deg_phd 0.3548 *** 2.5232 *** 1.4108 *** 0.3553 ***
(0.0270) (0.3858) (0.1929) (0.0272)
deg_oth 0.2819 *** 1.5989 *** 0.9232 *** 0.2853 ***
(0.0207) (0.1456) (0.0810) (0.0206)
race_wht1 0.0306 ** 0.2192 **
(0.0137) (0.0919)
race_ot1 -0.0248 -0.1751
(0.0271) (0.1784)
reg_ne -0.0130 -0.1226 -0.0641 -0.0114
(0.0116) (0.1020) (0.0562) (0.0116)
reg_so -0.0449 *** -0.3857 *** -0.2093 *** -0.0443 ***
(0.0105) (0.0879) (0.0486) (0.0105)
reg_we -0.0556 *** -0.4365 *** -0.2396 *** -0.0541 ***
(0.0120) (0.0932) (0.0520) (0.0120)
race_wht 0.1287 ** 0.0292 **
(0.0526) (0.0137)
race_ot -0.1077 -0.0256
(0.1021) (0.0271)
selfemp1:married 0.1384 ***
(0.0354)
N 8173 8173 8173 8173
R2 0.1484 0.1503
AIC 6727.8681 6868.5232 6865.1613 6709.6267
BIC 6861.0314 6987.6692 6984.3073 6842.7899
Pseudo R2 0.2164 0.2170
Standard errors are heteroskedasticity robust. *** p < 0.01; ** p < 0.05; * p < 0.1.

13
(4 points) (a) Estimate a linear probability model with insured as the depen-

dent variable and the following regressors: selfemp age age2 deg_ged deg_hs deg_ba

deg_ma deg_phd deg_oth race_wh race_ot reg_ne reg_so reg_we male married.
How does health insurance status vary with age? Is there a nonlinear relationship

between the probability of being insured and age?

Answer: The coecient on linear term age is positive while the coecient on

quardratic term age2 is negative. The probability of being insured is higher as age

increaes, and the eect of a change in age on health insurance status is declining with

age. The eect is greater for young people than for old people. There is a nonlin-

ear relationship between probability of being insured and age since the coecient on

quadratic term age2 is statistically signicant at 5% level.

(4 points) (b) Now please get rid of the variable age2 and estimate a logit model

using the left regressors. How does health insurance status vary with age by this model?

Are the self-employed less likely to have health insurance than wage earners? How does

a white individuals dier with the black individual in terms of having insurance? (Note:

race_bl is missing from the regression.)

Answer: The marginal eects are estimated in the following table.

From the marginal eect calculation, if the individual is one year older, the prob-

ability for this individual to have health insurance is higher by 0.3%. Compared with

14
wage earners, given other factors the same, self-employed individuals are 19.8% less

likely to have health insurance. Compared with a black individual and given other fac-

tors the same, if the individual is white, the probability for him/her to have the health

insurance is 3% higher. All these estimate are statistically signicant at 1% level.

(4 points) (c) Estimate a probit model using the same regressors as in (b). In

terms of having health insurance, how do the white individuals who aged at 25 behave

dierently when he/she is self-employed? How about the white individuals who aged

at 35 if he/she is self-employed? Is the eect of self-employment on insurance dierent

for married workers than for unmarried workers?

Answer: The regression outcome is reported in the third column of the regression

table. And the following tables present the marginal eects based on the probit model.

15
16
For the white individuals who aged at 25, the probability for them to have an health

insurance is 22% lower if they are self-employed, compared with wage earners. For the

white individuals who aged at 35, compared with wage earner, the probability for

them to have an health insurance is 20% lower if they are self-employed. For married

individuals, the probability to have an health insurance is 17.3% lower if they are

self-employed, compared with wage earner. The number is 23.7% lower for unmarried

individuals. Therefore, there is a 6.4% dierence of the eect of self-employment on

17
insurance between married and unmarried individuals.

(3 points) (d) Use a linear probability model to answer the question: Is the eect of

self-employment on insurance dierent for married workers than for unmarried workers

? Is your answer consistent with the answer in (c)?

Answer: The result is reported in the fourth column of the regression table. The

estimate for the interaction term between self-employment and the marriage status is

13.8%, statistically signicant at 1% level. It suggests that the dierence of eect of

self-employment on insurance between married and unmarried individuals is 13.8%,

which is much larger than (c) suggests.

Resistivity Solutions
100% (1)
Resistivity Solutions
8 pages
Midterm 1a Solutions
No ratings yet
Midterm 1a Solutions
9 pages
Tutorial 5 and 6
No ratings yet
Tutorial 5 and 6
5 pages
PDF
No ratings yet
PDF
9 pages
Practice Exam 3 - Fall 2009 - With Answers
No ratings yet
Practice Exam 3 - Fall 2009 - With Answers
6 pages
Final - Econ3005 - 2022spring - Combined 2
No ratings yet
Final - Econ3005 - 2022spring - Combined 2
11 pages
hw3 Spring2023 Econ3005 Solution
No ratings yet
hw3 Spring2023 Econ3005 Solution
9 pages
Practice Midterm2 Fall2011
No ratings yet
Practice Midterm2 Fall2011
9 pages
hw2 2024spring Solution
No ratings yet
hw2 2024spring Solution
11 pages
Midterm Solution 2024spring
No ratings yet
Midterm Solution 2024spring
10 pages
333 Practice Final Solutions
No ratings yet
333 Practice Final Solutions
5 pages
Mock Final Exam - Econometrics 2022-2023
No ratings yet
Mock Final Exam - Econometrics 2022-2023
7 pages
Mid Term Umt
No ratings yet
Mid Term Umt
4 pages
Problem Set 5
No ratings yet
Problem Set 5
5 pages
Sample Solution
No ratings yet
Sample Solution
4 pages
GMU Econ535-Applied Econometrics Final Exam Spring 2023 solutions
No ratings yet
GMU Econ535-Applied Econometrics Final Exam Spring 2023 solutions
13 pages
Econometrics I Final Examination Summer Term 2013, July 26, 2013
No ratings yet
Econometrics I Final Examination Summer Term 2013, July 26, 2013
9 pages
Econ 140 - Spring 2016 Section 8: Additional Exercises
No ratings yet
Econ 140 - Spring 2016 Section 8: Additional Exercises
4 pages
hw1 Econ7810 2022
No ratings yet
hw1 Econ7810 2022
4 pages
Econ7020X FinalReview (Answers)
No ratings yet
Econ7020X FinalReview (Answers)
10 pages
Ecf630-Final Examination - May 2021
No ratings yet
Ecf630-Final Examination - May 2021
12 pages
2024_Basic_of_Econometrics
No ratings yet
2024_Basic_of_Econometrics
5 pages
ps5 Fall+2015
No ratings yet
ps5 Fall+2015
9 pages
Econometrics 2 Exam Answers
67% (3)
Econometrics 2 Exam Answers
6 pages
Problem Set 2
No ratings yet
Problem Set 2
3 pages
HW3 Solutions - Stats 500: Problem 1
No ratings yet
HW3 Solutions - Stats 500: Problem 1
4 pages
Stat_Model _exam_2017_DBU
No ratings yet
Stat_Model _exam_2017_DBU
20 pages
ECON3334 Midterm Fall2022 Solution
No ratings yet
ECON3334 Midterm Fall2022 Solution
6 pages
all-old-final-exams
No ratings yet
all-old-final-exams
50 pages
hw2 Spring2023 Econ3005 Solution
No ratings yet
hw2 Spring2023 Econ3005 Solution
10 pages
Linear assignment
No ratings yet
Linear assignment
8 pages
Problem Set 4
No ratings yet
Problem Set 4
3 pages
27.12.10h15 KTLTC De-1
No ratings yet
27.12.10h15 KTLTC De-1
6 pages
Mock Exam 2 - Solutions
No ratings yet
Mock Exam 2 - Solutions
6 pages
Econometric Methods
No ratings yet
Econometric Methods
4 pages
UC Berkeley Econ 140 Section 10
No ratings yet
UC Berkeley Econ 140 Section 10
8 pages
Eco1 for Girmay
No ratings yet
Eco1 for Girmay
8 pages
mt1 2017 Soln
No ratings yet
mt1 2017 Soln
8 pages
Cross Section Answers
No ratings yet
Cross Section Answers
22 pages
Model of Questions on Regression Analysis - Linear - Binary - Multinomial (1) (1)
No ratings yet
Model of Questions on Regression Analysis - Linear - Binary - Multinomial (1) (1)
6 pages
Important Instructions To The Candidates:: Part B
No ratings yet
Important Instructions To The Candidates:: Part B
7 pages
Hoja 2 English
No ratings yet
Hoja 2 English
3 pages
questionbank_011020035933
No ratings yet
questionbank_011020035933
9 pages
Chapter 4 (Compatibility Mode)
No ratings yet
Chapter 4 (Compatibility Mode)
66 pages
2223_1_SEHH2313
No ratings yet
2223_1_SEHH2313
16 pages
Sample Question Econometrics
No ratings yet
Sample Question Econometrics
11 pages
EF3450 2122B MID
No ratings yet
EF3450 2122B MID
11 pages
Lecture 11 - SimplerLinear and Simple Logistic Regression
No ratings yet
Lecture 11 - SimplerLinear and Simple Logistic Regression
31 pages
Individual Assignment
No ratings yet
Individual Assignment
9 pages
Metrics Aug 2023
No ratings yet
Metrics Aug 2023
10 pages
Sheet 5 Hetero
No ratings yet
Sheet 5 Hetero
6 pages
Metrics Jan 2021
No ratings yet
Metrics Jan 2021
10 pages
Final Exam 102 w10 Solutions
No ratings yet
Final Exam 102 w10 Solutions
14 pages
HW3
No ratings yet
HW3
3 pages
Midterm
No ratings yet
Midterm
3 pages
Example Econometrics
No ratings yet
Example Econometrics
6 pages
ECU 07309_Econometrics discussion questions
No ratings yet
ECU 07309_Econometrics discussion questions
4 pages
MAT175_Fa24_Final_A_ver-21
No ratings yet
MAT175_Fa24_Final_A_ver-21
7 pages
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Student Solutions Manual for Mathematics for Economics, fourth edition
From Everand
Student Solutions Manual for Mathematics for Economics, fourth edition
Michael Hoy
No ratings yet
Unit 1 - Basic Concepts - FD3404 - Principles of Thermodynamics
No ratings yet
Unit 1 - Basic Concepts - FD3404 - Principles of Thermodynamics
28 pages
Knowing What Makes Air Quality: The New Filter Standard ISO 16890
No ratings yet
Knowing What Makes Air Quality: The New Filter Standard ISO 16890
8 pages
AD3491 - Unit 4 - Analysis of Variance Important Questions 2 Marks With Answer --3-9 (1)
No ratings yet
AD3491 - Unit 4 - Analysis of Variance Important Questions 2 Marks With Answer --3-9 (1)
7 pages
Projected Shadows Psychoanalytic Reflections on the Representation of Loss in European Cinema The New Library of Psychoanalysis 1st Edition Sabbadini - The ebook is available for instant download, read anywhere
100% (1)
Projected Shadows Psychoanalytic Reflections on the Representation of Loss in European Cinema The New Library of Psychoanalysis 1st Edition Sabbadini - The ebook is available for instant download, read anywhere
52 pages
(eBook PDF) Introduction to Research in Education 9th Edition download
100% (4)
(eBook PDF) Introduction to Research in Education 9th Edition download
50 pages
Canada Revenue Agency ATIP Response
No ratings yet
Canada Revenue Agency ATIP Response
87 pages
Scientific computing with case studies 1st Edition Dianne P. O'Leary download pdf
100% (4)
Scientific computing with case studies 1st Edition Dianne P. O'Leary download pdf
85 pages
Consensus Learning: A Novel Decentralised Ensemble Learning Paradigm
No ratings yet
Consensus Learning: A Novel Decentralised Ensemble Learning Paradigm
41 pages
Global Clinical Engieering 15-31-PB VOL 4 N3-2022
No ratings yet
Global Clinical Engieering 15-31-PB VOL 4 N3-2022
57 pages
(PPT) - Topic 4
No ratings yet
(PPT) - Topic 4
24 pages
OPSCMProject AnirudhaChakraborty
No ratings yet
OPSCMProject AnirudhaChakraborty
5 pages
Science Orientation
No ratings yet
Science Orientation
16 pages
9 Science EM
No ratings yet
9 Science EM
62 pages
Sound Chapter Experiments
No ratings yet
Sound Chapter Experiments
5 pages
Institute of Dentistry, CMH Lahore Medical College Curriculum & Study Guide First Year BDS Deaniod@cmhlahore - Edu.pk
No ratings yet
Institute of Dentistry, CMH Lahore Medical College Curriculum & Study Guide First Year BDS Deaniod@cmhlahore - Edu.pk
160 pages
Datasheet Eaf4-2-Eam4 c13 HBK
No ratings yet
Datasheet Eaf4-2-Eam4 c13 HBK
1 page
Open 1 4
No ratings yet
Open 1 4
16 pages
Organizational Leadership
No ratings yet
Organizational Leadership
9 pages
Grade 7 First Quarter Exam
No ratings yet
Grade 7 First Quarter Exam
5 pages
Dong 2021 - Uptake of Microplastics by Carrots in Presence of As (III)
No ratings yet
Dong 2021 - Uptake of Microplastics by Carrots in Presence of As (III)
12 pages
Anubhav verma CV
No ratings yet
Anubhav verma CV
1 page
De HSG Huyen Di Linh - 22 - 23
No ratings yet
De HSG Huyen Di Linh - 22 - 23
9 pages
Worksheet - Molar Mass - Answers
No ratings yet
Worksheet - Molar Mass - Answers
1 page
RPS Report CH17D009
No ratings yet
RPS Report CH17D009
30 pages
Management Consulting Guide 2020
No ratings yet
Management Consulting Guide 2020
4 pages
A Foucauldian Study of Power Gender and
No ratings yet
A Foucauldian Study of Power Gender and
13 pages
Installation/Service Manual: Slow Exhaust Autoclave & Rapid Exhaust Autoclave
No ratings yet
Installation/Service Manual: Slow Exhaust Autoclave & Rapid Exhaust Autoclave
35 pages
(Communications and Culture) Janet Wolff (Auth.) - The Social Production of Art-Macmillan Education UK (1981)
100% (1)
(Communications and Culture) Janet Wolff (Auth.) - The Social Production of Art-Macmillan Education UK (1981)
204 pages
E&E1
No ratings yet
E&E1
3 pages

hw3 Spring2024 Solution

Uploaded by

hw3 Spring2024 Solution

Uploaded by

Econ3005: Solution for Applied Econometrics, Spring 2024

Part 1 Multiple Choice (24 points, 3 each)

Please choose the answer(s) that you think is(are) appropriate.

(Non-graded excercises) A nonlinear function

b. can be adequately described by a straight line between the dependent variable

and one of the explanatory variables.

since you cannot draw a line in four dimensions.

d. is a function with a slope that is not constant.

(Non-graded excercises) The binary variable interaction regression

b. is the same as testing for dierences in means.

c. cannot be used with logarithmic regression functions because is not dened.

on the value of the other binary variable.

(Non-graded excercises) The interpretation of the slope coecient in the model

a. 1% change in X is associated with a β1 % change in Y.

b. 1% change in X is associated with a change in Y of 0.01β1 .

d. change in X by one unit is associated with a β1 change in Y.

1.1 In the regression model Yi = β0 + β1 Xi + β2 Di + β3 (Xi × Di ) + ui , where X is

identical, you must use the

a. t-statistic separately for β2 = 0, β3 = 0.

the linear regression.

b. compare the TSS from both regressions.

to positive, etc., then the polynomial regression should be used.

d. use the test of (r-1) restrictions using the F-statistic.

1.3 The major aw of the linear probability model is that

b. the regression R2 cannot be used as a measure of t.

c. people do not always make clear-cut decisions.

d. the predicted values can lie above 1 and below 0.

manner to the probit model, with the exception of the

b. signicance test using the t-statistic.

c. 95% condence interval using 1.96 times the standard error.

1.5 When estimating probit and logit models,

a. the t-statistic should still be used for testing a single restriction.

b. you cannot have binary variables as explanatory variables as well.

c. F-statistics should not be used, since the models are nonlinear.

d. it is no longer true that the R̄2 < R2

that the dependent variable will equal one.

that the dependent variable will equal one.

1.7 In the expression , P r(Y = 1|X1 ) = Φ(β0 + β1 X) ,

a.(β0 + β1 X) plays the role of z in the cumulative standard normal distribution

b. β1 cannot be negative since probabilities have to lie between 0 and 1.

d. min(β0 + β1 X) > 0 since probabilities have to lie between 0 and 1.

the exception of whether or not

a. a college student decides to study abroad for one semester.

b. being a female has an eect on earnings.

c. a college student will attend a certain college after being accepted.

d. applicants will default on a loan.

also address the question that Dr. Qin wants to analyze.

(12 years of education), the gender gap is approximately 30%(-0.521+0.018*12=-0.3),

23% (-0.52+0.018*16). The potential experience variable enters in an inverted U-shape,

Your regression output is:

W inpct = −0.19 − 0.099 × teamera + 1.49 × ops, R2 = 0.92

of approximately 15 percent. The regression explains 92 percent of the variation in

small dierences in winning percentage, they are also important.

W inpct = −0.29 + 0.10 × DAL − 0.100 × teamera + 0.008 × (DAL × teamera)

eect of pitching and hitting between Al and NL.

1% level? Should you worry about the small sample size?

really distributed as F3,∞ , and, as a result, inference is problematic here.

shown in the table:

more season (a minimum of 50 at bats or 25 innings pitched), number of seasons played

probability, in general, for a player to return the following season.

the answer in (a)?

test the hypothesis that the coecients are the same.

between the coecients for hitters and pitchers.

Part 3 Long Questions (47 points in total)

CollegeDistance_DataDescription.pdf to answer the following questions.

and empirical evidence (i.e. regressions or tests).

coecient of tuition or ln(tuition) mean?

mothers are not college graduates (named as momnoncoll ). Run a regression of ed on

(4 points) (f ) Whether the eect of dist on ed depends on dad's education and

Answer: The regression results are reported in Table 1.

signicant and we cannot reject that the eect is actually zero.

the years of education completed is increased by 0.013%. The estimate is statistically

regions tend to invest more in children's education.

0.012%. The estimate is statistically signicant at 5% level.

these estiamtes are statistically signicant at 1% level.

not depend on parents' education.

b. is the same as testing for dierences in means.

c. cannot be used with logarithmic regression functions because is not dened.

(Non-graded excercises) The interpretation of the slope coecient in the model

1.3 The major aw of the linear probability model is that

b. the regression R2 cannot be used as a measure of t.

b. signicance test using the t-statistic.

c. 95% condence interval using 1.96 times the standard error.

b. being a female has an eect on earnings.

small dierences in winning percentage, they are also important.

eect of pitching and hitting between Al and NL.

test the hypothesis that the coecients are the same.

between the coecients for hitters and pitchers.

coecient of tuition or ln(tuition) mean?

(4 points) (f ) Whether the eect of dist on ed depends on dad's education and

signicant and we cannot reject that the eect is actually zero.

0.012%. The estimate is statistically signicant at 5% level.

these estiamtes are statistically signicant at 1% level.

eect of ethic groups does not depend on parents education.

health status as healthy only.

quadratic term age2 is statistically signicant at 5% level.

Answer: The marginal eects are estimated in the following table.

insurance is 3% higher. All these estimate are statistically signicant at 1% level.

at 35 if he/she is self-employed? Is the eect of self-employment on insurance dierent

individuals. Therefore, there is a 6.4% dierence of the eect of self-employment on

13.8%, statistically signicant at 1% level. It suggests that the dierence of eect of