100% found this document useful (1 vote)
38 views50 pages

Old Final Exams for Econometrics BE

The document contains old final exams for Econometrics for International Economics and Business at Rijksuniversiteit Groningen, specifically from the 2014-2015 academic year. It includes instructions for the exam format, multiple-choice questions, and open questions, along with answers provided for each question. The exam tests knowledge on regression models, statistical significance, and econometric concepts.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
38 views50 pages

Old Final Exams for Econometrics BE

The document contains old final exams for Econometrics for International Economics and Business at Rijksuniversiteit Groningen, specifically from the 2014-2015 academic year. It includes instructions for the exam format, multiple-choice questions, and open questions, along with answers provided for each question. The exam tests knowledge on regression models, statistical significance, and econometric concepts.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

lOMoARcPSD|15344507

all old final exams

Econometrics for BE (Rijksuniversiteit Groningen)

Studocu is not sponsored or endorsed by any college or university


Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Name: _________________________________________

Midterm Exam
Student Number: _____________

Final Exam

Econometrics for International Economics and


Business 2014-2015
Tuesday, April 7, 9:00-12:00

Instructions:
1. Pleaseanswerall20MultipleChoiceQuestionsinPartIchoosingthemost
appropriatechoice.Usethecomputersheetprovidedandfollowtheinstructions
providedonthecomputersheet.
2. AnswerthethreeOpenQuestionsinthisexambooklet.Youcanearn15pointsfor
theopenquestions.Eachopenquestionisworth5points.
3. PleaserefertothecomputeroutputtoanswertheOpenQuestions.
4. ThegradeisanequallyweightedaverageofthegradesofPartsIandII:
ͲYourscoreforPartIis1+9*(CͲ5)/(20Ͳ5),whereCisthenumberofcorrectly
answeredmultipleͲchoicequestions.
ͲYourscoreforPartIIis1+9*D/15,whereDisthenumberofpointsearned.
5. Youarerequiredtosubmitallmaterialsaftercompletingthisexamination.
6. Youarenotallowedtouseagraphicalcalculatorbutonlyasinglelinecalculator.The
typesallowedareCasiofxͲ82ES(PLUS)ortheCasiofxͲ82MSasinMathematicsand
DataAnalysisfromyourfirstyear.
7. PleasedonotwriteonthetablesandformulasheetsaswewouldliketoreͲuse
them(savetrees!).Thetablesmaynotincludecriticalstatisticsforalldegreesof
freedom.Choosetheclosestdegreeoffreedomorinfinitywhereitapplies.
8. Youarenotallowedtovisitthetoiletduringthisexam.

Score on Open Score on Open Score on Open Total Score


Question 1 Question 2 Question 3 Open
Questions

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Part I Multiple Choice Questions

1. Consider the following multiple regression model: yi = ȕ1+ȕ2x i2 + ȕ3x i3+ei . Which of the
following statement about the variance of the least squares estimators is NOT correct?

A. The variance of the least squares estimator of ȕ2 can be reduced by increasing the
sample size.
B. The variance of the least squares estimator of ȕ2 is smaller if the correlation between
x2 and x3 is smaller.
C. The variance of the least squares estimator of ȕ2 is larger if the variance of the errors
is larger.
D. The variance of the least squares estimator is larger if there is more variation in x2
around its mean.

Answer: D

Questions 2 to 4 use the following estimation output:

Suppose someone is interested in the relationship between the UK real consumption growth
(¨Ct), real income growth (¨Yt) and the growth in real investment (¨It), and proceed in
estimating the following regression:

¨Ct = ȕ1 + ȕ2 ¨Yt + ȕ3 ¨It + et. Model (1)

(Note that C, Y, and I are in log levels). The estimates, produced via the least squares
estimation method, are presented as follows:

¨Ct = 0.1810 + 1.8959 ¨Yt í 0.0704 ¨It


(t) (1.8959) (8.4198) (í3.7017)

The sample N = 92; the sum of squared residuals Ȉt êt2 = 39.3601; and the standard deviation
of the dependent variable s¨C = 0.8861. The t-statistics are reported in the brackets.

2. To test the overall significance of the regression Model (1), one uses the F-test. Which of
the following set of hypotheses is correct?

A. H0: ȕ1 = 0, ȕ2 = 0, ȕ3 = 0 vs H1: ȕ1  0, ȕ2  0, ȕ3  0.
B. H0: ȕ1 = 0, ȕ2 = 0, ȕ3 = 0 vs H1: at least one of the ȕk is non-zero for k = 1, 2, 3.
C. H0: ȕ2 = 0, ȕ3 = 0 vs H1: at least one of the ȕk is non-zero for k = 2, 3.
D. H0: ȕ2 = 0, ȕ3 = 0 vs H1: ȕ2  0, ȕ3  0.

Answer: C

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

3. Carry on with the F-test for the overall significance of the regression Model (1), which of
the following gives the correct F-test statistic?

A. 36.28
B. 25.00
C. 24.46
D. 75.00

Answer: A

4. Using a 5% level of significance, the critical value for the F-statistic with the
corresponding degrees of freedom is Fc = 3.099. Which of the following is the correct
conclusion of the F-test for the overall significance of the regression Model (1)?

A. Reject H0: ȕ1 = 0, ȕ2 = 0, ȕ3 = 0 because the computed F-statistic is greater than Fc.


B. Reject H0: ȕ2 = 0, ȕ3 = 0 because the computed F-statistic is greater than Fc.
C. Cannot reject H0: ȕ2 = 0, ȕ3 = 0 because the computed F-statistic is greater than Fc.
D. Cannot reject H0: ȕ1 = 0, ȕ2 = 0, ȕ3 = 0 because the computed F-statistic is greater
than Fc.

Answer: B

5. Which of the following statement about R2 and adjusted R2 is correct?

A. An increase in the values of R2 and adjusted R2 does not always mean that the
additional variable included in a regression model is statistically significant.
B. The value of adjusted R2 can increase or decrease when an additional variable is
included in a regression model.
C. Both statement A and statement B are correct.
D. Statement B is correct but statement A is incorrect.

Answer: C

6. The precision of the estimates for the intercept and slope increases if

A. the error variance is smaller.


B. if the number of observations increases.
C. if the variance of the independent variables around their means increases.
D. All of the above.

Answer: D

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

7. Which of the following conditions is NOT necessary for OLS estimators to be BLUE?

A. The independent variable takes at least two values.


B. The variance of the error term is constant.
C. The expected value of the error term is 0.
D. The error term is normally distributed.

Answer: D

8. If the PDF (probability density function) has a peak at zero, then this distribution CAN
NOT be
A. a t distribution.
B. a standard normal distribution.
C. an F distribution.
D. Any of the above.

Answer: C

9. Consider the model yi=ȕ1+ȕ2xi+ei. If the 95% confidence interval of ȕ2 is [-0.005, 1.995]
and you do hypothesis testing at two significance levels: (1) H0: ȕ2 = 0 vs H1: ȕ2  0 at a
5% significance level, and (2) H0: ȕ2 = 0 vs H1: ȕ2  0 at a 1% significance level, what is
your conclusion?

A. Reject H0 in both (1) and (2).


B. Reject H0 in (1) but not in (2).
C. Do not reject H0 in both (1) and (2).
D. Reject H0 in (2) but not in (1).

Answer: C

10. What distribution does the sum of the squares of m independently distributed
standardized normal random variables follow?
A. A t distribution with m degrees of freedom.
B. A normal distribution.
C. A Chi squared distribution with m degree of freedom.
D. An F(1,m) distribution.

Answer: C

11. Using White standard errors with least squares estimation when heteroskedasticity is
present implies that

A. least squares estimators are BLUE and test statistics are correct.
B. least squares estimators are not BLUE but test statistics are correct.
C. least squares estimators are BLUE but test statistics are incorrect.

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

D. least squares estimators are not BLUE and test statistics are incorrect.

Answer: B

12. Suppose you are implementing a Lagrange Multiplier test for heteroskedasticity. In order
to calculate the test statistic for this test, you estimate a regression model in which the
dependent variable is

A. the regression error term.


B. the square of the dependent variable.
C. the least squares residuals.
D. the square of the least squares residuals.

Answer: D

13. Consider the population regression model Yi = ȕ0 + ȕ1Xi + ȕ2Di + ȕ3(Xi × Di) + ei, where
Xi is a continuous variable and Di is a (0, 1) dummy variable. In this model, ȕ2

A. is the difference in the means in Yi between the two categories.


B. indicates the difference in the intercepts of the two regressions.
C. is usually positive.
D. indicates the difference in the slopes of the two regressions.

Answer: B

14. You estimate a model in which you examine the relationship between the dependent
variable which is the natural logarithm (ln) of earnings (Earn is weekly earnings in
euros), and independent variables which are Age and a dummy for whether the individual
is female or not (Female =1 for women, 0 otherwise). The estimation results (with
standard errors se between brackets) are:

ln(Earn) = 5.44 + 0.015 u Age – 0.421 u Female, R 2 =0.17, SSE=0.75


(se) (0.08) (0.002) (0.036)

From the estimated equation we can conclude that

A. weekly earnings increase by approximately 0.015% for every additional year in an


individual’s life.
B. weekly earnings increase by approximately 1.5% for every additional year in an
individual’s life.
C. weekly earnings increase by approximately 1.5 Euros per week for every additional
year in an individual’s life.
D. weekly earnings increase by approximately 0.015 Euros per week for every
additional year in an individual’s life.
5

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Answer: B

15. In the context of a standard linear regression model, you are testing a restricted vs. an
unrestricted model and have imposed 1 restriction to obtain the restricted model. The
value of SSE (the sum of squared least squares errors) for the restricted model is
2683.411 while the value of SSE for the unrestricted model is 1532.084. There are 75
observations in the dataset and 3 independent variables in the unrestricted model. The F-
statistic in this case is

A. 21.258.
B. 53.355.
C. 54.106.
D. 21.278.

Answer: B

16. Which statement about serial correlation is correct?

A. Serial correlation leads to biased estimators.


B. One of assumptions of the linear regression time series model is that the errors are
uncorrelated.
C. Applying HAC or Newey-West robust standard errors makes the estimators BLUE.
D. All of the above.

Answer: B

17. We estimate a Phillips curve with inflation (INF) related to unemployment (U) and oil
price changes (POIL). The test equation for the serial correlation Lagrange multiplier
(LM) test (with standard errors se between brackets) is:

êt = í8.83x10-6 í 0.001U + 0.003POIL + 0.875 êt-1


(se) (0.001) (0.010) (0.002) (0.040)

The number of observations T = 159, R2 = 0.759. Use this information and the statistical
tables. Is there serial correlation (test at a 5% significance level)?

A. Yes, the test statistic is smaller than the critical Chi-square (ʖ2) value.
B. Yes, the test statistic is bigger than the critical Chi-square (ʖ2) value.
C. No, the test statistic is smaller than the critical Chi-square (ʖ2) value.
D. No, the test statistic is bigger than the critical Chi-square (ʖ2) value.

Answer: B

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

18. Applying GLS when errors follow an AR(1) model to deal with serial correlation

A. makes the estimators BLUE.


B. only calculates correct standard errors, and keeps estimators unchanged.
C. makes the estimators unbiased.
D. All of the above.

Answer: A

19. What is a characteristic of a non-stationary series?

A. The mean depends on the initial value.


B. Series return to a long-run mean.
C. The variance is time dependent and converges to zero over time.
D. In finite samples, the sample autocorrelations die out quickly.

Answer: A

20. Which of the following could be used as a test for autocorrelation up to the third order?

A. The Durbin-Watson test.


B. The White test.
C. The RESET test.
D. The Breusch-Godfrey Lagrange multiplier (LM) test.

Answer: D

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Part II Open Questions

Question 1

Consider the following model for salary determination:

Wagei = ȕ1 + ȕ2 Experi + ȕ3 Marriedi + ȕ4 Experi × Marriedi + ei,

where Wage is salary in terms of thousands of dollars, Married is an indicator variable


indicating whether the employee is married (1 for married workers; 0 for unmarried workers),
and Exper is the number of years of experience the person has. The regression output is in
Table 1.1.

a) Construct a 99% confidence interval of the average salary of an unmarried person with
no working experience. (1 point)

b) Perform a t-test on whether the marginal effect of one year increase in the experience on
salary is lower for married people than for unmarried people. Indicate the degrees of
freedom. (2 points)

c) What is the marginal effect of experience on salary for unmarried persons? Interpret this.
For a married person who has 10 years of experience, what is the marginal effect of 1
year increase of experience on his salary? (2 points)

Table 1.1. Least squares regression results

. reg wage exper married exper_married

Source SS df MS Number of obs = 1000


F( 3, 996) = 7.99
Model 3865.3471 3 1288.44903 Prob > F = 0.0000
Residual 160700.081 996 161.345463 R-squared = 0.0235
Adj R-squared = 0.0205
Total 164565.428 999 164.730158 Root MSE = 12.702

wage Coef. Std. Err. t P>|t| [95% Conf. Interval]

exper .0892401 .0441992 2.02 0.044 .0025059 .1759743


married 4.457092 1.860519 2.40 0.017 .8061048 8.108079
exper_married -.0464704 .0635807 -0.73 0.465 -.1712379 .0782972
_cons 16.42872 1.221768 13.45 0.000 14.03118 18.82625

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Answers:

a) It is 16.429 ± t(0.995, +’)*1.22 = 16.429 ± 3.143. (1 point)

b) H0: ȕ4 = 0 vs. H1: ȕ4 < 0. We can read the t-statistic from the output or use (b2-0)/se(b2)
to get t-statistic = -0.73. The degrees of freedom are 996. The critical t-statistic is tc =
t(0.05, +’) = -1.645. t> tc. So we do not reject H0.
(0.5 point for the correct hypothesis, 1 point for correct critical t-statistic, 0.5 point for
right conclusion.)

c) For unmarried person it is ȕ2. For an unmarried person, salary increases by 89.24
dollars for every additional year of experience. (1 point for complete answer, 0.5 for
just stating that it is ȕ2).

For the married person (with 10 years experience) it is b2+ b4=0.0427 thousand dollar,
or 42.7 dollar. (1 point). Note that the number of years of experience does not matter
here.

Question 2

Using a data set on hourly wages (Wage), education (Educ) and experience (Exper), you
estimate the following regression:

ln(Wagei) = ȕ1 + ȕ2Educi + ȕ3 Experi + ȕ4 Experi2+ ȕ5 Experi×Educi + ȕ6 Marriedi + ei

where Married is a dummy variable (1 for married workers; 0 for unmarried workers).

a) Using a 1% significance level, conduct a formal hypothesis test to test the null
hypothesis that the hourly wages of unmarried workers and married workers are the same
vs. the alternative hypothesis that hourly wage of married workers is higher than of un-
married workers (state the null and alternative hypotheses, test statistic and critical value
and explain your conclusion). See Table 2.1 for the least squares estimation results. (2
points)

b) A colleague of yours says that it is not correct to estimate this model with both married
and unmarried workers together in the same sample. She says that the data should be
split into two subsamples, one for married workers and one for unmarried workers. She
says that the model should be estimated separately for these two subsamples. What could
be the reasoning behind this suggestion? (No calculations required!) (1 point)

c) Your colleague suggests that the error variance in wages may be different for married vs.
unmarried workers. Using computer output (Table 2.2), apply the Goldfeld-Quandt test
to find out whether the error variance in wage is different for married vs. unmarried
workers. Use a 10% significance level. Formulate the hypothesis. Also state in your

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

answer the critical F-value. What do you conclude? (2 points)

Note that the critical values of the F-distribution are:


F(0.95,576,414) = 1.163, F(0.90,576,414) = 1.125, and
F(0.05,414,576) = 0.860, F(0.10,414,576) = 0.889.

Table 2.1. Least squares regression results

. reg lnwage [Link] [Link]#[Link] [Link]##[Link], vsquish

Source SS df MS Number of obs = 1000


F( 5, 994) = 64.73
Model 82.7225202 5 16.544504 Prob > F = 0.0000
Residual 254.058211 994 .255591762 R-squared = 0.2456
Adj R-squared = 0.2418
Total 336.780731 999 .337117849 Root MSE = .50556

lnwage Coef. Std. Err. t P>|t| [95% Conf. Interval]

[Link] .0402895 .0337911 1.19 0.233 -.0260206 .1065996


[Link]#
[Link] -.0006933 .0000897 -7.73 0.000 -.0008694 -.0005173
educ .1261199 .0147433 8.55 0.000 .0971883 .1550516
exper .0613731 .0096289 6.37 0.000 .0424777 .0802684
[Link]#
[Link] -.0013091 .0004949 -2.65 0.008 -.0022803 -.000338
_cons .5410612 .2268944 2.38 0.017 .0958141 .9863082

10

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Table 2.2. Least squares regression results for unmarried (a) and married workers (b)

a)
-> married = 0

Source SS df MS Number of obs = 419


F( 4, 414) = 39.32
Model 33.4771306 4 8.36928266 Prob > F = 0.0000
Residual 88.1199708 414 .212850171 R-squared = 0.2753
Adj R-squared = 0.2683
Total 121.597101 418 .290902157 Root MSE = .46136

lnwage Coef. Std. Err. t P>|t| [95% Conf. Interval]

[Link]#[Link] -.0007014 .0001193 -5.88 0.000 -.0009358 -.0004669


educ .151292 .0194232 7.79 0.000 .1131117 .1894723
exper .072836 .0127057 5.73 0.000 .0478602 .0978117
[Link]#[Link] -.0021448 .0006538 -3.28 0.001 -.00343 -.0008596
_cons .1974878 .2944715 0.67 0.503 -.381358 .7763335

(b)
-> married = 1

Source SS df MS Number of obs = 581


F( 4, 576) = 38.48
Model 44.111123 4 11.0277807 Prob > F = 0.0000
Residual 165.072748 576 .286584632 R-squared = 0.2109
Adj R-squared = 0.2054
Total 209.183871 580 .360661847 Root MSE = .53534

lnwage Coef. Std. Err. t P>|t| [95% Conf. Interval]

[Link]#[Link] -.0007088 .0001379 -5.14 0.000 -.0009796 -.000438


educ .1008275 .0221957 4.54 0.000 .0572331 .1444219
exper .0506938 .0149271 3.40 0.001 .0213756 .0800119
[Link]#[Link] -.000462 .0007478 -0.62 0.537 -.0019308 .0010068
_cons .9196962 .3557963 2.58 0.010 .2208799 1.618512

Answers:

a) The null and alternative hypotheses are H0: ȕ6 = 0 vs. H1: ȕ6 > 0. The test statistic from
the computer output is t = 0.04029/0.03379 = 1.192. The degrees of freedom associated
with the test are 994. In the t table we are given, the value of t(0.99, 50) is 2.403, and the
value of t(0.99, ’) is 2.326. Since the test statistic is lower than both of these critical

11

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

values we conclude that we cannot reject the null hypothesis that the hourly wage of
married and unmarried workers are the same. (0.5 point for the null and alternative
hypotheses, 1 point for test statistic and the critical value and 0.5 point for the correct
conclusion.)

b) Your colleague may be thinking that the error structure is very different for married vs.
unmarried workers. Even if all coefficients are different but the variances are the same, it
is possible to include a dummy for married/unmarried and include interactions of this
dummy with all the remaining RHS variables and estimate them together. So your
colleague suspects that there could be heteroskedasticity; that the variance in the error
term may be different for the two sub-samples. (1 point)

c) Denoting the married and unmarried groups by M and U,


The null and alternative hypotheses are
H0: ɐଶ୑ ൌ ɐଶ୙ vs. H1: ɐଶ୑ ് ɐଶ୙
The value of the F statistic is
ෝଶ୑
ɐ
‫ܨ‬ൌ ଶ
ɐ
ෝ୙

The computer output gives us SSE divided by the degrees of freedom for both models
(under the column MS, alternatively you can take the square of Root MSE), therefore F
= 0.2866/0.2129= 1.346 (alternative would be to calculate the reciprocal with married in
numerator and unmarried in denominator, this would result in an F statistic of 0.743).

Since the F = 1.346 > FU = 1.163 (alternatively, F = 0.743 < FL = 0.860) we reject the
null hypothesis that the error variances are the same for married vs. un-married
individuals. (0.5 point for the null and alternative hypotheses, 1 point for test statistic
and the critical value and 0.5 point for the correct conclusion.)

Question 3

In this question we use monthly data of individual expenditure in purchasing new cars in
America during 1975 – 1991 to estimate the following model:

PCECARSt = ȕ1 + ȕ2 PCDPYt + ȕ3CPINEWt + ȕ4POPt + et,

where:

PCECARS : Individual expenditure in purchasing new car (billion USD).


POP : The US population (million people).
PCDPY : Average personal income (thousand USD).
CPINEW : Consumer price index for new cars.

12

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

a) Tables 3.1 and 3.2 show the results of Augmented Dickey-Fuller tests. Is PCECARS
stationary? Is D(PFECARS) stationary? Is PFECARS a I(0) series or a I(1) series? (2
points)
b) The researcher shows that the variables are cointegrated and decides to run a regression
in levels. Is that decision warranted? (1 point)
c) Table 3.3 shows the least squares estimation results, and Table 3.4 shows the least
squares estimation results with robust variances and covariances. Do least squares
estimators overstate or understate precision if serial correlation is neglected? (2 points)

Table 3.1. Augmented Dickey-Fuller test in levels

. dfuller PCECARS, trend lags(0)

Dickey-Fuller test for unit root Number of obs = 201

Interpolated Dickey-Fuller
Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value

Z(t) -1.190 -4.006 -3.437 -3.137

MacKinnon approximate p-value for Z(t) = 0.9125

Table 3.2. Augmented Dickey-Fuller test in first differences

. dfuller [Link], lags(0)

Dickey-Fuller test for unit root Number of obs = 200

Interpolated Dickey-Fuller
Test 1% Critical 5% Critical 10% Critical
Statistic Value Value Value

Z(t) -14.991 -3.477 -2.883 -2.573

MacKinnon approximate p-value for Z(t) = 0.0000

13

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Table 3.3. Least squares estimation results


. reg PCECARS PCDPY CPINEW POP

Source SS df MS Number of obs = 202


F( 3, 198) = 4105.70
Model 184440.469 3 61480.1563 Prob > F = 0.0000
Residual 2964.92259 198 14.9743565 R-squared = 0.9842
Adj R-squared = 0.9839
Total 187405.392 201 932.365132 Root MSE = 3.8697

PCECARS Coef. Std. Err. t P>|t| [95% Conf. Interval]

PCDPY 23.6611 1.256354 18.83 0.000 21.18355 26.13865


CPINEW -2.018039 .1091943 -18.48 0.000 -2.233372 -1.802706
POP 4.422575 .2365113 18.70 0.000 3.956171 4.88898
_cons -967.5567 38.06453 -25.42 0.000 -1042.621 -892.4928

Table 3.4. HAC (Newey-West) estimation results


. newey PCECARS PCDPY CPINEW POP, lag(1)

Regression with Newey-West standard errors Number of obs = 202


maximum lag: 1 F( 3, 198) = 3910.06
Prob > F = 0.0000

Newey-West
PCECARS Coef. Std. Err. t P>|t| [95% Conf. Interval]

PCDPY 23.6611 1.487126 15.91 0.000 20.72846 26.59373


CPINEW -2.018039 .1315501 -15.34 0.000 -2.277458 -1.75862
POP 4.422575 .2806464 15.76 0.000 3.869136 4.976015
_cons -967.5567 44.30198 -21.84 0.000 -1054.921 -880.1924

Answers:
a) The null of a unit root is not rejected for the test in levels (Table 3.1). The null of a unit
root is rejected for the test in first differences (Table 3.2): PCECARS is not stationary,
D(PCECARS) is stationary. So, PCECARS is I(1). (2 x 0.5 point for stationary and 1
point for the order of integration.)

b) Cointegrated series indicate that there exists a long-run relationship between the variables,
and a regression in levels is allowed. Also as the first stage of an ECM model. (1 point)

c) Comparing Tables 3.3 and Table 3.4 with HAC standard errors shows bigger standard
deviations of the estimates: LS overstates precision. (2 points)

14

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Name: _________________________________________

Midterm Exam
Student Number: _____________

Final Exam Answers

Econometrics for Business Economics, Economics and


International Economics and Business 2015-2016
Instructions:
1. Please answer all 20 Multiple Choice Questions in Part I by selecting the most
appropriate answer. Use the computer sheet provided and follow the instructions
provided on the computer sheet.
2. Answer all three Open Questions in this exam booklet. You can earn 15 points for the
open questions. Each open question is worth 5 points.
3. Please refer to the computer output to answer the Open Questions.
4. The grade is an equally weighted average of the grades of Parts I and II:
- Your score for Part I is 1+9x(Cí5)/(20í5), where C is the number of correctly
answered multiple-choice questions.
- Your score for Part II is 1+ 9xD/15, where D is the number of points earned.
5. You are required to submit all materials after completing this examination.
6. You are not allowed to use a graphical calculator but only a single line calculator. The
types allowed are Casio fx-82ES (PLUS) or the Casio fx-82MS as in Mathematics and
Data Analysis from your first year.
7. Please do not write on the tables and formula sheets as we would like to re-use
them (save trees!). The tables may not include critical statistics for all degrees of
freedom. Choose the closest degree of freedom where it applies.
8. You are not allowed to visit the toilet during this exam.

Score on Open Score on Open Score on Open Total Score


Question 1 Question 2 Question 3 Open
Questions

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Part I Multiple Choice Questions

1) Consider the following regression model yi = ȕ1 + ȕ2xi + ei. Let b1 and b2 denote the least
squares estimators of ȕ1 and ȕ2, respectively. Assume that b1 and b2 both are normally
distributed. Let the linear combination of the true parameters be ߛ = (ܽߚଵ + ܿߚଶ ). What
is the probability distribution of ߛො = (ܾܽଵ + ܾܿଶ ); where a and c are some known
constants?
a) It follows a standard normal distribution with mean 0 and unit variance.
b) It follows a t-distribution with Ní2 degrees of freedom, where N is the number of
observations.
c) It follows a normal distribution with mean 0 and variance ı2, where ı2 is the variance
of the error.
d) It follows a normal distribution with mean Ȗ and variance var(ߛො).

Answer: d)

2) Use the information from Question 1. Suppose we would like to test the null hypothesis
H0: ߛ = 2 against the alternative hypothesis H1: ߛ ് 2 , where ߛ = (ܽߚଵ + ܿߚଶ ), and
the number of observations N = 25. Which of the following is the correct test statistic?
ఊෝ ିଶ
a) ~ܰ(0,1).
෣ෝ )
ට௏௔௥(ఊ
ෝ ିଶ

b) ~ܰ(0,1); where ıො ଶ is the estimator of the variance of the error.
ඥıො మ
ෝ ିଶ

c) ~‫(ݐ‬ଶହ) .
෣ෝ )
ට௏௔௥(ఊ

d) None of the above.

Answer: d)

Questions 3) to 5) use the following estimation output:

Suppose someone is interested in the relationship between variable y and variables x1, and x2,
and proceeds in estimating the following multiple regression model:
yi = ȕ1 + ȕ2xi1 + ȕ3 xi2 + ei
The estimates, produced via the least squares estimation method, are presented as follows:

yi = 0.022 + 0.383xi1 + 0.613xi2


(0.01) (0.152) (0.235)

The sample size, N = 32. The standard errors of the estimated parameters are reported in the
brackets.

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

3) Suppose we are interested in testing the null hypothesis H0: ȕ2 = 0.45 against the
alternative hypothesis H1: ȕ2 DWthe 5% significance level. Which one of the
following is the correct test result?
a) Reject H0 as the computed t-statistic is less than t(0.025,29) = í2.045.
b) Reject H0 as the computed t-statistic is less than t(0.025,32) = í2.037.
c) Do not reject H0 as the computed t-statistic is between t(0.025,29) = í2.045 and
t(0.975,29) = 2.045.
d) Do not reject H0 as the computed t-statistic is between t(0.025,32) = í2.037 and
t(0.975,32) = 2.037.

Answer: c)

4) Let t denotes the computed test statistic for ȕ2, and t(k) denotes a t-distributed random
variable with k degrees of freedom. Which of the following is the correct way to compute
the p-value for the hypothesis stated in Question 3)?
a) 1íP[t(k) •t].
b) P[t •t(k)] + P[t ” ít(k)].
c) P[t(k) • t].
d) None of the above.

Answer: b)

5) Suppose we would like to test if the expected value of y given x1=50 and x2 = 20 is less
than 40. Which of the following statements is correct?
a) The null hypothesis is H0: ߚଵ + 50ߚଶ + 20ߚଷ െ 40 = 0.
b) The null hypothesis is H0: 50ߚଶ + 20ߚଷ െ 40 = 0.
c) The null hypothesis is H0: 50ߚଶ + 20ߚଷ െ 40 < 0.
d) None of the above.

Answer: a)

6) In the presence of heteroskedasticity,


a) the least squares estimator is biased.
b) the least squares estimator has the smallest variance.
c) the conventional standard errors are correct.
d) None of the above.
Answer: d)

7) In the presence of heteroskedasticity, the GLS estimator


a) affects the coefficient estimates but not the standard errors.
b) arrives at the best linear unbiased estimator by transforming the model into one with
homoskedastic errors when the correct transformation is used.
c) All of the above.
d) None of the above.

Answer: b)

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

8) In the presence of heteroskedasticity,


a) the Gauss–Markov theorem does not apply, meaning that the least squares estimator is
not the Best Linear Unbiased Estimator (BLUE).
b) the variance of the least squares estimator is not the lowest of all other linear unbiased
estimators.
c) All of the above.
d) None of the above.

Answer: c)

9) Which is the following is the correct definition of collinearity?


A. The variance of the errors is not constant.
B. The mean of the errors is non-zero.
C. There exists a linear relationship between the explanatory variables.
D. One of the relevant explanatory variables is not included in the model, causing an
estimation bias.

Answer: c)

10) The following graph plots the value of x against the absolute value of residuals.

a) The graph suggests the presence of heteroskedasticity because the residuals are
positive.
b) The graph suggests the presence of homoskedasticity since the variance of the errors
is constant across the sample.
c) The graph suggests the presence of heteroskedasticity because of the trend in x.
d) None of the above.

Answer: b)

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

11) What would be the consequences for the least squares estimator if serial correlation is
present in a regression model but ignored?
a) It will be biased.
b) It will be inconsistent.
c) It will have the wrong standard error.
d) All of the above.

Answer: c)

12) Suppose that you wish to test for autocorrelation using an approach based on an auxiliary
regression. Which one of the following auxiliary regressions would be most appropriate?
a) ݁௧ଶ = ߙଵ + ߙଶ ‫ݔ‬௧ + ߩ݁௧ିଵ + ‫ݒ‬௧ .
ଶ ଶ
b) ݁௧ଶ = ߙଵ + ߙଶ ‫ݔ‬௧ଵ + ߙଷ ‫ݔ‬௧ଶ + ߙସ ‫ݔ‬௧ଵ ‫ݔ‬௧ଶ + ߙହ ‫ݔ‬௧ଵ + ߙ଺ ‫ݔ‬௧ଶ + ‫ݒ‬௧ .
c) ݁௧ = ߙଵ + ߙଶ ‫ݔ‬௧ + ߩ݁௧ିଵ + ‫ݒ‬௧ .
ଶ ଶ
d) ݁௧ = ߙଵ + ߙଶ ‫ݔ‬௧ଵ + ߙଷ ‫ݔ‬௧ଶ + ߙସ ‫ݔ‬௧ଵ ‫ݔ‬௧ଶ + ߙହ ‫ݔ‬௧ଵ + ߙ଺ ‫ݔ‬௧ଶ + ‫ݒ‬௧ .

Answer: c)

13) An incorrect and possibly spurious regression can be identified by the following rule-of-
thumb:
a) An R2 around 0 and a Durbin-Watson statistic around 2.
b) An R2 around 1 and a Durbin-Watson statistic around 2.
c) An R2 around 1 and a Durbin-Watson statistic around 0.
d) An R2 around 0 and a Durbin-Watson statistic around 0.

Answer: c)

14) A random walk series yt = yt-1 + vt, where vt ~ N(0,1)


a) is stationary.
b) is non-stationary.
c) has a linear deterministic trend.
d) is integrated of order zero.

Answer: b)

15) The series yt = 1 + 2t + vt with t = 1,2,…,100 and vt ~ N(0,1)


a) has a positive linear trend.
b) has no trend.
c) has a negative linear trend.
d) has no linear trend.

Answer: a)

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Use the following information to answer the following five questions:

You estimate model in which the dependent variable is LN(PRICE/1000), and where:
PRICE is the selling price of the home in dollars,
BEDS and BATHS are the number of bedrooms and bathrooms, respectively,
AGE is the age of the house in years at the time of the sale,
POOL is a dummy variable that is 1 if the house has a pool and 0 otherwise,
LGELOT (large lot) is a dummy variable that is 1 if the house is on a lot of land that is larger
than 0.5 acres (large lot) and 0 otherwise (regular lot).
LIVEAREA is the living area of home (in hundreds of square feet),
LGELOT x LIVAREA is the interaction term between LGELOT and LIVAREA.

Two models were estimated and the estimation results with standard errors in brackets are
given below.
Model 1 Model 2
Variable
LIVAREA 0.0539 0.0589
(0.0017) (0.0019)
BEDS -0.0382 -0.0480
(0.0114) (0.0113)
BATHS -0.0103 -0.0201
(0.0165) (0.0164)
LGELOT 0.2531 0.6134
(0.0255) (0.0632)
AGE -0.0013 -0.0016
(0.0005) (0.0005)
POOL 0.0787 0.0853
(0.0231) (0.0228)
LGELOT x LIVAREA -0.0161
(0.0026)
INTERCEPT 3.986 3.9649
(0.0373) (0.0370)

16) Based on Model 1, the following statement is approximately correct:


a) The difference between the expected selling price of a house with a pool and without
a pool is $78.7.
b) The difference between the expected selling price of a house with a pool and without
a pool is $108.18.
c) The percentage difference between the expected selling price of a house with a pool
and without a pool is 7.87%.
d) The percentage difference between the expected selling price of a house with a pool
and without a pool is 8.53%.

Answer c)

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

17) Based on Model 1, the following statement is approximately correct:


a) Increasing the living area of a house by 100 square feet increases the expected selling
price of the house by 0.0539%.
b) Increasing the living area of a house by 100 square feet increases the expected selling
price of the house by 5.39%.
c) Increasing the living area of a house by 100 square feet increases the expected selling
price of the house by $5.39.
d) Increasing the living area of a house by 100 square feet increases the expected selling
price of the house by $53.90.

Answer b)

18) Based on Model 2, the following statement is most accurate:


a) The interaction term between living area and large lot implies that the effect of living
area on expected selling price depends on whether or not the house is on a large lot.
b) The interaction term between living area and large lot can be included as long as there
is no collinearity in the model.
c) The interpretation of the interaction term between living area and lot size depends on
the magnitude of the logarithm of expected selling price.
d) The interaction term between living area and large lot allows for a non-linear effect of
living area on expected selling price.

Answer a)

19) Based on Model 2, the following statement is approximately correct:


a) The effect of an increase in living area of 100 square ft on the expected selling price
of a house is 5.39%.
b) The effect of an increase in the living area of 100 square ft on the expected selling
price of a house on a large lot is 5.89%.
c) The effect of an increase in the living area of 100 square ft on the expected selling
price of a house on a regular lot is 5.89%.
d) The effect of an increase in the living area of 100 square ft on the expected selling
price of a house on a large lot is 5.97%.

Answer c)

20) You are asked to test the joint significance of all the qualitative factors in Model 1 using
an F test. The number of restrictions is
a) 2.
b) 1.
c) 3.
d) Depends on the size of the sample.

Answer a)

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Part II Open Questions

Question 1.
Consider the following model for selling price of houses:
PRICE = ȕ1+ȕ2 TRADITIONAL+ ȕ3 FIREPLACE+ ȕ4 TRADITIONAL x FIREPLACE+e,
where PRICE is in US dollars, TRADITIONAL is an indicator variable indicating whether
the house is of traditional style (TRADITIONAL = 1) or not (TRADITIONAL = 0), and
FIREPLACE is an indicator variable indicating whether the house has a fireplace
(FIREPLACE = 1) or not (FIREPLACE = 0) .
Use the estimation output in Table 1.1 to answer this question.

(a) What does the constant term of 109415.2 mean in the regression output? (1 point)
It means that the average selling price of a non-traditional style house without fireplace is
109415.2 dollars.
(1 point for the correct answer.)

(b) A real estate agent who sells non-traditional houses claims that installing a fireplace
will increase the selling price while you think it will make no difference. Test the
claim of the real estate agent at a 5% significance level giving all steps. Indicate the
degree of freedom of this test. (2 points)
H0: ȕ3 = 0 vs. H1: ȕ3 > 0.
We can read the t-statistic from the output or use (b3í0)/se(b3) to get a t-statistic of 8.81. The
degrees of freedom are 1080í4 = 1076.
The critical t-statistic is tc = t ’GI ) = 1.645. Since t > tc, we reject H0 and find support for
the claim of the real estate agent that installing a fireplace increases the selling price.
(0.5 point for the correct hypothesis, 0.5 point for correct critical t-statistic, 1 point for right
conclusion.)

(c) Calculate the expected average price of traditional style houses in the sample with a
fireplace and without a fireplace. Formulate a null hypothesis and an alternative
hypothesis to test if the price of a traditional style houses with a fireplace is higher.
Write down the expression for the test statistic. You do not need to calculate the test
statistic. (2 points)
The expected average price of the traditional style houses with a fireplace is
b1+ b2+ b3+ b4 = 172,332,6 dollars and the expected average price of the traditional style
houses without a fireplace is b1+b2 = 112,976.6 dollars.
To test if the price of a traditional style house is higher: H0: ȕ3+ ȕ4=0 vs. H1: ȕ3+ ȕ4>0
we calculate the test statistic as: t = (b3+ b4 >Y۲U b3 Y۲U b4)+2xcov(b3, b4)]0.5
(0.5 point for expected average price with fireplace, 0.5 point for expected average price
without a fireplace, 0.5 for correct hypotheses, 0.5 for correct test statistic.)

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Question 2.
Using state level data from the United States, you estimate a model in which the dependent
variable is the percentage of votes for a certain candidate A (VOTESA) as a function of
percentage of votes for president (PRTYSTRA), an indicator variable for whether A is a
democrat or not (DEMOCA = 1 for a democrat, and 0 otherwise), the logarithm of the
expenditures of the party of candidate A (LEXPENDA = LOG(EXPENDA)) and the (natural)
logarithm of the expenditures of the party of candidate B (the opposing candidate) denoted
LEXPENDB = LOG(EXPENDB). Use the estimation output in Table 2.1 to answer this
question.
(a) Interpret the estimated coefficient of DEMOCA. Is the effect of this variable
statistically different from zero at the 2% significance level? (1 point)
The interpretation of the coefficient of DEMOCA is that a democratic candidate gets 3.793 %
more votes than a non-democratic candidate. The value of the test statistic t = 2.70. To be
conservative, we take the critical value of the t statistic at a 2% significance level to be
t(0.99,50) which is 2.403. Therefore, we can conclude that the effect of this variable is
statistically different from zero at the 2% significance level.
(0.5 point for interpretation, 0.5 point for statistical significance)

(b) You suspect heteroskedasticity and conduct a White test (see Table 2.2 in the
estimation output). State the null and alternative hypothesis of the White test. What is
your conclusion based on the output using a significance level of 1%? (2 points)
The White test for heteroskedasticity tests Ho: variance of error term is constant
(homoskedasticity) vs. H1: variance of error term is not constant (heteroskedasticity).
Based on the p-value associated with the chi square statistic, we can reject the null hypothesis
of homoskedasticity.
(1 point for the hypotheses, and 1 point for the conclusion.)

(c) In the output file (Table 2.3) alternative estimation results are provided based on a
weighted least squares method. What problem is being addressed by this method and
what effect does this method have on the estimators? Compare the estimations results
in Table 2.3 with the ones from Table 2.1. (2 points)
Weighted least squares solves for heteroskedasticity. If the weights are correctly specified,
the estimators are BLUE. The estimation results in Table 2.3 show that coefficient estimates
and standard errors are different from the ones in Table 2.1.
(1 point for explanation of method, 1 point for comparison of estimates and standard errors.)

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Question 3.
In this question we use a dataset on US monthly data on individual expenditures on
purchasing new cars, and other variables measured in 1975–1991. The variables are as
follows:
PCECARS: Individual expenditures in purchasing new cars (billion USD).
POP: The US population (million people).
PCDPY: Average personal income (thousand USD).
CPINEW: Consumer price index for new cars.
(a) After estimating the regression equation
PCECARSt = ȕ1 + ȕ2 PCDPYt + ȕ3CPINEWt + ȕ4POPt + et
(the regression results are in Table 3.1), you run the Breusch-Godfrey Serial Correlation
LM Test (the results are in Table 3.2). How many degrees of freedom does the Ȥ2 test
have? What is the null hypothesis of the Breusch-Godfrey Serial Correlation LM Test? (1
point)
Table 3.2 tests for first-order serial correlation: The Ȥ2-test has one degree of freedom. The
null hypothesis is no serial correlation.
(0.5 point for correct degrees of freedom, 0.5 point for correct statement of null hypothesis.)
(b) Use a 5% significance level. What is your conclusion from Table 3.2? Is there serial
correlation? If yes, what are the implications? (2 points)

The critical value of the chi squared distribution with one degree of freedom (95th percentile)
is 3.841. Since the test statistic 130.615 is larger than 3.841, the null of no serial correlation is
clearly rejected. This can also be rejected on the basis of the p-value (0.00 < 0.05).
This implies that standard errors and test statistics are not correct, and most likely least
squares overstates precision.
(1 point for conclusion, 1 point for implications.)
(c) Tables 3.3 and 3.4 show the results of Augmented Dickey-Fuller tests on PCDPY and
D(PCDPY). Is PCDPY stationary? Is D(PCDPY) stationary? Is PCDPY an I(0) series or a
I(1) series? (1 point)
The null of a unit root is not rejected for the test in levels (Table 3.3). The null of a unit root
is rejected for the test in first differences (Table 3.4): PCDPY is not stationary, D(PCDPY) is
stationary. So, PCDPY is I(1).
(0.5 point for stationarity of PCDPY and D(PCDPY), 0.5 point for the order of integration.)

(d) Suppose that the results of the Augmented Dickey-Fuller tests for all other variables in
the model in part (a) are the same as for PCDPY in part (c). What does this imply for the
specification of the model in part (a)? Is it allowed to specify the model in levels? (1 point)
The ADF tests suggest that the model should be specified in first differences, or in levels if
all I(1) variables are cointegrated.
(0.5 point for first differenced form, and 0.5 point for referring to cointegration.)

10

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Number on the list:

Name:

Student Number: Econ/IE&B/BE:

Final Exam
Econometrics for Business Economics [EBB061A05],
Economics [EBB814A05] and International Economics and
Business [EBB070A05] 2018-2019

Instructions:

1. Please answer all 20 Multiple Choice Questions in Part I by selecting the most appro-
priate answer. Use the computer sheet provided and follow the instructions provided
on the computer sheet.
2. Answer all three Open Questions in this exam booklet. You can earn 20 points for the
open questions.
3. Please refer to the computer output to answer the Open Questions.
4. You are required to submit all materials after completing this examination.
5. You are not allowed to use a graphical calculator but only a single line calculator. The
types allowed are Casio fx-82ES (PLUS) or the Casio fx-82MS as in Mathematics and
Data Analysis from your first year.
6. Please do not write on the tables and formula sheets as we would like to re-use them.
The tables may not include critical statistics for all degrees of freedom. Choose the
most appropriate degrees of freedom.
7. You are not allowed to visit the toilet during this exam.

Score on Open Score on Open Score on Open Total Score on Open


Question 1 Question 2 Question 3 Questions

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Part I Multiple Choice Questions


[Relevant for Questions 1-3]: Consider the following model studying whether campaign
expenditures affect election outcomes:

vot eA = β0 + β1 l e x pendA + β2 lex pendB + β3 pr t yst rA + u

The corresponding regression output is given by:

ÿ
v ot eA = 45.07 + 6.08l e x pendA 6.61le x pendB + 0.15pr t yst rA

where vot eA is the percentage of the vote received by Candidate A, lex pendA and le x pendB
are the logarithm of campaign expenditures by Candidates A and B and pr t yst rA is a mea-
sure of the party strength for Candidate A.

Question 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
The interpretation of the coefficient for β1 is as follows:
A. ∆vot eA ⇡ (β1 /100)(%∆ex pendA)
B. %∆vot eA ⇡ (β1 /100)(%∆e x pendA)
C. ∆vot eA ⇡ (β1 · 100)(%∆e x pendA)
D. ∆vot eA ⇡ (β1 /100)(∆e x pendA)

Question 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
The estimates imply the following:
A. A 10 unit ceteris paribus increase in spending by candidate A increases the
predicted share of the vote going to A by about 0.608 percentage points.
B. A 1% ceteris paribus increase in spending by candidate A increases the pre-
dicted share of the vote going to A by about 6.08 percentage points.
C. A 10% ceteris paribus increase in spending by candidate A increases the pre-
dicted share of the vote going to A by about 0.608 percent.
D. A 10% ceteris paribus increase in spending by candidate A increases the
predicted share of the vote going to A by about 0.608 percentage points.

Question 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
Suppose we want to test the null hypothesis H0 : β1 + β2 = 0. Which of the following
statements is true?
A. An equivalent null hypothesis to the one given above would be H0 : β1 = β2 .
B. If the null is true, then a z% increase in expenditure by A and a z% increase in
expenditure by B leaves vot eA unchanged.
C. We would need the standard error of βˆ1 + βˆ2 to test the hypothesis.
D. All of the above.

Page 2
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher estimates the following regression on a sample of students:

◊i = 6.5 + 0.1femalei + 0.01Groningen


GPA 0.03femalei ⇥ Groningeni ,
i

where GPAi is the grade point average, femalei is a dummy for female students and
Groningeni is a dummy for students living in Groningen. The average difference be-
tween the GPA of female students living in Groningen and male students living in
Groningen is
A. 0.07.
B. 0.01.
C. 0.03.
D. 0.02.

Question 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher is interested in estimating the impact of study time on passing rates and
estimates the following linear probability model

÷
p assi = 0.6 + 0.02studytimei ,

where passi is a dummy variable, which takes a value of 1 if the student passes the
exam, and zero otherwise. Studytime is the number of hours spend studying per week.
Which of the following statements is correct?
A. An extra hour of studying per week is associated with a 2 percent increase in
the probability of passing the course.
B. An extra hour of studying per week is associated with a 0.02 percentage point
increase in the probability of passing the course.
C. An extra hour of studying per week is associated with a 0.02 percent increase
in the probability of passing the course.
D. An extra hour of studying per week is associated with a 2 percentage point
increase in the probability of passing the course.

Question 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
What is not a potential source of endogeneity?
A. Measurement error in a dependent variable.
B. Reverse causality.
C. Unobserved heterogeneity.
D. Selection bias.

Question 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
What is the consequence of having measurement error in one of the independent vari-
ables?
A. Attenuation bias.
B. Reverse causality.
C. Model misspecification.
D. Overly precise standard errors.

Page 3
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
The problem of unobserved heterogeneity stems from
A. having a small sample with too little variation.
B. having independent variables that affect the outcome variable that are not
observable by the researcher.
C. having heterogeneous variances of the error terms.
D. having heterogeneous coefficients in the linear regression model.
Question 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher is interested in the effect of having breakfast on the test day (breakfast)
on students’exam scores. Therefore the researcher decides to run a randomized exper-
iment, and randomly assigns students into two groups: treatment group and control
group. The researcher asks students in the treatment group to have breakfast on the
exam day, and the researcher also ask students in the control group not to have break-
fast on the exam day. When the researcher collects data on the randomized experiment,
she realizes that some student in the control group actually had breakfast. When the
researcher compares the outcomes of students who had breakfast to those who did not
have breakfast,
A. the estimate captures the causal effect of having breakfast on the exam results.
B. the estimate is likely to suffer from selection bias.
C. the estimate suffers from attenuation bias stemming from measurement error.
D. the estimate will be close to zero, since students were randomly sorted into
treatment and control groups.
Question 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
In the distributed lag model, the coefficient on the contemporaneous value of the re-
gressor is called the
A. impact propensity.
B. dynamic propensity.
C. cumulative propensity.
D. autoregressive propensity.
Question 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
The long-run propensity
A. is the coefficient on X t r in the standard formulation of the distributed lag
model.
B. is the sum of all individual propensities.
C. is the difference between the coefficient on X t 1 and X t r .
D. is the product of all individual propensity.
Question 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
cov(ui t , uis |X i t , X is ) = 0 for period t 6= s means that
A. there is no cross-correlation between units.
B. conditional on the errors, the regressors are uncorrelated over time.
C. there is no perfect multicollinearity in the errors.
D. conditional on the regressors, the errors are uncorrelated over time.

Page 4
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
In the Fixed Effects regression model excluding the intercept, using (n 1) binary firm-
indicator variables for a sample of n firms, the coefficient of the binary variable for firm
i indicates
A. the difference in fixed effects between the i-th and the omitted firm.
B. the response in the dependent variable to a percentage change in the binary
variable.
C. will be either 0 or 1.
D. the level of the fixed effect of the i-th firm.

Question 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
The most important advantage of using panel data over cross sectional data on firms is
that it
A. allows you to analyze behaviour across time but not across firms.
B. allows you to control for some types of observable variables that are constant
over time.
C. allows you to study long-run trends.
D. allows you to control for some types of omitted variables without actually
observing them.

Question 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
Sample selection bias could occur
A. If the availability of the data is influenced by a selection process that is related
to the value of the independent variables.
B. If the choice between two samples is made by the researcher.
C. Because of the fact that we do not observe the entire population.
D. If the availability of the data is influenced by a selection process that is
related to the value of the dependent variable.

Question 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
The difference between an unbalanced and a balanced panel is that
A. the impact of different regressors are roughly the same for balanced but not
for unbalanced panels.
B. the magnitude of the intercept is meaningful only in balanced panels but not
in unbalanced panels.
C. you cannot have both fixed time effects and fixed unit effects regressions.
D. an unbalanced panel contains missing observations for at least one time
period or one unit.

Page 5
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
Consider estimating the effect of beer tax on the traffic fatality rate using data from the
United States, using time and state fixed effect for the Northeast Region (Maine, Ver-
mont, New Hampshire, Massachusetts, Connecticut and Rhode Island) for the period
1991-2001. If Beer Tax was the only explanatory variable, how many coefficients would
you need to estimate, excluding the constant?
A. 7.
B. 16.
C. 17.
D. 18.

Question 18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A static model is postulated when:
A. a change in the independent variable at time t is believed to have an effect on
the dependent variable at period t + 1.
B. a change in the lagged independent variable is believed to have an effect on
the dependent variable for time t .
C. a change in the independent variable at time t does not have any effect on the
dependent variable.
D. a change in the independent variable at time t is believed to have an con-
temporaneous effect on the dependent variable.

Question 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
The sample size for a time series data set is the number of:
A. variables being measured.
B. time periods over which we observe the variables of interest less the number
of variables being measured.
C. time periods over which we observe the variables of interest plus the number
of variables being measured.
D. time periods over which we observe the variables of interest.

Question 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A variable is not suitable to be an instrumental variable if:
A. it is correlated with the endogenous variable.
B. conditional on the endogenous variable, it is correlated with the outcome.
C. it is uncorrelated with the error term.
D. it does not have a direct effect on the outcome.

Page 6
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Part II Open Question [20 Points]

Reply to the sub-points of each question by using exclusively the space within boxes. The points
assigned to each sub-question are reported between the brackets.

Question 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Based on Census data from the United States the following model was estimated to
examine the influence of socio-economic variables on earnings of men, where wage are
monthly earnings, and educ is years of schooling, e x per is the overall years of experi-
ence at the labor market, t enur e is the years at the current employer, and mar r ied, black, south
and ur ban are all dummy variables defined in the usual way.

Figure 1: OLS estimates

(a) (2 points) Conduct a t test to test the null hypothesis that the coefficient of educ is
equal to 0.05 against the alternative that it is greater than 0.05 at a 1% significance
level. Explain your answer giving all steps.

β̂ 0.05
Solution: The t-test for a one-sided t-test is given by t = se(
k
β̂k )
= 0.0654 0.05
0.0063 =
2.44
The associated critical value is t (0.01,1) = 2.326.
Hence, the coefficient on educ is statistically larger than 0.05 at a 1% signifi-
cance level.

Page 7
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

(b) (2 points) Which of the classical assumptions is likely to be violated in this equa-
tion? Argue why a ’proxy’ variable IQ which measures the IQ of the individual
would alleviate the problem. Is the coefficient of educ likely to increase or decline,
if IQ is added to the regression?

Solution: Zero conditional mean assumption is violated: education is corre-


lated with the error term.
Classical example: ability as an unobserved variable in the error term, which
is positively correlated with education
if IQ is added as a proxy for ability the coefficient is likely to decline

(c) (4 points) Dr. Strangelove suggests to employ a 2SLS method. He proposes to use
the number of siblings as an instrument for education. Briefly explain the 2SLS
method, state the conditions a variable should satisfy to be an appropriate instru-
mental variable and argue whether they are met with the proposed instrument.

Solution: In 2SLS you use (exogenous) instrument to predict the (endogenous)


independent variable x (first-stage). The predicted value of this equation is
then used in the second-stage to estimate the dependent variable y. Conditions:

1. The instrument needs to be meaningful for explaining x , i.e., be corre-


lated with the endogenous x variable (relevant)

2. The instrument must be uncorrelated with any other determinant of the


dependent variable y (exogeneity)

Apply to number of siblings: it is likely that the number of siblings is sig-


nificantly negatively associated with the years of own education (more sib-
lings=less education). However, exogeneity of the number of siblings is less
clear. Fertility is clearly endogenous and there are many factors that influence
the number of children in a family (socio-economic status, culture), which also
might be correlated with wages (via intergenerational transmission).

(d) (1 point) Holding other factors fixed, what is the approximate difference in monthly
salary between blacks and nonblacks? Is the difference statistically significant?

Solution: The coefficient on black implies that, at given levels of the other ex-
planatory variables, black men earn about 18.8% less than nonblack men. The
t statistic is about –4.95, and so it is very statistically significant

(e) (1 point) Describe R2 , and explain what it means in the context of this question.

Solution: R-squared is a measure of how well the model can account for the
variation of the dependent variable. A value of 0.253 means that 25.3% of the
variation of l og(wage) around its mean can be explained by the model.

Page 8
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Consider the following extension to the model outlined in Question 1, where e x per2
and t enur e2 are squared variables of e x per and t enur e, respectively.

Figure 2: OLS estimates on log wages with squared terms

(a) (2 points) The R2 is now higher than before. Would you therefore argue that the
two additional variables should be included in the regression? Explain why or
why not.

Solution: R-squared is always higher when adding additional variables.


But one can see that the adjusted R-squared is actually lower, i.e., the penalty
of adding these variables was higher than the additional explanatory power.
Hence, from this point of view, the variables do not add to explaining the vari-
ation in the dependent variable.

Page 9
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

(b) (3 points) You now run two tests: one on t enur e2 and ex per2, and one on e x per
and e x per2.

Explain what these tests are doing and analyze the results shown. Compare the
results with the P > |t| value (see table 2) of the single coefficients.

Solution: These are two F-test of joint significance.


The H0 is that both coefficients are zero. The test on joint significance of t enur e2
and ex per2 can be rejected. The probability is 22.6% implying that the quadratic
terms are jointly insignificant at the 20% level. However, the test on e x per and
e x per2 show that both experience coefficients are jointly significant (i.e. differ-
ent from zero), although there is no significance of the individual coefficients
of both ex per and e x per2.

Page 10
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Consider the following extension to the model outlined in Question 1, where an inter-
action term is included: blacksingl e is an interaction term that is the product of black
and singl e, where singl e is a dummy variable which is 1 when the person is single.

Figure 3: OLS Estimation with Log Wage as dependent variable including interaction term

(a) (3 points) What is the estimated wage differential between single blacks and mar-
ried non-blacks?

Solution: Single blacks means we have singl e = 1 and black = 1 implying that
we need to add the coefficients: 0.0614 0.1794 0.1889 = 0.43
Salaries for single blacks are, on average, 43% lower, holding all other factors
constant, compared to married non-blacks.

(b) (2 points) Can you say whether the wage differential between single blacks and
married non-blacks is statistically significant? Argue.

Solution: The interaction term itself is not statistically significant. However,


since the other two coefficients are highly significant, it is likely that the three
coefficients are jointly significant.
We could test this with a F-test which indeed reveals a high joint significance.

Page 11
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Number on the list:

Name:

Student Number: Econ/IE&B/BE:

Resit
Econometrics for Business Economics [EBB061A05],
Economics [EBB814A05] and International Economics and
Business [EBB070A05] 2018-2019

Instructions:

1. Please answer all 20 Multiple Choice Questions in Part I by selecting the most appro-
priate answer. Use the computer sheet provided and follow the instructions provided
on the computer sheet.
2. Answer all three Open Questions in this exam booklet. You can earn 20 points for the
open questions.
3. Please refer to the computer output to answer the Open Questions.
4. You are required to submit all materials after completing this examination.
5. You are not allowed to use a graphical calculator but only a single line calculator. The
types allowed are Casio fx-82ES (PLUS) or the Casio fx-82MS as in Mathematics and
Data Analysis from your first year.
6. Please do not write on the tables and formula sheets as we would like to re-use them.
The tables may not include critical statistics for all degrees of freedom. Choose the
most appropriate degrees of freedom.
7. You are not allowed to visit the toilet during this exam.

Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])


lOMoARcPSD|15344507

Part I Multiple Choice Questions


Question 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
If the error term is correlated with any of the independent variables, then the OLS
estimators are:
A. biased and inconsistent.
B. biased and consistent.
C. unbiased and inconsistent.
D. unbiased and consistent.

Question 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
When an estimator is consistent,
A. the coefficient estimates will be as close to their true values as possible for
small and large samples.
B. on average, the estimated coefficient values will equal the true values.
C. the least squares estimator is unbiased and no other unbiased estimator has a
smaller variance.
D. the estimates will converge to the true values as the sample size increases.

Question 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
Assume that you have the following estimated model:

’q = 2.25
log 0.7 log pi + 0.02inci ,
i

where pi is the price and qi is demanded quantity of a certain good and inc i is the
income in thousand dollars.
The interpretation of the coefficient of log p in the above equation is:
A. If the price increases by 1%, the demanded quantity will be 0.7% lower on
average, ceteris paribus.
B. If the price increases by 1%, the demanded quantity will be 0.007% lower on
average, ceteris paribus.
C. If the price increases by 1%, the demanded quantity will be 70% lower on
average, ceteris paribus.
D. None of the above.

Page 2
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
Assume that you have the following estimated model:

’q = 2.25
log 0.7 log pi + 0.02inci ,
i

where pi is the price and qi is demanded quantity of a certain good and inc i is the
income in thousand dollars. The interpretation of the coefficient of inci in the above
equation is:
A. If the disposable income increases by a thousand dollar, the demanded
quantity will be 2% higher on average, ceteris paribus
B. If the disposable income increases by a thousand dollar, the demanded quan-
tity will be 0.02% higher on average, ceteris paribus.
C. If the disposable income increases by a thousand dollar, the demanded quan-
tity will be 0.0002% higher on average, ceteris paribus.
D. None of the above.

Question 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
Consider the least squares estimator for the standard error of the slope coefficient.
Which of the following statements are true?

I. The standard error will be positively related to the residual variance.


II. The standard error will be negatively related to the dispersion of the observations
on the explanatory variable around their mean value.
III. The standard error will be negatively related to the sample size.
IV. The standard error is still a valid estimator of the standard deviation of the slope
coefficient in the presence of heteroskedasticity.
A. (I) and (III).
B. (II) and (IV).
C. (I), (II), and (III).
D. (I), (II), (III), and (IV).

Question 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
Consider the following linear regression model:

y i = β0 + β1 x i + u i .

The error term ui may contain


A. omitted variables.
B. measurement error.
C. non-linearities.
D. all of the above.

Page 3
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A violation of the homoskedasticity assumption occurs if
A. the variance of the error term is individual-specific.
B. there is correlation of the error term between observations.
C. the variance of the error term depends on one of the covariates.
D. any of the above conditions hold.

Question 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
Consider the following linear regression model:

y i = β0 + β1 x i + u i .

If the covariance between y and x is positive, then the sign OLS estimate for β1
A. is negative.
B. is positive.
C. is zero.
D. depends on other factors as well.

Question 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
Consider the following model for average monthly rainfall measured in millimeters
(mm):

€ i = 100
rainfall 0.9temperaturei ,

where temperaturei denotes the monthly average temperature measured in Celsius de-
grees. What is the effect of a 10 Farenheit increase in temperature on rainfall? The
relationship between Farenheit (F) and Celsius (C) is F = 1.8C + 32.
A. A 10 Farenheit increase in temperature is associated with 5 mm less rainfall.
B. A 10 Farenheit increase in temperature is associated with 16.2 mm less rain-
fall.
C. A 9 Farenheit increase in temperature is associated with 5 mm less rainfall.
D. A 10 Farenheit increase in temperature is associated with 32 mm less rainfall.

Page 4
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher is interested in estimating the causal effect of class size (Class sizei ) on
students’ GPA (GPAi ). Therefore, the researcher specifies the following model:

GPAi = β0 + β1 Class sizei + ui .

The researcher knows that schools sort students of higher academic ability in smaller
classes. The OLS estimate of β1 is likely
A. to underestimate the true causal effect of class size.
B. to overestimate the true causal effect of class size.
C. to capture the true causal effect of class size.
D. to be too imprecise to capture the true causal effect of class size.

Question 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher compares the standard error of the OLS estimates of β1 of the following
two models:

yi = β0 + β1 x 1i + ui , (Short)
yi = β0 + β1 x 1i + β2 x 2i + ui . (Long)
Long
The corresponding estimates are β̂1Short and β̂1 . Which of the following statements is
true?
Long
A. Var(β̂1Short ) < Var(β̂1 ).
Long
B. Var(β̂1Short ) = Var(β̂1 )
Long
C. The relation between Var(β̂1Short ) and Var(β̂1 ) depends on the covariance
between x 1i and x 2i .
Long
D. If β2 = 0, then Var(β̂1Short ) = Var(β̂1 ).

Question 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher estimates the following regression on a population of workers for the
cohorts born between 1950 and 1960:

wagei = β0 + β1 agei + β2 yeari + ui .

Age is measured in years, and the variable yeari measures the year when the wage was
reported. What is the average difference between the wages of workers born in 1953
and 1954, holding age constant?
A. β2 .
B. β1 .
C. β1 + β2 .
D. 0.

Page 5
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher is interested in the returns to schooling, i.e., the effect of an extra year
of schooling on log wages. The researcher obtains the estimate of 0.12 with a 95%
confidence interval of [0.02, 0.22] on a sample of 1,000 workers. Which of the following
statements is true?
A. At a 10 percent significance level, we cannot reject the null that the returns to
schooling is zero.
B. At a 1 percent significance level, we reject the null that the returns to schooling
is zero.
C. At a 5 percent significance level, we reject the null that the returns to school-
ing is zero.
D. At a 5 percent significance level, we reject the null that the returns to schooling
is 0.10.

Question 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher estimates the following model

wagei = β0 + β1 a gei + β2 a gei2 + β3 f emal ei + ui ,

and is interested in testing the null hypothesis β1 = β2 = 0. What is the restricted model
that the researcher has to estimate to perform an F-test?
A. wagei = β0 + ui .
B. wagei = β0 + β1 ag ei + β3 f emal ei + ui .
C. wagei = β0 + β3 f emal ei + ui .
D. wagei = β0 + β1 ag ei + β2 a gei2 + β3 f emal ei + ui .

Question 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher is interested in testing whether x 1i and x 2i are jointly significant in the
following linear regression using an F-test:

yi = β0 + β1 x 1i + β2 x 2i + ui .

Which of the following statements is true?


A. The intercept in the unrestricted model is higher than in the restricted model.
B. The number of restrictions of the test is 3.
C. The R2 of the restricted model is zero.
D. The test can be performed using a t-test.

Page 6
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Question 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher estimated the following regression:


log wagei = 8 + 0.2agei 0.01femalei ⇥ agei ,

where femalei is a dummy variable for being a female worker. What is the difference
between the average wages of male and female workers at the age of 22?
A. 1 percent.
B. 20 percent.
C. 19 percent.
D. 22 percent.

Question 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher is interested in testing whether the effect of experience on wages is differ-
ent between men and women. Therefore, the researcher decides to estimate a model on
the full sample, as well as two separate models on the subsample of men and women,
separately. Which of the following tests can be used to test whether there is gender
difference in the effect of experience on wages?
A. Hausman test.
B. Breusch-Pagan test.
C. Chow test.
D. White test.

Question 18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A weak instrument
A. biases severely the IV estimate
B. always yields better estimates than OLS.
C. cannot be used to estimate the first-stage regression.
D. is weakly correlated with the error term.

Question 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A researcher is interested in estimating the supply function of housing, therefore col-
lects data on the number of houses and average prices in each Dutch municipality. The
researcher specifies the following supply function:

number of housesi = β0 + β1 pricei + ui .

The researcher obtains a negative estimate of β1 . Which of the following statements is


true?
A. Since the estimate is negative, the researcher actually estimates the demand
function.
B. The estimate captures the well-known fact that supply is downward sloping
in housing markets.
C. The estimate is negative, because it suffers from simultaneous equation
bias.

Page 7
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

D. Due to measurement error, the estimate is downward biased, and thus it is


negative.

Question 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 point
A natural experiment
A. usually takes place in the field, where the researcher can randomly assign
treatment.
B. studies the effects of natural disasters, such as earthquakes or hurricanes.
C. tests the effect of treatments in a laboratory environment.
D. studies policy reform where some people were affected and others were
not.

Page 8
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Part II Open Question [15 Points]

Reply to the sub-points of the question by using exclusively the space within boxes. The points
assigned to each sub-question are reported between [square brackets].
Consider data on 32,000 married black or married Hispanic women. The data contains
information about the children (kidcount is the number, samese x is an indicator for hav-
ing two children with the same sex, mul t i2nd is an indicator for having twins at the sec-
ond birth), earnings (l abinc is labor income, hours is weekly hours worked), and socio-
economic characteristics (educ is years of schooling, a ge is age of the individual, a ge f st m
is the age at first birth, black,hispan are dummy variables)

(a) You derive the following descriptive statistics for hours and educ . Calculate the esti-
mates βb0 and βb1 in the equation hours
◊ = βb0 + βb1 educ .[2 Points]

Figure 1: Descriptive statistics for hours and educ

Answer:
cov(educ,hours) 11.7669
β1 = var(educ) = 10.92432 = 1.077
β0 = ȳ β1 x̄ = 21.22011 1.077 · 11.00534 = 9.366

Page 9
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Now consider the following regression from an OLS.

Figure 2: OLS estimates on hours


(b) Compute the R2 for this regression. Interpret your result in words? [3 Point]

Answer:
1021907.1
R2 = 12111869.6 = 0.084 R-squared is a measure of how well the model can account
for the variation of the dependent variable. A value of 0.084 means that 8.4% of
the variation of hours around its mean can be explained by the model

(c) Write down the t -statistic to test whether the absolute value of the coefficient for
kidscount is larger than 3.0 on a α = 0.1 significance level and evaluate the test. State
the most appropriate critical value that you use.[3 Points]

Answer:
β̂k 3.0
The t-test for a one-sided t-test is given by t = se( β̂k )
= 3.129 3.0
0.1198 = 1.077
The associated critical value is t (0.1,1) = 1.282.
Hence, the hypothesis that the coefficient on kidscount is statistically larger than
3.0 cannot be rejected at a 10% significance level.
Sandor: Is the sign-definition clear here? Should I take the negative value (see
commented-out version)?

Page 10
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

(d) Compute a 90%-confidence interval for the coefficient of black. According to your
result, is the coefficient statistically different from zero?[2 Points]

Answer:
Confidence interval:

P(β̂ j t 0.05,1 se(β̂ j )  β j  β̂ j + t 0.05,1 se(β̂ j ) = 1 α


P(1.27 1.645 · 1.34  β1  1.27 + 1.645 · 1.34 = 0.9
P( 0.941  β1  3.47) = 0.9

Note: since the zero lies within the confidence interval we cannot reject the null
that the coefficient is zero on a 10% significance level.

(e) Which of the classical assumptions is likely to be violated in this equation? Argue by
describing a potential example for endogeneity for the problem at hand.[3 Points]

Answer:
Zero conditional mean assumption is violated: kidscount is correlated with the
error term.
Here endogeneity of kids might be due to the fact that women with less kids have
a higher preference for working in general whereas women with kids (controlling
for other factors) have a stronger preference for staying at home and work less.
Not accounting for this endogeneity in the estimation might bias the coefficient
for kidscount upwards in absolute terms (exaggerate the effect of children on
work), because this preference is taken up by the coefficient.

Page 11
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

The following tables presents results from a 2SLS regression. Note, that mul t i2nd is a
dummy variable indicating whether the second birth is a twin.

Figure 3: First-stage 2SLS

Page 12
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

Figure 4: 2SLS estimates on hours

(f) Briefly explain the 2SLS method (2 sentences are enough). Then, state the conditions
needed for an instrument and argue whether they are met with the proposed instru-
ment (use the results depicted in the tables above, if necessary). [4 Points]

Answer:
In 2SLS you use (exogenous) instrument to predict the (endogenous) indepen-
dent variable x (first-stage). The predicted value of this equation is then used in
the second-stage to estimate the dependent variable y.
Conditions:

• The instrument needs to be meaningful for explaining x , i.e., be correlated


with the endogenous x variable (relevant)

• Looking at the coefficient of mul t i2nd in the first stage confirms relevance:
the coefficient is meaningful (having a twin is increasing the number of kids
by 0.8) and it is significant:
p a t -statistic of 14.95 is larger than the Stock-Yogo
rule of thumb of 10 = 3.2

• The instrument must be uncorrelated with any other determinant of the


dependent variable y (exogeneity)

• There is not a general test for this so we must argue: Apparently, it should be
very difficult to influence the likelihood to have twins (other than artificial
insemination which results in a higher probability to get twins), hence the
instrument should be uncorrelated with the error term.

(g) Compare the coefficients of kidcount of the OLS and the 2SLS regression. Comment

Page 13
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])
lOMoARcPSD|15344507

on the difference.[3 Points]


Answer:

• The coefficient on kidcount is lower in the 2SLS compared to the OLS im-
plying that the negative effect of the number of children on hours worked
is reduced when accounting for endogeneity via the IV approach.

• This is in line with the reasoning of unobserved heterogeneity, i.e., some


women having preference for work and others having a higher preference
for children.

• The IV approach accounts for this difference and the coefficient is more
likely to be the causal effect of the number of kids on hours worked

• The OLS estimate cannot be interpreted as being a causal effect.

Page 14
Downloaded by Etsegenet Tafese (etsegenettafesse12@[Link])

You might also like