Fixed Effects and DiD Methods Explained
Fixed Effects and DiD Methods Explained
Otto-von-Guericke-Universität Magdeburg
1 / 80
Outline
1. Introduction
3. Difference-in-Differences
Readings
2 / 80
1. Introduction
A Brief Review of Panel Data and Fixed-Effects Models
3 / 80
1. Introduction
Example: Estimating the Union Wage Premium
where:
4 / 80
1. Introduction
Example: Estimating the Union Wage Premium
I Note, if:
1. M has an effect on Y , and
2. M and D are correlated,
⇒ we face classical omitted variable bias (see Topic 3).
5 / 80
2. Fixed Effects Model
6 / 80
2. Fixed Effects Model
Which FE Transformation to Use?
7 / 80
2. Fixed Effects Model
Estimation of FE Models in Stata
8 / 80
2. Fixed Effects Model
Interpretation of FE Models
9 / 80
2. Fixed Effects Model
Identifying Assumption
E [Y0it |Mi , Xit , Dit = 1] = E [Y0it |Mi , Xit , Dit = 0] = E [Y0it |Mi , Xit ]
E [Y1it |Mi , Xit , Dit = 1] = E [Y1it |Mi , Xit , Dit = 0] = E [Y1it |Mi , Xit ]
10 / 80
2. Fixed Effects Model
Identifying Assumption
11 / 80
2. Fixed Effects Model
Example: Estimating the Union Wage Premium (ctd.)
I As a consequence:
I it discards variation across cross-sections (between variation).
I it does not allow us to estimate the coefficients of time-invariant
regressors (gender, education, ...).
I does not solve the problem of time-varying omitted variables.
I demeaned regressors may be more susceptible to measurement error.
13 / 80
2. Fixed Effects Model
Caveats of the FE Estimator: Measurement Error
I If OLS estimates > FE estimates, then this may, but need not,
indicate selection bias.
I The time effects λt shift the intercept over time and affect all
micro-units uniformly.
I Examples:
I business cycle movements
I common trend in wages
I ...
16 / 80
2. Fixed Effects Model
Examples of Special Panels (Households, Twins)
17 / 80
3. Difference-in-Differences
19 / 80
3. Difference-in-Differences
In a Nutshell
I Idea:
I observed changes over time for the control group provide the
counterfactual trend for the treatment group.
20 / 80
3. Difference-in-Differences
In a Nutshell
I Identifying Assumption:
I outcomes in treatment and control move in parallel in the absence of
treatment (Common Trend Assumption).
! Note: Even if pre-trends are the same, one still has to worry about
other policies changing at the same time (”contemporaneous
shocks”).
21 / 80
3. Difference-in-Differences
In a Nutshell
I Main steps:
I collect data on treated and non-treated before and after treatment.
I compute the differences in the outcome for treated and non-treated
before and after the treatment.
I treatment effect = difference b/w these two differences.
22 / 80
3. Difference-in-Differences
Example: The Causes of Cholera
I In the early 19th century, it was widely believed that the cause of
cholera was miasma (airborne desease).
I However, physician John Snow, who was to become the father of
modern epidemiology, correctly believed that it was waterborne.
I In his study on cholera, Snow (1855)3 pioneered the DiD idea:
I Snow studied a neighborhood in Northwest Soho London, where 2
private companies supplied households with water (Lambeth,
Southwark & Vauxhall)
I until 1852: both companies drew water downstream from Thames
River, which was contaminated with sewage from London.
I in 1852: Lambeth moved its water works upriver to an area relatively
free of sewage.
3
John Snow (1855): On the Mode of Communication of Cholera, London: John
Churchill, New Burlington Street, England.
23 / 80
3. Difference-in-Differences
Example: The Causes of Cholera
This map was used by Dr. Snow to describe the ”grand experiment” of 1854 comparing cholera mortality among persons
consuming downstream contaminated water (Southwark and Vauxhall Company - blue, but faded to green on the map) versus
upstream cleaner water of the Thames (Lambeth Company - red). The overlapping area (purple but faded to gray-red on the
map) is where John Snow analyzed the results of the natural experiment. [Source: Map 2 in John Snow’s book.]
24 / 80
3. Difference-in-Differences
Example: The Causes of Cholera
⇒ Calculate:
Pre-exper. difference: 150 (treated) - 125 (control)= 25
Post-exper. difference: 10 (treated) - 150 (control)= -140
Difference-in-differences: -140 (treated) - 125 (control)= -165
I Idea:
I pre-exper. difference captures normal difference
I post-exper. difference captures normal difference plus causal effect!
25 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
4
Gary Richardson and William Troost (2009), ”Monetary Intervention Mitigated
Banking Panics during the Great Depression: Quasi-Experimental Evidence from a
Federal Reserve District Border, 1929-1933”, Journal of Political Economy, 117(6),
1031-1073.
26 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
I Border between 6th and 8th districts runs through the state of
Mississippi.
27 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
28 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
I These naive comparisons (what’s wrong with them?) are inputs into
the difference-in-differences (DiD) estimator.
29 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
⇒ Difference-in-differences:
30 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
I Absent any policy differences, number of banks in the 8th and 6th
district would have evolved in parallel.
I Under this assumption, the trend in 8th district is informative about
what would have happened in 6th district without easy money.
I This assumption can not be tested (why?).
I But we can check its plausibility by comparing pre-treatment trends
b/w treatment and control.
31 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
184 Chapter 5
Figure 5.1
Bank failures in the Sixth and Eighth Federal Reserve Districts
Eighth District
Number of banks in business 160
140
Sixth District
120
Sixth District counterfactual
Treatment effect
100
! Note: DiD attributes any difference in trends between treatment and control
The DD tool amounts to a comparison of slopes or trends
that occur atacross
the time of the
districts. intervention
The dotted to that
line in Figure 5.1 isintervention.
the counter- 32 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
Differences-in-Differences 185
Figure 5.2
Trends in bank failures in the Sixth and Eighth Federal
Reserve Districts
180
Number of banks in business
160
Eighth District
140
Sixth District
120
100
Figure 5.3
Trends in bank failures in the Sixth and Eighth Federal Reserve
Districts, and the Sixth District’s DD counterfactual
180
Sixth District
140
120
100
Sixth District counterfactual
80
I Advantages:
1. easy to calculate standard errors
2. we can add additional control variables:
I to increase precision by reducing residual variance
I to control for observable group-specific trends
35 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
where:
I Yit : Outcome of interest for unit i at time t.
I TREATi : Dummy for (units in the) treatment group.
I POSTt : Dummy for observations from the post-treatment period.
I TREATi × POSTt : Interaction term, indicates observations in
treatment group from post-treatment period.
36 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
where:
I Ydt : number of banks in district d at time t.
I TREATd : dummy for 6th district.
I POSTt : dummy for observations from 1931 onwards.
I TREATd × POSTt : interaction term, indicates observations for 6th
district from 1931 onwards.
37 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
where:
I Ydt : number of banks in district d at time t.
I TREATd : controls for time-constant differences b/w 6th and 8th
district.
I POSTt : controls for changes in the post-treatment period common
to control and treatment group.
I TREATd × POSTt : treatment dummy; δrDD is coefficient of interest.
38 / 80
3. Difference-in-Differences
Example: Monetary Intervention and the Great Depression
Table 5.1
Wholesale firm failures and sales in 1929 and 1933
Difference
1929 1933 (1933–1929)
Panel A. Number of wholesale firms
Sixth Federal Reserve District (Atlanta) 783 641 −142
Eighth Federal Reserve District (St. Louis) 930 607 −323
Difference (Sixth–Eighth) −147 34 181
41 / 80
3. Difference-in-Differences
Example: MLDA and Death Rates of Young Adults
I In a first step, we could study two states, say Alabama and Arkansas
(2-state DiD).
42 / 80
3. Difference-in-Differences
Example: MLDA and Death Rates of Young Adults
where:
I Yst : Death rates of 18-20-years-old for state s at time t.
I TREATs : Dummy for Alabama.
I POSTt : Dummy for observations from 1975 onwards.
I TREATs × POSTt : Interaction term, indicates observations for
Alabama from 1975 onwards (what does δrDD measure?).
43 / 80
3. Difference-in-Differences
Example: MLDA and Death Rates of Young Adults
I There are many such MLDA experiments in the data, with various
states switching to lower MLDAs over time.
I This multi-state DID regression uses panel data for 14 years and 51
states (⇒ 14 × 51 = 714 observations).
44 / 80
3. Difference-in-Differences
Example: MLDA and Death Rates of Young Adults
where:
I Yst : Death rates of 18-20-years-old for state s at time t.
I STATEks : Dummies for each state (state fixed effects).
I YEARjt : Dummies for each year (year fixed effects).
I LEGALst : Treatment variable.
45 / 80
3. Difference-in-Differences
Example: MLDA and Death Rates of Young Adults
46 / 80
3. Difference-in-Differences
Example: MLDA and Death Rates of Young Adults
47 / 80
3. Difference-in-Differences
Example: MLDA and Death Rates of Young Adults
Treatment variable:
I Usually a dummy indicating whether unit/state was exposed to
treatment in a specific year.
I In our example, continuous measure of the proportion of
18-20-years-old allowed to drink in state s and year t.
I Variable captures that MLDA varies from 18 to 21.
48 / 80
3. Difference-in-Differences
Example: MLDA and Death Rates of Young Adults
Results: 196 Chapter 5
Table 5.2
Regression DD estimates of MLDA effects on death rates
I Card and Krueger (1994)5 analyze the effect of a rise in the state
minimum wage in New Jersey from $4.25 to $5.05 on April 1, 1992
on employment in fast-food restaurants:
Research question:
”Did the minimum wage increase harm employment in New Jersey?”
Treatment group:
Fast-food restaurants in the US state of New Jersey (NJ).
Control group:
Fast-food restaurants in eastern part of neighbouring Pennsylvania
(PA) where minimum wage remained constant at $4.25.
! Recall: The quality of the comparison group determines the quality
of the evaluation!
5
Card, D. and A.B. Krueger (1994), ”Minimum wages and employment: A case
study of the fast-food industry in New Jersey and Pennsylvania”, American Economic
Review, 84(4), 772-793.
50 / 80
3. Difference-in-Differences
Example: Min. Wages and Employment
[Source: Wikipedia.] 51 / 80
3. Difference-in-Differences
Example: Min. Wages and Employment
I Data:
Survey of about 400 fast food stores in NJ and PA before / after
the minimum wage increase:
I before: February 1992
I after: November 1992
52 / 80
3. Difference-in-Differences
Example: Min. Wages and Employment
I Findings:
Variable PA NJ Difference
FTE employment before 23.33 20.44 − 2.89
(1.35) (0.51) (1.44)
FTE employment after 21.17 21.03 − 0.14
(0.94) (0.52) (1.07)
Change in FTE employment − 2.16 0.59 2.76
(0.94) (0.54) (1.33)
[Source: adapted from Card and Krueger (1994), Table 3.]
53 / 80
3. Difference-in-Differences
Example: Min. Wages and Employment
I Regression DiD:
where:
I Card and Krueger (1994) also add further control variables Xit to the
baseline regression. β3 , however, remains positive and significant.
54 / 80
3. Difference-in-Differences
Example: Min. Wages and Employment
[Source: Card and Krueger (2000), p. 1406. As in the original Card and Krueger survey, the administrative data show a slight
decline in employment from February to November 1992 in PA, and little change in NJ over the same period. However, the
data also reveal substantial year-to-year employment variation in other periods. These swings often seem to differ
substantially in the two states. In particular, while employment levels in NJ and PA were similar at the end of 1991,
employment in PA fell relative to employment in NJ over the next 3 years (especially in the 14-county group), mostly before
the 1996 increase in the federal minimum wage (to $4.75, which affected PA, but not NJ). So PA may not provide a very
good measure of counterfactual employment rates in NJ in the absence of a minimum wage change.] 56 / 80
3. Difference-in-Differences
Sensitivity Analysis
57 / 80
3. Difference-in-Differences
Standard Errors in DiD Strategies
7
Bertrand, M., E. Duflo and S. Mullainathan (2004), ”How Much Should We Trust
Differences-In-Differences Estimates?”, The Quarterly Journal of Economics, 119(1),
249-275.
58 / 80
3. Difference-in-Differences
Standard Errors in DiD Strategies
I Many papers using a DiD strategy use data from many years (not
only 1 pre and 1 post period):
I the variables of interest in many of these setups only vary at a group
level (say state) and outcome variables are often serially correlated.
⇒ Bertrand, Duflo, and Mullainathan (2004) point out that, as a
consequence, conventional standard errors often severely understate
the standard deviation of the estimators.
59 / 80
3. Difference-in-Differences
Standard Errors in DiD Strategies
I Bertrand et al. (2004) propose the following solutions:
1. block bootstrapping standard errors (if you analyze states the block
should be the states and you would sample whole states with
replacing for the bootstrapping).
2. clustering standard errors at the group level (in STATA one would
simply add cl(state) to the regression equation if one analyzes
state level variation).
3. aggregating the data into one pre and one post period. Literally
works only if there is only one treatment date. With staggered
treatment dates one should adopt the following procedure:
1. regress Yst on state FE, year FE, and relevant covariates.
2. obtain residuals from the treatment states only and divide them into
2 groups: pre- and post-treatment.
3. regress the two groups of residuals on a post dummy.
! Note: correct treatment of standard errors sometimes makes the number
of groups very small, e.g. in Card and Krueger (1994), only have 2 groups.
60 / 80
3. Difference-in-Differences
Example: Fukushima and House Prices
I Similarly, you can add lags to analyze whether the treatment effect
changes over time after the treatment.
61 / 80
3. Difference-in-Differences
Example: Fukushima and House Prices
where:
I treatment occurs in year τ = 0
I regression includes q leads
I regression includes m lags
62 / 80
3. Difference-in-Differences
Example: Fukushima and House Prices
8
Bauer, T.K., S. Braun and M. Kvasnicka (2017), ”Nuclear Power Plant Closures
and Local Housing Values: Evidence from Fukushima and the German Housing
Market”, Journal of Urban Economics, 99, 94-106.
63 / 80
3. Difference-in-Differences
Example: Fukushima and House Prices
I Key assumption:
I Prices of houses in the treatment and control group would have
followed the same time trend in the absence of the Fukushima
accident.
66 / 80
3. Difference-in-Differences
Example: Fukushima and House Prices
[Source: Bauer et al. (2017), p. 102.]
69 / 80
4. Synthetic Control Method
Basic idea:
I a combination of units (often) provides a better comparison for the
unit exposed to the intervention than any single unit alone.
10
Abadie, A. and J. Gardeazabal (2003), ”The Economic Costs of Conflict: A Case
Study of the Basque Country”, American Economic Review, 93(1), 113-132.
71 / 80
4. Synthetic Control Method
Example: Terror and Growth
72 / 80
4. Synthetic Control Method
Example: Terror and Growth
I The weights are chosen so that the synthetic Basque country most
closely resembles the actual Basque country before terrorism.
73 / 80
4. Synthetic Control Method
Example: Terror and Growth
I Let:
I X1 be a (K × 1) vector of pre-terrorism K economic growth
predictors in the Basque country (i.e. the values in the previous table:
investment ratio, population density, ...).
I X0 be a (K × J) matrix which contains the values of the same
variables for the J possible control regions.
I V be a diagonal matrix with nonnegative components reflecting the
relative importance of the different growth predictors.
77 / 80
4. Synthetic Control Method
Example: Terror and Growth
78 / 80
4. Synthetic Control Method
Example: Terror and Growth
79 / 80
Readings
*** [MHE] Angrist, J.D., and J.S. Pischke (2009), Mostly Harmless
Econometrics: An Empiricist’s Companion, Princeton University
Press.
[Chapter 5]
*** [MM] Angrist, J.D., and J.S. Pischke (2014), Mastering ’Metrics:
The Path from Cause to Effect, Princeton University Press.
[Chapter 5]
80 / 80