0% found this document useful (0 votes)
11 views37 pages

Evaluating Randomized Management Experiments

The lecture discusses policy and management evaluation methods, focusing on randomized experiments, fixed effects, and panel data. It highlights the importance of random assignment in eliminating selection bias and presents a case study of Ctrip's experiment on working from home, which showed increased productivity. The lecture also covers the use of fixed effects in regression analysis to control for omitted variable bias and the structure of panel data.

Uploaded by

Bích Châu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views37 pages

Evaluating Randomized Management Experiments

The lecture discusses policy and management evaluation methods, focusing on randomized experiments, fixed effects, and panel data. It highlights the importance of random assignment in eliminating selection bias and presents a case study of Ctrip's experiment on working from home, which showed increased productivity. The lecture also covers the use of fixed effects in regression analysis to control for omitted variable bias and the structure of panel data.

Uploaded by

Bích Châu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Policy and Management Evaluation Methods

Lecture 7

Prof. Dr. Fabian Kosse


Chair of Data Science in Business & Economics

1
From last week:
▪ Because randomly assigned treatment and control groups come from
the same underlying population, they are the same in every way,
including their expected 𝑌0𝑖
▪ I.e. 𝐸 𝑌0𝑖 𝐷𝑖 = 0 and 𝐸 𝑌0𝑖 𝐷𝑖 = 1 are the same if treatment 𝐷𝑖 is
randomly assigned. Therefore:
𝐸 𝑌1𝑖 𝐷𝑖 = 1 − 𝐸 𝑌0𝑖 𝐷𝑖 = 0
= 𝐸 𝑌0𝑖 + κ 𝐷𝑖 = 1 − 𝐸 𝑌0𝑖 𝐷𝑖 = 0
= κ + 𝐸 𝑌0𝑖 𝐷𝑖 = 1 − 𝐸 𝑌0𝑖 𝐷𝑖 = 0

▪ The last step follows from random assignment
▪ Hence, random assignment eliminates selection bias

2
The first 5 to appear in Dirndl or Lederhosen will
receive a free drink

3
Lecture Outline

▪ Another example of randomized experiments in firms


▪ Fixed effects
▪ Panel data

4
Lecture Outline

▪ Another example of randomized experiments in firms


▪ Fixed effects
▪ Panel data

5
Randomized Experiments

▪ Randomized experiments are a powerful tool to understand


causal effects and very useful (but probably underused?) in
firms
▪ Arguments against RCTs
▪ Ethical issue?
▪ Implementable?

▪ Experiments inform firms about:


▪ Customer behaviour (see tutorial: ebay)
▪ Management practices (this week)
▪ …
6
Working From Home

▪ Working from home is becoming more important:

Source: Bloom et. al. 2015

7
Selection Bias if we used Observational Data

▪ Why can’t we simply compare outcomes (e.g. productivity,


promotion prospects,…) of individuals who work from home
(WFH) to those who do not?
▪ Those who selected into working from home may be more or
less productive to start with: classical selection bias
▪ Ctrip (a leading travel agency in China) decided to run an
experiment to understand the causal effect on WFH
▪ Teamed up with economists at Stanford University (Nick Bloom,
James Liang, John Roberts, Zhichung Jenny Ying) to run the
experiment

8
Ctrip
▪ Ctrip is a leading travel agency in China

▪ Worth about $5 billion at the time of the experiment

9
Ctrip – English Website

10
Experimental Design
▪ Shanghai call centre workers were asked whether they wanted to
change their work arrangements from
▪ 5 days a week in the office
▪ 4 days at home and 1 day in the office

▪ 994 workers were asked whether they wanted to work from home; 503
volunteered for the experiment
▪ Why not simply compare the 503 individuals to the rest?
▪ Restricting sample: Among the volunteers only those who had worked
6 months with the company, had broadband internet, and an
independent workspace at home were allowed to participate (249
individuals)
▪ The 249 individuals were randomly allocated to a treatment (WFH)
and control group (continued to work in the office)

11
Working From Home

12
Who Volunteers To WFH?

13
Productivity Over Time

14
How to Evaluate the Experiment?

▪ The simplest way to evaluate the experiment would be to


compare productivity differences between the treatment and
control group during the experiment:

𝑃𝑟𝑜𝑑𝑢𝑐𝑡𝑖𝑣𝑖𝑡𝑦𝑖 = 𝛽0 + 𝛽1 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑖 + 𝜀𝑖

15
Regression Results

16
What Did the Firm Do After Seeing the Results?

▪ The experimental results indicate that productivity increased by


about 18% of a standard deviation in the treatment group
▪ Moreover, the control group did not do worse than call centre
workers in another location (rules out reduced motivation in the
control group)
▪ WFH caused higher productivity and lower costs for the
company (because office space is expensive in Shanghai)
→ After seeing the results the company rolled out voluntary WFH to
the whole company

17
Lecture Outline

▪ Another example of randomized experiments in firms


▪ Fixed effects
▪ Panel data

18
Fixed Effects

▪ A while ago we introduced indicator/dummy variables


▪ Including indicator/dummy variables as control variables can be
useful because they (like all other control variables):
▪ may reduce omitted variable bias
▪ reduce standard errors (because they reduced residual variation)

▪ Sometimes a large set of indicator variables are called fixed


effects
▪ Let’s look at an example using the World Management Survey
data (Bloom & van Reenen) and estimate the regression with FE
and without country FE
▪ Relation: Management Quality and Sales
19
Regression without Fixed Effects
Without FE
Source | SS df MS Number of obs = 732
-------------+---------------------------------- F(1, 730) = 39.98
Model | 76.4763017 1 76.4763017 Prob > F = 0.0000
Residual | 1396.4909 730 1.91300124 R-squared = 0.0519
-------------+---------------------------------- Adj R-squared = 0.0506
Total | 1472.9672 731 2.01500302 Root MSE = 1.3831

------------------------------------------------------------------------------
ls | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
amanagement | .4287287 .0678073 6.32 0.000 .295608 .5618494
_cons | 10.59861 .2237934 47.36 0.000 10.15925 11.03796
------------------------------------------------------------------------------

▪ Why could country FEs address omitted variable bias?

20
Regression with Country Fixed Effects
Without FE
Source | SS df MS Number of obs = 732
-------------+---------------------------------- F(1, 730) = 39.98
Model | 76.4763017 1 76.4763017 Prob > F = 0.0000
Residual | 1396.4909 730 1.91300124 R-squared = 0.0519
-------------+---------------------------------- Adj R-squared = 0.0506
Total | 1472.9672 731 2.01500302 Root MSE = 1.3831

------------------------------------------------------------------------------
ls | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
amanagement | .4287287 .0678073 6.32 0.000 .295608 .5618494
_cons | 10.59861 .2237934 47.36 0.000 10.15925 11.03796
------------------------------------------------------------------------------

With FE
Left out category → US (base category)
------------------------------------------------------------------------------
ls | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
amanagement | .3520594 .0659144 5.34 0.000 .2226542 .4814647
ccfrance | -.8659832 .1391211 -6.22 0.000 -1.13911 -.5928561
ccgermany | -.0933134 .1320019 -0.71 0.480 -.3524639 .1658371
ccuk | -.8167841 .1346197 -6.07 0.000 -1.081074 -.5524943
_cons | 11.19305 .2321523 48.21 0.000 10.73728 11.64882 21
------------------------------------------------------------------------------
Fixed Effects

▪ The coefficient changes after including the FEs, suggesting that


the regression without FE suffered from omitted variable bias
▪ Standard errors are also lower

22
Lecture Outline

▪ Another example of randomized experiments in firms


▪ Fixed effects
▪ Panel data (& FEs)

23
Panel Data
▪ Panel or longitudinal datasets contain observations that vary both at
the
▪ i: cross-sectional level: e.g. individuals, firms, countries (total: N)
▪ t: time level (total: T)

▪ The individuals (or firms, countries) are observed for multiple time
periods
▪ One distinguishes between different types of panels:
▪ Balanced: all individuals are observed in all time periods → number of observations:
NxT
▪ Unbalanced: (some) individuals are only observed in some time periods

▪ The concern with unbalanced panels is that observations may not be


missing at random (same concern if panel was artificially balanced by
dropping all unbalanced observations)

24
Panel Data – Model
▪ The inclusion of fixed effects allows us to control for all
unobserved factors that vary across cross-sectional units (e.g.
individuals) but do not vary over time

▪ Think of the following model:

𝑦𝑖𝑡 = 𝛽0 + 𝛽1 𝑥𝑖𝑡 + 𝑐𝑖 + 𝜀𝑖𝑡

▪ You have data that vary across individuals (𝑖) over time (𝑡)

▪ You are interested in estimating 𝛽1

▪ 𝜀𝑖𝑡 is an error term that satisfies “Gauss-Markov assumptions”

▪ 𝑐𝑖 is a component that affects the outcome and varies across


individuals but not over time (→ Estimation uses only individual
25
variation over time)
Panel Data – Example

▪ Suppose you wanted to estimate the effect of being married on


wages (why should that have an effect?)

ln(𝑤𝑎𝑔𝑒)𝑖𝑡 = 𝛽0 + 𝛽1 𝑀𝑎𝑟𝑟𝑖𝑒𝑑𝑖𝑡 + γ𝑡 + 𝑐𝑖 + 𝜀𝑖𝑡

▪ In the example we also control for time FE (γ𝑡 ) (= varies over


time but not over individuals) because individuals grow older,
earn more and also get married
▪ Do we expect 𝛽1 to be upward or downward biased if we do not
control for 𝑐𝑖 ?
𝐶𝑜𝑣(𝑥1, 𝑥2 )
▪ OVB formula: 𝐸(𝛽1 ) = 𝛽1 + 𝛽2

𝑉𝑎𝑟(𝑥1 )
▪ E.g. 𝑥2 = Health status! 26
1) Estimating FE with Dummies

▪ Conceptionally the easiest way of estimating the FE model


would be to include one dummy variable for each individual in
the model:
𝑦𝑖𝑡 = 𝛽0 + 𝛽1 𝑥𝑖𝑡 + 1 𝐼𝑛𝑑𝑖𝑣. 1 + 1 𝐼𝑛𝑑𝑖𝑣. 2 + ⋯ + 1 𝐼𝑛𝑑𝑖𝑣. 𝑁 + 𝜀𝑖𝑡

▪ Using the summation sign notation:


𝑁

𝑦𝑖𝑡 = 𝛽0 + 𝛽1 𝑥𝑖𝑡 + ෍ 1 𝐼𝑛𝑑𝑖𝑣. 𝑖 +𝜀𝑖𝑡


𝑖=1

27
1) Estimating FE with Dummies - Example

▪ We use data from the National Longitudinal Survey of Youth


1979 which has surveyed 12,686 individuals since 1979
▪ We want to understand the effect of being married on income
▪ First estimate the following model:
ln(𝑤𝑎𝑔𝑒)𝑖𝑡 = 𝛽0 + 𝛽1 𝑀𝑎𝑟𝑟𝑖𝑒𝑑𝑖𝑡 + γ𝑡 + 𝑢𝑖𝑡

28
Regression Results – Only Time FE

Source | SS df MS Number of obs = 63,332


-------------+---------------------------------- F(19, 63312) = 1538.85
Model | 34758.029 19 1829.36995 Prob > F = 0.0000
Residual | 75264.4744 63,312 1.18878687 R-squared = 0.3159
-------------+---------------------------------- Adj R-squared = 0.3157
Total | 110022.503 63,331 1.73726143 Root MSE = 1.0903

------------------------------------------------------------------------------
ln_income | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
married | .198981 .00924 21.53 0.000 .1808706 .2170915
year_d1 | -.203491 .0853035 -2.39 0.017 -.370686 -.0362959
year_d2 | -.0381279 .0846518 -0.45 0.652 -.2040455 .1277896
year_d3 | .0654794 .084215 0.78 0.437 -.0995821 .2305409
year_d4 | .230483 .0839114 2.75 0.006 .0660165 .3949496
year_d5 | .3385567 .0837646 4.04 0.000 .174378 .5027355

▪ What does the regression coefficient on married mean?

29
Regression Results – Now Include Individual FE

------------------------------------------------------------------------------
ln_income | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
married | .1108063 .0098053 11.30 0.000 .0915879 .1300247
year_d1 | -2.523614 .0246953 -102.19 0.000 -2.572017 -2.475211
year_d2 | -2.306119 .0230091 -100.23 0.000 -2.351217 -2.261021
year_d3 | -2.187204 .0218349 -100.17 0.000 -2.230001 -2.144408
year_d4 | -1.985332 .0209646 -94.70 0.000 -2.026423 -1.944241
year_d5 | -1.861583 .0205182 -90.73 0.000 -1.901799 -1.821367

▪ Note: Individual FEs not shown in the table


▪ Was our intuition of the omitted variable bias correct?

30
2) Within Estimator
▪ A second (and the most common) approach to estimate such
models is the so-called within-estimator
▪ Model:
𝑦𝑖𝑡 = 𝛽0 + 𝛽1 𝑥𝑖𝑡 + 𝑐𝑖 + 𝜀𝑖𝑡 (1)

▪ Step 1: Average estimating equation over 𝑡 = 1, … , 𝑇


𝑦ത𝑖 = 𝛽0 + 𝛽1 𝑥ҧ𝑖 + 𝑐𝑖ҧ + 𝜀𝑖ҧ (2)
1 1 1
where 𝑦ത𝑖 = 𝑇 σ𝑇𝑡=1 𝑦𝑖𝑡 , 𝑥ҧ𝑖 = 𝑇 σ𝑇𝑡=1 𝑥𝑖𝑡 , 𝑐𝑖ҧ = 𝑐𝑖 , 𝜀𝑖ҧ = 𝑇 σ𝑇𝑡=1 𝜀𝑖𝑡

▪ Step 2: Subtract equation (2) from (1) to get :


𝑦𝑖𝑡 − 𝑦ത𝑖 = 𝛽1 𝑥𝑖𝑡 − 𝑥ҧ𝑖 +(𝑐𝑖 −𝑐𝑖ҧ ) + (𝜀𝑖𝑡 − 𝜀𝑖ҧ )

31
2) Within Estimator

▪ Step 2: Subtract equation (2) from (1) to get:


𝑦𝑖𝑡 − 𝑦ത𝑖 = 𝛽1 𝑥𝑖𝑡 − 𝑥ҧ𝑖 +(𝑐𝑖 −𝑐𝑖ҧ ) + (𝜀𝑖𝑡 − 𝜀𝑖ҧ )

▪ This can be rewritten as (~ = tilde):


𝑦෤𝑖𝑡 = 𝛽1 𝑥෤𝑖𝑡 + 𝜀𝑖𝑡
ǁ
where 𝑦෤𝑖𝑡 = 𝑦𝑖𝑡 − 𝑦ഥ𝑖 , ത𝑖 ,
𝑥෤𝑖𝑡 = 𝑥𝑖𝑡 − 𝑥 ǁ = 𝜀𝑖𝑡 − 𝜀ҧ𝑖 , (𝑐𝑖 −𝑐𝑖ҧ ) = 0
𝜀𝑖𝑡

▪ Step 3: Run a regression of 𝑦෤𝑖𝑡 on 𝑥෤𝑖𝑡 using OLS


▪ In practice, you do not have to manually carry out these steps
because the statistical software will do them for you
▪ This regression is computationally easier to estimate than the
one with lots of FE
32
Regression Results – Within Estimator

------------------------------------------------------------------------------
ln_income | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
married | .1108063 .0098053 11.30 0.000 .0915879 .1300247
year_d1 | -2.523614 .0246953 -102.19 0.000 -2.572017 -2.475211
year_d2 | -2.306119 .0230091 -100.23 0.000 -2.351217 -2.261021
year_d3 | -2.187204 .0218349 -100.17 0.000 -2.230001 -2.144408
year_d4 | -1.985332 .0209646 -94.70 0.000 -2.026423 -1.944241
year_d5 | -1.861583 .0205182 -90.73 0.000 -1.901799 -1.821367

▪ The within estimator and the model that includes all FEs always
give the same results

▪ Estimating the model with a lot of fixed effects, however, is


computationally very demanding

33
3) First Differences

▪ Yet another alternative to estimate fixed effects models would


be to use first differences
▪ Model:
𝑦𝑖𝑡 = 𝛽0 + 𝛽1 𝑥𝑖𝑡 + 𝑐𝑖 + 𝜀𝑖𝑡 (1)

▪ Step 1: Lag the model by one period gives:


𝑦𝑖𝑡−1 = 𝛽0 + 𝛽1 𝑥𝑖𝑡−1 + 𝑐𝑖 + 𝜀𝑖𝑡−1 (2)

▪ Step 2: Subtract equation (2) from (1) to get:


𝑦𝑖𝑡 − 𝑦𝑖𝑡−1 = 𝛽1 𝑥𝑖𝑡 − 𝑥𝑖𝑡−1 +𝑐𝑖 −𝑐𝑖 + 𝜀𝑖𝑡 − 𝜀𝑖𝑡−1

▪ This can be rewritten as:


Δ𝑦𝑖𝑡 = 𝛽1 Δ𝑥𝑖𝑡 + Δ𝜀𝑖𝑡
34
3) First Differences

▪ The first-differences estimator is the pooled OLS regression of


Δ𝑦𝑖𝑡 on Δ𝑥𝑖𝑡
▪ First differencing eliminates 𝑐𝑖
▪ When we take first differences we lose the first time period for
each cross section: we now have 𝑇 − 1 periods instead of 𝑇
▪ Under “Gauss-Markov assumptions”: less efficient (for some
violations: more efficient)
▪ With just two time periods the first differences estimator is the
same as the within estimator

35
Variables That do Not Vary Over Time
▪ Once you include FE in the model you cannot estimate effects
of variables that do not vary over time (e.g. gender)
▪ Model without individual FE:
------------------------------------------------------------------------------
ln_income | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
married | .2881853 .0062954 45.78 0.000 .2758465 .3005241
female | -.4373824 .0059733 -73.22 0.000 -.4490899 -.4256749
year_d1 | -.0487806 .0597394 -0.82 0.414 -.1658687 .0683074
year_d2 | .1698219 .0594252 2.86 0.004 .0533496 .2862941

▪ Model with FE:


------------------------------------------------------------------------------
ln_income | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
married | .128708 .0069228 18.59 0.000 .1151396 .1422765
female | 0 (omitted)
year_d1 | -2.621585 .0170619 -153.65 0.000 -2.655026 -2.588144
year_d2 | -2.376223 .0160948 -147.64 0.000 -2.407768 -2.344677
36
References

▪ Bloom, N.; Liang, J.; Roberts, J. & Ying, Z. J. (2015). Does Working
from Home Work? Evidence from a Chinese Experiment. In: The
Quarterly Journal of Economics Vol. 130 (1), pp. 165-218.

37

You might also like