0% found this document useful (0 votes)
16 views12 pages

Understanding Impact Evaluation Methods

The document provides a comprehensive overview of impact evaluation (IE), defining its purpose, methodology, and importance in assessing the effects of interventions. It distinguishes between different types of evaluations, such as prospective and retrospective, and discusses various methodologies like randomization, regression discontinuity, and propensity score matching. Additionally, it highlights the significance of IE in accountability, learning, and advocacy for effective development interventions.

Uploaded by

planerpop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views12 pages

Understanding Impact Evaluation Methods

The document provides a comprehensive overview of impact evaluation (IE), defining its purpose, methodology, and importance in assessing the effects of interventions. It distinguishes between different types of evaluations, such as prospective and retrospective, and discusses various methodologies like randomization, regression discontinuity, and propensity score matching. Additionally, it highlights the significance of IE in accountability, learning, and advocacy for effective development interventions.

Uploaded by

planerpop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

7/19/2024

IMPACT EVALUATION

Organized by Mr. Abdul Kilima

Definition- Impact Impact vs Impact evaluation


• Systematically and empirically identifying the effects resulting from an intervention, be they
• Measure of the tangible and intangible effects (consequences) of one thing's or entity's
intended or unintended, direct or indirect. Impacts are usually understood to occur later
action or influence upon another.
than – and as a result of – intermediate outcomes.

• Impact is any effect of the service or of an event or initiative on an individual or group


• Impact evaluation goes beyond considering what agencies are doing to what happens as a
……. the impact can be positive or negative and may be intended or unintended
result of these activities, and the extent to which these interventions are indeed making a
difference in the lives of people and the conditions in which they live
• Measuring impact is about identifying and evaluating change

An impact evaluation measures


whether these results were
accomplished
7/19/2024

Impact Evaluation cont………………….


Impact evaluations are empirical studies that quantify the causal link between interventions
and outcomes of interest. This is far different from traditional process evaluations that are
concerned with characterizing how a project was implemented.

IEs are based on analysis of what happened with the intervention compared with an
empirically estimated scenario of what would have happened in the absence of the
intervention ‘’Counterfactual”.

Impacts = Outcomes - What would have happened anyway

This difference between the observed outcomes and the counterfactual outcomes is the
measure of impact, i.e., the difference that can be attributed to the intervention

IE is unique in that it is data driven and attempts to minimize unverifiable assumptions when
attributing effects. A core concept is that identified impacts are assessed not only in
magnitude, but also in terms of statistical significance.

Types of Impact Evaluation

Impact evaluations can be divided into two categories: prospective


and retrospective.

• Prospective evaluations are developed at the same time as the


program is being designed and are built into program
implementation. Baseline data are collected prior to program
implementation for both treatment and comparison groups.

• Retrospective impact evaluations assess program impact after


the program has been implemented, generating treatment and
comparison groups ex-post

• In general, prospective impact evaluations are more likely to


produce strong and credible evaluation results.
7/19/2024

Purpose of IE
IE, like other forms of evaluation, has two principal purposes.
When to use Impact Evaluation?
• The first is accountability, so as to ensure that development
Evaluate impact when project is: actions actually lead to development outcomes.
•Innovative • The second is learning, so as to offer an evidence base for
•Replicable/ scalable selecting and designing development interventions that are likely
•Strategically relevant for reducing poverty to be effective in fostering outcomes of interest.
•Evaluation will fill knowledge gap Other purposes are:
•Substantial policy impact • Advocacy, Impact evaluation can be used to advocate for
•Use evaluation within a program to test particular programs and policies which are supported by
alternatives and improve programs evidence
• Allocation, to reduce or stop funding for those found to not be
effective
• Validating the Theory of Change

Why is IE important? Challenges of IE

• Demonstrate success (to donors, ourselves, the public • Timing of an Evaluation


• Learn to understand how our efforts impact on local communities in order to improve the • Counterfactual
effectives of our interventions • Coordination between Program managers and Evaluators
• Be accountable to the people (stakeholders) • The size of Sample
• Use the findings from impact evaluation to advocate for changes in behavior, attitudes, policy • Cost Analysis
and legislation at all levels
• Scale up of intervention
7/19/2024

IE terms IE Terms
Causal Inference
Attrition The process of drawing a conclusion about a causal connection based on the conditions of the
Either the drop out of participants from the treatment group during the intervention, or failure occurrence of an effect. (effect of “P” on “Y”)
to collect data from a unit in subsequent rounds of a panel data survey. Either form of
attrition can result in biased impact estimates. Counterfactual
What would have happened in the absence of that program/policy/project?
Baseline survey and baseline data
A survey to collect data prior to the start of the intervention. Baseline data are necessary to Confounding
conduct double difference analysis, and should be collected from both treatment and Mix up (something) with something else.
comparison groups.
Control
Bias Not getting the program/project/policy/intervention.
The extent to which the estimate of impact differs from the true value as result of problems
in the evaluation or sample design Effect Size
The size of the relationship between two variables (particularly between program variables and
outcomes).

Treatment
Get the program/project/policy/intervention.

More IE terms…. Role of M&E in IE


External Validity
The extent to which the results of the impact evaluation apply to another time or place. Impact evaluations are typically external, carried out in whole or in part by an independent

Hypothesis expert from outside an agency. Nevertheless, M&E has a critical role to play, such as:
A specific statement regarding the relationship between two variables. 1. Identifying when and under what circumstances it would be possible and appropriate to

Internal Validity undertake an impact evaluation.


The validity of the evaluation design, i.e. whether it adequately handles issues such as sample 2. Contributing essential data to conduct an impact evaluation, such as baseline data of
selection (to minimize selection bias)
various forms and information about the nature of the intervention.
Intervention 3. Contributing necessary information to interpret and apply findings from impact
The project, program or policy which is the subject of the impact evaluation.
evaluation. This includes information about context, and data like the quality of
Power calculation implementation, needed to understand why given changes have or have not come about
A calculation of the sample required for the impact evaluation, which depends on the minimum
effect size and required level of confidence and what we can do to make our efforts even more effective in the future.
7/19/2024

How to build impact evaluation into M&E thinking and practices Theory of Change
• A theory of change is a model that explains how an intervention is expected to lead to
intended or observed impacts

• Setting up a Theory of Change is like making a roadmap that outlines the steps by which you
plan to achieve your goal.

• It helps you define whether your work is contributing towards achieving the impact you
envision, and if there is another way that you need to consider as well.

• it also allows you to spot potential risks in your plan by sharing the underlying assumptions
in each step.

• In large organizations, when there may be several projects running simultaneously, the
Theory of Change helps to map these different projects first and then consider how they link
and relate to each other.

• This tool can also aid in aligning team members to the larger end goal, and help them
understand their role in achieving it

The essence of theory of change – linking activities to intended outcomes How to setup theory of change

I am cutting I am building a
rocks temple
7/19/2024

Randomization

• Randomization is randomly assigning people to control and treatment group


• Randomized designs can take many forms. Here the focus is on a straightforward two-group
approach in order to clarify the key principles
• The key point is that the randomization ensures the two groups are statistically equivalent in all
respects at the point they are randomized. Subsequent to randomization, the treatment group is
exposed to the intervention which is the focus of the evaluation and whose impacts or effects are to
be measured
• Depending on the policy question of central concern, the control group can be assigned to receive
no treatment at all, or the treatment group can be compared to a group exposed to some other
treatment of interest (may be conceived as representing treatment as usual), or there can be
multiple treatment groups alongside a control group
• The “gold standard” in evaluating the effects of interventions

RCT cont………. Before the program (Balance Test)


 When to randomize? Example: Randomized Assignment
o Over supply: # eligible > available resources
Control Treatment Est.T
o Innovation: need rigorous evidence about a program’s effectiveness
Consumption
(monthly $ per 233.47 233.4 -0.39
 Advantages capita)
o Ethical, quantitative and transparent selection rules Head of Household
o Produces the best possible counterfactual and it’s intuitive/easy to communicate Age (years) 42.3 41.6 1.2
 Disadvantages: Age of Household
o Political constraints Head’s Spouse 36.8 36.8 -0.38
o Internal validity (exogeneity): people might not comply with the assignment (selective (years)
non-compliance) Household Head
o External validity (generalizability): usually run controlled experiment on a pilot, small Education (years) 2.8 2.9 -2.16*
scale. Difficult to extrapolate the results to a larger population.
o Does not always solve problem of spillovers Education of
Household Head’s 2.6 2.7 -0.006
Counterfactual: randomized - out group Spouse (years)
7/19/2024

Randomized Promotion
Instrumental Variables
IV is done when there is a problem of endogeneity which implies you cannot identify
• So what if everyone participate? Or no one participate at all? Can we compare causal impact of “P” on “Y”
participants and non-participants?
Find “something” that is highly correlated with treatment assignment but is not
• Solution: Offer a program to a randomized subgroup correlated with unobserved characteristics affecting outcomes.

• Necessary conditions Instrument Z and treatment indicator P:


1. Offered/promoted group and non-offered/not promoted group are comparable Cov(Z, P) ≠ 0
2. Offered/promoted group has a higher participation rate in the program
3. The offer/promotion does not affect results directly And the exclusion restriction:
Cov(Z,𝜀) = 0

In summary, instrument Z affects selection into the program but is not correlated with
factors affecting the outcomes.

In Estimating IV we use second stage Ordinary Least square (2OLS) and our impact is
ITT

TREATMENT should be something that human being cannot control


7/19/2024

Regression Discontinuity (RDD)


IV example
Treatment and comparison group assigned based on cut-off score on a quantitative variable

We have a continuous eligibility index with a defined cut-off


• Households with a score ≤ cutoff are eligible
• Households with a score > cutoff are not eligible
• Or vice-versa

Intuitive explanation is:


• Units just above the cut-off point are very similar to units just below it – good comparison

RDD example
CASE: Effect of fertilizer program on Agriculture production

GOAL: Improve agriculture production (rice yields) for small-scale farmers

ELIGIBILITY:
• Farms with a score (Ha) of land ≤50 are small
• Farms with a score (Ha) of land >50 are not small

INTERVENTION: Small farmers receive subsidies to purchase fertilizer


7/19/2024

DIFFERENCE IN DIFFERENCES (DID)


RDD post intervention
o Combines a before-and-after comparison with a (with-and-without) participant-nonparticipant
comparison.
o Measures changes in outcomes over time of program participants relative to the changes in outcomes
of non-participants.
o Counterfactual: Change in outcomes of individuals who didn’t participate in the program but on
whom data were collected.
o Identification comes from believing that the change in outcomes of individuals who didn’t participate in
theWe
 program represent
calculate what would
the difference in thehave happened
outcome to treatment
(Y) between group
the before in after
and the absence of for
situations the the
program.
treatment group (B-A).
 We calculate the difference in the outcome (Y) between the before and after situations for the
RDD gives unbiased estimate of the treatment effect, but produces a local estimate: comparison group (D-C).
o Effect of the program around the cut-off point/discontinuity.  Then we calculate the difference between the difference in outcomes for the treatment group
o This is not always generalizable. (B-A) and the difference for the comparison group (D-C), or:
 Diff-in-Diff = (B-A) – (D-C).
[Link] [Link]

Dif in Dif method


7/19/2024

Propensity Score Matching


o The propensity score is the estimated probability of being in the treatment group given the PSM cont…………………..
observable characteristics from a regression model of participation
o Matching uses large data sets and heavy statistical techniques to construct the best Advantages:
possible artificial comparison group for a given treatment group  Does not require randomization, nor baseline (pre-intervention data)

o PSM creates a comparison group from untreated observations by matching treatment Disadvantages:
observations to one or more observations from the untreated sample, based on observable
characteristics
 Strong identification assumptions
 In many cases, may make interpretation of results very difficult
o Perfect matching would require matching each individual or unit in the treatment group with a  Requires very good quality data: need to control for all factors that influence
person or unit in the comparison group that is identical on all relevant observable program placement
characteristics, such as age, education, religion, occupation, wealth, attitudes to risk, and so on.  Requires significantly large sample size to generate comparison group
Clearly, this is not possible. But nor is it necessary.

o In propensity score matching, matching is not on every single characteristic but on a single
number: the propensity score.
.
o The propensity score is a conditional probability. More specifically, it is the likelihood of a
person taking part in the intervention given their observable characteristics. This
probability is obtained from the “participation equation”: a probit or logit regression in
which the dependent variable is dichotomous, taking the value of 1 for those who took
part in the intervention, and 0 if they did not.

Uses of Different Design


Design When to use
Randomization ► Whenever possible
► When an intervention will not be universally
implemented
Random Promotion ► When an intervention is universally
implemented
Regression Discontinuity ► If an intervention is assigned based on rank

Diff-in-diff ► If two groups are growing at similar rates

Matching ► When other methods are not possible


► Matching at baseline can be very useful
Propensity score matching is used when a group of subjects receive a treatment and we'd like to
compare their outcomes with the outcomes of a control group. Examples include estimating the
effects of a training program on job performance or the effects of a government program targeted
at helping particular schools
7/19/2024

DIFFERENT IMPACT MEASURES


 Impact evaluation can give different measures of program effects, depending on the
population for which the estimate is generated. This is important to understand, as not all
methodologies can estimate all measures.

• Average treatment effect (ATE): the average impact of participation in the program on the entire
eligible population.

• Intention to treat effect (ITT): the average impact of exposure to the program, e.g., on all those
living in a program area eligible to take part in the program.

• Average treatment effect on the treated (ATT): the average impact on those who actually take
part in (choose to comply or adopt) the intervention.

• Average treatment effect on the untreated (ATU): the average potential impact on those not
taking part in the treatment were they treated. This is a relevant measure for understanding
the potential effects of program expansion.

• Local average treatment effect (LATE): the average impact on a subgroup of the beneficiary
population, usually those at the threshold for eligibility. Some impact evaluation designs yield a
LATE rather than an ATE.

IE FIELD WORK ORGANIZATION IE Field organization cont…………


Field procedures and interview skills:
Ethical procedures:
 Introductions
 Adhering to ethical principles is important in research
• Say your Name
 Consider the following during the Evaluation
• Ask respondents Name
• Be clear and concise in the introduction so that respondents are fully informed.
• Do consent procedures
• Obtain the consent for respondents before interview.
• Obtain ascent & consent of at least one parent or caregiver before collecting survey
 Appearance
information from children.
• Be Clean, dress professionally / nicely & decent
• Do not harass respondents to respond. Encourage them to participate in a polite, positive,
• Do not chew Gum
non-threatening way.
• Smile & Make Respondents comfortable especially children
• Be honest with respondents regarding all aspects of the project.
• Be aware of cultural and social differences relating to the topic of the project and wording
 During Interview
of questions.
• Speak Loud and clear
• Ensure privacy and confidentially of respondents’ information
• Make eye contact
• Be patient, let respondent explain
• No smoking & eating during interview
• Avoid leading respondents
• Explain difficult terms or word
• Do not answer phone ( make sure in silent or vibration mode)
7/19/2024

Quality assurance and field organization




Each team will have a team lead
Survey data will be uploaded on the sever
Questions & Answers
• GPS coordinates will be tracked to see data collection points
• Daily checks will be done
• Notes by note taker will be used to check on transcription
• Audio will be kept for future retrieval

Thank You

Roundy Presentation Template Design

You might also like