0% found this document useful (0 votes)

5 views30 pages

Chapter - 9

Chapter 9 of ECO 345 discusses functional form misspecification in econometric models, highlighting the consequences of estimating a linear model when the true relationship is nonlinear. It introduces the RESET Test for detecting misspecification and explains the use of proxy variables to address omitted variable bias. The chapter also covers issues related to measurement error, missing data, nonrandom samples, outliers, and introduces Least Absolute Deviations (LAD) as a robust alternative to Ordinary Least Squares (OLS) estimation.

Uploaded by

hamza.97859

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views30 pages

Chapter - 9

Uploaded by

hamza.97859

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

ECO 345 - Applied Econometrics II

Chapter 9

Muhammad Salman Khalid

School of Economics & Social Sciences

March 24, 2026

Salman (IBA) Chapter 9 1 / 30

Functional Form Misspecification

Salman (IBA) Chapter 9 2 / 30

Functional Form Misspecification

So far we have assumed that our model is correctly specified - the

functional form is correct.
What if the true model is nonlinear in some variables, but we
estimated a linear model?
For Example: The true model is

wage = β0 + β1 educ + β2 exper + β3 exper 2 + µ

But we estimated:

wage = β0 + β1 educ + β2 exper + µ

This is a case of functional form misspecification - a type of

omitted variable bias.
The omitted variable is exper 2 which is correlated with exper -
leading to biased and inconsistent estimates.
Salman (IBA) Chapter 9 3 / 30
Consequences of Functional Form Misspecification

The OLS estimators will be biased and inconsistent.

The standard errors, t-statistics and F-statistics will all be invalid.
The R 2 and adjusted R 2 will be misleading.
Therefore, it is important to test whether the functional form of the
model is correctly specified.
We have already learned one way to check for this - adding squared
and interaction terms and testing their joint significance.
However, there is a more systematic test for functional form
misspecification - the RESET Test.

Salman (IBA) Chapter 9 4 / 30

RESET Test (Regression Specification Error Test)

The RESET test was proposed by Ramsey (1969).

The idea is that if the model is misspecified, then nonlinear functions
of the fitted values ŷ should be significant when added to the model.
Why ŷ ? Because ŷ is a function of all the independent variables, so
ŷ 2 and ŷ 3 capture various nonlinear combinations of the independent
variables.
The RESET test adds ŷ 2 and ŷ 3 to the original model and tests their
joint significance.

Salman (IBA) Chapter 9 5 / 30

Steps to Conduct RESET Test
1 Run the original model and obtain the fitted values ŷ .

y = β0 + β1 x1 + β2 x2 + .... + βk xk + µ
2 Run the expanded model including ŷ 2 and ŷ 3 :

y = β0 + β1 x1 + .... + βk xk + δ1 ŷ 2 + δ2 ŷ 3 + γ
3 Test the joint significance of ŷ 2 and ŷ 3 :

H0 : δ 1 = δ 2 = 0 HA : Not(H0 )
4 Use F-test with q = 2 restrictions:
2 −R 2
Rnew old
2
F = 1−Rnew2 ∼ F2,n−k−3
n−k−3

5 If we reject H0 , the functional form is misspecified.

Salman (IBA) Chapter 9 6 / 30
Limitations of RESET Test

The RESET test tells us whether the model is misspecified but NOT
how it is misspecified.
If we reject H0 , we know something is wrong with the functional
form, but we do not know which variable needs transformation.
We must use our economic intuition and knowledge of the data to
determine the correct specification.
Additionally, the RESET test can sometimes reject the null even when
the functional form is correct (e.g., due to heteroskedasticity).
It is best to use RESET test in combination with other diagnostic
tools.

Salman (IBA) Chapter 9 7 / 30

Testing Against Non-Nested Alternatives
Sometimes we have two competing models that are non-nested -
neither is a special case of the other.
For example:

Model 1 : y = β0 + β1 x1 + β2 x2 + µ

Model 2 : y = β0 + β1 log (x1 ) + β2 log (x2 ) + µ

We cannot use the standard F-test because neither model is a
restricted version of the other.
One approach is to create a comprehensive model that includes
both sets of variables:

y = β0 + β1 x1 + β2 x2 + β3 log (x1 ) + β4 log (x2 ) + µ

Then test H0 : β1 = β2 = 0 (Model 2 preferred) or H0 : β3 = β4 = 0

(Model 1 preferred).
If both are rejected or neither is rejected, the test is inconclusive.
Salman (IBA) Chapter 9 8 / 30
Using Proxy Variables for Unobserved Explanatory
Variables

Salman (IBA) Chapter 9 9 / 30

The Problem of Omitted Variables

One of the most serious problems in econometrics is the omission of a

relevant variable that is correlated with the included variables.
For example, consider the wage equation:

wage = β0 + β1 educ + β2 exper + β3 ability + µ

We cannot observe ability directly.

If we omit ability, and it is correlated with educ, then βˆ1 will be
biased.
One solution is to use a proxy variable for the unobserved variable.

Salman (IBA) Chapter 9 10 / 30

What is a Proxy Variable?

A proxy variable is an observable variable that is related to the

unobservable variable we want to control for.
For ability, common proxies include IQ score, standardized test scores,
or GPA.
Let x3∗ be the unobserved variable and x3 be the proxy.
For x3 to be a good proxy for x3∗ , we need:

µ ∗ = δ0 + δ1 x3 + v

where E (v |x1 , x2 , x3 ) = 0.
This means that after controlling for x3 , the unobserved variable x3∗
should be uncorrelated with x1 and x2 .

Salman (IBA) Chapter 9 11 / 30

Using Proxy Variables in Practice

When we include the proxy variable in our model:

y = β0 + β1 x1 + β2 x2 + β3 x3 + γ

The coefficient β3 on the proxy does not have a direct interpretation

as the effect of x3∗ .
However, β1 and β2 will be consistent estimates of the parameters of
interest.
The key benefit of using a proxy is to reduce the omitted variable bias
on the coefficients of the other explanatory variables.
It is important to note that a bad proxy (weakly related to x3∗ ) can
make things worse rather than better.

Salman (IBA) Chapter 9 12 / 30

Lagged Dependent Variable as Proxy

A very useful proxy strategy is to include the lagged dependent

variable (yt−1 ) as an explanatory variable.
For example:

crimet = β0 + β1 unemt + β2 crimet−1 + µt

Why is this useful?

crimet−1 captures all the unobserved factors from the past that affect
crime in the current period.
It serves as a proxy for historical and institutional factors that are
difficult to measure.
However, using a lagged dependent variable requires that crimet−1 is
uncorrelated with µt (no serial correlation in errors).
Go over Example 9.3 in the book.

Salman (IBA) Chapter 9 13 / 30

Properties of OLS Under Measurement Error

Salman (IBA) Chapter 9 14 / 30

Measurement Error

In practice, the data we use may not accurately measure the true
values of the variables.
The difference between the observed value and the true value is called
measurement error.
We will consider two cases:
1 Measurement error in the dependent variable (y).
2 Measurement error in an explanatory variable (x).
The consequences of measurement error depend critically on which
variable is measured with error.

Salman (IBA) Chapter 9 15 / 30

Measurement Error in the Dependent Variable

Let y ∗ be the true value and y be the observed value.

The measurement error is defined as:

e0 = y − y ∗

The true model is:

y ∗ = β0 + β1 x1 + ..... + βk xk + µ

Since y = y ∗ + e0 , the estimated model becomes:

y = β0 + β1 x1 + ..... + βk xk + (µ + e0 )

The new error term is µ + e0 .

Salman (IBA) Chapter 9 16 / 30

Measurement Error in the Dependent Variable

If e0 is uncorrelated with the explanatory variables (x1 , x2 , ...., xk ):

1 The OLS estimators remain unbiased and consistent.
2 However, the variance of the error term increases:
Var (µ + e0 ) > Var (µ).
3 This leads to larger standard errors and less precise estimates.
If e0 is correlated with some xj , then the OLS estimators will be
biased.
In most practical cases, we assume e0 is uncorrelated with the
explanatory variables.
Therefore, measurement error in y is generally less problematic than
measurement error in x.

Salman (IBA) Chapter 9 17 / 30

Measurement Error in an Explanatory Variable

This case is much more serious.

Let x1∗ be the true value and x1 be the observed value.
The measurement error is:

e1 = x1 − x1∗

The true model is:

y = β0 + β1 x1∗ + µ
Substituting x1∗ = x1 − e1 :

y = β0 + β1 x1 + (µ − β1 e1 )

The new error term is µ − β1 e1 .

Salman (IBA) Chapter 9 18 / 30

Classical Errors-in-Variables (CEV) Assumption
Under the classical errors-in-variables (CEV) assumption:

Cov (x1 , e1 ) = 0 and Cov (x1∗ , e1 ) = 0

However, even under CEV:

Cov (x1 , µ − β1 e1 ) = −β1 Cov (x1 , e1 ) ̸= 0

Wait! Isn’t Cov (x1 , e1 ) = 0 under CEV?

Actually, the CEV assumption says Cov (x1∗ , e1 ) = 0, not
Cov (x1 , e1 ) = 0.
Since x1 = x1∗ + e1 :

Cov (x1 , e1 ) = Cov (x1∗ + e1 , e1 ) = Var (e1 ) > 0

Therefore, x1 is correlated with the composite error - OLS is biased

and inconsistent.
Salman (IBA) Chapter 9 19 / 30
Attenuation Bias

Under the CEV assumption, it can be shown that:

σx2∗
plim(βˆ1 ) = β1 1

σx2∗ + σe21
1

σx2∗
Since 1
σx2∗ +σe21
< 1, the OLS estimate is biased toward zero.
1
This is known as attenuation bias or bias toward zero.
The larger the measurement error variance (σe21 ), the greater the
attenuation bias.
This means that measurement error in an explanatory variable makes
it harder to find a significant effect - the estimates are understated
in magnitude.
Go over Example 9.4 in the book.

Salman (IBA) Chapter 9 20 / 30

Measurement Error in Multiple Regression

In a multiple regression, measurement error in x1 can affect the

estimates of all coefficients, not just β1 .
However, if the mismeasured variable is uncorrelated with the other
explanatory variables, then only βˆ1 is affected.
In practice, reducing measurement error through better data
collection is the best solution.
Another solution is using Instrumental Variables (IV) estimation,
which we will cover later in the course.

Salman (IBA) Chapter 9 21 / 30

Missing Data, Nonrandom Samples, and Outliers

Salman (IBA) Chapter 9 22 / 30

Missing Data

In practice, datasets often have missing observations for some

variables.
If data is missing completely at random (MCAR), then dropping
the observations with missing values does not bias OLS estimates.
We simply have a smaller sample size and therefore less precise
estimates.
However, if data is not missing at random, then dropping
observations can lead to bias.
For example, if high-income individuals are less likely to report their
income, then the sample is not random with respect to income.
In such cases, special techniques like imputation or sample selection
corrections (Heckman correction) are needed.

Salman (IBA) Chapter 9 23 / 30

Nonrandom Samples and Sample Selection Bias

Sample selection bias occurs when the sample is not representative

of the population.
Types of nonrandom sampling:
1 Exogenous sample selection: Selection based on the independent
variable.
Example: Sampling only college graduates to study wage determinants.
OLS is still unbiased but less efficient.
2 Endogenous sample selection: Selection based on the dependent
variable.
Example: Studying wage determinants using only employed individuals.
OLS can be biased because the selection is related to the outcome.
Endogenous sample selection is much more problematic and requires
correction methods.

Salman (IBA) Chapter 9 24 / 30

Outliers and Influential Observations

An outlier is an observation that is far from the rest of the data.

OLS estimates can be very sensitive to outliers because OLS
minimizes the sum of squared residuals - squaring gives
disproportionate weight to large residuals.
How to detect outliers?
1 Scatter plots of y against each x.
2 Examining the residuals - observations with very large residuals (in
absolute value) may be outliers.
3 Studentized residuals: If |eistud | > 3, observation i may be an outlier.
It is important to understand why an observation is an outlier before
removing it.

Salman (IBA) Chapter 9 25 / 30

Dealing with Outliers

Should we remove outliers?

If the outlier is due to a data entry error, it should be corrected or
removed.
If the outlier is a legitimate observation, removing it is not advisable.
Best practices for handling outliers:
1 Report the OLS results with and without the suspected outlier(s).
2 If results change substantially, investigate the outlier further.
3 Consider using robust estimation methods that are less sensitive to
outliers.
One such method is the Least Absolute Deviations (LAD)
estimator.

Salman (IBA) Chapter 9 26 / 30

Least Absolute Deviations (LAD) Estimation

Salman (IBA) Chapter 9 27 / 30

Least Absolute Deviations (LAD)

OLS minimizes the sum of squared residuals:

n
X
min µ̂2i
βs
i=1

LAD minimizes the sum of absolute residuals:

n
X
min |µ̂i |
βs
i=1

Since LAD does not square the residuals, it gives less weight to
extreme observations.
LAD is also known as Median Regression because it estimates the
conditional median (instead of the conditional mean).
Therefore, LAD is more robust to outliers than OLS.

Salman (IBA) Chapter 9 28 / 30

LAD vs OLS

When should we prefer LAD over OLS?

1 When the data has heavy-tailed distributions (many outliers).
2 When we are interested in the median rather than the mean.
3 When the conditional distribution of y is skewed.
When should we prefer OLS?
1 When the errors are normally distributed (OLS is efficient under
normality).
2 When we are interested in the conditional mean.
3 When the data does not have significant outliers.
In practice, comparing OLS and LAD results can be informative about
the influence of outliers.

Salman (IBA) Chapter 9 29 / 30

THANK YOU
The measure of intelligence is the ability to change.

Salman (IBA) Chapter 9 30 / 30

EC3306 Misspecification Issues
No ratings yet
EC3306 Misspecification Issues
38 pages
Implications of High Multicollinearity in OLS
No ratings yet
Implications of High Multicollinearity in OLS
54 pages
Model Specification and Data Issues
No ratings yet
Model Specification and Data Issues
13 pages
AAE 75202 Topic2b Problems in Modelling Remaining April2025
No ratings yet
AAE 75202 Topic2b Problems in Modelling Remaining April2025
31 pages
Model Selection in Econometrics
No ratings yet
Model Selection in Econometrics
15 pages
Applied Econometrics: Specification Issues
No ratings yet
Applied Econometrics: Specification Issues
43 pages
ch7 Sec1 2 3 Full Notes
No ratings yet
ch7 Sec1 2 3 Full Notes
11 pages
RESET Test in Econometrics
No ratings yet
RESET Test in Econometrics
18 pages
Functional Form Misspecification in Econometrics
No ratings yet
Functional Form Misspecification in Econometrics
16 pages
Understanding Specification Error
No ratings yet
Understanding Specification Error
5 pages
Choosing Independent Variables in Regression
No ratings yet
Choosing Independent Variables in Regression
7 pages
CLRM Assumption Violations Explained
No ratings yet
CLRM Assumption Violations Explained
42 pages
Regression Diagnostics: Model Errors
No ratings yet
Regression Diagnostics: Model Errors
30 pages
Econometrics: Model Specification Errors
No ratings yet
Econometrics: Model Specification Errors
15 pages
Understanding Linear Regression Models
No ratings yet
Understanding Linear Regression Models
59 pages
Omitted Variable Bias in Regression Analysis
No ratings yet
Omitted Variable Bias in Regression Analysis
18 pages
Model Specification Errors Explained
No ratings yet
Model Specification Errors Explained
15 pages
EMET2007: Linear Regression Insights
No ratings yet
EMET2007: Linear Regression Insights
6 pages
Understanding Specification Errors in Econometrics
100% (3)
Understanding Specification Errors in Econometrics
13 pages
CH3 Econometrics Tar
No ratings yet
CH3 Econometrics Tar
16 pages
Economics 308: Econometrics Professor Moody: Describing The Relationship Between Two Variables
No ratings yet
Economics 308: Econometrics Professor Moody: Describing The Relationship Between Two Variables
8 pages
Advanced Econometrics Overview
No ratings yet
Advanced Econometrics Overview
65 pages
Understanding Measurement Error in Econometrics
No ratings yet
Understanding Measurement Error in Econometrics
23 pages
CLRM Assumptions and Multicollinearity
No ratings yet
CLRM Assumptions and Multicollinearity
39 pages
Addressing Functional Form Misspecification
No ratings yet
Addressing Functional Form Misspecification
27 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
5 pages
Misspecification Testing in Regression Models
No ratings yet
Misspecification Testing in Regression Models
6 pages
Model Specification and Misspecification
No ratings yet
Model Specification and Misspecification
62 pages
Multiple Regression Analysis Explained
No ratings yet
Multiple Regression Analysis Explained
35 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
35 pages
Chapter Three
No ratings yet
Chapter Three
35 pages
Mece 101 Pyqs Part2
No ratings yet
Mece 101 Pyqs Part2
26 pages
Pure vs Impure Serial Correlation
No ratings yet
Pure vs Impure Serial Correlation
23 pages
Omitted Variables and Misspecification Analysis
No ratings yet
Omitted Variables and Misspecification Analysis
13 pages
Omitted Variable Bias in Model Specification
No ratings yet
Omitted Variable Bias in Model Specification
5 pages
Multiple Regression Analysis Issues
No ratings yet
Multiple Regression Analysis Issues
16 pages
Linear Regression Fundamentals in Econometrics
No ratings yet
Linear Regression Fundamentals in Econometrics
12 pages
Lecture 05
No ratings yet
Lecture 05
57 pages
Omitted Variable Bias in Econometrics
No ratings yet
Omitted Variable Bias in Econometrics
10 pages
Econometrics: Errors and Variables Explained
No ratings yet
Econometrics: Errors and Variables Explained
9 pages
Chapter 6 Econometrics
No ratings yet
Chapter 6 Econometrics
22 pages
Understanding Endogeneity and Measurement Error
No ratings yet
Understanding Endogeneity and Measurement Error
51 pages
Ordinary Least Squares in Econometrics
No ratings yet
Ordinary Least Squares in Econometrics
35 pages
Classical Linear Regression Overview
No ratings yet
Classical Linear Regression Overview
28 pages
Econometrics Unit4 5 Notes
No ratings yet
Econometrics Unit4 5 Notes
6 pages
Econometrics Ch2 Multiple Regression Analysis
No ratings yet
Econometrics Ch2 Multiple Regression Analysis
91 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
41 pages
Understanding OLS and Gauss-Markov Assumptions
No ratings yet
Understanding OLS and Gauss-Markov Assumptions
40 pages
Chapter 5 Endogenity, IV Regression and SEMs Compatibility Mode
No ratings yet
Chapter 5 Endogenity, IV Regression and SEMs Compatibility Mode
46 pages
Limitations of OLS Regression Analysis
No ratings yet
Limitations of OLS Regression Analysis
83 pages
Dummy Variables in Multiple Regression
No ratings yet
Dummy Variables in Multiple Regression
49 pages
9 Regression Specification
No ratings yet
9 Regression Specification
21 pages
Multiple Linear Regression (Economics Dept)
No ratings yet
Multiple Linear Regression (Economics Dept)
12 pages
Multiple Linear Regression Analysis Guide
No ratings yet
Multiple Linear Regression Analysis Guide
24 pages
OLS Assumptions and Parameter Estimates
No ratings yet
OLS Assumptions and Parameter Estimates
4 pages
Chapter - 8
No ratings yet
Chapter - 8
33 pages
Chapter 13
No ratings yet
Chapter 13
32 pages
Chapter 14
No ratings yet
Chapter 14
14 pages
Payment Methods & Discounts in Pakistan Ecommerce
No ratings yet
Payment Methods & Discounts in Pakistan Ecommerce
13 pages
Learner Reading Level Profile 2022-2023
No ratings yet
Learner Reading Level Profile 2022-2023
48 pages
Cost Function Analysis Methods
No ratings yet
Cost Function Analysis Methods
2 pages
EUC1502 Module3 Machine-Learning
No ratings yet
EUC1502 Module3 Machine-Learning
25 pages
Uber Fare Data Analysis Report
No ratings yet
Uber Fare Data Analysis Report
15 pages
NPAR Tests and Regression Analysis
No ratings yet
NPAR Tests and Regression Analysis
5 pages
Econometrics Assignment Overview
No ratings yet
Econometrics Assignment Overview
13 pages
LASSO vs SCAD: MSE Comparison Analysis
No ratings yet
LASSO vs SCAD: MSE Comparison Analysis
19 pages
Econometrics Assignment: Wage Analysis
No ratings yet
Econometrics Assignment: Wage Analysis
2 pages
Revision Exercise SDSC5001 Midterm
No ratings yet
Revision Exercise SDSC5001 Midterm
4 pages
Finals Quiz 1 Correlation and Regression
No ratings yet
Finals Quiz 1 Correlation and Regression
2 pages
Essentials of Simple Linear Regression
No ratings yet
Essentials of Simple Linear Regression
34 pages
Panel Data Exam for Financial Economics
100% (1)
Panel Data Exam for Financial Economics
12 pages
Alpha Plus Brochure
No ratings yet
Alpha Plus Brochure
16 pages
Institute of Actuaries Publications Unit Book Sale
No ratings yet
Institute of Actuaries Publications Unit Book Sale
42 pages
IandF CA11 201109 Examiners' Report
No ratings yet
IandF CA11 201109 Examiners' Report
19 pages
Business Analytics in Insurance Sector
No ratings yet
Business Analytics in Insurance Sector
37 pages
VNINDEX and FPT Stock Performance Data
No ratings yet
VNINDEX and FPT Stock Performance Data
4 pages
Mathematical and Statistical Methods For Actuarial Sciences and Finance
No ratings yet
Mathematical and Statistical Methods For Actuarial Sciences and Finance
465 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
26 pages
Linear Regression Model Overview
No ratings yet
Linear Regression Model Overview
21 pages
CA1: Actuarial Risk Management Overview
50% (2)
CA1: Actuarial Risk Management Overview
9 pages
População: Conceitos e Dimensões Demográficas
No ratings yet
População: Conceitos e Dimensões Demográficas
35 pages
Logistic Regression Implementation Guide
No ratings yet
Logistic Regression Implementation Guide
15 pages
F205 Investment Exam Guidelines 2019
No ratings yet
F205 Investment Exam Guidelines 2019
5 pages
Australian Insurance Summit 2010 Overview
No ratings yet
Australian Insurance Summit 2010 Overview
4 pages
Financial Calculations: FV, PV, Interest Rates
No ratings yet
Financial Calculations: FV, PV, Interest Rates
22 pages
Schedule B Instructions for Form 5500
No ratings yet
Schedule B Instructions for Form 5500
9 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
62 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
16 pages
Intermediate Accounting II Exam Review
No ratings yet
Intermediate Accounting II Exam Review
10 pages

Chapter - 9

Uploaded by

Chapter - 9

Uploaded by

ECO 345 - Applied Econometrics II

Muhammad Salman Khalid

School of Economics & Social Sciences

March 24, 2026

Salman (IBA) Chapter 9 1 / 30

Salman (IBA) Chapter 9 2 / 30

So far we have assumed that our model is correctly specified - the

wage = β0 + β1 educ + β2 exper + β3 exper 2 + µ

wage = β0 + β1 educ + β2 exper + µ

This is a case of functional form misspecification - a type of

The OLS estimators will be biased and inconsistent.

Salman (IBA) Chapter 9 4 / 30

The RESET test was proposed by Ramsey (1969).

Salman (IBA) Chapter 9 5 / 30

5 If we reject H0 , the functional form is misspecified.

Salman (IBA) Chapter 9 7 / 30

Model 2 : y = β0 + β1 log (x1 ) + β2 log (x2 ) + µ

y = β0 + β1 x1 + β2 x2 + β3 log (x1 ) + β4 log (x2 ) + µ

Then test H0 : β1 = β2 = 0 (Model 2 preferred) or H0 : β3 = β4 = 0

Salman (IBA) Chapter 9 9 / 30

One of the most serious problems in econometrics is the omission of a

wage = β0 + β1 educ + β2 exper + β3 ability + µ

We cannot observe ability directly.

Salman (IBA) Chapter 9 10 / 30

A proxy variable is an observable variable that is related to the

Salman (IBA) Chapter 9 11 / 30

When we include the proxy variable in our model:

The coefficient β3 on the proxy does not have a direct interpretation

Salman (IBA) Chapter 9 12 / 30

A very useful proxy strategy is to include the lagged dependent

crimet = β0 + β1 unemt + β2 crimet−1 + µt

Why is this useful?

Salman (IBA) Chapter 9 13 / 30

Salman (IBA) Chapter 9 14 / 30

Salman (IBA) Chapter 9 15 / 30

Let y ∗ be the true value and y be the observed value.

The true model is:

Since y = y ∗ + e0 , the estimated model becomes:

The new error term is µ + e0 .

Salman (IBA) Chapter 9 16 / 30

If e0 is uncorrelated with the explanatory variables (x1 , x2 , ...., xk ):

Salman (IBA) Chapter 9 17 / 30

This case is much more serious.

The true model is:

The new error term is µ − β1 e1 .

Salman (IBA) Chapter 9 18 / 30

Cov (x1 , e1 ) = 0 and Cov (x1∗ , e1 ) = 0

However, even under CEV:

Cov (x1 , µ − β1 e1 ) = −β1 Cov (x1 , e1 ) ̸= 0

Wait! Isn’t Cov (x1 , e1 ) = 0 under CEV?

Cov (x1 , e1 ) = Cov (x1∗ + e1 , e1 ) = Var (e1 ) > 0

Therefore, x1 is correlated with the composite error - OLS is biased

Under the CEV assumption, it can be shown that:

Salman (IBA) Chapter 9 20 / 30

In a multiple regression, measurement error in x1 can affect the

Salman (IBA) Chapter 9 21 / 30

Salman (IBA) Chapter 9 22 / 30

In practice, datasets often have missing observations for some

Salman (IBA) Chapter 9 23 / 30

Sample selection bias occurs when the sample is not representative

Salman (IBA) Chapter 9 24 / 30

An outlier is an observation that is far from the rest of the data.

Salman (IBA) Chapter 9 25 / 30

Should we remove outliers?

Salman (IBA) Chapter 9 26 / 30

Salman (IBA) Chapter 9 27 / 30

OLS minimizes the sum of squared residuals:

LAD minimizes the sum of absolute residuals:

Salman (IBA) Chapter 9 28 / 30

When should we prefer LAD over OLS?

Salman (IBA) Chapter 9 29 / 30

Salman (IBA) Chapter 9 30 / 30

You might also like