Lecture 7
Defining Econometrics, Economic Models, and Econometric Models
Introduction to Econometrics
BSc Eco 2023, Spring 2025
Instructor: Sunaina Dhingra
Email-id: sunaina@[Link]
Lecture Date: 24th February
Revision of
Estimation and Interpretation of OLS
Estimators
The Simple Regression Model
• Definition of the simple regression model
• “Explains variable y in terms of variable x”
3
The Simple Regression Model
• Interpretation of the simple linear regression model
• Explains how y varies with changes in x
• The simple linear regression model is rarely applicable in practice
but its discussion is useful for pedagogical reasons.
4
The Simple Regression Model
• Example: Soybean yield and fertilizer
• Example: A simple wage equation
5
The Simple Regression Model
• When is there a causal interpretation?
• Conditional mean independence assumption
• Example: wage equation
6
The Simple Regression Model
• Population regression function (PRF)
• The conditional mean independence assumption implies that
• This means that the average value of the dependent variable
can be expressed as a linear function of the explanatory variable.
7
The Simple Regression Model
8
The Simple Regression Model
• Deriving the ordinary least squares estimates
• In order to estimate the regression model one needs data
• A random sample of n observations
9
The Simple Regression Model
• Deriving the ordinary least squares (OLS) estimators
• Defining regression residuals
• Minimize the sum of the squared regression residuals
• OLS estimators
10
The Simple Regression Model
• OLS fits as good as possible a regression line through the data points
11
Estimation of OLS parameters
• The Wage1 dataset is used to estimate wage
PRF: 𝑤𝑎𝑔𝑒𝑖 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 + μ𝑖 −−−−−−−(1)
• The model is estimated using a sample from the population
0 + 𝛽
SRF: 𝑤𝑎𝑔𝑒𝑖 = 𝛽 1 𝑒𝑑𝑢cation𝑖 + μෝ𝑖 -------(2)
𝑤𝑎𝑔𝑒𝑖 = 𝑤𝑎𝑔𝑒
ෟ𝑖+ෞ μ𝑖
• 𝑤𝑎𝑔𝑒
ෟ 𝑖 is the estimated conditional mean value of 𝑤𝑎𝑔𝑒𝑖
• The parameters of this regression model are estimated using Ordinary Least
Squares (OLS) method
Estimation of OLS parameters(contd.)
0 + 𝛽
𝑤𝑎𝑔𝑒𝑖 = 𝛽 1 𝑒𝑑𝑢cation𝑖 + μෝ𝑖 -------(2)
0 − 𝛽
μෝ𝑖 = 𝑤𝑎𝑔𝑒𝑖 − 𝛽 1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 −−−−−−−(3)
• As the sum of disturbances equals zero, the mean also equals zero
E μ = μത = 0−−−−−−−(4)
• As the average value of residual equals zero , we square the residuals, sum them and
then try to minimize that sum
n n
2 2
0 − β
min μෝi = min wagei − β 1 educationi −−−−−−−(5)
i=1 i=1
0 and 𝛽
• The estimators 𝛽 1 are obtained by minimizing the sum of squared residuals.
Estimation of OLS parameters(contd.)
2
𝑚𝑖𝑛 σ𝑛𝑖=1 𝜇ෝ𝑖 2 = 𝑚𝑖𝑛 σ𝑛𝑖=1 0 − 𝛽
𝑤𝑎𝑔𝑒𝑖 − 𝛽 1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 −−−−−−−(5)
0 , and equate them to 0
• Take the FOC of equation 5 with respect to 𝛽
0 − 𝛽
−2 σ𝑛𝑖=1 𝑤𝑎𝑔𝑒𝑖 − 𝛽 1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 = 0 −−−−−−−(6)
1 , and equate them to 0
• Take the FOC of equation 5 with respect to 𝛽
0 − 𝛽
−2 σ𝑛𝑖=1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 𝑤𝑎𝑔𝑒𝑖 − 𝛽 1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 = 0 −−−−−−−(7)
Estimation of OLS parameters(contd)
0 & β
• Solving 6 and 7 simultaneously, we get the OLS estimators β 1
0 − 𝛽
−2 σ𝑛𝑖=1 𝑤𝑎𝑔𝑒𝑖 − 𝛽 1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 = 0 −−−−−−−(6)
0 − 𝛽
−2 σ𝑛𝑖=1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 𝑤𝑎𝑔𝑒𝑖 − 𝛽 1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 = 0 −−−−−−−(7)
0 = 𝑤𝑎𝑔𝑒 − 𝛽
𝛽 1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛------(8)
σ𝑛𝑖=1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 − 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 (𝑤𝑎𝑔𝑒𝑖 − 𝑤𝑎𝑔𝑒)
1 =
𝛽 −−−−−−(9)
2
σ𝑛𝑖=1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖 − 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛
• These estimators are also referred to as least-squares estimators
Fitted Values and Residuals
• Thus, we can write the OLS estimator for any y and x as
0 = 𝑦ത − 𝛽
𝛽 1 𝑥ҧ −−−− − 10
σ𝑛
𝑖=1 𝑥𝑖 −𝑥ҧ (𝑦𝑖 −𝑦)
ത
1 =
𝛽 2 −−−− − 11
σ𝑛
𝑖=1 𝑥−𝑥ҧ
• Equation 11 can be re written as
1 = 𝐶𝑜𝑣(𝑥𝑦)
𝛽 −−−− − 12
𝜎𝑥2
• Covariance and slope have the same sign . Thus, the sign of covariance determines
the expected direction in which x affects y
Fitted Values and Residuals(contd.)
0 and 𝛽
• Predicted y: For any given value of 𝑥𝑖 , using the estimated 𝛽 1 values we
get
𝑦ෝ𝑖 =
𝛽0 +𝛽 1 𝑥𝑖 -----(13)
𝑦𝑖 = 𝑦ෝ𝑖 + 𝑢ෝ𝑖 −−−−−(14)
• The fitted regression line is called the line of best fit
• The OLS residual associated with each observation i , 𝑢ෝ𝑖 is
𝑢ෝ𝑖 = 𝑦𝑖 − 𝑦ෝ𝑖 −−−−−(15)
• If 𝑢ෝ𝑖 is positive, the line under predicts 𝑦𝑖 and if 𝑢ෝ𝑖 is negative, the line over
predicts 𝑦𝑖
The Simple Regression Model
• Example of a simple regression
• CEO salary and return on equity
• Fitted regression
• Causal interpretation?
18
The Simple Regression Model
19
The Simple Regression Model
• Example of a simple regression
• Wage and education
• Fitted regression
• Causal interpretation?
20
The Simple Regression Model
• Example of a simple regression
• Voting outcomes and campaign expenditures (two parties)
• Fitted regression
• Causal interpretation?
21
STATA Results
• The Wage1 dataset is used to estimate wage- education model
𝑤𝑎𝑔𝑒
ෟ 𝑖 = −0.91 + 0.54 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛𝑖
STATA Result 1: Estimation of OLS Regression Line Figure 1: Line of best fit
. reg wage education
Source SS df MS Number of obs = 526
F(1, 524) = 103.36
Model 1179.73204 1 1179.73204 Prob > F = 0.0000
Residual 5980.68225 524 11.4135158 R-squared = 0.1648
Adj R-squared = 0.1632
Total 7160.41429 525 13.6388844 Root MSE = 3.3784
wage Coefficient Std. err. t P>|t| [95% conf. interval]
education .5413593 .053248 10.17 0.000 .4367534 .6459651
_cons -.9048516 .6849678 -1.32 0.187 -2.250472 .4407687
Source: Author’s estimation using Wage1 dataset in STATA, refer Do file for command
Algebraic Properties: Numerical & Statistical
• Algebraic properties are classified into numerical properties and statistical properties
• Numerical Propoerties:
• Numerical properties always hold as they were calculated using the Ordinary
Least Squares principles which uses differential calculus
• If the technique is used correctly, the β0 and β1 estimates will satisfy the
numerical properties regardless of how the data were generated
• These properties are true for any sample of data.
Numerical Property:1
• Numerical Property 1: The sum and sample average of OLS residuals is zero
• The equation below follows the first order condition with respect to 𝛽0
𝑛
𝑢ෝ𝑖 = 0 −−−−−−−(1)
𝑖=1
• The equation below follows the first order condition with respect to 𝛽1
0 − 𝛽
−2 σ𝑛𝑖=1 𝑦𝑖 − 𝛽 1 𝑥𝑖 = 0 -------(2)
σ𝑛𝑖=1 𝑢ෝ𝑖
𝜇Ƹ ҧ = = 0 if equation 1 holds true
𝑛
Numerical Property:2
• Numerical Property 2: The sample covariance between regressors and OLS residuals
is zero
Cov (ොμ,x) = E[(ොμ -E(ොμ)) (x-E(x))]-------(3)
= E[(ොμ) (x-E(x))] as E(ොμ)=0
Cov (ොμ,x)= E[ොμ x] -----(4)
• If Covariance is zero, then
𝑛
μෝ𝑖 . 𝑥𝑖 = 0 −−−−−(5)
𝑖=1
1 for any x
• The First order condition with respect to 𝛽
0 − 𝛽
−2 σ𝑛𝑖=1 𝑥𝑖 𝑦𝑖 − 𝛽 1 𝑥𝑖 = 0 -------(6)
Numerical Property 2: Example
• The Wage1 dataset is used to estimate the wage
Source: Author’s estimation using Wage1 dataset in STATA, refer Do file for command
• Cov(ොμ, education) = 0
Numerical Property: 3
• The point (𝑥,ҧ 𝑦)
ത is always on the OLS regression line
𝑤𝑎𝑔𝑒
ෟ =𝛽 0 + 𝛽
1 𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 ------(7)
• In the equation above if we plug in the mean of education, 𝑒𝑑𝑢cation on for
education, then the predicted value of wage will be its mean i.e., 𝑤𝑎𝑔𝑒
• These properties can be used to write each 𝑦𝑖 as its fitted value, plus its residuals as
given below
𝑦𝑖 = 𝑦ෝ𝑖 + 𝑢ෝ𝑖 ------(8)
The Simple Regression Model
obsno roe salary salaryhat uhat
1 14.1 1095 1224.058 -129.058
• This table presents fitted
2 10.9 1001 1164.854 -163.854
values and residuals for 15
3 23.5 1122 1397.960 -275.969
CEOs.
4 5.9 578 1072.348 -494.348
5 13.8 1368 1218.508 149.493
• For example, the 12th
6 20.0 1145 1333.215 -188.215 CEO’s predicted salary is
7 16.4 1078 1266.611 188.611 $526,023 higher than their
8 16.3 1094 1264.761 -170.761 actual salary.
9 10.5 1237 1157.454 79.546
10 26.3 833 1449.773 -616.773
11 25.9 567 1442.372 -875.372 • By contrast the 5th CEO’s
12 26.8 933 1459.023 -526.023 predicted salary is
13 14.8 1339 1237.009 101.991 $149,493 lower than their
14 22.3 937 1375.768 -438.768 actual salary.
15 56.3 2011 2004.808 6.192
28
Measure of Goodness of Fit (R )
2
Introduction
• Goodness of fit measures how well the independent variables explain the dependent
variable y
• From numerical property1: The average of residuals is zero as the sum of residuals is
𝑛
zero ҧ ෞ𝑖
𝑖=1 μ
𝜇Ƹ = = 0 𝑖𝑓 σ𝑛𝑖=1 μෝ𝑖 = 0 −−−−−− − 1
𝑛
• Actual yi consists of a fitted value and a residual
𝑦𝑖 = 𝑦ෝ𝑖 + μෝ𝑖 −−−−−− − 2
• From eq2, the sample average of the fitted values can be written as
𝑦തො = 𝑦--------(3)
ത Summing eq2, diving by n and plugging in eq 1.
Introduction(contd.)
• The covariance between residuals μෝ𝑖 , and x is 0
C𝑜𝑣 μෝ𝑖 , 𝑥𝑖 = E[μෝ𝑖 , 𝑥𝑖 ] =0 as σni=1 μෝi . xi = 0-----(4)
• The covariance between the fitted value and residuals is 0
0 − 𝛽
C𝑜𝑣 𝑦ෝ𝑖 , μෝ𝑖 = E 𝑢ෝ𝑖 . 𝑦ෝ𝑖 = E μෝ𝑖 . 𝑦𝑖 − 𝛽 1 𝑥𝑖 = 0 -------(5)
𝑛
0 − 𝛽
𝑠𝑖𝑛𝑐𝑒 μෝ𝑖 . 𝑦𝑖 − 𝛽 1 𝑥𝑖 =0
𝑖=1
Variation in a regression model
• Total sum of squares (SST): Measure of total sample variation in 𝑦𝑖
𝑛
Total variation: SST = 𝑖=1(𝑦𝑖 ത 2 = 0 −−−−−−−(6)
− 𝑦)
• Explained Sum of Squares (SSE): Measures sample variation in 𝑦ෝ𝑖
𝑛
Explained variation: SSE=𝑖=1(𝑦ෝ𝑖 ത 2 = 0-------(7)
− 𝑦)
• Residual Sum of Squares (SSR): Measures sample variation in μෝ𝑖
𝑛
Unexplained variation: SSR=𝑖=1(𝑢ෝ𝑖 )2 = 0 -------(8)
Variation in a regression model(contd)
• The total variation in y can be written as the sum of explained and unexplained variation
SST = SSE + SSR---------(9)
• Dividing the equation above throughout by SST
1= SSE/SST + SSR/SST --------(10)
Coefficient of determination (R2)
• It is the ratio of the explained variation(SSE) compared to the total variation (SST)
• Fraction of the sample variation in y that is explained by x
𝑆𝑆𝐸
R2 = −−− −(11)
𝑆𝑆𝑇
• The value of R2 is always between zero and one, because SSE can be no greater than SST
2 𝑆𝑆𝑅
R =1- −−− −(12)
𝑆𝑆𝑇
𝑆𝑆𝑅
• If the regression model fits well, then is nearly zero and so R2 is one
𝑆𝑆𝑇
𝑆𝑆𝑅
• If the regression model fits badly, then is nearly one and so R2 is zero
𝑆𝑆𝑇
STATA Results
• The Wage1 dataset is used to estimate wage
STATA Result 1: Regression Result
. reg wage education
Source SS df MS Number of obs = 526 R2= 100* 0.1632
F(1, 524) = 103.36 =16.32% : Explained
Model 1179.73204 1 1179.73204 Prob > F = 0.0000
Residual 5980.68225 524 11.4135158 R-squared = 0.1648
Adj R-squared = 0.1632 Unexplained: 83.6%
Total 7160.41429 525 13.6388844 Root MSE = 3.3784
wage Coefficient Std. err. t P>|t| [95% conf. interval]
education .5413593 .053248 10.17 0.000 .4367534 .6459651
_cons -.9048516 .6849678 -1.32 0.187 -2.250472 .4407687
Source: Author’s estimation using Wage1 dataset in STATA, refer Do file for command
• Low R2 value does not mean OLS regression equation is useless
• Using R-squared as the main gauge of success for an econometric analysis can lead to
trouble
The Simple Regression Model
• Goodness of fit
• How well does an explanatory variable explain the dependent variable?
• Measures of variation:
36
The Simple Regression Model
• Decomposition of total variation
• Goodness-of-fit measure (R-squared)
37
The Simple Regression Model
• CEO Salary and return on equity
• Voting outcomes and campaign expenditures
• Caution: A high R-squared does not necessarily mean that the
regression has a causal interpretation!
38
The Simple Regression Model
• Expected values and variances of the OLS estimators
• The estimated regression coefficients are random variables
because they are calculated from a random sample
• The question is what the estimators will estimate on average and
how large will their variability be in repeated samples
39
Assumptions and Unbiasedness
Property of OLS Estimators
SLR.1: Linear in parameters
Source: Wooldridge, Chapter 2
• We need linearity in parameters, i.e., the β’s should have power 1 but our equation
may not be linear in variables y and x
SLR.2: Random Sampling
Source: Wooldridge, Chapter 2
• As the primary goal is to draw conclusions about the population, the sample that
we used must be drawn at random from the population
• If the sample is not drawn at random, it may not be representative of the
population and the conclusions that we draw from it may be biased
The Simple Regression Model
• Discussion of random sampling: Wage and education
• The population consists, for example, of all workers of country A
• In the population, there is a linear relationship between wages and years
of education.
• Draw completely randomly a worker from the population
• The wage and the years of education of the worker drawn are random
because one does not know beforehand which worker is drawn.
• Throw that worker back into the population and repeat the random draw n
times.
• The wages and years of education of the sampled workers are used to
estimate the linear relationship between wages and education.
43
The Simple Regression Model
44
SLR.3: Sample Variation in the explanatory variable
Source: Wooldridge, Chapter 2
• This assumption requires that our x
needs to vary in some way
• Assumption SLR.3 fails if the sample
standard deviation of xi is zero;
otherwise, it holds.
• If this is zero, we cannot find beta1hat,
or beta0hat
Source: Author’s estimation using Wage1 dataset in STATA, refer Do file for command
SLR.4: Zero Conditional Mean
Source: Wooldridge, Chapter 2
• It states that the average value of the disturbance, conditional on x, equals zero for all
possible values of the explanatory variable.
• It also implies the cloud of data centers on a straight line at every possible value of x
Statistical Property 1: Unbiasedness
• Bias means our estimator’s expected value does not equal the true value in the population
• In some ways, biased estimators aren't even "correct on average.“
• The estimators are unbiased if
መ = 𝛽-------(1)
E(𝛽)
• Unbiasedness does not imply that the 𝛽መ for every possible sample is equal to its
population value
• It only means that on an average 𝛽መ are not too large or small in comparison to the
population value
Theorem1: Unbiasedness of OLS Estimates
Source: Wooldridge, Chapter 2
• If assumptions hold, we have a very important theorem that states that under assumptions SLR.1
through SLR.4, OLS estimates of 𝛽 0 and 𝛽1 are unbiased, i.,e they are equal to the population 𝛽
0 and 𝛽
1
, on average.
• In real world, simple linear regression estimators will most commonly be biased
• Unbiased SLR parameters (Theorem 1) only works if E(u|x)=0 i.e. SLR.4 is true
Interpretation of unbiasedness
• The estimated coefficients may be smaller or larger, depending on the sample that is the result of a random draw.
• However, on average, they will be equal to the values that characterize the true relationship between y and x in the
population.
• “On average” means if sampling was repeated, i.e. if drawing the random sample and doing the estimation was
repeated many times.
• In a given sample, estimates may differ considerably from true values.
Failure of SLR.4 may lead to biased Estimates
• Usually SL.4 fails to be true due to:
❑ Reverse causality
❑ Wrong functional form
❑ Expected values of y conditional on x do not really fall on a straight line
❑ Error in measurement of x variable
❑ Omitted variables