0% found this document useful (0 votes)

14 views16 pages

Understanding Multiple Regression Analysis

The document discusses multiple regression analysis, focusing on the population multiple regression model, interpretation of coefficients, and the OLS estimator. It also covers measures of fit such as R-squared and adjusted R-squared, as well as the implications of multicollinearity and omitted variable bias. An example using test scores illustrates the differences between simple and multiple regression models, highlighting the importance of including relevant variables for accurate predictions.

Uploaded by

Baran Alp Özdemir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views16 pages

Understanding Multiple Regression Analysis

Uploaded by

Baran Alp Özdemir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Multiple Regression

(Stock & Watson Ch. 6)

Outline
1. Multiple regression and OLS
2. Measures of fit
3. Least squares assumptions for the MR model
4. Sampling distribution of the OLS estimator
5. Multicollinearity
6. Omitted variable bias

1
The Population Multiple Regression Model
(S&W Section 6.2)
Consider the case of two regressors:
Yi = b0 + b1X1i + b2X2i + ui, i = 1,…, n
• Y is the dependent variable.
• X1, X2 are the two independent variables (regressors).
• (Yi, X1i, X2i) denote the ith observation on Y, X1, and X2.
• b0 = unknown population intercept.
• b1 = slope of X1 (effect on Y of a change in X1, holding X2 constant)
• b2 = slope of X2 (effect on Y of a change in X2, holding X1 constant)
• ui = the regression error (omitted factors).

2
Interpretation of coefficients in multiple regression
Yi = b0 + b1X1i + b2X2i + ui, i = 1,…,n
Consider changing X1 by DX1 while holding X2 constant.

Population regression line before the change:

Y = b0 + b1X1 + b2X2
Population regression line, after the change:
Y + DY = b0 + b1(X1 + DX1) + b2X2
Here DY is the change in Y in response to the change in X1.
After: Y + DY = b0 + b1(X1 + DX1) + b2X2
Before: Y = b0 + b1X1 + b2X2
Difference: DY = b1DX1
3
Holding X2 constant, the change in Y in response to the change in X1 is:
DY = b1DX1.
Similarly, holding X1 constant, we would get:
DY = b2DX2.
Interpretation of coefficients in Multiple Regression:
DY
b1 = , holding X2 constant
DX 1

DY
b2 = , holding X1 constant
DX 2

b0 = expected value of Y when X1 = X2 = 0.

If you are comfortable with math, use “𝜕” in place of D.
4
The OLS Estimator in Multiple Regression
(S&W Section 6.3)

With two regressors, the Least Squares problem is:

n
min b0 ,b1 ,b2 å[Yi - (b0 + b1 X 1i + b2 X 2i )]2
i =1

• This minimization problem is solved using calculus. The 3 F.O.C.s

yield a system of 3 linear equations in 3 unknowns. This system has
a unique solution as long as LS A4 holds (discussed below).
• The values that solve the 3-equation system are the OLS estimators.
• The OLS estimator minimizes the average squared difference
between the actual values of Yi and the prediction (predicted value)
based on the estimated line.

5
Least squares problem:
n
min b0 ,b1 ,b2 å[Yi - (b0 + b1 X 1i + b2 X 2i )]2
i =1
F.O.C.s:
• with respect to (w.r.t.) 𝑏! :
∑" 𝑌" − 𝑛𝑏! − 𝑏# ∑" 𝑋#" − 𝑏$ ∑" 𝑋$" = 0
• w.r.t. 𝑏# :
$
∑" 𝑌" 𝑋#" − 𝑏! ∑" 𝑋#" − 𝑏# ∑" 𝑋#" − 𝑏$ ∑" 𝑋#" 𝑋$" = 0
• w.r.t. 𝑏$ :
$
∑" 𝑌" 𝑋$" − 𝑏! ∑" 𝑋$" − 𝑏# ∑" 𝑋#" 𝑋$" − 𝑏$ ∑" 𝑋$" =0
Solution yields the OLS estimators of b0, b1 and b2. We denote
these by putting hats (^) above the unknown parameter.
6
Multiple regression in Stata (heteroscedasticity robust se’s)
reg testscr str pctel, robust

Regression with robust standard errors Number of obs = 420

F( 2, 417) = 223.82
Prob > F = 0.0000
R-squared = 0.4264
Root MSE = 14.464
------------------------------------------------------------------------------
| Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
str | -1.101296 .4328472 -2.54 0.011 -1.95213 -.2504616
pctel | -.6497768 .0310318 -20.94 0.000 -.710775 -.5887786
_cons | 686.0322 8.728224 78.60 0.000 668.8754 703.189
------------------------------------------------------------------------------

2
𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 686.0 – 1.10´STR – 0.65PctEL, R2 = 0.43
(8.72) (0.43) (0.031)

7
Test Score Example: Interpretation of Coefficient Estimates
– 1.10: (slope of STR) 1 student increase in a class decreases test
scores by 1.10 points on average, holding the percentage of English
learners (i.e., PctEL) constant.

– 0.65: (slope of PctEL) 1 percentage point increase in the

percentage of English learners decreases test scores by 0.65 points on
average, holding the class size (i.e., STR) constant.

686.0: (intercept) The predicted test scores is 686 points in a class

where there is no student (i.e., STR=0) and the percentage English
learners is zero (i.e., PctEL=0).

8
Test Score Example: SR vs. MR
Regression of TestScore on STR yields:
2
(Simple) 𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 698.9 – 2.28´STR, R2 = 0.051.

Now include percent English Learners in the district (PctEL):

2
(Multiple) 𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 686.0 – 1.10´STR – 0.65PctEL, R2 = 0.43.

Remark: The estimated slope of STR changed noticeably. This is

strong evidence for OVB (Omitted Variable Bias) in the simple
regression.
Why?
1) STR and PctEL is correlated
2) PctEL is a determinant of TestScore.

9
Measures of Fit for Multiple Regression
(S&W Section 6.4)
All measures exploit the breakdown of Y into the part that can be
predicted by X’s and the part that cannot be predicted from the X’s:
Yi = Yî + uî
Actual Y = predicted Y + residual
We proceed as in the case of simple regression:
∑"(𝑌" − 𝑌5)2 = ∑"(𝑌7" − 𝑌5)2 + ∑" 𝑢9"$ + 2<= 7" − 𝑌
∑":𝑌
==>= 5=?
= ;𝑢9"
!
= ∑"(𝑌7" − 𝑌5)2 + ∑" 𝑢9"$ .
Total Sum of Squares = Explained Sum of Squares +
Sum of Squared Residuals (Unexplained sum of squares)
TSS = ESS + SSR
10
Measures of fit:
SER = Standard Error of the Regression
(std. deviation of uî with d.f. correction.)
RMSE = Root Mean Squared Error
(std. deviation of uî without d.f. correction.)
R2 = fraction of variance of Y explained by the X’s
R 2 = adjusted R-squared.
(Modified R2 with a d.f. correction that adjusts for use of
additional regressors.)

11
SER and RMSE
Same idea as in simple regression: SER and the RMSE are measures
of the spread of the Y’s around the regression line:

!!" 1 n
SER = ! = åi ˆ
u 2
#$%$& n - k - 1 i =1

!!" 1 n 2
RMSE = ! = å uˆi
# n i =1

Recall that Stata reports RMSE.

12
R2 and R 2 (adjusted R2)

The R2 is the fraction of the variance explained – same definition as

in regression with a single regressor:
ESS SSR
R2 = = 1- ,
TSS TSS
n n
where ESS = ∑"(𝑌7" − 𝑌5)2, SSR = å i , TSS = å i
ˆ
u 2

i =1
(Y -
i =1
Y ) 2
.

Stata always reports R2. When robust command is not used, it also
reports adjusted R2, TSS, ESS, SSR and the associated degrees of
freedom – see next page.

13
Multiple regression in Stata (homoscedasticity-only se’s)

. reg testscr str pctel

Source | SS df MS Number of obs = 420

---------------+------------------------------ F( 2, 417) = 155.01
“ESS” Model | 64864.3011 2 32432.1506 Prob > F = 0.0000
“SSR” Residual | 87245.2925 417 209.221325 R-squared = 0.4264
---------------+------------------------------ Adj R-squared = 0.4237
“TSS” Total | 152109.594 419 363.030056 Root MSE = 14.464

------------------------------------------------------------------------------
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
str | -1.101296 .3802783 -2.90 0.004 -1.848797 -.3537945
pctel | -.6497768 .0393425 -16.52 0.000 -.7271112 -.5724423
_cons | 686.0322 7.411312 92.57 0.000 671.4641 700.6004
------------------------------------------------------------------------------

14
R2 and R 2 , cont’d.
• R2 always increases when you add another regressor (why?)
• This may be seen as a disadvantage when we are assessing “fit.

R 2 (“adjusted R2”) corrects this problem by “penalizing” the fit

measure for including additional regressors. While R2 always
increases when you add another regressor, R 2 does not necessarily
increase.
æ n - 1 ö SSR
Adjusted R : R = 1 - ç
2 2
÷ ; k = no. of regressors.
è n - k - 1 ø TSS
SSR
R2 = 1 -
TSS
Note that R 2 < R2, however if n is large the two will be very close.

15
Test Score Example: California school data
2
(Simple) 𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 698.9 – 2.28´STR,
R2 = .0512, RMSE = 18.6
(Multiple) 2
𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 686.0 – 1.10´STR – 0.65PctEL,
R2 = .426, R 2 = .424, RMSE = 14.5

Question: Compare R2 values. What does this tell you about the fit
of Multiple regression compared with Simple regression?

Question: Why are the R2 and the R 2 so close in Multiple?

Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
56 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
36 pages
Multiple Linear Regression Model Explained
No ratings yet
Multiple Linear Regression Model Explained
21 pages
Multiple Regression Analysis Explained
No ratings yet
Multiple Regression Analysis Explained
67 pages
Lecture 6 EC3303 KSeah (Slides)
No ratings yet
Lecture 6 EC3303 KSeah (Slides)
45 pages
Multiple Linear Regression Explained
No ratings yet
Multiple Linear Regression Explained
16 pages
Correlation and Regression
No ratings yet
Correlation and Regression
26 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
34 pages
Multiple Linear Regression Analysis Guide
No ratings yet
Multiple Linear Regression Analysis Guide
6 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
6 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
60 pages
Multiple Linear Regression Analysis Guide
No ratings yet
Multiple Linear Regression Analysis Guide
54 pages
Introduction To Econometrics - Stock & Watson - CH 5 Slides
100% (2)
Introduction To Econometrics - Stock & Watson - CH 5 Slides
71 pages
Multiple Regression Analysis Insights
No ratings yet
Multiple Regression Analysis Insights
10 pages
Demand Estimation via Regression Analysis
No ratings yet
Demand Estimation via Regression Analysis
9 pages
Multiple Regression Analysis Overview
100% (1)
Multiple Regression Analysis Overview
21 pages
Introduction to Multiple Linear Regression
No ratings yet
Introduction to Multiple Linear Regression
73 pages
Eco CH 3
No ratings yet
Eco CH 3
32 pages
Multiple Regression Analysis Techniques
No ratings yet
Multiple Regression Analysis Techniques
19 pages
Omitted Variable Bias in Regression Analysis
No ratings yet
Omitted Variable Bias in Regression Analysis
40 pages
Understanding R² in Multiple Regression
No ratings yet
Understanding R² in Multiple Regression
31 pages
Understanding Multiple Linear Regression
100% (4)
Understanding Multiple Linear Regression
26 pages
Multivariate Regression Analysis Overview
No ratings yet
Multivariate Regression Analysis Overview
20 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
28 pages
Multiple Regression Analysis Guide
No ratings yet
Multiple Regression Analysis Guide
29 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
6 pages
Least Squares Assumptions in Regression
No ratings yet
Least Squares Assumptions in Regression
11 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
36 pages
Simple Regression Analysis Overview
No ratings yet
Simple Regression Analysis Overview
55 pages
Simple Linear Regression Methodology
No ratings yet
Simple Linear Regression Methodology
56 pages
Interpreting Multiple Regression Models
No ratings yet
Interpreting Multiple Regression Models
33 pages
Endogenous Regressors & Instrumental Variables
No ratings yet
Endogenous Regressors & Instrumental Variables
18 pages
UNIT-IV Ds
No ratings yet
UNIT-IV Ds
6 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
5 pages
Multiple Regression Analysis Explained
No ratings yet
Multiple Regression Analysis Explained
49 pages
Omitted Variable Bias in Regression Analysis
No ratings yet
Omitted Variable Bias in Regression Analysis
71 pages
Heteroscedasticity Testing in Regression
No ratings yet
Heteroscedasticity Testing in Regression
11 pages
4.2 Tests of Structural Changes: X y X y
No ratings yet
4.2 Tests of Structural Changes: X y X y
8 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
33 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
9 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
54 pages
Applied Econometrics 2014 1
No ratings yet
Applied Econometrics 2014 1
90 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
6 pages
Study Guide Chapter 1 (EC220)
No ratings yet
Study Guide Chapter 1 (EC220)
11 pages
Multiple Linear Regression Analysis
100% (3)
Multiple Linear Regression Analysis
29 pages
Linear Regression
No ratings yet
Linear Regression
16 pages
Multiple Regression Model Overview
No ratings yet
Multiple Regression Model Overview
19 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
38 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
32 pages
Math644 Chapter 2 Part2
No ratings yet
Math644 Chapter 2 Part2
5 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
51 pages
Multiple Linear Regression Analysis
No ratings yet
Multiple Linear Regression Analysis
55 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
11 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
56 pages
Omitted Variable Bias in Regression Analysis
No ratings yet
Omitted Variable Bias in Regression Analysis
25 pages
Regression Analysis: OLS Relationships
No ratings yet
Regression Analysis: OLS Relationships
6 pages
Attitude Change Through Learning Techniques
No ratings yet
Attitude Change Through Learning Techniques
10 pages
Persuasion and Attitude Change Strategies
No ratings yet
Persuasion and Attitude Change Strategies
10 pages
Understanding Consumer Attitudes
No ratings yet
Understanding Consumer Attitudes
6 pages
Emotions and Their Evolutionary Role
No ratings yet
Emotions and Their Evolutionary Role
42 pages
Macroeconomics Final Exam Questions
No ratings yet
Macroeconomics Final Exam Questions
5 pages
Modeling Rare Events in Logistic Regression
No ratings yet
Modeling Rare Events in Logistic Regression
20 pages
Introduction To Econometrics, 5 Edition: Chapter 3: Multiple Regression Analysis
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter 3: Multiple Regression Analysis
25 pages
Auto Sales Forecasting Techniques
No ratings yet
Auto Sales Forecasting Techniques
11 pages
Mendelian Randomization Mendelian Randomization
No ratings yet
Mendelian Randomization Mendelian Randomization
222 pages
EC312 Advanced Econometrics Exam 2017
No ratings yet
EC312 Advanced Econometrics Exam 2017
11 pages
BECC-110 Econometrics Exam Guide
No ratings yet
BECC-110 Econometrics Exam Guide
8 pages
Simple Linear Regression & Correlation Guide
No ratings yet
Simple Linear Regression & Correlation Guide
2 pages
Test Bank For Introductory Econometrics: A Modern Approach, 7th Edition, Jeffrey M. Wooldridge Available Any Format
100% (10)
Test Bank For Introductory Econometrics: A Modern Approach, 7th Edition, Jeffrey M. Wooldridge Available Any Format
172 pages
Agriculture's Impact on Bangladesh Growth
No ratings yet
Agriculture's Impact on Bangladesh Growth
40 pages
Linear Regression Example Data: House Price in $1000s (Y) Square Feet (X)
No ratings yet
Linear Regression Example Data: House Price in $1000s (Y) Square Feet (X)
33 pages
Impact of Public Debt on Nigeria's Growth
No ratings yet
Impact of Public Debt on Nigeria's Growth
8 pages
Understanding Heteroscedasticity in Econometrics
No ratings yet
Understanding Heteroscedasticity in Econometrics
45 pages
Introductory Econometrics A Modern Approach 4th Edition Jeffrey M. Wooldridge Ebook Entire Content Available
75% (4)
Introductory Econometrics A Modern Approach 4th Edition Jeffrey M. Wooldridge Ebook Entire Content Available
47 pages
Regression
No ratings yet
Regression
19 pages
Nonstationarity and Error Correction Models
No ratings yet
Nonstationarity and Error Correction Models
41 pages
Key Assumptions of Linear Regression
100% (2)
Key Assumptions of Linear Regression
16 pages
Regression Diagnostics Explained
No ratings yet
Regression Diagnostics Explained
10 pages
Durbin-Watson Test Critical Values
No ratings yet
Durbin-Watson Test Critical Values
151 pages
ZIP Regression Mixture Model Analysis
No ratings yet
ZIP Regression Mixture Model Analysis
8 pages
Regression Analysis and Beta Calculations
No ratings yet
Regression Analysis and Beta Calculations
6 pages
Confidence Intervals and Sample Size
No ratings yet
Confidence Intervals and Sample Size
50 pages
Interpreting Multiple Regression Results
No ratings yet
Interpreting Multiple Regression Results
3 pages
Bootstrapping The Autoregressive Distributed Lag Test For Cointegration Bootstrapping The Autoregressive Distributed Lag Test For Cointegration
No ratings yet
Bootstrapping The Autoregressive Distributed Lag Test For Cointegration Bootstrapping The Autoregressive Distributed Lag Test For Cointegration
15 pages
Understanding Dummy Variables in Econometrics
No ratings yet
Understanding Dummy Variables in Econometrics
88 pages
LMM Analysis of Oat Yield Data
No ratings yet
LMM Analysis of Oat Yield Data
17 pages
Understanding Multiple Coefficient of Determination
No ratings yet
Understanding Multiple Coefficient of Determination
10 pages
EViews Regression Output Analysis
No ratings yet
EViews Regression Output Analysis
6 pages
Oneway ANOVA Analysis of Toko Data
No ratings yet
Oneway ANOVA Analysis of Toko Data
4 pages
Box-Jenkins ARIMA Time Series Analysis
No ratings yet
Box-Jenkins ARIMA Time Series Analysis
53 pages
Linear Regression Quiz Insights
100% (2)
Linear Regression Quiz Insights
6 pages

Understanding Multiple Regression Analysis

Uploaded by

Understanding Multiple Regression Analysis

Uploaded by

Multiple Regression

(Stock & Watson Ch. 6)

Population regression line before the change:

b0 = expected value of Y when X1 = X2 = 0.

With two regressors, the Least Squares problem is:

• This minimization problem is solved using calculus. The 3 F.O.C.s

Regression with robust standard errors Number of obs = 420

– 0.65: (slope of PctEL) 1 percentage point increase in the

686.0: (intercept) The predicted test scores is 686 points in a class

Now include percent English Learners in the district (PctEL):

Remark: The estimated slope of STR changed noticeably. This is

Recall that Stata reports RMSE.

The R2 is the fraction of the variance explained – same definition as

. reg testscr str pctel

Source | SS df MS Number of obs = 420

R 2 (“adjusted R2”) corrects this problem by “penalizing” the fit

Question: Why are the R2 and the R 2 so close in Multiple?

You might also like