0% found this document useful (0 votes)
14 views16 pages

Understanding Multiple Regression Analysis

The document discusses multiple regression analysis, focusing on the population multiple regression model, interpretation of coefficients, and the OLS estimator. It also covers measures of fit such as R-squared and adjusted R-squared, as well as the implications of multicollinearity and omitted variable bias. An example using test scores illustrates the differences between simple and multiple regression models, highlighting the importance of including relevant variables for accurate predictions.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views16 pages

Understanding Multiple Regression Analysis

The document discusses multiple regression analysis, focusing on the population multiple regression model, interpretation of coefficients, and the OLS estimator. It also covers measures of fit such as R-squared and adjusted R-squared, as well as the implications of multicollinearity and omitted variable bias. An example using test scores illustrates the differences between simple and multiple regression models, highlighting the importance of including relevant variables for accurate predictions.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Multiple Regression

(Stock & Watson Ch. 6)

Outline
1. Multiple regression and OLS
2. Measures of fit
3. Least squares assumptions for the MR model
4. Sampling distribution of the OLS estimator
5. Multicollinearity
6. Omitted variable bias

1
The Population Multiple Regression Model
(S&W Section 6.2)
Consider the case of two regressors:
Yi = b0 + b1X1i + b2X2i + ui, i = 1,…, n
• Y is the dependent variable.
• X1, X2 are the two independent variables (regressors).
• (Yi, X1i, X2i) denote the ith observation on Y, X1, and X2.
• b0 = unknown population intercept.
• b1 = slope of X1 (effect on Y of a change in X1, holding X2 constant)
• b2 = slope of X2 (effect on Y of a change in X2, holding X1 constant)
• ui = the regression error (omitted factors).

2
Interpretation of coefficients in multiple regression
Yi = b0 + b1X1i + b2X2i + ui, i = 1,…,n
Consider changing X1 by DX1 while holding X2 constant.

Population regression line before the change:


Y = b0 + b1X1 + b2X2
Population regression line, after the change:
Y + DY = b0 + b1(X1 + DX1) + b2X2
Here DY is the change in Y in response to the change in X1.
After: Y + DY = b0 + b1(X1 + DX1) + b2X2
Before: Y = b0 + b1X1 + b2X2
Difference: DY = b1DX1
3
Holding X2 constant, the change in Y in response to the change in X1 is:
DY = b1DX1.
Similarly, holding X1 constant, we would get:
DY = b2DX2.
Interpretation of coefficients in Multiple Regression:
DY
b1 = , holding X2 constant
DX 1

DY
b2 = , holding X1 constant
DX 2

b0 = expected value of Y when X1 = X2 = 0.


If you are comfortable with math, use “𝜕” in place of D.
4
The OLS Estimator in Multiple Regression
(S&W Section 6.3)

With two regressors, the Least Squares problem is:


n
min b0 ,b1 ,b2 å[Yi - (b0 + b1 X 1i + b2 X 2i )]2
i =1

• This minimization problem is solved using calculus. The 3 F.O.C.s


yield a system of 3 linear equations in 3 unknowns. This system has
a unique solution as long as LS A4 holds (discussed below).
• The values that solve the 3-equation system are the OLS estimators.
• The OLS estimator minimizes the average squared difference
between the actual values of Yi and the prediction (predicted value)
based on the estimated line.

5
Least squares problem:
n
min b0 ,b1 ,b2 å[Yi - (b0 + b1 X 1i + b2 X 2i )]2
i =1
F.O.C.s:
• with respect to (w.r.t.) 𝑏! :
∑" 𝑌" − 𝑛𝑏! − 𝑏# ∑" 𝑋#" − 𝑏$ ∑" 𝑋$" = 0
• w.r.t. 𝑏# :
$
∑" 𝑌" 𝑋#" − 𝑏! ∑" 𝑋#" − 𝑏# ∑" 𝑋#" − 𝑏$ ∑" 𝑋#" 𝑋$" = 0
• w.r.t. 𝑏$ :
$
∑" 𝑌" 𝑋$" − 𝑏! ∑" 𝑋$" − 𝑏# ∑" 𝑋#" 𝑋$" − 𝑏$ ∑" 𝑋$" =0
Solution yields the OLS estimators of b0, b1 and b2. We denote
these by putting hats (^) above the unknown parameter.
6
Multiple regression in Stata (heteroscedasticity robust se’s)
reg testscr str pctel, robust

Regression with robust standard errors Number of obs = 420


F( 2, 417) = 223.82
Prob > F = 0.0000
R-squared = 0.4264
Root MSE = 14.464
------------------------------------------------------------------------------
| Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
str | -1.101296 .4328472 -2.54 0.011 -1.95213 -.2504616
pctel | -.6497768 .0310318 -20.94 0.000 -.710775 -.5887786
_cons | 686.0322 8.728224 78.60 0.000 668.8754 703.189
------------------------------------------------------------------------------

2
𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 686.0 – 1.10´STR – 0.65PctEL, R2 = 0.43
(8.72) (0.43) (0.031)

7
Test Score Example: Interpretation of Coefficient Estimates
– 1.10: (slope of STR) 1 student increase in a class decreases test
scores by 1.10 points on average, holding the percentage of English
learners (i.e., PctEL) constant.

– 0.65: (slope of PctEL) 1 percentage point increase in the


percentage of English learners decreases test scores by 0.65 points on
average, holding the class size (i.e., STR) constant.

686.0: (intercept) The predicted test scores is 686 points in a class


where there is no student (i.e., STR=0) and the percentage English
learners is zero (i.e., PctEL=0).

8
Test Score Example: SR vs. MR
Regression of TestScore on STR yields:
2
(Simple) 𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 698.9 – 2.28´STR, R2 = 0.051.

Now include percent English Learners in the district (PctEL):


2
(Multiple) 𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 686.0 – 1.10´STR – 0.65PctEL, R2 = 0.43.

Remark: The estimated slope of STR changed noticeably. This is


strong evidence for OVB (Omitted Variable Bias) in the simple
regression.
Why?
1) STR and PctEL is correlated
2) PctEL is a determinant of TestScore.

9
Measures of Fit for Multiple Regression
(S&W Section 6.4)
All measures exploit the breakdown of Y into the part that can be
predicted by X’s and the part that cannot be predicted from the X’s:
Yi = Yˆi + uˆi
Actual Y = predicted Y + residual
We proceed as in the case of simple regression:
∑"(𝑌" − 𝑌5)2 = ∑"(𝑌7" − 𝑌5)2 + ∑" 𝑢9"$ + 2<= 7" − 𝑌
∑":𝑌
==>= 5=?
= ;𝑢9"
!
= ∑"(𝑌7" − 𝑌5)2 + ∑" 𝑢9"$ .
Total Sum of Squares = Explained Sum of Squares +
Sum of Squared Residuals (Unexplained sum of squares)
TSS = ESS + SSR
10
Measures of fit:
SER = Standard Error of the Regression
(std. deviation of uˆi with d.f. correction.)
RMSE = Root Mean Squared Error
(std. deviation of uˆi without d.f. correction.)
R2 = fraction of variance of Y explained by the X’s
R 2 = adjusted R-squared.
(Modified R2 with a d.f. correction that adjusts for use of
additional regressors.)

11
SER and RMSE
Same idea as in simple regression: SER and the RMSE are measures
of the spread of the Y’s around the regression line:

!!" 1 n
SER = ! = åi ˆ
u 2
#$%$& n - k - 1 i =1

!!" 1 n 2
RMSE = ! = å uˆi
# n i =1

Recall that Stata reports RMSE.

12
R2 and R 2 (adjusted R2)

The R2 is the fraction of the variance explained – same definition as


in regression with a single regressor:
ESS SSR
R2 = = 1- ,
TSS TSS
n n
where ESS = ∑"(𝑌7" − 𝑌5)2, SSR = å i , TSS = å i
ˆ
u 2

i =1
(Y -
i =1
Y ) 2
.

Stata always reports R2. When robust command is not used, it also
reports adjusted R2, TSS, ESS, SSR and the associated degrees of
freedom – see next page.

13
Multiple regression in Stata (homoscedasticity-only se’s)

. reg testscr str pctel

Source | SS df MS Number of obs = 420


---------------+------------------------------ F( 2, 417) = 155.01
“ESS” Model | 64864.3011 2 32432.1506 Prob > F = 0.0000
“SSR” Residual | 87245.2925 417 209.221325 R-squared = 0.4264
---------------+------------------------------ Adj R-squared = 0.4237
“TSS” Total | 152109.594 419 363.030056 Root MSE = 14.464

------------------------------------------------------------------------------
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
str | -1.101296 .3802783 -2.90 0.004 -1.848797 -.3537945
pctel | -.6497768 .0393425 -16.52 0.000 -.7271112 -.5724423
_cons | 686.0322 7.411312 92.57 0.000 671.4641 700.6004
------------------------------------------------------------------------------

14
R2 and R 2 , cont’d.
• R2 always increases when you add another regressor (why?)
• This may be seen as a disadvantage when we are assessing “fit.

R 2 (“adjusted R2”) corrects this problem by “penalizing” the fit


measure for including additional regressors. While R2 always
increases when you add another regressor, R 2 does not necessarily
increase.
æ n - 1 ö SSR
Adjusted R : R = 1 - ç
2 2
÷ ; k = no. of regressors.
è n - k - 1 ø TSS
SSR
R2 = 1 -
TSS
Note that R 2 < R2, however if n is large the two will be very close.

15
Test Score Example: California school data
2
(Simple) 𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 698.9 – 2.28´STR,
R2 = .0512, RMSE = 18.6
(Multiple) 2
𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 686.0 – 1.10´STR – 0.65PctEL,
R2 = .426, R 2 = .424, RMSE = 14.5

Question: Compare R2 values. What does this tell you about the fit
of Multiple regression compared with Simple regression?

Question: Why are the R2 and the R 2 so close in Multiple?

16

You might also like