0% found this document useful (0 votes)
5 views41 pages

Linear Regression with One Regressor

This document discusses linear regression with one regressor. Key points include: 1. Ordinary least squares (OLS) is used to estimate the population slope and intercept by drawing a line that best fits the data. 2. The OLS estimator minimizes the sum of squared residuals to find the estimated slope (β1) and intercept (β0). 3. Measures of fit like R-squared and the standard error of the regression are used to evaluate how well the regression line fits the data. 4. The sampling distribution of the estimated slope β1 follows a normal distribution, and its variance decreases as the variance of the regressor increases.

Uploaded by

Ali Mohsin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views41 pages

Linear Regression with One Regressor

This document discusses linear regression with one regressor. Key points include: 1. Ordinary least squares (OLS) is used to estimate the population slope and intercept by drawing a line that best fits the data. 2. The OLS estimator minimizes the sum of squared residuals to find the estimated slope (β1) and intercept (β0). 3. Measures of fit like R-squared and the standard error of the regression are used to evaluate how well the regression line fits the data. 4. The sampling distribution of the estimated slope β1 follows a normal distribution, and its variance decreases as the variance of the regressor increases.

Uploaded by

Ali Mohsin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Chapter 4

Linear Regression with One Regressor


Linear Regression with One Regressor
(SW Chapter 4)

Linear regression allows us to make inference


about population slope coefficients. How do we
best characterize the following relationship?

2
Income Data 2008 Census

3
Our objectives in linear regression are, in general, the same
as for Chapter 3:

Estimation:
How should we draw a line through the data to estimate
the (population) slope (answer: ordinary least squares).
What are advantages and disadvantages of OLS?
Hypothesis testing:
How to test for a relationship in the population?
Confidence intervals:
How to construct a set of numbers that best
characterize that relationship?

4
Some Notation and Terminology (SW Section 4.1)

5
The Population Linear Regression
Model general notation

6
Linearity
We assume that
X and Y have a
linear
relationship

7
Two components of Y
Remember, Y is a random variable that is composed
of two parts:
1. Systematic component: E(Yi|Xi) = 0 + 1 Xi
This is the mean of Y, given X.
2. Random component: ui = Yi E(Yi|Xi)
This is the error component.
Where does randomness come from?
a) Omitted variables
b) Measurement error
c) Inherent randomness of behavior
8
The Ordinary Least Squares Estimator
(SW Section 4.2)

9
n
The OLS estimator solves: b0 ,b1 i 0 1 i
min [Y ( b b X )]2

i 1

10
11
12
Interpretation of the estimated slope
and intercept

13
Predicted values & residuals:
One individual in the data set is a 47 year old white
male who is married with 2 children.
He received 18 years of education and earns
$100,000 in wage and salary income.
Label him as observation 1 and use a 1 subscript.

Y 1 -42.92417 6.08968*18 $66.69025 thousand dollars


u$1 100 66.69025=$33.30975 thousand dollars

14
OLS regression: SAS output

15
Baseball Application: Team Wins
v. Team Salary (2003)

^
Wins = 65.13 + 0.20*Payroll

Estimated slope 1 0.20


Estimated intercept 65.13
0


Wins 65.13 0.20 Payroll

16
Interpretation of parameter
estimates

Wins= 65.13 + 0.20*Payroll
Teams that spend $5M more for total team payroll
have, on average, 1 more win
The intercept (taken literally) means that, according
to the estimation line, teams with zero team payroll
have a predicted win total of 65.13 wins
The interpretation of the intercept makes no sense it
extrapolates the line outside the range of data here, the
intercept is not economically meaningful
17
Predicted values and residuals
Atlanta
Braves

^
Wins = 65.13 + 0.20*Payroll

The Atlanta Braves spent $103.91M and won 101


games
Predicted value: Y i = 65.13 + 0.20*103.91 = 85.61
Residual: u$i = 101 85.61 = 15.39
18
OLS Regression: SAS Output

19
Measures of Fit
(Section 4.3)

20
21
Another way to calculate R2
Sum of the squared residuals or SSR is the
remaining part of the total sum of squares after we
remove the part of the variation that is being
explained by the model.

ESS SSR
R
2
1
TSS TSS
n 2
SSR u$ i
i 1
22
The Standard Error of the
Regression (SER)

23
24
25
Example of the R and the SER
2

26
OLS Regression: SAS Output

Payroll explains only


a small fraction of the
variation in team
wins. Does this make
it unimportant? Can
small market teams
compete?

27
The Least Squares Assumptions

(SW Section 4.4)

28
The Least Squares Assumptions

29
LSA #1: E(u|X = x) = 0.

30
LSA #2: (Xi,Yi), i = 1,,n are i.i.d.

31
LSA #3: Large outliers are rare

32
OLS can be sensitive to an outlier:

33
The Sampling Distribution of the
OLS Estimator (SW Section 4.5)

34
Probability Framework for Linear
Regression

35
The Sampling Distribution of 1

36
What is the sampling distribution of 1?

37
The sampling distribution of 1 :

38
39
The larger the variance of X, the

smaller the variance of 1

40
The larger the variance of X, the


smaller the variance of 1

41

You might also like