0% found this document useful (0 votes)

10 views53 pages

Simple Linear Regression

The document provides an overview of simple linear regression, focusing on the relationship between a dependent variable Y and an independent variable X, expressed as Y = β0 + β1X + u. It discusses the estimation of coefficients using Ordinary Least Squares (OLS), the interpretation of parameters, and the importance of goodness of fit through R-squared. Additionally, it covers hypothesis testing for the slope coefficient and the statistical significance of the regression results.

Uploaded by

mhw2xpxpn7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views53 pages

Simple Linear Regression

Uploaded by

mhw2xpxpn7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Simple linear regression

Basic idea

I What is Regression: regression is concerned with describing

and evaluating the relationship between a given variable Y
and one or more other variables X .

Y = f (X )

I Y is called dependent variable or explained variable, X is called

independent variable or explanatory variables or regressors
I Linear regression: the relation is linear (in the coefficients)

Y = β0 + β1 X + u

I Simple linear regression: assume that Y depends on only one

X variable
Simple linear regression

Y = β0 + β1 X + u

I Coefficients (parameters): β0 (intercept), β1 (slope)

I Intepretation of β0 and β1 .
I We focus on β1 : β1 = dYdX , or dY = β1 dX . when X increase
by 1 unit, Y increase by β1 units.
I Error term u.
I We have data on Y and X , not u. Goal is to use the data to
estimate β0 and β1 .
I Example: Y = height of child, X = height of dad,
u = nutrition, life style.... (what should be the sign of β1 ?)
Linear functional form: too simple?

Y = β0 + β1 X + u

I Quadratic form: Y = β0 + β1 X 2 + u
I lnY = β0 + β1 X + u (intepretation)

dlnY
= β1
dX
dlnY dY /Y
but =
dX dX
dY /Y
so = β1
dX
so dY /Y = β1 dX

I Intepretation of β1 : when X increases by 1 unit, Y increases by

100 × β1 percentage (e.g., if β1 = 0.01, Y will increase by 1%).
Linear functional form: too simple? (cont.)

I lnY = β0 + β1 lnX + u (intepretation)

dlnY
= β1
dlnX
dlnY dY /Y
but =
dlnX dX /X
dY /Y
so = β1
dX /X
so dY /Y = β1 dX /X

I Intepretation of β1 : when X increase by 1%, Y increase by

β1 %.
I The model is equivalent to Y = AX β1 e u ,where β0 = lnA
Example: NHL Hockey player salaries

I Begin with a simple example of salary determination for NHL

hockey players.
I Restrict analysis to forwards (i.e., exclude defensemen and
goalies)
I What is the economic return to scoring a goal?
I Consider a simple economic model:

Salaryi = β0 + β1 Goalsi + i
Salary Data
I Data on hockey salaries and past performance from the
NHLPA website.
I Salaries in 2000-2001, with performance covering the
1999-2000 season.
I 418 players who played at least 20 games.
Ln(Salary) versus goals
I It turns out that it’s better to specify the following model:
ln(Salaryi ) = β0 + β1 Goalsi + i
I The slope coefficient gives the % increase in salary associated
with an additional goal (approximately).
Plot of regression results
Simple Regression Results

Estimated Effect of Performance on Player Salary

(standard errors in parentheses)
Salary ($US) Ln(Salary)
Mean = 1.4M Mean=13.78
(1) (2)
Intercept 0.108 −0.712
(0.105) (0.048)
Goals 0.101 0.053
(0.007) (0.003)
R-squared 0.36 0.43
Linear model

I The following regression function is linear in parameters:

Yi = β0 + β1 Xi + ui

I With assumptions about u, we can work out the conditional

expectation:

E [Yi |Xi ] = β0 + β1 Xi + E [ui |Xi ]

I If we assume: E [ui |Xi ] = 0 (error is noise, unpredictable by x)

E [Yi |Xi ] = β0 + β1 Xi
The population regression line
Population regression line, sample data points and the
associated error terms
Estimation of coefficients by Ordinary Least Squares (OLS)

I Consider a random sample {(X1 , Y1 ), . . . , (Xn , Yn )}

I Define an estimator (predictor) of Yi by Ŷi = β̂0 + β̂1 Xi .
I The residual (prediction error) is

ûi = Yi − Ŷi = Yi − β̂0 − β̂1 Xi

I The sum of squared residuals:

n
X n
X
ûi2 = (Yi − Ŷi )2
i=1 i=1

I The least squares estimator finds β̂0 , β̂1 such that the sum of
squared residuals is minimized
The objective function for a regression

I Define the Residual Sum of Squares (SSR)

n
X
SSR = ûi2
i=1
n
X
= (Yi − Ŷi )2
i=1
n
X
= (Yi − β̂0 − β̂1 Xi )2
i=1
Choose estimators to minimize SSR

I Simple calculus yields the first order conditions

Pn 2
n
∂ i=1 ûi
X
= −2 (Yi − β̂0 − β̂1 Xi )
∂ β̂0 i=1
n
X
= −2 ûi
i=1
Pn 2
n
∂ i=1 ûi
X
= −2 Xi (Yi − β̂0 − β̂1 Xi )
∂ β̂1 i=1
n
X
= −2 Xi ûi
i=1
Solving for the intercept

I The first equation implies:

n
X n
X
ûi = 0 ⇒ (Yi − β̂0 − β̂1 Xi ) = 0
i=1 i=1
Xn n
X
⇒ Yi = nβ̂0 + β̂1 Xi
i=1 i=1
Pn Pn
i=1 Yi i=1 Xi
⇒ β̂0 = − β̂1
n n
⇒ β̂0 = Ȳ − β̂1 X̄
Solving for the slope

n
X n
X
Xi ûi = 0 ⇒ Xi (Yi − β̂0 − β̂1 Xi ) = 0
i=1 i=1
Xn n
X n
X
⇒ Xi Yi = β̂0 Xi + β̂1 Xi2
i=1 i=1 i=1

I And substituting in the expression for β̂0 = Ȳ − β̂1 X̄

n
X n
X n
X
Xi Yi = (Ȳ − β̂1 X̄ ) Xi + βˆ1 Xi2
i=1 i=1 i=1
n
X n
X n
X
= Ȳ Xi − β̂1 X̄ Xi + β̂1 Xi2
i=1 i=1 i=1
n
X
= nȲ X̄ − β̂1 nX̄ 2 + β̂1 Xi2
i=1
The OLS formula for the slope

Pn Pn Pn
i=1 Xi Yi − nX̄ Ȳ i=1 (Xi − X̄ )(Yi − Ȳ ) i=1 xi yi
β̂1 = Pn 2 2
= P n 2
= P n 2
i=1 Xi − nX̄ i=1 (Xi − X̄ ) i=1 xi

where xi = Xi − X̄ and yi = Yi − Ȳ (deviations from means). So

n
X n
X
xi yi = (Xi − X̄ )(Yi − Ȳ )
i=1 i=1

and similarly
n
X n
X
xi2 = (Xi − X̄ )2
i=1 i=1
In summary, the OLS estimators

β̂0 = Ȳ − β̂1 X̄
Pn
xi yi cov
ˆ (X , Y )
β̂1 = Pi=1
n 2
=
i=1 ix var
ˆ (X )
Properties of OLS estimators

I Three important properties:

Pn
I The residuals sum to 0: i=1 ûi = 0 (from FOC 1)
I Xi and ûi are orthogonal: ni=1 Xi ûi = 0 (from FOC 2)
P
I The fitted regression line passes through the means (X̄ , Ȳ )

Ȳ = β̂0 + β̂1 X̄
OLS Figure
Deviations from mean

I Yi can be expressed in terms of Ŷi and ûi :

Yi = Ŷi + ûi
= β̂0 + β̂1 Xi + ûi

I Yi − Ȳ = β̂0 + β̂1 Xi + ûi − (β̂0 + β̂1 X̄ ) = β̂1 (Xi − X̄ ) + ûi

I Let yi = Yi − Ȳ and xi = Xi − X̄

yi = β̂1 xi + ûi
Goodness of fit and R 2

I How well does the OLS regression line fit the data?
I The following concepts will help us quantify the fit:
I Total Sum of Squares: SST = ni=1 (Yi − Ȳ )2
P
I Residual (unexplained) Sum of Squares: SSR = 2i=1 ûi2
P
I Explained Sum of Squares: SSE = ni=1 (Ŷi − Ȳ )2
P

I It holds that SST = SSE + SSR

Goodness of fit and R 2 (cont.)

I R 2 quantifies the fit of the OLS regression line to the data.

I R 2 is the proportion of total variation that is explained by the
regression.

SSE SSR
R2 ≡ =1−
SST SST
sampling properties of the OLS estimator

I The OLS estimator is unbiased

E [β̂i |X ] = βi
sampling properties of the OLS estimator (cont.)

I Two More assumptions for variance

I Pair Wise Uncorrelated (conditional on X):

cov (ui ,uj ) = 0

this is guaranteed if we have a random sample

I Conditional Homoskedasticity:

var (ui |X ) = σ 2
Conditional Homoskedasticity
Heteroskedasticity
sampling properties of the OLS estimator (cont.)

I The variance of β̂1 is given by:

σ2 σ2
var (β̂1 |X ) = P =
xi2
P
(Xi − X̄ )2
The sampling distribution of the OLS estimator
I Assume that u is distributed Normally

ui ∼ N(0, σ 2 )

I This implies that

Yi = β0 + β1 Xi + ui ∼ N(β0 + β1 Xi , σ 2 )

I We have the result that

σ2
β̂1 ∼ N(β1 , P 2 )
xi
I And, as usual, we can standardize the variable:

β̂1 − β1
z= qP ∼ N(0, 1)
σ/ xi2
The t-statistic

I Of course, we do not know the true variance σ 2 , so we must

employ an estimator
Pn
2 û 2
s = i=1 i
n−2
I And the new “z” is actually a “t-statistic”

β̂1 − β1
t= qP ∼ tn−2
s/ xi2

I The term in the denominator

qP is known as the standard error
of β̂1 : se(β̂1 ) = s/ 2
xi
Large sample properties of β̂1 (when n is large)

I OLS estimator is consistent:

plim(β̂1 ) = β1

I If we do not assume Normal errors, by Central Limit Theorem,

the large sample distribution of β̂1 is approximately normal:

σ2
β̂1 ∼ N(β1 , P 2 )
xi
I Or in terms of our typical t-statistics:

β̂1 − β1
t= qP ∼ N(0, 1)
s/ xi2
What affects precision of the slope estimate?

s
se(β̂1 ) = qP
n
i=1 (Xi − X̄ )2

I Sample size n
I Variance of independent variable
I Standard error of ûi , s
Difference Between Two Graphs?

For which β1 there will be a more precise estimate of β̂1 ?

Testing the Slope

Yi = β0 + β1 Xi + ui

Set of Statistical Hypotheses

Two-sided One-sided One-sided
H0 : β1 = β10 H0 : β1 = β10 H0 : β1 = β10
H1 : β1 6= β10 H1 : β1 > β10 H1 : β1 < β10

I β 10 is any number, does not have to be 0.

β̂1 −β10
I test statistics: t =
se(β̂1 )
I t ∼ t distribution, with n − 2 degrees of freedom, or
t ∼ N(0, 1) if n is large.
Statistical Significance

Most common hypothesis test: test of statistical significance for

the slope coefficient:

H0 : β 1 = 0
H1 : β1 6= 0
(or H1 : β1 > (<)0)

t-statistic:

β̂1
t=
se(β̂1 )

Can use rejection region approach or p-value approach.

Rejection region approach

I Step 1: compute the t-statistic

I Step 2: determine the sampling distribution of the statistic
under H0
I When n is small (say n < 25), t ∼ tn−2
I When n is large, can use either t distribution or normal
distribution: i.e., t ∼ tn−2 or t ∼ N(0, 1)
I Step 3: pick an α, say α = 5% or 1% (what’s the implication
for different values of α?)
I Step 4: according to the sampling distribution chosen in Step
2, find the critical value associated with α. This can be done
by looking up the statistical tables. The value also depends on
the nature of H1 (two-sided or one-sided).
Rejection region approach (cont.)

I Step 5: use the critical value to partition the sample space of

the statistic into acceptance region and rejection region. This
also depends on the nature of H1 (two-sided or one-sided)
I Step 6: conclusion, if the value of the statistic computed in
Step 1 falls in the rejection region, reject the Null, meaning β1
is significant at the level of α; if the statistic falls in the
acceptance region, do not reject the Null, meaning β1 is not
significant at the level of α;
p-value approach

I Recall that the p-value is the tail probability of the test

statistic value given that the null hypothesis is true
I The idea of the p-value approach is to compare the p-value of
the statistic with α.
p-value approach (cont.)

I Step 1: pick an α
I Step 2: determine the sampling distribution of the statistic (t
distribution or normal distribution)
I Step 3: compute the t-statistic, then determine its p-value
based on the sampling distribution chosen in Step 2, which is
the probablity of obtaining a result at least as extreme as that
obtained. This can be done with the help of the statistical
tables. Note the meaning of extreme depends on the nature
of H1 (one-sided or two-sided)
I Step 4: Compare the p-value obtained in Step 3 with α. If
p-value < α, reject the null, meaning β1 is significant at the
level of α.
Statistical vs Economic Significance

I Statistically Significant: The degree of correlation between

explanatory and dependent variables that is not likely observed
due to mere chance. The statistical significance of a variable
is entirely determined by the size of the t-statistic . Statistical
significance improves as sample size increases. Why?
I Statistical significance does not automatically imply that you
have found something important.
I Economically Significant (or practical significance): An effect
large enough in magnitude that decision makers would
consider it important. The economical importance is related
to the magnitude and sign of β̂1 and indicates whether the
explanatory variable has a meaningful and plausible influence
on dependent variable.
Statistical vs Economic Significance: an example

Consider a hypothetical regression of y=size of workforce of a city

(million of people) on x=the city’s annual government spending
(100 million of yuan)

Y = β0 + β1 X + u

We have found that β̂1 = 0.00001 and se(β̂1 ) = 0.000001

I Is β̂1 statistically significant?
I Consider average workforce of 2 million people. Increase in
government spending by 100 million yuan is associated with
increase in workforce in average by 10 people, or only
0.0005%! Is this result economically significant?
Confidence Interval Estimator of β1

I Point estimator is almost never equal to the true value, since

we don’t know the true value, we never know by how much
we are wrong
I As a rule, we would like to have more confidence about our
estimate of the parameter than the point estimate gives us.
I Instead of a point estimate, we turn to an interval estimator -
a range of values around the point estimator
I Our goal is to find a range of values around the point
estimate in such a way that there is a large probability that
this range would include the true population parameter
I The probability in this case is called a confidence level”, and
the range is called a “confidence interval”
Confidence Interval Estimator of β1 (cont.)

I Denote the confidence level by 1 − α

I Objective is to find an interval (lower,upper) based on β̂1 ,
such that P(lower < β1 < upper ) = 1 − α
I Since the sampling distribution of the t-statistic is

β̂1 − β1
t= ∼ tn−2
se(β̂1 )
β̂1 − β1
⇒ P(−t(α/2,n−2) < < t(α/2,n−2) ) = 1 − α
se(β̂1 )
⇒ P(β̂1 − t(α/2,n−2) se(β̂1 ) < β1 < β̂1 + t(α/2,n−2) se(β̂1 ))
=1−α
⇒ The 100 ∗ (1 − α)% CI is therefore
(β̂1 − tα/2 se(β̂1 ), β̂1 + tα/2 se(β̂1 ))
Point Prediction of Y
I Point Prediction: we can use our estimated model to predict
value of Y for any given value of X.
I For instance, let’s take an estimated model of Y and X:

Ŷ = 23.6 + 4.1 ∗ X

I What is the predicted value of Y for X = 12? easy

Ŷ = 23.6 + 4.1 ∗ 12 = 72.8

I But we know predictions are never accurate because of the

error term u
I The true model is:

Yi = β0 + β1 Xi + ui

and X is not the only factor to affect Y.

Prediction interval of Y

I Prediction Interval: for given X = x0 , an interval that

contains Y0 with 1 − α confidence.
s
1 (x0 − X̄ )2
Ŷ0 ± t(α/2,n−2) s 1 + + Pn 2
n i=1 (Xi − X̄ )

where s is the standard error of ui

I What happens to the width of the prediction interval if sample
size goes to infinity?
I What happens as x0 moves further from X̄ ?
Caution in Interpreting Regression Results

A common mistake often made when using regression analysis is to

state that x causes y to happen
I A model with a high R 2 does not automatically mean that a
change in x causes y to change in a very predictable way. It
could be just the opposite, that y causes x to change. A high
correlation goes both ways.
I It could also be that both y and x are changing in response to
a third variable that we don’t know about.
I When forming a simple regression based on a population
economic model, one has to assume that x causes y and be
able to justify this assumption on economic grounds.
Example: Inflation and Money supply
What is the replationship between inflation rate (Infl) and money
supply (M2)? Consider the regression

yt = β0 + β1 xt + ut

Where:
yt = ln(Inflt ) − ln(Inflt−1 )
xt = ln(M2t ) − ln(M2t−1 )

Results
(standard errors in parentheses)
intercept 0.0065
(0.0010)
x −0.5525
(0.0519)
R2 0.5625
Application of simple linear regression: hedging

I A hedge is an investment position intended to offset the risk of

potential losses/gains that may be incurred by an investment
I To hedge a risk is to hold assets with opposite payout
characteristics in response to change in the terminal spot price
I Example: hedging a stock price (index) using futures
Hedging application (cont.)

I Suppose I own a stock and would like to sell it in the future.

At the moment, how can I limit/minimize the risk of variation
in the stock price?
I One way is to enter into a short futures contract. That is, I
hedge a long position in the stock using a short position in
future contracts.
Hedging application (cont.)

I Hedge ratio is the ratio of the futures to underlying asset

position
I A perfect hedge requires that futures and underlying spot
asset price changes are perfectly correlated (it does NOT
alway exists.)
I For imperfect hedge, set hedge ratio to minimize variance of
hedging error
Hedging application (cont.)
I Let St represent the spot stock price and Ft represent the
futures price.
I Let ∆St represent the change in St and ∆Ft represent the
change in Ft
I Suppose the hedge ratio is β, then the hedging error is
et = ∆St − β∆Ft
I The optimal hegding ratio β ∗ is such that minimizes the
variance of et . We have
cov (∆St , ∆Ft )
β∗ = . (why?)
var (∆Ft )
I So β ∗ may be obtained from the estimated slope coefficient β
in the regression

∆St = β0 + β1 ∆Ft + ut . (why?)

Understanding Ordinary Least Squares (OLS)
No ratings yet
Understanding Ordinary Least Squares (OLS)
42 pages
Understanding Linear Regression Analysis
No ratings yet
Understanding Linear Regression Analysis
40 pages
Manzan SW4e Ch04
No ratings yet
Manzan SW4e Ch04
45 pages
Expected Value of OLS Estimators
No ratings yet
Expected Value of OLS Estimators
29 pages
Simple Regression Analysis in Econometrics
No ratings yet
Simple Regression Analysis in Econometrics
26 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
7 pages
Overview of Classical Linear Regression
No ratings yet
Overview of Classical Linear Regression
72 pages
Class 1-Simple Regression
No ratings yet
Class 1-Simple Regression
39 pages
Simple Linear Regression Explained
No ratings yet
Simple Linear Regression Explained
14 pages
OLS Regression: Pros and Cons
No ratings yet
OLS Regression: Pros and Cons
28 pages
Linear Regression Fundamentals in Econometrics
No ratings yet
Linear Regression Fundamentals in Econometrics
12 pages
Classical Two-Variable Regression Analysis
No ratings yet
Classical Two-Variable Regression Analysis
26 pages
Understanding Linear Regression Analysis
No ratings yet
Understanding Linear Regression Analysis
43 pages
Understanding Ordinary Least Squares
No ratings yet
Understanding Ordinary Least Squares
21 pages
OLS Linear Regression Overview
No ratings yet
OLS Linear Regression Overview
10 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
27 pages
SV e KTL t3 c1 Lrmodel Assump
No ratings yet
SV e KTL t3 c1 Lrmodel Assump
37 pages
Introduction to Econometrics Concepts
No ratings yet
Introduction to Econometrics Concepts
37 pages
Introduction to Regression Analysis
No ratings yet
Introduction to Regression Analysis
28 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
23 pages
Clutch Regression Guide
No ratings yet
Clutch Regression Guide
10 pages
Simple Regression Model Overview
No ratings yet
Simple Regression Model Overview
9 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
10 pages
Regression Analysis and Model Building
No ratings yet
Regression Analysis and Model Building
16 pages
Simple Linear Regression Explained
No ratings yet
Simple Linear Regression Explained
49 pages
Econometrics: OLS Regression Basics
No ratings yet
Econometrics: OLS Regression Basics
5 pages
FINA7 - Topic 1B
No ratings yet
FINA7 - Topic 1B
37 pages
Understanding Ordinary Least Squares (OLS)
No ratings yet
Understanding Ordinary Least Squares (OLS)
6 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
7 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
6 pages
FNB 3104 Lecture 2
No ratings yet
FNB 3104 Lecture 2
32 pages
Understanding Simple Regression Models
No ratings yet
Understanding Simple Regression Models
25 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
46 pages
Econometrics Ch1 Simple Regression
No ratings yet
Econometrics Ch1 Simple Regression
51 pages
Understanding Linear Regression Analysis
No ratings yet
Understanding Linear Regression Analysis
34 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
4 pages
Lecture 2
No ratings yet
Lecture 2
12 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
75 pages
Business - Analytics - Notes Linear Regression and Textual Analysis
No ratings yet
Business - Analytics - Notes Linear Regression and Textual Analysis
25 pages
Understanding Simple Regression Models
No ratings yet
Understanding Simple Regression Models
15 pages
OLS Linear Regression Model Explained
No ratings yet
OLS Linear Regression Model Explained
13 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
13 pages
Mgeb12 7N
No ratings yet
Mgeb12 7N
39 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
39 pages
Topic twoMECT731
No ratings yet
Topic twoMECT731
39 pages
Final 9 - Linear Regression and Correlation
No ratings yet
Final 9 - Linear Regression and Correlation
42 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
7 pages
Applied Regression Analysis Tutorial 1
No ratings yet
Applied Regression Analysis Tutorial 1
9 pages
Machine Learning: Regression Analysis Guide
No ratings yet
Machine Learning: Regression Analysis Guide
8 pages
EMET2007: Linear Regression Insights
No ratings yet
EMET2007: Linear Regression Insights
6 pages
Understanding R-Squared in Regression
No ratings yet
Understanding R-Squared in Regression
11 pages
Overview of Simple Linear Regression
No ratings yet
Overview of Simple Linear Regression
10 pages
Quantitative Methods for Finance
No ratings yet
Quantitative Methods for Finance
21 pages
Simple Linear Regression Explained
No ratings yet
Simple Linear Regression Explained
64 pages
Simple Linear Regression Inference Guide
No ratings yet
Simple Linear Regression Inference Guide
87 pages
Classical Linear Regression Overview
No ratings yet
Classical Linear Regression Overview
50 pages
OLS Regression Fundamentals by Dr. Mitiku
No ratings yet
OLS Regression Fundamentals by Dr. Mitiku
80 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
56 pages
Learning Outcome One Ordinary Least Squares
No ratings yet
Learning Outcome One Ordinary Least Squares
17 pages
Polytechnic Computer Maintenance Guide
No ratings yet
Polytechnic Computer Maintenance Guide
2 pages
"Secrets of Novaya Zemlya's Quiet Zone"
No ratings yet
"Secrets of Novaya Zemlya's Quiet Zone"
5 pages
ETAP Study Case Management Guide
No ratings yet
ETAP Study Case Management Guide
5 pages
Stalin Triumph and Tragedy 2nd Impression. Edition Volkogonov Full
No ratings yet
Stalin Triumph and Tragedy 2nd Impression. Edition Volkogonov Full
81 pages
IEEE Solid-States Circuits Magazine - Summer 2023
No ratings yet
IEEE Solid-States Circuits Magazine - Summer 2023
132 pages
Effective Engineering Presentation Techniques
No ratings yet
Effective Engineering Presentation Techniques
30 pages
PMF IAS General Science Guide
No ratings yet
PMF IAS General Science Guide
182 pages
Return Note for Longchamp Tote Bag
No ratings yet
Return Note for Longchamp Tote Bag
3 pages
Past Continuous Julieta Virginia Tubio
No ratings yet
Past Continuous Julieta Virginia Tubio
1 page
MSc International Business Admission Guide
No ratings yet
MSc International Business Admission Guide
3 pages
Flashing LG GT505 with OEM Firmware
No ratings yet
Flashing LG GT505 with OEM Firmware
2 pages
Adrian Atwood A Senior Manager at MNC Spends A Lot of His Time Assigning Group Members To Particular Tasks and Scheduling Their Work Such TH
No ratings yet
Adrian Atwood A Senior Manager at MNC Spends A Lot of His Time Assigning Group Members To Particular Tasks and Scheduling Their Work Such TH
13 pages
Knee Joint Stability and Structure
No ratings yet
Knee Joint Stability and Structure
16 pages
Gopika Varma: Sangeet Natak Akademi Award
No ratings yet
Gopika Varma: Sangeet Natak Akademi Award
88 pages
Audit Experience at Grant Thornton
No ratings yet
Audit Experience at Grant Thornton
4 pages
Transition Metal Ions Affect Chitosan Degradation
No ratings yet
Transition Metal Ions Affect Chitosan Degradation
9 pages
Building Construction-II GTU Winter 2022 Solutions
No ratings yet
Building Construction-II GTU Winter 2022 Solutions
4 pages
Northumbria University Application Form
No ratings yet
Northumbria University Application Form
4 pages
MG Truppe Roles and Equipment Guide
No ratings yet
MG Truppe Roles and Equipment Guide
11 pages
Cheese-Rolling: An Unusual English Sport
No ratings yet
Cheese-Rolling: An Unusual English Sport
2 pages
Active Ingredients Report 2019
No ratings yet
Active Ingredients Report 2019
10 pages
Theory Base of Accounting - Class 11
No ratings yet
Theory Base of Accounting - Class 11
4 pages
E3113 en SG Monopoly Deal Card Game
No ratings yet
E3113 en SG Monopoly Deal Card Game
2 pages
Amity Institute of Forestry & Wildlife Overview
No ratings yet
Amity Institute of Forestry & Wildlife Overview
1 page
Obstetrics and Gynaecology Unit Design
100% (1)
Obstetrics and Gynaecology Unit Design
19 pages
Hitachi EX3600-6 Specifications Guide
100% (2)
Hitachi EX3600-6 Specifications Guide
257 pages
E.V.S Sem 1 Class 2
No ratings yet
E.V.S Sem 1 Class 2
3 pages
Acne Scars: Classification & Treatment
100% (23)
Acne Scars: Classification & Treatment
16 pages
Human Reproductive System Overview
No ratings yet
Human Reproductive System Overview
12 pages
Australia Awards Scholarship 2024 Guide
No ratings yet
Australia Awards Scholarship 2024 Guide
6 pages

Simple Linear Regression

Uploaded by

Simple Linear Regression

Uploaded by

Simple linear regression

I What is Regression: regression is concerned with describing

I Y is called dependent variable or explained variable, X is called

I Simple linear regression: assume that Y depends on only one

I Coefficients (parameters): β0 (intercept), β1 (slope)

I Intepretation of β1 : when X increases by 1 unit, Y increases by

I lnY = β0 + β1 lnX + u (intepretation)

I Intepretation of β1 : when X increase by 1%, Y increase by

I Begin with a simple example of salary determination for NHL

Estimated Effect of Performance on Player Salary

I The following regression function is linear in parameters:

I With assumptions about u, we can work out the conditional

E [Yi |Xi ] = β0 + β1 Xi + E [ui |Xi ]

I If we assume: E [ui |Xi ] = 0 (error is noise, unpredictable by x)

I Consider a random sample {(X1 , Y1 ), . . . , (Xn , Yn )}

ûi = Yi − Ŷi = Yi − β̂0 − β̂1 Xi

I The sum of squared residuals:

I Define the Residual Sum of Squares (SSR)

I Simple calculus yields the first order conditions

I The first equation implies:

I And substituting in the expression for β̂0 = Ȳ − β̂1 X̄

where xi = Xi − X̄ and yi = Yi − Ȳ (deviations from means). So

I Three important properties:

I Yi can be expressed in terms of Ŷi and ûi :

I Yi − Ȳ = β̂0 + β̂1 Xi + ûi − (β̂0 + β̂1 X̄ ) = β̂1 (Xi − X̄ ) + ûi

I It holds that SST = SSE + SSR

I R 2 quantifies the fit of the OLS regression line to the data.

I The OLS estimator is unbiased

I Two More assumptions for variance

cov (ui ,uj ) = 0

this is guaranteed if we have a random sample

I The variance of β̂1 is given by:

I This implies that

I We have the result that

I Of course, we do not know the true variance σ 2 , so we must

I The term in the denominator

I OLS estimator is consistent:

I If we do not assume Normal errors, by Central Limit Theorem,

For which β1 there will be a more precise estimate of β̂1 ?

Set of Statistical Hypotheses

I β 10 is any number, does not have to be 0.

Most common hypothesis test: test of statistical significance for

Can use rejection region approach or p-value approach.

I Step 1: compute the t-statistic

I Step 5: use the critical value to partition the sample space of

I Recall that the p-value is the tail probability of the test

I Statistically Significant: The degree of correlation between

Consider a hypothetical regression of y=size of workforce of a city

We have found that β̂1 = 0.00001 and se(β̂1 ) = 0.000001

I Point estimator is almost never equal to the true value, since

I Denote the confidence level by 1 − α

I What is the predicted value of Y for X = 12? easy

Ŷ = 23.6 + 4.1 ∗ 12 = 72.8

I But we know predictions are never accurate because of the

and X is not the only factor to affect Y.

I Prediction Interval: for given X = x0 , an interval that

where s is the standard error of ui

A common mistake often made when using regression analysis is to

I A hedge is an investment position intended to offset the risk of

I Suppose I own a stock and would like to sell it in the future.

I Hedge ratio is the ratio of the futures to underlying asset

∆St = β0 + β1 ∆Ft + ut . (why?)

You might also like