0% found this document useful (0 votes)
23 views88 pages

Econometrics Inference: OLS & Hypothesis Testing

Uploaded by

poyebey479
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views88 pages

Econometrics Inference: OLS & Hypothesis Testing

Uploaded by

poyebey479
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ECON2280 Introductory Econometrics

Topic 7 Inference

Yiming Cao

The University of Hong Kong

March 17, 2025

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 1 / 87


Lecture Plan

Sampling Distribution of OLS Estimators

Hypothesis Testing of a Single Parameter

Confidence Intervals

Hypothesis Testing of Linear Combination of Parameters

Hypothesis Testing of Multiple Linear Restrictions

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 1 / 87


Sampling Distribution Motivation

Motivation

Linear Regression Model:

y = β0 + β1 x1 + β2 x2 + · · · + βk xk + u

Estimation:
ŷ = β̂0 + β̂1 x1 + β̂2 x2 + · · · + β̂k xk
What can we learn about the true value of βj from the sample estimate β̂j ?
Can we determine the true value of βj ?
Can we tell which values of βj are more likely than others?
What determines this “likelihood”?

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 2 / 87


Sampling Distribution Motivation

Inferring Population Parameters from Sample Estimates

Hypothesis testing: Determining whether a particular value of βj is likely to be


the true value of βj .

Confidence intervals: Constructing a range of values that are likely to contain


the true value of βj .

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 3 / 87


Sampling Distribution Motivation

Intuition of Hypothesis Testing

Suppose we have a random variable x that follows a distribution f (x; θ), where θ is
a parameter of the distribution.
We observe a random sample of x and estimate a θ̂, and we have a hypothesis
about the true value of θ.
We know that under any value θ, θ̂ is a random variable and has its own
distribution.
We can calculate the probability of observing the value θ̂ under the hypothesized
value of θ.
If the probability of observing the value θ̂ is very low under a particular value of θ,
we may “reject” that value of θ
That is, it is unlikely to oberve the value of θ̂ if the hypothesized value of θ is true.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 4 / 87


Sampling Distribution Motivation

Intuition of Hypothesis Testing

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 5 / 87


Sampling Distribution Motivation

Intuition of Confidence Intervals

We estimate a coefficient β̂j from a sample.

We know that β̂j is a random variable, and is not guaranteed to be equal to the
true value of βj .

However, we can construct a range of values that are likely to contain the true
value of βj with a certain probability.

This range is called a confidence interval.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 6 / 87


Sampling Distribution Motivation

What Do We Need to Know?

Both inference methods rely on the sampling distribution of β̂j

Under Assumption MLR.1–MLR.4:

E (β̂j ) = βj

Under Assumptions MLR.1–MLR.5:

σ2
Var(βj ) =
TSSj (1 − Rj2 )

Are these sufficient?

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 7 / 87


Sampling Distribution Normality

Sampling Distribution of β̂1

Recall that in the simple linear regression model, we have:


n
1 X
β̂1 = β1 + di ui
TSSx i=1

where di = xi − x̄

When the values of x are given (i.e., treated as non-random), β̂1 is a linear
combination of the errors in the sample {ui : i = 1, 2, · · · , n}

The distribution of β̂1 is determined by the distribution of each ui .

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 8 / 87


Sampling Distribution Normality

Sampling Distribution of β̂j


Similarly, in the multiple linear regression model:
n
1 X
β̂j = βj + γ̂ij ui (Remember why?)
RSSj i=1

When the values of (x1 , · · · , xk ) are given (i.e., treated as non-random), β̂j is a
linear combination of the errors in the sample {ui : i = 1, 2, · · · , n}

The distribution of β̂j is determined by the distribution of ui .

What do we know about the distribution of ui ?


Assumption MLR.4: E (ui ) = 0
Assumption MLR.5: Var(ui ) = σ 2
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 9 / 87
Sampling Distribution Normality

Assumption MLR.6: Normality


The population error u is independent of the explanatory variables (x1 , · · · , xk ) and
is normally distributed with zero mean and variance σ 2 :

u ∼ N(0, σ 2 )

This assumption is much stronger than and implies Assumptions MLR.4 and
MLR.5. (Why?)
Assumptions MLR.1–MLR.6 together are called the classical linear model
(CLM) assumptions, and models that satisfy these assumptions are called
classical linear regression models (CLM).
The CLM assumptions can be summarized as:

y |x ∼ N(β0 + β1 x1 + β2 x2 + · · · + βk xk , σ 2 ) i.i.d

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 10 / 87


Sampling Distribution Normality

Assumption MLR.6: Normality

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 11 / 87


Sampling Distribution Normality

Is This Assumption Reasonable?


One common justification: u is the sum of many different unobserved factors
affecting y . The distribution of u is approximately normal by the central limit
theorem (CLT).
This is a weak argument, as the normal approximation may be poor.

The normality assumption can be violated in many ways:


Models with non-negative y are never normal (e.g., wage, price)
Models with discrete y are never normal (e.g., years of schooling)

Practically, it’s good enough if it’s “close to” normal.


Sometimes taking the log of y can help (e.g., log(price))

With large enough sample size, the sampling distribution of u is approximately


normal by CLT (Textbook Ch5).
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 12 / 87
Sampling Distribution Normality

Sampling Distribution of β̂j


Recall that β̂j is a linear combination of the errors {ui : i = 1, 2, · · · , n}.
Under Assumption MLR.6, the errors are independent, identically distributed (i.i.d.)
N(0,σ 2 ) random variables.
A linear combination of i.i.d. normal random variables is also normally distributed.
(See Textbook Math Refresher B)
Therefore, β̂j is also normally distributed. What is the mean and variance?

β̂j ∼ N(βj , Var(β̂j ))

Standardization:
β̂j − βj
∼ N(0, 1)
sd(β̂j )
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 13 / 87
Sampling Distribution Normality

Sampling Distribution of β̂j

We still don’t know what sd(β̂j ) is! (Why don’t we?)

But we have introduced a good estimate of it:

σ̂
q
se(β̂j ) = Var(
c β̂j ) = q
TSSj (1 − Rj2 )

So the standardized estimator is:


β̂j − βj
se(β̂j )

But what is the distribution? Is it still normal?

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 14 / 87


Sampling Distribution Normality

Refresher on Distributions
If Zi , i = 1, · · · , n be independent random variables, each distributed as standard
normal, then the random variable
n
X
X = Zi2
i=1

follows a chi-square distribution with n degrees of freedom: X ∼ χ2n


If a random variable Z follows a standard normal distribution, and X follows a
chi-squared distribution with n degrees of freedom, then the random variable
Z
T =p
X /n

follows a t-distribution with n degrees of freedom: T ∼ tn


ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 15 / 87
Sampling Distribution Normality

Chi-Square Distribution

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 16 / 87


Sampling Distribution Normality

t-distribution

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 17 / 87


Sampling Distribution Normality

t-distribution and Normal Distribution

The t-distribution is similar to the standard normal distribution, but with heavier
tails.

When the degrees of freedom are large (usually df > 120), the t-distribution is very
close to the standard normal distribution.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 18 / 87


Sampling Distribution Normality

Sampling Distribution of β̂j


We can rewrite the standardized estimator as:

(β̂j − βj )/sd(β̂j ) (β̂j − βj )/sd(β̂j )


= p
se(β̂j )/sd(β̂j ) σ̂ 2 /σ 2

(β̂j − βj )/sd(β̂j ) ∼ N(0, 1)


Pn
σ̂ 2 = 2
each ûi ∼ N(0, σ 2 ), so ûi /σ ∼ N(0, 1)
i=1 ûi /(n − k − 1), with P
(n − k − 1)σ̂ 2 /σ 2 = (n − k − 1) ni=1 ( ûσi )2 /(n − k − 1) ∼ χ2n−k−1
Thus, the standardized estimator:

β̂j − βj
tβ̂j = ∼ tn−k−1
se(β̂j )

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 19 / 87


Hypothesis Testing

Lecture Plan

Sampling Distribution of OLS Estimators

Hypothesis Testing of a Single Parameter

Confidence Intervals

Hypothesis Testing of Linear Combination of Parameters

Hypothesis Testing of Multiple Linear Restrictions

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 20 / 87


Hypothesis Testing General Procedure

The General Procedure of a t-test

Form the null hypothesis H0 : βj = α (usually α = 0)

Calculate the t-statistic under H0 :

β̂j − α
tβ̂j =
se(β̂j )

Use the t-statistic to determine whether to reject H0


Reject H0 if it’s unlikely to observe the value of β̂j under H0

Compute the p-value (The probability of observing the value of β̂j under H0 )

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 21 / 87


Hypothesis Testing General Procedure

The null hypothesis (H0)


Usually, we are interested in whether a particular explanatory variable xj has an
effect on the dependent variable y .
The null hypothesis is:
H0 : βj = 0
That is, after x1 , · · · , xj−1 , xj+1 , · · · , xk have been accounted for, xj has no effect
on the expected value of y .
Can we state the opposite (xj has a partial effect on y ) as the null hypothesis?
The corresponding alternative hypothesis is:

H1 : βj 6= 0

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 22 / 87


Hypothesis Testing General Procedure

Example: The Effect of Education on Earnings

Consider the wage equation:

log(wage) = β0 + β1 educ + β2 exper + β3 tenure + u

Suppose we want to test whether education has an effect on wage.

What’s the null hypothesis? What’s the meaning of this hypothesis?

What’s the alternative hypothesis?

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 23 / 87


Hypothesis Testing General Procedure

Computing the t-statistic


The t-statistic under the null hypothesis H0 : βj = 0:

β̂j − 0 β̂j
tβ̂j = =
se(β̂j ) se(β̂j )

Does tβ̂j = 0 when H0 is true?


tβ̂j measures how far away β̂j is from 0 in terms of standard errors.
If tβ̂j is large, β̂j is too far away from 0, which is unlikely if H0 if true.
This provides evidence against H0 .
Does a large tβ̂j mean β̂j is large?
tβ̂j and β̂j always have the same sign (Why?).
But how large is large?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 24 / 87
Hypothesis Testing General Procedure

Testing the Hypothesis

We reject the null hypothesis if the probability of observing the value of β̂j under
H0 is too small.

One popular choice of the cutoff probability is 5%.

This is called the significance level (α) of the test.

If we rejected H0 , what is the probability that we made a mistake (Type I error)?

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 25 / 87


Hypothesis Testing General Procedure

The Critical Values

We can use the t-distribution to find the critical values c > 0 of the t-statistic at
a given significance level α (i.e., values that cut off the most extreme α% of the
t-distribution) .

The critical values c depend on:


The significance level α
The degrees of freedom of the t-distribution: n − k − 1
The type of test: one-sided or two-sided

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 26 / 87


Hypothesis Testing General Procedure

One-sided vs. Two-sided Test

One-sided test (right-tailed): We only reject the most extreme values of β̂j that are
too large.

One-sided test (left-tailed): We only reject the most extreme values of β̂j that are
too small.

Two-sided test: We reject the most extreme values of β̂j that are either too large
or too small.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 27 / 87


Hypothesis Testing One-sided Test

One-sided Test (Right-tailed)


We use the right-tailed test to test whether the parameter is positive.
The alternative hypothesis is thus:

H1 : βj > 0

The critical value c is the value such that:

Pr(tβ̂j > c) = α

The rejection rule is to reject H0 in favor of H1 if

tβ̂j > c

With n − k − 1 = 28 degrees of freedom, c = 1.701


ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 28 / 87
Hypothesis Testing One-sided Test

One-sided Test (Right-tailed)

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 29 / 87


Hypothesis Testing One-sided Test

One-sided Test (Right-tailed)


In a right-sided test, the null hypothesis is essentially:

H0 : βj 6 0

We nevertheless calculate the t-statistic tβ̂j under H0 : βj = 0


The hypothesis H0 : βj = 0 is the hardest to reject among all H0 : βj 6 0.
If you can reject H0 : βj = 0, you can always reject H0 : βj < 0

Do we reject H0 in the following cases (with c = 1.701)?


tβ̂j = 2? tβ̂j = 0.5? tβ̂j = −5?

The same procedure can be used with other significance levels (e.g., 1% and 10%):
E.g., with n − k − 1 = 21 degrees of freedom, c0.1 = 1.323, c0.01 = 2.518
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 30 / 87
Hypothesis Testing One-sided Test

Example: Hourly Wage Equation

Suppose we want to test whether experience has a positive effect on wage


Test H0 : βexper = 0 against H1 : βexper > 0
t-statistic: texper = .0041/.0017 ≈ 2.41
Degrees of freedom: n − k − 1 = 526 − 3 − 1 = 522
Given critical values c0.05 = 1.645, c0.01 = 2.326
Conclusion: The effect of experience on hourly wage is statistically greater than
zero at the 5% (and even at the 1%) significance level.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 31 / 87
Hypothesis Testing One-sided Test

One-sided Test (Left-tailed)


We use the left-tailed test to test whether the parameter is negative.
The alternative hypothesis is thus:

H1 : βj < 0

The critical value c is the value such that:

Pr(tβ̂j < −c) = α

The rejection rule is to reject H0 in favor of H1 if

tβ̂j < −c

With n − k − 1 = 18 degrees of freedom, c = 1.734


ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 32 / 87
Hypothesis Testing One-sided Test

One-sided Test (Light-tailed)

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 33 / 87


Hypothesis Testing One-sided Test

Example: Student Performance and School Size


Suppose we want to test smaller school size leads to better student performance

Test H0 : βenroll = 0 against H1 : βenroll < 0


t-statistic: tenroll = − 0.00020/0.00022 ≈ −0.91
df = n − k − 1 = 408 − 3 − 1 = 404
Critical values are −c0.05 = −1.65, −c0.15 = −1.04
We cannot reject the null hypothesis that there is no effect of school size on
student performance at the 5% significance level (not even at the 15% significance
level).
What about the effects of totalcomp and staff ?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 34 / 87
Hypothesis Testing One-sided Test

Example: Student Performance and School Size


Suppose we re-estimate the model with the log of the independent variables:

Test H0 : βlog(enroll) = 0 against H1 : βlog(enroll) < 0


t-statistic: tlog(enroll) = − 1.29/0.69 ≈ −1.87
Critical values are −c0.05 = −1.65, −c0.01 = −2.31
Conclusion: The null hypothesis can be rejected at the 5% significance level, but
not at the 1% significance level.
How large is the effect? 10% percent increase in school size associated with a
0.129 percantage points decrease in the passing rate.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 35 / 87
Hypothesis Testing Two-sided Test

Two-sided Test
Most commonly, We are interested in whether the parameter is different from zero.

The alternative hypothesis is:


H1 : βj 6= 0

The critical values c are the values such that:

Pr(|tβ̂j | > c) = α

The rejection rule is to reject H0 in favor of H1 if

|tβ̂j | > c

With n − k − 1 = 25 degrees of freedom, c = 2.06


ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 36 / 87
Hypothesis Testing Two-sided Test

Two-sided Test

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 37 / 87


Hypothesis Testing Two-sided Test

Example: Determinants of College GPA

Critical values are c0.01 = 2.58, c0.05 = 1.96, c0.1 = 1.645


why are they different from the one-sided test?
thsGPA = 4.38 > c0.01
tACT = 1.36 < c0.10
tskipped = − 3.19 < −c0.01

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 38 / 87


Hypothesis Testing Other test

Testing Other Hypothesis about βj


So far, we’ve been testing the most common hypothesis is H0 : βj = 0.
However, we can also test other hypothesis about βj :

H0 : β̂j = aj

What is the t-statistic under this hypothesis?

β̂j − aj
t= ∼ tn−k−1
se(β̂j )

Suppose we rejected the hypothesis against H1 : β̂j > aj . What does it mean?
“β̂j is statistically greater than one” at the appropriate significance level.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 39 / 87
Hypothesis Testing Other test

Example: Campus Crime and Enrollment


Suppose we are interested in whether crime increases by one percent if enrollment
is increased by one percent.

Hypothesis: H0 : βlog(enroll) = 1, H1 : βlog(enroll) 6= 1


1.27−1
t-statistic: tlog(enroll) = 0.11
≈ 2.45 > c0.05

The hypothesis is rejected at the 5% (and even at the 1%) significance level.

But why do we hypothesize H0 : βenroll = 1?


ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 40 / 87
Hypothesis Testing Other test

Example: Campus Crime and Enrollment


The relationship between enrollment and crime is modeled as:

crime = exp(β0 )enroll β exp(u)

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 41 / 87


Hypothesis Testing Other test

Example: Housing Prices and AirPollution

Suppose we are interested in whether the elasticity of price with respect to


pollution (nox) is -1:

Hypothesis: H0 : βnox = −1, H1 : βnox 6= −1


−0.954−(−1)
t-statistic: tnox = 0.134
≈ 0.393

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 42 / 87


Hypothesis Testing p-value

Problems with Significance Levels

One needs to determine the significance level α before conducting the test.

The choice of α is somewhat arbitrary.

Information is lost when we only report whether the null hypothesis is rejected or
not at a given significance level.

We draw different conclusions for t-statistics of 1.95 and 1.97? But are they really
different?

An alternative strategy is to report the p-value of the test.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 43 / 87


Hypothesis Testing p-value

P-values
The p-value is the probability of observing a value that is more extreme than β̂j
under the null hypothesis H0 .

p = Pr (|T | > |t|)

It is also the smallest significance level at which we would still reject H0 .


If the p-value is 0.51, we would reject H0 at the 5.1% significance level, but not at
the 5% significance level.
A small p-value is evidence against the null hypothesis because one would reject
the null hypothesis even at small significance levels.
A large p-value is avidence in favor of the null hypothesis because one would not
reject the null hypothesis even at large significance levels.
P-values are more informative than tests at fixed significance levels.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 44 / 87
Hypothesis Testing p-value

Computing the p-value

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 45 / 87


Hypothesis Testing Remarks

Remark 1: Two-sided Test vs. One-sided Tests

We generally favor the two-sided test over the one-sided ones. Why?
It does not assume the direction of the effect
It is more conservative (larger critical value)
It prevents the formation of a hypothesis after observing the estimate from the data

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 46 / 87


Hypothesis Testing Remarks

Remark 2: “Not Rejecting” 6= “Accepting” H0

What does “connot reject H0 ” mean?

Does it mean that βj = 0?

No. We only fail to reject H0 , not accept it.

The observed value of β̂j is not inconsistent with H0 , but it does not prove H0 .

“Absence of evidence is not evidence of absence.”

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 47 / 87


Hypothesis Testing Remarks

Remark 3: “Statistically Significant” Variables

If a regression coefficient is different from zero in a two-sided test, the


corresponding variable is said to be statistically significant.

If the number of df is large enough so that the normal approximation applies, we


can use the following rules of thumb:
|tβ̂j | > 1.645: “statistically significant at 10% level”
|tβ̂j | > 1.96: “statistically significant at 5% level”
|tβ̂j | > 2.576: “statistically significant at 1% level”

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 48 / 87


Hypothesis Testing Remarks

Remark 4: “Statistical” vs. “Economic” Significance

“Statistically significant” means that the effect of the variable is statistically


different from zero.

Does it mean that the effect is large in magnitude?

No. tβ̂j can be large even if β̂j is small. How?

se(β̂j ) can be small.

When are se(β̂j ) small?

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 49 / 87


Hypothesis Testing Remarks

Example: Participation Rates in 401(k) Plans

The t-statistic for the total number of firm employees (totemp) is:
ttotemp = − 0.00013/0.00004 = −3.25
This effect is statistically significant at the 1% level.
How large is this effect economically? a 10, 000 increase in the number of
employees is associated with a 1.3 percentage point decrease in the participation
rate.
So firm size does affect the participation rate, but the effect is not practically very
large.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 50 / 87
Hypothesis Testing Remarks

Practical Guidelines on “Statistical” vs. “Economic”


Significance
If a variable is statistically significant, discuss the magnitude of the coefficient to
get an idea of its economic or practical importance.
The fact that a coefficient is statistically significant does not necessarily mean it is
economically or practically significant!
If a variable is statistically and economically important but has the “wrong” sign,
the regression model might be misspecified.
If a variable is statistically insignificant at the usual levels (10%, 5%, or 1%), one
may think of dropping it from the regression.
Why? We can not distinguish the effect from zero.
If the sample size is small, effects might be imprecisely estimated so that the case
for dropping insignificant variables is less strong.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 51 / 87
CI

Lecture Plan

Sampling Distribution of OLS Estimators

Hypothesis Testing of a Single Parameter

Confidence Intervals

Hypothesis Testing of Linear Combination of Parameters

Hypothesis Testing of Multiple Linear Restrictions

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 52 / 87


CI

The Confidence Interval


We have defined the critical values c (for two-sided tests) such that:

Pr[|tβ̂j | > c] = α ⇒ Pr[−c < tβ̂j < c] = 1 − α

β̂j −βj
Since tβ̂j = se(β̂j )
, we have:

Pr[β̂j − c · se(β̂j ) < βj < β̂j + c · se(β̂j )] = 1 − α


That is, with repeated sampling process, in 95% of the samples, the true value of
βj lies in the interval:
β̂j ± c · se(β̂j )
This interval is called the confidence interval (CI) of βj
With α = 5%, the confidence interval is the 95% confidence interval.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 53 / 87
CI

Interpretation of the Confidence Interval

Can we say that the probability that βj lies in the interval is 95%? No. Why not?
The bounds of the interval are random, but the true value of βj is fixed.
In repeated samples, the interval that is constructed will cover the true value of βj
in 95% of the cases.
However, for a given sample and the estimated interval from it, βj is either in the
interval or not.
We hope that this is one of the 95% of the samples where the interval covers βj ,
but there is no guarantee.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 54 / 87


CI

Confidence Intervals from Repeated Sampling

The interval covers the true value of β in 95% of the cases.


ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 55 / 87
CI

Constructing the Confidence Interval

Confidence Intervals:
Pr[β j < β < β j ] = 1 − α

where β j ≡ β̂j − c · se(β̂j ) and β j ≡ β̂j + c · se(β̂j )

What do we need to compute the confidence interval?


The estimated β̂j and its standard error se(β̂j )
The critical value c (depends on the significance level α and the degrees of freedom)

For a 95% confidence interval, the critical values −c and c are at the 2.5% and
97.5% percentiles of the t-distribution. (Why?)
With large sample sizes (n − k − 1 > 120), the critical values are approximately 1.96.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 56 / 87


CI

Confidence Interval and Hypothesis Testing

We can use the confidence interval to test the null hypothesis H0 : βj = a.

If the confidence interval does not contain a, we would reject H0 at the α


significance level (why?).

If the confidence interval contains a, we would not reject H0 at the α significance


level.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 57 / 87


CI

Example: Model of R&D Expenditures

What is the 95% confidence interval for the effect of sales on rd?
CI = 1.084 ± 2.045 × 0.060 = (0.961, 1.21)
What is the 95% confidence interval for the effect of profits on rd?
CI = 0.0217 ± 2.045 × 0.0218 = (−0.0045, 0.0479)
Do we reject the null hypotheses?
How do we interpret the confidence intervals?
Can we say that the the true effect of sales on rd is between 0.961 and 1.21 with
95% probability?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 58 / 87
Linear Combination of Parameters

Lecture Plan

Sampling Distribution of OLS Estimators

Hypothesis Testing of a Single Parameter

Confidence Intervals

Hypothesis Testing of Linear Combination of Parameters

Hypothesis Testing of Multiple Linear Restrictions

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 59 / 87


Linear Combination of Parameters

Testing the Relationship between Parameters


In some cases, we are interested in testing the relationship between two or more
parameters.
For example, we might want to test whether the effect of X1 on Y is the same as
the effect of X2 on Y .
Or we might want to test whether the sum of the effects of X1 and X2 on Y is
equal to 1.
In these cases, we are interested in testing a linear combination of the
parameters.
Let’s consider the simple case:

H 0 : β1 = β2 or H0 : β1 − β2 = 0

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 60 / 87


Linear Combination of Parameters

Testing H0 : β1 − β2 = 0
What is the t-statistic?
β̂1 − β̂2
t=
se(β̂1 − β̂2 )

But what is se(β̂1 − β̂2 )?


From textbook Math Refresher B:
Var(β̂1 − β̂2 ) = Var(β̂1 ) + Var(β̂2 ) − 2Cov(β̂1 , β̂2 )
q
⇒ se(β̂1 − β̂2 ) = [se(β̂1 )]2 + [se(β̂2 )]2 − 2s12

where s12 is an estimate of Cov(β̂1 , β̂2 )


How do we estimate s12 ?
This is too complicated and usually unavailable in regression output.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 61 / 87
Linear Combination of Parameters

An Alternative Strategy

We can define θ = β1 − β2 , so that β1 = θ1 + β2

So the model can be rewritten as:

y = β0 + (θ1 + β2 )X1 + β2 X2 + u
= β0 + θ1 X1 + β2 (X1 + X2 ) + u

We can then estimate the modified model and test H0 : θ1 = 0.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 62 / 87


Linear Combination of Parameters

Example: Education Return at Two-Year vs. Four-Year


Colleges
Suppose we want to test whether the return to education is the smaller at two-year
than at four-year colleges.

H0 : β1 − β2 = 0, H1 : β1 − β2 < 0
But we cannot compute the t-statistic directly given the regression output.
Alternatively, define θ1 = β1 − β2 :

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 63 / 87


Linear Combination of Parameters

Example: Education Returen at Two-Year vs. Four-Year


Colleges
Generate a new variable totcoll = jc + univ and estimate:

t-statistic: tθ̂1 = −0.102/0.0069 ≈ −1.48

95% confidence interval (c = 1.96): −0.0102 ± 1.96 × 0.0069 = (−0.0237, 0.0003)

p-value: Pr(tθ̂1 < −1.48) = 0.070

Conclusion?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 64 / 87
F-test

Lecture Plan

Sampling Distribution of OLS Estimators

Hypothesis Testing of a Single Parameter

Confidence Intervals

Hypothesis Testing of Linear Combination of Parameters

Hypothesis Testing of Multiple Linear Restrictions

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 65 / 87


F-test

Multiple/Joint Hypothesis Testing

So far, we have been testing a single hypothesis with:


One parameter: H0 : βj = a (usually a = 0)
A linear combination of parameters: e.g., H0 : β1 − β2 = 0

But we might be interested in testing multiple hypothesis at the same time.


For example, whether a group of variables has no effect on the dependent variable.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 66 / 87


F-test

Example: Performance and Salary of Baseball Players


Suppose we want to test whether the performance measures of baseball players is
related to their salary.

Null hypothesis: Once years in the league and games per year have been controlled
for, the statistics measuring performance have no effect on salary.
H0 : β3 = 0, β4 = 0, β5 = 0
H1 : H0 is not true
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 67 / 87
F-test

Example: Performance and Salary of Baseball Players


Estimation Results:

Are β̂3 , β̂4 , and β̂5 statistically significant at the 5% level?


Can we reject H0 and conclude that the performance measures have no effect on
salary? Why not?
Each single t-test assumes that other factors are held constant. But the joint effect
of these variables does not have such restriction.
How do we test the joint hypothesis?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 68 / 87
F-test

Joint Hypothesis Testing


Idea: To test the joint hypothesis that a group of parameters are jointly equal to
zero, we can test whether the inclusion of these parameters significantly improves
the fit of the model.
RSS
What is the measure of model fit? R 2 = 1 − TSS

Can we simply compare the R 2 of estimates with and without the group of
parameters?
No, because R 2 increases naturally with the number of parameters.

We need to determine whether the increase in R 2 (or the decrease in RSS) is


statistically significant.

How do we do it?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 69 / 87
F-test

Restricted and Unrestricted Models


Consider the model:
y = β0 + β1 x1 + · · · + βk xk + u
This is called the unrestricted model.
Suppose we want to test the joint hypothesis that q of these variables have zero
coefficients
H0 : βk−q+1 = · · · = βk = 0
We can have another model without the group of variables hypothesized to be zero:

y = β0 + β1 x1 + · · · + βk−q xk−q + u

This is called the restricted model.


Why “restricted”? Because we have imposed the restrictions under H0 on the
parameters.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 70 / 87
F-test

The F-statistic
We define the F-statistic as:
(RSSr − RSSur )/q
F =
RSSur /(n − k − 1)

where r and ur denote the restricted and unrestricted models, respectively.


RSSr is the RSS from the restricted model, and RSSur is the RSS from the
unrestricted model. Which one is larger?
The F-statistic can be seen as measuring the relative increase in RSS when moving
from the unrestricted to the restricted model.
q = dfr − dfur is the number of restrictions imposed under H0 . Why?
F-statistic is always non-negative. Why?
What is the distribution of the F-statistic?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 71 / 87
F-test

The F-distribution

The F-statistic follows an F-distribution with q and n − k − 1 degrees of freedom:

(RSSr − RSSur )/q


F = ∼ F (q, n − k − 1)
RSSur /(n − k − 1)

An F-distribution can be viewed as a ratio of two χ2 -distributions divided by their


degrees of freedom:
X1 /k1
F = ∼ Fk1 ,k2
X2 /k2
where X1 ∼ χ2k1 and X2 ∼ χ2k2

What is a χ2 -distribution? The sum of squared standard normal random variables.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 72 / 87


F-test

The F-distribution

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 73 / 87


F-test

The F-test
We use the the F-statistic to conduct the F-test:
(RSSr − RSSur )/q
F =
RSSur /(n − k − 1)

The critical value c is determined by

Pr[F > c] < α

Degrees of freedom: q and n − k − 1


Decision rule: Reject H0 if F > c
If H0 : βk−q+1 = · · · = βk = 0 is rejected, we say that the group of variables
xk−q+1 , · · · , kk are jointly significant at the appropriate significance level.
we can also compute the p-value of the F-statistic: p-value = Pr[F > F ]
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 74 / 87
F-test

The F-test

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 75 / 87


F-test

Example: Performance and Salary of Baseball Players


Estimation of the unrestricted model:

Estimation of the restricted model:

q = 3, n − k − 1 = 347
What is the F-statistic? F = (198.311−183.186)/3
(183.186)/347
= 9.55
Conclusion given c0.01 = 3.78?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 76 / 87
F-test

t-test and F-test


Suppose we have a single hypothesis: H0 : βj = 0, H1 : βj 6= 0, and we conducted a
t-test (two-sided).
Can we conduct an F-test for the same hypothesis?
What is the restricted model? What is q?
So which test should we use? Actually, the two tests are equivalent. Why?
The square of the t-statistic follow a F distribution:
2
tn−k−1 = F1,n−k−1

Why?
X1 /k1
Recall that T = √Z and F = X2 /k2
where Z ∼ N(0, 1), X ∼ χ2k , X1 ∼ χ2k1 , and
X /k
X2 ∼ χ2k2
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 77 / 87
F-test

Individual and Joint Significance


Suppose we are interested in the effects of q of the variables at some significance
level.

If the group of variables xk−q+1 , · · · , xk is jointly significant, does it mean that


each of the variables are individually significant? At least one of the variables is
significant?
No, as we have seen in the baseball example, none of the performance measures are
significant individually, but they are jointly significant.
Why might this happen? Multicolinearity!

If xk−q+1 is significant, does it mean that the group of variables xk−q+1 , · · · , xk


must be jointly significant?
No, the joint test could fail to detect the significance of xk−q+1 if the other
variables are not significant.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 78 / 87
F-test

The R 2 Form of the F-statistic


Given that RSSr = TSS(1 − Rr2 ) and RSSur = TSS(1 − Rur
2
), we can also express
2
the F-statistic in terms of R s:
2
(Rur − Rr2 )/q 2
(Rur − Rr2 )/q
F = 2 )/(n − k − 1)
= 2 )/df
(1 − Rur (1 − Rur ur

In the baseball example:

2
Rur = 0.6278, Rr2 = 0.5971, q = 3, n − k − 1 = 347
F = (0.6278−0.5971)/3
(1−0.6278)/347
≈ 9.54
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 79 / 87
F-test

The Overall Significance of a Regression


One special case is to test whether none of the variables have an effect on the
dependent variable.
Null hypothesis:
H0 : β1 = · · · = βk = 0

What is the restricted model?


y = β0 + u

q = k, and Rr2 = 0
What is the F-statistic?
R 2 /k
F = ∼ Fk,n−k−1
(1 − R 2 )/(n − k − 1)
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 80 / 87
F-test

The General Linear Restrictions

So far, we have been testing whether a group of variables all have zero coefficients.

These are called tests of exclusion restrictions because we are excluding the
variables from the model.

We might also be interested in testing joint hypotheses where the coefficients are
not necessarily zero.

For example: H0 : β1 = 1, β2 = β3 = β4 = 0

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 81 / 87


F-test

Example: House Price Assessments


Suppose we’re interested in whether the assessed value of a house is rational:
1% change in the assessed value should be associated with a 1% change in the
selling price.
The assessment takes into account the key variables that determine the selling price.

H0 : β1 = 1, β2 = β3 = β4 = 0
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 82 / 87
F-test

Testing the General Linear Restrictions


What is the unrestricted model?

y = β0 + β1 x1 + β2 x2 + β3 x3 + β4 x4 + u

What is the restricted model?

y = β0 + x1 + u (But how can we estimate it?)


⇒ y − x1 = β0 + u
(RSSr −RSSur )/q
How do we calculate the F-statistic?F = RSSur /(n−k−1)
2 −R 2 )/q
(Rur
Can we calculate it as F = 2 )/(n−k−1) ?
(1−Rur
t
Why not?
TSS is no longer the same in the restricted and unrestricted models.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 83 / 87
F-test

Example: House Price Assessments

Unrestricted regression:

log(price) = β0 + β1 log(assess) + β2 log (lotsize) + β3 log (sqrft) + β4 bdrms + u

Restricted regression:

log(price) − log(assess) = β0 + u

With 88 obs, estimation results are that RSSr = 1.880 and RSSur = 1.822
q = 4, n − k − 1 = 83
r −RSSur )/q
F = (RSS
RSSur /(n−k−1)
= (1.880−1.822)/4
1.822/(88−4−1)
≈ 0.661
F ∼ F4,83 ⇒ c0.05 = 2.50 ⇒ H0 cannot be rejected

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 84 / 87


F-test

Reporting the Regression Results

In research papers, regression results are usually reported in a table. We need to


include:
The estimated coefficients
Standard errors
Number of observations
R2
We can report multiple regression results as different columns in the same table.

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 85 / 87


F-test

A Typical Regression Table

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 86 / 87


F-test

Next Lecture: Functional Forms

Textbook Chapter 6

ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 87 / 87

You might also like