ECON2280 Introductory Econometrics
Topic 7 Inference
Yiming Cao
The University of Hong Kong
March 17, 2025
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 1 / 87
Lecture Plan
Sampling Distribution of OLS Estimators
Hypothesis Testing of a Single Parameter
Confidence Intervals
Hypothesis Testing of Linear Combination of Parameters
Hypothesis Testing of Multiple Linear Restrictions
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 1 / 87
Sampling Distribution Motivation
Motivation
Linear Regression Model:
y = β0 + β1 x1 + β2 x2 + · · · + βk xk + u
Estimation:
ŷ = β̂0 + β̂1 x1 + β̂2 x2 + · · · + β̂k xk
What can we learn about the true value of βj from the sample estimate β̂j ?
Can we determine the true value of βj ?
Can we tell which values of βj are more likely than others?
What determines this “likelihood”?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 2 / 87
Sampling Distribution Motivation
Inferring Population Parameters from Sample Estimates
Hypothesis testing: Determining whether a particular value of βj is likely to be
the true value of βj .
Confidence intervals: Constructing a range of values that are likely to contain
the true value of βj .
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 3 / 87
Sampling Distribution Motivation
Intuition of Hypothesis Testing
Suppose we have a random variable x that follows a distribution f (x; θ), where θ is
a parameter of the distribution.
We observe a random sample of x and estimate a θ̂, and we have a hypothesis
about the true value of θ.
We know that under any value θ, θ̂ is a random variable and has its own
distribution.
We can calculate the probability of observing the value θ̂ under the hypothesized
value of θ.
If the probability of observing the value θ̂ is very low under a particular value of θ,
we may “reject” that value of θ
That is, it is unlikely to oberve the value of θ̂ if the hypothesized value of θ is true.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 4 / 87
Sampling Distribution Motivation
Intuition of Hypothesis Testing
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 5 / 87
Sampling Distribution Motivation
Intuition of Confidence Intervals
We estimate a coefficient β̂j from a sample.
We know that β̂j is a random variable, and is not guaranteed to be equal to the
true value of βj .
However, we can construct a range of values that are likely to contain the true
value of βj with a certain probability.
This range is called a confidence interval.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 6 / 87
Sampling Distribution Motivation
What Do We Need to Know?
Both inference methods rely on the sampling distribution of β̂j
Under Assumption MLR.1–MLR.4:
E (β̂j ) = βj
Under Assumptions MLR.1–MLR.5:
σ2
Var(βj ) =
TSSj (1 − Rj2 )
Are these sufficient?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 7 / 87
Sampling Distribution Normality
Sampling Distribution of β̂1
Recall that in the simple linear regression model, we have:
n
1 X
β̂1 = β1 + di ui
TSSx i=1
where di = xi − x̄
When the values of x are given (i.e., treated as non-random), β̂1 is a linear
combination of the errors in the sample {ui : i = 1, 2, · · · , n}
The distribution of β̂1 is determined by the distribution of each ui .
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 8 / 87
Sampling Distribution Normality
Sampling Distribution of β̂j
Similarly, in the multiple linear regression model:
n
1 X
β̂j = βj + γ̂ij ui (Remember why?)
RSSj i=1
When the values of (x1 , · · · , xk ) are given (i.e., treated as non-random), β̂j is a
linear combination of the errors in the sample {ui : i = 1, 2, · · · , n}
The distribution of β̂j is determined by the distribution of ui .
What do we know about the distribution of ui ?
Assumption MLR.4: E (ui ) = 0
Assumption MLR.5: Var(ui ) = σ 2
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 9 / 87
Sampling Distribution Normality
Assumption MLR.6: Normality
The population error u is independent of the explanatory variables (x1 , · · · , xk ) and
is normally distributed with zero mean and variance σ 2 :
u ∼ N(0, σ 2 )
This assumption is much stronger than and implies Assumptions MLR.4 and
MLR.5. (Why?)
Assumptions MLR.1–MLR.6 together are called the classical linear model
(CLM) assumptions, and models that satisfy these assumptions are called
classical linear regression models (CLM).
The CLM assumptions can be summarized as:
y |x ∼ N(β0 + β1 x1 + β2 x2 + · · · + βk xk , σ 2 ) i.i.d
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 10 / 87
Sampling Distribution Normality
Assumption MLR.6: Normality
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 11 / 87
Sampling Distribution Normality
Is This Assumption Reasonable?
One common justification: u is the sum of many different unobserved factors
affecting y . The distribution of u is approximately normal by the central limit
theorem (CLT).
This is a weak argument, as the normal approximation may be poor.
The normality assumption can be violated in many ways:
Models with non-negative y are never normal (e.g., wage, price)
Models with discrete y are never normal (e.g., years of schooling)
Practically, it’s good enough if it’s “close to” normal.
Sometimes taking the log of y can help (e.g., log(price))
With large enough sample size, the sampling distribution of u is approximately
normal by CLT (Textbook Ch5).
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 12 / 87
Sampling Distribution Normality
Sampling Distribution of β̂j
Recall that β̂j is a linear combination of the errors {ui : i = 1, 2, · · · , n}.
Under Assumption MLR.6, the errors are independent, identically distributed (i.i.d.)
N(0,σ 2 ) random variables.
A linear combination of i.i.d. normal random variables is also normally distributed.
(See Textbook Math Refresher B)
Therefore, β̂j is also normally distributed. What is the mean and variance?
β̂j ∼ N(βj , Var(β̂j ))
Standardization:
β̂j − βj
∼ N(0, 1)
sd(β̂j )
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 13 / 87
Sampling Distribution Normality
Sampling Distribution of β̂j
We still don’t know what sd(β̂j ) is! (Why don’t we?)
But we have introduced a good estimate of it:
σ̂
q
se(β̂j ) = Var(
c β̂j ) = q
TSSj (1 − Rj2 )
So the standardized estimator is:
β̂j − βj
se(β̂j )
But what is the distribution? Is it still normal?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 14 / 87
Sampling Distribution Normality
Refresher on Distributions
If Zi , i = 1, · · · , n be independent random variables, each distributed as standard
normal, then the random variable
n
X
X = Zi2
i=1
follows a chi-square distribution with n degrees of freedom: X ∼ χ2n
If a random variable Z follows a standard normal distribution, and X follows a
chi-squared distribution with n degrees of freedom, then the random variable
Z
T =p
X /n
follows a t-distribution with n degrees of freedom: T ∼ tn
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 15 / 87
Sampling Distribution Normality
Chi-Square Distribution
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 16 / 87
Sampling Distribution Normality
t-distribution
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 17 / 87
Sampling Distribution Normality
t-distribution and Normal Distribution
The t-distribution is similar to the standard normal distribution, but with heavier
tails.
When the degrees of freedom are large (usually df > 120), the t-distribution is very
close to the standard normal distribution.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 18 / 87
Sampling Distribution Normality
Sampling Distribution of β̂j
We can rewrite the standardized estimator as:
(β̂j − βj )/sd(β̂j ) (β̂j − βj )/sd(β̂j )
= p
se(β̂j )/sd(β̂j ) σ̂ 2 /σ 2
(β̂j − βj )/sd(β̂j ) ∼ N(0, 1)
Pn
σ̂ 2 = 2
each ûi ∼ N(0, σ 2 ), so ûi /σ ∼ N(0, 1)
i=1 ûi /(n − k − 1), with P
(n − k − 1)σ̂ 2 /σ 2 = (n − k − 1) ni=1 ( ûσi )2 /(n − k − 1) ∼ χ2n−k−1
Thus, the standardized estimator:
β̂j − βj
tβ̂j = ∼ tn−k−1
se(β̂j )
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 19 / 87
Hypothesis Testing
Lecture Plan
Sampling Distribution of OLS Estimators
Hypothesis Testing of a Single Parameter
Confidence Intervals
Hypothesis Testing of Linear Combination of Parameters
Hypothesis Testing of Multiple Linear Restrictions
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 20 / 87
Hypothesis Testing General Procedure
The General Procedure of a t-test
Form the null hypothesis H0 : βj = α (usually α = 0)
Calculate the t-statistic under H0 :
β̂j − α
tβ̂j =
se(β̂j )
Use the t-statistic to determine whether to reject H0
Reject H0 if it’s unlikely to observe the value of β̂j under H0
Compute the p-value (The probability of observing the value of β̂j under H0 )
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 21 / 87
Hypothesis Testing General Procedure
The null hypothesis (H0)
Usually, we are interested in whether a particular explanatory variable xj has an
effect on the dependent variable y .
The null hypothesis is:
H0 : βj = 0
That is, after x1 , · · · , xj−1 , xj+1 , · · · , xk have been accounted for, xj has no effect
on the expected value of y .
Can we state the opposite (xj has a partial effect on y ) as the null hypothesis?
The corresponding alternative hypothesis is:
H1 : βj 6= 0
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 22 / 87
Hypothesis Testing General Procedure
Example: The Effect of Education on Earnings
Consider the wage equation:
log(wage) = β0 + β1 educ + β2 exper + β3 tenure + u
Suppose we want to test whether education has an effect on wage.
What’s the null hypothesis? What’s the meaning of this hypothesis?
What’s the alternative hypothesis?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 23 / 87
Hypothesis Testing General Procedure
Computing the t-statistic
The t-statistic under the null hypothesis H0 : βj = 0:
β̂j − 0 β̂j
tβ̂j = =
se(β̂j ) se(β̂j )
Does tβ̂j = 0 when H0 is true?
tβ̂j measures how far away β̂j is from 0 in terms of standard errors.
If tβ̂j is large, β̂j is too far away from 0, which is unlikely if H0 if true.
This provides evidence against H0 .
Does a large tβ̂j mean β̂j is large?
tβ̂j and β̂j always have the same sign (Why?).
But how large is large?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 24 / 87
Hypothesis Testing General Procedure
Testing the Hypothesis
We reject the null hypothesis if the probability of observing the value of β̂j under
H0 is too small.
One popular choice of the cutoff probability is 5%.
This is called the significance level (α) of the test.
If we rejected H0 , what is the probability that we made a mistake (Type I error)?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 25 / 87
Hypothesis Testing General Procedure
The Critical Values
We can use the t-distribution to find the critical values c > 0 of the t-statistic at
a given significance level α (i.e., values that cut off the most extreme α% of the
t-distribution) .
The critical values c depend on:
The significance level α
The degrees of freedom of the t-distribution: n − k − 1
The type of test: one-sided or two-sided
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 26 / 87
Hypothesis Testing General Procedure
One-sided vs. Two-sided Test
One-sided test (right-tailed): We only reject the most extreme values of β̂j that are
too large.
One-sided test (left-tailed): We only reject the most extreme values of β̂j that are
too small.
Two-sided test: We reject the most extreme values of β̂j that are either too large
or too small.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 27 / 87
Hypothesis Testing One-sided Test
One-sided Test (Right-tailed)
We use the right-tailed test to test whether the parameter is positive.
The alternative hypothesis is thus:
H1 : βj > 0
The critical value c is the value such that:
Pr(tβ̂j > c) = α
The rejection rule is to reject H0 in favor of H1 if
tβ̂j > c
With n − k − 1 = 28 degrees of freedom, c = 1.701
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 28 / 87
Hypothesis Testing One-sided Test
One-sided Test (Right-tailed)
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 29 / 87
Hypothesis Testing One-sided Test
One-sided Test (Right-tailed)
In a right-sided test, the null hypothesis is essentially:
H0 : βj 6 0
We nevertheless calculate the t-statistic tβ̂j under H0 : βj = 0
The hypothesis H0 : βj = 0 is the hardest to reject among all H0 : βj 6 0.
If you can reject H0 : βj = 0, you can always reject H0 : βj < 0
Do we reject H0 in the following cases (with c = 1.701)?
tβ̂j = 2? tβ̂j = 0.5? tβ̂j = −5?
The same procedure can be used with other significance levels (e.g., 1% and 10%):
E.g., with n − k − 1 = 21 degrees of freedom, c0.1 = 1.323, c0.01 = 2.518
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 30 / 87
Hypothesis Testing One-sided Test
Example: Hourly Wage Equation
Suppose we want to test whether experience has a positive effect on wage
Test H0 : βexper = 0 against H1 : βexper > 0
t-statistic: texper = .0041/.0017 ≈ 2.41
Degrees of freedom: n − k − 1 = 526 − 3 − 1 = 522
Given critical values c0.05 = 1.645, c0.01 = 2.326
Conclusion: The effect of experience on hourly wage is statistically greater than
zero at the 5% (and even at the 1%) significance level.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 31 / 87
Hypothesis Testing One-sided Test
One-sided Test (Left-tailed)
We use the left-tailed test to test whether the parameter is negative.
The alternative hypothesis is thus:
H1 : βj < 0
The critical value c is the value such that:
Pr(tβ̂j < −c) = α
The rejection rule is to reject H0 in favor of H1 if
tβ̂j < −c
With n − k − 1 = 18 degrees of freedom, c = 1.734
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 32 / 87
Hypothesis Testing One-sided Test
One-sided Test (Light-tailed)
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 33 / 87
Hypothesis Testing One-sided Test
Example: Student Performance and School Size
Suppose we want to test smaller school size leads to better student performance
Test H0 : βenroll = 0 against H1 : βenroll < 0
t-statistic: tenroll = − 0.00020/0.00022 ≈ −0.91
df = n − k − 1 = 408 − 3 − 1 = 404
Critical values are −c0.05 = −1.65, −c0.15 = −1.04
We cannot reject the null hypothesis that there is no effect of school size on
student performance at the 5% significance level (not even at the 15% significance
level).
What about the effects of totalcomp and staff ?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 34 / 87
Hypothesis Testing One-sided Test
Example: Student Performance and School Size
Suppose we re-estimate the model with the log of the independent variables:
Test H0 : βlog(enroll) = 0 against H1 : βlog(enroll) < 0
t-statistic: tlog(enroll) = − 1.29/0.69 ≈ −1.87
Critical values are −c0.05 = −1.65, −c0.01 = −2.31
Conclusion: The null hypothesis can be rejected at the 5% significance level, but
not at the 1% significance level.
How large is the effect? 10% percent increase in school size associated with a
0.129 percantage points decrease in the passing rate.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 35 / 87
Hypothesis Testing Two-sided Test
Two-sided Test
Most commonly, We are interested in whether the parameter is different from zero.
The alternative hypothesis is:
H1 : βj 6= 0
The critical values c are the values such that:
Pr(|tβ̂j | > c) = α
The rejection rule is to reject H0 in favor of H1 if
|tβ̂j | > c
With n − k − 1 = 25 degrees of freedom, c = 2.06
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 36 / 87
Hypothesis Testing Two-sided Test
Two-sided Test
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 37 / 87
Hypothesis Testing Two-sided Test
Example: Determinants of College GPA
Critical values are c0.01 = 2.58, c0.05 = 1.96, c0.1 = 1.645
why are they different from the one-sided test?
thsGPA = 4.38 > c0.01
tACT = 1.36 < c0.10
tskipped = − 3.19 < −c0.01
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 38 / 87
Hypothesis Testing Other test
Testing Other Hypothesis about βj
So far, we’ve been testing the most common hypothesis is H0 : βj = 0.
However, we can also test other hypothesis about βj :
H0 : β̂j = aj
What is the t-statistic under this hypothesis?
β̂j − aj
t= ∼ tn−k−1
se(β̂j )
Suppose we rejected the hypothesis against H1 : β̂j > aj . What does it mean?
“β̂j is statistically greater than one” at the appropriate significance level.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 39 / 87
Hypothesis Testing Other test
Example: Campus Crime and Enrollment
Suppose we are interested in whether crime increases by one percent if enrollment
is increased by one percent.
Hypothesis: H0 : βlog(enroll) = 1, H1 : βlog(enroll) 6= 1
1.27−1
t-statistic: tlog(enroll) = 0.11
≈ 2.45 > c0.05
The hypothesis is rejected at the 5% (and even at the 1%) significance level.
But why do we hypothesize H0 : βenroll = 1?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 40 / 87
Hypothesis Testing Other test
Example: Campus Crime and Enrollment
The relationship between enrollment and crime is modeled as:
crime = exp(β0 )enroll β exp(u)
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 41 / 87
Hypothesis Testing Other test
Example: Housing Prices and AirPollution
Suppose we are interested in whether the elasticity of price with respect to
pollution (nox) is -1:
Hypothesis: H0 : βnox = −1, H1 : βnox 6= −1
−0.954−(−1)
t-statistic: tnox = 0.134
≈ 0.393
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 42 / 87
Hypothesis Testing p-value
Problems with Significance Levels
One needs to determine the significance level α before conducting the test.
The choice of α is somewhat arbitrary.
Information is lost when we only report whether the null hypothesis is rejected or
not at a given significance level.
We draw different conclusions for t-statistics of 1.95 and 1.97? But are they really
different?
An alternative strategy is to report the p-value of the test.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 43 / 87
Hypothesis Testing p-value
P-values
The p-value is the probability of observing a value that is more extreme than β̂j
under the null hypothesis H0 .
p = Pr (|T | > |t|)
It is also the smallest significance level at which we would still reject H0 .
If the p-value is 0.51, we would reject H0 at the 5.1% significance level, but not at
the 5% significance level.
A small p-value is evidence against the null hypothesis because one would reject
the null hypothesis even at small significance levels.
A large p-value is avidence in favor of the null hypothesis because one would not
reject the null hypothesis even at large significance levels.
P-values are more informative than tests at fixed significance levels.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 44 / 87
Hypothesis Testing p-value
Computing the p-value
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 45 / 87
Hypothesis Testing Remarks
Remark 1: Two-sided Test vs. One-sided Tests
We generally favor the two-sided test over the one-sided ones. Why?
It does not assume the direction of the effect
It is more conservative (larger critical value)
It prevents the formation of a hypothesis after observing the estimate from the data
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 46 / 87
Hypothesis Testing Remarks
Remark 2: “Not Rejecting” 6= “Accepting” H0
What does “connot reject H0 ” mean?
Does it mean that βj = 0?
No. We only fail to reject H0 , not accept it.
The observed value of β̂j is not inconsistent with H0 , but it does not prove H0 .
“Absence of evidence is not evidence of absence.”
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 47 / 87
Hypothesis Testing Remarks
Remark 3: “Statistically Significant” Variables
If a regression coefficient is different from zero in a two-sided test, the
corresponding variable is said to be statistically significant.
If the number of df is large enough so that the normal approximation applies, we
can use the following rules of thumb:
|tβ̂j | > 1.645: “statistically significant at 10% level”
|tβ̂j | > 1.96: “statistically significant at 5% level”
|tβ̂j | > 2.576: “statistically significant at 1% level”
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 48 / 87
Hypothesis Testing Remarks
Remark 4: “Statistical” vs. “Economic” Significance
“Statistically significant” means that the effect of the variable is statistically
different from zero.
Does it mean that the effect is large in magnitude?
No. tβ̂j can be large even if β̂j is small. How?
se(β̂j ) can be small.
When are se(β̂j ) small?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 49 / 87
Hypothesis Testing Remarks
Example: Participation Rates in 401(k) Plans
The t-statistic for the total number of firm employees (totemp) is:
ttotemp = − 0.00013/0.00004 = −3.25
This effect is statistically significant at the 1% level.
How large is this effect economically? a 10, 000 increase in the number of
employees is associated with a 1.3 percentage point decrease in the participation
rate.
So firm size does affect the participation rate, but the effect is not practically very
large.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 50 / 87
Hypothesis Testing Remarks
Practical Guidelines on “Statistical” vs. “Economic”
Significance
If a variable is statistically significant, discuss the magnitude of the coefficient to
get an idea of its economic or practical importance.
The fact that a coefficient is statistically significant does not necessarily mean it is
economically or practically significant!
If a variable is statistically and economically important but has the “wrong” sign,
the regression model might be misspecified.
If a variable is statistically insignificant at the usual levels (10%, 5%, or 1%), one
may think of dropping it from the regression.
Why? We can not distinguish the effect from zero.
If the sample size is small, effects might be imprecisely estimated so that the case
for dropping insignificant variables is less strong.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 51 / 87
CI
Lecture Plan
Sampling Distribution of OLS Estimators
Hypothesis Testing of a Single Parameter
Confidence Intervals
Hypothesis Testing of Linear Combination of Parameters
Hypothesis Testing of Multiple Linear Restrictions
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 52 / 87
CI
The Confidence Interval
We have defined the critical values c (for two-sided tests) such that:
Pr[|tβ̂j | > c] = α ⇒ Pr[−c < tβ̂j < c] = 1 − α
β̂j −βj
Since tβ̂j = se(β̂j )
, we have:
Pr[β̂j − c · se(β̂j ) < βj < β̂j + c · se(β̂j )] = 1 − α
That is, with repeated sampling process, in 95% of the samples, the true value of
βj lies in the interval:
β̂j ± c · se(β̂j )
This interval is called the confidence interval (CI) of βj
With α = 5%, the confidence interval is the 95% confidence interval.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 53 / 87
CI
Interpretation of the Confidence Interval
Can we say that the probability that βj lies in the interval is 95%? No. Why not?
The bounds of the interval are random, but the true value of βj is fixed.
In repeated samples, the interval that is constructed will cover the true value of βj
in 95% of the cases.
However, for a given sample and the estimated interval from it, βj is either in the
interval or not.
We hope that this is one of the 95% of the samples where the interval covers βj ,
but there is no guarantee.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 54 / 87
CI
Confidence Intervals from Repeated Sampling
The interval covers the true value of β in 95% of the cases.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 55 / 87
CI
Constructing the Confidence Interval
Confidence Intervals:
Pr[β j < β < β j ] = 1 − α
where β j ≡ β̂j − c · se(β̂j ) and β j ≡ β̂j + c · se(β̂j )
What do we need to compute the confidence interval?
The estimated β̂j and its standard error se(β̂j )
The critical value c (depends on the significance level α and the degrees of freedom)
For a 95% confidence interval, the critical values −c and c are at the 2.5% and
97.5% percentiles of the t-distribution. (Why?)
With large sample sizes (n − k − 1 > 120), the critical values are approximately 1.96.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 56 / 87
CI
Confidence Interval and Hypothesis Testing
We can use the confidence interval to test the null hypothesis H0 : βj = a.
If the confidence interval does not contain a, we would reject H0 at the α
significance level (why?).
If the confidence interval contains a, we would not reject H0 at the α significance
level.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 57 / 87
CI
Example: Model of R&D Expenditures
What is the 95% confidence interval for the effect of sales on rd?
CI = 1.084 ± 2.045 × 0.060 = (0.961, 1.21)
What is the 95% confidence interval for the effect of profits on rd?
CI = 0.0217 ± 2.045 × 0.0218 = (−0.0045, 0.0479)
Do we reject the null hypotheses?
How do we interpret the confidence intervals?
Can we say that the the true effect of sales on rd is between 0.961 and 1.21 with
95% probability?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 58 / 87
Linear Combination of Parameters
Lecture Plan
Sampling Distribution of OLS Estimators
Hypothesis Testing of a Single Parameter
Confidence Intervals
Hypothesis Testing of Linear Combination of Parameters
Hypothesis Testing of Multiple Linear Restrictions
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 59 / 87
Linear Combination of Parameters
Testing the Relationship between Parameters
In some cases, we are interested in testing the relationship between two or more
parameters.
For example, we might want to test whether the effect of X1 on Y is the same as
the effect of X2 on Y .
Or we might want to test whether the sum of the effects of X1 and X2 on Y is
equal to 1.
In these cases, we are interested in testing a linear combination of the
parameters.
Let’s consider the simple case:
H 0 : β1 = β2 or H0 : β1 − β2 = 0
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 60 / 87
Linear Combination of Parameters
Testing H0 : β1 − β2 = 0
What is the t-statistic?
β̂1 − β̂2
t=
se(β̂1 − β̂2 )
But what is se(β̂1 − β̂2 )?
From textbook Math Refresher B:
Var(β̂1 − β̂2 ) = Var(β̂1 ) + Var(β̂2 ) − 2Cov(β̂1 , β̂2 )
q
⇒ se(β̂1 − β̂2 ) = [se(β̂1 )]2 + [se(β̂2 )]2 − 2s12
where s12 is an estimate of Cov(β̂1 , β̂2 )
How do we estimate s12 ?
This is too complicated and usually unavailable in regression output.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 61 / 87
Linear Combination of Parameters
An Alternative Strategy
We can define θ = β1 − β2 , so that β1 = θ1 + β2
So the model can be rewritten as:
y = β0 + (θ1 + β2 )X1 + β2 X2 + u
= β0 + θ1 X1 + β2 (X1 + X2 ) + u
We can then estimate the modified model and test H0 : θ1 = 0.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 62 / 87
Linear Combination of Parameters
Example: Education Return at Two-Year vs. Four-Year
Colleges
Suppose we want to test whether the return to education is the smaller at two-year
than at four-year colleges.
H0 : β1 − β2 = 0, H1 : β1 − β2 < 0
But we cannot compute the t-statistic directly given the regression output.
Alternatively, define θ1 = β1 − β2 :
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 63 / 87
Linear Combination of Parameters
Example: Education Returen at Two-Year vs. Four-Year
Colleges
Generate a new variable totcoll = jc + univ and estimate:
t-statistic: tθ̂1 = −0.102/0.0069 ≈ −1.48
95% confidence interval (c = 1.96): −0.0102 ± 1.96 × 0.0069 = (−0.0237, 0.0003)
p-value: Pr(tθ̂1 < −1.48) = 0.070
Conclusion?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 64 / 87
F-test
Lecture Plan
Sampling Distribution of OLS Estimators
Hypothesis Testing of a Single Parameter
Confidence Intervals
Hypothesis Testing of Linear Combination of Parameters
Hypothesis Testing of Multiple Linear Restrictions
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 65 / 87
F-test
Multiple/Joint Hypothesis Testing
So far, we have been testing a single hypothesis with:
One parameter: H0 : βj = a (usually a = 0)
A linear combination of parameters: e.g., H0 : β1 − β2 = 0
But we might be interested in testing multiple hypothesis at the same time.
For example, whether a group of variables has no effect on the dependent variable.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 66 / 87
F-test
Example: Performance and Salary of Baseball Players
Suppose we want to test whether the performance measures of baseball players is
related to their salary.
Null hypothesis: Once years in the league and games per year have been controlled
for, the statistics measuring performance have no effect on salary.
H0 : β3 = 0, β4 = 0, β5 = 0
H1 : H0 is not true
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 67 / 87
F-test
Example: Performance and Salary of Baseball Players
Estimation Results:
Are β̂3 , β̂4 , and β̂5 statistically significant at the 5% level?
Can we reject H0 and conclude that the performance measures have no effect on
salary? Why not?
Each single t-test assumes that other factors are held constant. But the joint effect
of these variables does not have such restriction.
How do we test the joint hypothesis?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 68 / 87
F-test
Joint Hypothesis Testing
Idea: To test the joint hypothesis that a group of parameters are jointly equal to
zero, we can test whether the inclusion of these parameters significantly improves
the fit of the model.
RSS
What is the measure of model fit? R 2 = 1 − TSS
Can we simply compare the R 2 of estimates with and without the group of
parameters?
No, because R 2 increases naturally with the number of parameters.
We need to determine whether the increase in R 2 (or the decrease in RSS) is
statistically significant.
How do we do it?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 69 / 87
F-test
Restricted and Unrestricted Models
Consider the model:
y = β0 + β1 x1 + · · · + βk xk + u
This is called the unrestricted model.
Suppose we want to test the joint hypothesis that q of these variables have zero
coefficients
H0 : βk−q+1 = · · · = βk = 0
We can have another model without the group of variables hypothesized to be zero:
y = β0 + β1 x1 + · · · + βk−q xk−q + u
This is called the restricted model.
Why “restricted”? Because we have imposed the restrictions under H0 on the
parameters.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 70 / 87
F-test
The F-statistic
We define the F-statistic as:
(RSSr − RSSur )/q
F =
RSSur /(n − k − 1)
where r and ur denote the restricted and unrestricted models, respectively.
RSSr is the RSS from the restricted model, and RSSur is the RSS from the
unrestricted model. Which one is larger?
The F-statistic can be seen as measuring the relative increase in RSS when moving
from the unrestricted to the restricted model.
q = dfr − dfur is the number of restrictions imposed under H0 . Why?
F-statistic is always non-negative. Why?
What is the distribution of the F-statistic?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 71 / 87
F-test
The F-distribution
The F-statistic follows an F-distribution with q and n − k − 1 degrees of freedom:
(RSSr − RSSur )/q
F = ∼ F (q, n − k − 1)
RSSur /(n − k − 1)
An F-distribution can be viewed as a ratio of two χ2 -distributions divided by their
degrees of freedom:
X1 /k1
F = ∼ Fk1 ,k2
X2 /k2
where X1 ∼ χ2k1 and X2 ∼ χ2k2
What is a χ2 -distribution? The sum of squared standard normal random variables.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 72 / 87
F-test
The F-distribution
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 73 / 87
F-test
The F-test
We use the the F-statistic to conduct the F-test:
(RSSr − RSSur )/q
F =
RSSur /(n − k − 1)
The critical value c is determined by
Pr[F > c] < α
Degrees of freedom: q and n − k − 1
Decision rule: Reject H0 if F > c
If H0 : βk−q+1 = · · · = βk = 0 is rejected, we say that the group of variables
xk−q+1 , · · · , kk are jointly significant at the appropriate significance level.
we can also compute the p-value of the F-statistic: p-value = Pr[F > F ]
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 74 / 87
F-test
The F-test
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 75 / 87
F-test
Example: Performance and Salary of Baseball Players
Estimation of the unrestricted model:
Estimation of the restricted model:
q = 3, n − k − 1 = 347
What is the F-statistic? F = (198.311−183.186)/3
(183.186)/347
= 9.55
Conclusion given c0.01 = 3.78?
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 76 / 87
F-test
t-test and F-test
Suppose we have a single hypothesis: H0 : βj = 0, H1 : βj 6= 0, and we conducted a
t-test (two-sided).
Can we conduct an F-test for the same hypothesis?
What is the restricted model? What is q?
So which test should we use? Actually, the two tests are equivalent. Why?
The square of the t-statistic follow a F distribution:
2
tn−k−1 = F1,n−k−1
Why?
X1 /k1
Recall that T = √Z and F = X2 /k2
where Z ∼ N(0, 1), X ∼ χ2k , X1 ∼ χ2k1 , and
X /k
X2 ∼ χ2k2
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 77 / 87
F-test
Individual and Joint Significance
Suppose we are interested in the effects of q of the variables at some significance
level.
If the group of variables xk−q+1 , · · · , xk is jointly significant, does it mean that
each of the variables are individually significant? At least one of the variables is
significant?
No, as we have seen in the baseball example, none of the performance measures are
significant individually, but they are jointly significant.
Why might this happen? Multicolinearity!
If xk−q+1 is significant, does it mean that the group of variables xk−q+1 , · · · , xk
must be jointly significant?
No, the joint test could fail to detect the significance of xk−q+1 if the other
variables are not significant.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 78 / 87
F-test
The R 2 Form of the F-statistic
Given that RSSr = TSS(1 − Rr2 ) and RSSur = TSS(1 − Rur
2
), we can also express
2
the F-statistic in terms of R s:
2
(Rur − Rr2 )/q 2
(Rur − Rr2 )/q
F = 2 )/(n − k − 1)
= 2 )/df
(1 − Rur (1 − Rur ur
In the baseball example:
2
Rur = 0.6278, Rr2 = 0.5971, q = 3, n − k − 1 = 347
F = (0.6278−0.5971)/3
(1−0.6278)/347
≈ 9.54
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 79 / 87
F-test
The Overall Significance of a Regression
One special case is to test whether none of the variables have an effect on the
dependent variable.
Null hypothesis:
H0 : β1 = · · · = βk = 0
What is the restricted model?
y = β0 + u
q = k, and Rr2 = 0
What is the F-statistic?
R 2 /k
F = ∼ Fk,n−k−1
(1 − R 2 )/(n − k − 1)
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 80 / 87
F-test
The General Linear Restrictions
So far, we have been testing whether a group of variables all have zero coefficients.
These are called tests of exclusion restrictions because we are excluding the
variables from the model.
We might also be interested in testing joint hypotheses where the coefficients are
not necessarily zero.
For example: H0 : β1 = 1, β2 = β3 = β4 = 0
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 81 / 87
F-test
Example: House Price Assessments
Suppose we’re interested in whether the assessed value of a house is rational:
1% change in the assessed value should be associated with a 1% change in the
selling price.
The assessment takes into account the key variables that determine the selling price.
H0 : β1 = 1, β2 = β3 = β4 = 0
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 82 / 87
F-test
Testing the General Linear Restrictions
What is the unrestricted model?
y = β0 + β1 x1 + β2 x2 + β3 x3 + β4 x4 + u
What is the restricted model?
y = β0 + x1 + u (But how can we estimate it?)
⇒ y − x1 = β0 + u
(RSSr −RSSur )/q
How do we calculate the F-statistic?F = RSSur /(n−k−1)
2 −R 2 )/q
(Rur
Can we calculate it as F = 2 )/(n−k−1) ?
(1−Rur
t
Why not?
TSS is no longer the same in the restricted and unrestricted models.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 83 / 87
F-test
Example: House Price Assessments
Unrestricted regression:
log(price) = β0 + β1 log(assess) + β2 log (lotsize) + β3 log (sqrft) + β4 bdrms + u
Restricted regression:
log(price) − log(assess) = β0 + u
With 88 obs, estimation results are that RSSr = 1.880 and RSSur = 1.822
q = 4, n − k − 1 = 83
r −RSSur )/q
F = (RSS
RSSur /(n−k−1)
= (1.880−1.822)/4
1.822/(88−4−1)
≈ 0.661
F ∼ F4,83 ⇒ c0.05 = 2.50 ⇒ H0 cannot be rejected
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 84 / 87
F-test
Reporting the Regression Results
In research papers, regression results are usually reported in a table. We need to
include:
The estimated coefficients
Standard errors
Number of observations
R2
We can report multiple regression results as different columns in the same table.
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 85 / 87
F-test
A Typical Regression Table
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 86 / 87
F-test
Next Lecture: Functional Forms
Textbook Chapter 6
ECON2280: Introductory Econometrics Topic 7: Inference Mar 17, 2025 87 / 87