0% found this document useful (0 votes)
292 views25 pages

Remedial Measures for Multicollinearity

The document discusses multicollinearity in regression analysis. It defines multicollinearity as a high correlation between two or more independent variables. This can cause standard errors of estimates to be higher and make parameter estimates imprecise. The document outlines various ways of detecting and dealing with multicollinearity, such as using variance inflation factors and removing or transforming correlated variables.

Uploaded by

Dushyant Mudgal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
292 views25 pages

Remedial Measures for Multicollinearity

The document discusses multicollinearity in regression analysis. It defines multicollinearity as a high correlation between two or more independent variables. This can cause standard errors of estimates to be higher and make parameter estimates imprecise. The document outlines various ways of detecting and dealing with multicollinearity, such as using variance inflation factors and removing or transforming correlated variables.

Uploaded by

Dushyant Mudgal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
  • Presentation on Multicollinearity
  • The Nature of Multicollinearity
  • Causes of Multicollinearity
  • Consequences of Multicollinearity
  • Detection of Multicollinearity
  • Remedial Measures
  • Conclusion

PRESENTATION

ON
MULTICOLINEARITY

DUSHYANT
2018MBA011
MULTICOLLINEARITY

• Multicollinearity occurs when two or more


independent variables in a regression model
are highly correlated to each other.

• Standard error of the OLS parameter estimate


will be higher if the corresponding independent
variable is more highly correlated to the other
independent variables in the model.
THE NATURE OF MULTICOLLINEARITY
Multicollinearity originally it meant the existence of a “perfect,” or
exact, linear relationship among some or all explanatory variables of a
regression model. For the k-variable regression involving explanatory
variable X1, X2, . . . , Xk (where X1 = 1 for all observations to allow for the
intercept term), an exact linear relationship is said to exist if the
following condition is satisfied:
λ1X1 + λ2X2 +· · ·+λkXk = 0
where λ1, λ2, . . . , λk are constants such that not all of them are
zero
simultaneously.
Today, however, the term multicollinearity is used to include the case
where the X variables are intercorrelated but not perfectly so, as
follows:
λ1X1 + λ2X2 +· · ·+λ2Xk + vi = 0
where vi is a stochastic error term.
PERFECT MULTICOLLINEARITY

• Perfect multicollinearity occurs when there is a


perfect linear correlation between two or more
independent variables.

• When independent variable takes a constant


value in all observations.
CAUSES OF
MULTICOLLINEARITY
CAUSES OF MULTICOLLINEARITY

Constraints
on the Statistical
Data An over
model or in model
collection
method
the determine
population specificati
employed d model
being on
sampled.
CONSEQUENCES OF MULTICOLLINEARITY
CONSEQUENCES OF MULTICOLLINEARITY

In cases of near or high multicollinearity, one is


likely to encounter the following consequences:
1. the OLS estimators have large variances and
covariance, making precise estimation difficult.
2. The confidence intervals tend to be much wider,
leading to the acceptance of the “zero null
hypothesis” (i.e., the true population coefficient is
zero) more readily.
2. WIDER CONFIDENCE INTERVALS
Because of the large standard errors, the
confidence intervals for relevant
population parameters tend the to be larger.
Therefore, in cases of high multicollinearity,
the sample data may be compatible with a
diverse set of hypotheses. Hence, the
probability of accepting a false hypothesis
(i.e., type II error) increases.
3) t ratio of one or more coefficients tends to be
statistically insignificant.

• In cases of high collinearity, as the estimated


standard errors increased, the t values
decreased..
Therefore, in such cases, one will increasingly
accept the null hypothesis that the relevant
true population value is zero
4) Even the t ratio is insignificant, R2 can be very high.

 Since R2 is very high, we reject the


hypothesis i.e H0: β1 =β2 = Βk = 0; due to the
null
significant relationship between the variables.
But since t ratios are small again we reject H0.

 Therefore there is a contradictory


conclusion in the presence of multicollinearity.
DETECTION OF
MULTICOLLINEARITY
VARIANCE INFLATION FACTORS

• VARIANCE INFLATION FACTORS ARE VERY USEFUL IN


DETERMINING IF MULTICOLLINEARITY IS PRESENT.

VIF j  C jj  (1  R 2j ) 1
• VIFS > 5 TO 10 ARE CONSIDERED SIGNIFICANT. THE
REGRESSORS THAT HAVE HIGH VIFS PROBABLY HAVE
POORLY ESTIMATED REGRESSION COEFFICIENTS
DETECTION OF
MULTICOLLINEARITY
• Multicollinearity cannot be tested; only the degree of
multicollinearity can be detected.

• Multicollinearity is a question of degree and not of kind.


The meaningful distinction is not between the presence
and the absence of multicollinearity, but between its
various degrees.

• Multicollinearity is a feature of the sample and not of the


population. Therefore, we do not “test for
multicollinearity” but we measure its degree in any
particular sample.
A. THE FARRAR
TEST
• Computation of F-ratio to test location of
the multicollinearity.

• Computation of t-ratio to test pattern of


the multicollinearity.

• Computation of chi-square to test the presence of


multicollinearity in a function with several
explanatory variables.
B. BUNCH ANALYSIS
a. Coefficient of determination, R2 , in the
presence of multicollinearity, R2 is high.
b. In the presence of muticollinearity in the
data, the partial correlation coefficients, r12
also high.
c. The high standard error of the parameters
shows the existence of multicollinearity.
REMEDIAL MEASURES
REMEDIAL MEASURES
• Do nothing
– If you are only interested in prediction,
multicollinearity is not an issue.
– t-stats may be deflated, but still
significant, hence multicollinearity is
not significant.
– The cure is often worse than the disease.
• Drop one or more of the
multicollinear variables.
– In an effort to avoid specification bias a
researcher can introduce
multicollinearity, hence it would be
appropriate to drop a variable.
• Transform the multicollinear variables.
– Form a linear combination of the multicollinear
variables.
– Transform the equation into first differences or
logs.

• Increase the sample size.


– The issue of micronumerosity.
– Micronumerosity is the problem of (n) not
exceeding (k). The symptoms are similar to the
issue of multicollinearity (lack of variation in the
independent variables).

• A solution to each problem is to increase


the sample size. This will solve the problem
of micronumerosity but not necessarily the
problem of multicollinearity.
CONCLUSIO
N
CONCLUSION
• Multicollinearity is a statistical phenomenon in
which there exists a perfect or exact relationship
between the predictor variables.
• When there is a perfect or exact relationship
between the predictor variables, it is difficult to
come up with reliable estimates of their individual
coefficients.
• The presence of multicollinearity can cause
serious problems with the estimation of β and the
interpretation.
• When multicollinearity is present in the data,
ordinary least square estimators are
imprecisely estimated.
• If goal is to understand how the various X
variables impact Y, then multicollinearity is a
big problem. Thus, it is very essential to detect
and solve the issue of multicollinearity before
estimating the parameter based on fitted
regression model.
• Detection of multicollinearity can be done by
examining the correlation matrix or by using
VIF.
• Remedial measures help to solve the problem
of multicollinearity.

PRESENTATION  
ON
MULTICOLINEARITY
DUSHYANT
2018MBA011
MULTICOLLINEARITY
• Multicollinearity occurs when two or more  
independent variables in a regression model  
are highly corr
THE NATURE OF MULTICOLLINEARITY
Multicollinearity originally it meant the existence of a “perfect,” or  
exact, linear relati
PERFECT MULTICOLLINEARITY
• Perfect multicollinearity occurs when there is a  
perfect linear correlation between two or more
CAUSES OF 
MULTICOLLINEARITY
CAUSES OF MULTICOLLINEARITY
Data  
collection 
 method  
employed
Constraints 
 on the  
model or in 
 the   
population  
be
CONSEQUENCES OF MULTICOLLINEARITY
CONSEQUENCES OF MULTICOLLINEARITY
In cases of near or high multicollinearity, one is  
likely to encounter the following cons
2. WIDER CONFIDENCE INTERVALS
Because of the large standard errors, the
confidence
intervals
for
the
relevant
population para
3) t ratio of one or more coefficients tends to be  
statistically insignificant.
• In cases of high collinearity, as the est

You might also like