Multicollinearity
1
Introduction and Overview
• As we were discussing the violations of the Classical
Assumptions and remedies for those violations
• This lecture addresses multicollinearity; as we have already
discuss serial correlation and heteroskedasticity
• For each of these three problems, we will attempt to answer the
following questions:
1. What is the nature of the problem?
2. What are the consequences of the problem?
3. How is the problem diagnosed?
4. What remedies for the problem are available?
2
Perfect Multicollinearity
• Perfect multicollinearity violates Classical Assumption, which specifies that no
explanatory variable is a perfect linear function of any other explanatory variables
• The word perfect in this context implies that the variation in one explanatory variable
can be completely explained by movements in another explanatory variable
– A special case is that of a dominant variable: an explanatory variable is definitionally
related to the dependent variable
• An example would be (Notice: no error term!):
X1i = α0 + α1X2i
where the αs are constants and the Xs are independent variables in:
Yi = β0 + β1X1i + β2X2i + εi
• Following Figure illustrates this case
3
Perfect Multicollinearity
4
Perfect Multicollinearity
(cont.)
• What happens to the estimation of an econometric equation where there is
perfect multicollinearity?
– OLS is incapable of generating estimates of the regression coefficients
– most OLS computer programs will print out an error message in such a situation
• What is going on?
• Essentially, perfect multicollinearity ruins our ability to estimate the coefficients
because the perfectly collinear variables cannot be distinguished from each
other:
• You cannot “hold all the other independent variables in the equation constant” if
every time one variable changes, another changes in an identical manner!
• Solution: one of the collinear variables must be dropped (they are essentially
identical, anyway)
5
Imperfect Multicollinearity
• Imperfect multicollinearity occurs when two (or
more) explanatory variables are imperfectly
linearly related, as in:
X1i = α0 + α1X2i + ui
• Compare this Equation with the previous one.
– Notice that this Equation includes ui,
• This case is illustrated in following Figure
6
Imperfect Multicollinearity
7
The Consequences of
Multicollinearity
There are five major consequences of multicollinearity:
1. Estimates will remain unbiased
2. The variances and standard errors of the estimates
will increase:
a. Harder to distinguish the effect of one variable from the
effect of another, so much more likely to make large
errors in estimating the βs than without multicollinearity
b. As a result, the estimated coefficients, although still
unbiased, now come from distributions with much larger
variances and, therefore, larger standard errors (this
point is illustrated in following Figure)
8
Severe Multicollinearity Increases
the Variances of the s
9
The Consequences of
Multicollinearity (cont.)
3. The computed t-scores will fall:
a. Recalling Equation t-statistics = estimated coefficient / standard error
b. Multicollinearity problem makes a significant variable insignificant by increasing
its standard error. If SE goes up, t statistics goes down and hence comes up
with high p-value.
4. Estimates will become very sensitive to changes in specification:
a. The addition or deletion of an explanatory variable or of a few observations
will often cause major changes in the values of the s when significant
multicollinearity exists
b. For example, if you drop a variable, even one that appears to be statistically
insignificant, the coefficients of the remaining variables in the equation
sometimes will change dramatically
c. This is again because with multicollinearity, it is much harder to distinguish
the effect of one variable from the effect of another
10
Eviews: The Detection of
Multicollinearity
• File name: Multicollinearity and Regression
• First realize that that some multicollinearity exists in every
equation: all variables are correlated to some degree (even if
completely at random)
• So it’s really a question of how much multicollinearity exists in
an equation, rather than whether any multicollinearity exists
• Correlation test
11
Eviews: The Detection of
Multicollinearity
• File name: Multicollinearity and Regression
12
Eviews: The Detection of
Multicollinearity
• File name: Multicollinearity and Regression
13
Eviews: The Detection of
Multicollinearity
• File name: Multicollinearity and Regression
14
Eviews: The Detection of
Multicollinearity
• File name: Multicollinearity and Regression
15
Eviews: The Detection of
Multicollinearity
• File name: Multicollinearity and Regression
• Only the coefficient of X5 is significant.
• We suspect there is a problem of multicollinearity.
• We should conduct correlation test.
16
Eviews: The Detection of
Multicollinearity
• File name: Multicollinearity and Regression
17
Eviews: The Detection of
Multicollinearity
• File name: Multicollinearity and Regression
18
Eviews: The Detection of
Multicollinearity
• File name: Multicollinearity and Regression
19
Eviews: The Detection of
Multicollinearity
• File name: Multicollinearity and Regression
20
Eviews: The Removal of
Multicollinearity
• File name: Multicollinearity and Regression
• As correlation coefficient between X3 and X4 is 95.3609%, so
there exists the problem of multicollinearity.
• We shall drop that variable which has higher p-value.
• Higher p-value---lower t-statistic---lower level of significance
21
Eviews: The Removal of
Multicollinearity
• File name: Multicollinearity and Regression
22
Eviews: The Removal of
Multicollinearity
• File name: Multicollinearity and Regression
• So we’ll drop X4 as its p-value (0.3010) is more than the p-value
(0.0648) of X5.
23
Eviews: The Removal of
Multicollinearity
• File name: Multicollinearity and Regression
24
Eviews: The Removal of
Multicollinearity
• File name: Multicollinearity and Regression
25
Remedies for
Multicollinearity
Essentially three remedies for multicollinearity:
1. Do nothing:
a. Multicollinearity will not necessarily reduce the t-
scores enough to make them statistically insignificant
and/or change the estimated coefficients to make
them differ from expectations
b. the deletion of a multicollinear variable that belongs
in an equation will cause specification bias
2. Drop a redundant variable:
a. Viable strategy when two variables measure
essentially the same thing
b. Always use theory as the basis for this decision!
26
Remedies for
Multicollinearity (cont.)
3. Increase the sample size:
a. This is frequently impossible but a useful alternative
to be considered if feasible
b. The idea is that the larger sample normally will
reduce the variance of the estimated coefficients,
diminishing the impact of the multicollinearity
27
28