Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Multicollinearity
(Reading, Wooldridge: Ch 3, Dougherty: Ch 6, Gujarati: Ch 4)
1. Perfect multicollinearity
Consider the usual model,
yi 1 x1i 2 x2i k xki i i 1, 2, n (1)
perfect multicollinearity arises when one of the x variables, for example, x ji is an exact linear
function of some other x variables in the model. Consider the case where x1i 2 3x2i ,
equation (1) cannot be estimated by OLS as the variables x1i and x2i cannot be distinguished
yi
and it is impossible to 1 (holding all else constant) as x2i must change for a change in
x1i
x1i .
Perfect multicolliearity should never arise and often reflects a mistake on the part of the
researcher. The solution would be to drop that variable which is perfectly related to the other
variables.
Examples of perfect mulitcollinearity might be:
ln(M i ) 1 ln(GDPi ) 2 Infi 3ri 4 rri i (1)
where M = Real money, GDP = Real gross domestic product, Inf = Inflation rate, r = Nominal
interest rate, rr = Real interest rate. However, as rri ri Infi , we have perfect
multicollinearity and we cannot estimate all the parameters of the model. The solution would
be to include any two of the three variables Inf, r, rr, and estimate:
ln(M i ) 1 ln(GDPi ) 2 Infi 3ri i .
Alternatively, returning to Handout 3 (on Dummy variables), consider the problem with
estimating:
ln(wi ) 1Schooli 2 Femalei 3Malei i (2)
1
Econometrics 1 (Term 1: Handout 11)
And we can see immediately that the Male variable can be constructed as the variable for the
Intercept minus the Female variable. Trying to estimate equation (1) or (2), would be
equivalent to trying to solve the system of equations:
z1 z2 6
2 z1 2 z2 12
2. Imperfect multicollinearity
As perfect multicollinearity should only occur in error, when we talk about multicollinearity
we are generally referring to severe imperfect multicollinearity. Imperfect multicollinearity
occurs when there is a very high correlation between two or more explanatory variables
(although this correlation is not unity as this would be perfect multicollinearity).
2
Econometrics 1 (Term 1: Handout 11)
be reflected by the parameter estimates of any model being very sensitive to small changes
in the specification of the model. The overall R2 of an equation is largely unaffected by
multicollinearity.
3
Econometrics 1 (Term 1: Handout 11)
120
100
80
(2001=100)
GDP
60
GDP_1
40
20
0
1981 Q1 1983 Q1 1985 Q1 1987 Q1 1989 Q1 1991 Q1 1993 Q1 1995 Q1 1997 Q1 1999 Q1 2001 Q1 2003 Q1
Time
1.5
0.5
(2001=100)
DGDP
DGDP_1
0
1981 Q1 1983 Q1 1985 Q1 1987 Q1 1989 Q1 1991 Q1 1993 Q1 1995 Q1 1997 Q1 1999 Q1 2001 Q1 2003 Q1
-0.5
-1
-1.5
Time