Sei sulla pagina 1di 25

PRESENTATION

ON
MULTICOLINEARITY

DUSHYANT
2018MBA011
MULTICOLLINEARITY

• Multicollinearity occurs when two or more


independent variables in a regression model
are highly correlated to each other.

• Standard error of the OLS parameter estimate


will be higher if the corresponding independent
variable is more highly correlated to the other
independent variables in the model.
THE NATURE OF MULTICOLLINEARITY
Multicollinearity originally it meant the existence of a “perfect,” or
exact, linear relationship among some or all explanatory variables of a
regression model. For the k-variable regression involving explanatory
variable X1, X2, . . . , Xk (where X1 = 1 for all observations to allow for the
intercept term), an exact linear relationship is said to exist if the
following condition is satisfied:
λ1X1 + λ2X2 +· · ·+λkXk = 0
where λ1, λ2, . . . , λk are constants such that not all of them are
zero
simultaneously.
Today, however, the term multicollinearity is used to include the case
where the X variables are intercorrelated but not perfectly so, as
follows:
λ1X1 + λ2X2 +· · ·+λ2Xk + vi = 0
where vi is a stochastic error term.
PERFECT MULTICOLLINEARITY

• Perfect multicollinearity occurs when there is a


perfect linear correlation between two or more
independent variables.

• When independent variable takes a constant


value in all observations.
CAUSES OF
MULTICOLLINEARITY
CAUSES OF MULTICOLLINEARITY

Constraints
on the Statistical
Data An over
model or in model
collection
method
the determine
population specificati
employed d model
being on
sampled.
CONSEQUENCES OF MULTICOLLINEARITY
CONSEQUENCES OF MULTICOLLINEARITY

In cases of near or high multicollinearity, one is


likely to encounter the following consequences:
1. the OLS estimators have large variances and
covariance, making precise estimation difficult.
2. The confidence intervals tend to be much wider,
leading to the acceptance of the “zero null
hypothesis” (i.e., the true population coefficient is
zero) more readily.
2. WIDER CONFIDENCE INTERVALS
Because of the large standard errors, the
confidence intervals for relevant
population parameters tend the to be larger.
Therefore, in cases of high multicollinearity,
the sample data may be compatible with a
diverse set of hypotheses. Hence, the
probability of accepting a false hypothesis
(i.e., type II error) increases.
3) t ratio of one or more coefficients tends to be
statistically insignificant.

• In cases of high collinearity, as the estimated


standard errors increased, the t values
decreased..
Therefore, in such cases, one will increasingly
accept the null hypothesis that the relevant
true population value is zero
4) Even the t ratio is insignificant, R2 can be very high.

 Since R2 is very high, we reject the


hypothesis i.e H0: β1 =β2 = Βk = 0; due to the
null
significant relationship between the variables.
But since t ratios are small again we reject H0.

 Therefore there is a contradictory


conclusion in the presence of multicollinearity.
DETECTION OF
MULTICOLLINEARITY
VARIANCE INFLATION FACTORS

• VARIANCE INFLATION FACTORS ARE VERY USEFUL IN


DETERMINING IF MULTICOLLINEARITY IS PRESENT.

VIF j  C jj  (1  R 2j ) 1
• VIFS > 5 TO 10 ARE CONSIDERED SIGNIFICANT. THE
REGRESSORS THAT HAVE HIGH VIFS PROBABLY HAVE
POORLY ESTIMATED REGRESSION COEFFICIENTS
DETECTION OF
MULTICOLLINEARITY
• Multicollinearity cannot be tested; only the degree of
multicollinearity can be detected.

• Multicollinearity is a question of degree and not of kind.


The meaningful distinction is not between the presence
and the absence of multicollinearity, but between its
various degrees.

• Multicollinearity is a feature of the sample and not of the


population. Therefore, we do not “test for
multicollinearity” but we measure its degree in any
particular sample.
A. THE FARRAR
TEST
• Computation of F-ratio to test location of
the multicollinearity.

• Computation of t-ratio to test pattern of


the multicollinearity.

• Computation of chi-square to test the presence of


multicollinearity in a function with several
explanatory variables.
B. BUNCH ANALYSIS
a. Coefficient of determination, R2 , in the
presence of multicollinearity, R2 is high.
b. In the presence of muticollinearity in the
data, the partial correlation coefficients, r12
also high.
c. The high standard error of the parameters
shows the existence of multicollinearity.
REMEDIAL MEASURES
REMEDIAL MEASURES
• Do nothing
– If you are only interested in prediction,
multicollinearity is not an issue.
– t-stats may be deflated, but still
significant, hence multicollinearity is
not significant.
– The cure is often worse than the disease.
• Drop one or more of the
multicollinear variables.
– In an effort to avoid specification bias a
researcher can introduce
multicollinearity, hence it would be
appropriate to drop a variable.
• Transform the multicollinear variables.
– Form a linear combination of the multicollinear
variables.
– Transform the equation into first differences or
logs.

• Increase the sample size.


– The issue of micronumerosity.
– Micronumerosity is the problem of (n) not
exceeding (k). The symptoms are similar to the
issue of multicollinearity (lack of variation in the
independent variables).

• A solution to each problem is to increase


the sample size. This will solve the problem
of micronumerosity but not necessarily the
problem of multicollinearity.
CONCLUSIO
N
CONCLUSION
• Multicollinearity is a statistical phenomenon in
which there exists a perfect or exact relationship
between the predictor variables.
• When there is a perfect or exact relationship
between the predictor variables, it is difficult to
come up with reliable estimates of their individual
coefficients.
• The presence of multicollinearity can cause
serious problems with the estimation of β and the
interpretation.
• When multicollinearity is present in the data,
ordinary least square estimators are
imprecisely estimated.
• If goal is to understand how the various X
variables impact Y, then multicollinearity is a
big problem. Thus, it is very essential to detect
and solve the issue of multicollinearity before
estimating the parameter based on fitted
regression model.
• Detection of multicollinearity can be done by
examining the correlation matrix or by using
VIF.
• Remedial measures help to solve the problem
of multicollinearity.