Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Proxy variables
Instrumental variables
STATA
Endogeneity
Gabriel V. Montes-Rojas
Here, the main problem is that educ and exper are not
exogenous, or they are endogenous:
Cov (educ, v ) 6= 0, Cov (exper , v ) 6= 0.
Gabriel Montes-Rojas Endogeneity
The endogeneity problem
Proxy variables
Instrumental variables
STATA
Proxy variables
Proxy variables
Proxy variables
Proxy variables
Proxy variables
Proxy variables
⇒ y = ( β 0 + β 3 δ0 ) + ( β 1 + β 3 δ1 )educ
+( β 2 + β 3 δ2 )exper + β 3 δ3 IQ + u + β 3 v3
Instrumental variables
y = β0 + β1 x + u
where Cov (x, u ) 6= 0 (i.e. x is endogenous)
A good instrumental variable (say z) satisfies these two conditions:
1. It is not correlated with the error term: Cov (z, u ) = 0
2. It is correlated with the endogenous variable: Cov (x, z ) 6= 0
Cov (z, y )
β1 =
Cov (z, x )
Why?
x = γ0 + γ1 z + v (1)
y = β 0 + β 1 x̂ + u (3)
y = β 0 + β 1 x1 + β 2 x2 + u
where Cov (x1 , u ) 6= 0 (i.e. x1 is endogenous) and Cov (x2 , u ) = 0.
A good instrumental variable (say z) satisfies these two conditions:
1. It is not correlated with the error term: Cov (z, u ) = 0
2. It is correlated with the endogenous variable: Cov (x1 , z ) 6= 0
The 2SLS estimator is less efficient (i.e. larger variance) than OLS
when the explanatory variables are exogenous.
Therefore, it is important to test for endogeneity first, in order to
avoid using an IV estimator that is:
1 more computationally intensive (2 stages is more difficult than
1)
2 less efficient
y1 = β 0 + β 1 y2 + β 2 z1 + β 3 z2 + u
where y2 is (possibly) endogenous; z3 and z4 are IVs
In order to test for endogeneity:
II . y1 = β 0 + β 1 y2 + β 2 z1 + + β 3 z2 + δ1 v̂2 + error
III. Test for the significance of v̂2 in the latter model. If we reject
H0 : δ1 = 0, then there is evidence that u and v2 are correlated,
therefore y2 is endogenous!!!
Gabriel Montes-Rojas Endogeneity
The endogeneity problem
Proxy variables
Instrumental variables
STATA
This is a test that will tell you if the instruments are uncorrelated
with the error term, an essential condition for the validity of the
IVs.
Requirement: You need more IVs than endogenous variables.
In the model above, we can run the 2SLS with z3 as the only IV;
compute û3 = y1 − β̂ 0 − β̂ 1 y2 − β̂ 2 z1 − β̂ 3 z2 ; and then evaluate
the regression model û3 = δ0 + δ1 z4 , in particular, test the
significance of z4 .
This is a valid test for the validity of z4 as an IV. BUT it needs to
assume that z3 is a valid IV.
1 Estimate the full 2SLS model with all IVs, obtain the residuals
û.
2 Regress û on ALL exogenous variables (i.e. the exogenous
variables and the IVs)
3 Consider the F-test of significance of the regression. H0 can be
interpreted as exogeneity of all variables in the model. Then if
you reject H0 one (or more) of your IVs are not exogenous.
How to do it in STATA?
Assume that x1 is endogenous and x2 is exogenous. Moreover
assume that you have two instruments available: z1 and z2
How to do it in STATA?
Assume that x1 is endogenous and x2 is exogenous. Moreover
assume that you have two instruments available: z1 and z2
ivregress 2sls y (x1=z1 z2) x2 (instrumental variables
estimation)
How to do it in STATA?
Assume that x1 is endogenous and x2 is exogenous. Moreover
assume that you have two instruments available: z1 and z2
ivregress 2sls y (x1=z1 z2) x2 (instrumental variables
estimation)
ivregress 2sls y (x1=z1 z2) x2, first (idem - request that
the first-stage regression results are shown)
How to do it in STATA?
Assume that x1 is endogenous and x2 is exogenous. Moreover
assume that you have two instruments available: z1 and z2
ivregress 2sls y (x1=z1 z2) x2 (instrumental variables
estimation)
ivregress 2sls y (x1=z1 z2) x2, first (idem - request that
the first-stage regression results are shown)
estat firststage (test for the significance of the instruments -
thumb-rule F > 10)
How to do it in STATA?
Assume that x1 is endogenous and x2 is exogenous. Moreover
assume that you have two instruments available: z1 and z2
ivregress 2sls y (x1=z1 z2) x2 (instrumental variables
estimation)
ivregress 2sls y (x1=z1 z2) x2, first (idem - request that
the first-stage regression results are shown)
estat firststage (test for the significance of the instruments -
thumb-rule F > 10)
estat overid (test for the validity of the instruments: need more
instruments than endogenous variables...)
How to do it in STATA?
Assume that x1 is endogenous and x2 is exogenous. Moreover
assume that you have two instruments available: z1 and z2
ivregress 2sls y (x1=z1 z2) x2 (instrumental variables
estimation)
ivregress 2sls y (x1=z1 z2) x2, first (idem - request that
the first-stage regression results are shown)
estat firststage (test for the significance of the instruments -
thumb-rule F > 10)
estat overid (test for the validity of the instruments: need more
instruments than endogenous variables...)
estat endogenous (test for the exogeneity of all variables)
How to do it in STATA?
Assume that x1 is endogenous and x2 is exogenous. Moreover
assume that you have two instruments available: z1 and z2
ivregress 2sls y (x1=z1 z2) x2 (instrumental variables
estimation)
ivregress 2sls y (x1=z1 z2) x2, first (idem - request that
the first-stage regression results are shown)
estat firststage (test for the significance of the instruments -
thumb-rule F > 10)
estat overid (test for the validity of the instruments: need more
instruments than endogenous variables...)
estat endogenous (test for the exogeneity of all variables)
reg x1 z1 z2
test z1 z2 (test for the significance of the instruments -
thumb-rule F > 10)
Gabriel Montes-Rojas Endogeneity
The endogeneity problem
Proxy variables
Instrumental variables
STATA
How to do it in STATA?
http://fmwww.bc.edu/gstat/examples/wooldridge/wooldridge15.html