Lec01 - Introduction

Panel Data Econometrics
Neil Foster-McGregor
(neil.foster@univie.ac.at)
Introduction

Course Outline
Revie of !asic Econometrics
o "inear Regression Model
o #$stem of Regression E%uations
o Estimation Met&ods ('"#( )#"#( *#"#)
o +$,ot&esis -esting and Inference
"inear Panel Data
o Pooled Regression
o Fi.ed Effects Regression
o Random Effects Regression
o First Difference Model
o Model /&oice and Miss,ecification -ests
o I0 and GMM Estimation
D$namic Panel Regression Models
-o,ics in Discrete /&oice 1nal$sis it& Panel Data
o !inar$ Models
o Multinomial and 'rdered Res,onse Models
o /ount Data
Multi-e%uation models for ,anel data
-ime-#eries Panel Data
o 2nit Root -esting it& Panel Data
o /ointegration 1nal$sis it& Panel Data
Assessment
1ssessment ill 3e 3ased on a final e.am (456)( to ta7e-&ome com,uter-3ased e.ercises ()86) and a grou, ,ro9ect (*86).

Office Hours
Email me for an a,,ointment (neil.foster@univie.ac.at):

Reading List
;. M. <ooldridge( Econometric Analysis of Cross Section and Panel Data( #econd Edition( -&e MI- Press( )5=5.
1. /olin /ameron and Pravin >. -rivedi( Microecometrics: Methods and Applications( /am3ridge 2niversit$ Press( )558.
1. /olin /ameron and Pravin >. -rivedi( Microeconometrics Using Stata( Revised Edition( #tata Press( )5=5.
M. 1rellano( Panel Data Econometrics( '.ford 2niversit$ Press( )55*.
!. +. !altagi( Econometric Analysis of Panel Data( Fourt& Edition.( ;o&n <ile$( Ne ?or7( )55@.
/. +siao( Analysis of Panel Data( )nd ed.( /am3ridge 2niversit$ Press( )55*.

Introduction
i. <&at is EconometricsA
ii. -$,es of Data
iii. "inear Regression Model ('rdinar$ "east #%uares)
iv. +$,ot&esis testing
v. Dumm$ 0aria3les
vi. -esting t&e '"# 1ssum,tions (+eteroscedasticit$( 1utocorrelation( Normalit$ of Residuals)
vii. Endogeneit$
viii. #$stems of Regression E%uations
i.. 1dditional Estimation Met&ods ()#"#( GMM( *#"#( M"E)
.. Pooled Data and Difference-in-Difference 1nal$sis
.i. Introduction to #tata

What is Econometrics?
2suall$ involves determining &et&er a c&ange in one varia3le( w( causes a c&ange in anot&er varia3le( y
o For e.am,le( does anot&er $ear of education cause an increase in agesA

Estimating economic relations&i,s
o /ost function
o Demand and #u,,l$
o Production Function

-esting +$,ot&eses
o 1re economies of scale constant( decreasing or increasingA
o Is t&e demand function elastic or inelasticA

Forecasting
o Economic grot&
o 2nem,lo$ment rate
o Po,ulation grot&

Causal Relationships
Finding t&at to varia3les are correlated doesnBt usuall$ im,l$ causalit$
Econometric met&ods allo one to effectivel$ &old ot&er factors constant C t&e ceteris ,ari3us assum,tion
Involves estimating t&e e.,ected value of y conditional on w and - E(y|w, ) - &ere is a set of control
varia3les t&at e &old fi.ed &en loo7ing at t&e effect of w on y
In t&e case of continuous e.,lanator$ varia3les our interest is on t&e ,artial effect of w on E(y|w, )( i.e.
oE(y|w, )ow
For discrete e.,lanator$ varia3les e are interested in E(y|w, ) evaluated at different values of w (&olding
t&e elements of fi.ed)
If e 7no &at t&e elements of are and are a3le to collect data on t&em( t&en estimating causal
relations&i,s is straig&tforard
In realit$( data ma$ not 3e availa3le or ma$3e measured it& error

Ingredients of an Empirical Study
Model
o #ingle e%uation model
o #imultaneous e%uation model
Data
o /ross-section
o -ime-series
o Panel
Estimation
Ro3ustness
+$,ot&esis -esting
o Nested and non-nested models
o Individual and 9oint &$,ot&esis testing
Inter,retation
o Meaning
o Im,lications for economic t&eor$
o Im,lications for ,olic$

Types of ata
/ross-#ection C collects data on man$ individuals (e.g. firms( countries( ,eo,le(D) at a single ,oint in time
-ime-#eries C data collected on a single individual at different ,oints in time
Panel Data C usuall$ com3ines cross-section and time-series data and refers to t&e ,ooling of o3servations
on a cross-section of &ouse&olds( countries( firms( etc. over several time ,eriods
o /ross-section units can 3e individuals( &ouse&olds( ,lants( firms( munici,alities( states or countries
o Re,eat o3servations are usuall$ time ,eriod (e.g. five $ears intervals( annual( %uarters( ee7s( da$s( etc)
or units it&in clusters (e.g. si3lings it&in a famil$( firms it&in an industr$( or7ers it&in a firm(
etc)
1n im,ortant c&aracteristic of ,anel data is t&at e cannot assume t&at t&e o3servations are inde,endentl$
distri3uted across time
o e.g. uno3served factors t&at affects a ,ersonBs age in =EE5 ill also affect t&at ,ersonBs age in
=EE=
Inde,endentl$ ,ooled cross-section data are o3tained 3$ sam,ling randoml$ from a large ,o,ulation at
different ,oints in time.
o #uc& data consist of inde,endentl$ sam,led o3servations( &ic& rules out correlation in t&e error
terms across different o3servations

Micro ,anels C ac&ieved 3$ surve$ing a num3er of &ouse&olds( individuals or firms over time. 2suall$
collected over a s&ort time ,eriod I
E.am,lesF
Panel #tud$ of Income D$namics
National "ongitudinal #urve$
<orld !an7Bs "iving #tandards Measurement #tud$
Euro,ean /ommunit$ +ouse&old Panel
Macro ,anels C involves data on a num3er of countries over time. 2suall$ collected over a longer time ,eriod
()5-G5 $ears)
E.am,lesF
Penn <orld -a3les
<orld Develo,ment Indicators
Direction of -rade #tatistics
Micro and Macro ,anels re%uire different econometric tec&ni%ues
o 1s$m,totics for micro ,anels need to 3e large N and small I( &ile t&e$ need to 3e large N and large I
for macro ,anels
o <it& long time series macro ,anels need to deal it& nonstationarit$ (i.e. unit roots( structural 3rea7s(
cointegration)
o Need to deal it& cross-countr$ de,endence( &ic& is not usuall$ a ,ro3lem in micro ,anels &ere
random sam,ling is used
The Single!E"uation Linear Regression #odel
-&e linear regression model can 3e ritten asF
y
= [
0
+[
1
x
1
+ +[
k
x
k
+e

<&ere y( x
1
(D( x
k
are o3serva3le random scalars( e is t&e uno3serva3le random distur3ance (error) and [
0
(
[
1
(D( [
k
are t&e ,arameters t&at e ould li7e to estimate
/rucial assum,tions for '"# estimation areF
o E(e
) = u
o Ior(e
) = o
2

o Co:(x
]
, e
) = u

In matri. form e &aveF
Y = X +s
<it&F
o E(s) =
o Ior(s) = o
2
I

-&is standard model is fairl$ general since t&e xBs ma$ include nonlinear functions of t&e underl$ing
varia3les( suc& as logarit&ms( s%uares( reci,rocals( interactions( etc.
1 model nonlinear in ,arameters ma$ 3e more a,,lica3le &en t&e range of y is restricted (e.g. t&e
de,endent varia3le is 3inar$( non-negative( etc)

E.am,lesF
ln(wogc
) = [
0
+[
1
cJuc
+[
2
cxpcr
+[
3
cxpcr
2
+[
4
morricJ
+e

ln(output
]t
) = [
0
+ [
1
ln(lob
]t
) +[
2
ln(cop
]t
) +[
3
spill
]t
+ [
4
quolity
]
+e
]t

#ain Assumptions of the Linear Regression #odel
1=. -&e inde,endent varia3les and t&e error term are uncorrelated E(X|s) = u C H'rt&ogonalit$ /onditionB
1). None of t&e inde,endent varia3les is a linear com3ination of t&e ot&er inde,endent varia3les. -&e ran7
of X is e%ual to K + 1. 't&erise e &ave (,erfect) multicollinearit$
1*. -&e error term s &as a mean of 5 and a constant variance o
2
( or E(s) = and Co:(e) = o
2
I
o -&e ,ro,ert$ of a constant variance is called &omoscedasticit$. 1 non-constant variance corres,onds to
&eteroscedasticit$
o -&e error term is uncorrelated over time( i.e. E|e
t
e
s
] = u, t = s If t&is assum,tion is not fulfilled(
autocorrelation is ,revalent
14. -&e error terms are normall$ distri3uted( s~N(, o
2
I) (onl$ necessar$ for testing)

Ordinary Least S"uares Regression
"east #%uares /riterionF
S() = (Y -X)
i
(Y -X) = Y
i
Y -2Y
i
X +'X'X
- min
[
S()

Normal E%uations
oS
o[
= -2X
i
Y + 2(X
i
X) = u
- (X
i
X) = X'Y

#olutions
= (X
i
X)
-1
X'Y
and t&e estimated covariance matri. isF
vai()
= o
2
(X
i
X)
-1

-&e '"# estimator is Best Linear Unbiased Estimator (!"2E)( as ,roven 3$ t&e Gauss-Mar7ov -&eorem.

$roperties of Estimators
!ias
o 1n estimator is un3iased if E(
) =
Efficienc$
o 1ssume
and
are to un3iased estimators. If Ior(
) < Ior(
)( t&en
is more efficient
/onsistenc$
o 1n estimator
n
is called a consistent estimator of if for n - t&e estimator
n
converges to its
true value

%otation
Predicted values of Y are given 3$ Y
X
i.e. t&e vector of t&e estimated coefficients multi,lied 3$ t&e matri. of t&e inde,endent varia3les
-&e estimated residuals are given 3$ s = Y -Y

i.e. t&e difference 3eteen t&e actual and ,redicted value of t&e de,endent varia3le
-&e estimate for o
2
can 3e e.,ressed as o
2
=
1
n-(K+1)
s's
What if the assumptions do not hold?
1=. <e &ave an endogeneit$ ,ro3lem and t&e '"# estimator is inconsistent
1). If t&e ran7 of X is loer t&an k( t&e estimator does not e.ist. In t&e case of multicollinearit$( i.e. some
inde,endent varia3les are &ig&l$ correlated( t&ere are numerical ,ro3lems
1*. In 3ot& cases( i.e. autocorrelation or &eteroscedasticit$( t&e '"# estimator is inefficient( 3ut consistent
o In t&e case of ,ositive autocorrelation t&e estimated standard errors are too small. If additionall$ t&ere
are lagged de,endent varia3les on t&e rig&t-&and side( t&e '"# estimates are 3iased and inconsistent
o It is ,ossi3le to correct for incorrectl$ estimated standard errors
14. -&e test statistics are invalid. +oever( if t&e sam,le siIe is large enoug&( due to t&e la of large
num3ers( e can again assume a normal distri3ution

&oodness of 'it
/onsider t&e folloing identit$F
y
-y = (y
-y
) +(y
-y)
#%uaring 3ot& sides and summing over o3servations givesF
(y
- y)
2
= (y
-y
)
2
+ (y
-y)
2
+ 2(y
- y
)(y
-y)
-&e last of t&ese terms is identicall$ e%ual to Iero due to '"# assum,tions (=) and (4)( meaning t&atF
(y
-y)
2
= (y
-y
)
2
+(y
-y)
2

##- J ##E K ##R
-otal sum of s%uared residuals J E.,lained sum of s%uared residuals K Residual sum of s%uared residuals
R
2
=
SSE
SSI
= 1 -
SSR
SSI

u R
2
1( &ere = corres,onds to a ,erfect fit( 5 to no fit at all. !ut t&is is onl$ true &en a constant &as
3een included. If not t&e R
)
can 3e negative.
R
)
increases &en a ne e.,lanator$ varia3le is added.
- oJ]. R
2
= R
2
= 1 -
n - 1
n -K
(1 -R
2
)
Hypothesis Testing
Individual &$,ot&esisF --test
2nder t&e assum,tions of t&e Gauss-Mar7ov -&eorem( it is true t&at
[
`
k
-[
k
stJ. crr. ([
`
k
)
~t
n-(K+1)

+$,ot&esesF
E
0
: [
k
= b
k

E
1
: [
k
= b
k

-est #tatisticF
t
n-(K+1)
=
[
`
k
-[
k
stJ. crr. ([
`
k
)

DecisionF
If t
n-(K+1)
t
ct
t&en E
0
can 3e re9ected
If t
n-(K+1)
< t
ct
t&en E
0
cannot 3e re9ected
t
ct
de,ends on t&e c&osen level of significance
/ommonl$ used levels of significanceF E56( E86( EE6

(oint Hypothesis Testing) The ' test
2sed &en anting to test &$,ot&eses on multi,le ,arameters.
/onsider t&e modelF ln(wogc
) = [
0
+[
1
cJuc
+[
2
cxpcr
+[
3
cxpcr
2
+[
4
morricJ
+e

1ssume t&at e ant to test t&e &$,ot&esis t&at after controlling for t&e individual 3eing married a ,ersonBs
education and e.,erience &ave no im,act on ages
+$,ot&esesF
E
0
: [
1
= [
2
= [
3
= u
E
1
: E
0
is not tiue

-est #tatisticF
F
,n-(K+1)
=
(RSSR -uSSR)r
uSSR(n - (K +1))

<&ere RSSR and uSSR e%ual t&e sum of s%uared residuals from t&e restricted and unrestricted models
res,ectivel$( r is t&e num3er of restrictions( n t&e num3er of o3servations and K +1 t&e num3er of
regressors.

DecisionF
If F
,n-(K+1)
F
ct
t&en E
0
can 3e re9ected
If F
,n-(K+1)
< F
ct
t&en E
0
cannot 3e re9ected
F
ct
de,ends u,on t&e c&osen level of significance
/ommonl$ used levels of significanceF E56( E86( EE6

(oint Hypothesis Testing) The Lagrange #ultiplier test
1n additional L alternative to t&e F test is t&e "agrange Multi,lier ("M) test and ,roceeds as follosF
#tage =F Estimate t&e restricted model and o3tain t&e residuals( e

#tage )F Regress t&ese residuals on a constant and all e.,lanator$ varia3les
-&e test statistic is given 3$F IH NR
s
2
&ere R
s
2
is t&e R-s%uared from t&e second stage regression
-&is statistic is distri3uted as a _
2
it& r degrees of freedom (&ere r is t&e num3er of restrictions 3eing
tested)

The Cho* Test
1 /&o test is a F-test and tests for structural 3rea7s in t&e coefficients. For e.am,le( in a macro model e
can test for t&e influence of t&e oil ,rice s&oc7 at t&e 3eginning of t&e =EM5s. 'r it& cross-section data e
can test &et&er t&e coefficient for t&e return to education in a age regression is e%ual across to grou,s of
individuals.

-&e restricted modelF
y
= [
0
+ [
1
x
1
+[
2
x
2
+e
, i = 1, ., n

-&e unrestricted modelF
y
= [
10
+[
11
x
1
+[
12
x
2
+e
1
, i = 1, ., N
1

y
= [
20
+[
21
x
1
+[
22
x
2
+e
2
, i = 1, ., N
2

Null +$,ot&esisF
E
0
: [
10
= [
20
, [
11
= [
21
, [
12
= [
22

#te, =F Regression it& structural 3rea7
- Regression it& sam,le =( N
1
num3er of o3servations and K +1 = S ,arameters. '3tain t&e residual
sum of s%uares (SSR(1)).
- Regression it& sam,le )( N
2
o3servations and K +1 = S ,arameters. '3tain t&e residual sum of
s%uares (SSR(2)).
- uSSR = SSR(1) + SSR(2)
#te, )F Regression it&out structural 3rea7
- Regression it& t&e &ole sam,le i = 1, ., (N
1
+N
2
) and K +1 = S ,arameters. '3tain t&e residual
sum of s%uares (RSSR)
#te, *F /onstruction of an F-test
F
,n-(K+1)
=
(RSSR -uSSR)r
uSSR(n - (K +1))

it& SS and USS e%ual to t&e sum of s%uared residuals of t&e restricted and t&e unrestricted model( r t&e
num3er of restrictions( n t&e num3er of o3servations and ! K = t&e num3er of regressors.

/&o-testF
F
K,N
1
+N
2
-2(K+1)
=
(RSSR -uSSR)K
uSSR(N
1
+N
2
-2(K +1))

it& RSSR = SSR(1) + SSR(2) and uSSR e%ual to t&e sum of s%uared residuals of t&e restricted and t&e
unrestricted model( K t&e num3er of restrictions( N
1
+ N
2
t&e num3er of o3servations and 2(K + 1) t&e
num3er of regressors.

ummy +aria,les
1 dumm$ varia3le is e%ual to one if an event or individual meets a certain criteria and Iero ot&erise. For
econometric im,lementation t&ere are to ,ossi3ilitiesF (a) interce,t and (3) slo,e dumm$ varia3les. In t&e
first case t&ere is onl$ a s&ift in t&e estimated interce,t( in t&e ot&er case t&ere is also a s&ift in t&e estimated
slo,e.
Interce,t dumm$ varia3lesF
y
= p +e
, foi i e iou 1
y
= p + + e
, foi i e iou 2
-&ese to e%uations can 3e summariIed into one e%uationF
y
= p +
+ e

&ere is e%ual to one if t&e individual is in grou, ) and Iero if t&e$ are in grou, =.
#lo,e dumm$ varia3les
y
= p + [
2
x
+e
, foi i e iou 1
y
= p +( + [
2
)x
+e
, foi i e iou 2
-&ese to e%uations can 3e summariIed into one e%uationF
y
= p + [
2
x
+ e

Testing the OLS Assumptions
Heteroscedasticity
1ssum,tion of t&e '"# EstimatorF
o -&e error term e &as a mean of 5( conditional on t&e xBs
o -&e variance of t&e uno3serva3le error term( conditional on t&e xBs( is constant( i.e. vai(e|x) = o
2
( as
is t&e unconditional error variance( i.e. vai(e) = o
2
given t&e assum,tion a3ove
-&e ,ro,ert$ of a constant variance is called &omoscedasticit$
If t&e variance is not constant e &ave &eteroscedasticit$.
<&at if t&is assum,tion is not metA
o -&e '"# estimator remains un3iased and consistent
o -&e '"# estimator is no longer !"2E( and in ,articular it is no longer as$m,toticall$ efficient
o '"# standard errors are no longer valid for constructing confidence intervals and t-statistics
It is ,ossi3le to correct for t&e rong estimated standard errors.

Testing for Heteroscedasticity
/onsider t&e folloing linear modelF
y
= [
0
+[
1
x
1
+ +[
k
x
k
+e

o Null &$,ot&esis is of &omoscedasiticit$( i.e.
0
: vai(e|x
1
, x
2
, ., x
k
) = o
2

o and since e are assuming t&at e &as a Iero conditional e.,ectation e &ave
0
: E(e
2
|x
1
, x
2
, ., x
k
) = E(e
2
) = o
2

o -&is e%uation indicates t&at in order to test for a violation of &omoscedasticit$ e need to test &et&er
e
2
is related to one or more of t&e e.,lanator$ varia3les.

!reusc&-Pagan -est
o 1 sim,le a,,roac& is to consider t&e folloingF
e
2
=
0
+
1
x
1
+ +
k
x
k
+
o &ere is an error term it& mean Iero (given t&e xBs)
o -&e null of no &eteroscedasticit$ is t&enF
0
:
1
=
2
=
k
= u
o #ince e donBt &ave data on t&e true errors e use an estimate for t&e errors( i.e. t&e residuals
e
2
=
0
+
1
x
1
+ +
k
x
k
+
o -o test statistics can t&en 3e constructedF
F =
R
s
2
2
k
(1 -R
s
2
2
)(n - k -1)

IH = n R
s
2
2

o -&e "M version of t&is test is usuall$ called t&e !reusc&-Pagan test for &eteroscedasticit$

<&iteNs +eteroscedasticit$ -est
o <&iteNs test is a test of t&e null &$,ot&esis of no &eteros7edasticit$ against &eteros7edasticit$ of some
un7non general formF
E
0
: o = o
for all i
o -&is test is a ver$ general test. -&e null &$,ot&esis conditions on &omoscedasticit$( a linear model and
inde,endence of t&e regressors.
o If t&e null &$,ot&esis can 3e re9ected( one of t&ese t&ree conditions ma$ not 3e met. It is not( &oever(
clear( &ic& one.
o -&e test statistic is com,uted 3$ an au.iliar$ regression( &ere e regress t&e s%uared residuals on all
,ossi3le cross ,roducts of t&e regressors.
o For e.am,le( su,,ose e estimated t&e folloing regressionF
y
= [
0
+[
1
x
1
+ [
2
x
2
+[
3
x
3
+e

o -&e test statistic is t&en 3ased on t&e au.iliar$ regressionF
e
2
= [
0
+[
1
x
1
+[
2
x
2
+ [
3
x
3
+[
11
x
1
2
+[
22
x
2
2
+[
33
x
3
2
+ [
12
x
1
x
2
+[
13
x
1
x
3
+[
23
x
2
x
3
+

o -&e statistic n R
s
2
2
gives t&e value of <&iteBs test statistic. -&e e.act finite sam,le distri3ution of
t&e F-statistic under t&e null &$,ot&esis is not 7non( 3ut <&iteBs test statistic is as$m,toticall$
distri3uted as a _
2
it& " degrees of freedom e%ual to t&e num3er of slo,e coefficients (e.cluding
t&e constant) in t&e test regression.
o <&ite also descri3es t&is a,,roac& as a general test for model miss,ecification( since t&e null
&$,ot&esis underl$ing t&e test assumes t&at t&e errors are 3ot& &omoscedastic and inde,endent of
t&e regressors( and t&at t&e linear s,ecification of t&e model is correct.
o Failure of an$ one of t&ese conditions could lead to a significant test statistic. /onversel$( a non-
significant test statistic im,lies t&at none of t&e t&ree conditions is violated.
o Possi3le reasons for &eteroscedasticit$F
- Miss,ecification of t&e model
- Missing varia3les
- Incorrect functional form
o <it& man$ rig&t-&and side varia3les in t&e regression( t&e num3er of ,ossi3le cross ,roduct terms
3ecomes ver$ large so t&at it ma$ not 3e ,ractical to include all of t&em. It is ,ossi3le t&erefore to
run t&e model it& no cross terms.

Correcting for Heteroscedasticity - Weighted Least S"uares
#u,,ose t&at e &ave &eteroscedasticit$ of 7non form( and t&at t&ere is a series ( &ose values are
,ro,ortional to t&e reci,rocals of t&e error standard deviations.
<e can use eig&ted least s%uares( it& eig&t series ( to correct for t&e &eteroscedasticit$.
o E.am,le
y
= [
0
+[
1
x
1
+ [
2
x
2
+[
3
x
3
+e

Ior(e
) = cx
3

y
x
3
= [
0
1
x
3
+[
1
x
1
x
3
+ [
2
x
2
x
3
+ [
3
+
e
x
3

Estimation is done 3$ running a regression using t&e eig&ted de,endent and eig&ted inde,endent
varia3les to minimiIe t&e sum-of-s%uared residuals
S([) =
2
(y
[)
2

it& res,ect to t&e k-dimensional vector of ,arameters [.
In matri. notation( let 3e a diagonal matri. containing t&e scaled along t&e diagonal and Ieros
else&ere( and let y and 3e t&e usual matrices associated it& t&e left and rig&t-&and side varia3les. -&e
eig&ted least s%uares estimator is
[
`
= ('')
-1
''y
and t&e estimated covariance matri. is
Ior([)
= o
2
(
i
')
-1

-&e variance of t&e errors in t&e transformed regression ill 3e constant
'easi,le &eneralised Least S"uares
More generall$( t&e e.act form of t&e &eteroscedasticit$ is not o3vious
In man$ cases e can model t&e &eteroscedasticit$ using t&e ,arameters from t&e regression model.
o First ste,F Estimate y
= [
0
+[
1
x
1
+[
2
x
2
+[
3
x
3
+ e
it& '"# and generate t&e s%uared

residuals e
2
.
o #econd ste,F Run an '"# regression it& e
2
as t&e de,endent varia3le and x
1
( x
2
( x
3
t&eir s%uared
and cross terms as t&e inde,endent varia3les and generate t&e s%uare root of t&e ,redicted values of
t&is regression.
o -&ird ste,F Re-estimate t&e regression of t&e first ste, it& eig&ted least s%uares and use t&e s%uare
root of t&e ,redicted values of t&e regression of t&e second ste, as t&e eig&ts.

Heteroscedasticity Consistent Co.ariances /White0
<&en t&e form of &eteroscedasticit$ is not 7non( it ma$ not 3e ,ossi3le to o3tain efficient estimates of t&e
,arameters using eig&ted least s%uares.
<&ite (=E@5) &as derived a &eteroscedasticit$ consistent covariance matri. estimator &ic& ,rovides correct
estimates of t&e coefficient covariances in t&e ,resence of &eteroscedasticit$ of un7non form. -&e <&ite
covariance matri. is given 3$F
Ior
t
=
n
n -(K +1)
(
i
)
-1
e
n
2
n
1
x
n
x
n
i
(
i
)
-1

&ere n is t&e num3er of o3servations( K + 1 is t&e num3er of regressors( and e is t&e least s%uares
residual.

Autocorrelation
1ssum,tions of least s%uare estimationF
o -&e error term e &as a mean of 5 and a constant variance o
2
( or E(e) = u and Co:(e) = o
2

o -&e error term is uncorrelated over time( i.e. E|e
t
e
s
] = u, t = s If t&is assum,tion is not fulfilled(
autocorrelation is ,revalent

<&at if t&e error term is correlated over timeA
o -&e '"# estimator is inefficient( 3ut consistent
o In t&e case of ,ositive autocorrelation t&e estimated standard deviations are too small( t&erefore t&e t-
values are too large
o In t&e case of negative autocorrelation t&e estimated standard deviations are too large( t&erefore t&e t-
values are too small
o If t&ere are additionall$ lagged de,endent varia3les on t&e rig&t-&and side( t&e '"# estimates are 3iased
and inconsistent.

Testing for first!order autocorrelation
1utocorrelation denotes t&e correlation 3eteen a time series y
t
and its on lagged values y
t-s
it&
s = -, ., :
y
t
= y
t-1
+u
t

or consider t&e error term in a linear model
y
t
= [
t
+e
t

e
t
= e
t-1
+u
t

ur,in!Watson Test
-&e null &$,ot&esis of t&e Dur3in-<atson test is no first order autocorrelation = u. -&e alternative
&$,ot&esis is of first order autocorrelation.
-&e test statisticF
=
(u
t-1
+u
t
)
2
t2
(u
t
)
2
t2

For I - t&e test statistic - 2 -2.
If t&ere is no serial correlation( t&e D< statistic ill 3e around )
-&e D< statistic ill fall 3elo ) if t&ere is ,ositive serial correlation (in t&e orst case( it ill 3e near Iero).
If t&ere is negative correlation( t&e statistic ill lie some&ere 3eteen ) and 4.
-&e critical values are ta3ulated. For eac& D< test statistic t&ere are to critical values
and

de,ending on t&e num3er of ,arameters and t&e num3er of o3servations.
-o reac& a decisionF
o Do not re9ect t&e null &$,ot&esis( if
.
o Re9ect t&e null &$,ot&esis( if <
.
o No decision( if
< <
.

Testing for higher order autocorrelation
-&e autocorrelation coefficient
r
=
1
u'u
u
t
'u
t-
,
t+1
l = 1, ., I
it& I t&e &ig&est lag.
-&e correlogram is t&e series of t&e autocorrelation coefficients. -&e E86 confidence interval s&os t&e
significance of eac& autocorrelation coefficient. If one of t&e coefficients is significant( t&en t&e tested time
series is not <&ite noise.
"9ung-!o. O-statisticsF
= I(I +2)
r
]
2
I -]
]1
, l = 1, ., I
it& r
]
= Corr(y
t
, y
t-]
) and I t&e &ig&est lag. -&e test statistic
is distri3uted as a _
2
under t&e null
&$,ot&esis of <&ite noise.
If t&ere is no serial correlation in t&e residuals( t&e autocorrelations and ,artial autocorrelations at all lags
s&ould 3e nearl$ Iero( and all -statistics s&ould 3e insignificant it& large p-values.

Cochrane!Orcutt $rocedure
Iterative ,rocedure to correct for autocorrelation using '"# estimation.
= X
+s
it&
s
= ps
-1
+u

NGeneral-differencingN transformation
= X
+s
-1
= X
-1
+ s
-1
| p
-
-p
-1
= (X
- pX
-1
) +s
-ps
-1

-
-p
-1
= (X
-pX
-1
) +u

it& u
~N(, o
2
)
'"# is a,,lica3le

-&e folloing ste,s are necessar$F
o Estimation of t&e original model
= X
+s
,

o Generate s
,
=
and estimate s
,
= p
s
,-1
to o3tain p
.
o -ransformationF
(
-p
-1
) = (X
-p
X
-1
) +s
1

o Generate ne residuals.
o Re,eat until |p
n
-p
n-1
| < 6, 6
NoteF -&is ,rocedure does not necessaril$ find t&e glo3al minimum. It is ,ossi3le to end u, it& a local one.

Hildreth!Lu procedure
1 set of grid values for p ill 3e s,ecified.
o For e.am,leF p
= , . 1, ., 1.
-&e folloing ste,s are t&en necessar$F
Estimation ofF
= X
+s
it&
s
= p
s
-1
+ u

for eac& p
= , . 1, ., 1.
o -&e estimation it& t&e loest sum of s%uared residuals (##R) gives
0
.
o Find a ne set of grid values in t&e neig&3our&ood of
0
to get
1
.
o Generate ne estimates.
o Re,eat until |
n
-
n-1
| < , u
NoteF It is also ,ossi3le to find a local( and not t&e glo3al minimum.

HAC Co.ariances /%e*ey!West0
-&e <&ite covariance matri. descri3ed a3ove assumes t&at t&e residuals of t&e estimated e%uation are seriall$
uncorrelated.
Nee$ and <est (=E@M3) &ave ,ro,osed a more general covariance matri. estimator t&at is consistent in t&e
,resence of 3ot& &eteroscedasticit$ and autocorrelation of un7non form.
-&e Nee$-<est estimator is given 3$
Ior
N
=
I
I - (K +1)
(
i
)
-1
u(
i
)
-1

&ere
u =
I
I -(K +1)
_e
t
2
x
t
x
t
i
t1
+ _1 -

1 +q
]
q
u1
I(x
t
e
t
e
t-u
i
x
t-u
i
+ x
t-u
e
t-u
e
t
i
x
t
i
)
tu+1
_
and q( t&e truncation lag( is a ,arameter re,resenting t&e num3er of autocorrelations used in evaluating t&e
d$namics of t&e '"# residuals e.
Note t&at using t&e <&ite &eteroscedasticit$ consistent or t&e Nee$-<est +1/ consistent covariance estimates
does not c&ange t&e ,oint estimates of t&e ,arameters( onl$ t&e estimated standard errors.

%ormality of Residuals
Non-normalit$ of t&e residuals results in t&e test statistics 3eing invalid.
o If t&e sam,le siIe is large enoug&( due to t&e la of large num3ers( e can again assume a normal
distri3ution
/an test for t&is 3$ loo7ing at a &istogram of t&e residuals
1 more formal test is t&e ;ar%ue-!era test

NoteF Pgra resQ is an old #tata command. In #tata =) t&e command ould 3eF &ist res( normal 3in(85)

Residuals s&o signs of rig&t s7eness (residuals 3unc&ed to t&e left) and 7urtosis (le,to7urtic C ,ea7 &ig&er
t&an e.,ected for a normal distri3ution)

(ar"ue!1era Test
Is a goodness-of-fit test of &et&er t&e sam,le data &ave a s7eness and 7urtosis matc&ing a normal
distri3ution
-&e null &$,ot&esis is of Iero s7eness and Iero e#cess 7urtosis( meaning t&at t&e null is of a normal
distri3ution
-&e test statistic isF
[B =
n
6
_S
2
+
1
4
(K -S)
2
]
&ere n is t&e num3er of o3servations( S is t&e sam,le s7eness and K is t&e sam,le 7urtosis
-&e test statistic is distri3uted as /&i s%uared it& to degrees of freedom

Construct (ar"ue!1era test
. 93J(=MELG)R((=.85888S))K((G.4*-*)S)L4) J *)@.E
-&e statistic &as a _
2
distri3ution it& ) degrees of freedom (one for s7eness( one for 7urtosis)
From statistical ta3les( t&e critical value at t&e E86 level for ) degrees of freedom is 8.EE
<e t&erefore re9ect t&e null &$,ot&esis t&at t&e residuals are normall$ distri3uted
#uc& results suggest t&at an alternative functional form s&ould 3e considered to tr$ and ma7e t&e residuals
normal. 't&erise( t&e t-statistics ma$ 3e invalid.
-&is test is onl$ valid as$m,toticall$( and so relies on a large sam,le siIe

Specification Tests
Ramse$Bs RE#E- test (Ramse$( =EGE) detects functional form miss,ecification
o In ,articular( it tests for neglected nonlinearities in models
If t&e original linear model y = [
0
+[
1
x
1
+ +[
k
x
k
+e satisfies t&e assum,tion t&at
E(e|x
1
, x
2
, ., x
k
) = u t&en no nonlinear functions of t&e inde,endent varia3les s&ould 3e significant
&en added to t&e regression e%uation
-&e RE#E- test tests &et&er nonlinear com3inations of t&e estimated values &el, e.,lain t&e de,endent
varia3le
If nonlinear com3inations of t&e e.,lanator$ varia3les &ave an$ ,oer in e.,laining t&e de,endent
varia3le( t&en t&e model is mis-s,ecified.
-ests &et&er ([
1
x)
2
( ([
1
x)
3
(...( ([
1
x)
]
&as an$ ,oer in e.,laining y
In ,ractice t&e s%uared and cu3ed fitted values from t&e original model are often included for t&is test
-&e model to estimate t&erefore is
y = [
0
+ [
1
x
1
++[
k
x
k
+
1
y
2
+
2
y
3
+crror
-&e null &$,ot&esis is of no miss,ecification( i.e.
0
:
1
=
2
= u
<&ic& can 3e tested using a standard F-test
Endogeneity
1n e.,lanator$ varia3le( x
]
( is said to 3e endogenous if it is correlated it& t&e error term( e.
Endogeneit$ usuall$ arises in one of t&ree a$sF
'mitted varia3lesF
o 0aria3les t&at e ould li7e to control for( 3ut cannot 3ecause of data availa3ilit$
o 'mitted varia3les form ,art of t&e error term and if t&e e.,lanator$ varia3le is correlated it& t&e
omitted varia3le e &ave an endogeneit$ ,ro3lem
o 'ften arises due to self-selectionT e.g. individualBs decision on $ears of sc&ooling is li7el$ to 3e
correlated it& uno3served a3ilit$
#imulataneit$
o 1rises &en at least one e.,lanator$ varia3le is determined simultaneousl$ it& t&e de,endent varia3le(
y
o In suc& cases t&ere is generall$ a correlation 3eteen t&e simultaneousl$ determined e.,lanator$
varia3le and t&e error term
o 1 common res,onse to suc& ,ro3lems is to use simultaneous e%uation models

#easurement error
<&ere e onl$ o3serve an im,erfect measure of our varia3le of interest
-&e measurement error forms ,art of t&e error term
Endogeneit$ ill arise if t&ere is a correlation 3eteen t&e im,erfect e.,lanator$ varia3le and t&e com,osite
error term
/onsider t&e folloing regression modelF
y = [
0
+[
1
x
1
-
+ [
2
x
2
+e
x
1
-
is not o3served( 3ut x
1
is an o3served measurement of x
1
-
T x
1
= x
1
-
+u( &ere u re,resents t&e
measurement error
/lassical Errors in 0aria3les assum,tionF cov(x
1
-
, u) = u
If t&is &olds t&en x
1
and u must 3e correlatedF
cov(x
1
, u) = E(x
1
, u) = E(x
1
-
, u) + E(u
2
) = u +o
2
= o
2

Reriting t&e regression model e &aveF
y = [
0
+[
1
(x
1
-u) +[
2
x
2
+e
y = [
0
+ [
1
x
1
+[
2
x
2
+(e -[
1
u)
-&ere ill t&erefore 3e a correlation 3eteen x
1
and t&e com,osite errorF
cov(x
1
, e -[
1
u) = -[
1
cov(x
1
, u) = -[
1
o
2

Omitted +aria,le 1ias
E.cluding a relevant varia3le from t&e regression model can lead to 3iased coefficients.
Model it& one inde,endent varia3leF
y
= o
0
+ o
1
x
1
+e

Model it& to inde,endent varia3lesF
y
= [
0
+[
1
x
1
+[
2
x
2
+e
(1)
If t&e model it& to inde,endent varia3les is t&e true model( t&e error term of t&e model it& one
inde,endent varia3le can 3e e.,ressed as follosF
e
= [
2
x
2
+

-&is violates t&e ort&ogonalit$ assum,tion and e &ave an endogeneit$ ,ro3lem
/onsider t&e au.iliar$ regression e%uationF
x
2
=
0
+
1
x
1
+

#u3stitute into (1)
y
= [
0
+[
1
x
1
+[
2
(
0
+
1
x
1
+
) +e
= ([
0
+ [
2
0
) +([
1
+ [
2
1
)x
1
+([
2
+e
)
= o
1
= [
1
+
1
[
2

'mitted varia3le 3iasF
o
1
-[
1
=
1
[
2

If additional e.,lanator$ varia3les are included in t&e model (e.g. x
3
)( t&e coefficients on t&ese varia3les
ill also in general 3e 3iased (e.ce,t &en x
1
and x
3
are uncorrelated).

Ordinary Least S"uares Solutions to Omitted +aria,les - $ro2y +aria,les
'mitted varia3le 3ias can 3e eliminated Lmitigated if t&ere is a ,ro.$ varia3le availa3le for t&e omitted
varia3le( x
2

-&ere are to re%uirements for a ,ro.$ varia3le (&ic& e ill call z)F
o -&e ,ro.$ s&ould 3e redundant (ignora3le) in t&e structural e%uation (i.e. z is irrelevant for e.,laining
y once x
1
and x
2
&ave 3een controlled for)
o -&e correlation 3eteen t&e omitted varia3le and t&e ot&er e.,lanator$ varia3les s&ould 3e Iero once
e ,artial out I
E.am,leF 2sing IO as a ,ro.$ for a3ilit$ in age regressions

Additional Estimators) T*o!Stage Least S"uares
1ssum,tion of t&e '"# Model
1=. -&e inde,endent varia3les and t&e error term are uncorrelated E(|e) = u C H'rt&ogonalit$ /onditionB
<&en t&is assum,tion fails to &old e &ave a so-called endogeneit$ ,ro3lem
-&e '"# estimator is 3iased and inconsistent
Instrumental 0aria3les estimation ma$ 3e used to o3tain consistent ,arameter estimates in suc& cases
Endogeneit$ usuall$ arises in one of t&ree a$sF
o 'mitted varia3lesF
0aria3les t&at e ould li7e to control for( 3ut cannot 3ecause of data availa3ilit$
'ften arises due to self-selectionT e.g. individualBs decision on $ears of sc&ooling is li7el$ to 3e
correlated it& uno3served a3ilit$
o Measurement errorF
<&ere e onl$ o3serve an im,erfect measure of our varia3le of interest
-&e measurement error forms ,art of t&e error term
o #imulataneit$F
1rises &en at least one e.,lanator$ varia3le is determined simultaneousl$ it& t&e de,endent
varia3le( y
In suc& cases t&ere is generall$ a correlation 3eteen t&e simultaneousl$ determined e.,lanator$
varia3le and t&e error term

What is an Instrumental +aria,le?

1n instrumental varia3le is a varia3le( z( t&at &as t&e ,ro,erties t&at c&anges in z are associated it& c&anges
in x( 3ut are not correlated it& e.
-&e instrument z ill 3e correlated it& y( 3ut t&e onl$ source of suc& correlation is t&e indirect ,at& of z
3eing correlated it& x( &ic& in turn determines y.

In order for a varia3le( z, to serve as a valid instrument for x, t&e folloing must 3e true
o -&e instrument must 3e e.ogenous
-&at is( Co:(z, e) = u
o -&e instrument must 3e correlated it& t&e endogenous e.,lanator$ varia3le x
-&at is( Co:(z, x) = u

-&e I0 estimator of isF
= N
-1
z
i
x
N
1

-1
N
-1
z
i
y
N
1
= (Z'X)
-1
Z'Y

The +alidity of Instruments
<e &ave to use common sense and economic t&eor$ to decide if it ma7es sense to assume Co:(z, e) = u
o /anBt directl$ test t&is condition 3ecause e donBt &ave an un3iased estimator for e
o '"# estimator of e is ,resumed 3iased and t&e I0 estimator of e de,ends on t&e validit$ of
Co:(z, e) = u condition
<e can test if Co:(z, x) = u
o !$ testing &et&er p
1
= u in t&e regressionF x = p
0
+ p
1
z + :
o -&is is referred to as t&e first-stage regression

Instrumental +aria,les) Intuition
1n instrumental $ariable( Z is uncorrelated it& t&e distur3ance e 3ut is correlated it& (e.g.( ,ro.imit$ to
college mig&t 3e correlated it& sc&ooling 3ut not it& age residuals)
<it& t&is ne varia3le( t&e I0 estimator s&ould ca,ture onl$ t&e effects on of s&ifts in induced 3$
Z &ereas t&e '"# estimator ca,tures not onl$ t&e direct effect of on ( 3ut also t&e effect of t&e included
measurement error andLor endogeneit$
I0 is not as efficient as '"# (es,eciall$ if Z onl$ ea7l$ correlated it& ( i.e. &en e &ave so-called Pea7
instrumentsQ) and onl$ &as large sam,le ,ro,erties (consistenc$)
o I0 results in 3iased coefficients. -&e 3ias can 3e large in t&e case of ea7 instruments (see 3elo)

The I+ Estimator *ith a Single Regressor and a Single Instrument
= [
0
+ [
1
+ e

"oosel$( I0 regression 3rea7s into to ,artsF a ,art t&at mig&t 3e correlated it& e( and a ,art t&at is not.
o !$ isolating t&e ,art t&at is not correlated it& e( it is ,ossi3le to estimate [
1
.
-&is is done using an instrumental $ariable( Z
( &ic& is uncorrelated it& e
.
-&e instrumental varia3le detects movements in
t&at are uncorrelated it& e
( and use t&ese to estimate

[
1
.
-&e met&od used to do t&is is called T*o!Stage Least S"uares and is im,lemented as follosF
#tage =F First isolates t&e ,art of % t&at is uncorrelated it& e 3$ regressing % on & using '"#F

= n
0
+ n
1
Z
+ :
(=)
o !ecause Z
is uncorrelated it& e
( n
0
+ n
1
Z
. <e donBt 7no n

0
or n
1
3ut e
&ave estimated t&em( soD
o /om,ute t&e ,redicted values of
(
`
( &ere
`
= n
0
+n
1
Z
( i = 1, . . . , n
#tage )F Re,lace
3$
`
in t&e regression of interest( regressing ' on

`
using '"#F

= [
0
+ [
1
+ e
())
o !ecause
`
in large sam,les( so t&e assum,tion 1= &olds

o -&us [
1
can 3e estimated 3$ '"# using regression ())
-&is argument relies on large sam,les (so t&at n
0
and n
1
are ell estimated using regression (=))
-&e resulting estimator is called t&e P-o #tage "east #%uaresQ ()#"#) estimator( [
`
1
2
.
[
`
1
2
is a consistent estimator of [
1
.

Inference using T*o Stage Least S"uares

#tatistical inference ,roceeds in t&e usual a$.

(ote on standard errorsF
o -&e '"# standard errors from t&e second stage regression arenBt rig&t C t&e$ donBt ta7e into account
t&e estimation in t&e first stage (
`
is estimated).
o Instead( use a single s,ecialiIed command t&at com,utes t&e )#"# estimator and t&e correct SEs.
o 1s usual( use &eteros7edasticit$-ro3ust SEs

-o #tage "east #%uares C Finding t&e !est Instrument
o If e &ave more t&an one ,otential instrument( sa$ Z
2
and Z
3
( t&en e could use eit&er Z
2
or Z as an
instrument (t&e model is said to 3e over-identified)
o -&e 3est instrument &oever is a linear com3ination of all of t&e e.ogenous varia3les(

= n
0
+ n
1
Z
2
+n
2
Z
3
+ :

o <e can o3tain t&e ,redicted values(
`
3$ regressing
on Z
2
and Z
3
( &ic& ould 3e t&e first stage
regression
o <e could t&en once again re,lace
3$
`
in t&e original regression e%uation.

o 'nce again t&e standard errors from doing )#"# 3$ &and are incorrect (statistical ,ac7ages usuall$
correct for t&is &oever).

The &eneral I+ Regression #odel
-&e general I0 regression modelF
= [
0
+ [
1
1
+ . + [
k
k
+ [
k+1
1
+ . + [
k+
+ e
is t&e de,endent varia3le

o
1
, .,
k
are t&e endogenous regressors (,otentiall$ correlated it& e
)
o
1
, .,
are t&e included e#ogenous $ariables or included e#ogenous regressors (uncorrelated it& e
)
o [
0
( [
1
, ., [
k+
are t&e un7non regression coefficients
o Z
1
, ., Z
m
are t&e m instrumental varia3les (t&e e#cluded e#ogenous $ariables)

<e need to introduce some ne conce,ts and to e.tend some old conce,ts to t&e general I0 regression modelF
-erminolog$F identification and o$eridentification
)#"# it& included e.ogenous varia3les
o 'ne endogenous regressor
o Multi,le endogenous regressors
1ssum,tions t&at underlie t&e normal sam,ling distri3ution of )#"#
o Instrument validit$ (relevance and e.ogeneit$)
o General I0 regression assum,tions

Identification
In general( a ,arameter is said to 3e identified if different values of t&e ,arameter ould ,roduce different
distri3utions of t&e data.
In I0 regression( &et&er t&e coefficients are identified de,ends on t&e relation 3eteen t&e num3er of
instruments (m) and t&e num3er of endogenous regressors (k)
Intuitivel$( if t&ere are feer instruments t&an endogenous regressors( e canBt estimate [
1
, ., [
k+

o For e.am,le( su,,ose k = 1 3ut m = u (e &ave no instruments):

-&e coefficients [
1
, ., [
k
are said to 3eF
o E#actly )dentified if m = k.
-&ere are 9ust enoug& instruments to estimate [
1
, ., [
k
*
o +$eridentified if m k.
-&ere are more t&an enoug& instruments to estimate [
1
, ., [
k
*
)f so, you can test -hether the instruments are $alid (a test of the .o$eridentifying restrictionsQ)
o Underidentified if m < k.
-&ere are too fe enoug& instruments to estimate [
1
, ., [
k
*
)f so, you need to get more instruments/

&eneral I+ regression) 3SLS *ith one Endogenous Regressor
-&e regression model ta7es t&e formF

= [
0
+ [
1
1
+ [
2
1
+ . + [
1+
+ e

o InstrumentsF Z
1
, ., Z
m

o First stage
- Regress
1
on all t&e e.ogenous regressorsF regress
1
on
1
, .,
and Z
1
, ., Z
m
using '"#
- /om,ute ,redicted values
`
( i = 1, ., n
o #econd stage
- Regress ' on
`
(
1
, .,
using '"#
- -&e coefficients from t&is second stage regression are t&e )#"# estimators( 3ut t&e standard
errors are again rong

&eneral I+ regression) 3SLS *ith #ultiple Endogenous Regressors
= [
0
+ [
1
1
+ . + [
k
k
+ [
k+1
1
+ . + [
k+
+ e

InstrumentsF Z
1
, ., Z
m

No t&ere are " first stage regressionsF
o Regress
1
1
on
1
, .,
and Z
1
, ., Z
m
using '"#
o /om,ute ,redicted values
`
1
( i = 1, ., n
o Regress
2
2
on
1
, .,
and Z
1
, ., Z
m
using '"#
o /om,ute ,redicted values
`
2
, i = 1, ., n
o Re,eat for all Bs( o3taining
`
1
, .,
`
k

#econd stage
o Regress on
`
1
, .,
`
k
and
1
, .,
using '"#
o -&e coefficients from t&is second stage regression are t&e )#"# estimators (3ut t&e standard errors are
rong)

A 4+alid5 Set of Instruments in the &eneral Case
-&e set of instruments must 3e relevant and e.ogenousF

=. Instrument relevanceF
o 0eneral case, multiple 1s: #u,,ose t&e second stage regression could 3e run using t&e ,redicted values
from t&e population first stage regression. -&enF t&ere is no ,erfect multicollinearit$ in t&is (infeasi3le)
second stage regression
o Special case of one : 1t least one instrument must enter t&e ,o,ulation counter,art of t&e first stage
regression.
). Instrument e.ogeneit$F
o 1ll t&e instruments are uncorrelated it& t&e error termF co:(Z
1
, e
) = u, ., co:(Z
m
, e
) = u

<&ere Do 0alid Instruments /ome FromA
0alid instruments are (=) relevant and ()) e.ogenous
'ne general a$ to find instruments is to loo7 for e.ogenous variation C variation t&at is Pas ifQ randoml$
assigned in a randomiIed e.,eriment C t&at affects .
o Rainfall s&ifts t&e su,,l$ curve for 3utter 3ut not t&e demand curveT rainfall is Pas ifQ randoml$
assigned
o #ales ta. s&ifts t&e su,,l$ curve for cigarettes 3ut not t&e demand curveT sales ta.es are Pas ifQ
randoml$ assigned

4Wea6 Instruments5
If Co:(Z, ) is ea7( I0 no longer &as suc& desira3le as$m,totic ,ro,erties
o I0 estimates are not un3iased( and t&e 3ias tends to 3e larger &en instruments are ea7 (even it&
ver$ large datasets)
o <ea7 instruments tend to 3ias t&e results toards t&e '"# estimates
o 1dding more and more instruments to im,rove as$m,totic efficienc$ does not solve t&e ,ro3lem
Recommendation ala$s test t&e Hstrengt&B of $our instrument(s) 3$ re,orting t&e 2-test on t&e instruments
in t&e first stage regression

Specification Tests
Testing for Endogeneity ! Wu!Hausman Test
#ince '"# is ,referred to I0 (or )#"#) if e do not &ave an endogeneit$ ,ro3lem( eBd li7e to 3e a3le to test
for endogeneit$
o If e do not &ave endogeneit$( 3ot& '"# and I0 are consistent( 3ut I0 is inefficient
Idea of +ausman test is to see if t&e estimates from '"# and I0 are different
o If all e.,lanator$ varia3les are e.ogenous( differences in t&e '"# and )#"# results s&ould 3e due to
sam,ling error onl$
1n au.illiar$ regression is easiest a$ to do t&is test

/onsider t&e folloing regressionF
= [
0
+ [
1
1
+ [
2
1
+ [
3
2
+ e

o <it& Z
1
and Z
2
as additional e.ogenous varia3les (i.e. additional instruments)
If
1
is uncorrelated it& ' e s&ould estimate t&is e%uation 3$ '"#
+ausman (=EM@) suggested com,aring t&e '"# and )#"# estimates and determining &et&er t&e
differences are significant. If t&e$ differ significantl$( e conclude t&at
1
is an endogenous varia3le.
-&is can 3e ac&ieved 3$ estimating t&e first stage regressionF
1
= o
0
+ o
1
Z
1
+ o
2
Z
2
+ o
3
1
+o
4
2
+

#ince eac& instrument is uncorrelated it& e
(
1
onl$ if
.
-o test t&is( e run t&e folloing regression using '"#F
= [
0
+ [
1
1
+ [
2
1
+ [
3
2
+
1
+ crror
1nd test &et&er
1
= u using a standard t-test (If e re9ect t&e null &$,ot&esis e conclude t&at
1
is
endogenous( since
and e
ill 3e correlated.

E.tending t&is to t&e case of more t&an one endogenous regressor is straig&tforard
o Estimate eac& of t&e first stage regressions using '"#( and save t&e residuals (u)
o Include eac& of t&e first-stage residuals in t&e structural e%uation
o 2se an F-test to test t&e 9oint &$,ot&esis t&at all first-stage residuals are Iero( i.e.
1
=
2
= = u

Testing O.eridentifying Restrictions
I0 must satisf$ to conditionsF
(=) rele$anceF Co:(z, x) = u
()) e#ogeneityF Co:(z, e) = u
<e cannot test ()) 3ecause it involves a correlation 3eteen t&e I0 and an uno3served error.
If e &ave more t&an one instrument &oever( e can effectivel$ test &et&er some of t&em are
uncorrelated it& t&e structural error
o +ausman (=EM@) suggested com,aring t&e )#"# estimator using all instruments to )#"# using a su3set
t&at 9ust identifies t&e e%uation of interest
If all instruments are valid( t&e estimates s&ould differ onl$ as a result of sam,ling error
/onsider t&e a3ove e.am,leF
= [
0
+ [
1
1
+ [
2
1
+ [
3
2
+ e

o <it& Z
1
and Z
2
as additional e.ogenous varia3les (i.e. additional instruments)
o Estimate t&is e%uation 3$ I0 using onl$ Z
1
as an instrument( and com,ute t&e residuals( e
.
o <e can no test &et&er Z
2
and e
are correlated. If t&e$ are( Z

2
is not a valid instrument.
o -&is tells us not&ing a3out &et&er Z
1
and e
are correlated (in fact( for t&is test to 3e relevant e &ave

to assume t&at t&e$ are not)
o If &oever( t&e to instruments are c&osen using t&e same logic (e.g. mot&erBs and fat&erBs education
levels) finding t&at Z
2
and e
are correlated casts dou3t on t&e use of Z

1
as an instrument.
o NoteF if e &ave a single instrument t&en t&ere are no overidentif$ing restrictions and e cannot use
t&is testT if e &ave to I0s for
1
e &ave one overidentif$ing restrictionT if e &ave t&ree e &ave
to overidentif$ing restrictions( and so on.

&eneral #ethod for Testing O.eridentifying Restrictions /The Sargan test0
Estimate t&e structural e%uation 3$ )#"# and o3tain t&e residuals( e
.
Regress e
on all e.ogenous varia3les. '3tain t&e R-s%uared( R

1
2
.
2nder t&e null &$,ot&esis t&at all I0s are uncorrelated it& e( nR
1
2
~_
q
2
( &ere q is t&e num3er of
instrumental varia3les from outside t&e model minus t&e total num3er of endogenous e.,lanator$ varia3les.
If t&e test statistic e.ceeds t&e critical value e re9ect t&e null &$,ot&esis and conclude t&at at least some of
t&e I0s are not endogenous

If e re9ect t&e null &$,ot&esis( our logic for c&oosing I0s must 3e re-e.amined
o -&e test does not tell us &ic& of t&e I0s fail &oever

E2amples
<e ant to estimate t&e returns to e.ogenous c&anges in sc&ooling( estimating a regression of t&e form
ln (ogc) = [
0
+ [
1
Scbool + e
#uc& a s,ecification ma$ suffer from a 3ias due to omitted a3ilit$
Part of t&e co-variation in sc&ooling and ages is 3ecause 3ot& are affected 3$ t&e a3ilit$ of t&e ,erson
Most datasets lac7 measures of individual a3ilit$( so a regression of earnings on sc&ooling &as an error t&at
includes uno3served a3ilit$ and is &ence correlated it& t&e regressor sc&ooling

E2ample 7
/ard (=EE8) amongst ot&ers ,ro,osed ,ro.imit$ to college or universit$ as an instrumental varia3le
/ard (=EE8) used age and education data for a sam,le of men in =EMG to estimate t&e return to education
+e used a dumm$ varia3le for &et&er someone gre u, near a four-$ear college as an I0 for education
-&is instrument is li7el$ to satisf$ t&e second condition (since ,eo,le &ose &omes are a long a$ from a
college or universit$ are less li7el$ to attend)
-&is instrument is also li7el$ to satisf$ t&e first condition (t&oug& it ma$ 3e argued t&at ,eo,le &o live a
long a$ from college or universit$ are more li7el$ to 3e in lo-age la3our mar7ets C need to control for
t&is:)
In a log(age) e%uation &e included ot&er standard controlsF e.,erience( a 3lac7 dumm$( dumm$ varia3les
for living in an #M#1 (standard metro,olitan statistical area) and living in t&e #out&( and a full set of
regional dummies( and an #M#1 dumm$ for &ere t&e man as living in =EGG.
+e finds t&at t&e coefficient from t&e I0 regression is nearl$ tice as large as t&e '"# estimate( &ile t&e
standard errors from t&e I0 estimator are also muc& larger t&an t&ose using '"# - t&e ,rice e ,a$ to get a
consistent estimator of t&e returns to education.

E2ample 3
1ngrist and >rueger (=EE=) ,ro,osed mont& of 3irt& as an instrumental varia3le
-&is satisfies t&e first condition since e ould not e.,ect t&ere to 3e an effect of mont& of 3irt& on
earnings if t&e regression includes age in $ears
-&e second condition ma$ also 3e satisfied since 3irt& mont& determines age of first entr$ into sc&ool in t&e
2#1( &ic& in turn ma$ affect $ears of sc&ooling since las often s,ecif$ a minimum sc&ool leaving age
o e.g. if c&ildren are re%uired to enter sc&ool in t&e #e,tem3er of t&e $ear in &ic& t&e$ turn si. and
assume t&at Decem3er t&e *=
st
is t&e cut-off date( t&en c&ildren 3orn in t&e first %uarter ill 3e G and U
&en t&e$ enter sc&ool( &ile t&ose 3orn in t&e fourt& %uarter ill 3e 8 and U. #o c&ildren &ave
different lengt&s of sc&ooling due to t&eir 3irt& dates.
o -&us( individuals 3orn earlier in t&e $ear reac& t&e minimum sc&ool leaving age (e.g. =G
t&
or =M
t&

3irt&da$) at a loer grade t&an ,eo,le 3orn later in t&e $ear.
o -&erefore( t&ose &o ant to dro, out as soon as legall$ ,ossi3le can leave sc&ool it& less education.
<age effects of t&at ,art of t&e variation in sc&ool $ears are not due to t&e im,act of omitted a3ilit$.
Ouarter of 3irt& can t&us 3e used as an instrument for sc&ooling in t&e a3ove e%uation.
!ut( t&e correlation 3eteen %uarter of 3irt& and education is fairl$ ea7 (t&e F-statistic from t&e first stage
regression is sometimes less t&an to)( i.e. a ea7 instruments ,ro3lem.

Systems of Regression E"uations
Seemingly 8nrelated Regressions
/onsider a model it& 0 linear e%uationsF
y
1
= x
1
1
+e
1
y
2
= x
2
2
+e
2
.
y
u
= x
G
u
+e
u

'ften( x
g
is t&e same for all g( 3ut t&is is not necessar$
#uc& a s$stem is called a seemingl$ unrelated regression (#2R) model (Vellner( =EG))
o #eemingl$ unrelated since eac& e%uation &as its on vector [
g

/orrelation across t&e errors in different errors can ,rovide lin7s t&at can 3e e.,loited &oever

1ssum,tions concerning &o t&e uno3serva3les (e
g
) are related to t&e e.,lanator$ varia3les (x
1
9 x
2
9:9 x
G
)
are crucial for determining t&e a,,ro,riate estimator
'ften it is assumed t&at E(e
g
|x
1
, x
2
, ., x
G
) = u
o Im,l$ing t&at e
g
is uncorrelated it& t&e e.,lanator$ varia3les in all regressions
o If t&is assum,tion &olds and if t&e x
g
are not t&e same across g( t&en an$ e.,lanator$ varia3les
e.cluded from e%uation g are assumed to &ave no effect on e.,ected y
g
once x
g
&as 3een controlled
for( i.e.
E(y
g
|x
1
, x
2
, ., x
G
) = E(y
g
| x
g
) = x
g
g

-&e #2R model can 3e e.,ressed as y
= X
+s
( it& y
= (y
1
, y
2
, ., y
u
)'( s
= (e
1
, e
2
, ., e
u
)'(
and
=
`
x
1
u u . u
u x
2
u
u
.
u
u
u u .
.
u
x
u/
( [ = _
2
.
u
_
-&e 7e$ ort&ogonalit$ conditions for consistent estimation of isF E(X
's
) = (Assumption 7)
o -&is assum,tion doesnBt re%uire x
and e
g
to 3e uncorrelated &en b = g
o 1 stronger assum,tion is E(s
| X
) = u( &ic& re%uires t&at x
and e
g
3e uncorrelated &en b =g
and b = g

2nder 1ssum,tion = t&e vector satisfiesF
E(X
'(y
-X
)) = (=)
-o 3e a3le to estimate e need to assume t&at it is t&e onl$ K 1 vector t&at satisfies (=)F
A E(X
'X
) is non-singular (&as ran7 K) (Assumption 3)

-&e #$stem '"# estimator of is t&enF
= N
-1
X
'X
N
1

-1
N
-1
X
'y
N
1

<it& X
'X
N
1
=
`
x
1
i
x
1
u u . u
u x
2
i
x
2
u
u
.
u
u
u u .
.
u
x
|6
i
x
u
/
and X
'y
N
1
= _
x
1
i
y
1
x
2
i
y
2
.
x
u
i
y
u
_

-&is #'"# estimator can 3e ritten as
= (
1
i
,
2
i
, .,
u
i
)( &ere eac&
g
is t&e single-e%uation '"#
estimator for t&e gt& e%uation
o In ot&er ords( s$stem '"# estimation of an #2R model is e%uivalent to '"# e%uation 3$ e%uation
#$stem '"# is consistent under fairl$ ea7 assum,tions (and inference can 3e made ro3ust using a ro3ust
variance matri. estimator)
o 2sing t&is variance estimator testing cross-e%uation restrictions is straig&tforard using t&e <ald test

#trengt&ening 1ssum,tion = and adding assum,tions on t&e conditional variance matri. of s
allos us to do
3etter t&an '"# 3$ using G"#
E(X
) = /Assumption ;)
a sufficient condition for 1ssum,tion * is E(s
|X
) =
-&e second moment matri. of s
is critical for G"# estimation of s$stems of e%uations

Define t&e 0 0 ,ositive semi-definite matri. u asF
u E(s
')
In a,,lications &ere t&e de,endent varia3les satisf$ an adding u, constraint across e%uations C e.g.
e.,enditure s&ares C an e%uation must 3e dro,,ed to ensure t&at u is non-singular
1fter defining u it is ,ossi3le to state a ea7er version of 1ssum,tion *F
E(X
u
-1
s
) =
In ,lace of 1ssum,tion ) e &aveF
u is ,ositive definite and E(X
i
u
-1
X
) is non-singular (Assumption <)

-&e usual motivation for G"# is to transform a s$stem of e%uations &ere t&e error &as a nonscalar
variance-covariance matri. into a s$stem &ere t&e error vector &as a scalar variance-covariance matri.
-&is is ac&ieved 3$ multi,l$ing our regression s$stem 3$ u
-12
F
u
-12
y
= (u
-12
X
) +u
-12
s
or y
-
= X
-
+s
-
())
From &ic& it is clear t&at E(s
-
s
-
') = I
u

<e can t&en estimate e%uation ()) using s$stem '"#( o3taining t&e s$stem G"# estimator
-
F
-
X
-
'X
-
N
1

-1
X
-
'y
-
N
1

'3taining t&e G"# estimator re%uires 7noing u u, to scale
o i.e. e must 3e a3le to rite u = o
2
C( &ere C is a 7non 0 0 ,ositive definite matri. and o
2
is
alloed to 3e an un7non constant
1s a result( t&e more normal a,,roac& is to re,lace u it& a consistent estimator and estimate 3$ Feasi3le
G"# (FG"#)
o 1 natural candidate is u
N
-1
s
i N
1
( &ere s are t&e #'"# residuals
o FG"# t&en involves re,lacing u it& u
( giving t&e FG"# estimatorF
= X
i
u
-1
X
|
N
1

-1
X
i
u
-1
y
|
N
1

Why use '&LS?
FG"# is com,utationall$ more difficult and less ro3ust
2nder an additional assum,tion( FG"# is more efficient t&an #'"# (and ot&er estimators)( an assum,tion
called s$stem &omoscedasticit$F
E(X
i
u
-1
s
i
u
-1
X
) = E(X
i
u
-1
X
)( &ere u E(s
i
) 1ssum,tion 8
Estimating '"# e%uation 3$ e%uation doesnBt allo one to test cross-e%uation restrictions (t&oug& an
alternative variance matri. can 3e constructed t&at allos for suc& testing)
-esting multi,le restrictions using FG"# is straig&tforard using a <ald test (or associated F-test)

Estimate t&e model using FG"# it& and it&out t&e restrictions on using t&e same estimator of u
(usuall$ t&at from t&e unrestricted #'"#)
"et s
denote t&e residuals from t&e constrained FG"# using variance matri. u

2nder t&e null &$,ot&esis and assum,tions *-8 e &aveF
s
'
N
1
u
-1
s
-s
'
N
1
u
-1
s
~_
2

<&ere is t&e num3er of restrictions im,osed on

-&is is sim,l$ t&e difference 3eteen t&e transformed sum of s%uared residuals from t&e restricted and
unrestricted models
1n alternative F-statistic can also 3e usedF
F = _s
'
N
1
u
-1
s
-s
'
N
1
u
-1
s
'
N
1
u
-1
s
_ _ |(N0 - K)]

NoteF if t&e same regressors a,,ear in all e%uations '"# and FG"# are e%uivalent

Simultaneous E"uation #odels /and the 3SLS Estimator0
#imultaneit$ arises &en one or more of t&e e.,lanator$ varia3les is 9ointl$ determined it& t&e de,endent
varia3le( usuall$ t&roug& an e%uili3rium relations&i,
#imultaneous E%uation Models (#EMs) differ from ,revious models 3ecause in eac& model t&ere are to or
more de,endent varia3les rat&er t&an 9ust one
#EMs involve a linear set of e%uations t&at determines 9ointl$ a set of 0 outcomes( &ere endogenous
varia3les ma$ a,,ear on t&e rig&t &and side it& e.ogenous varia3les
-&e classic e.am,le of a #EM is a demand and su,,l$ s$stem( &ic& 9ointl$ determine ,rices and %uantities
-&e met&ods used to estimate #EM are a,,lica3le more 3roadl$( for e.am,le to ,ro3lems of measurement
error and omitted varia3les

E.am,le =F -&e classic e.am,le of an #EM is a su,,l$ and demand e%uation for some commodit$ (e.g. coffee) or
in,ut to ,roduction (e.g. la3our)
/onsider a sim,le mar7et su,,l$ functionF
q
s
= o
1
p +[
1
z
1
+e
1

&ere q is %uantit$ or out,ut( p is ,rice and z
1
is some o3served varia3le affecting su,,l$ of t&e commodit$
(e.g. eat&er)
-&e error term( e
1
( contains ot&er factors t&at affect su,,l$
-&e e%uation is an e.am,le of a structural e%uation( i.e. it is deriva3le from economic t&eor$ and &as a causal
inter,retation
-&e coefficient measures &o su,,l$ of t&e ,roduct c&anges &en t&e ,rice c&anges. If ,rice and %uantit$
are measured in logs( t&e coefficient gives t&e ,rice elasticit$ of su,,l$

Plotting t&e su,,l$ function( e ,lot out,ut as a function of ,rice( &olding z
1
and e
1
fi.ed.
o /&anges in eit&er of t&ese to factors lead to s&ifts in t&e su,,l$ curveT t&e difference 3eing t&at z
1
is
o3served( &ile e
1
is not
-&e crucial assum,tion for '"# t&at e ma7e is t&at t&e inde,endent varia3les are inde,endent of t&e error
term
In t&is case( t&is assum,tion does not &old
o 1ssuming t&at t&e demand curve is donard slo,ing (or vertical)( t&en a s&ift in t&e su,,l$ curve
,roduces a c&ange in 3ot& ,rice and %uantit$. -&us t&e error term is correlated it& ,rice
-&e im,ortant t&ing to remem3er is t&at su,,l$ and demand interact to 3ointly determine t&e mar7et ,rice of
a good and t&e amount of it t&at is sold (i.e. e onl$ o3serve t&e e%uili3rium values of p and q).

1n econometric model t&at e.,lains mar7et ,rice and %uantit$ s&ould t&erefore consist of to e%uations(
one for su,,l$ and one for demandF
#u,,l$F q
s
= o
1
p +[
1
z
1
+ e
1

DemandF q
d
= o
2
p +[
2
z
2
+e
2

<&ere q
d
is t&e %uantit$ demanded and z
2
is an o3served varia3le affecting t&e demand for t&e commodit$
(e.g. income)
In t&is model t&e varia3les p and q are called endogenous varia3les 3ecause t&eir values are determined it&in
t&e s$stem e &ave created
-&e varia3les z
1
and z
2
&ave values t&at are given to us( and &ic& are determined outside t&is s$stem.
o 1s suc&( t&ese are e.ogenous varia3les. It is assumed t&at t&e varia3les z
1
and z
2
are uncorrelated it&
t&e demand and su,,l$ errors( e
1
and e
2

-&e error terms in t&e su,,l$ and demand e%uations are assumed to &ave t&e usual ,ro,ertiesT i.e. t&e$ &ave
a constant mean and variance( and are inde,endentl$ distri3uted
E.am,le )F "a3or #u,,l$-<age 'ffer s$stem for married omen.
In e%uili3rium( e can rite t&e s$stem asF
b = z
1
w +
11
cxpcr +
12
cxpcr
2
+
13
otbinc +
14
kiJs +e
1

w = y
2
b +
21
cxpcr +
22
cxpcr
2
+
23
cJuc +e
2

#o ot&er sources of income and num3er of c&ildren affect la3our su,,l$ 3ut not t&e age offer( and
education affects t&e age offer 3ut not la3our su,,l$
In t&is s$stem( h and - are endogenous. -raditional #EM anal$sis ould ta7e ever$t&ing else as e.ogenous
-&e nonlinearit$ in cxpcr (i.e. cxpcr
2
) re%uires no s,ecial treatment.

E.am,le *F /it$ /rime Rates and t&e #iIe of t&e Police Force
-&e idea of an underl$ing counterfactual is critical to sensi3le a,,lications of #EMs.
o It ma7es sense to t&in7 of a demand function in isolation( and similarl$ it& a su,,l$ function.
o -&e$ are 3roug&t toget&er as a a$ of determining t&e o3served dataF
crimc = y
1
policc +
11
ogc +
12
uncm +
13
wogc +e
1

policc = y
2
crimc +
21
ogc +
22
uncm +
23
wogc +
24
clcction + e
2

-&ese to e%uations form a legitimate #EMF eac& e%uation stands on its on.
o In effect( t&e$ descri3e to different sides of a Pmar7et.Q
o -&e$ come toget&er as a s$stem under assum,tions a3out &o t&e o3served outcomes( (crimc
(
policc
).

1n im,ortant ,oint to remem3er &en using #EMs is t&at eac& e%uation in t&e model s&ould &ave a ceteris
,ari3us( causal inter,retation
In E.am,le = for e.am,le( t&e to e%uations descri3e entirel$ different relations&i,sF
o -&e su,,l$ e%uation descri3es t&e 3e&aviour of firms
o -&e demand e%uation is a 3e&avioural relations&i, for consumers
Eac& e%uation &as a ceteris ,ari3us inter,retation t&erefore and stands on its on
-&e$ 3ecome lin7ed in t&e econometric anal$sis onl$ 3ecause t&e o3served ,rice and %uantit$ are determined
3$ t&e intersection of su,,l$ and demand.
;ust 3ecause to varia3les are determined simultaneousl$ doesnBt mean t&at a #EM is suita3le.
E.am,le 4F ;oint Determination of Famil$ Retirement #aving and +ousing E.,enditureF
-&is is a ,oor a,,lication of #EMs. #u,,ose t&e ,o,ulation is all families in a ,articular countr$F
rctircmcnt = y
1
bousing +
11
inc +
12
cJuc +
13
ogc +e
1

bousing = y
2
rctircmcnt +
21
inc +
22
cJuc +
23
ogc +e
2

Neit&er of t&ese e%uations stands on its on. <&at ould it mean( in t&e first e%uation( to stud$ t&e effect of
c&anging income on retirement &olding &ousing e.,enditure fi.edA
o If income increases( families ill generall$ c&ange t&e o,timal mi. of &ousing and retirement
e.,enditures. -&e first e%uation &oever ma7es it seem as t&oug& e ant to 7no t&e im,act of a
c&ange in income( education or age on &ousing e.,enditure &olding retirement s,ending constant.
Even if one ants to model t&e 9oint determination of y
1
and y
2
in t&is a$( t&e ,arameters are not
interesting. -&ere is no interesting counterfactual.

-o-E%uation #$stem
General to-e%uation structural system (in t&e ,o,ulation)F
y
1
= y
1
y
2
+ z
1
6
1
+e
1

y
2
= y
2
y
1
+ z
2
6
2
+e
2

&ere z
1
is 1 H
1
and z
2
is 1 H
2
. "et = 3e 1 H and contain all (non-redundant) e.ogenous varia3lesF
E(z
i
e
1
) = E(z
i
e
2
) =
&ere in almost all a,,lications z
1
and z
2
(and t&erefore z) include unit$ (i.e. a constant term).
o <e act as if t&at is true &ere( so t&at t&e structural errors e
1
and e
2
&ave Iero means.
y
1
( 6
1
9 y
2
and 6
2
are t&e structural ,arameters.
If a varia3le is e.ogenous in an$ e%uation( it is e.ogenous in all e%uations.

Reduced Form E%uations
1lt&oug& e do not need t&em to stud$ identification( e can o3tain reduced forms for y
1
and y
2
ifF
y
1
y
2
= 1
Generall$( a reduced form e.,resses an endogenous varia3le as a function of e.ogenous varia3les and
uno3served errors.
In t&is case( solve t&e to e%uations for y
1
and y
2
F
y
1
= y
1
(y
2
y
1
+z
2
6
2
+e
2
) + z
1
6
1
+e
1

y
1
= y
1
y
2
y
1
+ z
1
6
1
+ z
2
y
1
6
2
+e
1
+y
1
e
2

-&erefore( if y
1
y
2
= 1(
y
1
= (1 - y
1
y
2
)
-1
(z
1
6
1
+z
2
y
1
6
2
+e
1
+y
1
e
2
)
y
1
zn
1
+:
1

<&ere n
1
is t&e H 1 vector of reduced form parameters and :
1
= (1 -y
1
y
2
)
-1
(e
1
+ y
1
e
2
) is a reduced form
error.

<e can do t&e same for y
2
( so e &aveF
y
1
zn
1
+:
1

y
2
zn
2
+:
2

!ot& reduced form errors satisf$F
E(z
i
:
1
) = E(z
i
:
2
) =
&ic& means n
1
and n
2
can 3e consistentl$ estimated 3$ '"# on a random sam,le (,rovided E(z
i
z) is non-
singular).

Simultaneity 1ias in OLS
/onsider t&e folloing e.am,leF
y
1
= o
1
y
2
+ [
1
z
1
+e
1
(=)
y
2
= o
2
y
1
+ [
2
z
2
+e
2
())
-o s&o t&at y
2
is generall$ correlated it& e
1
e can solve for y
2
in terms of t&e e.ogenous varia3les and t&e
error terms. Re,lacing y
1
in ()) it& t&e e.,ression in (=) gives(
(1 -o
2
o
1
)y
2
= o
2
[
1
z
1
+ [
2
z
2
+o
2
e
1
+e
2
(*)
1ssuming t&at o
2
o
1
= 1 e can divide (*) 3$ 1 -o
2
o
1
to o3tain(
y
2
= n
21
z
1
+ n
22
z
2
+u
2
(4)
<&ere n
21
= o
2
[
1
(1 -o
2
o
1
)T n
22
= [
2
(1 - o
2
o
1
) and u
2
= (o
2
e
1
+e
2
)(1 -o
2
o
1
)
E%uation (4) e.,resses y
2
in terms of t&e e.ogenous varia3les and t&e error terms and is called t&e reduced form
e%uation for y
2
.
-&e ,arameters n
21
and n
22
are non-linear functions of t&e structural ,arameters( and are termed t&e reduced
form ,arameters.

-&e reduced form error( u
2
, is a linear function of t&e structural error terms( e
1
and e
2
.
#ince t&e z's are uncorrelated it& t&e e's( u
2
is also uncorrelated it& t&e z's( &ence t&e reduced form
,arameters in (G) can 3e estimated 3$ '"#.
- -&e reduced form e%uations can 3e im,ortant for economic anal$sis. -&ese e%uations relate t&e
e4uilibrium values of t&e endogenous varia3les to t&e e.ogenous varia3les.
E%uation (4) also tells us t&at estimation of e%uation (=) 3$ '"# ill result in 3iased and inconsistent estimates
of o
1
and [
1
.
- In e%uation (=) t&e issue is &et&er y
2
and e
1
are correlated (z
1
and e
1
are 3$ assum,tion uncorrelated)
- From (4) e see t&at y
2
and e
1
are correlated if and onl$ if u
2
and e
1
are correlated
- #ince u
2
is a linear com3ination of e
1
and e
2
it is generall$ correlated it& e
1

<&en y
2
is correlated it& e
1
3ecause of simultaneit$( e sa$ t&at '"# suffers from simultaneit$ 3ias

When are the Structural $arameters Identified?
Identification in t&e to-e%uation case is straig&tforard. /onsider identification of t&e first structural
e%uation
y
1
= y
1
y
2
+ z
1
6
1
+e
1

y
2
zn
2
+:
2

!ecause y
2
is t&e onl$ endogenous e.,lanator$ varia3le( e need at least one instrument for it
o -&at means e must &ave somet&ing in = in t&e reduced form e%uation it& a nonIero coefficient t&at
is not also in z
1
.
o !ut y
2
= y
2
y
1
+ z
2
6
2
+ e
2
and so n
2
&as a nonIero coefficient on somet&ing not in z
1
if and onl$ if
t&ere is at least one element of z
2
t&at is not also in z
1
it& nonIero coefficient in 6
1
.

The Instrumental +aria,les Solution
1s e sa last time t&e I0 solution of to-stage least s%uares can 3e used to solve t&e ,ro3lem of endogenous
e.,lanator$ varia3les.
-&is is also true for #EMs C t&e ma9or difference 3eing t&at 3ecause e s,ecif$ a structural e%uation for eac&
endogenous varia3le( e can immediatel$ see &et&er sufficient I0s are availa3le to estimate eit&er e%uation.
/onsider t&e folloing e.am,leF
q
s
= o
1
p +[
1
z
1
+e
1

q
d
= o
2
p +e
2

+ere e can t&in7 of t&e coffee mar7et as an e.am,le( it& q 3eing sa$ ,er ca,ita coffee consum,tion( p 3eing
t&e average ,rice ,er 9ar and z
1
3eing somet&ing li7e t&e eat&er (in !raIil:) t&at affects su,,l$. It is assumed
t&at z
1
is e.ogenous to 3ot& t&e su,,l$ and demand e%uations

-&e first %uestion to 3e addressed isF given a random sam,le on q( p and z
1
( &ic& of t&e a3ove e%uations can
3e estimated( i.e. &ic& is an identified e%uationA
It turns out t&at t&e demand e%uation is identified( 3ut t&e su,,l$ e%uation is not
o -&is is indicated for our rules for instruments
o <e can use z
1
as an instrument for ,rice in t&e demand e%uation
o !ecause z
1
a,,ears in t&e su,,l$ e%uation &oever( e cannot use it as an instrument in t&is e%uation
o In order to estimate t&e su,,l$ e%uation e ould need an o3served e.ogenous varia3le t&at s&ifts t&e
demand curve

/onsidering t&e more general to-e%uation modelF
y
1
= o
1
y
2
+ [
1
z
1
+e
1

y
2
= o
2
y
1
+ [
2
z
2
+e
2

<&ere y
1
and y
2
are t&e endogenous varia3les( e
1
and e
2
are t&e structural error terms( and z
1
and z
2
no
denote a set of e.ogenous regressors( k
1
and k
2
( t&at a,,ear in t&e first and second regression res,ectivel$( i.e.
z
1
= (z
11
, z
12
, ., z
1k
1
) and z
2
= (z
21
, z
22
, ., z
2k
2
).
In man$ cases z
1
and z
2
ill overla,
-&e assum,tion t&at z
1
and z
2
contain different e.ogenous varia3les means t&at e im,ose e#clusion restrictions
on t&e model( i.e. e assume t&at certain e.ogenous regressors do not a,,ear in t&e first e%uation and ot&ers are
a3sent from t&e second. -&is allos us to distinguis& 3eteen t&e to structural e%uations.

The Order and Ran6 Condition
-&e 'rder /ondition states t&at t&e num3er of e.ogenous varia3les not a,,earing in t&e first e%uation must
3e at least as large as t&e num3er of included R+# endogenous varia3les. <e can see t&is 3$ counting to see
if e &ave enoug& ,otential instrumental varia3les.
-&e ran7 condition re%uires moreF at least one of t&e e.ogenous regressors e.cluded from t&e first e%uation
must &ave a non-Iero ,o,ulation coefficient in t&e second e%uation. -&is can 3e tested using a t or F test.

E.am,le =
y
1
= y
12
y
2
+
11
z
1
+
12
z
2
+
13
z
3
+
14
z
4
+e
1

y
2
= y
21
y
1
+
21
z
1
+
22
z
2
+e
2

-&e first e%uation fails t&e order condition( and is unidentified. -&e second e%uation satisfies t&e order
condition and ill satisf$ t&e ran7 condition if at least one of
13
= u or
14
= u.
If
13
and
14
are 3ot& different from Iero( t&ere is one overidentif$ing restriction in e%uation to.
!ut if( sa$(
13
= u( t&e second e%uation is onl$ 9ust identified 3ecause t&ere is onl$ one instrument for y
1
(
z
4
.

E.am,le )
/onsider t&e s$stem
b = z
1
w +
11
cxpcr +
12
cxpcr
2
+
13
otbinc +
14
kiJs +e
1

w = y
2
b +
21
cxpcr +
22
cxpcr
2
+
23
cJuc +e
2

-&e la3or su,,l$ function is identified if and onl$ if
23
= u.
-&e age offer function is identified if and onl$ if at least one of
13
and
14
is different from Iero.
NoteF 'ur im,osing e.clusion restrictions means t&at it must 3e t&e case t&at educ is legitimatel$ e.cluded
from t&e su,,l$ e%uation and othinc and "ids are ,ro,erl$ e.cluded from t&e age offer e%uation.
NoteF In general( identification of an$ ,articular e%uation of an #EM de,ends on t&e structure of other
e%uations in t&e #EM.

Estimation
'nce e &ave determined t&at an e%uation is identified( e can estimate it 3$ )#"#
o -&e instruments consist of t&e e.ogenous varia3les a,,earing in eit&er e%uation
-ests for endogeneit$( overidentif$ing restrictions and so on ,roceed as for t&e standard I0 estimator
It turns out t&at( &en an$ s$stem it& to or more e%uations is correctl$ s,ecified and certain additional
assum,tions &old( system estimation methods (e.g. -&ree-#tage-"east-#%uares( *#"#) are generall$ more
efficient t&an estimating 3$ )#"#

Systems *ith #ore Than T*o E"uations
#EMs can consist of more t&an to e%uations
#tud$ing t&e general identification of t&ese models is not straig&tforard
'nce an e%uation &as 3een s&on to 3e identified( it can 3e estimated 3$ -#"#
/onsider t&e folloing t&ree e%uation s$stem
y
1
= o
12
y
2
+o
13
y
3
+ [
11
z
1
+e
1
(=)
y
2
= o
21
y
1
+[
21
z
1
+[
22
z
2
+ [
23
z
3
+e
2
())
y
3
= o
32
y
2
+[
31
z
1
+[
32
z
2
+ [
33
z
3
+[
34
z
4
+e
3
(*)
It is more difficult to s&o t&at an e%uation in a #EM it& more t&an to e%uations is identified
It is clear &oever t&at (*) is not identified( since all e.ogenous regressors are included in t&e e%uation(
leaving no instruments for y
2
- e &ave in terms of last ee7 an unidentified e4uation
E%uation (=) on t&e ot&er &and loo7s ,romisingT e &ave t&ree e.ogenous regressors e.cluded from t&e
regression( z
2
( z
3
and z
4
( and onl$ to endogenous regressors( y
2
and y
3
- t&is e%uation is t&erefore
,otentiall$ o$eridentified
In general( an e%uation in an$ #EM satisfies t&e order condition for identification if t&e num3er of e.cluded
e.ogenous varia3les from t&e e%uation is at least as large as t&e num3er of endogenous regressors
o 1s suc&( t&e order condition in ()) is also satisfied since e &ave one e.cluded e.ogenous regressor( z
4
(
and one endogenous regressor( y
1
- t&e e%uation is ,otentiall$ e#actly identified
Identification of an e%uation de,ends on t&e ,arameters (&ic& e can never 7no for sure) in t&e ot&er
e%uations &oever
o For e.am,le( if [
34
= u in (E) t&en (@) is not identified( as z
4
is useless as an instrument for y
1

La,our Supply E2ample
2sing t&e mroI.dta dataset consider t&e folloing #EM modelF
bours = y
1
lwogc +
11
cJuc +
12
nwicinc +
13
ogc +
14
kiJslt6 +
15
kiJsgc6 +e
1

lwogc = y
1
bours +
21
cJuc +
22
cxpcr +
23
cxpcrsq +e
2

InstrumentsF
o cxpcr and cxpcrsq are e.cluded e.ogenous varia3les from t&e &ours or7ed regression
o nwicinc( ogc( kiJslt6 and kiJsgc6 are e.cluded e.ogenous varia3les from t&e age regression

regress hours lwage educ nwifeinc age kidslt6 kidsge6

Source | SS df MS Number of obs = 428
-------------+------------------------------ F( 6, 421) = 5.04
Model | 17228385.4 6 2871397.56 Prob > F = 0.0001
Residual | 240082635 421 570267.54 R-squared = 0.0670
-------------+------------------------------ Adj R-squared = 0.0537
Total | 257311020 427 602601.92 Root MSE = 755.16

hours | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwage | -17.40781 54.21544 -0.32 0.748 -123.9745 89.15886
educ | -14.44486 17.96793 -0.80 0.422 -49.76289 20.87317
nwifeinc | -4.245807 3.655815 -1.16 0.246 -11.43173 2.940117
age | -7.729976 5.52945 -1.40 0.163 -18.59874 3.138792
kidslt6 | -342.5048 100.0059 -3.42 0.001 -539.078 -145.9317
kidsge6 | -115.0205 30.82925 -3.73 0.000 -175.619 -54.42208
_cons | 2114.697 340.1307 6.22 0.000 1446.131 2783.263

Note: OLS gives essentially a zero slope for the labour supply function

Use TSLS with instruments exper and expersq for lwage

ivregress 2sls hours educ nwifeinc age kidslt6 kidsge6 (lwage = exper expersq)

Instrumental variables (2SLS) regression Number of obs = 428
Wald chi2(6) = 20.80
Prob > chi2 = 0.0020
R-squared = .
Root MSE = 1291.2

hours | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwage | 1544.818 476.7912 3.24 0.001 610.3248 2479.312
educ | -177.449 57.66517 -3.08 0.002 -290.4706 -64.42731
nwifeinc | -9.24912 6.427897 -1.44 0.150 -21.84757 3.349328
age | -10.78409 9.498705 -1.14 0.256 -29.40121 7.833032
kidslt6 | -210.8339 175.4811 -1.20 0.230 -554.7705 133.1028
kidsge6 | -47.55707 56.45049 -0.84 0.400 -158.198 63.08385
_cons | 2432.198 589.293 4.13 0.000 1277.205 3587.191

Instrumented: lwage
Instruments: educ nwifeinc age kidslt6 kidsge6 exper expersq

Note: Now we have a very strong upward slope
Do we believe the exclusion restrictions excluding exper and expersq in the
labour supply function?

The heteroscedasticity-robust standard error is substantially larger on lwage, and so
we should use robust inference:

ivregress 2sls hours educ nwifeinc age kidslt6 kidsge6 (lwage = exper expersq),
vce(robust)

Wald chi2(6) = 15.41
Prob > chi2 = 0.0173
R-squared = .
Root MSE = 1291.2

| Robust
-------------+----------------------------------------------------------------
lwage | 1544.818 598.8004 2.58 0.010 371.1914 2718.446
educ | -177.449 66.84514 -2.65 0.008 -308.463 -46.43491
nwifeinc | -9.24912 5.231389 -1.77 0.077 -19.50245 1.004215
age | -10.78409 10.57756 -1.02 0.308 -31.51573 9.947555
kidslt6 | -210.8339 203.9118 -1.03 0.301 -610.4936 188.8259
kidsge6 | -47.55707 56.47944 -0.84 0.400 -158.2547 63.14059
_cons | 2432.198 611.223 3.98 0.000 1234.223 3630.173
Instrumented: lwage
Instruments: educ nwifeinc age kidslt6 kidsge6 exper expersq

Use 3SLS on the system (though 3SLS does not allow inference robust to heteroscedasticity)

reg3 (hours lwage educ nwifeinc age kidslt6 kidsge6) (lwage hours educ exper expersq)

Three-stage least-squares regression
----------------------------------------------------------------------
Equation Obs Parms RMSE "R-sq" chi2 P
----------------------------------------------------------------------
hours 428 6 1368.362 -2.1145 34.54 0.0000
lwage 428 4 .6892585 0.0895 79.87 0.0000
----------------------------------------------------------------------

| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hours |
lwage | 1676.933 431.1689 3.89 0.000 831.8576 2522.009
educ | -205.0267 51.8473 -3.95 0.000 -306.6455 -103.4078
nwifeinc | .3678949 3.451518 0.11 0.915 -6.396955 7.132745
age | -12.28121 8.261529 -1.49 0.137 -28.47351 3.911095
kidslt6 | -200.5672 134.2685 -1.49 0.135 -463.7286 62.59415
kidsge6 | -48.63986 35.95136 -1.35 0.176 -119.1032 21.82352
_cons | 2504.799 535.8919 4.67 0.000 1454.47 3555.128
-------------+----------------------------------------------------------------
lwage |
hours | .000201 .0002109 0.95 0.340 -.0002123 .0006143
educ | .1129699 .0151452 7.46 0.000 .0832858 .1426539
exper | .0208906 .0142782 1.46 0.143 -.0070942 .0488753
expersq | -.0002943 .0002614 -1.13 0.260 -.0008066 .000218
_cons | -.7051105 .3045904 -2.31 0.021 -1.302097 -.1081242
Endogenous variables: hours lwage
Exogenous variables: educ nwifeinc age kidslt6 kidsge6 exper expersq

In the system, hourly wage does not appear to depend on hours

The key difference between 3SLS and TSLS estimation of the labour supply equation is
that the former maintains the exclusion restrictions in the wage offer, while TSLS
uses an unrestricted reduced form for lwage.

What if we estimate labour demand instead of the wage offer?

ivregress 2sls hours educ exper expersq (lwage = nwifeinc age kidslt6 kidsge6)
Wald chi2(4) = 26.50
Prob > chi2 = 0.0000
R-squared = .
Root MSE = 1020.8

-------------+----------------------------------------------------------------
lwage | 1000.536 800.6996 1.25 0.211 -568.8069 2569.878
educ | -130.1077 88.75291 -1.47 0.143 -304.0602 43.84485
exper | 13.88496 38.92312 0.36 0.721 -62.40295 90.17287
expersq | -.0257314 .8858452 -0.03 0.977 -1.761956 1.710493
_cons | 1584.152 517.0086 3.06 0.002 570.8341 2597.471
Instrumented: lwage
Instruments: educ exper expersq nwifeinc age kidslt6 kidsge6

The above shows that it matters for estimation whether we specify a wage offer or a
labour demand function.

The estimated labour demand function gives nonsense!

We can see the problem by looking at the reduced form for lwage. The reduced form
does not depend on the excluded exogenous variables in the labour demand function.

reg lwage educ exper expersq nwifeinc age kidslt6 kidsge6

Source | SS df MS Number of obs = 428
-------------+------------------------------ F( 7, 420) = 11.78
Model | 36.6476854 7 5.23538363 Prob > F = 0.0000
Residual | 186.679766 420 .444475633 R-squared = 0.1641
-------------+------------------------------ Adj R-squared = 0.1502
Total | 223.327451 427 .523015108 Root MSE = .66669

lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0998844 .0150975 6.62 0.000 .0702084 .1295604
exper | .0407097 .0133723 3.04 0.002 .0144249 .0669946
expersq | -.0007473 .0004018 -1.86 0.064 -.0015371 .0000424
nwifeinc | .0056942 .0033195 1.72 0.087 -.0008307 .0122192
age | -.0035204 .0054145 -0.65 0.516 -.0141633 .0071225
kidslt6 | -.0558726 .0886034 -0.63 0.529 -.230034 .1182889
kidsge6 | -.0176485 .027891 -0.63 0.527 -.0724718 .0371749
_cons | -.3579973 .3182963 -1.12 0.261 -.9836496 .267655

test nwifeinc age kidslt6 kidsge6

( 1) nwifeinc = 0
( 2) age = 0
( 3) kidslt6 = 0
( 4) kidsge6 = 0

F( 4, 420) = 0.91
Prob > F = 0.4555

System Estimation ,y I+
#2R can 3e used &en t&e e.,lanator$ varia3le satisf$ certain e.ogeneit$ conditions
<&en t&is is not ,ossi3le( I0 met&ods can 3e used
#$stem I0 estimation is 3ased on t&e ,rinci,le of generalised met&od of moments (GMM)
<e &ave seen t&at e can estimate simultaneous e%uations using )#"# &en e &ave enoug& instruments
-&ere can 3e efficienc$ gains from estimating #EMs 9ointl$ &oever using a s$stem estimator

&eneral Linear System of E"uations
/onsider a general linear model of t&e formF
y
= X
+ s

<&ere y
is a 0 1 vector( X
is a 0 K matri. and s
I
is t&e 0 1 vector of errors
-&e folloing ort&ogonalit$ condition is t&e 3asis for estimating F
E(Z
i
s
) = ( &ere Z
is a 0 I matri. of o3serva3le instruments (SI+ Assumption 7)

1s e 7no from a3ove t&is assum,tion is not enoug& for identification of
<e also re%uire t&e ran7 conditionF
iank E(Z
i
X
) = K (SI+ Assumption 3)
#ince E(Z
i
X
) is an I K matri.( #I0 1ssum,tion ) re%uires t&e columns of t&is matri. to 3e linearl$

inde,endent
Necessar$ for t&e ran7 condition is t&e order conditionF I K

E.am,le
1 0 e%uation s$stem for t&e ,o,ulation can 3e ritten asF
y
1
= x
1
1
+e
1
y
2
= x
2
2
+e
2
.
y
u
= x
G
u
+e
u

-&e onl$ difference to t&e #2R model is t&at x
g
can contain 3ot& e.ogenous and endogenous varia3les
<e assume t&at e &ave set of instrumental varia3les for eac& e%uation( a 1 I
g
vector z
g
( t&at are
e.ogenous in t&e sense t&atF
E(z
g
i
e
g
) = ( g = 1,2, ., 0
2suall$( t&e same instruments( &ic& consist of all e.ogenous varia3les a,,earing an$&ere in t&e s$stem( are
valid for ever$ e%uation( so t&at z
g
= w( g = 1,2, ., 0

<e can t&en riteF
y
_
y
1
y
2
.
y
u
_(
x
1
u u . u
u x
2
u
u
.
u
u
u u .
.
u
x
u/
( s
_
e
1
e
2
.
e
u
_( = (
1
i
,
2
i
, .,
u
i
)'
<it& K = K
1
+K
2
++K
u
3eing t&e total num3er of ,arameters in t&e s$stem
-&e matri. of instruments loo7s li7eF
Z
z
1
u u . u
u z
2
u
u
.
u
u
u u .
.
u
z
u/

<&ic& &as dimension 0 I( &ere I = I
1
+I
2
++I
u

For eac& i( E(Z
i
s
) = (z
1
s
1
, z
2
s
2
, ., z
u
s
u
)'
E(Z
i
s
) = t&erefore re,roduces t&e ort&ogonalit$ condition

<e can also riteF
E(Z
i
X
)
`
E(z
1
i
x
1
) u u . u
u E(z
2
i
x
2
)
u
u
.
u
u
u u .
.
u
E(z
u
i
x
u
)
/

<&ere E(z
g
i
x
g
) is I
g
K
g

#I0 1ssum,tion ) re%uires t&at t&is &as full column ran7( &ere t&e num3er of columns is K = K
1
+K
2
+
+ K
u

-&is ill &old if and onl$ if iank E(z
g
i
x
g
) = K
g
( g = 1,2, ., 0
o i.e. if and onl$ if eac& 3loc7 along t&e diagonal &as full column ran7
-&is is t&e ran7 condition re%uired for estimating eac& e%uation 3$ )#"#

&eneralised #ethod of #oments Estimation
2nder #I0 assum,tions = and )( is t&e uni4ue K 1 vector solving t&e linear set ,o,ulation moment
conditionsF
E|Z
i
(y
- X
)] =
If b is an$ ot&er K 1 vector t&en E|Z
i
(y
-X
b)] =
#ince sam,le averages are consistent estimators of ,o,ulation moments( suggests c&oosing t&e estimator

to solveF
N
-1
Z
i
(y
-X
) =
N
1
(=)
<&ic& is a set of I linear e%uations in K un7nons in

If e &ave I = K t&en e &ave e.actl$ enoug& I0s for t&e e.,lanator$ varia3les in t&e s$stem
If t&e K K matri. Z
i
X
N
1
is nonsingular( e can solve for
asF
= (N
-1
Z
i
X
N
1
)
-1
(N
-1
Z
i
y
N
1
) ())
-&is s$stem I0 estimator is consistent under #I0 1ssum,tions = and )

<&en I K and e &ave more instruments t&an e need for identification( c&oosing
is more
com,licated
E%uation ()) ill generall$ not &ave a solution
Instead( e c&oose
to ma7e t&e vector in e%uation (=) as PsmallQ as ,ossi3le in t&e sam,le

'ne o,tion is to minimise t&e s%uared Euclidean lengt& in (=)( &ic& dro,,ing t&e 1N im,lies c&oosing

to ma7e t&e folloing as small as ,ossi3leF
_Z
i
N
1
(y
-X
)_
i
_Z
i
N
1
(y
- X
)_
o -&is rarel$ ,roduces t&e 3est estimator &oever
1 more general a,,roac& is to use a eig&ting matri.( W
( &ic& is an I I s$mmetric( ,ositive semidefinite

matri.( in t&e %uadratic form
1 GMM estimator of is a vector
t&at solves t&e ,ro3lemF

min
b
|Z
i
(y
- X
b)]'W
|Z
i
(y
-X
b)]
#ince t&is e%uation is a %uadratic function of b( t&e solution &as a closed form( &ic& can 3e ritten asF

= (X'ZW
Z'X)
-1
(X'ZW
Z'Y) (*)
GMM ,roduces a consistent estimator for [ and is as$m,toticall$ normall$ distri3uted( it& mean Iero and
as$m,totic varianceF
AvaiVN(
-) = (C'WC)
-1
C'WAWC(C'WC)
-1
(4)
<&ereF
A E(Z
i
s
i
Z
) = vai(Z
i
s
) (8)

System 3SLS Estimator
'ne ,ossi3le c&oice for W
isF
W
= N
-1
Z
i
Z
N
1

-1

<&ic& is a consistent estimator for |E(Z
i
Z
)]
-1

2sing t&is( gives us t&e folloing estimatorF
= |X'Z(Z
i
Z
)
-1
Z'X]
-1
X'Z(Z
i
Z
)
-1
Z'Y
<&ic& is similar to t&e single-e%uation )#"# estimator( and is called t&e s$stem )#"# estimator
<e estimate t&e first e%uation 3$ )#"# using instruments z
1
( t&e second 3$ )#"# using instruments z
2
( and
so on

Optimal Weighting #atri2
1 GMM estimator e.ists for an$ ,ositive definite eig&ting matri.( so &ic& one to useA
<e are tr$ing to ma7e E%uation (4) as small as ,ossi3le (in t&e matri. sense)
It can 3e s&on t&at an o,timal eig&ting isF W = A
-1
( &ere A is defined 3$ (8)
-&is ,roduces a GMM estimator t&at is t&e most efficient of all GMM estimators of t&e form given 3$ (*)
Provided e can consistentl$ estimate A e can o3tain t&e as$m,toticall$ efficient GMM estimator
-&is suggests t&e folloing ,rocedure to estimate t&e s$stem 3$ GMMF
=. "et
3e an initial consistent estimator of . -&is is usuall$ t&e s$stem )#"# estimator

). '3tain t&e 0 1 residuals vectorsF s
= y
-X
( i = 1,2, ., N
*. 1 generall$ consistent estimator of A isF A
= N
-1
Z
i
s
i
Z
N
1

4. /&oose W
= A
-1
= (N
-1
Z
i
s
i
Z
N
1
)
-1
and use t&is matri. to o3tain t&e as$m,toticall$
o,timal GMM estimator

&## Estimator - ;SLS
*#"# is a ,o,ular met&od of estimating simultaneous e%uation models
-&e *#"# GMM is a GMM estimator t&at uses a ,articular eig&ting matri.
"et s
`
= y
-X
3e t&e residuals from an initial estimation (usuall$ )#"#)

Define t&e 0 0 matri.F u
N
-1
s
`
s
`
i N
1

-&e eig&ting matri. used 3$ *#"# isF
W
= _N
-1
Z
i
u
N
1
]
-1
= |Z
i
(I
N
u
)ZN]
-1

1nd t&e resulting *#"# estimator isF
= jX'Z|Z
i
(I
N
u
)Z
|
-1
Z'X[
-1
X'Z|Z
i
(I
N
u
)Z
|
-1
Z'Y
-&e *#"# estimator is generall$ less efficient t&an GMM using t&e o,timal eig&ting matri.
#o &$ use itA
o It &as a tradition in econometrics
o It ma$ &ave 3etter finite sam,le ,ro,erties

Traditional ;SLS Estimator
-&e original *#"# estimator as develo,ed 3$ Vellner and -&eil (=EG))
-&e gt& of 0 e%uations for t&e it& individual in a linear #EM isF
y
g
= z
g
i
y
g
+Y
g
i
[
g
+e
g
(=)
<it& z
g
a vector of e.ogenous regressors and Y
g
a vector t&at contains a su3set of t&e de,endent varia3les
from t&e ot&er 0 -1 e%uation (and is t&us endogenous)
-&e structural model for t&e it& individual can also 3e ritten asF
y
i
B +z
i
I = s

#olving for t&e reduced form givesF
y
= -z
i
IB
-1
+s
B
-1

y
= z
i
H +v

<&ere H = IB
-1
is t&e r 0 matri. of reduced form ,arameters
-&e reduced form can 3e consistentl$ estimated 3$ '"#
-&e ,arameters from t&e structural model can 3e consistentl$ estimated 3$ )#"#

-&e *#"# a,,roac& ,roceeds as follosF
=. Estimate t&e reduced form ,arameters( H( using '"#
). '3tain t&e )#"# estimates 3$ estimating (=)( re,lacing Y
g
it& Y
g
from stage =
*. /om,ute t&e *#"# estimates using t&e G"# estimatorF
0
`
3
= |X
i
(L
-1
I
N
)X
]
-1
X
i
(L
-1
I
N
)y
<&ere X
are t&e ,redictions of Y and z and L
= N
-1
s
(it& s
t&e residuals from t&e )#"# model)

In t&e case &en GMM uses e.actl$ t&e same instruments in eac& e%uation t&ese to *#"# estimators are
e%uivalent
't&erise t&e$ ma$ differ( 3ut 3ot& $ield consistent estimates (under #I0 1ssum,tion =)
#a2imum Li6elihood Estimation
Provides a means of c&oosing an as$m,toticall$ efficient estimator for a ,arameter or set of ,arameters
o Particularl$ useful as a met&od to estimate non-linear models
Focusses on t&e fact t&at different ,o,ulations generate different sam,lesT an$ one sam,le 3eing scrutinised
is more li7el$ to &ave come from some ,o,ulations t&an from ot&ers
o For e.am,le( if e ere sam,ling coin tosses and a sam,le mean of 5.8 ere o3tained( t&e most li7el$
,o,ulation from &ic& t&e sam,le as dran ould 3e a ,o,ulation it& mean 5.8
-&e ma.imum li7eli&ood estimator of a ,arameter 0 is defined as t&e value of 0
`
t&at ould most li7el$
generate t&e o3served sam,le o3servations(
1
,
2
, .,
N

Ma.imum li7eli&ood estimation involves a searc& over alternative ,arameter estimators to find t&ose
estimators &ic& most li7el$ ould generate t&e sam,le

-&e ,ro3a3ilit$ densit$ function (,df) for a random varia3le y( conditioned on a set of ,arameters 0 is
denoted (y|0)
-&is identifies t&e data generating ,rocess t&at underlies an o3served sam,le of data( and at t&e same time
,rovides a mat&ematical descri,tion of t&e data t&at t&e ,rocess ill ,roduce
-&e 9oint densit$ of n i.i.d. o3servations from t&is ,rocess is t&e ,roduct of t&e individual densitiesF
(y
1
, ., y
n
|0) = _(y
|0) = I(0|y)
n
1

-&e 9oint densit$ is t&e li7eli&ood function defined as a function of t&e un7non ,arameter vector( 0( &ere
y is used to indicate t&e collection of sam,le data.

E2ample) $oisson istri,ution
/onsider a random sam,le of t&e folloing =5 o3servations from a Poisson distri3utionF 8( 5( =( =( 5( *( )( *(
4( =.
-&e densit$ for eac& o3servation isF
(y
|0) =
c
-0
0
i
y
!

#ince t&e o3servations are inde,endent( t&eir 9oint densit$( &ic& is t&e li7eli&ood for t&is sam,le isF
(y
1
, ., y
10
|0) = _(y
|0) =
n
1
c
-100
0

i
10
i=1
y
!
10
1
=
c
-100
0
20
2u7,S6u

-&is gives t&e ,ro3a3ilit$ of o3serving t&is ,articular sam,le( assuming t&at a Poisson distri3ution it& as $et
un7non ,arameter 0 generated t&e data
<&at value of 0 ould ma7e t&is sam,le most ,ro3a3leA
W<e could ,lot t&is function for different values of 0 and see &ic& &as t&e largest value of t&e li7eli&oodX
"et y = y
1
, ., y
n

o 1ssume t&ese o3servations are from t&e distri3ution function F it& densit$ function and
c&aracteristics 0.
o If 2 is t&e normal distri3ution ( t&en 0 = (p, o
2
) it& e.,ectation p and variance o
2
.
o /onsider to ,ossi3le c&oices for 0( sa$ 0
1
and 0
2
.
o t&e ,ro3a3ilit$ to o3serve t&e sam,le y is e%ual to I(0
1
, y)( if 0
1
is t&e true valueT
o t&e ,ro3a3ilit$ to o3serve t&e sam,le y is e%ual to I(0
2
, y)( if 0
2
is t&e true valueT
o if I(0
1
, y) I(0
2
, y)( t&en e decide t&at 0
1
is 3etter t&an 0
2
.
1s suc&( e searc& for t&at value of 0( t&at ma.imiIes t&e li7eli&ood function.
-&e generalisation of t&is idea is t&e ma.imum li7eli&ood ,rinci,le
o 1mong all t&e ,ossi3le values t&at 0 can assume( t&e one it is most li7el$ to assume is t&e one t&at
ma.imiIes t&e li7eli&ood function given 3$ I(0, y) = (y, 0).

Definition
o "et I(0; y
1
, ., y
n
) = (y
1
, ., y
n
; 0) 3e t&e li7eli&ood function of t&e sam,le y and let 3e t&e set
of all 0's.
o If t&e statistic 0
`
ma.imiIes t&e li7eli&ood function I(0, y) for 0 e ( t&en 0
`
is said to 3e t&e
ma.imum li7eli&ood estimator (M"E) of 0.
-&e ma.imum li7eli&ood estimator selects t&e densit$ function from t&e famil$ (y, 0) t&at ma7es t&e actual
realiIation most ,ro3a3le.
1ssume t&e li7eli&ood function is differentia3le it& res,ect to 0. -&e first-order condition for ma.imiIing
t&e li7eli&ood function is
o e%ual to t&e total differential JIJ0 = u( if 0 consists of one ,arameterT
o or e%ual to t&e ,artial derivatives oIo0 = u( if 0 consists of several ,arameters.
!ecause t&e logarit&mic transformation is monotonic(
JIJ0 = u if and onl$ if Jln(I)J0 = u.
-&e log-li7eli&ood (lnI) is more convenient( 3ecause
o t&e logarit&m of a ,roduct is e%ual to t&e sum of t&e logarit&ms( and
o t&e derivative of a ,roduct is usuall$ more com,licated t&an t&e derivative of a sum.

$roperties
NotationF 0 5 true valueT 0
`
( 0
`
n
( 0
- estimators
2n3iasednessF 1n estimator is un3iased( if E(0
`
) = 0.
Efficienc$F 1ssume( 0
`
and 0
are to un3iased estimators. If Ior(0

`
) < Ior(0
)( t&en 0
`
is more efficient.
/onsistenc$F 1n estimator 0
`
n
is called a consistent estimator of 0( if for n - ( P(0 - o < 0
`
n
< 0 +o) =
1( for all o u. Intuitivel$ t&is means( t&at for n - t&e estimator 0
`
n
converges to its true value 0.
1s$m,totic efficienc$F 1n estimator is as$m,toticall$ efficient( if it is consistent and as$m,toticall$ normall$
distri3uted. 1dditionall$( it &as to 3e true t&at t&e as$m,totic covariance matri. is not larger t&an t&e
covariance matri. of an$ ot&er consistent and as$m,toticall$ normall$ distri3uted estimators.
"inear regression modelF y = [ +e
[
`
0
= (
i
)
-1
'y is !"2E C P3est linear un3iased estimatorQ
-&e ma.imum li7eli&ood estimator is consistent( as$m,toticall$ normall$ distri3uted and as$m,toticall$
efficient.

E.am,le =F Poisson Distri3ution /ontinued
Ma.imise t&e log li7eli&ood for t&e Poisson e.am,leF
lnI(0|y) = -n0 + ln0 y
n
1
- ln(y
!)
n
1

o ln I(0|y)
o0
= -n +
1
0
y
n
1
= u 0
`
M
= y
n

For t&e assumed sam,le of o3servationsF
ln I(0|y) = -1u0 +2u ln 0 -12.242
o lnI(0|y)
o0
= -1u +
2u
0
= u 0
`
M
= 2

E.am,le )F Normal Distri3ution
"et y
~N(p, o
2
). Find M"E of p and o
2
( i.e. 0 = (p, o
2
).
Densit$ function of normal distri3utionF (y
) =
1
V2nc
2
ex{-(y
- p)
2
2o
2
]
"i7eli&ood and log-li7eli&ood function for 0 = (p, o
2
)F
I(0, y) = _ (y
, 0) =
1
(2no
2
)
n
ex _- -
1
2
[
y
- p
o

2 n
1
_
n
1

lnI(0, y) = (y
, 0) = -
n
2
ln(o
2
) -
n
2
ln(2n) -
1
2
-
(y
-p)
2
o
2
n
1
n
1

Derivative it& res,ect to pF
olnL
op
= u = -
1
o
2
(y
-p)
n
1

Ma.imum "i7eli&ood Estimator - p = 1n y
n
1
= y

Derivative it& res,ect oF
oln L
oo
= u = -
n
2o
2
+
1
2o
4
(y
-p)
2
n
1

Ma.imum "i7eli&ood Estimator - o
2
= 1n (y
-y)
2 n
1
it& y = 1n y
n
1

Remar7F as t&e result for o
2
s&os( t&e ma.imum li7eli&ood does not need to 3e un3iased.

E.am,le *F "inear regression model
y
= x
k
[
k
+e
K
k1
or y = [ +e
"et e
~N(u, o
2
)
o -&is im,lies t&at eac& y
is normall$ distri3uted it& mean [ and variance o

2

Find M"E of [ and o
2
( i.e. 0 = ([, o
2
).
It is trueF e
= y
- x
k
[
k
K
k1
and e = e
1
, ., e
n
( y = y
1
, ., y
n
and = x
, . . , x
K
( it& x
k
=
x
k1
, . . , x
kn

Densit$ function of normal distri3utionF (e
) = 1o(2n)
0.5
cxp|-e
2
2o
2
|
"i7eli&ood function for 0 = ([, o
2
)F
I(0, e) = _ (e
, 0) =
1
(2no
2
)
n
ex _-
1
2o
2
_y
- x
k
[
k
K
k1
]
2
n
1
_
n
1

"og-li7eli&ood functionF
lnI(0, e) = -
n
2
ln(2n) -
n
2
ln(o
2
) -
1
2c
2
(y - [)
i
(y - [)
n
1

Derivative it& res,ect to [F
ln(I(0, e)) o[ = u = -2
i
(y - [)

Ma.imum "i7eli&ood Estimator - [
`
= (
i
)
-1
'y

Derivative it& res,ect to oF
ln(I(0, x)) oo = u = -
n
2o
2
+
1
2o
4
(y -[)
i
(y -[)

Ma.imum "i7eli&ood Estimator - o
2
= 1n(y -[)
i
(y -[) =
1
n
e'e

Remar7F o
0
2
=
1
(n-k)
e'e

Estimation of the .ariance!co.ariance matri2
-o calculate t&e standard errors( e need t&e variance-covariance matri..
2sing t&e least s%uares met&od( t&e variance-covariance matri. of t&e linear regression model y = [ + e is
e%ual to o
2
(
i
)
-1
( it& o
2
re,laced 3$ o
2
.
-&e as$m,totic variance-covariance matri. of t&e M"E is a matri. of ,arameters t&at &ave to 3e estimated(
i.e. a function of t&e true value 0( t&at &as to 3e estimated.
Estimator =F
o +esse matri.F |
`
(0
`
)]
-1
= -j
2
n(0
)
0
0i
[
-1

Estimator )F
o !+++ estimatorj
`
`
(0
`
)[
-1
= | g
i n
1
]
-1
= |0
`
'0
`
]
-1
it& g
to 3e t&e gradient evaluated at 0

`
(
g
=
In](
i
,0
)
0
( and 0
`
= |g
1
, g
2
, ., g
n
]'.

'nce e &ave an estimator of t&e covariance matri.( t-tests and F-tests can 3e carried out in t&e normal
manner
1n alternative to t&e F-test is t&e "i7eli&ood Ratio -estF
IR
= -2|ln(I
R
) -ln (I
)]~_
2

it& ln(I
R
) and ln (I
) t&e log-li7eli&ood functions of t&e restricted and unrestricted models res,ectivel$ and r t&e
num3er of restrictions.
't&er testsF <ald-test or "agrange multi,lier test

Regression Hints
Do not ala$s attem,t to ma.imise R
2
( ad9usted R
2
or some ot&er goodness-of-fit measure.
o Mig&t include factors in xBs t&at s&ould not 3e &eld fi.ed.
o It is ,ossi3le to o3tain a convincing estimate of a causal effect it& a lo R
2
.
o Include covariates t&at &el, ,redict t&e outcome if t&e$ are uncorrelated (in t&e ,o,ulation) it& t&e
covariate(s) of interest.
!e careful &en using models nonlinear in e.,lanator$ varia3les( es,eciall$ it& interactions.
o /oefficients on level terms ma$ 3ecome essentiall$ meaningless. E.am,leF
contribs = [
0
+ [
1
motcb +[
2
incomc +[
3
cmolc +[
4
motcb incomc +[
5
motcb cmolc + e
o -&e coefficient on matc& measures t&e sensitivit$ of contri3utions to t&e matc& rate for a male or7er
it& Iero income.
/entring can ma7e coefficients more interestingF
contribs = [
0
+[
1
motcb + [
2
incomc +[
3
cmolc + [
4
motcb (incomc -tncomc ) +[
5
motcb
cmolc +e
o -&e coefficient on matc& no measures t&e effect of t&e matc& rate for a male or7er at t&e average
income level.

$ooled ata
Pooling Inde,endent /ross-#ections across -ime
o #ince a random sam,le is dran at eac& time ,eriod( ,ooling t&e resulting random sam,les
gives us an inde,endentl$ ,ooled cross-section
o 1s suc&( e can use standard '"# met&ods
o 1dvantage of ,ooling is to increase t&e sam,le siIe( t&ere3$ o3taining more ,recise
estimates and test statistics it& greater ,oer
o Pooling is onl$ useful in t&is regard if t&e relations&i, 3eteen t&e de,endent varia3le and
at least some of t&e inde,endent varia3les remains constant over time
o -o reflect t&e fact t&at t&e ,o,ulation ma$ &ave different distri3utions in different time
,eriods( t&e interce,t is usuall$ alloed to differ across time ,eriods (can 3e accom,lis&ed
3$ including $ear dummies)
o -&e coefficients on t&e $ear dummies ma$ 3e of interest (e.g. after controlling for ot&er
factors &as t&e ,attern of fertilit$ c&anged over timeA)
o ?ear dummies can also 3e interacted it& ot&er e.,lanator$ varia3les to see if t&e effect of
t&at varia3le &as c&anged over time

-esting for #tructural /&ange across -ime
o /onsidering a ,ooled dataset of to time ,eriods t
1
and t
2

o Interact eac& varia3le it& a $ear dumm$ for t
2

o -est for t&e 9oint significance of t&e $ear dumm$ and all of t&e interaction terms
o #ince t&e interce,t in a regression model often c&anges over time( t&e /&o test can detect
suc& c&anges. It is usuall$ more interesting to allo for an interce,t difference and t&en to test
&et&er certain slo,e coefficients c&ange over time
o -&is can 3e e.tended to more t&an to time ,eriods

$olicy Analysis *ith $ooled Cross!Sections
Difference-in-Difference Estimation
o Met&odolog$
E.amine t&e effect of some sort of treatment 3$ com,aring t&e treatment grou, after
treatment 3ot& to t&e treatment grou, 3efore treatment and to some ot&er control
grou,.
#tandard caseF outcomes are o3served for to grou,s for to time ,eriods. 'ne of t&e
grou,s is e.,osed to a treatment in t&e second ,eriod 3ut not in t&e first ,eriod. -&e
second grou, is not e.,osed to t&e treatment during eit&er ,eriod. #tructure can a,,l$ to
re,eated cross sections or ,anel data.
2suall$ related to a so-called natural (or %uasi-) e.,eriment( &en some e.ogenous event
C often a c&ange in government ,olic$ C c&anges t&e environment in &ic& individuals(
families( firms or cities o,erate.

E.am,le
o 1 state offers a ta. 3rea7 to firms ,roviding em,lo$ers it& &ealt& insurance. -o estimate t&e
im,act of t&e 3ill on t&e ,ercentage of firms offering &ealt& insurance e could use data on a
state t&at didnBt im,lement suc& a la as a control grou,. It is not correct 9ust to com,are ,re-
and ,ost-la c&anges in t&e ,ercentage of firms offering &ealt& insurance( i.e.
y = [
0
+
0
J2 + e (=)
&ere J2 is a dumm$ for ,eriod to.
o +ere t&e coefficient estimate
`
0
gives an estimate of t&e difference in t&e ,ercentage of firms
offering &ealt& insurance 3eteen ,eriods one and to
o -&e coefficient doesnBt necessaril$ ,rovide a (causal) estimate of t&e im,act of t&e ta. 3rea7
&oever( since t&ere could 3e a trend toards more em,lo$ers offering &ealt& insurance over
time

<it& re,eated cross sections( let A 3e t&e control grou, and B t&e treatment grou,. <rite
y = [
0
+[
1
JB +
0
J2 +
1
J2. JB +e ())
&ereF
y is t&e outcome of interest (e.g. ,ercentage of firms offering &ealt& insurance in eac& #tate)
dB ca,tures ,ossi3le differences 3eteen t&e treatment and control grou,s ,rior to t&e ,olic$
c&ange (e.g. #tate 1 versus #tate !)
d) ca,tures aggregate factors t&at ould cause c&anges in y over time even in t&e a3sence of a
,olic$ c&ange( i.e. for 3ot& #tates (e.g. time dummies)
-&e coefficient of interest is
1
( &ic& gives an estimate of t&e c&ange in &ealt& insurance ta7e-
u, for firms in #tate !( and &ic& is called t&e difference-in-difference estimator.

#tate 1 #tate !
?ear =
o b
?ear )
c J

/oefficient /alculation
[
0
o
[
1
b -o
0
c -o
1
(J -b) -(c -o)

-&e difference-in-differences (DD) can 3e ritten asF

`
1
= (y
B,2
-y
B,1
) -(y
A,2
-y
A,1
) (*)
In ot&er ords(
`
1
re,resents t&e difference in t&e c&anges over time.
1ssuming t&at 3ot& states &ave t&e same &ealt& insurance trends over time( e &ave no
controlled for a ,ossi3le national time trend( and can no identif$ &at t&e true im,act of t&e
ta. deducti3ilit$ is on em,lo$ers offering insurance.
Inference 3ased on moderate sam,le siIes in eac& of t&e four grou,s is straig&tforard( and is
easil$ made ro3ust to different grou,Ltime ,eriod variances in regression frameor7.

/an refine t&e definition of treatment and control grou,s.
o E.am,leF c&ange in state &ealt& care ,olic$ aimed at elderl$. /ould use data onl$ on ,eo,le
in t&e state it& t&e ,olic$ c&ange( 3ot& 3efore and after t&e c&ange( it& t&e control grou,
3eing ,eo,le 88 to G8 (sa$) and t&e treatment grou, 3eing ,eo,le over G8
o -&is DD anal$sis assumes t&at t&e ,at&s of &ealt& outcomes for t&e $ounger and older
grou,s ould not 3e s$stematicall$ different in t&e a3sence of intervention
o 1lternativel$( use t&e over-G8 ,o,ulation from anot&er state as an additional control
"et dE 3e a dumm$ e%ual to one for someone over G8F
y = [
0
+[
1
JB + [
2
JE +[
3
JB. JE +
0
J2 +
1
J2. JB +
2
J2. JE +
3
J2. JE. JB + e (4)
-&e '"# estimate
`
3
is
`
3
= |(y
B,L,2
- y
B,L,1
) -(y
B,N,2
-y
B,N,1
)] -|(y
A,L,2
-y
A,L,1
) - (y
A,N,2
- y
A,N,1
)] (8)
&ere t&e A su3scri,t means t&e state not im,lementing t&e ,olic$ and t&e ( su3scri,t re,resents t&e
non-elderl$. -&is is t&e difference5in5difference5in5differences 6DDD7 estimate.

/an add covariates to eit&er t&e DD or DDD anal$sis to control for com,ositional c&anges.
/an use multi,le time ,eriods and grou,s.
-&is met&odolog$ &as a num3er of a,,lications( ,articularl$ &en t&e data arise from a natural
e.,eriment (or %uasi e.,eriment)
o 'ccurs &en some e.ogenous event C often a c&ange in government ,olic$ C c&anges t&e
environment in &ic& individuals( families( firms or cities o,erate
1 natural e.,eriment ala$s &as a control grou,( &ic& is not affected 3$ t&e ,olic$ c&ange( and
a treatment grou, t&oug&t to 3e affected 3$ t&e ,olic$ c&ange
Different to a true e.,eriment t&e control and treatment grou,s in natural e.,eriments arise
from t&e ,articular ,olic$ c&ange and are not randoml$ assigned

If C is t&e control grou, and I t&e treatment grou,( and letting JI e%ual one for t&ose in t&e
treatment grou, I( and Iero ot&erise. -&en( letting J2 denote a dumm$ for t&e second (,ost-
,olic$ c&ange) time ,eriod( t&e e%uation of interest isT
y = [
0
+
0
J2 + [
1
JI +
1
J2 JI + otbcr octors
<&ere
1
measures t&e effect of t&e ,olic$
<it&out ot&er factors in t&e regression
`
1
ill 3e t&e difference-in-difference estimatorF

`
1
= (y
2,
- y
2,C
) - (y
1,
- y
1,C
)
&ere t&e 3ar denotes t&e average

!efore 1fter 1fter - !efore
/ontrol
[
0
[
0
+
0

0

-reatment
[
0
+[
1
[
0
+
0
+[
1
+
1

0
+
1

-reatment - /ontrol
[
1
[
0
+
1

1

-&e ,arameter
1
C sometimes called t&e average treatment effect C can 3e estimated in to
a$sF
=. /om,ute t&e differences in averages 3eteen t&e treatment and control grou,s in eac&
time ,eriod( and t&en difference t&e results over time
). /om,ute t&e c&ange in averages over time for eac& of t&e treatment and control grou,s(
and t&en difference t&ese c&anges( i.e. rite
`
1
= (y
2,
- y
1,
) - (y
2,C
- y
1,C
)
<&en e.,lanator$ varia3les are added to t&e regression( t&e '"# estimate of
1
no longer &as a
sim,le form( 3ut its inter,retation is similar

Lec01 - Introduction

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Lec01 - Introduction

Caricato da

Copyright:

Formati disponibili

Panel Data Econometrics

are to un3iased estimators. If Ior(

it& '"# and generate t&e s%uared

( &ic& is uncorrelated it& e

t&at are uncorrelated it& e

( and use t&ese to estimate

. <e donBt 7no n

in t&e regression of interest( regressing ' on

in large sam,les( so t&e assum,tion 1= &olds

in t&e original regression e%uation.

is t&e de,endent varia3le

are correlated. If t&e$ are( Z

are correlated (in fact( for t&is test to 3e relevant e &ave

are correlated casts dou3t on t&e use of Z

on all e.ogenous varia3les. '3tain t&e R-s%uared( R

) = u( &ic& re%uires t&at x

) is non-singular (&as ran7 K) (Assumption 3)

is critical for G"# estimation of s$stems of e%uations

) is non-singular (Assumption <)

( giving t&e FG"# estimatorF

is a 0 I matri. of o3serva3le instruments (SI+ Assumption 7)

) is an I K matri.( #I0 1ssum,tion ) re%uires t&e columns of t&is matri. to 3e linearl$

) = t&erefore re,roduces t&e ort&ogonalit$ condition

to ma7e t&e vector in e%uation (=) as PsmallQ as ,ossi3le in t&e sam,le

( &ic& is an I I s$mmetric( ,ositive semidefinite

t&at solves t&e ,ro3lemF

3e an initial consistent estimator of . -&is is usuall$ t&e s$stem )#"# estimator

3e t&e residuals from an initial estimation (usuall$ )#"#)

are t&e ,redictions of Y and z and L

t&e residuals from t&e )#"# model)

are to un3iased estimators. If Ior(0

is normall$ distri3uted it& mean [ and variance o

to 3e t&e gradient evaluated at 0

Potrebbero piacerti anche