Sei sulla pagina 1di 10

Simultaneous Equation Models

As discussed last week, one important form of endogeneity is simultaneity. This arises when one or more of the explanatory
variables is jointly determined with the dependent variable, usually through an equilibrium mechanism.
Simultaneous Equations Models (SEMs) differ from those considered previously because in each model there are two or
more dependent variables rather than just one.
Simultaneous equations models also differ from most of the econometric models we have considered so far because they consist
of a set of equations
The least squares estimation procedure is not appropriate in these models and we must develop new ways to obtain reliable
estimates of economic parameters.
The usual method for estimating SEMs is the instrumental variables method, discussed last week.

Example
The classic example of an SEM is a supply and demand equation for some commodity (e.g. coffee) or input to production (e.g.
labour)
Consider a simple market supply function:
       

Where  is quantity or output,  is price and  is some observed variable affecting supply of the commodity (e.g. weather). The
error term,
 , contains other factors that affect supply.
The equation is an example of a structural equation, i.e. it is derivable from economic theory and has a causal interpretation.
The coefficient  measures how supply of the product changes when the price changes. If price and quantity are measured in
logs, the coefficient gives the price elasticity of supply.
Plotting the supply function, we plot output as a function of price, holding  and
 fixed. Changes in either of these two
factors lead to shifts in the supply curve; the difference being that  is observed, while
 is not.
The crucial assumption for OLS that we make is that the independent variables are independent of the error term.
In this case, this assumption does not hold. Assuming that the demand curve is downward sloping (or vertical), then a shift in
the supply curve produces a change in both price and quantity. Thus the error term is correlated with price.
In addition, the fact that  is random means that on the right-hand side of the supply and demand equations we have an
explanatory variable that is random. This is contrary to the assumption of fixed explanatory variables that we usually make in
regression model analysis.
The important thing to remember is that supply and demand interact to jointly determine the market price of a good and the
amount of it that is sold
An econometric model that explains market price and quantity should therefore consist of two equations, one for supply and one
for demand.

Demand:       

Supply:        


(1)
(2)

Where  is the quantity demanded and is an observed variable affecting the demand for the commodity (e.g. income).

In this model the variables p and q are called endogenous variables because their values are determined within the system we
have created.
The variables  and have values that are given to us, and which are determined outside this system. As such, these are
exogenous variables.

The error terms in the supply and demand equations are assumed to have the usual properties; i.e. they have a constant mean and
variance, and are independently distributed

A Bad Example
An important point to remember when using SEMs is that each equation in the model should have a ceteris paribus, causal
interpretation.
In the above example, the two equations describe entirely different relationships.
- The supply equation describes the behaviour of firms
- The demand equation is a behavioural relationship for consumers
Each equation has a ceteris paribus interpretation therefore and stands on its own
They become linked in the econometric analysis only because the observed price and quantity are determined by the intersection
of supply and demand.
Consider the following example:
               

                  

Neither of these equations has a sensible ceteris paribus interpretation because housing and saving are chosen by the same
individual.
- If income increases, a person will generally change the optimal mix of housing expenditures and saving. The first
equation however, makes it seem as though we want to know the impact of a change in income, education or age on
housing expenditure, holding saving constant.
Just because two variables are determined simultaneously does not mean that a SEM is suitable.

Simultaneity Bias in OLS


Consider the following example:
       


(3)

      

(4)

To show that  is generally correlated with


 we can solve for  in terms of the exogenous variables and the error terms.
Replacing  in (4) with the expression in (3) gives,
1            
 

Assuming that  

(5)

1 we can divide (5) by 1    to obtain,

  !    !  

(6)

Where !     /1    ; !   /1     and   


 
/1    
Equation (6) expresses  in terms of the exogenous variables and the error terms and is called the reduced form equation for
 .
The parameters !  and ! are non-linear functions of the structural parameters, and are termed the reduced form parameters.
The reduced form error,  , is a linear function of the structural error terms,
 and
.
Since the  are uncorrelated with the
,  is also uncorrelated with the , hence the reduced form parameters in (6) can be
estimated by OLS.
- The reduced form equations can be important for economic analysis. These equations relate the equilibrium values of
the endogenous variables to the exogenous variables.

Equation (6) also tells us that estimation of equation (3) by OLS will result in biased and inconsistent estimates of  and  .
- In equation (3) the issue is whether  and
 are correlated (  and
 are by assumption uncorrelated)
- From (6) we see that  and
 are correlated if and only if  and
 are correlated
- Since  is a linear combination of
 and
it is generally correlated with

When  is correlated with
 because of simultaneity, we say that OLS suffers from simultaneity bias

The Instrumental Variables Solution


As we saw last time the IV solution of two-stage least squares can be used to solve the problem of endogenous explanatory
variables.
This is also true for SEMs the major difference being that because we specify a structural equation for each endogenous
variable, we can immediately see whether sufficient IVs are available to estimate either equation.
Consider the following example:
       

    

Here we can think of the coffee market as an example, with  being say per capita coffee consumption,  being the average price
per jar and  being something like the weather (in Brazil!) that affects supply. It is assumed that  is exogenous to both the
supply and demand equations
The first question to be addressed is: given a random sample on ,  and  , which of the above equations can be estimated, i.e.
which is an identified equation?
It turns out that the demand equation is identified, but the supply equation is not.
- This is indicated for our rules for instruments
- We can use  as in instrument for price in the demand equation
- Because  appears in the supply equation however, we cannot use it as an instrument in this equation
- In order to estimate the supply equation we would need an observed exogenous variable that shifts the demand curve

Considering the more general two-equation model:


       

      

Where  and  are the endogenous variables,
 and
are the structural error terms, and  and now denote a set of
exogenous regressors, $ and $ , that appear in the first and second regression respectively, i.e.     ,  , , &'  and
   , , , &( .
In many cases  and will overlap
The assumption that  and contain different exogenous variables means that we impose exclusion restrictions on the model,
i.e. we assume that certain exogenous regressors do not appear in the first equation and others are absent from the second. This
allows us to distinguish between the two structural equations.
The Rank Condition for Identification of a Structural Equation
- The first equation in a two-equation SEM is identified if, and only if, the second equation contains at least one
exogenous variable (with a nonzero coefficient) that is excluded from the first equation.
- The order condition for identifying the first equation states that at least one exogenous variable is excluded from this
regression. This is simple to check once both equations have been specified.
- The rank condition requires more: at least one of the exogenous regressors excluded from the first equation must have
a non-zero population coefficient in the second equation. This can be tested using a t or F test.
- Identification in the second equation is the mirror image of the above

Estimation
Once we have determined that an equation is identified, we can estimate it by TSLS
- The instruments consist of the exogenous variables appearing in either equation
Tests for endogeneity, overidentifying restrictions and so on proceed as before
It turns out that, when any system with two or more equations is correctly specified and certain additional assumptions hold,
system estimation methods (e.g. Three-Stage-Least-Squares) are generally more efficient than estimating by TSLS

Systems with More Than Two Equations


SEMs can consist of more than two equations
Studying the general identification of these models is not straightforward
Once an equation has been shown to be identified, it can be estimated by TSLS
Consider the following three equation system
          


(7)

               

(8)

             ) ) 




(9)

It is difficult to show that an equation in a SEM with more than two equations is identified
It is clear however that (9) is not identified, since all exogenous regressors are included in the equation, leaving no instruments
for  - we have in terms of last week an unidentified equation
Equation (7) on the other hand looks promising; we have three exogenous regressors excluded from the regression, ,  and
) , and only two endogenous regressors,  and  - this equation is therefore overidentified
In general, an equation in any SEM satisfies the order condition for identification if the number of excluded exogenous variables
from the equation is at least as large as the number of endogenous regressors
- As such, the order condition in (8) is also satisfied since we have one excluded exogenous regressor, ) , and one
endogenous regressor,  - the equation is exactly identified
Identification of an equation depends on the parameters (which we can never know for sure) in the other equations however
- For example, if )  0 in (9) then (8) is not identified, as ) is useless as an instrument for 

Potrebbero piacerti anche