Sei sulla pagina 1di 25

Chapter 17

Pooling Time-Series and Cross-Sectional Data


When investigating the behavior of economic units such as households, firms, or even
nations, we often have observations on a number of such units for a number of time
periods.
The problem is how to specify a statistical model that will capture individual
differences in behavior so that we may combine or pool all the data (information) for
estimation and inference purposes.
17.1 An Economic Model
Investment demand is the purchase of durable goods by both households and firms.

Slide 17.1
Undergraduate Econometrics,2nd Edition-Chapter 17

Focusing on Vit (value of a firms stock) and Kit (capital stock) as explanatory
variables, an economic model for describing gross firm investment for the i-th firm
in the t-th time period may be expressed as
INVit = f (Vit , K it )

(17.1.1)

Let yit = INVit denote values for the dependent variable and x2it = Vit and x3it = Kit denote
values for the explanatory variables.
A very flexible linear statistical model that corresponds to (17.1.1) is
yit = 1it + 2it x2it + 3it x3it + eit

(17.1.2)

Slide 17.2
Undergraduate Econometrics,2nd Edition-Chapter 17

17.2 Seemingly Unrelated Regressions


The simplification of (17.1.2) that yields what is called the seemingly unrelated
regressions (SUR) model is
1it = 1i

2it = 2i

3it = 3i

(17.2.1)

The parameters of the investment function differ across firms (note that the "i"
subscript remains) but are constant across time.
This assumption means that the model in (17.1.2) becomes
yit = 1i + 2i x2it + 3i x3it + eit

(17.2.2)

The data we use to illustrate the SUR model are 20 time-series observations on
investment, stock market value and capital stock for two firms, General Electric (G)
and Westinghouse (W).
In terms of the subscripts in (17.2.2), i = G and W, and t = 1,2,...,20.
Slide 17.3
Undergraduate Econometrics,2nd Edition-Chapter 17

17.2.1Estimating Separate Equations


Corresponding to the model in (17.2.2), we can specify two regression models, one for
General Electric and one for Westinghouse.
INVGt = 1G + 2GVGt + 3G K Gt + eGt

(17.2.3a)

INVWt = 1W + 2W VWt + 3W KWt + eWt

(17.2.3b)

For the moment we make the usual least squares assumptions about the errors. That is,
E (eGt ) = 0 var (eGt ) = G2 cov (eGt , eGs ) = 0
E (eWt ) = 0

var (eWt ) = W2 cov (eWt , eWs ) = 0

(17.2.4a)
(17.2.4b)

Note that the two functions do have different error variances G2 and W2 .
One way of linking the two equations uses a dummy variable to give the model
INVt = 1G + 1Dt + 2GVt + 2 DV
t t + 3G K t + 3 Dt K t + et

(17.2.5)

Slide 17.4
Undergraduate Econometrics,2nd Edition-Chapter 17

Dt is a dummy variable equal to 1 for the Westinghouse observations and 0 for the
General Electric observations.
(17.2.5) is just another way of writing (17.2.3).
They are identical specifications with the following relationships between the
parameters
1W = 1G + 1

2W = 2G + 2 3W = 3G + 3

What happens if we apply least squares to (17.2.5) utilizing all 40 observations?


The estimates of the 's turn out to be exactly the same.
The standard errors from the two procedures will be different.
If we estimate the pooled dummy variable model by least squares, we are implicitly
assuming that the error variance for et is constant over all 40 observations.
What happens, then, if we recognize the existence of heteroskedasticity (G2 W2 ) , and
apply generalized least squares to the pooled dummy-variable model? In this case all

Slide 17.5
Undergraduate Econometrics,2nd Edition-Chapter 17

the results, both coefficient estimates and standard errors, will be exactly the same as
those obtained from separate least squares estimation of the two equations.
17.2.2Joint Estimation of the Equations
An assumption that lets us utilize a joint estimation procedure that is better than
separate least squares estimation is
cov (eGt , eWt ) = GW

(17.2.6)

This assumption says that the error terms in the two equations, at the same point in
time, are correlated. This kind of correlation is often called contemporaneous
correlation.
To understand why eGt and eWt might be correlated, recall that these errors contain the
influence on investment of factors that have been omitted from the equations.

Slide 17.6
Undergraduate Econometrics,2nd Edition-Chapter 17

Since the two firms are similar in many respects, it is likely that the effects of the
omitted factors on investment by General Electric will be similar to their effect on
investment by Westinghouse.
If so, then eGt and eWt will be capturing similar effects and will be correlated.
Adding the contemporaneous correlation assumption (17.2.6) has the effect of
introducing additional information that is not included when we carry out separate least
squares estimation of the two equations.
This information cannot be utilized when the equations are estimated separately.
However, it can be utilized to produce better estimates when the equations are jointly
estimated.
The GLS transformation is too complicated to present here, but it is automatically
carried out by your computer software, usually using some kind of "seemingly
unrelated regression" command.

Slide 17.7
Undergraduate Econometrics,2nd Edition-Chapter 17

The steps that your software follows are: (i) Estimate the equations separately using
least squares. (ii) Use the least squares residuals from step (i) to estimate G2 , W2 and
GW . (iii) Use the estimates from step (ii) to estimate the two equations jointly within a
generalized least squares framework.

Slide 17.8
Undergraduate Econometrics,2nd Edition-Chapter 17

Table 17.1

Least Squares and Seemingly Unrelated Regression Estimates for Two

Investment Functions
Variable

General Electric
LS

constant

LS

SUR

9.9956 27.719

0.509

1.252

(31.374) (27.033)

(8.015)

(6.956)

0.0529

0.0576

0.0265

SUR

Westinghouse

0.0383

(0.0156) (0.0133) (0.0157) (0.0134)

0.1517

0.1390

0.0924

0.0640

(0.0257) (0.0230) (0.0561) (0.0489)

Slide 17.9
Undergraduate Econometrics,2nd Edition-Chapter 17

Since the SUR technique utilizes the information on the correlation between the error
terms, it is more precise than least squares. This fact is supported by the lower standard
errors of the SUR estimates.
Equations which exhibit contemporaneous correlation were called "seemingly
unrelated" by Arnold Zellner; the equations seem to be unrelated, but the additional
information provided by the correlation between the equation errors means that joint
generalized least squares estimation is better than single-equation least squares.
17.2.3

Separate or Joint Estimation

There are two situations where separate least squares estimation is just as good as the
SUR technique.
The first of these cases is where the errors are not correlated. If the errors are not
correlated, there is nothing linking the two equations, and separate estimation cannot
be improved upon.
Slide 17.10
Undergraduate Econometrics,2nd Edition-Chapter 17

The second situation is less obvious. Indeed, some advanced algebra is needed to prove
that least squares and SUR give identical estimates when the same explanatory
variables appear in each equation.
If the explanatory variables in each equation are different, then a test to see if the
correlation between the errors is significantly different from zero is of interest. If a null
hypothesis of zero correlation is not rejected, then there is no evidence to suggest that
SUR will improve on separate least squares estimation. To carry out such a test we
compute the squared correlation
2
(176.45 )
GW
2
= 2 2 =
= (0.729 ) = .53139
G W (660.83)(88.662 )
2

2
rGW

(R17.1)

The variance estimates G2 and W2 are the usual ones from separate least squares
estimation, except that T = 20 rather than TK = 17 has, for large-sample
approximation reasons, been used as the divisor in the formulas.
The estimated covariance is computed from
Slide 17.11
Undergraduate Econometrics,2nd Edition-Chapter 17

GW

1 20
= eGt eWt
T t =1

2
To check the statistical significance of rGW
, we test the null hypothesis H 0 : GW = 0 .
2
2
If GW = 0 , then = T rGW
is a test statistic that is distributed as a (1)
random variable

in large samples.
The 5% critical value of a 2 distribution with one degree of freedom is 3.84. The
value of the test statistic from our data is = 10.628. Hence we reject the null
hypothesis of no correlation between the eGt and eWt .
If we are testing for the existence of correlated errors for more than two equations, the
relevant test statistic is equal to T times the sum of squares of all the correlations; the
probability distribution under H0 is a 2 distribution with degrees of freedom equal to
the number of correlations.
For example, with 3 equations, denoted by subscripts "1", "2" and "3", the null
hypothesis is
Slide 17.12
Undergraduate Econometrics,2nd Edition-Chapter 17

H 0 : 12 = 13 = 23 = 0
2
test statistic is
and the (3)

= T ( r122 + r132 + r232 )

17.2.4

Testing Cross-Equation Restrictions

Suppose we are interested in whether the equations for Westinghouse and General
Electric have identical coefficients. That is, we are interested in testing
H 0 : 1G = 1W , 2G = 2W , 3G = 3W

(17.2.7)

against the alternative that at least one pair of coefficients are not equal.
It is possible to test hypotheses such as (17.2.7) when the more general error
assumptions of the SUR model are relevant.
Most computer software will perform an F-test and/or a 2 -test.
Slide 17.13
Undergraduate Econometrics,2nd Edition-Chapter 17

In the context of SUR equations both tests are large sample approximate tests. The Fstatistic has (MTK) and J degrees of freedom where M is the number of equations, K
is the total number of coefficients in the whole system and J is the number of
restrictions.
The 2 -statistic has J degrees of freedom.
For our particular example, at a 5% significance level, we find that F = 3.01 > Fc = 2.88
and 2 = 10.31 > c2 = 7.81. Thus, both tests reject the null hypothesis of equal
coefficients.

Slide 17.14
Undergraduate Econometrics,2nd Edition-Chapter 17

17.3 A Dummy Variable Specification


Return to (17.1.2) which is
yit = 1it + 2it x2it + 3it x3it + eit

(17.3.1)

Both the dummy variable model to be described in this section and the error
components model considered in the next section assume that
1it = 1i

2it = 2

3it = 3

(17.3.2)

This model of parameter variation specifies that only the intercept parameter varies,
not the response parameters;
and the intercept varies only across firms and not over time.
Also, we will make the assumption that the errors eit are independent and distributed
N(0, e2 ) for all individuals and in all time periods.

Slide 17.15
Undergraduate Econometrics,2nd Edition-Chapter 17

Given this assumption, and (17.3.2), it follows that all behavioral differences between
individual firms and over time are captured by the intercept. The resulting statistical
model is
yit = 1i + 2 x2it + 3 x3it + eit

(17.3.3)

The dummy variable model treats it as a fixed unknown parameter. We make


inferences only about the firms on which we have data.
The error components model views the firms on which we have data as a random
sample from a larger population of firms. The intercepts are treated as random
drawings from the population distribution of firm intercepts.
The example we use for introducing the dummy variable and error components
frameworks is the same investment behavior example that we used for the section on
SUR.

Slide 17.16
Undergraduate Econometrics,2nd Edition-Chapter 17

However, instead of using only 2 firms, we extend our data set to include 10 firms.
They comprise T = 20 time series observations on N = 10 firms.

17.3.1The Model
To introduce the dummy variable version of (17.3.3), we define dummy variables
1 i = 1
D1i =
,
0
otherwise

1 i = 2
D2i =
,
0
otherwise

1 i = 3
D3i =
, etc.
0
otherwise

Under these definitions (17.3.3) becomes

yit = 11 D1i + 12 D2i + + 1,10 D10i + 2 x2it + 3 x3it + eit

(17.3.4)

Slide 17.17
Undergraduate Econometrics,2nd Edition-Chapter 17

Given that the error terms eit are independent and N(0, e2 ) for all observations, the best
linear unbiased estimator of (17.3.4) is the least squares estimator.

Slide 17.18
Undergraduate Econometrics,2nd Edition-Chapter 17

Table 17.2 Dummy Variable Results


Variable

Parameter Estimate

Standard Error

t-Statistic

D1

69.14

49.68

1.39

D2

100.86

24.91

4.05

D3

235.12

24.42

9.63

D4

27.63

14.07

1.96

D5

115.32

14.16

8.14

D6

23.07

12.66

1.82

D7

66.68

12.84

5.19

D8

57.36

13.99

4.10

D9

87.28

12.89

6.77

D10

6.55

11.82

0.55

x2

0.1098

0.0119

9.26

x3

0.3106

0.0174

17.88
Slide 17.19

Undergraduate Econometrics,2nd Edition-Chapter 17

The firm intercepts vary considerably, and some of them have large t-values,
suggesting that the assumption of differing intercepts for different firms is appropriate.
To confirm this fact we can test the following hypothesis.
H0: 11 = 12 = ... = 1N
H1: the 1i are not all equal

(17.3.5)

These (N1) joint null hypotheses may be tested using the usual F-test statistic.

( SSER SSEU ) J
SSEU ( NT K )
(1749127 522855 ) 9
=
522855 ( 200 12 )

F=

(R17.2)

= 48.99

Slide 17.20
Undergraduate Econometrics,2nd Edition-Chapter 17

If the null hypothesis is true, then F ~ F9,188. The value of the test statistic
F = 48.99 yields a p-value of less than .0001; we reject the null hypothesis that the
intercept parameters for all firms are equal.

17.4 An Error Components Model

In an error components framework we continue to model differences in firm


investment behavior by permitting each firm to have a different intercept parameter.
However, we assume the intercepts are random variables; this alternative model is
useful if the individual firms (or cross-sectional units) appearing in the sample are
randomly chosen and taken to be "representative" of a larger population of firms.
Returning to equation 17.3.3
yit = 1i + 2 x2it + 3 x3it + eit

(17.4.1)
Slide 17.21

Undergraduate Econometrics,2nd Edition-Chapter 17

we take 1i to be random and modeled as


1i = 1 + i

i = 1,...,N

(17.4.2)

1 is an unknown parameter that represents the population mean intercept


i is an unobservable random error that accounts for individual differences in firm
behavior.
We assume that the i are independent of each other and eit, and that
E (i ) = 0 var(i ) = 2
Consequently, E (1i ) = 1 and var(1i ) = 2 .
Substituting (17.4.2) into (17.4.1) yields
yit = (1 + i ) + 2 x2it + 3 x3it + eit
= 1 + 2 x2it + 3 x3it + (eit + i )
= 1 + 2 x2it + 3 x3it + vit

(17.4.3)
Slide 17.22

Undergraduate Econometrics,2nd Edition-Chapter 17

where vit = eit + i.


The phrase "error components" comes from the fact that the error term vit = (eit + i)
consists of two components: the overall error eit and the individual specific error i .
The error i reflects individual differences, and varies across individuals, but is
constant across time.

Slide 17.23
Undergraduate Econometrics,2nd Edition-Chapter 17

The choice of estimation technique depends on the properties of the new error vit. It can
be shown that
E (vit ) = 0

(vit has zero mean)

(17.4.4a)

var(vit ) = 2 + e2

(vit is homoskedastic)

(17.4.4b)

cov(vit , vis ) = 2 (t s)

(the errors from the


same firm in different

(17.4.4c)

time periods are


correlated)
cov(vit , v js ) = 0 (i j)

(errors from different


firms are

(17.4.4d)

always uncorrelated)
The nonzero correlation in (17.4.4c) means that least squares is not the optimal
technique.
Slide 17.24
Undergraduate Econometrics,2nd Edition-Chapter 17

The generalized least squares estimator, that uses a transformed model, with
appropriately transformed error term, is a better estimator. Also, it yields standard
errors that are appropriate for interval estimation and hypothesis testing.
These tasks are performed automatically using appropriate econometric software. If we
do so for the investment function that utilizes the 20 time-series observations on 10
firms, we obtain the following generalized least squares estimated equation
y it =57.873+0.1095 x2it+0.3087 x3it
(28.875)(0.0105)

(R17.3)

(0.0172)

In this case the response parameters for the value and capital stock variables, and their
standard errors, are virtually identical to those obtained from the dummy variable
model. It makes little difference which model is specified.

Slide 17.25
Undergraduate Econometrics,2nd Edition-Chapter 17

Potrebbero piacerti anche