Panel Data Econometrics Kenya

COLLABORATIVE MASTERS PROGRAMME
IN ECONOMICS FOR ANGLOPHONE AFRICA

(CMAP)
JOINT FACILITY FOR ELECTIVES 2008
TOPICS IN MICROECONOMETRICS
DR. MOSES SICHEI*
Date: 8th September, 2008
LECTURE 10: INTRODUCTION TO PANEL ECONOMETRICS I
*
Research Department, Central Bank of Kenya, Contact details: Tel. 254 20 2860000 Ext.3248
Mobile:+254 723383505;Email: sichei@yahoo.co.uk or Sicheimm@centralbank.go.ke
Objectives:
The main objective of the lecture is to provide motivation for

panel data models. Specifically, the lecture presents the
following:
• Places panel data in the context of other data types
• Types of panel data types
• Advantages of panel data
• Limitations of panel data
• Overview of panel data models
Key words
• Balanced panel data
• Cross-section oriented panel data
• Dynamic panel
• Macro-panel data
• Micro-panel data
• Nonstationary panel data
• Panel data
• Pooled data
• Pseudo-panel
• Rotating panels
• Seemingly unrelated regression model
• Spatial panel data
• Static panel data
• Stationary panel data
• Synthetic panel data
• Unbalanced panel data
1. INTRODUCTION AND MOTIVATION
1.1 Types of Data
1.1.1 Cross-section data
1
• Values of one or more variables are collected for several sample units/economic entities
at the same point in time.
• In other words it is a snapshot at a point in time
• Examples
o Poverty rates in different countries in Africa at a particular point in time

o Econometrics marks for the 2008 CMAP group
o Household survey data for Uganda
• Cross-section models are predominantly equilibrium models that generally do not shed
light on intertemporal dependence of events
• They fail to resolve fundamental issues about the sources of persistent behaviour
• For instance what’s the main cause of high non-performing loans in country A?
1.1.2 Time series data
• Observe the values of one or more variables over time e.g. GDP, money supply for
several years.
• They shed light on intertemporal dependence of events
• E.g. autoregressive distributed lag models, error correction models etc.
1.1.3 Panel data (cross-section and time series)
• Cross-section repeatedly sampled over time but where the same economic agent has been
followed throughout the period of the sample.
• An example is the average marks for the CMAP econometrics course for each university
over the period 2001-2008.
Period University University of Dar Es University University of University of University of University of

of Nairobi Salaam of Ghana Addis Ababa Malawi Zimbabwe Botswana
2001
2002
2003
2004
2
2005
2006
2007
2008
• In other words panel data combines cross-section (“picture or snapshot”, or space)
with time series (“path.”, movie)
Other terms used;
• Pooled data (pooling of time series and cross-section observations)

• Combination of time series and cross-section data
• Longitudinal data (Study over time of a variable or group of subjects).
• Event history (study of the movement over time of subjects through successive states
or conditions)
• Cohort analysis (e.g. following the career path of the first CMAP graduates)
⇒ All these terminologies essentially connote movements over time of cross-sectional units.
⇒Thus panel data is used as a generic term to include one or more of these situations.
⇒Regressions based on such data are called panel data regression models
Examples
• Gravity model of trade, where you observe trade figures for different countries/products
over time
• Investment model, where your cross-sections are the firms observed over time
• Studies dealing with a panel of commercial banks
• Etc.
1.2 Structures of Panel Data
3
(a) Cross-section oriented panel data. The number of cross-sections (N) is more
than the time dimension (T) .e.g. study covering 24 banks over 10 years. This is
the original panel data
(b) Time-series oriented panel data. The time dimension (T) is greater than the
cross-sections (N) e.g. Study of the demand for 4 different oil products covering a
period of say 10 years. This is quite common in macroeconomics
(c) Balanced panel data. This is panel data where there is no missing observations
for every cross-section
(d) Unbalanced panel data. This is the case, where the cross-sections do not have
the same number of data observations. In other words some cross-sections do not
have data. For example when studying Ghana’s trade data to a number of
countries in Africa including South Africa, There would be no exports figures
before 1994 due to sanctions imposed on South Africa.
(e) Rotating panels. This is a case where in order to keep the same number of
economic agents in a survey; the fraction of economic agents that drops from the
sample in the second period is replaced by an equal number of similar economic
agents that are freshly surveyed. This is a necessity in survey panels where the
same economic agent (say household) may not want to be interviewed again and
again.
(f) Pseudo-Panels/synthetic panels. This panel data that is close to a genuine panel
data structure. For instance for some countries, panel data may not exist. Instead
the researcher may find annual household survey based on a large random sample
of the population. For instance in Kenya there are household surveys for 1993,
1994, 1997 and the recent KHIBS 2006. For these repeated cross-section
surveys, it may be impossible to track the same household over time as required
in a genuine panel. In Pseudo panels cohorts are tracked (e.g. males borne
between 1970 to 1980). For large samples, successive surveys will generate
random samples of members of each cohort. We can then estimate economic
relations based on means rather than individual observations.
(g) Spatial Panels. This is panel data dealing with space. For instance cross-section
of countries, regions, states. These aggregate units are likely to exhibit cross-
sectional correlation that has to be dealt with using special methods (spatial
econometrics)
4
(h) Limited dependent/nonlinear panel data. This is panel data where the
dependent variable is not completely continuous-binary(logit/probit models),
hierarchical (nested logit models), ordinal (ordered logit/probit),
categorical(multinomial logit/probit), count models(poisson and negative
binomial), truncated (truncated regression), censored (tobit), sample
selection(Heckit model)
1.3 Types of Panel Data Models

(a) Static Panel data Models vs Dynamic Panel Data Model. Static panel data model
has no lagged dependent variable on the rhs.
(c ) Stationary Panel Data Model vs Non Stationary Panel Data Model. Stationary
panel data model contain stationary variables (i.e. I(0) variables) as opposed to non-
stationary variables
1.4 Why Panel Data? (Baltagi, 2005, Chapter 1)
1. Control for heterogeneity among economic agents.

• Time series and cross-section studies that do not control for the heterogeneity run the
risk of obtaining biased results.
2. More informative data, more variability, less collinearity amongst variables,
more degress of freedom and more efficiency;
• Time series studies are faced with multi-collinearity in most cases for instance in
studying say the demand for beer in Kenya using time series, there is likely to be
high collinearity between price and income in aggregate time series data. This is less
likely with a panel across the 8 provinces in Kenya since the cross-section dimension
adds a lot of variability, adding more informative data on price and income. The idea
is that the variation in the data can be decomposed into variation between the 8
provinces of different sizes and characteristics and variation within each prince over
time.
• Blows up degrees of freedom (NxT);
5
• Increased precision in estimation (more efficiency). With additional more
informative data, we can produce more reliable parameter estimates.
3. Panel data are better able to study dynamics of adjustment. Panel data are better
suited for studying the duration of economic states like unemployment and poverty
and if such panels are long enough, they can shed some light on the speed of
adjustments of to economic policy changes.
• E.g. the effects of free primary education on poverty.
• Questions such as determining whether families’ experiences of poverty,
unemployment and dependency ratios are transitory or chronic necessitate the use of
panels. By studying the repeated cross-section of observations, panel data are better
suited to study the dynamics of change.
• Estimation of intertemporal relations, life cycles and inter-generational models
4. Panel data are better able to identify and measure the effects that are simply not
detectable in pure cross-section and pure time series data
• Does union membership in Kenya increase or decrease wages? We need to observe a
worker moving from union to nonunion jobs or vice versa. Holding the individual’s
characteristics constant, we will be better equipped to determine whether union
membership affects wage and by how much.
5. Panel data models allow us to construct and test more complicated behavioural
models than purely cross-section and time series data e.g technical efficiency better
studied in panel
6. Panel data are usually gathered on micro units like individuals, firms, households,
countries etc. Many variables can be more accurately measured at the micro level
and biases resulting from aggregation over firms or individuals are eliminated.
1.5 Limitations of Panel Data Analysis
1. Design and data collection problems

• Coverage (incomplete account of the population of interest)
• Non-response (due to lack of cooperation of the respondent-fear for use of the
results(tax?) or because of interviewer error
6
• Recall (respondents not remembering correctly)
• Freq of interviews
• Time in sample bias-is observed when a significantly different level for a
characteristic occurs in the first interview than in later interviews, when ideally one
would expect the same level
2. Distortions of measurement error
Measurement errors may arise because of
• Faulty responses due to unclear questions, memory errors, deliberate distortions
(e.g.prestige bias); inappropriate informants, misrecording of responses
3. Selectivity problems
These include;
• Self-selectivity-For instance people choose not to work because of the reservation
wage is higher than the offered wage. In this case we only observe the characteristics
of these individuals but not their wage. Panel data does not solve such a problem
• Non/partial-responses-Occurs mainly at the initial wave of the panel due to refusal to
participate, nobody at home, untraced sample unit, etc. This cause loss in efficiency
as well as serious identification problems for the population parameters
• Attrition-Respondents may die, or move or find that the cost of responding is too
high. In order to counter the effects of attrition, rotating panels are sometimes used,
where a fixed percentage of the respondents are replaced in every wave to replenish
the sample.
4. Very short time series dimension
• Sometimes data for the problem at hand has a short time span for each individual
(micropanels).
• This means that asymptotic refinements which rely crucially on the number of
individuals tending to infinity may not be useful.
5. Cross-section dependence
• Macropanels on countries or regions may lead to misleading inference.
• Most analyses assume independence
• When we have dependence between cross-sections, it becomes complicated
• More on this when we handle panel unit roots and panel cointegration
7
2. OVERVIEW OF PANEL DATA MODELS
• Panel data model notation differs from a regular time series or cross-section regression in
that it has a double subscript on its variables ; yit , xit ;
2.1 Panel Data Models

• A very general linear model for panel data permits the intercept and slope coefficients to
vary over both individual and time
y it = α it + x it′ β it + ε it , i = 1,..., N , t = 1,...,T (1)
• Where y it is a scalar dependent variable, x it is a k × 1 vector of independent
variables, ε it is a scalar disturbance term, i indexes individuals (firms, country etc.), t

indexes time.
• This model is too general and not estimable as there are more parameters to estimate
than observations
• More restrictions need to be placed on the extent to which α it and β it can vary with
i and t and on the behaviour of the error term ε it
2.2 The Pooled Data Model

• This is the most restrictive model that specifies constant coefficients, which is the usual
assumption about cross-section analysis
y it = α + x it′ β + ε it (2)
Where i denotes households, individuals, firms, countries etc and t denotes time.
• We assume that errors are homoscedastic and serially independent both within and
between individuals(cross-sections).
Var (ε it ) = σ 2
Cor (ε it , ε js ) = 0 when i ≠ j and/or t ≠ s
• The marginal effects β of the set of k vector of time-varying characteristics xit are
taken to be common across i and t, although this assumption can itself be tested.
8
• If the model is correctly specified and regressors are uncorrelated with the error term, the
pooled OLS will product consistent and efficient estimates for the parameters
• This is the pooled least squares model.
1 N T ~~
NT
∑∑ xit′ yit
βˆ = i =1 t =1
1 N T ~~
∑∑ xit′ xit
NT i =1 t =1
N T N T
1 1
Where x =
NT
∑∑ xit , y =
i =1 t =1 NT
∑∑ y
i =1 t =1
it ,~
xit = xit − x , ~
yit = yit − y (3)
αˆ = y − βˆx
• This formulation does not distinguish between two different individuals and the same
individual at two different points in time
• This feature undermines the accuracy of the approach when differences do exist between
cross-sectional units.
• Nonetheless, the increase in the sample by pooling data across time generates an
improvement in efficiency relative to a single cross-section.
yit
yit = α + xit′ β + ε it
xit
• Here we do not use any panel information. The data are treated as if there was only one
single index.
9
2.3 Traditional Panel Data Model
• In this case the constant term, α i , varies from individual to individual.
y it = α i + x it′ β + ε it (4)
• We assume that errors are homoscedastic and serially independent both within and
between individuals(cross-sections).
Var (ε it ) = σ 2
Cor (ε it , ε js ) = 0 when i ≠ j and/or t ≠ s
• This is what we refer to in panel parlance as individual (unobserved) heterogeneity.

• The slopes are the same for all individuals i.e.
• Its graphical form is as follows;
yit =α3 + xit′ β +εit

yit 3
2 yit =α2 + xit′ β +εit
θ3 1 yit =α1 + xit′ β +εit
α3
θ2
α2 θ1 tan(θ1 ) = tan(θ 2 ) = tan(θ 3 ) = β
α1
0
xit
2.4 Traditional Seemingly Unrelated Regression (SUR) Model

• The constant terms, α i , and slope coefficients, β i ,vary from individual to individual.
y it = α i + x it′ β i + ε it
10
yit
yit = α 1 + xit′ β1 + ε it
1
θ1
2 yit = α 2 + xit′ β 2 + ε it
α3
θ3 θ2
α2 3 yit = α 3 + xit′ β 3 + ε it
tan(θ1 ) > tan(θ2 ) > tan(θ3 )

α1
0
xit
• In the SUR models, the error terms are assumed to be contemporaneously correlated and
heteroscedastic between individuals.
Var (ε it ) = σ i2
Cor (ε it , ε jt ) = σ ij Contemporaneous (same period) correlation
Cor (ε it , ε js ) = 0 when t ≠ s
2.5 Which model is appropriate for my Data?

Large number of independent individuals observed for a few time periods (N>>T).
This is common in cross-sectional panels.
o It is not possible to estimate different individual slopes, β i , for all the exogenous
variables. The panel data model is the most appropriate.
There are medium length time series for relatively few individuals (say countries, firms,
sectors, banks,etc.). T>N.
o In this case the SUR model may be appropriate.
o Efficient SUR estimation is mainly used when T ≥ N .
o Equation by Equation OLS is used if K ≤ T ≤ N
In terms of unrestrictiveness, the relationship is as follows:

Pooled (i.e.most restrictive)<Panel<SUR( i.e.most unrestrictive)
11
References
Baltagi, B.H. (2005), Econometric Analysis of Panel Data, 3rd Edition,

John Wiley Chapter 1
Cameron,A.C. and Trivedi, P.K. (2006), Microeconometrics: Methods

and Applications, Cambridge University Press. Chapter Chapter 21
Cheng Hsiao, Analysis of Panel Data, Cambridge University Press,

1986
Greene W H. Econometric Analysis, Second Edition, Macmillan, 2003.

Chapter 13
Wooldridge, J. M. (2002), Econometric Analysis of Cross-Section and

Panel Data, MIT Press: Cambridge, Massachusets. Chapter 10
12
(CMAP)
DR. MOSES SICHEI*
LECTURE 11: ONE-WAY ERROR COMPONENT MODELS
*
Objectives:
The main objective of the lecture is to understand some basic

panel data models in the context of one-way error components.
Key words
• Balanced panel data
• Between estimator
• Cross-section oriented panel data
• Dynamic panel
• Fixed effects model(FEM)
• Generalised least squares
• Least squares dummy variable (LSDV)
• Panel data
• Pooled data
• One-way error components model
• Random effects model (REM)
• Within estimation
1. INTRODUCTION
1.1 One-Way Error Component Model

• This model allows cross-section heterogeneity in the error term.
• From the traditional panel data model in lecture 10
y it = α i + x it′ β + ε it (1)
• The error term in equation 1 is decomposed into;
ε it = µ i + ν it (2)
• Where µ i denotes the unobservable individual specific effect and ν it denotes

idiosyncratic errors or idiosyncratic disturbances, which change across time and cross-
section.
1
• µ i is time invariant (same for all the time) and accounts for any individual-specific
effect that is not included in the regression.
1.1.1 Meaning of the unobservable individual specific effects µ i
• These refer to unobservable individual specific effects which are not included in the
equation because of :
o We do not know exactly how to specify them explicitly
o We know but have no data
• We simply want to acknowledge their existence
• For instance in a production function utilizing data on firms across time, µ i refers to the
unobservable entrepreneurial or management skills of the firm executives
The other terminologies given to µ i are;
• Latent variable
• Unobserved heterogeneity
• Substituting Equation 2 in 1 yields the following one-way error component model;
y it = α + µ i + x it′ β + ν it (3)
1 N
• Note here that α = ∑α i
N i =1
µi = α i − α
α is the average individual effect while µ i is the individual deviation from the average (recall
the reference class is multinomial logit model)
2. FIXED EFFECTS MODEL (FEM)

• This is appropriate when differences between individual economic agents may
reasonably be viewed as parametric shifts in the regression function itself.
• Suppose we have a simple linear panel regression model of the form ;
y it = α + µ i + x it′ β + ν it (4)
• The α + µi are possibly correlated with the regressors xit
2
• The following figure shows how the FEM handles the heterogeneity issue.
yit E [yit xit ] = α + β xit Pooled
Group2 E [yit xit ] = α + µ 2 + β xit
α + µ2
E [yit xit ] = α + µ1 + β xit
Group 1
α + µ1
Biased slope when fixed effects are ignored

α
0
x it
2.1 ESTIMATION OF FIXED EFFECTS MODEL

• The challenge of estimation is the presence of the N individual-specific effects that
increase as N → ∞
• Nonetheless, there are several methods that can be applied
o Least squares dummy variables (LSDV) which is a direct OLS with indicator
(dummy variables) for each of the N fixed effects
o Use OLS in the within estimation context
o Generalized Leased squares in the within model context
o Maximum likelihood estimation conditional on the individual means
yi , i = 1,2,..., N
o OLS in first differences
3. LEAST SQUARES DUMMY VARIABLE (LSDV) ESTIMATION
• This approach assumes that any difference across economic agents can be captured by
shifts in the intercepts of a standard OLS regression.
y it = α + µ i + x it′ β + ν it
3
• We estimate an LSDV model first by defining a series of individual-specific dummies
variables.
• In principle one simply estimates the OLS regression of y it on x it and a set of N-1
indicator variables d1t , d 2it ,...d ( N −1)t
• The resulting estimator of β turns out to equal the within estimator (running a
regression through the mean)
• This is a special case of the Frisch-Waugh-Lovell theorem. You have been using this
theorem: running a regression through the origin (after subtracting the mean) produces
the same slope coefficients as running it with an intercept).
• The theorem was introduced by Frisch and Waugh (1933), and then reintroduced by
Lovell (1963).
• Read pages 62-75 of Econometric theory and methods by Davidson and Mackinnon
(2004) for more details of the theorem
Digression: Different ways of stacking data

• Suppose we studying private consumption in 12 Africa countries over the period 1998-
2003
• We have data on real consumption and income for the different African countries
Dependent variable: consumption
(i) stacked vertically to create NT × 1 vector
4
Country Period Consumption
Botswana 1998 7180.3
Burkina Faso 1998 1283027.4
Burundi 1998 462066.7
Burundi 1999 492055.5
Burundi 2000 465738.0
Burundi 2001 469289.9
Burundi 2002 491701.3
Burundi 2003 486195.2
Kenya 1998 596883.1
Kenya 1999 594332.1
Kenya 2000 609862.0
Kenya 2001 629103.7
Kenya 2002 650968.4
Kenya 2003 680065.0
Madagascar 1998 21830.2
Mauritius 1998 69552.9
Morocco 1998 240.3
Morocco 1999 233.4
Morocco 2000 243.0
Morocco 2001 256.4
Morocco 2002 256.1
Morocco 2003 261.7
Nigeria 1998 3307.9
Nigeria 1999 2255.7
Nigeria 2000 2446.5
Nigeria 2001 3068.0
Nigeria 2002 3665.8
Nigeria 2003 3424.9
Rwanda 1998 588.0
Rwanda 1999 595.6
Rwanda 2000 641.9
Rwanda 2001 676.1
Rwanda 2002 740.4
Rwanda 2003 769.9
Sierra Leone 1998 1180237.6
South Africa 1998 516925.9
Tanzania 1998 5610.4
• This is how data for stata, SAS and PCGIVE should be

organized
• Data should be organized this way for Eviews if you would
like to use dynamic panel methods
5
(ii) Horizontal to create NT × N matrix
Country Period rcons_Bots rcons_BurkF rcons_Bur rcons_Ken rcons_Madag rcons_Maurit rcons_Mor rcons_Nig rcons_Rwa rcons_SierL rcons_rsa rcons_Tan
Botswana 1998 7180.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Botswana 1999 7533.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Botswana 2000 7841.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Botswana 2001 7919.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Botswana 2002 8085.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Botswana 2003 8222.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burkina Faso 1998 0.0 1283027.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burkina Faso 1999 0.0 1297642.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burkina Faso 2000 0.0 1306400.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burkina Faso 2001 0.0 1411715.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burkina Faso 2002 0.0 1513999.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burkina Faso 2003 0.0 1626124.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burundi 1998 0.0 0.0 462066.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burundi 1999 0.0 0.0 492055.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burundi 2000 0.0 0.0 465738.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burundi 2001 0.0 0.0 469289.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burundi 2002 0.0 0.0 491701.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Burundi 2003 0.0 0.0 486195.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Kenya 1998 0.0 0.0 0.0 596883.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Kenya 1999 0.0 0.0 0.0 594332.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Kenya 2000 0.0 0.0 0.0 609862.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Kenya 2001 0.0 0.0 0.0 629103.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Kenya 2002 0.0 0.0 0.0 650968.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Kenya 2003 0.0 0.0 0.0 680065.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Madagascar 1998 0.0 0.0 0.0 0.0 21830.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Madagascar 1999 0.0 0.0 0.0 0.0 22441.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Madagascar 2000 0.0 0.0 0.0 0.0 22483.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Madagascar 2001 0.0 0.0 0.0 0.0 22443.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Madagascar 2002 0.0 0.0 0.0 0.0 21150.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Madagascar 2003 0.0 0.0 0.0 0.0 22985.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Mauritius 1998 0.0 0.0 0.0 0.0 0.0 69552.9 0.0 0.0 0.0 0.0 0.0 0.0
Mauritius 1999 0.0 0.0 0.0 0.0 0.0 71594.9 0.0 0.0 0.0 0.0 0.0 0.0
Mauritius 2000 0.0 0.0 0.0 0.0 0.0 73939.3 0.0 0.0 0.0 0.0 0.0 0.0
Mauritius 2001 0.0 0.0 0.0 0.0 0.0 76048.7 0.0 0.0 0.0 0.0 0.0 0.0
Mauritius 2002 0.0 0.0 0.0 0.0 0.0 78570.9 0.0 0.0 0.0 0.0 0.0 0.0
Mauritius 2003 0.0 0.0 0.0 0.0 0.0 82602.2 0.0 0.0 0.0 0.0 0.0 0.0
Morocco 1998 0.0 0.0 0.0 0.0 0.0 0.0 240.3 0.0 0.0 0.0 0.0 0.0
Morocco 1999 0.0 0.0 0.0 0.0 0.0 0.0 233.4 0.0 0.0 0.0 0.0 0.0
Morocco 2000 0.0 0.0 0.0 0.0 0.0 0.0 243.0 0.0 0.0 0.0 0.0 0.0
Morocco 2001 0.0 0.0 0.0 0.0 0.0 0.0 256.4 0.0 0.0 0.0 0.0 0.0
Morocco 2002 0.0 0.0 0.0 0.0 0.0 0.0 256.1 0.0 0.0 0.0 0.0 0.0
Morocco 2003 0.0 0.0 0.0 0.0 0.0 0.0 261.7 0.0 0.0 0.0 0.0 0.0
Nigeria 1998 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3307.9 0.0 0.0 0.0 0.0
Nigeria 1999 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2255.7 0.0 0.0 0.0 0.0
Nigeria 2000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2446.5 0.0 0.0 0.0 0.0
Nigeria 2001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3068.0 0.0 0.0 0.0 0.0
Nigeria 2002 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3665.8 0.0 0.0 0.0 0.0
Nigeria 2003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3424.9 0.0 0.0 0.0 0.0
Rwanda 1998 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 588.0 0.0 0.0 0.0
Rwanda 1999 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 595.6 0.0 0.0 0.0
Rwanda 2000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 641.9 0.0 0.0 0.0
Rwanda 2001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 676.1 0.0 0.0 0.0
Rwanda 2002 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 740.4 0.0 0.0 0.0
Rwanda 2003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 769.9 0.0 0.0 0.0
Sierra Leone 1998 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1180237.6 0.0 0.0
Sierra Leone 1999 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1032168.8 0.0 0.0
Sierra Leone 2000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1142680.0 0.0 0.0
Sierra Leone 2001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1369830.5 0.0 0.0
Sierra Leone 2002 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1547871.3 0.0 0.0
Sierra Leone 2003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1613277.6 0.0 0.0
South Africa 1998 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 516925.9 0.0
South Africa 1999 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 531213.0 0.0
South Africa 2000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 556652.0 0.0
South Africa 2001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 579316.4 0.0
South Africa 2002 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 598804.9 0.0
South Africa 2003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 614082.8 0.0
Tanzania 1998 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5610.4
Tanzania 1999 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6003.2
Tanzania 2000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6069.6
Tanzania 2001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6579.9
Tanzania 2002 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 7064.1
Tanzania 2003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 7974.2
6
The following stacking is used by Eviews software when you are not
interested in using dynamic panel or the version of eviews software
cannot allow you to do so (e.g.eviews 3.1)
Period rcons_Bots rcons_BurkF rcons_Bur rcons_Ken rcons_Madag rcons_Maurit rcons_Mor rcons_Nig rcons_Rwa rcons_SierL rcons_rsa rcons_Tan
1990 5587.5 944244.1 652316.1 534498.2 18388.6 48387.5 203.9 2003.8 680.7 1351781.6 416324.7 4179.1
1991 6361.7 922686.9 616476.2 521072.4 19200.5 50061.0 230.1 2538.2 639.0 1538679.5 417907.8 4188.8
1992 6517.6 954058.8 686615.5 513099.7 18385.7 52803.7 219.0 3192.2 646.7 1359474.4 419952.9 4391.4
1993 5989.5 978916.2 712774.9 414525.6 19496.6 55480.3 208.9 2700.9 624.1 1485785.2 430586.1 4471.0
1994 6344.1 878232.0 733701.2 382161.6 19770.5 58196.4 229.5 2221.0 546.1 1403960.5 447390.1 4490.5
1995 6361.3 998737.2 548712.2 485436.4 19840.0 60635.0 221.2 2857.5 440.4 1445824.7 473595.4 4585.6
1996 6397.7 1103507.1 411135.4 496801.0 19766.0 63251.7 242.1 3391.5 494.8 1470327.2 494634.1 4684.2
1997 6633.4 1112742.4 409956.4 562446.2 21313.3 65509.1 230.5 3222.4 579.4 1322968.4 510869.8 5115.2
1998 7180.3 1283027.4 462066.7 596883.1 21830.2 69552.9 240.3 3307.9 588.0 1180237.6 516925.9 5610.4
1999 7533.5 1297642.2 492055.5 594332.1 22441.8 71594.9 233.4 2255.7 595.6 1032168.8 531213.0 6003.2
2000 7841.1 1306400.0 465738.0 609862.0 22483.0 73939.3 243.0 2446.5 641.9 1142680.0 556652.0 6069.6
2001 7919.2 1411715.4 469289.9 629103.7 22443.8 76048.7 256.4 3068.0 676.1 1369830.5 579316.4 6579.9
2002 8085.2 1513999.2 491701.3 650968.4 21150.2 78570.9 256.1 3665.8 740.4 1547871.3 598804.9 7064.1
2003 8222.9 1626124.7 486195.2 680065.0 22985.4 82602.2 261.7 3424.9 769.9 1613277.6 614082.8 7974.2
Note that if the above data is treated as a matrix, you can simply
stack it vertically by using the vectorisation algebra (available as
option in matrix algebra in most software like stata and eviews)
⎡1 ⎤
⎢ 4⎥
⎡1 3 ⎤
A=⎢ ⎥ Vec( A) = ⎢ ⎥
⎣4 7 ⎦ ⎢ 3⎥
⎢ ⎥
⎣7 ⎦
Independent variable: income
• The same stacking as the dependent variable can be done depending on the software
Individual specific dummy variables for LSDV model: Creates NT × N matrix
7
Country Period Bots BurkF Bur Ken Madag Maurit Mor Nig Rwa SierL rsa Tan
Botswana 1998 1 0 0 0 0 0 0 0 0 0 0 0
Botswana 1999 1 0 0 0 0 0 0 0 0 0 0 0
Botswana 2000 1 0 0 0 0 0 0 0 0 0 0 0
Botswana 2001 1 0 0 0 0 0 0 0 0 0 0 0
Botswana 2002 1 0 0 0 0 0 0 0 0 0 0 0
Botswana 2003 1 0 0 0 0 0 0 0 0 0 0 0
Burkina Faso 1998 0 1 0 0 0 0 0 0 0 0 0 0
Burkina Faso 1999 0 1 0 0 0 0 0 0 0 0 0 0
Burkina Faso 2000 0 1 0 0 0 0 0 0 0 0 0 0
Burkina Faso 2001 0 1 0 0 0 0 0 0 0 0 0 0
Burkina Faso 2002 0 1 0 0 0 0 0 0 0 0 0 0
Burkina Faso 2003 0 1 0 0 0 0 0 0 0 0 0 0
Burundi 1998 0 0 1 0 0 0 0 0 0 0 0 0
Burundi 1999 0 0 1 0 0 0 0 0 0 0 0 0
Burundi 2000 0 0 1 0 0 0 0 0 0 0 0 0
Burundi 2001 0 0 1 0 0 0 0 0 0 0 0 0
Burundi 2002 0 0 1 0 0 0 0 0 0 0 0 0
Burundi 2003 0 0 1 0 0 0 0 0 0 0 0 0
Kenya 1998 0 0 0 1 0 0 0 0 0 0 0 0
Kenya 1999 0 0 0 1 0 0 0 0 0 0 0 0
Kenya 2000 0 0 0 1 0 0 0 0 0 0 0 0
Kenya 2001 0 0 0 1 0 0 0 0 0 0 0 0
Kenya 2002 0 0 0 1 0 0 0 0 0 0 0 0
Kenya 2003 0 0 0 1 0 0 0 0 0 0 0 0
Madagascar 1998 0 0 0 0 1 0 0 0 0 0 0 0
Madagascar 1999 0 0 0 0 1 0 0 0 0 0 0 0
Madagascar 2000 0 0 0 0 1 0 0 0 0 0 0 0
Madagascar 2001 0 0 0 0 1 0 0 0 0 0 0 0
Madagascar 2002 0 0 0 0 1 0 0 0 0 0 0 0
Madagascar 2003 0 0 0 0 1 0 0 0 0 0 0 0
Mauritius 1998 0 0 0 0 0 1 0 0 0 0 0 0
Mauritius 1999 0 0 0 0 0 1 0 0 0 0 0 0
Mauritius 2000 0 0 0 0 0 1 0 0 0 0 0 0
Mauritius 2001 0 0 0 0 0 1 0 0 0 0 0 0
Mauritius 2002 0 0 0 0 0 1 0 0 0 0 0 0
Mauritius 2003 0 0 0 0 0 1 0 0 0 0 0 0
Morocco 1998 0 0 0 0 0 0 1 0 0 0 0 0
Morocco 1999 0 0 0 0 0 0 1 0 0 0 0 0
Morocco 2000 0 0 0 0 0 0 1 0 0 0 0 0
Morocco 2001 0 0 0 0 0 0 1 0 0 0 0 0
Morocco 2002 0 0 0 0 0 0 1 0 0 0 0 0
Morocco 2003 0 0 0 0 0 0 1 0 0 0 0 0
Nigeria 1998 0 0 0 0 0 0 0 1 0 0 0 0
Nigeria 1999 0 0 0 0 0 0 0 1 0 0 0 0
Nigeria 2000 0 0 0 0 0 0 0 1 0 0 0 0
Nigeria 2001 0 0 0 0 0 0 0 1 0 0 0 0
Nigeria 2002 0 0 0 0 0 0 0 1 0 0 0 0
Nigeria 2003 0 0 0 0 0 0 0 1 0 0 0 0
Rwanda 1998 0 0 0 0 0 0 0 0 1 0 0 0
Rwanda 1999 0 0 0 0 0 0 0 0 1 0 0 0
Rwanda 2000 0 0 0 0 0 0 0 0 1 0 0 0
Rwanda 2001 0 0 0 0 0 0 0 0 1 0 0 0
Rwanda 2002 0 0 0 0 0 0 0 0 1 0 0 0
Rwanda 2003 0 0 0 0 0 0 0 0 1 0 0 0
Sierra Leone 1998 0 0 0 0 0 0 0 0 0 1 0 0
Sierra Leone 1999 0 0 0 0 0 0 0 0 0 1 0 0
Sierra Leone 2000 0 0 0 0 0 0 0 0 0 1 0 0
Sierra Leone 2001 0 0 0 0 0 0 0 0 0 1 0 0
Sierra Leone 2002 0 0 0 0 0 0 0 0 0 1 0 0
Sierra Leone 2003 0 0 0 0 0 0 0 0 0 1 0 0
South Africa 1998 0 0 0 0 0 0 0 0 0 0 1 0
South Africa 1999 0 0 0 0 0 0 0 0 0 0 1 0
South Africa 2000 0 0 0 0 0 0 0 0 0 0 1 0
South Africa 2001 0 0 0 0 0 0 0 0 0 0 1 0
South Africa 2002 0 0 0 0 0 0 0 0 0 0 1 0
South Africa 2003 0 0 0 0 0 0 0 0 0 0 1 0
Tanzania 1998 0 0 0 0 0 0 0 0 0 0 0 0
Tanzania 1999 0 0 0 0 0 0 0 0 0 0 0 0
Tanzania 2000 0 0 0 0 0 0 0 0 0 0 0 0
Tanzania 2001 0 0 0 0 0 0 0 0 0 0 0 0
Tanzania 2002 0 0 0 0 0 0 0 0 0 0 0 0
Tanzania 2003 0 0 0 0 0 0 0 0 0 0 0 0
8
• Note the fact that the last cross-section is not coded with 1 to avoid the problem of
dummy variable trap (i.e. perfect multi-collinearity)
Use of Kronecker product

⎡1 ⋅ 0 1⋅ 3 2⋅0 2 ⋅ 3⎤ ⎡0 3 0 6⎤
⎢ 2 ⋅ 1⎥⎥ ⎢⎢2 2⎥⎥
⎡1 2⎤ ⎡0 3⎤ ⎢1 ⋅ 2 1⋅1 2⋅2 1 4
Recall that A ⊗ B = ⎢ ⊗
⎥ ⎢ ⎥ = =
⎣3 1 ⎦ ⎣2 1⎦ ⎢3 ⋅ 0 3⋅3 1⋅ 0 1 ⋅ 3 ⎥ ⎢0 9 0 3⎥
⎢ ⎥ ⎢ ⎥
⎣3 ⋅ 2 3 ⋅1 1⋅ 2 1 ⋅ 1 ⎦ ⎣6 3 2 1⎦
The dummy variables can be represented as

⎡1 0 0 0 0 0 0 0 0 0 0 0⎤
⎢0 1 0 0 0 0 0 0 0 0 0 0⎥⎥
⎢
⎢0 0 1 0 0 0 0 0 0 0 0 0⎥
⎢ ⎥
⎢0 0 0 1 0 0 0 0 0 0 0 0⎥ ⎡1⎤
⎢0 0 0 0 1 0 0 0 0 0 0 0⎥ ⎢1⎥
⎢ ⎥ ⎢⎥
0 0 0 0 0 1 0 0 0 0 0 0⎥ ⎢1⎥
I N = ⎢⎢ , J = ⎢⎥
0⎥
T
0 0 0 0 0 0 1 0 0 0 0 ⎢1⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 1 0 0 0 0⎥ ⎢1⎥
⎢0 ⎢⎥
0 0 0 0 0 0 0 1 0 0 0⎥ ⎣⎢1⎦⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 0 0 1 0 0⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 0 0 0 1 0⎥
⎢⎣0 0 0 0 0 0 0 0 0 0 0 1⎥⎦
The reproduction of the dummy variables above amounts to

I N ⊗ JT = Z µ
Our model can then be written as

⎡α ⎤
y = [(Z µ ) x ]⎢ ⎥ + ν it
⎣β ⎦
The Kronecker product (I N ⊗ J T ) is an block diagonal matrix and X is the
matrix of nonconstant regressors
9
• An OLS estimator of this model yields the LSDV estimator
−1
⎡αˆ LSDV ⎤ ⎡(I N ⊗ J T )′ (I N ⊗ J T ) (I N ⊗ J T )′ x ⎤ ⎡(I ⊗ J )′ y ⎤
⎢ βˆ =
⎥ ⎢ ⎥ × ⎢ N T
⎥
⎣ LSDV ⎦ ⎢⎣ x′(I N ⊗ J T ) x′x ⎥⎦ ⎢⎣ x′y ⎥⎦
−1
⎡TI Tx ⎤ ⎡ y⎤
=⎢ N ⎥ ×⎢ ⎥
⎣Tx ′ x′x ⎦ ⎣ x′y ⎦
• The LSDV model can easily be estimated using over the full panel
to yield LSDV estimators.
• This model is appealing
• But for short panels the problem is that it estimates too many
(incidental) parameters that may not be of intrinsic value
K+1 + (N−1)
K parameters for the original X-regressors;
1 parameter for the intercept;
N-1 parameters for cross-section fixed effects (omitted cross-
section captured by the common intercept i.e the reference class-
Tanzania.)
Problems with LSDV Model

1. There are too many incidental/nuisance parameters since µi grows
as N increases. The usual proof of consistency for an estimator
does not hold for LSDV model.
10
2. Inverting (K+1)+(N-1) matrix may be impossible if N is very
large. Even when it is possible it can be inaccurate.
Eviews panel results of LSDV model

Dependent Variable: LN_RCONS?
Method: Pooled Least Squares
Date: 09/08/08 Time: 21:11
Sample: 1990 2003
Included observations: 14
Cross-sections included: 12
Total pool (unbalanced) observations: 163
Variable Coefficient Std. Error t-Statistic Prob.
LN_RGDP? 0.683031 0.050445 13.54021 0.0000

_BOTS--C 2.042464 0.502558 4.064134 0.0001
_BURKF--C 4.224775 0.718861 5.877042 0.0000
_BUR--C 4.730530 0.630225 7.506101 0.0000
_KEN--C 3.976501 0.681912 5.831401 0.0000
_MADAG--C 3.065602 0.508071 6.033802 0.0000
_MAURIT--C 3.215030 0.580446 5.538895 0.0000
_MOR--C 1.471438 0.295430 4.980671 0.0000
_NIG--C 2.076701 0.434629 4.778103 0.0000
_RWA--C 2.025546 0.325644 6.220130 0.0000
_SIERL--C 4.366440 0.721471 6.052135 0.0000
_RSA--C 3.813676 0.687518 5.547019 0.0000
_TAN--C 2.569156 0.444138 5.784585 0.0000
R-squared 0.998770 Mean dependent var 10.40442

Adjusted R-squared 0.998671 S.D. dependent var 2.939862
S.E. of regression 0.107162 Akaike info criterion -1.552558
Sum squared resid 1.722550 Schwarz criterion -1.305817
Log likelihood 139.5335 F-statistic 10147.81
Durbin-Watson stat 0.961912 Prob(F-statistic) 0.000000
Stata is not quite good in estimating the LSDV model. Eviews does a good job
We need a trick to deal with these problems
4. WITHIN/Q ESTIMATOR
11
• Using the “WITHIN” estimation we can still assume individual effects, although we no
longer directly estimate them.
• We demean the data so as “wipe out the incidental parameters (individual effects) and
estimate β only.
• This means subtracting the mean for each cross-section from each observation.
• Demeaning the data will not change the estimates for β. (Think of the econometric
exercise of “running a regression line through the origin”.)
• In order to wipe out the individual effects, we define a Q matrix
Qy = Q x ′β + Q ν it
Where Q = IN − P
P = Z µ (Z µ′ Z µ ) Z µ
−1
• P is a centering matrix that averages across time for each

individual cross-section
• Consequently, pre-multiplying this regression by Q obtains
deviations from the means within each cross-section
~
y = Qy and ~
x = Qx
The OLS estimator is

~
β = ( x′Qx )−1 x′Qy
~
var β = σ v ( x′Qx )
−1
Let’s look at the same stuff using common parlance
12
The mean model is

y i • = α + µ i + xi′• β + ν i •
Demeaning the model
y it − y i • = α + µ i + x it′ β + ν it − (α + µ i xi′• β + ν i • )
= (α − α ) + (µ i − µ i ) + ( x it′ − x i′• )β + (ν it − ν i • )
y it − y i • = ( x it′ − xi′• )β + (ν it − ν i • )
Where
1 T
yi • = ∑ yit
T t =1
1 T
xi • = ∑ xit
T t =1
1 T
ν i• = ∑ν it
T t =1
Notice that we have wiped out the individual effect coefficients since
(α − α ) = 0
(µ i − µ i ) = 0
Using OLS yields the within estimator
−1 N
⎡N T ⎤
βˆW = ⎢∑∑ ( xit − xi • )( xit − xi • )′ ⎥
T
⎣ i =1 i =1 ⎦
∑∑ (x
i =1 t =1
it − xi • )( yit − yi • )
• There are no incidental parameters and the errors still satisfy the usual assumptions.
• We can therefore use OLS on the above equation to obtain consistent estimates.
• Averaging across all observations yields
y •• = α + β x •• + ν ••
• Individual effects can be solved (not estimated) with the assumption:
13
N
∑ µi = 0 to avoid the dummy variable trap or perfect multicollinearity
i =1
and solving:
~
α~ = y • • − β 1 y • • − β 2 x • •
~ ~ ~
µi• = yi• −α − β1xi• − β2
• In other words, we can use First Order Conditions to derive individual effects.
• Note that the total individual effect is the sum of the common constant and the
constructed individual component.
14
A B A-C
Country Period Consumption mean consumption demeaned consumption
Botswana 1998 7180.3 7797.021896 -616.7
Botswana 1999 7533.5 7797.021896 -263.6
Botswana 2000 7841.1 7797.021896 44.1
Botswana 2001 7919.2 7797.021896 122.2
Botswana 2002 8085.2 7797.021896 288.2
Botswana 2003 8222.9 7797.021896 425.9
Burkina Faso 1998 1283027.4 1406484.816 -123457.4
Burkina Faso 1999 1297642.2 1406484.816 -108842.6
Burkina Faso 2000 1306400.0 1406484.816 -100084.8
Burkina Faso 2001 1411715.4 1406484.816 5230.6
Burkina Faso 2002 1513999.2 1406484.816 107514.4
Burkina Faso 2003 1626124.7 1406484.816 219639.9
Burundi 1998 462066.7 477841.1 -15774.4
Burundi 1999 492055.5 477841.1 14214.4
Burundi 2000 465738.0 477841.1 -12103.1
Burundi 2001 469289.9 477841.1 -8551.2
Burundi 2002 491701.3 477841.1 13860.2
Burundi 2003 486195.2 477841.1 8354.1
Kenya 1998 596883.1 626869.0426 -29986.0
Kenya 1999 594332.1 626869.0426 -32537.0
Kenya 2000 609862.0 626869.0426 -17007.0
Kenya 2001 629103.7 626869.0426 2234.7
Kenya 2002 650968.4 626869.0426 24099.3
Kenya 2003 680065.0 626869.0426 53196.0
Madagascar 1998 21830.2 22222.41045 -392.2
Madagascar 1999 22441.8 22222.41045 219.4
Madagascar 2000 22483.0 22222.41045 260.6
Madagascar 2001 22443.8 22222.41045 221.4
Madagascar 2002 21150.2 22222.41045 -1072.2
Madagascar 2003 22985.4 22222.41045 763.0
Mauritius 1998 69552.9 75384.83324 -5831.9
Mauritius 1999 71594.9 75384.83324 -3789.9
Mauritius 2000 73939.3 75384.83324 -1445.5
Mauritius 2001 76048.7 75384.83324 663.9
Mauritius 2002 78570.9 75384.83324 3186.1
Mauritius 2003 82602.2 75384.83324 7217.4
Morocco 1998 240.3 248.4780447 -8.1
Morocco 1999 233.4 248.4780447 -15.0
Morocco 2000 243.0 248.4780447 -5.5
Morocco 2001 256.4 248.4780447 7.9
Morocco 2002 256.1 248.4780447 7.6
Morocco 2003 261.7 248.4780447 13.2
Nigeria 1998 3307.9 3028.129497 279.7
Nigeria 1999 2255.7 3028.129497 -772.4
Nigeria 2000 2446.5 3028.129497 -581.6
Nigeria 2001 3068.0 3028.129497 39.9
Nigeria 2002 3665.8 3028.129497 637.6
Nigeria 2003 3424.9 3028.129497 396.7
Rwanda 1998 588.0 668.6450459 -80.7
Rwanda 1999 595.6 668.6450459 -73.1
Rwanda 2000 641.9 668.6450459 -26.7
Rwanda 2001 676.1 668.6450459 7.5
Rwanda 2002 740.4 668.6450459 71.8
Rwanda 2003 769.9 668.6450459 101.2
Sierra Leone 1998 1180237.6 1254557.641 -74320.1
Sierra Leone 1999 1032168.8 1254557.641 -222388.8
Sierra Leone 2000 1142680.0 1254557.641 -111877.6
Sierra Leone 2001 1369830.5 1254557.641 115272.9
Sierra Leone 2002 1547871.3 1254557.641 293313.6
Sierra Leone 2003 1613277.6 1254557.641 358720.0
South Africa 1998 516925.9 566165.8325 -49239.9
South Africa 1999 531213.0 566165.8325 -34952.8
South Africa 2000 556652.0 566165.8325 -9513.8
South Africa 2001 579316.4 566165.8325 13150.5
South Africa 2002 598804.9 566165.8325 32639.1
South Africa 2003 614082.8 566165.8325 47917.0
Tanzania 1998 5610.4 6550.223129 -939.8
Tanzania 1999 6003.2 6550.223129 -547.0
Tanzania 2000 6069.6 6550.223129 -480.6
Tanzania 2001 6579.9 6550.223129 29.7
Tanzania 2002 7064.1 6550.223129 513.9
Tanzania 2003 7974.2 6550.223129 1423.9
15
Within estimation results from Eviews
Date: 09/08/08 Time: 21:16
Sample: 1990 2003
C 3.082438 0.540824 5.699521 0.0000

LN_RGDP? 0.683031 0.050445 13.54021 0.0000
Fixed Effects (Cross)
_BOTS--C -1.039974
_BURKF--C 1.142337
_BUR--C 1.648092
_KEN--C 0.894063
_MADAG--C -0.016836
_MAURIT--C 0.132592
_MOR--C -1.611000
_NIG--C -1.005736
_RWA--C -1.056892
_SIERL--C 1.284002
_RSA--C 0.731238
_TAN--C -0.513282
Effects Specification
Cross-section fixed (dummy variables)

• The fixed effects have not been computed but simply recovered
• This can be seen from the lack of standard errors as opposed to
the LSDV results
• The interpretation of the country-specific fixed effects is as
follows
16
• For those countries with positive values, it means that there are
some unobservable factors which tend to enhance consumption
• For those countries with negative country-specific fixed effects,
there are unobservable characteristics that hinder the consumption
Look at stata within results
xtreg ln_cons ln_rgdp, fe i(country)
Fixed-effects (within) regression Number of obs = 154

Group variable (i): country Number of groups = 11
R-sq: within = 0.5644 Obs per group: min = 14

between = 0.9903 avg = 14.0
overall = 0.9888 max = 14
F(1,142) = 183.96
corr(u_i, Xb) = 0.9591 Prob > F = 0.0000
------------------------------------------------------------------------------
ln_cons | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_rgdp | .6576598 .0484889 13.56 0.000 .5618063 .7535133
_cons | 3.255495 .5148909 6.32 0.000 2.237653 4.273337
-------------+----------------------------------------------------------------
sigma_u | 1.0877784
sigma_e | .10204792
rho | .99127587 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(10, 142) = 127.58 Prob > F = 0.0000
Properties of the WITHIN Estimators
• The slope coefficients are consistent if N or T become large

• The fixed effects are only consistent if T is large
• The number of degrees of freedom must be adjusted. The degrees of
freedom k=NT-N-K. Please note that the usual OLS
programs(software) not designed for panel data assume that the
degrees of freedom k=NT-K, which is wrong!!! Their standard
errors, test statistics and p-values must be corrected as follows. Let’s
use the following notation ku = NT − K (unadjusted degrees of freedom)
and k a = ku − N (adjusted degrees of freedom)
17
()
sea βˆ =
ku
ka
seu ( βˆ )
() k
t a βˆ = a tu ( βˆ )
ku
Which is distributed tv a under the null hypothesis
Note that “a” denotes adjusted and “u” denotes unadjusted

• The parameter estimates from the LSDV are the same as from the
WITHIN regression model.
• Please note that this is not a general result since incidental paremeters
do cause inconsistencies in may applied models.
Important disadvantage with the WITHIN method:

• Demeaning the data means that X-regressors which are dummy
variables cannot be used.
• For example sex, race, religion, etc.
• Thus we would be able to say nothing about the relationship

between the dependent variable and the time-invariant
characteristics using this estimator.
5. RANDOM EFECTS MODEL (REM)
• The benefit of REM approach is that you concede variation across

the cross-sections, but don’t estimate “N-1” of these variations.
18
• However, in this approach you introduce a more complicated
variance structure and OLS is no longer appropriate.
• This method is best suited to “random” draws from a large

population ex. household surveys which claim to be
“representative”. (N is usually quite large.)
• The problems of too many parameters with LSDV model and

“sweeping away” the time-invariant regressors “can be avoided”
if µi are assumed random, i.e. drawn from a given distribution.
µi ~ IID(0, σ µ2 )
v it ~ IID(0, σ v2 )
and µi are independent of vit.
• In other words we are assuming that the individual effects have an

empirical distribution function.
120
100
80
Frequency
60
40
20
0
-2 .5 0 -1 .2 5 0 .0 0 1 .2 5 2 .5 0
µˆ
Which has certain characteristics.
19
1 N
α = average µ= ∑ µi
N i =1
σ µ2 = Variance of µ
• We can use these definitions to write the panel data model in form
of REM
y it = α + x it′ β + (µ i − α ) + ν it
The new error term is uit = (µi − α ) + ν it
We can then rewrite the REM model as;

y it = α + x it′ β + u it
This is almost like the pooled model, except for the following;
• The constant term can be interpreted as the average individual
effects
• The error term has a special complicated form
• We can estimate the REM model using OLS to obtain estimates of

α and β .
• These estimates will only be consistent if the following conditions

hold;
o E (uit ) = E (µi − α ) + E (ν it ) = 0
o Cov(uit , xit ) = Cv(µi , xit ) + Cov(ν it , xit ) = 0 , i.e. no correlation between

individual effects and regressors
20
5.1 Efficiency in the REM
For REM to be efficient, two conditions must fulfilled;
• Homoscedasticity : Var (uit ) = σ µ2 + σν2 for all i and t. Here we assume
that µi and uit are independent
• Serial independence in the error term , uit = (µi − α ) + ν it
Cov (uit , u js ) = σ µ2 + σν2 if i= j and s=t (Same cross-section, same
year)
Cov (uit , u js ) = 0 if i≠ j and s=t If individuals are
independent)
Cov (uit , u js ) = σ µ2 ≠ 0 if i= j and s≠t (Same cross-section,
different year)
The last condition violates the serial independence assumption.
OLS is thus inefficient in a REM and thus yields incorrect standard
errors and tests.
5.2 FGLS ESTIMATOR

y it = α + x it′ β + u it
• The FGLS estimator for the REM can be implemented by OLS

regression of the transformed equation as follows
1. Define θˆ = 1 − σν where σ 12 = Tσ µ2 + σν2
σ1
2. Calculate “pseudo within differences”

yit* = yit − θˆyi • , xit* = xit − θˆxi •
3. Perform an OLS regression

21
yit* = α * + β xit* + uit*
Where α * = (1 − θˆ )α and ( ) (
uit* = 1 − θˆ α i + ε it − θˆε i )
4. The REM estimate of β is given by;
∑∑ (x )( )
N T
*
it − xi*• yit* − yi*•
βˆre = i =1 t =1
∑∑ (x )
N T
2
*
it − xi*•
i =1 t =1
The estimator of the intercept can be shown to equal

µ re = y − βˆre x (Greene, 2003)
5.2.1 The Crucial Problem: We do not know θ

• The unfortunate thing is that we do not know σ µ2 and σν2
• If the errors uit ,ν it and µi were known, we could estimate the

variances easily as follows;
T N
1. σ̂ 12 = ∑ ui •
N i =1
2
1 N T
2. σˆν =
2
∑∑ (uit − ui • )
N (T − 1) i =1 t =1
2
1 N T
= ∑∑ (ν it − ν i • )
N (T − 1) i =1 t =1
2
1 N T
3. σˆ µ2 = ∑∑ (µi − µ )
( N − 1) i =1 t =1
• Since uit ,ν it and µi are unknown, there are a number of suggestions

on how they can be estimated.
22
• These methods use various residuals instead of unknown
parameters.
Possible Residuals
1. ûols =REM residuals from the pooled regression y it = α + x it′ β + u it ,
number of observations is NT
2. ûb =REM residuals from the between regression
y i • = α + xi′• β + u i • , number of observations is N.
3. νˆw =FE residuals from the Within regression ~ x it′ β + ν~it ,

y it = ~
number of observations is NT.

4. ûw =REM residuals from the Within regression. This can be
computed as νˆ w + (µˆ w − µˆ w ) .
5. ûre =REM residuals from the regression yit* = α * + βxit* + uit* , number of
observations is NT.
1. Wallace and Hussein (1969)

Use ûols (unbiased and consistent but not efficient) instead of u in 4
and 5
2. Swamy and Arora (1972)
Use ûb in 4 and νˆw in 5. This is the approach used in Eviews
econometric software when you estimate a REM. It is also the
default in stata when you use re option.
3. Amemiya(1971)
Use ûw in 4 and νˆw in 5
23
4. Nerlove (1971)
Use µ̂ w in 6 and νˆw in 5
5. Wansbeek-Kapteyn (1989) for incomplete panels
Some Comments
• There is no much difference between the models when the REM
specification is correct
• Only Nerlove (1971) guarantees that σ µ2 > 0 . Many users of the
other methods set θ =1 (fixed effects model) if a negative value of
σ µ2 is found.
• There are no general rules as to which method to use. The most

common is the Swammy-Arora(also used in Eviews)
• The REM estimates are more efficient when REM specification is
correct. They are inconsistent when the model is incorrect
• It is important to test which model is correct
xtreg ln_cons ln_rgdp, re i(country)
Random-effects GLS regression Number of obs = 154


between = 0.9903 avg = 14.0
Random effects u_i ~ Gaussian Wald chi2(1) = 879.56

corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
------------------------------------------------------------------------------
ln_cons | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ln_rgdp | .8916337 .0300645 29.66 0.000 .8327083 .950559
_cons | .7713074 .3363147 2.29 0.022 .1121428 1.430472
-------------+----------------------------------------------------------------
sigma_u | .31717419
sigma_e | .10204792
------------------------------------------------------------------------------
24
5.3 BETWEEN ESTIMATOR
• The between estimator uses just the cross-sectional variation.

• For instance for a model y it = α i + x it′ β + ν it
• We could average all the years to yield y i = α i + xi′β + ν i
• This can be rewritten as a between model
y i = α + xi′β + (α i − α + ν i )
T T T
Where yi = T −1 ∑ yit , ε i = T −1 ∑ ε it and xi = T −1 ∑ xit
t =1 t =1 t =1
• The between estimator is the OLS estimator for regression of

yi on time averaged regressors
• The concern is the difference between different individuals
(i.e. “between estimator”) and is the analogue of cross-section
regression which is a special case T=1
• For instance for our consumption example, the data for
consumption will be as follows
Country Mean real consumption
Botswana 6926.8
Burkina Faso 1166573.8
Burundi 545623.9
Kenya 547946.8
Madagascar 20678.3
Mauritius 64759.6
Morocco 234.0
Nigeria 2878.3
Rwanda 618.8
Sierra Leone 1376062.0
South Africa 500589.7
Tanzania 5386.2
Do the same for gdp
• The between estimator is consistent if the regressors xi are

independent of the composite error (α i − α + ν i )
25
• This will be case for the constant-coefficients model and the
REM model.
• For the fixed effects model, the between estimator is
inconsistent as α i is assumed to be correlated with xi and
hence xi
Between regression (regression on group means) Number of obs = 154


between = 0.9903 avg = 14.0
F(1,9) = 920.77
sd(u_i + avg(e_i.))= .3183446 Prob > F = 0.0000
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ln_rgdp | .9996306 .0329431 30.34 0.000 .9251081 1.074153
_cons | -.375336 .3627005 -1.03 0.328 -1.195821 .4451495
------------------------------------------------------------------------------
5.3.1 Comments
• The REM estimator β̂ re of the slope parameters converge to the
within estimator as T →∞ and as θ → 1
• We can show that the GLS estimator is a weighted average of
the within and between estimator
βˆre = w1β within + w1βˆbe
5.4 FEM vs. REM MODEL

• The FEM vs REM is an issue that has generated a hot debate in
panel econometrics literature
26
(1) Traditional criterion: Is µi viewed as a random variable or as
parameter to be estimated?
µi is “random effect”- when the individual effects are randomly
distributed across the cross-sections
µi is “fixed effect”- when it is treated as a parameter to estimated for

each cross-section observation i.
(2) Modern panel data econometrics-The key issue here is whether µi is

correlated with the regressors or not.
µi is random effect-When there is zero correlation between the
observed explanatory variables and the µi i.e. Cov( X it , µ i ) = 0
µi is fixed effect-when there is correlation between the observed
explanatory variables and µi .
In other words, we allow for arbitrary correlation between the
unobserved effects µi and the observed explanatory variables
5.4. 1 Individual Specific Variables (Time-invariant regressors)
• Many times when we conduct research we have exogenous
variables that vary between individuals but which do not vary over
time within a given individual (e.g.gender, race, language,
nationality etc.).
27
• These are called time-invariant regressors
Which is the best model to deal with such exogenous variables?
To understand this let’s denote and individual specific variable as si
In a FEM, we will write it as follows;
y it = α + µ i + λ s i + x it′ β + ν it
Let’s look at the two FEM approachs: LSDV and WITHIN estimation
a) LSDV: The LSDV will not be able to estimate it because of perfect
multicollinearity between the individual-specific effects, µi , and the
individual specific variables si . The reason is because in both cases
dummy variables are used
b) WITHIN: Under this approach the term (α + µi + λsi ) does not vary
over time and will thus be removed by within transformation
(demeaning process). y it − y i • = β 1 ( x it − x i • ) + (ν it − ν i • )
This means that the parameters of the individual specific variables
cannot be estimated using the FEM. This means that we cannot
distinguish between observed and unobserved heterogeneity, a feature
that may be important for policy.

28
In REM we will write the model as follows
y it = α + λ s i + x it′ β + u it
In this case λ can easily be estimated, although not when using the
Nerlove method. It is however, important to note that for the REM to
be appropriate, the observed heterogeneity, si , must be independent of
the unobserved heterogeneity, µi .
The REM thus offers an added advantage by allowing us to estimate
parameters for time-invariant regressors which may be of policy
relevance.
Note however, that in the context of the REM, we cannot
interpret the coefficients for the unobserved heterogeneity!!.
Look at eviews output

Method: Pooled EGLS (Cross-section random effects)
Date: 09/08/08 Time: 22:56
Sample: 1990 2003
Swamy and Arora estimator of component variances
C 1.019244 0.371701 2.742110 0.0068

LN_RGDP? 0.879092 0.032618 26.95085 0.0000
Random Effects (Cross)
29
_BOTS--C -0.922566
_BURKF--C 0.411866
_BUR--C 1.256634
_KEN--C 0.307806
_MADAG--C 0.074454
_MAURIT--C -0.057190
_MOR--C -0.687420
_NIG--C -0.625203
_RWA--C -0.253274
_SIERL--C 0.542767
_RSA--C 0.124030
_TAN--C -0.171903
S.D. Rho
Cross-section random 0.419148 0.9386

Idiosyncratic random 0.107162 0.0614
Weighted Statistics

S.E. of regression 0.115749 Sum squared resid 2.157038
F-statistic 698.1075 Durbin-Watson stat 0.954621
Prob(F-statistic) 0.000000
Unweighted Statistics

Sum squared resid 50.10865 Durbin-Watson stat 0.041094
We cannot interpret the coefficients of the random effects
coefficients because they are randomly distributed across the
cross-sections
Question:
Suppose you have a one-way error components model of the form
Prove that the sum of µ i is equal to zero.
30
Solution
N
The question is equivalent to ∑µi =1
i =0
Use the fact that µi = α i − α

N N N N
∑ µi = ∑ (α i − α ) = ∑α i − ∑α = Nα − Nα = 0
i =1 i =1 i =1 i =1
References

Cameron,A.C. and Trivedi, P.K. (2006), Microeconometrics:

Methods and Applications, Cambridge University Press. Chapter
Chapter 21

1986
Greene W H. Econometric Analysis, Second Edition, Macmillan,

2003. Chapter 13
Wooldridge, J. M. (2002), Econometric Analysis of Cross-Section

and Panel Data, MIT Press: Cambridge, Massachusets. Chapter
10
31
Stata 9 reference on Longitudinal/panel data
32
(CMAP)
DR. MOSES SICHEI*
*
LECTURE 12: HYPOTHESIS TESTING AND TWO-WAY
ERROR COMPONENT MODELS
Objectives:
The main objective of the lecture is to understand some basic panel data
models in the context of one-way error components.
Key words
• Breusch-Pagan LM test
• Chow test
• Hausman test
• Generalised least squares
• Least squares dummy variable (LSDV)
• Panel data
• Pooled data
• Two-way error components model
• Random effects model (REM)
• Within estimation
1. INTRODUCTION
• Hypothesis testing is central to statistical inference in

econometrics.
• In econometrics, we distinguish 3 types of tests;
o Parameter test: do the parameters have specified values e.g.
are a parameter significant?
o Specification tests: Is the model correct? e.g. pooled, REM,
FEM or SUR
1
o Misspecification test: Are any of the statistical assumptions
violated? E.g. homoscedasticity or serial independence
1.1 Parameter tests

• The principles used in ordinary regression can be applied in panel
data models
• The usual way to test the parameters is to use the t-test (one
parameter) or F-test for several parameters.
2. SPECIFICATION TESTS
• We have considered the FEM, REM, SUR and pooled model in
lectures 10 and 11.
Pooled: y it = α + x it′ β + ε it
FEM: ~y it = α + ~x it′ β + ε~it i.e. transformation with Q (demeaning)
σν
REM: y * it = α + x it* β + u * it i.e transformation with θˆ = 1 −
Tσ µ2 + σ v2
SUR: y it = α i + x it′ β i + ε it i.e. allowing for contemporaneous correlation in error
terms
• The error terms in these models satisfy standard OLS assumption

if the respective model is correct
The specification tests are shown below;
2
Breusch-Pagan
Pooled
LM test
Chow test REM
Hausman test
FEM
2.1 The Chow Test of the Pooled Model against the FEM
H0 : the pooled (restricted) model is correct
H A : The FE (unrestricted) model is correct
• The URSS is calculated using the residuals from within regression

model i.e. ~y it = α + ~x it′ β + ε~it .
• The number of parameters is N+K
• The RRSS is computed from the pooled regression model

y it = α + x it′ β + ε it
• The number of parameters is K +1
• The number of observations in both models are NT
The Chow test sum of squares test is thus;

( RRSS − URSS ) /( N − 1) H0
F= ~ F( N −1),( NT − N − K )
URSS /( NT − N − K )
3
• This test is called the Chow test because of the similarity to the
well known chow test for parameter stability.
2.2 Pooled or REM?: Breusch-Pagan (1980) LM Test

• Breusch and Pagan (1980) derived Lagrange multiplier tests for
the presence of individual specific random effects against the null
hypothesis assumption of iid errors
• The REM reduces to the pooled model if the variance of the
σν
individual effects become zero i.e. σ µ → 0 so that θˆ = 1 − =0
Tσ µ2 + σ v2
• The LM test uses this notion and tests the following hypothesis;
H 0 : σ µ2 = 0
H A : σ µ2 > 0
• The Breusch-Pagan LM statistic is calculated using OLS residuals

from the pooled model
⎛ 2 N 2 ⎞
⎜ T ∑ν i • ⎟
NT ⎜ i =1 ⎟ ~ χ2
LM = − 1 under H0
2(T − 1) ⎜ N T
⎟ 1
⎜ ∑∑ it ν 2
⎟
⎝ i =1 t =1 ⎠
• Unfortunately, the B-P test is two-sided test against the alternative

H A : σ µ2 ≠ 0 despite of the fact that we know that the variance cannot
be negative
xttest0
Breusch and Pagan Lagrangian multiplier test for random effects:
ln_cons[country,t] = Xb + u[country] + e[country,t]
Estimated results:
| Var sd = sqrt(Var)
---------+-----------------------------
ln_cons | 8.644141 2.940092
e | .0104138 .1020479
4
u | .1005995 .3171742
Test: Var(u) = 0
chi2(1) = 731.09
Prob > chi2 = 0.0000
• The above BP LM test is based on Breusch and Pagan (1980) and

as modified by Baltagi and Li (1990) to deal with
unbalanced/incomplete panels
• We reject the null of H 0 : σ µ2 = 0 implying that REM is better than a
pooled model
• It is important to note that LM tests generally have low power.
Experiments have shown that it is better to use the Chow test for
FE against pooled even if we suspect that RE is the correct
alternative model.
Eviews does not do Breusch pagan test directly
2.3 Testing the Joint Validity of Fixed Effects

H 0 : µ1 = µ 2 = ... = µ N −1 = 0 (no individual effects; same intercept for
all cross sections)
H A : µ1 ≠ µ 2 ≠ ... ≠ µ N −1 ≠ 0
• We test the null hypothesis of no individual effects with an applied

Chow or F-test, combining the Residual Sum of Squares for the
regression both with constraints (under the null) and without
(under alternative):
5
RRSS − OLS on pooled model (constant intercept)
URSS −From LSDV model or Within estimation
( RRSS − URSS ) /( N − 1) H0
F= ~ F( N −1),( NT − N − K )
URSS /( NT − N − K )
If N is “large enough”, can use “within” estimation instead of LSDV for

the residual sum of squares.
Eviews 5 test
Redundant Fixed Effects Tests
Pool: POOL1
Test cross-section fixed effects
Effects Test Statistic d.f. Prob.
Cross-section F 193.077198 (11,150) 0.0000

Cross-section Chi-square 443.130831 11 0.0000
We reject the null of H 0 : µ1 = µ 2 = ... = µ N −1 = 0
Stata produces the test of fixed effects as default output from the fixed
effects command as follows
. xtreg ln_cons ln_rgdp,fe i(country)
Fixed-effects (within) regression Number of obs = 154


between = 0.9903 avg = 14.0
F(1,142) = 183.96
corr(u_i, Xb) = 0.9591 Prob > F = 0.0000
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
ln_rgdp | .6576598 .0484889 13.56 0.000 .5618063 .7535133
_cons | 3.255495 .5148909 6.32 0.000 2.237653 4.273337
-------------+----------------------------------------------------------------
sigma_u | 1.0877784
sigma_e | .10204792
6
------------------------------------------------------------------------------
F test that all u_i=0: F(10, 142) = 127.58 Prob > F = 0.0000
• We reject the null that fixed effects are redundant just as we found
using eviews software
2.4 The Hausman Specification Test for the FE against the REM
• The Hausman test is a general test procedure, which is used when
we want to test the validity of an assumption that is necessary for
efficient estimation.
• In the panel framework the test checks for the following;
H 0 : Cor (µi , xit ) = 0 i.e. REM is correct
H 0 : Cor (µi , xit ) ≠ 0 i.e. FEM is correct
The Hausman test is calculated in a matrix form or omitted variable
form.
• We begin by assuming that the true model is the random effects
model with individual effects uncorrelated with regressors and
error term
• The estimator β̂ re is fully efficient
• The hausman test statistic is
( ′
) [ ( ) ( )] (βˆ
H = βˆre − βˆw Vˆ βˆw − Vˆ βˆre
−1
re − βˆw )
H 0 : βˆre − βˆw = 0 i.e. the correct model is a REM
H A : βˆre − βˆw ≠ 0 i.e. the correct model is a FEM
The test statistic is asymptotically chi-square distributed under the

null hypothesis
7
Correlated Random Effects - Hausman Test
Pool: POOL1
Test cross-section random effects
Chi-Sq.
Test Summary Statistic Chi-Sq. d.f. Prob.
Cross-section random 25.960427 1 0.0000
Cross-section random effects test comparisons:
Variable Fixed Random Var(Diff.) Prob.
LN_RGDP? 0.683031 0.879092 0.001481 0.0000
We reject the null that the FE and RE models are exactly the same
In stata, we have to run the FEM and REM and store the values
and use the hausman test (see table 4.3 in Baltagi,2005:71)
2.5 Testing for heteroscedasticity

• Estimating heteroscedastic errors with the assumption of
homoscedasticity will yield consistent estimates but they will not
be efficient.
• The standard errors will be biased and one should compute robust
standard errors correcting for the possible presence of
heteroscedasticity.
• From σ i2 = σ µ2 + σν2
• Which component on the right hand side is heteroscedastic?

• There are two different situations that can arise;
8
o Variance of the individual effects µi is heteroscedastic as
suggested by Mazodier and Trognon (1978) in the case of
REM
o Variance of ν it is heteroscedastic as suggested by Baltagi and
Griffin (1988)
• There is quite a lot of details regarding this in Baltagi 2005 chapter
5
• Testing for heteroscedasticity in one-way error component is
complicated especially when µi is heteroscedastic
¾ Verbon (1980) derived a LM test for the null of
homoscedasticity against heteroscedastic alternative
µi ~ (0,σ µ2 i
) and v ~ (0,σ )
it
2
it
¾ Li and Stengos (1994) suggested a motified Breusch-Pagan

test for significance of random individual effects: H 0 : σ µ2 = 0
which is robust to heteroscedasticity of unknown form in the

remainder error term
• An easy test could be done as follows;
H0 :σ i = σ for all i i.e. homoscedasticity
HA :σi ≠ σ for all i i.e. heteroscedasticity
Where σˆ 2 =
1
(e′e ) -the pooled rss
N
σˆ i2 =
1
(ei′ei ) -cross-section specific residual vectors
N
9
2
T N ⎡σˆ 2 ⎤
The LM test is as follows LM = ∑ ⎢ i2 − 1⎥ ~ χ (2N −1) . This is a one tail test
2 i=1 ⎣σˆ ⎦
Notice that this is an imperfect test since σˆ i2 = σ µ2 + σν2
2.6 Testing for serial correlation

• Ignoring serial correlation when it is present results in consistent
but inefficient estimates of the regression coefficient and biased
standard errors
• There are two types of correlation here
Effects-drive serial correlation: classical one-way error
component model, ε it = µi + ν it , assumes that the only
correlation over time is due to the presence of the same
individual across the panel.
The correlation between ε it and ε js is;
⎧ 1 for i = j , t = s
ρ = cor (ε it , ε js ) = ⎨ σ µ
⎪ 2
i= j t≠s
⎪σ µ2 + σ v2
⎩
• It is the same no matter how far s is from t. this is

quite restrictive for economic relationships like
investment, where unobserved shock this period will
affect the behavioural relationship for at least the next
few periods
Traditional time/distance driven serial correlation, which is

ignored above
10
• There are different ways to model this serial correlation
o AR processes e.g. AR(1) vit = ρvit −1 + ηit , AR(2) :
vit = ρ1vit −1 + ρ 2vit − 2 + ηit
o Moving average process e.g.MA(1) vit = ηit + λvit −1
• One could compute first-order within individual autocorrelation

coefficient from the within regression estimation
N T
∑∑ εˆ εˆ it it −1
r= i =1 t = 2
N T
∑∑ εˆ
i =1 t = 2
2
it
• The simplest test is the LM test due to Breusch and Godfrey

NT 2
LM = r ~ N (0,1) under H0
T −1
• We can also use the Durbin-Watson in panel data
2.7 Robust standard errors

• If we discover (or suspect) heteroscedasticity or serial correlation,
we must decide what to do
• One approach is to try and model the variances and or correlation.
For instance we could estimate the regression assuming that the
disturbance term is first-order autoregressive.
• This can be done in stata using the xtregar command
yit = α + xit′ β + µi + ε it
Where ε it = ρε it −1 + ηt
11
• An alternative approach is to accept the usual estimates, but to
calculate their robust standard errors
• If we only suspect heteroscedasticity, we can use the White’s
robust standard errors
• If we suspect heteroscedasticity and /or within individual
autocorrelation we can use Arrellano’(1987)s robust standard
errors
• The white’s method is standard in most econometric software
• Arrelano’s method is available in stata
3. TWO-WAY ERROR COMPONENT MODEL

• In a one-way error component model, we assume that there exist
unobserved individual heterogeneity but that the model is
homogenous over time.
• Under the two-way error component model, the error term is
decomposed into two key components
• ε it = α + µi + λt + ν it
• Where µi denotes the unobserved individual effects discussed

earlier, λt denotes the unobserved time effect and ν it is the
remainder stochastic disturbance term
• Note that λt is individual-invariant and it accounts for any time-
specific effect that is not included in the regression
• Note also that µi is time-invariant
12
i=3,t=1
yit yit = αi + xit′ β + ε it
i=3,t=2
i=2,t=1
i=2,t=2
i=1,t=1
i=1,t=2
xit
• Our model now becomes

y it = α + µ i + λ t + x it′ β + ν it
N T
Where ∑ µi = 0, ∑ λt = 0
i =1 t =1
The individual/time effects can be defined as follows;

α it = α + µ i + λ t
N T
1
α = α •• ≡
NT
∑ ∑α
i =1 t =1
it i.e. the average effect
T
1
α + µ i = α i• ≡
T
∑α
t =1
it i.e. the individual effect. This is time-invariant
N
1
α + λ t = α •t ≡
N
∑α
i =1
it i.e. the time effect. This is cross-section
invariant.
13
• Note that some software report the individual effects as α i • while
others report as µi . Notice also that α it − α i • − α •t − α •• = 0
• In a two way error component models both the individual and time
effects can be fixed or random e.g.
o {µ F , λF } i.e.both are fixed effects

o {µ R , λF } or {µ F , λR }i.e.mixed FE/RE model
o {µ R , λR } both are RE
3.1. FULLY FIXED EFFECTS MODEL ( {µ F , λF } )
1. ESTIMATING FIXED EFFECTS
• We extend the one-way model assumptions, µi and λt are fixed

parameters to be estimated and ν it ~ IID(0,σ v2 )
LSDV
Estimation with LSDV requires the estimation of {(N-1)+(T-1)}
dummies.
14
Country Period 1998 1999 2000 2001 2002 2003
Botswana 1998 1 0 0 0 0 0
Botswana 1999 0 1 0 0 0 0
Botswana 2000 0 0 1 0 0 0
Botswana 2001 0 0 0 1 0 0
Botswana 2002 0 0 0 0 1 0
Botswana 2003 0 0 0 0 0 0
Burkina Faso 1998 1 0 0 0 0 0
Burkina Faso 1999 0 1 0 0 0 0
Burkina Faso 2000 0 0 1 0 0 0
Burkina Faso 2001 0 0 0 1 0 0
Burkina Faso 2002 0 0 0 0 1 0
Burkina Faso 2003 0 0 0 0 0 0
Burundi 1998 1 0 0 0 0 0
Burundi 1999 0 1 0 0 0 0
Burundi 2000 0 0 1 0 0 0
Burundi 2001 0 0 0 1 0 0
Burundi 2002 0 0 0 0 1 0
Burundi 2003 0 0 0 0 0 0
Kenya 1998 1 0 0 0 0 0
Kenya 1999 0 1 0 0 0 0
Kenya 2000 0 0 1 0 0 0
Kenya 2001 0 0 0 1 0 0
Kenya 2002 0 0 0 0 1 0
Kenya 2003 0 0 0 0 0 0
Madagascar 1998 1 0 0 0 0 0
Madagascar 1999 0 1 0 0 0 0
Madagascar 2000 0 0 1 0 0 0
Madagascar 2001 0 0 0 1 0 0
Madagascar 2002 0 0 0 0 1 0
Madagascar 2003 0 0 0 0 0 0
Mauritius 1998 1 0 0 0 0 0
Mauritius 1999 0 1 0 0 0 0
Mauritius 2000 0 0 1 0 0 0
Mauritius 2001 0 0 0 1 0 0
Mauritius 2002 0 0 0 0 1 0
Mauritius 2003 0 0 0 0 0 0
Morocco 1998 1 0 0 0 0 0
Morocco 1999 0 1 0 0 0 0
Morocco 2000 0 0 1 0 0 0
Morocco 2001 0 0 0 1 0 0
Morocco 2002 0 0 0 0 1 0
Morocco 2003 0 0 0 0 0 0
Nigeria 1998 1 0 0 0 0 0
Nigeria 1999 0 1 0 0 0 0
Nigeria 2000 0 0 1 0 0 0
Nigeria 2001 0 0 0 1 0 0
Nigeria 2002 0 0 0 0 1 0
Nigeria 2003 0 0 0 0 0 0
Rwanda 1998 1 0 0 0 0 0
Rwanda 1999 0 1 0 0 0 0
Rwanda 2000 0 0 1 0 0 0
Rwanda 2001 0 0 0 1 0 0
Rwanda 2002 0 0 0 0 1 0
Rwanda 2003 0 0 0 0 0 0
Sierra Leone 1998 1 0 0 0 0 0
Sierra Leone 1999 0 1 0 0 0 0
Sierra Leone 2000 0 0 1 0 0 0
Sierra Leone 2001 0 0 0 1 0 0
Sierra Leone 2002 0 0 0 0 1 0
Sierra Leone 2003 0 0 0 0 0 0
South Africa 1998 1 0 0 0 0 0
South Africa 1999 0 1 0 0 0 0
South Africa 2000 0 0 1 0 0 0
South Africa 2001 0 0 0 1 0 0
South Africa 2002 0 0 0 0 1 0
South Africa 2003 0 0 0 0 0 0
Tanzania 1998 1 0 0 0 0 0
Tanzania 1999 0 1 0 0 0 0
Tanzania 2000 0 0 1 0 0 0
Tanzania 2001 0 0 0 1 0 0
Tanzania 2002 0 0 0 0 1 0
Tanzania 2003 0 0 0 0 0 0
15
This can introduce a rather severe loss of degrees of freedom!
WITHIN
• Once again to avoid this problem we can perform “WITHIN”
transformation (similar to one-way model.)
• Now, however, we must demean across both dimensions.
~
~
yit = yit − yi • − y•t + y••
~
~
xit = xit − xi • − x•t + x••
~
v~it = vit − vi • − v•t + v••
=∑ =∑ =∑
N
ΣTt=1 yit
i =1
N
ΣTt=1 xit
i =1
N
ΣTt=1vit
i =1
with y•• , x•• , v••
NT NT NT
• The two-way within model can thus be written as

~ ~ ~
~ x it′ β + ν~it
y it = ~
• We now need two constraints to capture individual and time

effects:
∑ µi = 0
i
∑ λt = 0
t
We can recover the average, individual and time effects as follows;

α~ = y•• − βˆx••
µ~i = yi • − βˆxi •
~
λt = y•t − βˆx•t
Consistency
• α~ and βˆ are consistent as either N or T tend to infinity
• µ~i is only T-consistent
• λ~t is only N-consistent
16
• The two way error components within transformation removes
both observed and unobserved heterogeneity for both individual
and time effects
Two-way error components FEM in Eviews

Date: 09/09/08 Time: 15:39
Sample: 1990 2003
C 5.063740 0.633160 7.997570 0.0000

LN_RGDP? 0.498205 0.059060 8.435613 0.0000
Fixed Effects (Cross)
_BOTS--C -1.244337
_BURKF--C 1.731393
_BUR--C 1.943289
_KEN--C 1.347627
_MADAG--C -0.200967
_MAURIT--C 0.214007
_MOR--C -2.576373
_NIG--C -1.459456
_RWA--C -1.911089
_SIERL--C 1.882629
_RSA--C 1.205361
_TAN--C -0.932085
Fixed Effects (Period)
1990--C -0.003013
1991--C 0.023721
1992--C 0.031312
1993--C 0.025417
1994--C 0.019385
1995--C 0.000829
1996--C 0.024241
1997--C 0.055336
1998--C 0.082757
1999--C 0.057133
2000--C 0.064770
2001--C 0.120360
2002--C 0.171388
2003--C 0.186043
17
Cross-section fixed (dummy variables)
Period fixed (dummy variables)

3.2 FULLY RANDOM EFFECTS MODEL ( {µ R , λR } )

• This model can be written as
• Where µ , λ and x are independent

• OLS will be consistent but inefficient
• The efficient estimate is to use FGLS by regressing y ** and y **
where
yit* = yit − θ1 yi • − θ 2 y•t + θ3 y••

xit* = xit − θ1 xi • − θ 2 x•t + θ3 x••
vit* = vit − θ1vi • − θ 2v•t + θ3v••
Where
σv
θ1 = 1 − where σ 12 = Tσ µ2 + σ v2
σ1
σ
θ2 = 1 − v where σ 22 = Nσ λ2 + σ v2
σ2
σv
θ 3 = θ1 + θ 2 + −1 where σ 32 = σ 12 + σ 22 − σ v2
σ3
3.2.1 Problem of estimating θ
• The problem is that we do not know θ

• Similar alternatives as in one-way error components can be
applied
18
o Wallace which uses OLS residuals
o Amemiya which uses within estimation residuals
o Swammy-Arora which uses between individual and
between time residuals and within residuals

Method: Pooled EGLS (Two-way random effects)
Date: 09/09/08 Time: 15:44
Sample (adjusted): 1990 1998
Included observations: 9 after adjustments
Total pool (balanced) observations: 108
C 0.986764 0.489575 2.015554 0.0464

LN_RGDP? 0.881646 0.043623 20.21071 0.0000
Random Effects
(Cross)
_BOTS--C -0.877294
_BURKF--C 0.408867
_BUR--C 1.259177
_KEN--C 0.253819
_MADAG--C 0.091398
_MAURIT--C -0.042582
_MOR--C -0.654678
_NIG--C -0.610147
_RWA--C -0.200834
_SIERL--C 0.494613
_RSA--C 0.091849
_TAN--C -0.214189
Random Effects
(Period)
1990--C 0.000000
1991--C 0.000000
1992--C 0.000000
1993--C 0.000000
1994--C 0.000000
1995--C 0.000000
1996--C 0.000000
1997--C 0.000000
1998--C 0.000000
S.D. Rho
19
Period random 0.000000 0.0000
Weighted Statistics


3.3 MIXED FE/RE MODEL ( {µ F , λR }) and {µ R , λF }

• The estimation can be done either for within or RE model
• There are different transformation depending on whether we have
{µ F , λR } or {µ R , λF }
• There also methods for computation of θ in the REM
• Eviews can estimate this model easily

Method: Pooled EGLS (Cross-section random effects)
Date: 09/09/08 Time: 15:46
Sample (adjusted): 1990 1998
Included observations: 9 after adjustments
Total pool (balanced) observations: 108
C 1.002894 0.421706 2.378184 0.0193

LN_RGDP? 0.880140 0.039372 22.35460 0.0000
Random Effects
(Cross)
_BOTS--C -0.878631
_BURKF--C 0.413947
_BUR--C 1.261821
_KEN--C 0.257952
_MADAG--C 0.090330
_MAURIT--C -0.041596
20
_MOR--C -0.662075
_NIG--C -0.613345
_RWA--C -0.207404
_SIERL--C 0.500086
_RSA--C 0.096118
_TAN--C -0.217204
Fixed Effects (Period)
1990--C -0.010309
1991--C 0.002957
1992--C -0.001522
1993--C 0.003697
1994--C 0.011615
1995--C -0.035977
1996--C -0.016217
1997--C 0.016454
1998--C 0.029302
S.D. Rho

Period fixed (dummy variables)
Weighted Statistics


3.4. TESTING FOR FIXED EFFECTS
• We can adapt an applied F-test to test the null hypothesis of one
common intercept across time and cross-section versus the
alternative of an intercept for each year and cross-section.

21
H0: µ1 = µ 2 = K µ N −1 = 0 and λ1 = λ2 = K = λT −1 = 0
HA: not all equal to 0.
RRSS from pooled OLS (without dummies)

URSS from WITHIN regression
⎛ ⎞
⎜ RRSS − URSS ⎟ / ( N + T − 2 ) H0
F1 = ⎝ ⎠
~ F(( N +T −2 ),( N −1)(T −1)− K ))
URSS / ( N − 1)(T − 1) − K
• Note: as seen previously this can also test just individual effects or
time effects.
Redundant Fixed Effects Tests

Pool: POOL1
Test cross-section fixed effects
Effects Test Statistic d.f. Prob.
Cross-section F 167.055245 (11,95) 0.0000
References
Arellano, M.(1987), “Computing Robust Standard Errors for Within
Group Estimators”, Oxford Bulleting of Economics and Statistics,
Vol.49(4), 431-434.
Baltagi, B.H. and Li, Q. (1990), “A Lagrange Multiplier Test for Error
Components with Incomplete Panels”, Econometric Reviews, Vol.9(1),
103-107.

John Wiley Chapter 3-5
22
Breusch, T. and Pagan, A. (1980), “The Lagrange Multiplier Test and its
Applications to Model Specifications in Econometrics”. Review of
Economic Studies, Vol.47, 239-253.


1986

Chapter 13

23
(CMAP)
DR. MOSES SICHEI*
*
LECTURE 13: INTRODUCTION TO DYNAMIC PANEL DATA
AND NONSTATIONARY PANELS
Objectives:
The main objective of the lecture is to understand dynamic panels and

nonstationary panel.
Key words
• Arrelano and Bond
• Arrelano and Bover
• Dynamic panel
• GMM
• Kao test of cointegration
• Nickell bias
• Panel cointegration
• Panel unit roots
• Pedroni test of cointegration
• Nonstationary panels
• Orthogonal forward deviations
• Spurious regression
1. DYNAMIC PANEL DATA
• Many economic relationships are dynamic in nature

• One key advantage of panel data is that they allow the researcher to better understand
dynamics of adjustment
• Examples
o Dynamic demand for energy
o Dynamic wage equation
o Dynamic model of employment
1
• Dynamic relationships are characterized by the presence of a lagged dependent variable
among the regressors
yit = δyit −1 + xit′ β + uit
• Assume one-way error component model

• uit = µi + vit
• The model then becomes yit = µi + δyit −1 + xit′ β + vit
Where µi ~ iid (0,σ µ2 ) and vit ~ iid 0, σ v2 ( )

• The dynamic model above is characterized by two sources of persistence over time
¾ Autocorrelation due to the presence of lagged dependent variable among the
regressors
¾ Individual effects characterizing the heterogeneity among the individuals
1.1 Problems introduced by the lagged dependent variable

• Since yit is a function of µi then yit −1 is also a function of µi . This can be seen from the
fact that
yit −1 = µi + δyit − 2 + xit′ −1β + vit −1 .
• It implies that yit −1 is correlated with the error term in the equation
yit = δyit −1 + xit′ β + uit
• This renders OLS estimator biased and inconsistent even if vit is not serially correlated
1.2 Nickell (1981) bias in FE estimator

• The within estimator will run an equation of the form
yit − yi • = µi − µi + δ ( yit −1 − yi • −1 ) + ( xit′ − xi • )β + (vit − vi • )
• BUT: ( yit −1 − yi • −1 ) is correlated with (vit − vi • ) even if vit are not serially correlated.
• This is because yit −1 is correlated with vi • by construction
• Indeed, the within estimator will be biased of magnitude O (T −1 ) and its consistency
depends on T being large
• Consequently, only if T → ∞ will the within estimator for δ and β be consistent for
the dynamic error component model.
2
1.3 Bias in Random effects model
• The random effects GLS estimator will also be biased in a dynamic panel data model
• The problem is that the GLS method entails quasi-demeaning of variables using θ
yit − θyi • = δ ( yit −1 − θyi • −1 ) + ( xit′ − θxi • )β + (vit − θvi • )
• But yit − θyi • will be correlated with (uit − θvi • −1 )
2. ANDERSON AND HSIAO (1981) ESTIMATOR-AH

• Anderson and Hsiao (1981) suggested the following procedure:
¾ Difference the model to get rid of the µi
∆yit = δ∆yit −1 + ∆xit′ β + ∆vit
¾ Use ∆yit − 2 = yit − 2 − yit − 3 or simply yit − 2 as instruments for ∆yit −1 = yit −1 − yit − 2
• These instruments will not be correlated with ∆vit −1 = vit − vit −1 provided the vit are not
serially correlated
Limitations of this method

• This method leads to consistent but not efficient estimates of parameters because it does
not make use of all the available moment conditions
• It also does not take into account the differenced structure on the residual disturbance
2. ARRELANO AND BOND (1991) METHOD

• Arrelano and Bond (1991) proposed a generalized method of moments (GMM)
procedure that is more efficient than Anderson and Hsiao (1981) method
• They argue that additional instruments can be obtained in a dynamic model if one
utilizes the orthogonality conditions that exist between lagged values of yit and the
disturbance term vit
Illustration of additional instruments

• From the model yit = µi + δyit −1 + vit
• Difference the model to eliminate the individual effects as suggested by AH

• yit − yit −1 = δ ( yit −1 − yit − 2 ) + (vit − vit −1 )
3
At t = 3 we can find a valid instrument
• yi 3 − yi 2 = δ ( yi 2 − yi1 ) + (vi 3 − vi 2 )
• A valid instrument here is yi1 because it is highly correlated with ( yi 2 − yi1 ) but not
correlated with (vi 3 − vi 2 )
At t = 4 we can find a valid instrument

• yi 4 − yi 3 = δ ( yi 3 − yi 2 ) + (vi 4 − vi 3 )
• A valid instrument here is yi 2 because it is highly correlated with ( yi 3 − yi 2 ) but not
correlated with (vi 4 − vi 3 )
• We can continue the process to get additional valid instruments as a vector

( yi1 , yi 2 ,..., yiT − 2 )
Equations Instruments
∆yi 3 = δ∆yi 2 + ∆vi 3 yi1
∆yi 4 = δ∆yi 3 + ∆vi 4 yi1 , yi 2

. .
. .
∆yiT = δ∆yiT −1 + ∆viT yi1 , yi 2 ,..., yiT − 2
The set of dependent variables for the regression are the LHS in the equations column above
′
• We can put these in a matrix of instruments yi* = (∆yi 3 , ∆yi 4 ,..., ∆yiT )
• The set of instruments are the variables in the instruments column above shown in the
matrix below
⎡( yi1 ) 0 0 0 0 0 ⎤
⎢ 0 ( yi1 , yi 2 ) 0 0 . 0 ⎥
⎢ ⎥
⎢ 0 . ( yi1 , yi 2 , yi 3 ) . . 0 ⎥
Wi D = ⎢ ⎥
⎢ . . . . . . ⎥
⎢ . . . . . . ⎥
⎢ ⎥
⎣⎢ 0 . . . . ( yi1 , yi 2 ,..., yiT −2 )⎦⎥
• Wi D refer to the instruments for the first differenced based Arrelano and Bond
estimator
• The two-step GMM estimator is
4
• δˆ2 = ⎡⎢(∆y−1 )′WVˆN−1 (∆y−1 )−1 ⎤⎥ ⎡⎢(∆y−1 )′WVˆN−1 (∆y−1 )−1 ⎤⎥
⎣ ⎦⎣ ⎦
N ′
• Where VN = ∑Wi′(∆vi )(∆vi ) Wi
i =1
• If we have x regressors the matrix of instruments change to

⎡( yi1, xi′1, xi′2 ) 0 0 0 0 0 ⎤
⎢ 0 ( yi1, yi2 , xi′1, xi′2 ) 0 0 . 0 ⎥
⎢ ⎥
⎢ 0 . ( y , y , y , x′ , x′ ) . . 0 ⎥
• Wi D = ⎢
i1 i 2 i 3 i1 i 2
⎥
⎢ . . . . . . ⎥
⎢ . . . . . . ⎥
⎢ ⎥
⎣⎢ 0 . . . . ( yi1, yi 2 ,..., yiT−2, xi′1,..xiT′ −2 )⎦⎥
• Additionally, Arrelano and Bond suggested Sargan’s test of overidentifying

restrictions given by
•
−1
⎡N ′ ⎤
m = ∆vˆ′W ⎢∑Wi′(∆vi )(∆vi ) Wi ⎥ W ′(∆vˆ ) ~ χ 2 ( p − k − 1)
⎢ i =1 ⎥
⎣ ⎦
Where p refers to the number of columns of W and ∆v̂ denotes the residuals from a two-step
estimation
3. ARRELANO AND BOVER (1995) SYSTEMS ESTIMATOR

• Arellano and Bover (1995) and Blundell and Bond(1998) unified the GMM framework
for looking at IV estimators for dynamic panel data models
• This method eliminates individual effects by computing orthogonal forward deviation
• Orthogonal deviations as proposed by Arellano(1988) and Arellano and Bond (1995)
express each observation as the deviation from the average of the future observations in
the sample for the same individual and weight each deviation to standardize the variance
as follows;
1
⎛ x + xi ( t + 2 ) + ...xiT ⎞⎛ T − t ⎞ 2
x = ⎜⎜ xit − i (t +1)
*
⎟⎟⎜ ⎟ for t = 1,.., T − 1
T −t ⎠⎝ T − t + 1 ⎠
it
⎝
• Look at the C3 matrix given in question 8.4 in Baltagi (2005:162)
5
• The orthogonal forward deviations use the same instruments as in Arrelano and Bond
(1991)
• But adds the following
Equations Instruments
yi 3 = δyi 2 + µi + vi 3 ∆yi 2
yi 4 = δyi 3 + µi + vi 4 ∆yi 3
. .
. .
yiT = δyiT −1 + µi + viT ∆yiT −1
• The full matrix of instruments entails a stacked matrix of the form
⎡Wi D 0 0 0 0 0 ⎤
⎢ ⎥
⎢ 0 ∆yi 2 0 0 . 0 ⎥
⎢0 . ∆yi3 . . 0 ⎥
Wi = ⎢
D
⎥
⎢ . . . . . . ⎥
⎢ . . . . . . ⎥
⎢ ⎥
⎢⎣ 0 . . . . ∆yiT−1⎥⎦
Instruments for the differenced estimator
• Can get details in Baltagi
4. NON-STATIONARY PANEL DATA MODELS
• With the growing use of cross-country data to study a number of macroeconomic topics,
the focus of panel data econometrics has shifted towards studying the asymptotics of
macroeconomic-oriented panel data rather than microeconomic-oriented panel data.
4.1 Problems with nonstationary data in general

¾ Low power of time series tests (unit root and cointegration.)
6
¾ Nonstandard limiting distributions of time series tests.
¾ Spurious regression problem (t-statistics diverge in misspecified regressions of
two I(1) variables.)
• Panel data can help solve the problems faced in time series data but at the cost of
introducing a new issue, how homogeneous is the panel?
• If we have large T dimensions then:
• Allow estimation of heterogeneous panels (heterogeneous regressors)
• Allow investigation of non-stationary spurious regression, and cointegration
“The aim of the econometrics of non-stationary panel data is to combine the best of both worlds:
the method of dealing with non-stationary data from the time series and the increased data from
the cross-section.”
• View cross-sections as repeated draws, or paths, from the same distribution.
• Certain panel statistics (estimators) converge in distribution to normally distributed random

variables
Some distinctive results for panel data in non-stationary world
• Many test statistics and estimators of interest have normal limiting distribution. e.g.IPS
test. This is in contrast to the non-stationary time series literature where the limiting
distributions are complicated functionals of Brownian/Wiener processes
• Using panel data one can avoid the problem of spurious regression.
• However, unlike time series spurious regression literature, the panel data spurious
regression estimates give consistent estimate of the true value of the parameter as both N
and T tend to ∞
• The reason is because the panel estimator averages across individuals and the
information in the independent cross-section data in the panel leads to a stronger overall
signal than in pure time series
• Note however that the introduction of N → ∞ and T → ∞ introduces some other
complications in the asymptotic analysis (Phillips and Moon, 2000-multi-indexed
processes)
2. TESTING FOR UNIT ROOTS IN PANEL DATA
• Testing for unit roots in time series is now common practice among applied researchers.
• However, testing for panel unit roots is quite recent and many researches and thesis
applying panel data still disregard this crucial step.
Unit root tests in time series analysis:
7
• Based on the coefficient of the AR(1) representation. (Less than one in absolute value)
• ADF/PP consider the null hypothesis of non-stationarity; KPSS considers the null hypothesis
of stationarity.
• A process with a unit root has an infinite memory and can be considered the sum of all past
random shocks.
• Tests are notorious for low power and non-standard limiting distributions.
• Panel unit root tests are similar but not identical to unit root tests carried out in time
series analysis.
• All panel unit roots begin with the following
• yi ,t = δ i yi ,t −1 + λ xit + ε it
• The xit represent the exogenous variables in the model
• If δ i < 1 y i ,t is weakly(trend) stationary and we shall be dealing with stationary data

• On the other hand if δ i = 1 then y i ,t contains a unit root
• We can simplify this further by substracting yi ,t −1 on both sides so that

∆ yi ,t = (δ i − 1) yi ,t −1 + λ xit + ε it
• Assuming that ρ i = (δ i − 1)
pi
• Our ADF type-model is

∆ yi ,t = ρ i yi ,t −1 + λ xit + ∑ θ ij ∆ yi ,t − j + ε i ,t
j =1
• For purposes of testing, there are two natural assumptions
(1) assume that the persistence parameter are common across the cross-sections so that
ρi = ρ
Examples of panel unit root methods that apply this approach are;
o Levin, Lin and Chu (LLC)
o Breitung
o Hadri
(2) Assume that ρ

i vary with cross-sections
o Im, Pesaran and Shin (IPS)
o Fischer-ADF
o Fischer PP tests
3. TESTS WITH COMMON UNIT ROOTS
8
• The Levin, Lin and Chu (LLC), Breitun and Hadri all assume that there is a common unit
root process so that ρ i is identical across cross-sections

• LLC and Breitung assume a null of unit root while the Hadri test uses a null of no unit
root (just like KPSS in time series)
• LLC and Breitun all consider the following basic ADF specification
pi
∆ yi ,t = ρ yi ,t −1 + λ xit + ∑ θ ij ∆ yi ,t − j + ε i ,t
j =1
• LLC and Breitun allow the lag order for the difference terms pi to vary across cross-
sections
• The null and alternative hypotheses are
H0: ρ = 0 i.e. there is a unit root

HA: ρ < 0 i.e. there is no unit root
3.1 LEVIN, LIN AND CHU
• This method attempts to derive estimates for ρ from proxies for ∆yit and yit that are
standardized and free of autocorrelations and deterministic components
• The LLC requires specification of the number of lags used in each equation ADF
regression, pi as well as kernel choices
3.2 BREITUNG
• This method differs from LLC in two distinct ways

o Only the autoregressive component (and not the exogenous components) is
removed when constructing standardized proxies
o Proxies are transformed and detrended
• Breitung method requires only a specification of the number o lags used in each
cross-section ADF and exogenous regressors
• No kernel computations are required
3.3 HADRI
• This is similar to KPSS unit root test

• It has a null of no unit root in any of the series of the panel
• Like KPSS, Hadri is based on the residuals from the individual OLS regressions of the
yit on a constant, or on a constant and a trend
• Hadri test requires only the specification of the form of the OLS regression
4. TESTS WITH INDIVIDUAL UNIT ROOT PROCESSES
9
• Im, Pesaran and Shin, and the ADF and PP tests allow for individual unit root processes
so that ρi may vary across cross-sections
• The tests are characterized by combining of individual unit root tests to derive a panel-
specific result
4.1 IM, PESARAN AND SHIN (IPS)

• IPS begin by specifying a separate ADF regression for each cross-section
pi
∆ yi ,t = ρ i yi ,t −1 + λ xit + ∑ θ ij ∆ yi ,t − j + ε i ,t
j =1
H0: ρi = 0 i.e. there is a unit root
HA: ρi < 0 i.e. there is no unit root
• After estimating the separate ADF regressions, the average of the t-statistics for each of
the ρi from the individual ADF regressions, tiTi ( pi )
⎛ N ⎞
⎜ ∑ tiTi ( pi )⎟
• The average is t NT = ⎝ i =1 ⎠
N
• IPS show that a properly standardized t NT has asymptotic standard normal distribution
⎛ N
⎞
N ⎜ t NT − N ∑ E (tiT ( pi ))⎟
−1
• Wt NT = ⎝ i =1 ⎠ ~ N (0,1)
N
N −1 ∑ var(tiT ( pi ))
i =1
• The expression for the expected mean and variance of the ADF regression t-statistics are
provided by IPS for various values of the T and p and differing test equation assumptions
5. PANEL COINTEGRATION
• Panel cointegration tests can be motivated by the search for more powerful tests than
those obtained by applying individual time-series tests
• Time series tests are known to have low power, especially for short T
Residual based tests

Residual based DF and ADF tests (Kao tests-1999)
Residual based LM test (McCoskey and Kao-1998)
Pedroni tests
Likelihood-based tests
• Combined individual tests
5.1 PEDRONI (ENGLE-GRANGER BASED) PANEL COINTEGRATION TEST
• Engle-Granger (1987) cointegration test is based on an examination of the residuals of a

spurious regression performed using I(1) variables
10
• If the variables are cointegrated, the residual should be I(0).
• If the variables are not cointegrated, then the residuals will be I(1)
• Pedroni (1999, 2004) and Kao (1999) extend the Engle-Granger framework to tests
involving panel data
• Pedroni proposes several tests for cointegration that allow for heterogenous intercepts
and trend coefficients across cross-sections
• Consider the regression
• yit = α i + δ it + β1i x1it + β 2i x2it + .. + β Mi xMit + eit
• t = 1,2,...T , i = 1,2,...N , m = 1,..M
• y and x are assumed to be integrated of order 1
• The parameters α i and δ i are individual and trends effects which may be set to zero if
desired
• Under the null hypothesis that there is no cointegration, the residuals will be I(1)
• The general approach is to obtain residuals from the above regression and then test
whether the residuals are I(1) just like in EG-2 step procedure
• We run an auxiliary regression of the form
pi
•
∆ ei ,t = ρ i ei ,t −1 + ∑ ϕ ij ∆ ei ,t − j + vi ,t
j =1
• Pedroni’s panel cointegration statistic Ξ NT is constructed from the residuals from the
auxiliary regression
• Pedroni shows that the standardized statistic is asymptotically normally distributed
Ξ NT − µ N
~ N (0,1)
v
Where µ and v are Monte Carlo generated adjustment terms
5.2 KAO (ENGLE-GRANGER BASED) PANEL COINTEGRATION TEST

• Kao test follows the same basic approach as Pedroni tests but specifies cross-section
specific intercepts and homogenous coefficients on the first-stage regressors
• In the bivariate case described by Kao(1999) we have the following
• yit = α i + β xit + eit
For yit = yit −1 + uit

xit = yit −1 + ε it
• t = 1,2,...T , i = 1,2,...N
• The general approach is to obtain residuals from the above regression and then test
whether the residuals are I(1) just like in EG-2 step procedure
• Kao runs an auxiliary regression of the form
• ei ,t = ρ ei ,t −1 + vit
• In order to test the null of no cointegration, the null can be written as H 0 : ρ = 1
• The t-value is
11
N T
(ρ̂ − 1) ∑∑ eît2 −1
ρ= i t =2
se
• Under the null of no cointegration, Kao presents several statistics for

DFρ , DFt , DFρ* , DFt * and ADF
• It can be shown that the asymptotic distributions for DFρ , DFt , DFρ* , DFt * and ADF
converge to standard normal by sequential limit theory
5.3 COMBINED INDIVIDUAL TESTS (FISHER/JOHANSEN)
• Fisher(1932) derives a combined test that uses the results of the individual independent
tests
• Maddala and Wu (1999) use Fisher’s result to propose an alternative approach to testing
for cointegration in panel data by combining tests from individual cross-sections to
obtain test for the full panel
• If π i is the p-value from an individual cointegration test for cross-section i then under
the null hypothesis for the panel
N
− 2∑ log(π i ) → χ 2 (2 N )
i =1
References
Anderson, T.W. and Hsiao, C. (1981), Estimation of Dynamic Panels

with Error Components, Journal of the American Statistical Association,
Vol.76, 598-606.
Arrelano,M. and Bond,S. (1991), “Some Tests of Specification for Panel

Data: Monte Carlo Evidence and Application to Employment
Equations”, Review of Economic Studies, vol.58, 277-297.
Arrelano,M. and Bover,O.(1995), “Another Look at the Instrumental

Variables Estimation of Error Component Models”, Journal of
Econometrics, vol.68, 29-51.
12
Blundell, R.W. and Bond, S.R. (1998), “Initial Conditions and Moment
Restrictions in Dynamic Panel Data Models” Journal of Econometrics,
Vol87, 115-143.
Breitung, J. (2000), “The Local Power of Some Unit Root Tests for
Panel Data,” in B.Baltagi(ed). Advances in Econometrics, Vol.15:
Nonstationary Panels, Panel Cointegration, and Dynamic Panels,
Amsterdam: JAI Press, P.161-178.

1986
Fisher, R.A. (1932), Statistical Methods for Research Workers, 4th

Edition, Edinburg: Oliver & Boyd.

Chapter 13
Hadri, K. (2000), “Testing for Stationarity in Heterogenous Panel Data”,
Econometric Journal, Vol.3, 148-161.
Im,K.S., Pesaran, M.H. and Shin, Y. (2003), “Testing for Unit Roots in
Heterogenous Panels”, Journal of Econometrics, Vol.115, 53-74.
13
Kao, C. (1999), “Spurius Regression and Residual-Based Tests for
Cointegration in Panel Data”, Journal of Econometrics, Vol.90, 1-144.
Levin, A., Lin, C.F. and Chu, C. (2002), “Unit Root Tests in Panel Data:
Asymptotic and Finite Sample Properties”, Journal of Econometrics,
Vol108, 1-24.
Maddal, G.S. and Wu, S. (1999), “A Comparative Study of Unit Root

Tests with Panel Data and a New Simple Test”, Oxford Bulletin of
Economics and Statistics, Vol. 61, 631-652.
Nickell, S. (1981), “Biases in Dynamic Models with Fixed Effects”,

Econometrica, vol.49, 1417-1426.
Pedroni,P. (1999), “Critical Values for Cointegration Tests in

Heterogenous Panels with Multiple Regressors”, Oxford Bulleting of
Economics and Statistics, Vol.61, 653-70.
Pedroni,P. (2004), “Panel Cointegration, Asymptotic and Finite Sample

Properties of Pooled Time Series Tests with an Application to the PPP
Hypothesis”, Econometric Theory, 20, 597-625.
14
Practice quizes
(1) Explain how nonstationary panel data analysis addresses
the problems of low power of tests, nonlimiting
distribution and spurious regression.
(2) What do you understand by the Nickell (1981) bias?
15
Introduction to Panel
14 Data Models
14.1 Introduction
If the same units of observation in a cross-sectional sample are surveyed two or
more times, the resulting observations are described as forming a panel or longit-
udinal data set. The National Longitudinal Survey of Youth that has provided
data for many of the examples and exercises in this text is such a data set. The
NLSY started with a baseline survey in 1979 and the same individuals have been
reinterviewed many times since, annually until 1994 and biennially since then.
However the unit of observation of a panel data set need not be individuals. It
may be households, or enterprises, or geographical areas, or indeed any set of
entities that retain their identities over time.
Because panel data have both cross-sectional and time series dimensions, the
application of regression models to fit econometric models are more complex
than those for simple cross-sectional data sets. Nevertheless, they are increasingly
being used in applied work and the aim of this chapter is to provide a brief
introduction. For comprehensive treatments see Hsiao (2003), Baltagi (2001),
and Wooldridge (2002).
There are several reasons for the increasing interest in panel data sets. An
important one is that their use may offer a solution to the problem of bias caused
by unobserved heterogeneity, a common problem in the fitting of models with
cross-sectional data sets. This will be discussed in the next section.
A second reason is that it may be possible to exploit panel data sets to reveal
dynamics that are difficult to detect with cross-sectional data. For example, if
one has cross-sectional data on a number of adults, it will be found that some
are employed, some are unemployed, and the rest are economically inactive. For
policy purposes, one would like to distinguish between frictional unemployment
and long-term unemployment. Frictional unemployment is inevitable in a chan-
ging economy, but the long-term unemployment can indicate a social problem
that needs to be addressed. To design an effective policy to counter long-term
unemployment, one needs to know the characteristics of those affected or at risk.
In principle the necessary information might be captured with a cross-sectional
survey using retrospective questions about past emloyment status, but in practice
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 408 — #1

Introduction to Econometrics 409
the scope for this is often very limited. The further back in the past one goes, the
worse are the problems of a lack of records and fallible memories, and the greater
becomes the problem of measurement error. Panel studies avoid this problem in
that the need for recall is limited to the time interval since the previous interview,
often no more than a year.
A third attraction of panel data sets is that they often have very large numbers
of observations. If there are n units of observation and if the survey is undertaken
in T time periods, there are potentially nT observations consisting of time series
of length T on n parallel units. In the case of the NLSY, there were just over
6,000 individuals in the core sample. The survey has been conducted 19 times as
of 2004, generating over 100,000 observations. Further, because it is expensive
to establish and maintain them, such panel data sets tend to be well designed
and rich in content.
A panel is described as balanced if there is an observation for every unit of
observation for every time period, and as unbalanced if some observations are
missing. The discussion that follows applies equally to both types. However, if
one is using an unbalanced panel, one needs to take note of the possibility that
the causes of missing observations are endogenous to the model. Equally, if a
balanced panel has been created artificially by eliminating all units of observation
with missing observations, the resulting data set may not be representative of its
population.
Example of the use of a panel data set to investigate dynamics

In many studies of the determinants of earnings it has been found that married
men earn significantly more than single men. One explanation is that marriage
entails financial responsibilities—in particular, the rearing of children—that may
encourage men to work harder or seek better paying jobs. Another is that certain
unobserved qualities that are valued by employers are also valued by potential
spouses and hence are conducive to getting married, and that the dummy variable
for being married is acting as a proxy for these qualities. Other explanations
have been proposed, but we will restrict attention to these two. With cross-
sectional data it is difficult to discriminate between them. However, with panel
data one can find out whether there is an uplift at the time of marriage or soon
after, as would be predicted by the increased productivity hypothesis, or whether
married men tend to earn more even before marriage, as would be predicted by
the unobserved heterogeneity hypothesis.
In 1988 there were 1,538 NLSY males working 30 or more hours a week,
not also in school, with no missing data. The respondents were divided into
three categories: the 904 who were already married in 1988 (dummy variable
MARRIED = 1); a further 212 who were single in 1988 but who married within
the next four years (dummy variable SOONMARR = 1); and the remaining 422
who were single in 1988 and still single four years later (the omitted category).
Divorced respondents were excluded from the sample. The following earnings
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 409 — #2

410 14: Introduction to Panel Data Models
function was fitted (standard errors in parentheses):
LG
EARN = 0.163 MARRIED + 0.096 SOONMARR + constant + controls
(0.028) (0.037) R2 = 0.27.
(14.1)
The controls included years of schooling, ASVABC score, years of tenure with
the current employer and its square, years of work experience and its square,
age and its square, and dummy variables for ethnicity, region of residence, and
living in an urban area.
The regression indicates that those who were married in 1988 earned 16.3
percent more than the reference category (strictly speaking, 17.7 percent, if the
proportional increase is calculated properly as e0.163 − 1) and that the effect
is highly significant. However, it is the coefficient of SOONMARR that is of
greater interest here. Under the null hypothesis that the marital effect is dynamic
and marriage encourages men to earn more, the coefficient of SOONMARR
should be zero. The men in this category were still single as of 1988. The t
statistic of the coefficient is 2.60 and so the coefficient is significantly different
from zero at the 0.1 percent level, leading us to reject the null hypothesis at that
level.
However, if the alternative hypothesis is true, the coefficient of SOONMARR
should be equal to that of MARRIED, but it is lower. To test whether it is
significantly lower, the easiest method is to change the reference category to those
who were married by 1988 and to introduce a new dummy variable SINGLE
that is equal to 1 if the respondent was single in 1988 and still single four years
later. The omitted category is now those who were already married by 1988.
The fitted regression is (standard errors in parentheses)
LG
EARN = −0.163 SINGLE − 0.066 SOONMARR + constant + controls
(0.028) (0.034) R2 = 0.27.
(14.2)
The coefficient of SOONMARR now estimates the difference between the coef-
ficients of those married by 1988 and those married within the next four years,
and if the second hypothesis is true, it should be equal to zero. The t statistic is
−1.93, so we (just) do not reject the second hypothesis at the 5 percent level.
The evidence seems to provide greater support for the first hypothesis, but it is
possible that neither hypothesis is correct on its own and the truth might reside
in some compromise.
In the foregoing example, we used data only from the 1988 and 1992 rounds
of the NLSY. In most applications using panel data it is normal to exploit the
data from all the rounds, if only to maximize the number of observations in the
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 410 — #3

sample. A standard specification is

k
s
Yit = β1 + βj Xjit + γp Zpi + δt + εit (14.3)
j=2 p=1
where Y is the dependent variable, the Xj are observed explanatory variables,

and the Zp are unobserved explanatory variables. The index i refers to the unit
of observation, t refers to the time period, and j and p are used to differentiate
between different observed and unobserved explanatory variables. εit is a dis-
turbance term assumed to satisfy the usual regression model conditions. A trend
term t has been introduced to allow for a shift of the intercept over time. If the
implicit assumption of a constant rate of change seems too strong, the trend can
be replaced by a set of dummy variables, one for each time period except the
reference period.
The Xj variables are usually the variables of interest, while the Zp variables
are responsible for unobserved heterogeneity and as such constitute a nuisance
component of the model. The following discussion will be confined to the (quite
common) special case where it is reasonable to assume that the unobserved
heterogeneity is unchanging and accordingly the Zp variables do not need a
time subscript. Because the Zp variables are unobserved, there is no means of

obtaining information about the sp=1 γp Zpi component of the model and it is
convenient to rewrite (14.3) as

k
Yit = β1 + βj Xjit + αi + δt + εit (14.4)
j=2
where

s
αi = γp Zpi . (14.5)
p=1
αi , known as the unobserved effect, represents the joint impact of the Zpi on
Yi . Henceforward it will be convenient to refer to the unit of observation as an
individual, and to the αi as the individual-specific unobserved effect, but it should
be borne in mind that the individual in question may actually be a household or
an enterprise, etc. If αi is correlated with any of the Xj variables, the regression
estimates from a regression of Y on the Xj variables will be subject to unobserved
heterogeneity bias. Even if the unobserved effect is not correlated with any of the
explanatory variables, its presence will in general cause OLS to yield inefficient
estimates and invalid standard errors. We will now consider ways of overcoming
these problems.
First, however, note that if the Xj controls are so comprehensive that they
capture all the relevant characteristics of the individual, there will be no relevant
unobserved characteristics. In that case the αi term may be dropped and a pooled
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 411 — #4

OLS regression may be used to fit the model, treating all the observations for all
of the time periods as a single sample.
14.2 Fixed effects regressions

The two main approaches to the fitting of models using panel data are known
as fixed effects regressions, discussed in this section, and random effects regres-
sions, discussed in the next. Three versions of the fixed effects approach will
be described. In the first two, the model is manipulated in such a way that the
unobserved effect is eliminated.
Within-groups fixed effects

In the first version, the mean values of the variables in the observations on a
given individual are calculated and subtracted from the data for that individual.
In view of (14.4), one may write

k
Y i = β1 + βj X ij + δt + αi + εit . (14.6)
j=2
Subtracting this from (14.4), one obtains

k
Yit − Y i = βj Xijt − X ij + δ(t − t) + εit − εi (14.7)
j=2
and the unobserved effect disappears. This is known as the within-groups

regression model because it is explaining the variations about the mean of the
dependent variable in terms of the variations about the means of the explanatory
variables for the group of observations relating to a given individual. The possib-
ility of tackling unobserved heterogeneity bias in this way is a major attraction
of panel data for researchers.
However, there are some prices to pay. First, the intercept β1 and any X
variable that remains constant for each individual will drop out of the model.
The elimination of the intercept may not matter, but the loss of the unchanging
explanatory variables may be frustrating. Suppose, for example, that one is fitting
an earnings function to data for a sample of individuals who have completed their
schooling, and that the schooling variable for individual i in period t is Sit . If the
education of the individual is complete by the time of the first time period, Sit
will be the same for all t for that individual and Sit = Si for all t. Hence (Sit − Si )
is zero for all time periods. If all individuals have completed their schooling by
the first time period, Sit will be zero for all i and t. One cannot include a variable
whose values are all zero in a regression model. Thus if the object of the exercise
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 412 — #5

were to obtain an estimate of the returns to schooling untainted by unobserved

heterogeneity bias, one ends up with no estimate at all.
A second problem is the potential impact of the disturbance term. We saw in
Chapter 3 that the precision of OLS estimates depends on the mean square devi-
ations of the explanatory variables being large in comparison with the variance
of the disturbance term. The analysis was in the context of the simple regression
model, but it generalizes to multiple regression. The variation in (Xj − X j ) may
well be much smaller than the variation in Xj . If this is the case, the impact of
the disturbance term may be relatively large, giving rise to imprecise estimates.
The situation is aggravated in the case of measurement error, since this will lead
to bias, and the bias is the greater, the smaller the variation in the explanatory
variable in comparison with the variance of the measurement error.
A third problem is that we lose a substantial number of degrees of freedom
in the model when we manipulate the model to eliminate the unobserved effect:
we lose one degree of freedom for every individual in the sample. If the panel is
balanced, with nT observations in all, it may seem that there would be nT − k
degrees of freedom. However, in manipulating the model, the number of degrees
of freedom is reduced by n, for reasons that will be explained later in this section.
Hence the true number of degrees of freedom will be n(T −1)−k. If T is small, the
impact can be large. (Regression applications with a fixed regression facility will
automatically make the adjustment to the degrees of freedom when implementing
the within-groups method.)
First differences fixed effects

In a second version of the fixed effects approach, the first differences regression
model, the unobserved effect is eliminated by subtracting the observation for the
previous time period from the observation for the current time period, for all
time periods. For individual i in time period t the model may be written

k
Yit = β1 + βj Xijt + δt + αi + εit . (14.8)
j=2
For the previous time period, the relationship is

k
Yit−1 = β1 + βj Xijt−1 + δ(t − 1) + αi + εit−1 . (14.9)
j=2
Subtracting (14.9) from (14.8), one obtains

k
Yit = βj Xijt + δ + εit − εit−1 (14.10)
j=2
and again the unobserved heterogeneity has disappeared. However, the other
problems remain. In particular, the intercept and any X variable that remains
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 413 — #6

fixed for each individual will disappear from the model and n degrees of free-
dom are lost because the first observation for each individual is not defined. In
addition, this type of differencing gives rise to autocorrelation if εit satisfies the
regression model conditions. The error term for Yit is (εit − εit−1 ). That for the
previous observation is (εit−1 −εit−2 ). Thus the two error terms both have a com-
ponent εit−1 with opposite signs and negative moving average autocorrelation
has been induced. However, if εit is subject to autocorrelation:
εit = ρεit−1 + vit (14.11)
where vit is a well behaved innovation, the moving average disturbance term is
equal to vit − (1 − ρ)εit−1 . If the autocorrelation is severe, the (1 − ρ)εit−1 com-
ponent could be small and so the first differences estimator could be preferable
to the within-groups estimator.
Least squares dummy variable fixed effects

In the third version of the fixed effects approach, known as the least squares
dummy variable (LSDV) regression model, the unobserved effect is brought
explicitly into the model. If we define a set of dummy variables Ai , where Ai is
equal to 1 in the case of an observation relating to individual i and 0 otherwise,
the model can be rewritten

k
n
Yit = βj Xijt + δt + αi Ai + εit . (14.12)
j=2 i=1
Formally, the unobserved effect is now being treated as the coefficient of the
individual-specific dummy variable, the αi Ai term representing a fixed effect on
the dependent variable Yi for individual i (this accounts for the name given to
the fixed effects approach). Having re-specified the model in this way, it can be
fitted using OLS.
Note that if we include a dummy variable for every individual in the sample as
well as an intercept, we will fall into the dummy variable trap described in Section
5.2. To avoid this, we could define one individual to be the reference category,
so that β1 is its intercept, and then treat the αi as the shifts in the intercept
for the other individuals. However, the choice of reference category is often
arbitrary and accordingly the interpretation of the αi in such a specification not
particularly illuminating. Alternatively, we can drop the β1 intercept and define
dummy variables for all of the individuals, as has been done in (14.12). The αi
now become the intercepts for each of the individuals. Note that, in common with
the first two versions of the fixed effects approach, the LSDV method requires
panel data. With cross-sectional data, one would be defining a dummy variable
for every observation, exhausting the degrees of freedom. The dummy variables
on their own would give a perfect but meaningless fit.
If there are a large number of individuals, using the LSDV method directly
is not a practical proposition, given the need for a large number of dummy
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 414 — #7

Table 14.1 Individual-specific dummy variables and an

unchanging X variable
Individual Time period A1 A2 A3 A4 Xj

1 1 1 0 0 0 c1
1 2 1 0 0 0 c1
1 3 1 0 0 0 c1
2 1 0 1 0 0 c2
2 2 0 1 0 0 c2
2 3 0 1 0 0 c2
3 1 0 0 1 0 c3
3 2 0 0 1 0 c3
3 3 0 0 1 0 c3
4 1 0 0 0 1 c4
4 2 0 0 0 1 c4
4 3 0 0 0 1 c4
variables. However, it can be shown mathematically that the method is identical

to the within-groups method. The only apparent difference is in the number of
degrees of freedom. It is easy to see from (14.12) that there are nT −k−n degrees
of freedom if the panel is balanced. In the within-groups approach, it seemed at
first that there were nT − k. However, n degrees of freedom are consumed in the
manipulation that eliminates the αi .
Given that it is equivalent to the within-groups approach, the LSDV method is
subject to the same problems. In particular, we are unable to estimate coefficients
for the X variables that are fixed for each individual. Suppose that Xij is equal
to ci for all the observations for individual i. Then

n
Xj = c i Ai . (14.13)
i=1
To see this, suppose that there are four individuals and three time periods, as
in Table 14.1, and consider the observations for the first individual. Xj is equal
to c1 for each observation. A1 is equal to 1. All the other A dummies are equal
to 0. Hence both sides of the equation are equal to c1 . Similarly, both sides of
the equation are equal to c2 for the observations for individual 2, and similarly
for individuals 3 and 4.
Thus there is an exact linear relationship linking Xj with the dummy variables
and the model is subject to exact multicollinearity. Accordingly Xj cannot be
included in the regression specification.
Example
To illustrate the use of a fixed effects model, we return to the example in Section
14.1 and use all the available data from 1980 to 1996, 20,343 observations in all.
Table 14.2 shows the extra hourly earnings of married men and of men who are
single but married within the next four years. The controls (not shown) are the
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 415 — #8

Table 14.2 Earnings premium for married and soon-to-be married men, NLSY
1980–96
OLS Fixed effects Random effects

Married 0.184 0.106 – 0.134 –
(0.007) (0.012) (0.010)
Single, married 0.096 0.045 −0.061 0.060 −0.075
within 4 years (0.009) (0.010) (0.008) (0.009) (0.007)
Single, not married – – –0.106 – −0.134
within 4 years (0.012) (0.010)
R2 0.358 0.268 0.268 0.346 0.346
DWH test – – – 205.8 205.8
n 20,343 20,343 20,343 20,343 20,343
same as in Section 14.1. The first column gives the estimates obtained by simply
pooling the observations and using OLS with robust standard errors. The second
column gives the fixed effects estimates, using the within-groups method, with
single men as the reference category. The third gives the fixed effects estimates
with married men as the reference category. The fourth and fifth give the random
effects estimates, discussed in the next section.
The OLS estimates are very similar to those in the wage equation for 1988
discussed in Section 14.1. The fixed effects estimates are considerably lower,
suggesting that the OLS estimates were inflated by unobserved heterogeneity.
Nevertheless, the pattern is the same. Soon-to-be-married men earn significantly
more than single men who stay single. However, if we fit the specification corre-
sponding to equation (14.2), shown in the third column, we find that soon-to-be
married men earn significantly less than married men. Hence both hypotheses
relating to the marriage premium appear to be partly true.
14.3 Random effects regressions

As we saw in the previous section, when the variables of interest are constant
for each individual, a fixed effects regression is not an effective tool because
such variables cannot be included. In this section we will consider an alternat-
ive approach, known as a random effects regression that may, subject to two
conditions, provide a solution to this problem.
The first condition is that it is possible to treat each of the unobserved Zp
variables as being drawn randomly from a given distribution. This may well
be the case if the individual observations constitute a random sample from a
given population as, for example, with the NLSY where the respondents were
randomly drawn from the US population aged 14 to 21 in 1979. If this is the
case, the αi may be treated as random variables (hence the name of this approach)
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 416 — #9

drawn from a given distribution and we may rewrite the model as

k
Yit = β1 + βj Xjit + αi + δt + εit
j=2

k
= β1 + βj Xjit + δt + uit (14.14)
j=2
where
uit = αi + εit . (14.15)
We have thus dealt with the unobserved effect by subsuming it into the
disturbance term.
The second condition is that the Zp variables are distributed independently
of all of the Xj variables. If this is not the case, α, and hence u, will not be
uncorrelated with the Xj variables and the random effects estimation will be
biased and inconsistent. We would have to use fixed effects estimation instead,
even if the first condition seems to be satisfied.
If the two conditions are satisfied, we may use (14.14) as our regression
specification, but there is a complication. uit will be subject to a special form
of autocorrelation and we will have to use an estimation technique that takes
account of it.
First, we will check the other regression model conditions relating to the dis-
turbance term. Given our assumption that εit satisfies the usual regression model
conditions, we can see that uit satisfies the condition that its expectation be zero,
since
E(uit ) = E(αi + εit ) = E(αi ) + E(εit ) = 0 for all i and t (14.16)
Here we are assuming without loss of generality that E(αi ) = 0, any nonzero
component being absorbed by the intercept, β1 . uit will also satisfy the condition
that it should have constant variance, since
σu2it = σα2i +εit = σα2 + σε2 + 2σαε = σα2 + σε2 for all i and t. (14.17)
The σαε term is zero on the assumption that αi is distributed independently

of εit . uit will also satisfy the regression model condition that it be distributed
independently of the values of Xj , since both αi and εit are assumed to satisfy
this condition.
However, there is a problem with the regression model condition that the
value of uit in any observation be generated independently of its value in all
other observations. For all the observations relating to a given individual, αi will
have the same value, reflecting the unchanging unobserved characteristics of the
individual. This is illustrated in Table 14.3 for the case where there are four
individuals and three time periods.
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 417 — #10

Table 14.3 Example of disturbance term

values in a random effects model
Individual Time period u

1 1 α1 + ε11
1 2 α1 + ε12
1 3 α1 + ε13
2 1 α2 + ε21
2 2 α2 + ε22
2 3 α2 + ε23
3 1 α3 + ε31
3 2 α3 + ε32
3 3 α3 + ε33
4 1 α4 + ε41
4 2 α4 + ε42
4 3 α4 + ε43
Since the disturbance terms for individual i have a common component αi ,

they are correlated. For individual i in period t, the disturbance term is (αi + εit ).
For the same individual in any other period t it is (αi + εit ). The population
covariance between them is
σuit ,uit = σ(αi +εit ),(αi +εit ) = σαi ,αi + σαi ,εit + σεit ,αi + σεit ,εit = σα2 . (14.18)
For observations relating to different individuals the problem does not arise
because then the α components will be different and generated independently.
We have encountered a problem of the violation of this regression model
condition once before, in the case of autocorrelated disturbance terms in a time
series model. As in that case, OLS remains unbiased and consistent, but it is
inefficient and the OLS standard errors are computed wrongly.
The solution then was to transform the model so that the transformed disturb-
ance term satisfied the regression model condition, and a similar procedure is
adopted in the present case. However, while the transformation in the case of
autocorrelation was very straightforward, in the present case it is more complex.
Known as feasible generalized least squares, its description requires the use of
linear algebra and is therefore beyond the scope of this text. It yields consistent
estimates of the coefficients and therefore depends on n being sufficiently large.
For small n its properties are unknown.
Assessing the appropriateness of fixed effects and random

effects estimation
When should you use fixed effects estimation rather than random effects estima-
tion, or vice versa? In principle, random effects is more attractive because
observed characteristics that remain constant for each individual are retained
in the regression model. In fixed effects estimation, they have to be dropped.
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 418 — #11

Also, with random effects estimation we do not lose n degrees of freedom, as is

the case with fixed effects.
However, if either of the preconditions for using random effects is violated,
we should use fixed effects instead. One precondition is that the observations
can be described as being drawn randomly from a given population. This is a
reasonable assumption in the case of the NLSY because it was designed to be a
random sample. By contrast, it would not be a reasonable assumption if the units
of observation in the panel data set were countries and the sample consisted of
those countries that are members of the Organization for Economic Cooperation
and Development (OECD). These countries certainly cannot be considered to
represent a random sample of the 200-odd sovereign states in the world.
The other precondition is that the unobserved effect be distributed indepen-
dently of the Xj variables. How can we tell if this is the case? The standard
procedure is yet another implementation of the Durbin–Wu–Hausman test used
to help us choose between OLS and IV estimation in models where there is sus-
pected measurement error (Section 8.5) or simultaneous equations endogeneity
(Section 9.3). The null hypothesis is that the αi are distributed independently of
the Xj . If this is correct, both random effects and fixed effects are consistent,
but fixed effects will be inefficient because, looking at it in its LSDV form, it
involves estimating an unnecessary set of dummy variable coefficients. If the null
hypothesis is false, the random effects estimates will be subject to unobserved
heterogeneity bias and will therefore differ systematically from the fixed effects
estimates.
As in its other applications, the DWH test determines whether the estimates of
the coefficients, taken as a group, are significantly different in the two regressions.
If any variables are dropped in the fixed effects regression, they are excluded from
the test. Under the null hypothesis the test statistic has a chi-squared distribution.
In principle this should have degrees of freedom equal to the number of slope
coefficients being compared, but for technical reasons that require matrix algebra
for an explanation, the actual number may be lower. A regression application
that implements the test, such as Stata, should determine the actual number of
degrees of freedom.
Example
The fixed effects estimates, using the within-groups approach, of the coeffi-
cients of married men and soon-to-be married men in Table 14.2 are 0.106 and
0.045, respectively. The corresponding random effects estimates are consider-
ably higher, 0.134 and 0.060, inviting the suspicion that they may be inflated by
unobserved heterogeneity. The DWH test involves the comparison of 13 coeffi-
cients (those of MARRIED, SOONMARR, and 11 controls). Stata reports that
there are in fact only 12 degrees of freedom. The test statistic is 205.8. With
12 degrees of freedom the critical value of chi-squared at the 0.1 percent level is
32.9, so we definitely conclude that we should be using fixed effects estimation.
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 419 — #12

Our findings are the same as in the simpler example in Section 14.1. They
confirm that married men earn more than single men. Part of the differential
appears to be attributable to the characteristics of married men, since men who
are soon-to-marry but still single also enjoy an earnings premium. However, part
of the marriage premium appears to be attributable to the effect of marriage itself,
since married men earn significantly more than those who are soon-to-marry but
still single.
Random effects or OLS?

Suppose that the DWH test indicates that we can use random effects rather
than fixed effects. We should then consider whether there are any unobserved
effects at all. It is just possible that the model has been so well specified that the
disturbance term
uit = αi + εit (14.19)
consists of only the purely random component εit and there is no individual-
specific αi term. In this situation we should use pooled OLS, with two advantages.
There is a gain in efficiency because we are not attempting to allow for non-
existent within-groups autocorrelation, and we will be able to take advantage of
the finite-sample properties of OLS, instead of having to rely on the asymptotic
properties of random effects.
Various tests have been developed to detect the presence of random effects.
The most common, implemented in some regression applications, is the Breusch–
Pagan Lagrange multiplier test, the test statistic having a chi-squared distribution
with one degree of freedom under the null hypothesis of no random effects. In
the case of the marriage effect example the statistic is very high indeed, 20,007,
but in this case it is meaningless because we are not able to use random effects
estimation.
A note on the random effects and fixed effects terminology

It is generally agreed that random effects/fixed effects terminology can be mis-
leading, but that it is too late to change it now. It is natural to think that random
effects estimation should be used when the unobserved effect can be character-
ized as being drawn randomly from a given population and that fixed effects
should be used when the unobserved effect is considered to be non-random. The
second part of that statement is correct. However, the first part is correct only
if the unobserved effect is distributed independently of the Xj variables. If it is
not, fixed effects should be used instead to avoid the problem of unobserved het-
erogeneity bias. Figure 14.1 summarizes the decision-making process for fitting
a model with panel data.
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 420 — #13

Can the observations be described as being a

random sample from a given population?
Yes No
Perform both fixed effects and Use fixed effects

random effects regressions.
Does a DWH test indicate Provisionally choose random

No
significant differences in the effects. Does a test indicate the
coefficients? presence of random effects?
Yes No
Yes
Use fixed effects Use random effects Use pooled OLS
Figure 14.1 Choice of regression model for panel data
Key terms
balanced panel panel data set

Durbin–Wu–Hausman test pooled OLS regression
first differences regression random effects
fixed effects unbalanced panel
least squares dummy variable (LSDV) unobserved effect
regression
longitudinal data set within-groups regression
Exercises
14.1 Download the OECD2000 data set from the website. See Appendix B for details The
data set contains 32 variables:
ID This is the country identification, with 1 = Australia, 2 = Austria,
3 = Belgium, 4 = Canada, 5 = Denmark, 6 = Finland, 7 = France,
8 = Germany, 9 = Greece, 10 = Iceland, 11 = Ireland, 12 = Italy, 13 =
Japan, 14 = Korea, 15 = Luxembourg, 16 = Mexico, 17 = Netherlands,
18 = New Zealand, 19 = Norway, 20 = Portugal, 21 = Spain,
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 421 — #14

22 = Sweden, 23 = Switzerland, 24 = Turkey, 25 = United Kingdom,

26 = United States. Four countries that have recently joined the OECD,
the Czech Republic, Hungary, Poland, and Slovakia, are excluded because
their data do not go back far enough.
ID01–26 These are individual country dummy variables. For example, ID09 is the
dummy variable for Greece.
E Average annual percentage rate of growth of employment for country i
during time period t.
G Average annual percentage rate of growth of GDP for country i during
time period t.
TIME There are three time periods, denoted 1, 2, and 3. They refer to average
annual data for 1971–80, 1981–90, and 1991–2000.
TIME2 Dummy variable defined to be equal to 1 when TIME = 2, 0 otherwise.
TIME3 Dummy variable defined to be equal to 1 when TIME = 3, 0 otherwise.
Perform a pooled OLS regression of E on G. Regress E on G, TIME2, and TIME3.

Perform appropriate statistical tests and give an interpretation of the regression
results.
14.2 Using the OECD2000 data set, perform a (within-groups) fixed effects regression
of E on G, TIME2, and TIME3. Perform appropriate statistical tests, give an
interpretation of the regression coefficients, and comment on R2 .
14.3 Perform the corresponding LSDV regression, using OLS to regress E on G, TIME2,
TIME3, and the country dummy variables (a) dropping the intercept, and (b) drop-
ping one of the dummy variables. Perform appropriate statistical tests and give an
interpretation of the coefficients in each case. Explain why either the intercept or one
of the dummy variables must be dropped.
14.4 Perform a test for fixed effects in the OECD2000 regression by evaluating the
explanatory power of the country dummy variables as a group.
14.5 Download the NLSY2000 data set from the website. See Appendix B for details. This
contains the variables found in the EAEF data sets for the years 1980–94, 1996,
1998, and 2000 (there were no surveys in 1995, 1997, or 1999). Assuming that a
random effects model is appropriate, investigate the apparent impact of unobserved
heterogeneity on estimates of the coefficient of schooling by fitting the same earnings
function, first using pooled OLS, then using random effects.
14.6 The UNION variable in the NLSY2000 data set is defined to be equal to 1 if the
respondent was a member of a union in the year in question and 0 otherwise.
Assuming that a random effects model is appropriate, add UNION to the earnings
function specification and fit it using pooled OLS and random effects.
14.7 Using the NLSY2000 data set, perform a fixed effects regression of the earnings
function specification used in Exercise 14.5 and compare the estimated coefficients
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 422 — #15

with those obtained using OLS and random effects. Perform a Durbin–Wu–Hausman
test to discriminate between random effects and fixed effects.
14.8 Using the NLSY2000 data set, perform a fixed effects regression of the earnings
function specification used in Exercise 14.6 and compare the estimated coefficients
with those obtained using OLS and random effects. Perform a Durbin–Wu–Hausman
test to discriminate between random effects and fixed effects.
14.9 The within-groups version of the fixed effects regression model involved subtracting
the group mean relationship
k

Y i = β1 + (βj X ij ) + δt + αi + εit
j=2
from the original specification in order to eliminate the individual-specific effect αi .

Regressions using the group mean relationship are described as between effects regres-
sions. Explain why the between effects model is in general inappropriate for estimating
the parameters of a model using panel data. (Consider the two cases where the αi are
correlated and uncorrelated with the Xj controls.)
DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 423 — #16

ECON 5350 Class Notes
Chapter 13. Panel Data
1 Introduction
Panel (a.k.a., longitudinal or time series-cross sectional) data are observed both across sections and over
time. The advantages of panel data are that it
1. increases the number of observations,
2. increases the precision of parameter estimates and
3. allows one to sort out effects that may be impossible with only cross sectional or only time series data
(e.g., technological progress versus economies of scale).
Three famous U.S. panel data sets are the
• Panel Study of Income Dynamics (PSID),
• National Longitudinal Survey (NLS) and
• Current Population Survey (CPS).
For example, the PSID follows 6,000 families and 15,000 individuals since 1968, asking questions related
to income, job changes, marital status, other socioeconomic and demographic characteristics, etc.
Although there are clear advantages of panel data, there are also some complications. Below, I present
an introduction to estimation with panel data. Begin with the following model
yit = αi + x0it β + it (1)
where i = 1, ..., n and t = 1, ..., T . Of course, when αi = α for i = 1, ..., n, then the data can simply
be pooled together, the model written in standard form Y = Xβ + , and estimated with standard linear
techniques.
1
2 Fixed-Effects (FE) Model
In the fixed-effects model, we treat αi as a group-specific constant term to be estimated with the other
parameters. Now write the model as
Y = Dα + Xβ + (2)
where      
y1 i 0 ··· 0 x11 x12 ··· x1k
  T   
     
 y2  0 iT 0  x21 x22 x2k 
     
Y = .  D=. .. ..  X = . .. .. 
. . .   . . 
. . .  . . 
     
yn 0 0 ··· iT xn1 xn2 ··· xnk
nT ×1 nT ×n nT ×k
and    
α1 β1
   
   
 α2  β 2 
   
α= .  β= .  .
 .   . 
 .   . 
   
αn βk
n×1 k×1
This commonly referred to as the Least Squares Dummy Variable (LSDV) model. Theoretically, there are
no problems in estimating (2) — assuming the standard assumptions hold, the Gauss-Markov theorem applies
and we obtain unbiased and efficient estimates. There may be computational problems, however. Notice
that the stacked coefficient vector is of length (n+k). Therefore, the standard least squares formula requires
inverting a matrix of size (n + k) × (n + k). Since many panel data sets have n > 1000, this can lead to
numerical errors.
2.1 Partitioned Regression
A partitioned regression provides a simple solution to the above problem. Recall that
b = (X 0 MD X)−1 (X 0 MD Y )
2
where  
1
I − ii0 0 ··· 0
 T T 
 
 0 IT − 1 0 
 T ii 0 
MD = I − D(D0 D)−1 D0 =  .. .. .. 
 . 
 . . 
 
1 0
0 0 ··· IT − T ii
is a symmetric, idempotent "residual-maker" matrix for the regression on the dummy variables D. Since
MD is symmetric, idempotent and premultiplication produces the average over t = 1, ..., T for each i, the
partitioned regression is equivalent to regressing
 
y1 − ȳ1
 
 
 y2 − ȳ2 
 
Y∗ = MD Y = Y − Ȳ =  . 
 . 
 . 
 
yn − ȳn
nT ×1
on  
x11 − x̄11 x12 − x̄12 ··· x1k − x̄1k
 
 
 x21 − x̄21 x22 − x̄22 x2k − x̄2k 
 
X∗ = MD X = X − X̄ =  .. ..  .
 .. 
 . . . 
 
xn1 − x̄n1 xn2 − x̄n2 ··· xnk − x̄nk
nT ×k
This only requires inversion of a (k × k) matrix. The group-specific constant terms can then be recovered
according to
ai = ȳi − b0 x̄i
for i = 1, ..., n. The partitioned regression approach is basically a two-stage estimation procedure:
• Step #1. Transform the data by subtracting group means.
• Step #2. Run OLS on the transformed data.
2.2 Variance Estimation
The variance estimator for b is as expected
ar(b) = s2 (X 0 MD X)−1
vd
3
where the appropriate estimator for σ 2 is
Pn PT
e0 e − ai − x0it b)2
t=1 (yit
s2 = = i=1
.
nT − n − k nT − n − k
Keep in mind that if you use the partitioned-matrix approach, standard econometric programs may incor-
rectly use the degrees of freedom correction nT − k when nT − n − k is appropriate. Finally, the variance
estimator for the ai is
s2
vd
ar(ai ) = + x̄0i {s2 (X 0 MD X)−1 }x̄i .
T
2.3 Testing for Group-Specific Effects
The standard F test can be used to test whether the pooled or fixed-effects model is more appropriate. The
null hypothesis is
H0 : α1 = α2 = · · · = αn .
The F statistic is
2 2
(RLSDV − Rpooled )/(n − 1)
F = 2 ∼ F (n − 1, nT − n − k)
(1 − RLSDV )/(nT − n − k)
which could alternatively be written in sum-of-squared-errors form.
3 Random-Effects (RE) Model
An alternative approach is to treat αi in equation (1) as a random draw from a distribution rather than
being nonstochastic. The advantages are
• The RE model has fewer parameters to estimate.
• The RE model allows for additional explanatory variables that have equal value for all observations
within a group (e.g., education level of parents, number of siblings, etc.).
The disadvantage of the RE approach is that
• if the unobserved group-specific effects are correlated with the explanatory variables, then the estimates
will be biased and inconsistent.
• the estimator is a bit more complicated.
4
3.1 Basic Framework
From equation (1), we will let αi = α + µi so that the model is
yit = α + x0it β + (µi + it )
where the following assumptions are made
• E( it ) = 0, ∀i, t
• E(µi ) = 0, ∀i
2
• E( it ) = σ2 , ∀i, t
• E(µ2i ) = σ 2µ , ∀i
• E( it µj ) = 0, ∀i, t, j
• E( it js ) = 0, ∀s 6= t or i 6= j
• E(µi µj ) = 0, ∀i 6= j.
The RE model can be written in matrix form as
Y = Xβ + η
where η ∼ N (0, Ω). Given the assumptions above, the (nT × nT ) variance-covariance matrix Ω has the
following structure  
Σ 0 ··· 0
 T 
 
 0 ΣT 0 
 
Ω = (In ⊗ ΣT ) =  . .. .. 
 . . 
 . . 
 
0 0 ··· ΣT
where  
σ 2 + σ 2µ σ 2µ ··· σ 2µ
 
 
 σ 2µ σ 2 + σ 2µ ··· σ 2µ 
 
ΣT =  .. .. ..  .
 .. 
 . . . . 
 
σ 2µ σ 2µ ··· σ2 + σ 2µ
T ×T
5
3.2 Estimation
Although OLS will produce consistent estimates, because Ω is not diagonal, the OLS estimates will be
inefficient. Therefore, GLS is the efficient estimator. Recall that the GLS estimator is
β̂ GLS = (X∗0 X∗ )−1 (X∗0 Y∗ ) = ((P X)0 (P X))−1 ((P X)0 (P Y )) = (X 0 Ω−1 X)−1 (X 0 Ω−1 Y ). (3)
For the RE model,

−1/2
P = Ω−1/2 = [In ⊗ ΣT ]
where
−1/2 1 θ
ΣT = [IT − iT i0T ]
σ T
and
σ
θ =1− q .
σ 2 + T σ 2µ
Therefore, the GLS estimator can be calculated by running a regression of the pseudo-deviations
 
y1 − θȳ1
 
 
 y2 − θȳ2 
 
Y∗ =  .. 
 
 . 
 
yn − θȳn
on the similarly transformed X∗ .
3.3 Feasible Estimation
To make (3) operational, all that is left is to estimate σ 2 and σ 2µ . We will do this sequentially — first we
estimate σ 2 and then use that to estimate σ 2µ .
3.3.1 Estimation of σ 2 .
We begin by using the "within-groups" information given by the difference between
yit = α + β 0 xit + (µi + it ) (4)
6
and
ȳi = α + β 0 x̄i + (µi + ¯i ). (5)
This produces
yit − ȳi = β 0 (xit − x̄i ) + ( it − ¯i ). (6)
Now that the unobserved group-specific random effects are gone, we estimate (6) using the LSDV estimator
and use the residuals to get the following estimate of σ2 :
Pn PT
− ēi )2
t=1 (eit
σ̂2 = i=1
.
nT − n − k
3.3.2 Estimation of σ 2µ .
Now we use the "between-groups" information to estimate σ 2µ . Consider (5) again
¯i + µi = ȳi − α − β 0 x̄i .
The variance of (5) is

σ2
σ2∗∗ = + σ 2µ .
T
Therefore, we can estimate σ 2µ using

e0∗∗ e∗∗ σ̂ 2
σ̂2µ = − .
n−k T
Finally, insert the estimates σ̂ 2 and σ̂ 2µ into P and calculate β̂ GLS .
4 Choosing Between Fixed and Random Effects Models
A frequent question with panel data is which model to use — fixed or random effects. The answer boils down
to whether the unobserved group-specific effects are correlated with the explanatory variables or not. If
they are, then the RE model will produce inconsistent estimates. If they are not, then the RE model may
be preferable. There are two methods for choosing between RE and FE models.
4.1 Think Through the Problem
Consider the following two problems where the RE model is almost certainly inappropriate.
7
1. Returns to Schooling. Labor economists use panel datasets to explain individual wages as a function of
years of schooling, as well as other socioeconomic and demographic characteristics. Individuals almost
certainly have unobserved innate abilities that are likely to be correlated with observable explanatory
variables such as years of schooling, marital status, type of employment, etc.
2. Economic Growth and R&D Spending. Consider a regression of GDP per capita on a number of
different country-specific variables such as research and development (R&D) spending, saving rates,
population growth rates, schooling, and capital-labor ratios. There are likely to be unobserved,
country-specific cultural differences that influence economic growth and, at the same time, are corre-
lated with the explanatory variables such as saving rates, population growth rates, etc.
4.2 Hausman Specification Test
The motivation behind the Hausman test is that under the null hypothesis of no correlation (i.e., H0 :
corr(Xit , µi ) = 0), then both the FE and RE estimators are consistent but only the RE estimator is efficient.
Under the alternative, while the FE estimator is consistent, the RE estimator is not. The statistic is
asy
W = (bLSDV − β̂ GLS )0 [var(b) − var(β̂ GLS )]−1 (bLSDV − β̂ GLS ) ∼ χ2 (k − 1).
5 Heteroscedasticity and Autocorrelation
In general, there are two ways to handle nonspherical disturbances — robust estimation of the asymptotic
variance-covariance matrix (e.g., White’s estimator or the Newey-West estimator) or respecification of the
error structure and application of generalized least squares. I will present only the latter (see Greene
section 13.7 for a discussion of robust estimation). Note that LIMDEP has canned routines for handling
heteroscedasticity and autocorrelation in panel-data models.
5.1 Heteroscedasticity in the FE Model
The most straightforward way to handle heteroscedasticity in the FE model is to begin by calculating an
estimate of σ 2,i using the LSDV residuals
1 XT
σ̂ 2,i = e2i,t . (7)
T t=1
8
Feasible GLS estimates are then calculated by
β̂ F GLS = (X 0 Ω̂−1 X)−1 (X 0 Ω̂−1 Y )
where  
σ̂ 2 I 0 ··· 0
 ,1 T 
 
 0 σ̂ 2,2 IT 0 
 
Ω̂ =  . ..  .
 . .. 
 . . . 
 
0 0 ··· σ̂ 2,n IT
nT ×nT
5.2 Heteroscedasticity in the RE Model
Begin by considering the composite error term µi + it . Although it makes sense to allow E(µ2i ) = σ2µ,i ,
we will only have one observation for each i on µi . Therefore, estimation of σ 2µ,i would have to be µ̂2i ,
2
which is probably not desirable. Therefore, if we let the E( it ) = σ 2,i , then all we have to do is adjust our
transformation parameter as follows:
σ ,i
θi = 1 − q .
2
σ ,i + T σ2µ
A similar result holds for unbalanced panels where heteroscedasticity is introduced by varying group sizes.
The only remaining task is to estimate σ2,i and σ 2µ . Greene suggests using the LSDV residuals to estimate
the σ2,i since the model has been purged of the µi . This estimate is given in equation (7). The group-specific
variance σ2µ can then be estimated by
1 Xn
σ̂ 2µ = (σ̂ 2OLS,i − σ̂ 2,i )
n i=1
where σ̂2OLS,i are comparable to σ̂ 2,i , but estimated using the pooled OLS residuals. The RE estimator then
proceeds as described earlier.
5.3 Autocorrelation in the FE Model
Autocorrelation in the FE is fairly simple (although care needs to be made that the data are stacked in the
correct manner). Begin with AR(1) errors
it = ρi i,t−1 + ν it (8)
9
where some researchers choose to set ρi = ρ for all i = 1, ..., n. Either way, the ρi s can be estimated with
the LSDV residuals, the data pseudo-differenced, and feasible GLS applied.
5.4 Autocorrelation in the RE Model
Autocorrelation in the RE model is only slightly more complex. Of course, in the RE model there is
always going to be autocorrelation in the composite error term µi + it because µi does not vary over time.
Therefore, it only makes sense to specify autocorrelation in it , as is done in equation (8). The LSDV
residuals can be used to get an estimate of ρ (or ρi if desired) and Cochrane-Orcutt can then be applied.
The transformed model will take the form
yit − ρi yi,t−1 = α(1 − ρi ) + β 0 (xit − ρi xi,t−1 ) + µi (1 − ρi ) + ν it
for t = 2, ..., T .
6 Other Types of Panel-Data Models
6.1 Unbalanced Panels
Up to this point, we have implicitly assumed that each cross section is observed for T periods. However,
because of sample attrition and new entry, it is common to have not observe each cross section for all periods.
P
In this case, the total number of observations will not be nT , but rather ni=1 Ti . Below, I describe how to
modify the FE and RE models for an unbalanced panel, which is already programmed into most econometric
software packages.
6.1.1 FE Model
The FE model works with an unbalanced panel with no real change, provided the dummy variables are
updated to no longer have the same number of ones in each column.
6.1.2 RE Model
The RE model is only slightly more complicated. With an unbalanced panel, now define
σ
θi = 1 − q
σ 2 + Ti σ 2µ
10
where i = 1, ..., n. The data are then transformed according to
 
y1 − θ1 ȳ1
 
 
 y2 − θ2 ȳ2 
 
Y∗ =  .. 
 
 . 
 
yn − θn ȳn
and similarly for X. Of course, if Ti = T , then this collapses back to the standard RE estimator.
6.2 Time-Specific Effects
It is possible, using either the FE or RE approach, to incorporate time-specific effects:
yit = αi + β 0 xit + γ t + it .
Using the FE approach, γ t can be estimated by incorporating dummy variables for T − 1 of the time periods.
The RE approach, which treats γ t as a random variable, is more complicated.
6.3 Dynamic Models
Sometimes it is desirable to allow for dynamic effects in the model
yit = αi + γyi,t−1 + x0it β + it .
The problem with estimating such a model, either of the FE or RE nature, is that after transforming the
model to get rid of the group-specific effect (e.g., first-differencing the model), the right-hand side variables
are correlated with the error term. The solution to this problem involves using values of yi,t−s for s > 1 as
instrumental variables.
7 Gauss Application
Consider the following panel data model taken from Woolridge (2002)
M urdersit = αi + β 1 Executionsit + β 2 U nempit + it
11
where M urdersit is the number of murders in state i in year t per 10,000 people; Executionsit is the
total number of executions for the current and prior two years; U nempit is the current unemployment rate;
i = 1, ..., 50; and t = 1987, 1990, 1993. See Gauss example 13.1 for further details.
12

Panel Data Econometrics Kenya

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Panel Data Econometrics Kenya

Caricato da

Copyright:

Formati disponibili

COLLABORATIVE MASTERS PROGRAMME

IN ECONOMICS FOR ANGLOPHONE AFRICA

DR. MOSES SICHEI*

Date: 8th September, 2008

LECTURE 10: INTRODUCTION TO PANEL ECONOMETRICS I

The main objective of the lecture is to provide motivation for

1. INTRODUCTION AND MOTIVATION

1.1 Types of Data

1.1.1 Cross-section data

• In other words it is a snapshot at a point in time

o Poverty rates in different countries in Africa at a particular point in time

• They shed light on intertemporal dependence of events

• E.g. autoregressive distributed lag models, error correction models etc.

1.1.3 Panel data (cross-section and time series)

followed throughout the period of the sample.

over the period 2001-2008.

Period University University of Dar Es University University of University of University of University of

• In other words panel data combines cross-section (“picture or snapshot”, or space)

with time series (“path.”, movie)

Other terms used;

• Pooled data (pooling of time series and cross-section observations)

1.2 Structures of Panel Data

1.3 Types of Panel Data Models

1.4 Why Panel Data? (Baltagi, 2005, Chapter 1)

1. Control for heterogeneity among economic agents.

1.5 Limitations of Panel Data Analysis

1. Design and data collection problems

2.1 Panel Data Models

• Where y it is a scalar dependent variable, x it is a k × 1 vector of independent

variables, ε it is a scalar disturbance term, i indexes individuals (firms, country etc.), t

i and t and on the behaviour of the error term ε it

2.2 The Pooled Data Model

Cor (ε it , ε js ) = 0 when i ≠ j and/or t ≠ s

Cor (ε it , ε js ) = 0 when i ≠ j and/or t ≠ s

• This is what we refer to in panel parlance as individual (unobserved) heterogeneity.

yit =α3 + xit′ β +εit

2 yit =α2 + xit′ β +εit

θ3 1 yit =α1 + xit′ β +εit

α2 θ1 tan(θ1 ) = tan(θ 2 ) = tan(θ 3 ) = β

2.4 Traditional Seemingly Unrelated Regression (SUR) Model

tan(θ1 ) > tan(θ2 ) > tan(θ3 )

Cor (ε it , ε jt ) = σ ij Contemporaneous (same period) correlation

2.5 Which model is appropriate for my Data?

 In terms of unrestrictiveness, the relationship is as follows:

Baltagi, B.H. (2005), Econometric Analysis of Panel Data, 3rd Edition,

Cameron,A.C. and Trivedi, P.K. (2006), Microeconometrics: Methods

Cheng Hsiao, Analysis of Panel Data, Cambridge University Press,

Greene W H. Econometric Analysis, Second Edition, Macmillan, 2003.

Wooldridge, J. M. (2002), Econometric Analysis of Cross-Section and

DR. MOSES SICHEI*

Date: 9th September, 2008

LECTURE 11: ONE-WAY ERROR COMPONENT MODELS

The main objective of the lecture is to understand some basic

1.1 One-Way Error Component Model

• The error term in equation 1 is decomposed into;

• Where µ i denotes the unobservable individual specific effect and ν it denotes

1.1.1 Meaning of the unobservable individual specific effects µ i

The other terminologies given to µ i are;

• Substituting Equation 2 in 1 yields the following one-way error component model;

2. FIXED EFFECTS MODEL (FEM)

• The α + µi are possibly correlated with the regressors xit

yit E [yit xit ] = α + β xit Pooled

Group2 E [yit xit ] = α + µ 2 + β xit

In terms of unrestrictiveness, the relationship is as follows: