17 130 1 PB

Quantitative Economics 2 (2011), 173210 1759-7331/20110173
Numerically stable and accurate stochastic simulation

approaches for solving dynamic economic models
KrNNr1n L. Junn
Hoover Institution, Stanford University and NBER
LiIin MnIinn
University of Alicante and Hoover Institution, Stanford University
Srncuri MnIinn
University of Alicante and Hoover Institution, Stanford University
We develop numerically stable and accurate stochastic simulation approaches
for solving dynamic economic models. First, instead of standard least-squares
approximation methods, we examine a variety of alternatives, including least-
squares methods using singular value decomposition and Tikhonov regulariza-
tion, least-absolute deviations methods, and principal component regression
method, all of which are numerically stable and can handle ill-conditioned prob-
lems. Second, instead of conventional Monte Carlo integration, we use accurate
quadrature and monomial integration. We test our generalized stochastic simu-
lation algorithm (GSSA) in three applications: the standard representativeagent
neoclassical growth model, a model with rare disasters, and a multicountry model
with hundreds of state variables. GSSA is simple to program, and MATLAB codes
are provided.
Krvwonns. Stochastic simulation, generalized stochastic simulation algorithm,
parameterized expectations algorithm, least absolute deviations, linear program-
ming, regularization.
JEL cInssiricn1ioN. C63, C68.
1. IN1nonuc1ioN
Dynamic stochastic economic models do not generally admit closed-formsolutions and
must be studied with numerical methods.
1
Most methods for solving such models fall
Kenneth L. Judd: kennethjudd@mac.com
Lilia Maliar: maliarl@stanford.edu
Serguei Maliar: maliars@stanford.edu
We thank a co-editor and two anonymous referees for very useful comments that led to a substantial im-
provement of the paper. Lilia Maliar andSerguei Maliar acknowledge support fromthe Hoover Institutionat
Stanford University, the Ivie, the Generalitat Valenciana under Grants BEST/2010/142 and BEST/2010/141,
respectively, the Ministerio de Ciencia e Innovacin de Espaa, and FEDER funds under project SEJ-2007-
62656 andunder the programs Jos CastillejoJC2008-224 andSalvador Madariaga PR2008-190, respectively.
1
For reviews of such methods, see Taylor and Uhlig (1990), Rust (1996), Gaspar and Judd (1997), Judd
(1998), Marimon and Scott (1999), Santos (1999), Christiano and Fisher (2000), Miranda and Fackler (2002),
Copyright 2011 Kenneth L. Judd, Lilia Maliar, and Serguei Maliar. Licensed under the Creative Commons
Attribution-NonCommercial License 3.0. Available at http://www.qeconomics.org.
DOI: 10.3982/QE14
174 Judd, Maliar, and Maliar Quantitative Economics 2 (2011)
into three broad classes: projection methods, which approximate solutions on a pre-
specied domain using deterministic integration; perturbation methods, which nd so-
lutions locally using Taylor expansions of optimality conditions; and stochastic simula-
tion methods, which compute solutions on a set of simulated points using Monte Carlo
integration. All three classes of methods have their relative advantages and drawbacks,
and the optimal choice of a method depends on the application. Projection methods
are accurate and fast when applied to models with few state variables; however, their
cost increases rapidly as the number of state variables increases. Perturbation methods
are practical to use in high-dimensional applications, but the range of their accuracy is
uncertain.
2
Stochastic simulation algorithms are simple to program although they are
generally less accurate than projection methods and often numerically unstable.
3
In
the present paper, we focus on the stochastic simulation class.
4
We specically develop
a generalized stochastic simulation algorithm (GSSA) that combines advantages of all
three classes, namely, it is accurate, numerically stable, tractable in high-dimensional
applications, and simple to program.
The key message of the present paper is that a stochastic simulation approach is
attractive for solving economic models because it computes solutions only in the part
of the state space which is visited in equilibriumthe ergodic set. In Figure 1, we plot
the ergodic set of capital and productivity level for a representativeagent growth model
with a closed-form solution (for a detailed description of this model, see Section 2.1).
The ergodic set takes the form of an oval and most of the rectangular area that sits
outside of the ovals boundaries is never visited. In the two-dimensional case, a circle
inscribed within a square occupies about 79% of the area of the square, and an oval in-
scribed in this way occupies an even smaller area. Thus, the ergodic set is at least 21%
smaller than the square. In general, the ratio of the volume of a J-dimensional hyper-
sphere of diameter 1 to the volume of a J-dimensional hypercube of width 1 is
V
J
=
_
_
(,2)
(J1),2
1 3 J
for J =1, 3, 5, . . . ,
(,2)
J,2
2 4 J
for J =2, 4, 6, . . . .
(1)
The ratio V
J
declines very rapidly with the dimensionality of the state space. For exam-
ple, for dimensions 3, 4, 5, 10, and 30, this ratio is 0.52, 0.31, 0.16, 3 10
3
, and 2 10
14
,
respectively.
Aruoba, Fernandez-Villaverde, and Rubio-Ramrez (2006), Heer and Maussner (2008), Den Haan (2010),
and Kollmann, Maliar, Malin, and Pichler (2011).
2
See Judd and Guu (1993), Gaspar and Judd (1997), and Kollmann et al. (2011) for accuracy assessments
of perturbation methods.
3
See Judd (1992) and Christiano and Fisher (2000) for a discussion.
4
Stochastic simulations are widely used in economics and other elds; see Asmussen and Glynn (2007)
for an up-to-date review of such methods. In macroeconomic literature, stochastic simulation methods
have been used to approximate an economys path (Fair and Taylor (1983)), a conditional expectation func-
tion in the Euler equation (Marcet (1988)), a value function (Maliar and Maliar (2005)), an equilibrium in-
terest rate (Aiyagari (1994)), and an aggregate law of motion of a heterogeneous-agent economy (Krusell
and Smith (1998)), as well as to make inferences about the parameters of economic models (Smith (1993),
among others).
Quantitative Economics 2 (2011) Approaches for solving dynamic models 175
Ficunr 1. The ergodic set in the model with a closed-form solution.
The advantage of focusing on the ergodic set is twofold. First, when computing a so-
lution on an ergodic set that has the shape of a hypersphere, we face just a fraction of
the cost we would have faced on a hypercube grid, which is used in conventional projec-
tion methods. The higher is the dimensionality of a problem, the larger is the reduction
in cost. Second, when tting a polynomial on the ergodic set, we focus on the relevant
domain and can get a better t inside the relevant domain than conventional projection
methods, which face a trade-off betweenthe t inside and outside the relevant domain.
5
However, to fully benet from the advantages of a stochastic simulation approach,
we must rst stabilize the stochastic simulation procedure. The main reason for the nu-
merical instability of this procedure is that polynomial terms constructed on simulated
series are highly correlatedwithone another evenunder low-degree polynomial approx-
imations. Under the usual least-squares methods, the multicollinearity problemleads to
a failure of the approximation (regression) step.
To achieve numerical stability, we build GSSA on approximation methods that are
designed to handle ill-conditioned problems. Inthe context of a linear regressionmodel,
we examine a variety of such methods including least-squares (LS) methods using sin-
gular value decomposition(SVD) andTikhonov regularization, the principal component
regression method, and least-absolute deviations (LAD) linear-programming methods
(in particular, we present primal and dual LAD regularization methods). In addition,
we explore how the numerical stability is affected by other factors such as a normaliza-
tion of variables, the choice of policy function to parameterize (capital versus marginal-
utility policy functions), and the choice of basis functions (ordinary versus Hermite
polynomials). Our stabilization strategies are remarkably successful: our approximation
methods deliver polynomial approximations up to degree 5 (at least), while the ordinary
5
The importance of this effect can be seen from the results of the January 2010 special Journal of Eco-
nomic Dynamics and Control issue on numerical methods for solving Krusell and Smiths (1998) model.
An Euler equation method based on the KrusellSmith type of simulation by Maliar, Maliar, and Valli (2010)
delivered a more accurate aggregate law of motion than does any other method participating in the com-
parison analysis, including projection methods; see Table 15 in Den Haan (2010).
least-squares method fails to go beyond the second-degree polynomial in the studied
examples.
We next focus on accuracy. We show that if Monte Carlo integration is used to ap-
proximate conditional expectations, the accuracy of solutions is dominated by sampling
errors froma nite simulation. The sampling errors decrease with the simulation length
but the rate of convergence is low, and high-accuracy levels are impractical. For exam-
ple, in a representativeagent model, Monte Carlo integration leads to accuracy levels
(measured by the size of unit-free Euler equation errors on a stochastic simulation) of
order 10
4
10
5
under the simulation length of 10,000. The highest accuracy is attained
under second- or third-degree polynomials. Thus, even though our stabilization strate-
gies enable us to compute a high-degree polynomial approximation, there is no point in
doing so with Monte Carlo integration.
To increase the accuracy of solutions, we replace the Monte Carlo integration
method with more accurate deterministic integration methods, namely, the Gauss
Hermite quadrature and monomial methods. Such methods are unrelated to the esti-
mated density function and do not suffer from sampling errors. In the representative
agent case, GSSA based on GaussHermite quadrature integration delivers accuracy lev-
els of order 10
9
10
10
, which are comparable to those attained by projection methods.
Thus, under accurate deterministic integration, high-degree polynomials do help in-
crease the accuracy of solutions.
Given that GSSA allows for a variety of approximation and integration techniques,
we can choose a combination of the techniques that takes into account a trade-off be-
tween numerical stability, accuracy, and speed for a given application. Some tenden-
cies from our experiments are as follows. LAD methods are generally more expensive
than LS methods; however, they deliver smaller mean absolute errors. In small- and
moderate-scale problems, the LS method using SVD is more stable than the method
using Tikhonov regularization, although the situation reverses in large-scale problems
(SVD becomes costly and numerically unstable). GaussHermite quadrature (product)
integration rules are very accurate; however, they are practical only with few exogenous
random variables (shocks). Monomial (nonproduct) integration rules deliver compara-
ble accuracy and are feasible with many exogenous random variables. Surprisingly, a
quadrature integration method with just one integration node is also sufciently accu-
rate in our examples; in particular, it is more accurate than a Monte Carlo integration
method with thousands of integration nodes.
We advocate versions of GSSA that use deterministic integration methods. Such
versions of GSSA construct a solution domain using stochastic simulations but com-
pute integrals using methods that are unrelated to simulations; these preferred versions
of GSSA, therefore, lie between pure stochastic simulation and pure projection algo-
rithms. Importantly, GSSA keeps the prominent feature of stochastic simulation meth-
ods, namely, their tractability in high-dimensional applications. To illustrate this fea-
ture, we solve a version of the neoclassical growth model with N heterogeneous coun-
tries (the state space is composed of 2N variables). For small-scale economies, N =6, 4,
and 2, GSSA computes polynomial approximations up to degrees 3, 4, and 5 with maxi-
mum absolute errors of 0.001%, 0.0006%, and 0.0002%, respectively. For medium-scale
economies, N = 8, 10, and 20, GSSA computes second-degree polynomial approxima-
tions with the maximum absolute errors of 0.01%, which is comparable to the highest
accuracy levels attained in the related literature; see Kollmann et al. (2011). Finally, for
large-scale economies, N = 100 and 200, GSSA computes rst-degree polynomial ap-
proximations with the maximum absolute approximation errors of 0.1%. The running
time of GSSA depends on the cost of the integration and approximation methods. Our
cheapest setup delivers a second-degree polynomial solution to a 20-country model in
about 18 minutes using MATLAB and a standard desktop computer.
We present GSSA in the context of examples in which all variables can be expressed
analytically in terms of capital policy function, but GSSA can be applied in far more
general contexts. In more complicated models (e.g., with valued leisure), intratempo-
ral choices, such as labor supply, are not analytically related to capital policy functions.
One way to proceed under GSSA is to approximate intratemporal-choice policy func-
tions as we do with capital; however, this may reduce accuracy and numerical stabil-
ity. Maliar, Maliar, and Judd (2011) described two intertemporal-choice approaches
precomputation and iteration-on-allocationthat make it possible to nd intratem-
poral choices both accurately and quickly; these approaches are fully compatible with
GSSA. Furthermore, GSSA can be applied for solving models with occasionally binding
borrowing constraints; see, for example, Marcet and Lorenzoni (1999), Christiano and
Fisher (2000), and Maliar, Maliar, and Valli (2010). Finally, the approximation and inte-
gration methods described in the paper can be useful in the context of other solution
methods, for example, the simulation-based dynamic programming method of Maliar
and Maliar (2005).
GSSA is simple to program, and MATLAB codes are provided.
6
Not only can the
codes solve the studied examples, but they can be easily adapted to other problems in
which the reader may be interested. In particular, the codes include generic routines
that implement numerically stable LS and LAD methods, construct multidimensional
polynomials, and performmultidimensional GaussHermite quadrature and monomial
integration methods. The codes also contain a test suite for evaluating the accuracy of
solutions.
The rest of the paper is organized as follows: In Section 2, we describe GSSA using
an example of a representativeagent neoclassical growth model. In Section 3, we dis-
cuss the reasons for numerical instability of stochastic simulation methods. In Section 4,
we elaborate on strategies for enhancing the numerical stability. In Section 5, we com-
pare Monte Carlo and deterministic integration methods. In Section 6, we present the
results of numerical experiments. In Section 7, we conclude. The Appendices are avail-
able in a supplementary le on the journal website, http://qeconomics.org/supp/14/
supplement.pdf.
2. GrNrnnIizrn s1ocnns1ic siruIn1ioN nIconi1nr
We describe GSSA using an example of the standard representativeagent neoclassical
stochastic growth model. However, the techniques described in the paper are not spe-
cic to this model and can be directly applied to other economic models including those
6
The codes are available at http://www.stanford.edu/~maliars.
with many state and control variables. In Section 7, we showhowto apply GSSA for solv-
ing models with rare disasters and models with multiple countries.
2.1 The model
The agent solves the intertemporal utility-maximization problem
max
{l
i+1
,c
i
}
i=0,...,
L
0
i=0
i
u(c
i
) (2)
s.t. c
i
+l
i+1
=(1 )l
i
+a
i
] (l
i
), (3)
lna
i+1
=lna
i
+c
i+1
, c
i+1
N(0,
2
), (4)
where initial condition (l
0
, a
0
) is given exogenously. Here, L
i
is the expectation operator
conditional on information at time i; c
i
, l
i
, and a
i
are, respectively, consumption, cap-
ital, and productivity level; (0, 1) is the discount factor; (0, 1] is the depreciation
rate of capital; (1, 1) is the autocorrelationcoefcient; and 0 is the standard de-
viation. The utility and production functions, u and ] , respectively, are strictly increas-
ing, continuously differentiable, and concave. The solution to (2)(4) is represented by
stochastic processes {c
i
, l
i+1
}
i=0,...,
which are measurable with respect to {a
i
}
i=0,...,
.
At each time i, the solution to (2)(4) satises the Euler equation
u
(c
i
) =L
i
_
u
(c
i+1
)[1 +a
i+1
]
(l
i+1
)]
_
, (5)
where u
and ]
are the rst derivatives of the utility and production functions, respec-
tively. In a recursive (Markov) equilibrium, decisions of period i are functions of the cur-
rent state (l
i
, a
i
). Our objective is to nd policy functions for capital, l
i+1
= K(l
i
, a
i
),
and consumption, c
i
=C(l
i
, a
i
), that satisfy (3)(5).
2.2 The GSSA algorithm
To solve the model (2)(4), we approximate the capital policy function l
i+1
=K(l
i
, a
i
).
We choose some exible functional form (l
i
, a
i
; b) and search for a vector of coef-
cients b such that
K(l
i
, a
i
) (l
i
, a
i
; b) (6)
for some set of points (l
i
, a
i
) in the state space. We rewrite the Euler equation (5) in the
equivalent form
l
i+1
=L
i
_
(c
i+1
)
u
(c
i
)
[1 +a
i+1
]
(l
i+1
)]l
i+1
_
. (7)
The condition (7) holds because u
(c
i
) = 0 and because l
i+1
is i-measurable.
7
We now
have expressed l
i+1
in two ways: as a choice implied by the policy function l
i+1
=
7
Ina similar way, one canuse the Euler equation(5) to express other i-measurable variables, for example,
ln(l
i+1
), c
i
, and u
(c
i
).
K(l
i
, a
i
) and as a conditional expectation of a time i +1 random variable on the right
side of (7). This construction gives us a way to express the capital policy function as a
xed point: substituting K(l
i
, a
i
) into the right side of (7) and computing the condi-
tional expectation should give us l
i+1
= K(l
i
, a
i
) for all (l
i
, a
i
) in the relevant area of
the state space.
GSSA nds a solution by iterating on the xed-point construction (7) via stochastic
simulation. To be specic, we guess a capital policy function (6), simulate a time-series
solution, compute conditional expectation in each simulated point, and use simulated
data to update the guess along iterations until a xed point is found. The formal descrip-
tion of GSSA is as follows:
Stage 1
Initialization:
Choose an initial guess b
(1)
.
Choose the initial state (l
0
, a
0
) for simulations.
Choose a simulationlength T, drawa sequence of productivity shocks {c
i
}
i=1,...,T
,
and compute {a
i
}
i=1,...,T
as dened in (4).
Step 1. At iteration , use b
()
to simulate the model T periods forward:
l
i+1
=
_
l
i
, a
i
; b
()
_
,
c
i
=(1 )l
i
+a
i
] (l
i
) l
i+1
.
Step 2. For i = 0, . . . , T 1, dene ,
i
to be an approximation of the conditional ex-
pectation in (7) using J integration nodes and weights, {c
i+1,]
}
]=1,...,J
and {
i,]
}
]=1,...,J
,
respectively:
,
i
=
J
]=1
_
i,]
(c
i+1,]
)
u
(c
i
)
[1 +a
i+1,]
]
(l
i+1
)]l
i+1
__
, (8)
where c
i+1,]
, the value of c
i+1
if the innovation in productivity is c
i+1,]
, is dened for
] =1, . . . , J by
a
i+1,]
a
i
exp(c
i+1,]
),
l
i+2,]
_
l
i
, a
i
; b
()
_
, a
i
exp(c
i+1,]
); b
()
_
,
c
i+1,]
(1 )l
i+1
+a
i+1,]
] (l
i+1
) l
i+2,]
.
Step 3. Find
b that minimizes the errors

i
in the regression equation
,
i
=(l
i
, a
i
; b) +
i
(9)
according to some norm .
Step 4. Check for convergence and end Stage 1 if
1
T
T
i=1
l
()
i+1
l
(1)
i+1
l
()
i+1
-v, (10)
where {l
i+1
}
i=1,...,T
and {l
(1)
i+1
}
i=1,...,T
are the capital series obtained on iterations
and 1, respectively.
Step 5. Compute b
(+1)
for iteration (+1) using xed-point iteration
b
(+1)
=(1 )b
()
+
b, (11)
where (0, 1] is a damping parameter. Go to Step 1.
Stage 2 The purpose of Stage 2 is to subject the candidate solution from Stage 1 to
an independent and stringent test. Construct a new set of T
test
points {l
, a
}
=0,...,T
test
for testing the accuracy of the solution obtained in Stage 1 (this can be a set of simula-
tion points constructed with a new random draw or some deterministic set of points).
Rewrite the Euler equation (5) at (l
, a
) in a unit-free form:
E(l
, a
) L
(c
+1
)
u
(c
)
[1 +a
+1
]
(l
+1
)]
_
1. (12)
For each point (l
, a
), compute E(l
, a
) by using a high-quality integration method

to evaluate the conditional expectation in (12). We measure the quality of a candidate
solution by computing various norms, such as the mean, variance, and/or supremum,
of the errors (12). If the economic signicance of these errors is small, we accept the can-
didate b. Otherwise, we tighten up Stage 1 by using a more exible approximating func-
tion, and/or increasing the simulation length, and/or improving the method used for
computing conditional expectations, and/or choosing a more demanding norm when
computing
b in Step 3.
8
2.3 Discussion
GSSA relies on generalized notions of integration and approximation. First, in Step 2,
the formula (8) represents both Monte Carlo integration methods and deterministic
integration methods such as the GaussHermite quadrature and monomial methods.
The choice of integration method is critical for the accuracy of GSSA and is analyzed in
Section 5. Second, explanatory variables in the regression equation (9) are often highly
collinear, which presents challenges to approximation methods. GSSA uses methods
that are suitable for dealing withcollinear data, namely, the least-squares methods using
SVD and Tikhonov regularization, least-absolute deviations methods, and the principal
component regression method. The choice of approximation method is critical for nu-
merical stability of GSSA and is studied in Section 4.
GSSAis compatible with any functional formfor that is suitable for approximating
policy functions. In this paper, we examine of the form
(l
i
, a
i
; b) =
n
i=0
b
i
i
(l
i
, a
i
) (13)
8
For the models considered in the paper, errors in the Euler equation are the only source of approxima-
tion errors. In general, we need to check approximation errors in all optimality conditions, the solutions to
which are evaluated numerically.
for a set of basis functions {
i
| i = 0, . . . , n}, where b (b
0
, b
1
, . . . , b
n
)
R
n+1
. In Ap-
pendix A, we examine cases where the coefcients b enter in a nonlinear manner and
we describe nonlinear approximation methods suitable for dealing with collinear data.
The specication (13) implies that in Step 3, the regression equation is linear,
, =Ab +, (14)
where , (,
0
, ,
1
, . . . , ,
T1
)
R
T
; A [1
T
, :
1
, . . . , :
n
] R
T(n+1)
with 1
T
being a
T 1 vector whose entries are equal to 1 and :
ii
=
i
(l
i
, a
i
) for i = 1, . . . , n; and
(
0
,
1
, . . . ,
T1
)
R
T
. (Note that 1
T
in A means that
0
(l
i
, a
i
) =1 for all i.) The
choice of a family of basis functions used to construct A can affect numerical stability
of GSSA. In Section 4.5.1, we consider families of ordinary and Hermite polynomials.
9
The xed-point iteration method in Step 4 is a simple derivative-free method for
nding a xed point and is commonly used in the related literature. The advantage of
this method is that its cost does not considerably increase with the dimensionality of the
problem. The shortcoming is that its convergence is not guaranteed. One typically needs
to set the damping parameter in (11) at a value much less than 1 to attain convergence
(this, however, slows down the speed of convergence). We were always able to nd a
value for that gave us convergence.
10
Finally, our convergence criterion(10) looks at the difference betweenthe time series
fromtwo iterations. We do not focus on changes in b since we are interested in the func-
tion K(l
i
, a
i
) and not in its representation in some basis. The regression coefcients
b have no economic meaning. The criterion (10) focuses on the economic differences
implied by different vectors b.
2.4 Relation to the literature
GSSA builds on the past literature on solving rational expectation models but uses a
different combination of familiar tools. GSSA differs from conventional deterministic-
grid methods in the choice of a solution domain: we solve the model on a relatively
small ergodic set instead of some, generally much larger, prespecied domains used,
for example, in parameterized expectations approaches (PEA) of Wright and Williams
(1984) and Miranda and Helmberger (1988) and projection algorithms of Judd (1992),
9
GSSAcan also use nonpolynomial families of functions. Examples of nonpolynomial basis functions are
trigonometric functions, step functions, neural networks, and combinations of polynomials with functions
from other families.
10
Other iterative schemes for nding xed-point coefcients are time iteration and quasi-Newton meth-
ods; see Judd (1998, pp. 553558 and 103119, respectively). Time iteration can be more stable than xed-
point iteration; however, it requires solving costly nonlinear equations to nd future values of variables.
Quasi-Newton methods can be faster and can help achieve convergence if xed-point iteration does not
converge. A stable version of a quasi-Newton method for a stochastic simulation approach requires a good
initial condition and the use of line-search methods. Since derivatives are evaluated via simulation, an ex-
plosive or implosive simulated series can make a Jacobian matrix ill-conditioned and lead to nonconver-
gence; we had this problem in some of our experiments.
Christiano and Fisher (2000), and Krueger and Kubler (2004).
11
An ergodic-set domain
makes GSSA tractable in high-dimensional applications; see condition (1).
12
To construct the ergodic set realized in equilibrium, GSSA uses stochastic simula-
tion. This approach is taken in Marcets (1988) simulation-based version of PEA used
in, for example, Den Haan and Marcet (1990), Marcet and Lorenzoni (1999), and Maliar
and Maliar (2003a). We differ from this literature in the following respects: We incorpo-
rate accurate deterministic integration methods, while the above literature uses a Monte
Carlo integration method, whose accuracy is limited. Furthermore, we rely on a variety
of numerically stable approximation methods, while the simulation-based version of
PEA relies on standard least-squares methods, which are numerically unstable in the
given context.
13
In addition, GSSA differs fromthe literature in the use of a linear regres-
sion model that can be estimated with simple and reliable approximation methods.
14
Unlike previous simulation-based methods, GSSA delivers high-degree polynomial ap-
proximations and attains accuracy comparable to the best accuracy attained in the lit-
erature.
3. III-coNni1ioNrn LS nouIrrs
In this section, we discuss the stability issues that arise when standard least-squares (LS)
methods are used in the regression equation (14). The LS approach to the regression
equation (14) solves the problem
min
b
, Ab
2
2
=min
b
[, Ab]
[, Ab], (15)
where
2
denotes the L
2
vector norm. The solution to (15) is
b =(A
A)
1
A
,. (16)
The LS problem (15) is often ill-conditioned when A is generated by stochastic simula-
tion. The degree of ill-conditioning is measured by the condition number of the matrix
A
A, denoted by K(A
A). Let us order the eigenvalues

i
, i =1, . . . , n, of A
A by their
11
Krueger and Kublers (2004) method relies ona nonproduct Smolyak grid constructed ina multidimen-
sional hypercube. This construction reduces the number of grid points inside the hypercube domain but
not the size of the domain itself. Other methods using prespecied nonproduct grids are Malin, Krueger,
and Kubler (2011) and Pichler (2011).
12
Judd, Maliar, and Maliar (2010) and Maliar, Maliar, and Judd (2011) developeda projectionmethod that
operates on the ergodic set. The grid surrounding the ergodic set is constructed using clustering methods.
13
Concerning the simulation-based PEA, Den Haan and Marcet (1990) reported that, even for a low
(second-degree) polynomial, cross terms are highly correlated with the other terms and must be removed
from the regression. See Judd (1992) and Christiano and Fisher (2000) for a discussion.
14
The simulation-based PEA literature employs exponentiated polynomial specication (l
i
, a
i
; b) =
exp(b
0
+b
1
lnl
i
+b
2
lna
i
+ ). The resulting nonlinear regression model is estimated with nonlinear least-
squares (NLLS) methods. The use of NLLS methods is an additional source of numerical problems, because
such methods typically need a good initial guess, may deliver multiple minima, and on many occasions
fail to converge. Moreover, nonlinear optimization is costly because it requires computing Jacobian and
Hessian matrices; see Christiano and Fisher (2000) for a discussion.
magnitude
1

2

n
0. The condition number of A
A is equal to the ra-

tio of its largest eigenvalue,
1
, to its smallest eigenvalue,
n
, that is, K(A
A)
1
,
n
.
The eigenvalues of A
A are dened by the standard eigenvalue decomposition A
A =
! !

, where R
nn
is a diagonal matrix of eigenvalues of A
A and ! R
nn
is an
orthogonal matrix of eigenvectors of A
A. A large condition number implies that A
A
is close to being singular and not invertible, and tells us that any linear operation, such
as (16), is very sensitive to perturbation and numerical errors (such as round-off errors).
Two causes of ill-conditioning are multicollinearity and poor scaling of the variables
that constitute A. Multicollinearity occurs when the variables that form A are signif-
icantly correlated. The following example illustrates the effects of multicollinearity on
the LS solution (we analyze the sensitivity to changes in ,, but the results are similar for
the sensitivity to changes in A).
ExnrIr 1. Let A =
_
1+
1
1
1+
_
with =0. Then K(A
A) =(1 +
2
)
2
. Let , =(0, 0)
.
Thus, the ordinary least-squares (OLS) solution (16) is (
b
1
,
b
2
) =(0, 0). Suppose , is per-
turbed by a small amount, that is, , =(
1
,
2
)
. Then the OLS solution is
b
1
=
1
1
(1 +)
2
2 +
_
and

b
2
=
1
2
(1 +)
1
2 +
_
. (17)
Sensitivity of

b
1
and

b
2
to perturbation in , is proportional to 1, (increases with
K(A
A)).
The scaling problem arises when the columns (the variables) of A have signicantly
different means and variances (due to differential scaling among either the state vari-
ables, l
i
and a
i
, or their functions, for example, l
i
and l
5
i
). A column with only very
small entries is treated as if it is a column of zeros. The next example illustrates the effect
of the scaling problem.
ExnrIr 2. Let A =
_
1
0
0
_
with =0. Then K(A
A) =1,. Let , =(0, 0)
. Thus, the
OLS solution (16) is (
b
1
,
b
2
) =(0, 0). Suppose , is perturbed by a small amount, that is,
, =(
1
,
2
)
. The OLS solution is
b
1
=
1
and

b
2
=

2
. (18)
Sensitivity of
b
2
to perturbation in , is proportional to 1, (and K(A
A)).
A comparison of Examples 1 and 2 shows that multicollinearity and poor scaling
magnify the impact of perturbations on the OLS solution. Each iteration of a stochastic
simulation algorithm produces changes in simulated data (perturbations). In the pres-
ence of ill-conditioning, these changes together with numerical errors may induce large
and erratic jumps in the regression coefcients and failures to converge.
4. ENnnNciNc NurrnicnI s1nuiIi1v
We need to make choices of approximation methods that ensure numerical stability of
GSSA. We face two challenges: rst, we must solve the approximation step for any given
set of simulation data; second, we must attain the convergence of the iterations over b.
The stability of the iterations over b depends on the sensitivity of the regression coef-
cients to the data (each iteration of GSSA produces different time series and results in
large changes in successive values of b and nonconvergence). In this section, we present
approximationmethods that canhandle collinear data, namely, a LS methodusing a sin-
gular value decomposition (SVD) and least-absolute deviations (LAD) method. Further-
more, we describe regularization methods that not only can deal with ill-conditioned
data, but can also dampen movements in b by effectively penalizing large values of the
regression coefcients. Such methods are a LS method using Tikhonov regularization,
LAD regularization methods, and the principal component regression method. We -
nally analyze other factors that can affect numerical stability of GSSA, namely, data nor-
malization, the choice of a family of basis functions, and the choice of policy functions
to parameterize.
4.1 Normalizing the variables
Data normalization addresses the scaling issues highlighted in Example 2. Also, our reg-
ularization methods require the use of normalized data. We center and scale both the
response variable , and the explanatory variables of A to have a zero mean and unit
standard deviation. We then estimate a regression model without an intercept to ob-
tain the vector of coefcients (
b
+
1
, . . . ,
b
+
n
). We nally restore the coefcients
b
1
, . . . ,
b
n
and the intercept
b
0
in the original (unnormalized) regression model according to
b
i
=
(
,
,
:
i
)
b
+
i
, i = 1, . . . , n, and
b
0
= ,
n
i=1
b
+
i
:
i
, where , and :
i
are the sample means,
and
,
and
:
i
are the sample standard deviations of the original unnormalized vari-
ables , and :
i
, respectively.
15
4.2 LS approaches
In this section, we present two LS approaches that are more numerically stable than the
standard OLS approach. The rst approach, called LS using SVD (LS-SVD), uses a sin-
gular value decomposition (SVD) of A. The second approach, called regularized LS us-
ing Tikhonov regularization (RLS-Tikhonov), imposes penalties based on the size of the
regression coefcients. In essence, the LS-SVD approach nds a solution to the original
ill-conditionedLS problem, while the RLS-Tikhonov approachmodies (regularizes) the
original ill-conditioned LS problem into a less ill-conditioned problem.
15
To maintain a simple system of notation, we do not introduce separate notation for normalized and
unnormalized variables. Instead, we remember that when the regression model is estimated with normal-
ized variables, we have b R
n
, and when it is estimated with unnormalized variables, we have b R
n+1
.
4.2.1 LS-SVD We can use the SVD of A to rewrite the OLS solution (16) in a way that
does not require an explicit computation of (A
A)
1
. For a matrix A R
Tn
with T >n,
an SVD decomposition is
A =US!

, (19)
where U R
Tn
and ! R
nn
are orthogonal matrices, and S R
nn
is a diagonal
matrix with diagonal entries s
1
s
2
s
n
0, known as singular values of A.
16
The condition number of A is its largest singular value divided by its smallest singu-
lar value, K(A) =s
1
,s
n
. The singular values of A are related to the eigenvalues of A
A
by s
i
=

i
; see, for example, Golub and Van Loan (1996, p. 448). This implies that
K(A) = K(S) =
_
K(A
A). The OLS estimator

b = (A
A)
1
A
, in terms of the SVD

(19) is
b =! S
1
U
,. (20)
With an innite-precision computer, the OLS formula (16) and the LS-SVD formula (20)
give identical estimates of b. With a nite-precision computer, the standard OLS esti-
mator cannot be computed reliably if A
A is ill-conditioned. However, it is still possi-

ble that A and S are sufciently well-conditioned so that the LS-SVD estimator can be
computed successfully.
17
4.2.2 RLS-Tikhonov Aregularizationmethodreplaces anill-conditionedproblemwith
a well-conditioned problemthat gives a similar answer. Tikhonov regularization is com-
monly used to solve ill-conditioned problems. In statistics, this method is known as
ridge regression and it is classied as a shrinkage method because it shrinks the norm
of the estimated coefcient vector relative to the nonregularized solution. Formally,
Tikhonov regularization imposes an L
2
penalty on the magnitude of the regression-
coefcient vector; that is, for a regularization parameter 0, the vector b() solves
min
b
, Ab
2
2
+b
2
2
=min
b
(, Ab)
(, Ab) +b
b, (21)
where , R
T
and A R
Tn
are centered and scaled, and b R
n
. The parameter con-
trols the amount by which the regression coefcients shrink; larger values of lead to
greater shrinkage.
Note that the scale of an explanatory variable affects the size of the regression coef-
cient on this variable and hence, it affects how much this coefcient is penalized. Nor-
malizing all explanatory variables :
i
to zero mean and unit standard deviation allows us
to use the same penalty for all coefcients. Furthermore, centering the response vari-
able , leads to a no-intercept regression model and, thus, allows us to impose a penalty
16
For a description of methods for computing the SVD of a matrix, see, for example, Golub and Van
Loan (1996, pp. 448460). Routines that compute the SVD are readily available in modern programming
languages.
17
Another decomposition of A that leads to a numerically stable LS approach is a QR factorization; see,
for example, Davidson and MacKinnon (1993, pp. 3031) and Golub and Van Loan (1996, p. 239).
on the coefcients b
1
, . . . , b
n
without distorting the intercept b
0
(the latter is recovered
after all other coefcients are computed; see Section 4.1).
Finding the rst-order condition of (21) with respect to b gives us the estimator
b() =(A
A +I
n
)
1
A
,, (22)
where I
n
is an identity matrix of order n. Note that Tikhonov regularization adds a posi-
tive constant multiple of the identity matrix to A
A prior to inversion. Thus, if A
A is
nearly singular, the matrix A
A +I
n
is less singular, reducing problems in computing
b(). Note that
b() is a biased estimator of b. As increases, the bias of
b() increases
and its variance decreases. Hoerl and Kennard (1970) showed that there exists a value of
such that
L
_
(
b() b)
b() b)
_
-L[(
b b)
b b)],
that is, the mean squared error (equal to the sum of the variance and the squared bias)
of the Tikhonov-regularization estimator,
b(), is smaller than that of the OLS estimator,
b. Two main approaches to nding an appropriate value of the regularization parame-

ter in statistics are ridge trace and cross-validation. The ridge-trace approach relies on a
stability criterion: we observe a plot showing how
b() changes with (ridge trace) and

select the smallest value of for which

b() is stable. The cross-validation approach
focuses on a statistical-t criterion. We split the data into two parts, x some , com-
pute an estimate
b() using one part of the data, and evaluate the t of the regression
(i.e., validate the regression model) using the other part of the data. We then iterate on
to maximize the t. For a detailed discussion of the ridge-trace and cross-validation
approaches used in statistics, see, for example, Brown (1993, pp. 6271).
The problemof nding an appropriate value of for GSSA differs fromthat in statis-
tics in two respects: First, in Stage 1, our data are not xed and not exogenous to the
regularization process: on each iteration, simulated series are recomputed using a pol-
icy function that was obtained in the previous iteration under some value of the regu-
larization parameter. Second, our criteria of stability and accuracy differ from those in
statistics. Namely, our criterionof stability is the convergence of the xed-point iteration
in Stage 1, and our criterion of t is the accuracy of the converged solution measured by
the size of the Euler equation errors in Stage 2. In Section 6.1, we discuss how we chose
the regularization parameter for the RLS-Tikhonov method (as well as for other regular-
ization methods presented below) in the context of GSSA.
4.3 LAD approaches
LAD, or L
1
, regression methods use linear programming to minimize the sum of abso-
lute deviations. LADmethods do not depend on (A
A)
1
and avoid the ill-conditioning
problems of LS methods. Section 4.3.1 develops primal and dual formulations of the
LADproblem, and Section 4.3.2 proposes regularized versions of both. Section 4.3.3 dis-
cusses the advantages and drawbacks of the LAD approaches.
4.3.1 LAD The basic LAD method solves the optimization problem
min
b
, Ab
1
=min
b
1
T
|, Ab|, (23)
where
1
denotes the L
1
vector norm and | | denotes the absolute value.
18
Without a
loss of generality, we assume that Aand , are centered and scaled.
There is no explicit solution to the LAD problem (23), but this problem is equivalent
to the linear-programming problem
min
g,b
1
T
g (24)
s.t. g , Ab g, (25)
where g R
T
. The problem has n +T unknowns. Although this formulation of the LAD
problem is intuitive, it is not the most suitable for a numerical analysis.
LAD: Primal problem (LAD-PP). Charnes, Cooper, and Ferguson (1955) showed that
a linear LAD problem can be transformed into a canonical linear programming form.
They expressed the deviation for each observation as a difference between two nonneg-
ative variables
+
i
and
i
, as in
,
i
i=0
b
i
:
ii
=
+
i

i
, (26)
where :
ii
is the ith element of the vector :
i
. The variables
+
i
and
i
represent the mag-
nitude of the deviations above and below the tted line ,
i
=A
i
b, respectively. The sum
+
i
+
i
is the absolute deviation between the t ,
i
and the observation ,
i
. Thus, the
LAD problem is to minimize the total sum of absolute deviations subject to the system
of equations (26). In vector notation, this problem is
min
+
,
,b
1
+
+1
(27)
s.t.
+
+Ab =,, (28)
+
0,
0, (29)
where
+
,
R
T
. This is called the primal problem. A noteworthy property of its so-
lution is that both
+
i
and
i
cannot be strictly positive at a solution; if so, we could
reduce both
+
i
and
i
by the same quantity and reduce the value of the objective func-
tion without affecting the constraint (28). The advantage of (27)(29) compared to (24)
and (25) is that the only inequality constraints in the former problem are the variable
bounds (29), a feature that often helps make linear programming algorithms more ef-
cient.
18
LAD regression is a particular case of quantile regressions introduced by Koenker and Bassett (1978).
The central idea behind quantile regressions is the assignation of differing weights to positive versus neg-
ative residuals, , Ab. A th regression quantile, (0, 1), is dened as a solution to the problem of min-
imizing a weighted sum of residuals, where is a weight on positive residuals. The LAD estimator is the
regression median, that is, the regression quantile for =1,2.
LAD: Dual problem(LAD-DP). Linear programming tells us that every primal problem
can be converted into a dual problem.
19
The dual problemcorresponding to (27)(29) is
max
q
,
q (30)
s.t. A
q =0, (31)
1
T
q 1
T
, (32)
where q R
T
is a vector of unknowns. Wagner (1959) argued that if the number of obser-
vations T is sizable (i.e., T n), the dual problem(30)(32) is computationally less cum-
bersome than the primal problem (27)(29). Indeed, the dual problem contains only n
equality restrictions and the primal problemhas contained T equality restrictions, while
the number of lower and upper bounds on unknowns is equal to 2T in both problems.
The elements of the vector b, which is what we want to compute, are equal to the La-
grange multipliers associated with the equality restrictions given in (31).
4.3.2 Regularized LAD (RLAD) We next modify the original LAD problem (23) to in-
corporate an L
1
penalty on the coefcient vector b. We refer to the resulting problem
as a regularized LAD (RLAD). Like Tikhonov regularization, our RLAD problem shrinks
the values of the coefcients toward zero. Introducing an L
1
penalty in place of the L
2
penalty from Tikhonov regularization allows us to have the benets of biasing coef-
cients to zero but to do so with linear programming. Formally, for a given regularization
parameter 0, the RLAD problem attempts to nd the vector b() that solves
min
b
, Ab
1
+b
1
=min
b
1
T
|, Ab| +1
n
|b|, (33)
where , R
T
and A R
Tn
are centered and scaled, and b R
n
. As in the case of
Tikhonov regularization, centering and scaling of A and , in the RLAD problem (33) al-
lows us to use the same penalty parameter for all explanatory variables and to avoid pe-
nalizing an intercept. Below, we develop a linear programming formulation of the RLAD
problem in which an absolute value term|b
i
| is replaced with a difference between two
nonnegative variables. Our approach is parallel to the one we used to construct the pri-
mal problem (27)(29) and differs from the approach used in statistics.
20
RLAD: Primal problem (RLAD-PP). To cast the RLAD problem (33) into a canonical
linear programming form, we represent the coefcients of the vector b as b
i
=
+
i

i
,
with
+
i
0,
i
0 for i = 1, . . . , n. The regularization is done by adding to the objec-
tive a penalty linear in each
+
i
and
i
. The resulting regularized version of the primal
problem (27)(29) is
min
+
,
,
+
,
+
+1
+1
+
+1
(34)
s.t.
+
+A
+
A
=,, (35)
19
See Ferris, Mangasarian, and Wright (2007) for duality theory and examples.
20
Wang, Gordon, and Zhu (2006) constructed a RLAD problem in which |b
i
| is represented as sign(b
i
)b
i
.
+
0,
0, (36)
+
0,
0, (37)
where
+
,
R
n
are vectors that dene b(). The above problem has 2T + 2n un-
knowns, as well as T equality restrictions (35) and 2T +2n lower bounds (36) and (37).
RLAD: Dual problem (RLAD-DP). The dual problem corresponding to the RLAD-PP
(34)(37) is
max
q
,
q (38)
s.t. A
q 1
n
, (39)
A
q 1
n
, (40)
1
T
q 1
T
, (41)
where q R
T
is a vector of unknowns. Here, 2n linear inequality restrictions are imposed
by (39) and (40), and 2T lower and upper bounds on T unknown components of q are
given in (41). By solving the dual problem, we obtain the coefcients of the vectors
+
and
as the Lagrange multipliers associated with (39) and (40), respectively; we can
then restore the RLAD estimator using b() =
+
.
4.3.3 Advantages and drawbacks of LADapproaches LAD approaches are more robust
to outliers than LS approaches because they minimize errors without squaring them
and, thus, place comparatively less weight on distant observations than LS approaches
do. LAD approaches have two advantages compared to LS approaches. First, the statis-
tical literature suggests that LADestimators are preferable if regression disturbances are
nonnormal, asymmetric, or heavy-tailed; see Narula andWellington(1982) and Dielman
(2005) for surveys. Second, LAD methods can easily accommodate additional linear re-
strictions on the regression coefcients, for example, restrictions that impose mono-
tonicity of policy functions. Incontrast, adding such constraints for LS methods changes
anunconstrainedconvex minimizationproblemintoa linearly constrainedconvex min-
imization problem and substantially increases the computational difculty.
LAD approaches have two drawbacks compared to the LS approaches. First, a LAD
estimator does not depend smoothly on the data; since it corresponds to the median,
the minimal sum of absolute deviations is not differentiable in the data. Moreover, a
LAD regression line may not even be continuous in the data: a change in the data could
cause a solution switch from one vertex of the feasible set of coefcients to another
vertex. This jump will cause a discontinuous change in the regression line, which in
turn will produce a discontinuous change in the simulated path. These jumps might
create problems in solving for a xed point. Second, LAD approaches require solving
linear-programming problems, whereas LS approaches use only linear algebra opera-
tions. Therefore, LAD approaches tend to be more costly than LS approaches.
4.4 Principal component (truncated SVD) method
In this section, we describe a principal component method that reduces the multi-
collinearity in the data to a target level. Let A R
Tn
be a matrix of centered and scaled
explanatory variables, and consider the SVD of A dened in (19). Let us make a linear
transformation of A using 7 A! , where 7 R
Tn
and ! R
nn
is the matrix of singu-
lar vectors of A dened by (19). The vectors :
1
, . . . , :
n
are called principal components of
A. They are orthogonal, :
:
i
=0 for any i
=i, and their norms are related to the singular

values s
i
by :
i
:
i
=s
2
i
. Principal components have two noteworthy properties. First, the
sample mean of each principal component :
i
is equal to zero, since it is given by a linear
combination of centered variables :
1
, . . . , :
n
, each of which has a zero mean; second,
the variance of each principal component is equal to s
2
i
,T, because we have :
i
:
i
=s
2
i
.
Since the SVD method orders the singular values from the largest, the rst princi-
pal component :
1
has the largest sample variance among all the principal components,
while the last principal component :
n
has the smallest sample variance. In particular, if
:
i
has a zero variance (equivalently, a zero singular value, s
i
=0), then all entries of :
i
are
equal to zero, :
i
=(0, . . . , 0)
, which implies that the variables :

1
, . . . , :
n
that constitute
this particular principal component are linearly dependent. Therefore, we can reduce
the degrees of ill-conditioning of A to some target level by excluding low-variance prin-
cipal components that correspond to small singular values.
To formalize the above idea, let represent the largest condition number of A that
we are willing to tolerate. Let us compute the ratios of the largest singular value to all
other singular values,
s
1
s
2
, . . . ,
s
1
s
n
. (Recall that the last ratio is the actual condition number
of the matrix A; K(A) = K(S) =
s
1
s
n
.) Let 7
r
(:
1
, . . . , :
r
) R
Tr
be the rst r principal
components for which
s
1
s
i
and let us remove the last n r principal components for
which
s
1
s
i
> . By construction, the matrix 7
r
has a condition number which is smaller
than or equal to .
Let us consider the regression equation (14) and let us approximate Ab using 7
r
such that Ab =A! !
1
b A!
r
(!
r
)
1
b() =7
r
r
, where !
r
=(:
1
, . . . , :
r
) R
nr
con-
tains the rst r singular vectors of A and
r
(!
r
)
1
b() R
r
. The resulting regression
equation is
, =7
r
r
+, (42)
where , is centered and scaled. The coefcients
r
can be estimated by any of the meth-
ods described in Sections 4.2 and 4.3. For example, we can compute the OLS estimator
(16). Once we compute
r
, we can recover the coefcients
b() =!
r
r
R
n
.
We can remove collinear components of the data using a truncated SVD method
instead of the principal component method. Let the matrix A
r
R
Tn
be dened by
a truncated SVD of A, such that A
r
U
r
S
r
(!
r
)
, where U
r
R
Tr
and !
r
R
nr
are
the rst r columns of U and ! , respectively, and S
r
R
rr
is a diagonal matrix whose
entries are the r largest singular values of A. As follows from the theorem of Eckart
and Young (1936), A
r
is the closest rank r approximation of A R
Tn
. In terms of
A
r
, the regression equation is , = A
r
b(r) +. Using the denition of A
r
, we can write
A
r
b(r) = A
r
!
r
(!
r
)
1
b(r) = A
r
!
r
r
= U
r
S
r
r
, where
r
(!
r
)
1
b(r) R
r
. Again, we
can estimate the resulting regression model , =U
r
S
r
r
+ with any of the methods de-
scribed in Sections 4.2 and 4.3 and recover
b(r) = !
r
r
R
n
. In particular, we can nd
r
using the OLS method and arrive at
b(r) =!
r
(S
r
)
1
(U
r
)
,. (43)
We call the estimator (43) regularized LS using truncated SVD (RLS-TSVD). If r =n, then
RLS-TSVD coincides with LS-SVD described in Section 4.2.1.
21
The principal compo-
nent and truncated SVD methods are related through 7
r
=A
r
!
r
.
We make two remarks. First, the principal component regression (42) is well suited
to the shrinkage type of regularization methods without additional scaling: the lower is
the variance of a principal component, the larger is the corresponding regression coef-
cient and the more heavily such a coefcient is penalized by a regularization method.
Second, we should be careful removing low-variance principal components, since they
may contain important pieces of information.
22
To rule out only the case of extremely
collinear variables, a safe strategy is to set to a very large number, for example, to 10
14
on a machine with 16 digits of precision.
4.5 Other factors that affect numerical stability
We complement our discussion by analyzing two other factors that can affect numeri-
cal stability of GSSA: the choice of a family of basis functions and the choice of policy
functions to parameterize.
4.5.1 Choosing a family of basis functions We restrict attention to polynomial families
of basis functions in (13). Let us rst consider an ordinary polynomial family G
n
(:) =
:
n
, n=0, 1, . . . . The basis functions of this family look very similar (namely, G
2
(:) =:
2
looks similar to G
4
(:) =:
4
, and G
3
(:) =:
3
looks similar to G
5
(:) =:
5
); see Figure 2(a).
As a result, the explanatory variables in the regression equation are likely to be strongly
correlated (i.e., the LS problem is ill-conditioned) and estimation methods (e.g., OLS)
may fail because they cannot distinguish between similarly shaped basis functions.
In contrast, for families of orthogonal polynomials (e.g., Hermite, Chebyshev, Legen-
dre), basis functions have very different shapes and, hence, the multicollinearity prob-
lem is likely to manifest to a smaller degree, if at all.
23
In this paper, we consider the
case of Hermite polynomials. Such polynomials can be dened with a simple recur-
sive formula: H
0
(:) =1, H
1
(:) =:, and H
n
(:) =:H
n
(:) nH
n1
(:). For example, for
n=1, . . . , 5, this formula yields H
0
(:) =1, H
1
(:) =:, H
2
(:) =:
2
1, H
3
(:) =:
3
3:,
21
A possible alternative to the truncated SVD is a truncated QR factorization method with pivoting of
columns; see Eldn (2007, pp. 7274). The latter method is used in MATLAB to construct a powerful back-
slash operator for solving linear systems of equations.
22
Hadi and Ling (1998) constructed an articial regression example with four principal components,
for which the removal of the lowest variance principal component reduces the explanatory power of the
regression dramatically: F
2
drops from 1.00 to 0.00.
23
This useful feature of orthogonal polynomials was emphasized by Judd (1992) in the context of projec-
tion methods.
(a) (b)
Ficunr 2. (a) Ordinary polynomials. (b) Hermite polynomials.
H
4
(:) =:
4
6:
2
+3, and H
5
(:) =:
5
10:
3
+15:. These basis functions look different;
see Figure 2(b).
Two points are in order. First, Hermite polynomials are orthogonal under the Gaus-
sian density function, but are not orthogonal under the ergodic measure of our simula-
tions. Still, Hermite polynomials are far less correlatedthanordinary polynomials, which
may sufce to avoid ill-conditioning. Second, even though using Hermite polynomials
helps us avoid ill-conditioning in one variable, it does not help us to deal with multi-
collinearity across variables. For example, if l
i
and a
i
happen to be perfectly correlated,
certain Hermite polynomial terms for l
i
and a
i
, like H
2
(l
i
) =l
2
i
1 and H
2
(a
i
) =a
2
i
1,
are also perfectly correlated and, hence, A is singular. Thus, we may still need regression
methods that are able to treat ill-conditioned problems.
24
4.5.2 Choosing policy functions to approximate The numerical stability of the approx-
imation step is a necessary but not sufcient condition for the numerical stability of
GSSA. It might happen that xed-point iteration in (11) does not converge along iter-
ations even if the policy function is successfully approximated on each iteration. The
xed-point iteration procedure (even with damping) is sensitive to the nature of nonlin-
earity of solutions. There exist many logically equivalent ways to parameterize solutions,
with some parameterizations working better than others. A slight change in the nonlin-
earity of solutions due to variations in the models parameters might shift the balance
between different parameterizations; see Judd (1998, p. 557) for an example. Switching
to a different policy function to approximate can possibly help stabilize xed-point iter-
ation. Instead of capital policy function (6), we can approximate the policy function for
marginal utility in the left side of the Euler equation (5), u
(c
i
) =
u
(l
i
, a
i
; b
u
). This pa-
rameterization is common for the literature using Marcets (1988) simulation-based PEA
(although the parameterization of capital policy functions is also used to solve models
with multiple Euler equations; see, for example, Den Haan (1990)).
24
Christiano and Fisher (2000) found that multicollinearity can plague the regression step even with
orthogonal (Chebyshev) polynomials as basis functions.
5. INcnrnsiNc nccunncv or iN1rcnn1ioN
InSections 5.1 and5.2, we describe the Monte Carlo anddeterministic integrationmeth-
ods, respectively. We argue that accuracy of integration plays a determinant role in the
accuracy of GSSA solutions.
5.1 Monte Carlo integration
A one-node Monte Carlo integration method approximates an integral with the next-
periods realization of the integrand; we call it MC(1). Setting c
i+1,1
c
i+1
and
i,1
= 1
transforms (8) into
,
i
=
u
(c
i+1
)
u
(c
i
)
[1 +a
i+1
]
(l
i+1
)]l
i+1
. (44)
This integration method is used in Marcets (1988) simulation-based version of PEA.
A J-node Monte Carlo integration method, denoted by MC(J), draws J shocks,
{c
i+1,]
}
]=1,...,J
(which are unrelated to c
i+1
, the shock along the simulated path) and
computes ,
i
in (8) by assigning equal weights to all draws, that is,
i,]
= 1,J for all ]
and i.
An integration error is given by
I
i
,
i
L
i
[], where L
i
[] denotes the exact value
of conditional expectation in (7).
25
The OLS estimator (16) yields
b = b +[(A)
A]
1
(A)
I
, where
I
(
I
1
, . . . ,
I
T
)
R
T
. Assuming that
I
i
is independent and identi-
cally distributed (i.i.d.) with zero mean and constant variance
2
, we have the standard

version of the central limit theorem. For the conventional one-node Monte Carlo in-
tegration method, MC(1), the asymptotic distribution of the OLS estimator is given by
T(
b b) N(0, [A
A]
1
), and the convergence rate of the OLS estimator is

T.
Similarly, the convergence rate for MC(J) is
TJ. To decrease errors by an order of mag-

nitude, we must increase either the simulation length T or the number of draws J by 2
orders of magnitude or do some combination of the two.
Since the convergence of Monte Carlo integration is slow, high accuracy is theoreti-
cally possible but impractical. In a typical real business cycle model, variables uctuate
by several percent and so does the variable ,
i
given by (44). If a unit-free integration
error |
,
i
L
i
[]
L
i
[]
| is on average 10
2
(i.e., 1%), then a regression model with T =10,000 ob-
servations has errors of order 10
2
,
T =10
4
. To reduce errors to order 10
5
, we would
need to increase the simulation length to T =1,000,000. Thus, the cost of accuracy im-
provements is prohibitive.
26
5.2 One-dimensional quadrature integration
Deterministic integration methods are unrelated to simulations. In our model with
one normally distributed exogenous random variable, we can approximate a one-di-
25
Other types of approximation errors are discussed in Judd, Maliar, and Maliar (2011).
26
In a working-paper version of the present paper, Judd, Maliar, and Maliar (2009) developed a variant
of GSSA based on the one-node Monte Carlo integration method. This variant of GSSA is included in the
comparison analysis of Kollmann et al. (2011).
mensional integral using GaussHermite quadrature. A J-node GaussHermite quadra-
ture method, denoted by Q(J), computes ,
i
in (8) using J deterministic integration
nodes and weights. For example, a two-node GaussHermite quadrature method, Q(2),
uses nodes c
i+1,1
= , c
i+1,2
= and weights
i,1
=
i,2
=
1
2
; a three-node Gauss
Hermite quadrature method, Q(3), uses nodes c
i+1,1
= 0, c
i+1,2
=
_
3
2
, c
i+1,3
=
_
3
2
and weights
i,1
=
2
3
,
i,2
=
i,3
=
6
. A special case of the GaussHermite quadra-
ture method is a one-node rule, Q(1), which uses a zero node c
i+1,1
= 0 and a unit
weight
i,1
=1. Integration errors under GaussHermite quadrature integration can be
assessed using the GaussHermite quadrature formula, see, for example, Judd (1998,
p. 261). For a function that is smooth and has little curvature, the integration error de-
creases rapidly with the number of integration nodes J. In particular, GaussHermite
quadrature integration is exact for functions that are linear in the exogenous random
variable.
5.3 Multidimensional quadrature and monomial integration
We now discuss deterministic integration methods suitable for models with multiple
exogenous random variables (in Section 6.6, we extend our baseline model to include
multiple countries hit by idiosyncratic shocks). In this section, we just provide illustra-
tive examples; a detailed description of such methods is given in Appendix B.
With a small number of normally distributed exogenous random variables, we can
approximate multidimensional integrals with a GaussHermite product rule, which
constructs multidimensional nodes as a tensor product of one-dimensional nodes. Be-
low, we illustrate an extension of the two-node quadrature rule to the multidimensional
case by way of example.
ExnrIr 3. Let c
l
i+1
N(0,
2
), l = 1, 2, 3, be uncorrelated random variables. A two-
node GaussHermite product rule Q(2) (obtained from the two-node GaussHermite
rule) has 2
3
nodes, which are
] =1 ] =2 ] =3 ] =4 ] =5 ] =6 ] =7 ] =8
c
1
i+1,]

c
2
i+1,]

c
3
i+1,]

where weights of all nodes are equal,
i,]
=1,8 for all ].
Under a J-node GaussHermite product rule, the number of nodes grows exponen-
tially withthe number of exogenous randomvariables N. Evenif there are just two nodes
for each randomvariable, the total number of nodes is prohibitively large for large N; for
example, if N =100, we have 2
N
10
30
nodes. This makes product rules impractical.
With a large number of exogenous random variables, a feasible alternative to prod-
uct rules is monomial rules. Monomial rules construct multidimensional integration
nodes directly in a multidimensional space. Typically, the number of nodes under
monomial rules grows polynomially with the number of exogenous random variables.
In Appendix B, we present a description of two monomial rules, denoted by M1 and
M2, which have 2N and 2N
2
+1 nodes, respectively. In particular, M1 constructs nodes
by considering consecutive deviations of each random variable from its expected value,
holding the other randomvariables xed to their expected values. We illustrate this con-
struction using the setup of Example 3.
ExnrIr 4. Let c
l
i+1
N(0,
2
), l =1, 2, 3, be uncorrelated randomvariables. A mono-
mial nonproduct rule M1 has 2 3 nodes, which are
] =1 ] =2 ] =3 ] =4 ] =5 ] =6
c
1
i+1,]

3 0 0 0 0
c
2
i+1,]
0 0
3 0 0
c
3
i+1,]
0 0 0 0
3
where weights of all nodes are equal,
i,]
=1,6 for all ].
Since the cost of M1 increases with N only linearly, this rule is feasible for approxi-
mation of integrals with very large dimensionality. For example, with N =100, the total
number of nodes is only 2N =200.
The one-node GaussHermite quadrature rule, Q(1), plays a special role in our anal-
ysis. This rule is even cheaper than the monomial rules discussed above since there
is just one node for any number of exogenous random variables. Typically, there is a
trade-off between accuracy and cost of integration methods: having more nodes leads
to a more accurate approximation of integrals, but is also more costly. In our numerical
experiments, the GaussHermite quadrature rule and monomial rules lead to virtually
the same accuracy with an exception of the one-node GaussHermite rule producing
slightly less accurate solutions. Overall, the accuracy levels attained by GSSA under de-
terministic integration methods are orders of magnitude higher than those attained un-
der the Monte Carlo method.
27
6. NurrnicnI rxrnirrN1s
Inthis section, we discuss the implementationdetails of GSSAanddescribe the results of
our numerical experiments. We rst solve the representativeagent model of Section 2.1.
Then we solve two more challenging applications: a model with rare disasters and a
model with multiple countries.
6.1 Implementation details
Models parameters We assume a constant relative risk aversion utility function u(c
i
) =
c
1
i
1
1
, witha risk-aversioncoefcient (0, ), and a CobbDouglas productionfunc-
tion ] (l
i
) = l
i
, with a capital share = 0.36. The discount factor is = 0.99, and the
27
Quasi-Monte Carlo integration methods based on low-discrepancy sequences of shocks may also give
more accurate solutions than Monte Carlo integration methods; see Geweke (1996) for a review.
parameters in (4) are =0.95 and =0.01. The parameters and vary across experi-
ments.
Algorithms parameters The convergence parameter v in the convergence criterion
(10) must be chosen by taking into account a trade-off between accuracy and speed
in a given application (a too strict criterion wastes computer time, while a too loose
criterion reduces accuracy). In our experiments, we nd it convenient to adjust v to a
degree of the approximating polynomial n and to the damping parameter in (11) by
v=10
4n
. The former adjustment allows us to roughly match accuracy levels attain-
able under different polynomial degrees in our examples. The latter adjustment ensures
that different values of imply roughly the same degree of convergence in the time-
series solution (note that the smaller is , the smaller is the difference between the se-
ries l
()
i+1
and l
(1)
i+1
; in particular, if = 0, the series do not change from one iteration
to another). In most experiments, we use = 0.1, which means that v decreases from
10
6
to 10
10
when nincreases from 1 to 5. To start iterations, we use an arbitrary guess
l
i+1
= 0.95l
i
+ 0.05la
i
, where l is the steady-state capital. To compute a polynomial
solution of degree n=1, we start iterations from a xed low-accuracy solution; to com-
pute a solution of degree n 2, we start from the solution of degree n1. The initial
condition is the steady state (l
0
, a
0
) =(l, 1).
Regularization parameters For RLS-Tikhonov, RLAD-PP, and RLAD-DP, it is convenient
to normalize the regularization parameter by the simulation length T and the num-
ber of the regression coefcients n. For RLS-Tikhonov, this implies an equivalent rep-
resentation of the LS problem (21): min
b
1
T
(, Ab)
(, Ab) +

n
b
b, where reects
a trade-off between the average squared error
1
T
(, Ab)
(, Ab) and the average

squared coefcient
1
n
b
b. Since is constructed to be invariant to changes in T and

n, the same numerical value of often works well for experiments with different T
and n (and thus, different polynomial degrees n). For the RLAD problem (33), we have
min
b
1
T
1
T
|, Ab| +

n
1
n
|b|.
To select appropriate values of the regularization parameters for our regulariza-
tion methods, we use the approach that combines the ideas of ridge trace and cross-
validation, as described in Section 4.2.2. We specically search for a value of the regular-
ization parameter that ensures both the numerical stability (convergence) of xed-point
iteration in Stage 1 and the high accuracy of solutions in Stage 2. In our experiments,
we typically use the smallest degree of regularization that ensures numerical stability of
xed-point iteration; we nd that this choice also leads to accurate solutions.
28
Results reported, hardware, and software For each experiment, we report the value of a
regularization parameter (if applicable) and time necessary for computing a solution, as
well as unit-free Euler equation errors (12) on a stochastic simulation of T
test
= 10,200
28
We tried to automate a search of the regularization parameter by targeting some accuracy criterion in
Stage 2. The outcome of the search was sensitive to a realization of shocks and an accuracy criterion (e.g.,
mean squared error, mean absolute error, maximumerror). In the studied models, accuracy improvements
were small, while costs increased substantially. We did not pursue this approach.
observations (we discard the rst 200 observations to eliminate the effect of initial con-
ditions); see Juillard and Villemot (2011) for a discussion of other accuracy measures.
To compute conditional expectations in the test, we use a highly accurate integration
method Q(10). We runthe experiments ona desktopcomputer ASUS withIntel
Core
2 Quad CPU Q9400 (2.66 GHz). Our programs are written in MATLAB, version 7.6.0.324
(R2008a). To solve the linear-programming problems, we use a routine linprog under
the optionof aninterior-point method.
29
To increase the speedof computations inMAT-
LAB, we use vectorization (e.g., we approximate conditional expectation in all simulated
points at once rather thanpoint by point and compute all policy functions at once rather
than one by one).
6.2 Testing numerical stability
We consider a version of the representativeagent model under = 1 and = 1. This
model admits a closed-form solution, l
i+1
=a
i
l
i
. To compute conditional expecta-
tions, we use the one-node Monte Carlo integration method (44). A peculiar feature of
this model is that the integrand of conditional expectation in the Euler equation (7) is
equal to l
i+1
for all possible realizations of a
i+1
. Since the integrand does not have a
forward-looking component, the choice of integration method has little impact on ac-
curacy. We can therefore concentrate on the issue of numerical stability of GSSA.
We consider four nonregularization methods (OLS, LS-SVD, LAD-PP, and LAD-DP)
and four corresponding regularization methods (RLS-Tikhonov, RLS-TSVD, RLAD-PP,
and RLAD-DP). The RLS-TSVD method is also a representative of the principal com-
ponent approach. We use both unnormalized and normalized data, and we consider
both ordinary and Hermite polynomials. We use a relatively short simulation length of
T =3000 because the primal-problem formulations LAD-PP and RLAD-PP proved to be
costly in terms of time and memory. In particular, when T exceeded 3000, our computer
ran out of memory. The results are shown in Table 1.
Our stabilization techniques proved to be remarkably successful in the examples
considered. When the OLS method is used with unnormalized data and ordinary poly-
nomials, we cannot go beyond the second-degree polynomial approximation. Normal-
ization of variables alone allows us to compute degree 3 polynomial solutions. LS-SVD
and LAD with unnormalized data deliver the fourth-degree polynomial solutions. All
regularization methods successfully computed degree 5 polynomial approximations.
Hermite polynomials ensure numerical stability under any approximation method (all
methods considered lead to nearly identical results). The solutions are very accurate
with mean errors of order 10
9
.
For the regularization methods, we compare the results under 2 degrees of regular-
ization. When a degree of regularization is low, the regularization methods deliver accu-
racy levels that are comparable or superior to those of the corresponding nonregulariza-
tion methods. However, an excessively large degree of regularization reduces accuracy
29
A possible alternative to the interior-point method is a simplex method. Our experiments indicated
that the simplex method, incorporated in MATLAB, was slower than the interior-point method; occasion-
ally, it was also unable to nd an initial guess. See Portnoy and Koenker (1997) for a comparison of interior-
point and simplex-based algorithms.
1
9
8
J
u
d
d
,
M
a
l
i
a
r
,
a
n
d
M
a
l
i
a
r
Q
u
a
n
t
i
t
a
t
i
v
e
E
c
o
n
o
m
i
c
s
2
(
2
0
1
1
)
TnuIr 1. Stability of GSSA in the representativeagent model with a closed-form solution: the role of approximation method, data normaliza-
tion, and polynomial family.
a
Ordinary Polynomials: Ordinary Polynomials: Hermite Polynomials:
Nonregularization Methods Regularization Methods NonregularizationMethods
Unnormalized Data Normalized Data Smaller Regularization Larger Regularization Unnormalized Data
Polynomial
Degree E
mean
E
max
CPU E
mean
E
max
CPU E
mean
E
max
CPU E
mean
E
max
CPU E
mean
E
max
CPU
OLS RLS-Tikhonov, =10
10
RLS-Tikhonov, =10
7
OLS
1
st
3.52 2.45 0.8 3.52 2.45 1 3.52 2.45 1 3.52 2.45 1 3.52 2.45 1
2
nd
5.46 4.17 3.1 5.46 4.17 3 5.46 4.17 3 5.46 4.16 3 5.46 4.17 4
3
rd
6.84 5.36 5 6.84 5.36 5 5.85 4.51 4 6.84 5.36 6
4
th
6.97 5.63 8 6.12 4.74 7 7.97 6.35 8
5
th
6.22 4.75 11 9.09 7.29 10
LS-SVD RLS-TSVD, =10
8
RLS-TSVD, =10
6
LS-SVD
1
st
3.52 2.45 0.9 3.52 2.45 1 3.52 2.45 1 3.52 2.45 1 3.52 2.45 1
2
nd
5.46 4.17 3.1 5.46 4.17 3 5.46 4.17 3 5.46 4.17 3 5.46 4.17 4
3
rd
6.84 5.36 4.6 6.84 5.36 5 6.84 5.36 5 6.84 5.36 5 6.84 5.36 6
4
th
7.98 6.37 6.1 7.97 6.35 6 7.97 6.35 6 7.20 5.46 6 7.97 6.35 8
5
th
9.12 7.43 10 9.08 7.25 8 7.64 5.97 9 9.08 7.25 9
LAD-PP RLAD-PP, =10
6
RLAD-PP, =10
4
LAD-PP
1
st
3.57 2.43 28.6 3.52 2.45 16 3.52 2.45 15 3.52 2.45 15 3.57 2.43 30
2
nd
5.56 4.11 246.5 5.55 4.12 92 5.55 4.12 127 5.55 4.11 100 5.56 4.11 243
3
rd
6.98 5.26 386.8 6.97 5.25 245 6.98 5.25 321 6.93 5.22 263 6.98 5.26 379
4
th
7.62 5.58 558.8 8.16 6.11 383 8.17 6.13 530 6.75 5.06 349 8.16 6.13 512
5
th
9.10 7.02 560 8.17 6.15 706 6.64 4.97 936 9.09 7.05 670
LAD-DP RLAD-DP, =10
6
RLAD-DP, =10
4
LAD-DP
1
st
3.57 2.43 3.1 3.52 2.45 9 3.52 2.45 3 3.52 2.45 3 3.57 2.43 3
2
nd
5.56 4.11 9.3 5.55 4.12 34 5.55 4.12 10 5.55 4.12 11 5.56 4.11 9
3
rd
6.98 5.26 13.2 6.97 5.25 55 6.98 5.25 19 6.93 5.22 25 6.98 5.25 13
4
th
8.14 6.12 74 8.17 6.13 45 6.75 5.06 30 8.15 6.18 18
5
th
8.17 6.15 71 6.64 4.97 62 9.26 7.04 21
a
E
mean
and E
max
are, respectively, the average and maximum absolute unit-free Euler equation errors (in log10 units) on a stochastic simulation of 10,000 observations; CPU is the
time necessary for computing a solution (in seconds); is the regularization parameter in RLS-Tikhonov, RLAD-PP, and RLAD-DP; is the regularization parameter in RLS-TSVD. In all
experiments, we use the one-node Monte Carlo integration method MC(1), simulation length T =3000, and damping parameter =0.1.
because the regression coefcients are excessively biased. Finally, under any degree of
regularization, RLS-Tikhonov leads to visibly less accurate solutions than the other LS
regularization method, RLS-TSVD. This happens because RLS-Tikhonov and RLS-TSVD
work with different objects: the former works with a very ill-conditioned matrix A
A,
while the latter works with a better conditioned matrix S.
30
6.3 Testing accuracy
We study a version of the model with = 1 and = 0.02. With partial depreciation of
capital, the integrand of conditional expectation in the Euler equation (7) does depend
on a
i+1
, and the choice of integration method plays a critical role in the accuracy of
solutions. In all the experiments, we use ordinary polynomials and RLS-TSVD with =
10
7
. This choice ensures numerical stability, allowing us to concentrate on the accuracy
of integration.
We rst assess the performance of GSSA based on the Monte Carlo method, MC(J),
with J =1 and J =2000. (Recall that MC(1) uses one random draw, and MC(2000) uses
a simple average of 2000 random draws to approximate an integral in each simulated
point.) We consider four different simulation lengths, T {100, 1000, 10,000, 100,000}.
The results are provided in Table 2.
The performance of the Monte Carlo method is poor. Under MC(1), GSSA can
deliver high-degree polynomial approximations only if T is sufciently large (if T is
small, Monte Carlo integration is so inaccurate that simulated series either explode
or implode). A 10 times increase in the simulation length (e.g., from T = 10,000 to
T = 100,000) decreases errors by about a factor of 3. This is consistent with a

T rate
of convergence of MC(1); see Section 5.1. Increasing the number of nodes J from 1 to
2000 augments accuracy by about

J and helps restore numerical stability. The most
accurate solution is obtained under the polynomial of degree 3, and corresponds to a
combination of T and J with the largest number of random draws (i.e., T =10,000 and
J = 2000). Overall, high-degree polynomials do not necessarily lead to more accurate
solutions than low-degree polynomials because accuracy is dominated by large errors
produced by Monte Carlo integration. Thus, even though our stabilization techniques
enable us to compute polynomial approximations of 5 degrees, there is no gain in going
beyond the third-degree polynomial if Monte Carlo integration is used.
We next consider the GaussHermite quadrature method Q(J) with J =1, 2, 10. The
results change dramatically: all the studied cases become numerically stable and the
accuracy of solutions increases by orders of magnitude. Q(J) is very accurate even with
just two nodes: increasing the number of nodes from J = 2 to J = 10 does not visibly
reduce the approximation errors in the table. The highest accuracy is attained with the
degree 5 polynomials, T = 100,000, and the most accurate integration method Q(10).
30
Alternatively, we can apply a Tikhonov-type of regularization directly to S by adding I
n
, that is,
b() =
! (S +I
n
)
1
U
,. This version of Tikhonov regularization produces solutions that are at least as accurate
as those produced by LS-SVD. However, in some applications, such as large-scale economies, computing
the SVD can be costly or infeasible, and the standard Tikhnonov regularization based on A
A can still be
useful.
2
0
0
J
u
d
d
,
M
a
l
i
a
r
,
a
n
d
M
a
l
i
a
r
Q
u
a
n
t
i
t
a
t
i
v
e
E
c
o
n
o
m
i
c
s
2
(
2
0
1
1
)
TnuIr 2. Accuracy of GSSA in the representativeagent model with partial depreciation of capital: the role of integration method (Monte Carlo
versus GaussHermite quadrature methods).
a
Monte Carlo Method GaussHermite Quadrature Method
MC(1) MC(2000) Q(1) Q(2) Q(10)
Polynomial
Degree E
mean
E
max
CPU E
mean
E
max
CPU E
mean
E
max
CPU E
mean
E
max
CPU E
mean
E
max
CPU
T =100
1
st
3.54 2.80 0.2 4.35 3.40 56 4.36 3.37 3 4.35 3.36 1 4.35 3.36 2
2
nd
4.07 3.06 112 6.05 4.93 4 6.13 4.90 3 6.13 4.90 5
3
rd
3.81 2.52 200 6.32 5.85 5 7.47 5.94 4 7.47 5.94 6
4
th
6.24 5.25 6 6.84 5.26 6 6.84 5.26 8
5
th
6.04 4.73 7 6.22 4.72 10 6.22 4.72 11
T =1000
1
st
4.02 3.21 0.4 4.40 3.47 425 4.34 3.48 3 4.36 3.47 3 4.36 3.47 3
2
nd
3.71 2.73 6 5.52 4.65 644 6.06 4.95 7 6.16 4.95 6 6.16 4.95 7
3
rd
5.33 4.23 873 6.32 5.92 9 7.57 6.21 9 7.57 6.21 10
4
th
5.22 3.81 1383 6.31 6.20 10 8.92 7.30 11 8.92 7.30 13
5
th
5.22 3.80 1730 6.32 6.20 12 8.53 6.68 13 8.53 6.68 15
T =10,000
1
st
4.26 3.37 1 4.40 3.48 1236 4.35 3.37 15 4.36 3.37 16 4.36 3.37 20
2
nd
4.42 3.69 11 6.04 4.93 1711 5.99 4.94 32 6.13 4.92 27 6.13 4.92 34
3
rd
4.32 3.37 25 6.15 5.07 2198 6.32 5.90 45 7.48 6.01 35 7.48 6.01 44
4
th
4.31 2.98 47 6.08 4.71 3337 6.32 6.18 53 8.72 7.10 44 8.72 7.10 54
5
th
4.23 3.30 80 6.07 4.70 4551 6.32 6.18 62 8.91 7.26 51 8.91 7.26 63
T =100,000
1
st
4.39 3.40 4 4.36 3.40 117 4.37 3.39 113 4.37 3.39 142
2
nd
4.87 3.96 79 6.03 4.94 281 6.16 4.94 188 6.16 4.94 238
3
rd
4.86 3.60 184 Ran out of memory 6.32 5.93 387 7.52 6.04 260 7.52 6.04 328
4
th
4.72 3.43 341 6.32 6.19 470 8.78 7.18 335 8.78 7.18 421
5
th
4.71 3.44 623 6.32 6.19 548 8.98 7.35 406 8.98 7.35 508
a
E
mean
and E
max
are, respectively, the average and maximumabsolute unit-free Euler equation errors (in log10 units) on a stochastic simulation of 10,000 observations; CPU is the time
necessary for computing a solution (in seconds); T is the simulation length; MC(J) and Q(J) denote J-node Monte Carlo and GaussHermite quadrature integration methods, respectively.
In all experiments, we use RLS-TSVD with =10
7
, the ordinary polynomial family, and damping parameter =0.1.
The mean absolute error is around 10
9
and is nearly 3 orders of magnitude lower than
that attained under Monte Carlo integration. Thus, high-degree polynomials do help
increase the accuracy of solutions if integration is accurate.
Note that even the least accurate solution obtained under the GaussHermite
quadrature method with T = 100 and J = 1 is still more accurate than the most, ac-
curate solution obtained under the Monte Carlo method with T =10,000 and J =2000.
The simulation length T plays a less important role in accuracy and numerical stability
of GSSA under Q(J) than under MC(J) because Q(J) uses simulated points only to con-
struct the domain, while MC(J) uses such points both to construct the domain and to
evaluate integrals. To decrease errors from 10
5
to 10
9
under the Monte Carlo method
MC(1), we would need to increase the simulation length from T =10
4
to T =10
12
.
6.4 Sensitivity of GSSA to the risk-aversion coefcient
We test GSSA in the model with very low and very high degrees of risk aversion, =0.1
and = 10. We restrict attention to three regularization methods RLS-Tikhonov, RLS-
TSVD, and RLAD-DP (in the limit, these methods include nonregularization methods
OLS, LS-SVD, and LAD-DP, respectively). We omit RLAD-PP because of its high cost. In
all experiments, we use T =10,000 and an accurate integration method Q(10) (however,
we found that Q(2) leads to virtually the same accuracy). The results are presented in
Table 3.
Under = 0.1, GSSA is stable even under large values of the damping parameter
such as =0.5. In contrast, under =10, GSSA becomes unstable because xed-point
iteration is fragile. One way to enhance numerical stability is to set the damping param-
eter to a very small value; for example, =0.01 ensures stability under both ordinary
and Hermite polynomials. Another way to do so is to choose a different policy function
to approximate; see the discussion in Section 4.5.2. We nd that using a marginal-utility
policy function (instead of the capital policy function) ensures the stability of GSSA un-
der large values of such as =0.5.
Overall, the accuracy of solutions is higher under = 0.1 than under = 10. How-
ever, even in the latter case, our solutions are very accurate: we attain mean errors of
order 10
8
. The accuracy levels attained under the capital and marginal-utility policy
functions are similar. RLAD-DP and RLS-TSVD deliver more accurate solutions than
RLS-Tikhonov. As for the cost, RLAD-DP is more expensive than the other methods. Fi-
nally, the convergence to a xed point is considerably faster under the capital policy
function than under the marginal-utility policy function.
6.5 Model with rare disasters
We investigate how the performance of GSSA depends on specic assumptions about
uncertainty. We assume that, in addition to standard normally distributed shocks, the
productivity level is subject to large negative low-probability shocks (rare disasters). We
modify (4) as lna
i+1
= lna
i
+ (c
i+1
+
i+1
), where c
i+1
N(0,
2
),
i+1
takes values
and 0 with probabilities and 1 , respectively, and >0. We assume that =10
TnuIr 3. Sensitivity of GSSA to the risk-aversion coefcient in the representativeagent model.
a
RLS-Tikhonov RLS-TSVD RLAD-DP
Polynomial
Degree E
mean
E
max
CPU E
mean
E
max
CPU E
mean
E
max
CPU
=0.1 (capital policy function, ordinary polynomial, dampening =0.5)
1
st
10
7
4.95 3.91 8 10
7
4.95 3.91 8 10
7
4.95 3.90 19
2
nd
6.57 5.32 14 6.57 5.32 15 6.61 5.31 45
3
rd
6.47 5.14 22 7.93 6.32 22 7.99 6.28 154
4
th
6.94 5.52 35 9.06 7.42 30 9.08 7.37 317
5
th
6.98 5.50 86 8.92 7.16 39 9.57 7.46 885
=10 (capital policy function, ordinary polynomial, dampening =0.01)
1
st
10
7
2.87 1.76 92 10
7
2.87 1.76 89 10
7
2.87 1.77 194
2
nd
4.25 2.95 214 4.25 2.95 218 4.24 2.91 757
3
rd
5.37 3.99 374 5.36 3.96 332 5.35 3.89 1799
4
th
5.60 3.93 681 6.36 4.83 448 6.34 4.77 4278
5
th
7.13 5.63 580 7.25 5.49 7107
=10 (capital policy function, Hermite polynomial, dampening =0.01)
1
st
10
7
2.87 1.76 102 10
7
2.87 1.76 109 10
7
2.87 1.77 217
2
nd
4.25 2.95 318 4.25 2.95 332 4.24 2.91 809
3
rd
5.36 3.96 517 5.36 3.96 503 5.35 3.89 1859
4
th
6.36 4.83 693 6.36 4.83 710 6.34 4.77 4267
5
th
7.31 5.60 921 7.30 5.61 926
=10 (marginal-utility policy function, ordinary polynomial, dampening =0.5)
1
st
10
10
2.84 2.79 256 10
7
2.84 2.79 442 10
7
2.72 2.68 1206
2
nd
3.67 3.58 645 3.67 3.58 1017 3.55 3.50 2596
3
rd
4.06 4.06 1120 4.06 4.06 1674 5.38 5.37 4331
4
th
4.63 4.61 1568 4.81 4.75 2274 5.21 5.18 9093
5
th
5.41 5.30 3102 7.75 6.57 18,596
a
E
mean
and E
max
are, respectively, the average and maximum absolute unit-free Euler equation errors (in log10 units) on
a stochastic simulation of 10,000 observations; CPU is the time necessary for computing a solution (in seconds); is the coef-
cient of risk aversion; is the regularization parameter in RLS-Tikhonov and RLAD-DP; is the regularization parameter in
RLS-TSVD. In all experiments, we use the simulation length T =10,000, and the 10 node GaussHermite quadrature integration
method, Q(10).
and =0.02, that is, a 10% drop in a productivity level occurs with a probability of 2%.
These values are in line with the estimates obtained in recent literature on rare disasters;
see Barro (2009).
We solve the model with = 1 using three regularization methods (RLS-Tikhonov,
RLS-TSVD, and RLAD-DP). We consider both ordinary and Hermite polynomials. We
implement a quadrature integration method with 2J nodes and weights. The rst J
nodes are the usual GaussHermite nodes {c
i+1,]
}
]=1,...,J
and the remaining J nodes cor-
respond to a rare disaster {c
i+1,]
}
]=1,...,J
; the weights assigned to the former J nodes
and latter J nodes are adjusted to the probability of a rare disaster by {(1 )
i,]
}
]=1,...,J
and {
i,]
}
]=1,...,J
, respectively. We use J =10 and T =10,000.
In all cases, GSSA is successful in nding solutions; see Table 4. Overall, the errors
are larger than in the case of the standard shocks because the ergodic set is larger, and
solutions must be approximated and tested on a larger domain; compare Tables 2 and 4.
TnuIr 4. The model withrare disasters (10%negative productivity shocks occur withprobability
0.01).
a
RLS-Tikhonov RLS-TSVD RLAD-DP
Polynomial
Degree E
mean
E
max
CPU E
mean
E
max
CPU E
mean
E
max
CPU
Ordinary Polynomials
1
st
10
6
3.97 2.87 50 10
8
3.97 2.87 40 10
6
3.98 2.80 81
2
nd
5.47 4.09 79 5.47 4.09 67 5.61 4.08 152
3
rd
6.63 4.70 110 6.64 4.71 97 6.81 4.67 257
4
th
7.67 5.89 134 7.67 5.83 118 7.88 5.50 642
5
th
8.16 6.30 158 8.66 6.54 143 8.86 6.12 1193
Hermite Polynomials
1
st
10
6
3.97 2.87 49 10
8
3.97 2.87 49 10
6
3.98 2.80 88
2
nd
5.47 4.09 77 5.47 4.09 77 5.61 4.08 164
3
rd
6.64 4.71 108 6.64 4.71 108 6.81 4.67 266
4
th
7.67 5.83 131 7.67 5.83 131 7.88 5.51 516
5
th
8.66 6.54 156 8.66 6.54 156 8.87 6.42 1013
a
E
mean
and E
max
are, respectively, the average and maximum absolute unit-free Euler equation errors (in log10 units)
on a stochastic simulation of 10,000 observations; CPU is the time necessary for computing a solution (in seconds); is the
regularizationparameter inRLS-Tikhonov andRLAD-DP; is the regularizationparameter inRLS-TSVD. Inall experiments, we
use simulation length T =10,000, the 10 node GaussHermite quadrature integration method, Q(10), and damping parameter
=0.1.
The accuracy levels are still high: the mean absolute errors are of order 10
8
. We perform
further sensitivity experiments and nd that GSSA is numerically stable and delivers
accurate solutions for a wide range of the parameters , , , and .
6.6 Multicountry model
We demonstrate the tractability of GSSA in high-dimensional problems. For this, we ex-
tend the representativeagent model (2)(4) to include multiple countries. Each coun-
try l {1, . . . , N} is characterized by capital l
l
i
and productivity level a
l
i
(i.e., the state
space contains 2N state variables). The productivity level of a country is affected by both
country-specic and worldwide shocks. The world economy is governed by a planner
who maximizes a weighted sum of utility functions of the countries representative con-
sumers. We represent the planners solution with N capital policy functions and com-
pute their approximations,
l
l
i+1
=K
l
({l
l
i
, a
l
i
}
l=1,...,N
)
l
({l
l
i
, a
l
i
}
l=1,...,N
; b
l
), l =1, . . . , N, (45)
where
l
and b
l
are, respectively, an approximating function and a vector of the ap-
proximation parameters of country l. A formal description of the multicountry model
and implementation details of GSSA are provided in Appendix C. The results are shown
in Table 5.
We rst compute solutions using GSSA with the one-node Monte Carlo method
MC(1). We use RLS-Tikhonov with =10
5
and T =10,000. The performance of Monte
Carlo integration is again poor. The highest accuracy is achieved under the rst-degree
2
0
4
J
u
d
d
,
M
a
l
i
a
r
,
a
n
d
M
a
l
i
a
r
Q
u
a
n
t
i
t
a
t
i
v
e
E
c
o
n
o
m
i
c
s
2
(
2
0
1
1
)
TnuIr 5. Accuracy and cost of GSSA in the multicountry model: the effect of dimensionality.
a
RLS-Tikh., MC(1) RLS-TSVD, M2,M1, RLAD-DP, M1 RLS-Tikh., Q(1),
T =10,000, =10
5
T =1000, =10
7
T =1000, =10
5
T =1000, =10
5
Number of Polynomial Number of
Countries Degree Coefcients E
mean
E
max
CPU E
mean
E
max
CPU E
mean
E
max
CPU E
mean
E
max
CPU
N =2 1
st
5 4.70 3.13 251 4.65 2.99 37 4.67 3.01 127 4.64 2.99 28
2
nd
15 4.82 3.11 1155 6.01 3.99 407 6.06 4.02 680 5.79 4.04 167
3
rd
35 4.59 2.42 3418 7.09 4.83 621 7.10 4.83 1881 5.50 3.55 385
4
th
70 4.57 2.53 9418 7.99 5.63 978 8.12 5.60 14,550 5.73 3.81 897
5
th
126 4.53 2.38 24,330 8.00 5.50 2087 8.17 5.65 48,061 5.76 3.85 2463
N =4 1
st
9 4.59 3.06 280 4.72 3.17 102 4.73 3.17 290 4.71 3.18 36
2
nd
45 4.46 2.87 1425 6.05 4.15 1272 6.08 4.16 3912 5.67 4.22 189
3
rd
165 4.29 2.52 11,566 7.06 4.89 5518 7.00 4.89 95,385 5.64 4.03 1092
4
th
495 4.20 2.31 58,102 7.46 5.23 37,422 5.64 4.39 4858
N =6 1
st
13 4.58 3.12 301 4.71 3.08 225 4.72 3.10 562 4.69 3.09 41
2
nd
91 4.30 2.73 1695 6.06 4.21 2988 5.94 4.11 13,691 5.62 4.26 224
3
rd
455 4.04 2.29 30,585 6.96 4.88 65,663 5.55 3.87 3219
N =8 1
st
17 4.56 3.14 314 4.73 3.08 430 4.74 3.07 996 4.72 3.09 42
2
nd
153 4.19 2.63 1938 6.06 4.20 5841 6.08 4.20 78,928 5.59 4.29 278
N =10 1
st
21 4.54 3.15 341 4.73 3.08 773 4.74 3.08 1609 4.72 3.09 44
2
nd
231 4.07 2.59 2391 6.05 4.20 10,494 5.97 4.14 183,046 5.56 4.32 292
N =20 1
st
41 4.55 3.12 390 4.77 2.93 344 4.79 2.94 9727 4.75 2.93 56
2
nd
861 3.88 2.36 7589 5.48 3.99 6585 5.40 3.94 1079
N =100 1
st
201 4.17 2.77 1135 4.63 3.04 13,846 4.64 3.05 225
N =200 1
st
401 3.97 2.56 2232 4.60 3.10 105,121 4.59 3.10 1008
a
E
mean
and E
max
are, respectively, the average and maximum absolute unit-free Euler equation errors (in log10 units) on a stochastic simulation of 10,000 observations; CPU is the
time necessary for computing a solution (in seconds); N is the number of countries; Number of Coefcients is the number of polynomial coefcients in the policy function of one country;
T is the simulation length. In all experiments, we use ordinary polynomials and normalized data. For RLS-TSVD, we use M2 for N - 20 and M1 for N = 20, 100, and 200. For N = 200 in
experiments RLS-TSVD, M2,M1 and RLS-Tikh., Q(1), we use T =2000.
polynomials. This is because polynomials of higher degrees have too many regression
coefcients to identify for a given sample size T. Moreover, when N increases, so does
the number of coefcients, and the accuracy decreases even further. For example, going
from N =2 to N =20 increases the size of the approximation errors by about a factor of
10 under the second-degree polynomial. Longer simulations increase the accuracy but
at a high cost.
We next compute solutions using GSSA with the deterministic integration methods.
Since such methods do not require long simulations for accurate integration, we use a
relatively short simulation length of T = 1000 (except for the case of N = 200 in which
we use T = 2000 for enhancing numerical stability). We start with accurate but expen-
sive integration methods (namely, we use the monomial rule M2 with 2N
2
+ 1 nodes
for 2 N 10 and we use the monomial rule M1 with 2N nodes for N > 10). The ap-
proximation method was RLS-TSVD (with = 10
7
). For small-scale economies, N = 2,
4, and 6, GSSA computes the polynomial approximations up to degrees 5, 4, and 3, re-
spectively, with maximum absolute errors of 10
5.5
, 10
5.2
, and 10
4.9
, respectively. For
medium-scale economies, N 8, 10, and 20, GSSA computes the second-degree poly-
nomial approximations with maximum absolute error of 10
4
. Finally, for large-scale
economies, N = 100 and 200, GSSA computes the rst-degree polynomial approxima-
tions with maximum absolute error of 10
2.9
.
We then compute solutions using RLAD-DP (with =10
5
) combined with M1. We
obtain accuracy levels that are similar to those delivered by our previous combination
of RLS-TSVDand M2. We observe that RLAD-DP is more costly than the LS methods but
is still practical in medium-scale applications. It is possible to increase the efciency of
LAD methods by using techniques developed in the recent literature.
31
We nally compute solutions using GSSA with a cheap one-node quadrature
method, Q(1), and RLS-Tikhonov (with = 10
5
). For polynomials of degrees larger
than 2, the accuracy of solutions is limited. For the rst- and second-degree polynomi-
als, the accuracy is similar to that under more expensive integration methods, but the
cost is reduced by an order of magnitude or more. In particular, when N increase from
2 to 20, the running time increases only from 3 to 18 minutes. Overall, RLS-Tikhonov is
more stable in large-scale problems than RLS-TSVD (because SVD becomes costly and
numerically unstable).
The accuracy of GSSA solutions is comparable to the highest accuracy attained in
the comparison analysis of Kollmann et al. (2011). GSSA ts a polynomial on a relevant
domain (the ergodic set) and as a result, can get a better t on the relevant domain than
methods tting polynomials on other domains.
32
A choice of domain is especially im-
portant for accuracy under relatively rigid low-degree polynomials. In particular, linear
solutions produced by GSSA are far more accurate than the rst- and second-order per-
turbation methods of Kollmann, Kim, and Kim (2011) that in a similar model produce
31
Tits, Absil, and Woessner (2006) proposed a constraint-reduction scheme that can drastically reduce
computational cost per iteration of linear-programming methods.
32
An advantage of focusing on the ergodic set is illustrated by Judd, Maliar, and Maliar (2010) in the
context of a cluster grid algorithm. In a model with only two state variables, solutions computed on the
ergodic set are up to 10 times more accurate than those computed on the rectangular grid containing the
ergodic set.
approximation errors of 10
1.7
and 10
2.66
, respectively; see Table 2 in a web appendix
of the comparison analysis of Kollmann et al. (2011), http://www.sciencedirect.com/
science/journal/01651889.
33
The cost of GSSA depends on the integration and approx-
imation methods and the degree of the approximating polynomial, as well as the sim-
ulation length. There is a trade-off between accuracy and speed, and cheap versions of
GSSA are tractable in problems with very high dimensionality. Finally, GSSA is highly
parallelizable.
34
7. CoNcIusioN
Methods that operate on an ergodic set have two potential advantages compared to
methods that operate on domains that are exogenous to the models. The rst advantage
is in terms of cost: ergodic-set methods compute solutions only in a relevant domain
the ergodic set realizedinequilibriumwhile exogenous-domainmethods compute so-
lutions both inside and outside the relevant domain, and spend time computing solu-
tions in unnecessary points. The second advantage is in terms of accuracy: ergodic-set
methods t a polynomial ina relevant domain, while exogenous-domainmethods t the
polynomial in generally larger domains, and face a trade-off between the t (accuracy)
inside and outside the relevant domain.
Stochastic simulation algorithms in previous literature (based on standard LS ap-
proximation methods and Monte Carlo integration methods) did not benet from the
above advantages. Their performance was severely handicapped by two problems: nu-
merical instability (because of multicollinearity) and large integration errors (because
of low accuracy of Monte Carlo integration). GSSA xes both of these problems: First,
GSSA relies on approximation methods that can handle ill-conditioned problems; this
allows us to stabilize stochastic simulation and to compute high-degree polynomial ap-
proximations. Second, GSSA uses a generalized notion of integration that includes both
Monte Carlo and deterministic (quadrature and monomial) integration methods; this
allows us to compute integrals very accurately. GSSA has shown great performance in
the examples considered. It extends the speedaccuracy frontier attained in the related
literature, it is tractable for problems with high dimensionality, and it is very simple to
program. GSSA appears to be a promising method for many economic applications.
RrrrnrNcrs
Aiyagari, R. (1994), Uninsured idiosyncratic risk and aggregate saving. Quarterly Jour-
nal of Economics, 109, 659684. [174]
33
Maliar, Maliar, and Villemot (2011) implement a perturbation-based method which is comparable in
accuracy to global solution methods. This is a hybrid method that computes some policy functions locally
(using perturbation) and computes the remaining policy functions globally (using analytical formulas and
numerical solvers).
34
For example, Creel (2008) developed a parallel computing toolbox which reduces the cost of a
simulation-based PEA, studied in Maliar and Maliar (2003b), by running simulations on a cluster of com-
puters.
Aruoba, S., J. Fernandez-Villaverde, and J. Rubio-Ramrez (2006), Comparing solution
methods for dynamic equilibriumeconomies. Journal of Economic Dynamics and Con-
trol, 30, 24772508. [174]
Asmussen, S. and P. Glynn (2007), Stochastic Simulation: Algorithms and Analysis.
Springer, New York. [174]
Barro, R. (2009), Rare disasters, asset prices, and welfare costs. American Economic
Review, 99, 243264. [202]
Brown, P. (1993), Measurement, Regression, and Calibration. Clarendon Press, Oxford.
[186]
Charnes, A., W. Cooper, and R. Ferguson (1955), Optimal estimation of executive com-
pensation by linear programming. Management Science, 1, 138151. [187]
Christiano, L. and D. Fisher (2000), Algorithms for solving dynamic models with
occasionally binding constraints. Journal of Economic Dynamics and Control, 24,
11791232. [173, 174, 177, 182, 192]
Creel, M. (2008), Using parallelization to solve a macroeconomic model: A parallel pa-
rameterized expectations algorithm. Computational Economics, 32, 343352. [206]
Davidson, R. andJ. MacKinnon(1993), Estimationand Inference inEconometrics. Oxford
University Press, New York. [185]
Den Haan, W. (1990), The optimal ination path in a Sidrauski-type model with uncer-
tainty. Journal of Monetary Economics, 25, 389409. [192]
Den Haan, W. (2010), Comparison of solutions to the incomplete markets model with
aggregate uncertainty. Journal of Economic Dynamics and Control, 34, 427. [174, 175]
Den Haan, W. and A. Marcet (1990), Solving the stochastic growth model by parameter-
ized expectations. Journal of Business and Economic Statistics, 8, 3134. [182]
Dielman, T. (2005), Least absolute value: Recent contributions. Journal of Statistical
Computation and Simulation, 75, 263286. [189]
Eckart, C. and G. Young (1936), The approximation of one matrix by another of lower
rank. Psychometrika, 1, 211218. [190]
Eldn, L. (2007), Matrix Methods in Data Mining and Pattern Recognition. SIAM,
Philadelphia. [191]
Fair, R. and J. Taylor (1983), Solution and maximum likelihood estimation of dynamic
nonlinear rational expectation models. Econometrica, 51, 11691185. [174]
Ferris, M., O. Mangasarian, and S. Wright (2007), Linear Programming With MATLAB.
MPS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics,
Philadelphia, Pennsylvania. [188]
Gaspar, J. and K. Judd (1997), Solving large scale rational expectations models. Macroe-
conomic Dynamics, 1, 4575. [173, 174]
Geweke, J. (1996), Monte Carlo simulation and numerical integration. In Handbook of
Computational Economics (H. Amman, D. Kendrick, and J. Rust, eds.), 733800, Elsevier
Science, Amsterdam. [195]
Golub, G. and C. Van Loan (1996), Matrix Computations. The John Hopkins University
Press, Baltimore, Maryland. [185]
Hadi, A. and R. Ling (1998), Some cautionary notes on the use of principal components
regression. American Statistician, 52, 1519. [191]
Heer, B. and A. Maussner (2008), Computation of business cycle models: A comparison
of numerical methods. Macroeconomic Dynamics, 12, 641663. [174]
Hoerl, A. and R. Kennard (1970), Ridge regression: Biased estimation for nonorthogonal
problems. Technometrics, 12, 6982. [186]
Judd, K. (1992), Projection methods for solving aggregate growth models. Journal of
Economic Theory, 58, 41052. [174, 181, 182, 191]
Judd, K. (1998), Numerical Methods in Economics. MIT Press, Cambridge, Mas-
sachusetts. [173, 181, 192, 194]
Judd, K. and S. Guu (1993), Perturbation solution methods for economic growth mod-
els. In Economic and Financial Modeling With Mathematica (H. Varian, ed.), 80103,
Springer-Verlag, Santa Clara, California. [174]
Judd, K., L. Maliar, and S. Maliar (2009), Numerically stable stochastic simulation meth-
ods for solving dynamic economic models. Working Paper 15296, NBER. [193]
Judd, K., L. Maliar, and S. Maliar (2010), A cluster-grid projection method: Solving prob-
lems with high dimensionality. Working Paper 15965, NBER. [182, 205]
Judd, K., L. Maliar, and S. Maliar (2011), One-node quadrature beats Monte Carlo:
A generalized stochastic simulation algorithm. Working Paper 16708, NBER. [193]
Juillard, M. and S. Villemot (2011), Multi-country real business cycle models: Accuracy
tests and testing bench. Journal of Economic Dynamics and Control, 35, 178185. [197]
Koenker, R. and G. Bassett (1978), Regression quantiles. Econometrica, 46, 3350. [187]
Kollmann, R., S. Kim, and J. Kim (2011), Solving the multi-country real business cycle
model using a perturbation method. Journal of Economic Dynamics and Control, 35,
203206. [205]
Kollmann, R., S. Maliar, B. Malin, and P. Pichler (2011), Comparison of solutions to the
multi-country real business cycle model. Journal of Economic Dynamics and Control,
35, 186202. [174, 177, 193, 205, 206]
Krueger, D. and F. Kubler (2004), Computing equilibrium in OLG models with produc-
tion. Journal of Economic Dynamics and Control, 28, 14111436. [182]
Krusell, P. and A. Smith (1998), Income and wealth heterogeneity in the macroecon-
omy. Journal of Political Economy, 106, 868896. [174, 175]
Maliar, L. and S. Maliar (2003a), The representative consumer in the neoclassical
growth model with idiosyncratic shocks. Review of Economic Dynamics, 6, 362380.
[182]
Maliar, L. and S. Maliar (2003b), Parameterized expectations algorithmand the moving
bounds. Journal of Business and Economic Statistics, 21, 8892. [206]
Maliar, L. and S. Maliar (2005), Solving nonlinear stochastic growth models: Iterating
on value function by simulations. Economics Letters, 87, 135140. [174, 177]
Maliar, L., S. Maliar, and F. Valli (2010), Solving the incomplete markets model with ag-
gregate uncertainty using the KrusellSmith algorithm. Journal of Economic Dynamics
and Control, 34, 4249. [175, 177]
Maliar, L., S. Maliar, and S. Villemot (2011), Taking perturbation to the accuracy fron-
tier: A hybrid of local and global solutions. Dynare, Working Paper 6. [206]
Maliar, S., L. Maliar, and K. Judd (2011), Solving the multi-country real business cy-
cle model using ergodic set methods. Journal of Economic Dynamic and Control, 35,
207228. [177, 182]
Malin, B., D. Krueger, and F. Kubler (2011), Solving the multi-country real business cycle
model using a Smolyak-collocation method. Journal of Economic Dynamics and Con-
trol, 35, 229239. [182]
Marcet, A. (1988), Solving non-linear models by parameterizing expectations. Unpub-
lished manuscript, Carnegie Mellon University, Graduate School of Industrial Adminis-
tration. [174, 182, 192, 193]
Marcet, A. and G. Lorenzoni (1999), The parameterized expectation approach: Some
practical issues. In Computational Methods for Study of Dynamic Economies (R. Mari-
mon and A. Scott, eds.), 143171, Oxford University Press, New York. [177, 182]
Marimon, R. and A. Scott (1999), Computational Methods for Study of Dynamic
Economies. Oxford University Press, New York. [173]
Miranda, M. and P. Fackler (2002), Applied Computational Economics and Finance. MIT
Press, Cambridge, Massachusetts. [173]
Miranda, M. and P. Helmberger (1988), The effects of commodity price stabilization
programs. American Economic Review, 78, 4658. [181]
Narula, S. and J. Wellington (1982), The minimum sum of absolute errors regression:
A state of the art survey. International Statistical Review, 50, 317326. [189]
Pichler, P. (2011), Solving the multi-country real business cycle model using a mono-
mial rule Galerkin method. Journal of Economic Dynamics and Control, 35, 240251.
[174, 182]
Portnoy, S. and R. Koenker (1997), The Gaussian hare and the Laplacian tortoise: Com-
putability of squared error versus absolute-error estimators. Statistical Science, 12,
279296. [197]
Rust, J. (1996), Numerical dynamic programming in economics. In Handbook of Com-
putational Economics (H. Amman, D. Kendrick, and J. Rust, eds.), 619722, Elsevier Sci-
ence, Amsterdam. [173]
Santos, M. (1999), Numerical solution of dynamic economic models. In Handbook of
Macroeconomics (J. Taylor and M. Woodford, eds.), 312382, Elsevier Science, Amster-
dam. [173]
Smith, A. (1993), Estimating nonlinear time-series models using simulated vector au-
toregressions. Journal of Applied Econometrics, 8, S63S84. [174]
Taylor, J. and H. Uhlig (1990), Solving nonlinear stochastic growth models: A compar-
ison of alternative solution methods. Journal of Business and Economic Statistics, 8,
117. [173]
Tits, A., P. Absil, and W. Woessner (2006), Constraint reduction for linear programs with
many inequality constraints. SIAM Journal on Optimization, 17, 119146. [205]
Wagner, H. (1959), Linear programming techniques for regression analysis. American
Statistical Association Journal, 54, 206212. [188]
Wang, L., M. Gordon, and J. Zhu (2006), Regularized least absolute deviations regres-
sion and an efcient algorithm for parameter tuning. In Proceedings of the Sixth In-
ternational Conference on Data Mining, 690700, IEEE Computer Society, Los Alamitos,
California. [188]
Wright, B. and J. Williams (1984), The welfare effects of the introduction of storage.
Quarterly Journal of Economics, 99, 169192. [181]
Submitted August, 2009. Final version accepted January, 2011.

17 130 1 PB

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

17 130 1 PB

Caricato da

Copyright:

Formati disponibili

Quantitative Economics 2 (2011), 173210 1759-7331/20110173

Numerically stable and accurate stochastic simulation

b that minimizes the errors

) by using a high-quality integration method

A). Let us order the eigenvalues

A is equal to the ra-

A are dened by the standard eigenvalue decomposition A

A. A large condition number implies that A

. Then the OLS solution is

A) =1,. Let , =(0, 0)

. The OLS solution is

A). The OLS estimator

, in terms of the SVD

A is ill-conditioned. However, it is still possi-

A prior to inversion. Thus, if A

b(). Note that

b() is a biased estimator of b. As increases, the bias of

b(), is smaller than that of the OLS estimator,

b. Two main approaches to nding an appropriate value of the regularization parame-

b() changes with (ridge trace) and

b, respectively. The sum

+Ab =,, (28)

=i, and their norms are related to the singular

, which implies that the variables :

, we have the standard

), and the convergence rate of the OLS estimator is

TJ. To decrease errors by an order of mag-

(, Ab) and the average

b. Since is constructed to be invariant to changes in T and

Potrebbero piacerti anche