Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
approaches to economics
Jason Smith∗
Abstract
In the natural sciences, complex non-linear systems composed of large num-
bers of smaller subunits provide an opportunity to apply the tools of statistical
mechanics and information theory. The principle of maximum entropy can
usually provide shortcuts in the treatment of these complex systems. However,
there is an impasse to straightforward application to social and economic
systems: the lack of well-defined constraints for Lagrange multipliers. This is
typically treated in economics by introducing marginal utility as a Lagrange
multiplier.
Jumping off from Gary Becker’s 1962 paper "Irrational Behavior and Eco-
nomic Theory" — a maximum entropy argument in disguise — we introduce
Peter Fielitz and Guenter Borchardt’s concept of "information equilibrium"
presented in arXiv:0905.0610v4 [physics.gen-ph] as a means of applying
maximum entropy methods even in cases where well-defined constraints
such as energy conservation required to define Lagrange multipliers and
partition functions are not obvious (i.e. economics). From these initial steps
we are able to motivate a well-defined constraint in terms of growth rates
and develop a formalism for ensembles of markets described by information
equilibrium conditions. We apply information equilibrium to a description
of the US unemployment rate, connect it to search and matching theory, and
empirical regularities such as Okun’s Law. This represents a step toward Lee
Smolin’s call for a "statistical economics" analogous to statistical mechanics
in arXiv:0902.4274 [q-fin.GN].
∗ Associate Technical Fellow, The Boeing Company. P. O. Box 3707, Seattle, Washington 98124.
Email: jason.r.smith4@boeing.com.
1
1 INTRODUCTION
1 Introduction
In 1962, University of Chicago economist Gary Becker published a paper titled
"Irrational Behavior and Economic Theory". The original purpose of Becker (1962)
was to immunize economics against attacks on the idealized rationality typically
assumed in models. The paper briefly sparked a debate between Becker and Israel
Kirzner1 about the role of rationality in economic theory.
Becker’s main argument was that ideal rationality was not critical to microe-
conomic theory because random agents can be used to reproduce some important
theorems. Consider the opportunity set (state space) given a budget constraint
for two goods. An agent may select any point inside the budget constraint. In
order to find which point the agents select, economists typically introduce a utility
function for the agents (one good may produce more utility than the other) and
then solve for the maximum utility on the opportunity set. As the price changes for
one good (meaning more or less of that good can be bought given the same budget
constraint), the utility maximizing point on the opportunity set moves. The effect
of these price changes selects a different point on the opportunity set, tracing out a
demand curve.
Instead of the agents selecting a point through utility maximization, Becker
assumed every point in the opportunity set was equally likely — that agents selected
points in the opportunity set at random. In this case, the average is at the “center
of mass” of the region inside the budget constraint. However, Becker showed that
changing the price of one of the goods still produced a demand curve just like in
the utility maximization case thereby demonstrating microeconomics emerging
from random behavior.
There are a few key points here:
• Becker is using the principle of indifference and therefore is presenting a
maximum entropy argument. Without prior information, there is no rea-
son to expect any point in the opportunity set to be more likely than any
other. Each point is equally likely (equivalent points should be assigned
equal probabilities). The generalization of this principle is the principle of
maximum entropy: given prior information, the probability distribution that
best represents the current state of knowledge is the one with maximum
entropy. The present paper presents a mathematical framework for applying
the principle of maximum entropy and information theory.
• Becker (1962) adds the assumption that the average must saturate the budget
constraint in order to more completely reproduce the traditional microeco-
nomic argument. However as the number of goods increases, the dimension
1 This exchange seemed to end abruptly and became largely forgotten as documented by Lagueux
(2010).
2
2 INFORMATION EQUILIBRIUM
• There is no real requirement that the behavior be truly random — it just must
result in a maximum entropy distribution. For example, the behavior could
be so complex as to appear random (e.g. chaotic dynamics or algorithmic
randomness), or it could be deterministic with a random distribution of initial
conditions (e.g. molecules in a gas). The key requirement is that the behavior
is uncoordinated — agents do not preferentially select the same specific
point in the state space. Coordinated actions (spontaneous falls in entropy)
are a possible mechanism for market failures (e.g. recessions, “bubbles”)
following from human behavior (“groupthink”, panic, etc).
Jaynes (1991) represents an early attempt at applying maximum entropy and in-
formation theory to economics, and many papers have invented thermodynamic
approaches. An exhaustive survey is beyond the scope of the present paper. How-
ever, we will proceed from Becker’s proto-maximum entropy argument to a more
explicit approach via Fielitz and Borchardt (2014).
2 Information equilibrium
The maximum entropy approach typically requires the definition of constraints
(such as conservation laws), and Lagrange multipliers (such as temperature) are
introduced to maintain them in optimization problems (entropy maximization,
energy minimization). In economics, however, few true constraints exist. Even bud-
get constraints are not necessarily binding when one considers economic growth,
lending, asset valuation, and the creation of money.
Economics does in fact employ Lagrange multipliers in optimization problems.
Whereas temperature is the concept introduced in thermodynamics as the Lagrange
3
2 INFORMATION EQUILIBRIUM
as in Shannon (1948) where the sum is taken over all the states pi (and ∑i pi = 1).
Also note that p log p = 0 for p = 0. The Shannon entropy is additive such that
the information entropy of n events (draws) of the random variable X is given by
I(nX) = n H(X).
How does this relate to economics? In the traditional Walrasian definition of
economic equilibrium where supply meets demand with no excess of either, the
distributions of supply P(s) and demand P(d) are in information equilibrium. The
information required to specify the spatial, temporal probability distribution of
supply must be equal to the information required to specify probability distribution
of demand as any difference in information would represent an excess supply
or demand. Note that this is not as strict as the realized distribution of supply
and demand being equal (i.e. after the random variable ‘events’ drawn from the
distribution). It implies that the realized distributions are only equal on average.
The distribution of a large sample of random events drawn from these probability
distributions will approximately coincide; we can think of coinciding supply events
and demand events as market transaction events. So in economics, we could say that
the information entropy of nd draws from the distribution P(d) of demand random
variable d is equal to the information entropy of ns draws from the distribution
P(s) of the supply random variable s:
I(d) = I(s) (2.2)
nd H(d) = ns H(s) (2.3)
and call it information equilibrium. The market can be seen as a system for
equalizing the distributions of supply and demand (so that everywhere there is
4
2 INFORMATION EQUILIBRIUM
some demand, there is some supply on average at least in an ideal market). Let
us take the distributions P to be uniform distributions (over i = 1...σ symbols) so
that2 :
σ σ
1 1 σ 1
H(X) = − ∑ pi log pi = − ∑ log = − log = log σ
i=1 i=1 σ σ σ σ
n H(X) = n log σ
5
2.1 Non-ideal information transfer 2 INFORMATION EQUILIBRIUM
I(s) ≤ I(d)
Communication
I(d) I(s)
Channel
Figure 2.1: A diagram of the communication channel described by the information transfer
framework. We are agnostic about the properties of the communication channel (e.g. noise level or
transmission mechanism). The abstract price represents a measure of information flow through this
channel rather than a receiver or transmitter.
to the real data, the supply distribution is analogous to the generated model,
and the abstract price is analogous to the discriminator. The discriminator in
GANs minimizes the information difference (via the KL-divergence) between the
distribution of real data and the generative model of that distribution. However in
contrast to economics, the real data is usually taken as a fixed set of training data
while the demand distribution is subject to change.
I(d) ≥ I(s)
since you cannot receive more information than is transmitted. We call the case
where information is lost non-ideal information transfer. Following our derivation
of Eq. 2.5, our differential equation becomes a differential inequality
dD D
p≡ ≤k (2.6)
dS S
4 See Rao (2017) for an example of a study of how information flows through a trading network.
6
2.2 Solutions to the equations 2 INFORMATION EQUILIBRIUM
Use of Gronwall’s inequality5 tells us that the solutions to the differential equation
2.5 now become bounds on the solutions to Eq. 2.6 in the case of non-ideal
information transfer. For example, the information equilibrium price (the ideal
price) now becomes an upper bound on the observed price in the case of non-ideal
information transfer.
7
3 DYNAMIC INFORMATION EQUILIBRIUM
A ∼ eat (3.1)
B ∼ ebt (3.2)
d d A
log p = log ≈ (k − 1)b = a − b (3.3)
dt dt B
where the solution A ∼ Bk to Eq. 2.5 requires a = kb. Since the right hand side of
Eq. 3.3 is a constant, we can identify cases of information equilibrium empirically
by observing lines of constant slope on a logarithmic graph of time series of ratios
of process variables A/B in information equilibrium (or the relevant abstract price
p).
One pair of process variables of interest are the unemployment level U and the
size of the labor force L, the ratio of which is the unemployment rate u ≡ U/L.
If we plot US unemployment data UNRATE from FRED (2017) on a log-linear
graph as we do in Fig. 3.1, we can observe lines of approximately constant slope
between recessions. This constant logarithmic slope6 α is the dynamic information
equilibrium of Eq. 3.3
d
log u ≈ (k − 1) λ ≡ α (3.4)
dt
where λ is the growth rate of L. Similar graphs can be obtained for the ratios of
observables from the US Job Openings and Labor Turnover Survey (JOLTS). Let
us consider the seasonally adjusted JOLTS hires level (JTSHIL), job openings level
(JTSJOL), and the unemployment level (UNEMPLOY) and use the variables H, V
6 In
cases where ratio x is constrained to be e.g. between 0 and 100% (for example the unem-
ployment rate will never rise above 100%) we should actually consider the variable
x
log ∼ log x + o(x)
1−x
which reduces to log x for ratios away from 100%.
8
3 DYNAMIC INFORMATION EQUILIBRIUM
10
Figure 3.1: The US unemployment rate from FRED (2017) plotted on log-linear axes. Approxi-
mately constant rates of decline (constant α of Eq. 3.4) between recessions shown with lines.
9
3.1 Shocks 3 DYNAMIC INFORMATION EQUILIBRIUM
3.1 Shocks
While the preceding model is adequate for these macroeconomic observables
outside of a recession (i.e. “equilibrium”), a complete description of the data
requires the addition of non-equilibrium shocks corresponding to recessions in the
case of labor market measures. If we subtract the log-linear dynamic equilibrium
from the unemployment rate data, we obtain a series of discrete steps at recessions.
We will model these steps using logistic functions (and therefore approximately
Gaussian shocks to the slope α in Eq. 3.4). The resulting model for n shocks is:
n
ai
log u(t) = α(t − t0 ) + c0 + ∑ (3.8)
t−ti
i=1 1 + exp bi
Python and Mathematica implementations are available at GitHub via Smith (2017).
The effect of this minimization is to place the most points of log u(t) in the fewest
diagonal bins of slope α. This approach was chosen over e.g. direct nonlinear
regression of the function Eq. 3.8 in order to make the determination of the slope
α robust to different sizes and shapes of the shocks (parameterized by ai and
bi ), while simultaneously using as much of the data as possible7 . The entropy
functional evaluated at points α ∈ [0, 0.2] in steps of 0.001 is shown in Fig. 3.2
and the minimum at α ' 0.084 y−1 is indicated with a vertical line. This number
represents an approximate relative fall of 8% per year (e.g. an unemployment rate
of 10% would fall ∼ 0.8 percentage points to 9.2% over the course of a year).
In order to fit the shocks of Eq. 3.8, the data is then transformed by subtracting
the log-linear slope α ' 0.084 and fit to logistic functions (the parameters are
provided in Appendix A). This fit is shown in Fig. 3.3 along with the minimum
entropy histogram corresponding to α ' 0.084. We can transform back to the
unemployment rate domain by adding back the log-linear slope and taking the
exponential as shown in Fig. 3.4.
The same method may be applied to JOLTS data resulting in models of e.g.
the job openings rate (V /L) and unemployment rate (U/L) in Fig. 3.5 that may be
7 For narrow shocks, entropy minimization on α is independent of the shock parameters ai , bi ,
and ti .
10
3.1 Shocks 3 DYNAMIC INFORMATION EQUILIBRIUM
1.35
Relative information entropy [-]
1.30
1.25
1.20
1.15
1.10
0.00 0.05 0.10 0.15 0.20
Logarithmic slope [α]
Figure 3.2: Entropy functional Eq. 3.9 evaluated at points α ∈ [0, 0.2] in steps of 0.001. The
function shown is the information entropy relative to a uniform distribution over the same domain.
The minimum at α ' 0.084 is indicated with a vertical line.
4 4
3 3
log u -α (y-y0 )
log u -α (y-y0 )
2 2
1 1
0
0
2000 2005 2010 2015 0 10 20 30 40
Time [y] Number of data points
Figure 3.3: Left: Logarithm of the unemployment rate data (in percent) with the dynamic equilib-
rium removed (yellow). The logistic function fit is shown in blue. Right: The minimum entropy
histogram of the unemployment rate data for α ' 0.084. The peaks of the histogram align with the
steps of the logistic functions.
11
4 ENSEMBLES AND MACROECONOMICS
14
10
0
2000 2005 2010 2015
Year
Figure 3.4: Model of US unemployment rate from 1995 to 2016 obtained from transforming the
model fit of Fig. 3.3 back to the unemployment rate domain. Data is blue, model is in red with 90%
confidence intervals shown as a light red band. A forecast conditional on the absence of a recession
shock is shown through 2019.
combined to illustrate a Beveridge curve in Fig. 3.6. The locations of the shocks for
the different labor market measures in the JOLTS data do not coincide and some
measures show signs of the shock statistically significantly earlier than others. In
particular, the center of the recession shock to JOLTS hires data precedes the shock
to vacancies, quits, and the unemployment rate in Fig. 3.7. This is speculative
since detailed JOLTS data is only available from FRED (2017) for one complete
recession (2008-9).
Ai ∼ Bki
where ki is the information transfer index for the information equilibrium rela-
tionship Ai B. We will refer to the ki as the “k-state” for each market. Now an
exponentially growing economy with growing factor of production B ∼ eγt with
growth rate γ composed of several markets would quickly become dominated by a
12
4 ENSEMBLES AND MACROECONOMICS
6 14
12
5
8
3
6
2
4
1
2
0 0
2005 2010 2015 2020 2005 2010 2015 2020
Year Year
Figure 3.5: Left: JOLTS job openings rate (JOR) and dynamic equilibrium model fit with two
shocks. Right: Unemployment rate (UNR) data over the same domain and dynamic equilibrium
model fit.
8
Job openings rate [%]
4
2014
2001
2 2008
0
2 4 6 8 10
Unemployment rate [%]
Figure 3.6: The Beveridge curve resulting from combining the JOR and UNR dynamic equilibrium
models on a single graph. The gray lines indicate the paths the data would follow in the absence of
recession shocks. Since the timing and amplitude of recession shocks are not equal, these shocks
cause the Beveridge curve to move from one equilibrium to another (red line).
13
4 ENSEMBLES AND MACROECONOMICS
UNR
QUR
JTS observable
JOR
HIR
J A S O N D J F M A M J J A S O N D J F M A M J J
2008 2009
Year
Figure 3.7: The center of the shocks (ti ) including their width (given by the parameter bi ) for
JOLTS hires rate (HIR), job openings rate (JOR), quit rate (QUR), and unemployment rate (UNR).
The shock to HIR precedes the shock to UNR by several months.
14
4.1 Partition function 4 ENSEMBLES AND MACROECONOMICS
macroeconomic growth rate and choose a distribution of growth k-states consistent with it.
9 The approach here is analogous with thermodynamics where one maximum entropy equilibrium
of the macrostate is one with well-defined energy. Here, the growth rate is analogous to energy.
There is a deep connection between thermodynamics and information theory so there should be no
surprise that an analogy arises.
10 In thermodynamics, temperature T plays the role of Lagrange multiplier β −1 = kT . This also
represents the most obvious point of departure from Smith and Foley (2008) who uses price as a
Lagrange multiplier and the constraint is defined in terms of a market “offer” aggregated across
agents.
15
4.1 Partition function 4 ENSEMBLES AND MACROECONOMICS
which is simply the average of the initial values of the output for each market.
Therefore we can take the definition β ≡ log(1 + b) to be a consistent definition
of the Lagrange multiplier11 for the case of a macro-observable growth rate hkiγ.
Note that these choices for the Lagrange multiplier and partition function are based
on the information equilibrium treatment of Section 2.
The definition of the partition function Eq. 4.2 also lets us derive a macroeco-
nomic information equilibrium relationship. We have already computed hAi, so let
us compute
dhAi d 1 n d 1
= ∑ A0,i ≡ Ā0
dB dB Z i dB Z
Ā0 dZ
= − (4.5)
Z 2 dB
Now let us compute
1 n
hki = ∑ ki e−β ki
Z i
1 n
= ∑ ki e−ki log(B/B0 )
Z i
1 n d
= − ∑ e−ki log(B/B0 )
Z i d log(B/B0 )
B dZ
= − (4.6)
Z dB
Combining Eqs. 4.4, 4.5, and 4.6 we find
dhAi hAi
= hki (4.7)
dB B
which is formally similar to Eq. 2.5. The difference lies in the fact that hki
may change over time. However, if hki changes slowly, then the solution to the
differential equation 4.7 can be approximated by
hAi ∼ Bhki (4.8)
dhAi
hpi = ∼ hkiBhki−1 (4.9)
dB
11 This definition implies that an economy with more of a given factor of production — i.e.a larger
economy — is analogous to a colder thermodynamic system since β ∼ 1/T in thermodynamics.
Another way to think of it is that a given unit of a factor of production is more and more likely to
be found contributing to a low growth market as an economy grows simply because there are more
ways to construct an economy of a given growth rate from a large number of low growth markets
than a few high growth ones. This could potentially form a basis for understanding the “secular
stagnation” of Summers (2013) or the “Great Stagnation” of Cowen (2011). It is possible slower
growth may just be the most likely path an economy follows that becomes almost certain for large
economies.
16
4.2 Okun’s Law 4 ENSEMBLES AND MACROECONOMICS
based on the macro observables hk(1) i and hk(2) i being well-defined12 . Therefore
we can obtain the Cobb-Douglas form for slowly changing hk(i) i:
!hk(1) i !hk(2) i
B(1) B(2)
hAi ∼ (1) (2)
B0 B0
Finally, we can extend these results for ensembles of markets to the case of dynamic
information equilibrium discussed in Section 3 in the case of slowly varying hki so
that for the unemployment rate:
d hUi
log ≈ (hki − 1) λ ≡ α
dt L
where λ is the growth rate of the labor force. We can see that α measured through
the entropy minimization procedure of Section 3.1 could potentially change slowly
as the labor force grows. The fact that assuming constant α remains an empirically
accurate model over the course of 20 years in Fig. 3.4 would allow us to put
constraints on the rate of change of α. In the present paper, we simply note that a
fixed dynamic equilibrium is a very good approximation.
17
5 SUMMARY AND CONCLUSION
20
CPI inflation @%, all itemsD
15
10
-5
Figure 4.1: The model of US inflation using N = NGDP and total hours worked H is shown in
blue. Inflation data (CPI all items) is in green.
rearranging, we have
hNi
H = hki
hPi
Now N/P is real output (RGDP), and taking a logarithmic time derivative of both
sides yields (for hki approximately constant)
d d
log H = log RGDP
dt dt
which is a form of Okun’s law (falls in real output are correlated with falls in total
hours worked). This works fairly well empirically (using data for the US from
FRED) as shown in Fig. 4.1. Another consequence of slowly varying hki is that
as shown in Smith (2015) Eq. 4.10 would lead to an aggregate demand/aggregate
supply (in this case, labor) diagrammatic macro model in the short run where
aggregate demand hNi or aggregate (labor) supply H vary slowly. In the long run,
hNi ∼ H hki reproducing the long run aggregate supply curve.
18
A APPENDIX
via Eq. 4.7. This framework is not empty formalism, but leads to empirically
accurate models of unemployment and inflation as well as multiple avenues for
future research and new understanding of well-established results like Okun’s
Law and matching theory. Many of the results are critically dependent on the
information transfer index k or its ensemble average hki. As shown in Zorick
and Smith (2016), the information transfer index is directly related to Lyapunov
exponents in dynamical systems. Here, this parameter characterizes the relative
information content of state spaces and opportunity sets, and therefore we might
view economic theory as the study of the state spaces available to agents rather
than the agents themselves and their decisions. Much like how Becker showed
agent rationality is not a critical assumption, we have shown that maximum entropy
arguments agnostic about the details of agent behavior can be used to describe
empirical data.
Acknowledgment
We would like to thank David Glasner for bringing Becker (1962) to our atten-
tion. We would also like to thank Peter Fielitz and Guenter Borchardt for useful
discussions.
A Appendix
The model fit for Fig. 3.4 has three shocks (n = 3) with two positive (i.e. increasing
the unemployment rate) and one negative (i.e. decreasing the unemployment
rate). By convention, the direction of the shock is dictated by the sign of bi as
Mathematica’s NonlinearModelFit function was found to be more stable for that
choice when selecting Method→Automatic. Standard errors are given for the
shock fit parameters. The model is given by Eq. 3.8 rewritten here for convenience.
n
ai
log u(t) = α(t − t0 ) + c0 + ∑ (A.1)
t−ti
i=1 1 + exp bi
19
REFERENCES REFERENCES
References
Becker, Gary S. (1962). Irrational Behavior and Economic Theory. Journal of
Political Economy Vol. 70, No. 1 (Feb 1962), pp. 1-13
Chen, M. Keith and Lakshminarayanan, Venkat and Santos, Laurie (2005). The
Evolution of Our Preferences: Evidence from Capuchin Monkey Trading Behav-
ior (June 2005). Cowles Foundation Discussion Paper No. 1524. Available at
SSRN: http://ssrn.com/abstract=675503
Cockshott, Paul, Cottrell, Allin F., Michaelson, Gregory John, Wright, Ian P. and
Yakovenko, Victor (2011). Classical Econophysics.
Cowen, Tyler (2011). The Great Stagnation: How America Ate All the Low-
Hanging Fruit of Modern History, Got Sick, and Will (Eventually) Feel Better
Dutton.
Fielitz, Peter and Guenter Borchardt (2014). A general concept of natural in-
formation equilibrium: from the ideal gas law to the K-Trumpler effect"
arXiv:0905.0610v4 [physics.gen-ph].
20
REFERENCES REFERENCES
Hayek, Friedrich (1945). The Use of Knowledge in Society. The American Eco-
nomic Review. XXXV, No. 4. pp. 519-30 (1945).
Petrongolo, Barbara and Pissarides, Christopher (2001). Looking into the Black
Box: A Survey of the Matching Function Journal of Economic Literature XXXIX
(June 2001) 390-431.
Smith, Eric and Foley, Duncan K. (2008). Classical thermodynamics and economic
general equilibrium theory. Journal of Economic Dynamics & Control 32 (2008)
7-65.
21
REFERENCES REFERENCES
Williams, Michael A., P. Pinto, Brijesh, and Park, David (2015). Global evidence
on the distribution of firm growth rates. Physica A 432, 15 August 2015, Pages
102-107.
22