Sei sulla pagina 1di 22

Maximum entropy and information theory

approaches to economics
Jason Smith∗

Abstract
In the natural sciences, complex non-linear systems composed of large num-
bers of smaller subunits provide an opportunity to apply the tools of statistical
mechanics and information theory. The principle of maximum entropy can
usually provide shortcuts in the treatment of these complex systems. However,
there is an impasse to straightforward application to social and economic
systems: the lack of well-defined constraints for Lagrange multipliers. This is
typically treated in economics by introducing marginal utility as a Lagrange
multiplier.
Jumping off from Gary Becker’s 1962 paper "Irrational Behavior and Eco-
nomic Theory" — a maximum entropy argument in disguise — we introduce
Peter Fielitz and Guenter Borchardt’s concept of "information equilibrium"
presented in arXiv:0905.0610v4 [physics.gen-ph] as a means of applying
maximum entropy methods even in cases where well-defined constraints
such as energy conservation required to define Lagrange multipliers and
partition functions are not obvious (i.e. economics). From these initial steps
we are able to motivate a well-defined constraint in terms of growth rates
and develop a formalism for ensembles of markets described by information
equilibrium conditions. We apply information equilibrium to a description
of the US unemployment rate, connect it to search and matching theory, and
empirical regularities such as Okun’s Law. This represents a step toward Lee
Smolin’s call for a "statistical economics" analogous to statistical mechanics
in arXiv:0902.4274 [q-fin.GN].

Keywords: Information theory, macroeconomics, microeconomics


Journal of Economic Literature Classification: C00, E10, E30, E40.

∗ Associate Technical Fellow, The Boeing Company. P. O. Box 3707, Seattle, Washington 98124.
Email: jason.r.smith4@boeing.com.

1
1 INTRODUCTION

1 Introduction
In 1962, University of Chicago economist Gary Becker published a paper titled
"Irrational Behavior and Economic Theory". The original purpose of Becker (1962)
was to immunize economics against attacks on the idealized rationality typically
assumed in models. The paper briefly sparked a debate between Becker and Israel
Kirzner1 about the role of rationality in economic theory.
Becker’s main argument was that ideal rationality was not critical to microe-
conomic theory because random agents can be used to reproduce some important
theorems. Consider the opportunity set (state space) given a budget constraint
for two goods. An agent may select any point inside the budget constraint. In
order to find which point the agents select, economists typically introduce a utility
function for the agents (one good may produce more utility than the other) and
then solve for the maximum utility on the opportunity set. As the price changes for
one good (meaning more or less of that good can be bought given the same budget
constraint), the utility maximizing point on the opportunity set moves. The effect
of these price changes selects a different point on the opportunity set, tracing out a
demand curve.
Instead of the agents selecting a point through utility maximization, Becker
assumed every point in the opportunity set was equally likely — that agents selected
points in the opportunity set at random. In this case, the average is at the “center
of mass” of the region inside the budget constraint. However, Becker showed that
changing the price of one of the goods still produced a demand curve just like in
the utility maximization case thereby demonstrating microeconomics emerging
from random behavior.
There are a few key points here:
• Becker is using the principle of indifference and therefore is presenting a
maximum entropy argument. Without prior information, there is no rea-
son to expect any point in the opportunity set to be more likely than any
other. Each point is equally likely (equivalent points should be assigned
equal probabilities). The generalization of this principle is the principle of
maximum entropy: given prior information, the probability distribution that
best represents the current state of knowledge is the one with maximum
entropy. The present paper presents a mathematical framework for applying
the principle of maximum entropy and information theory.
• Becker (1962) adds the assumption that the average must saturate the budget
constraint in order to more completely reproduce the traditional microeco-
nomic argument. However as the number of goods increases, the dimension
1 This exchange seemed to end abruptly and became largely forgotten as documented by Lagueux

(2010).

2
2 INFORMATION EQUILIBRIUM

of the opportunity set increases. For a large number of dimensions, the


“center of mass” of the opportunity set approaches the budget constraint hy-
perplane. Therefore, instead of assuming saturation one can simply assume
a large number of goods reducing the required assumptions about agents
making it truly a maximum entropy argument.

• There is no real requirement that the behavior be truly random — it just must
result in a maximum entropy distribution. For example, the behavior could
be so complex as to appear random (e.g. chaotic dynamics or algorithmic
randomness), or it could be deterministic with a random distribution of initial
conditions (e.g. molecules in a gas). The key requirement is that the behavior
is uncoordinated — agents do not preferentially select the same specific
point in the state space. Coordinated actions (spontaneous falls in entropy)
are a possible mechanism for market failures (e.g. recessions, “bubbles”)
following from human behavior (“groupthink”, panic, etc).

• Experiments where the traditional microeconomics of supply and demand


appear to arise spontaneously are not very surprising because they do not
depend strongly on the underlying agents. From Vernon Smith’s multiple
experiments with students at the University of Arizona to the experiments
in Chen et al (2005) using capuchin monkeys at Yale, most agents capable
of exploring the opportunity set (state space) will manifest some rational
microeconomic behavior.

Jaynes (1991) represents an early attempt at applying maximum entropy and in-
formation theory to economics, and many papers have invented thermodynamic
approaches. An exhaustive survey is beyond the scope of the present paper. How-
ever, we will proceed from Becker’s proto-maximum entropy argument to a more
explicit approach via Fielitz and Borchardt (2014).

2 Information equilibrium
The maximum entropy approach typically requires the definition of constraints
(such as conservation laws), and Lagrange multipliers (such as temperature) are
introduced to maintain them in optimization problems (entropy maximization,
energy minimization). In economics, however, few true constraints exist. Even bud-
get constraints are not necessarily binding when one considers economic growth,
lending, asset valuation, and the creation of money.
Economics does in fact employ Lagrange multipliers in optimization problems.
Whereas temperature is the concept introduced in thermodynamics as the Lagrange

3
2 INFORMATION EQUILIBRIUM

multiplier, the Lagrange multiplier in economics is marginal utility (of consumption,


income, etc depending on the problem). We will take a different approach.
In order to address the issue of constraints (originally for physical systems),
Fielitz and Borchardt (2014) developed a formalism to look at how far one could
take maximum entropy arguments in the absence of constraints based on infor-
mation theory, deriving some simple yet general relationships between process
variables that hold under the condition of information equilibrium. Smith (2015)
later applied these results to economic systems. Let us review the basic results.
Information equilibrium posits that the information of two random processes d
and s (eventually for demand and supply below) are equal, i.e.
I(d) = I(s) (2.1)
The Shannon information entropy of a random variable X with discrete state
probabilities pi is given by
H(X) = − ∑ pi log pi
i

as in Shannon (1948) where the sum is taken over all the states pi (and ∑i pi = 1).
Also note that p log p = 0 for p = 0. The Shannon entropy is additive such that
the information entropy of n events (draws) of the random variable X is given by
I(nX) = n H(X).
How does this relate to economics? In the traditional Walrasian definition of
economic equilibrium where supply meets demand with no excess of either, the
distributions of supply P(s) and demand P(d) are in information equilibrium. The
information required to specify the spatial, temporal probability distribution of
supply must be equal to the information required to specify probability distribution
of demand as any difference in information would represent an excess supply
or demand. Note that this is not as strict as the realized distribution of supply
and demand being equal (i.e. after the random variable ‘events’ drawn from the
distribution). It implies that the realized distributions are only equal on average.
The distribution of a large sample of random events drawn from these probability
distributions will approximately coincide; we can think of coinciding supply events
and demand events as market transaction events. So in economics, we could say that
the information entropy of nd draws from the distribution P(d) of demand random
variable d is equal to the information entropy of ns draws from the distribution
P(s) of the supply random variable s:
I(d) = I(s) (2.2)
nd H(d) = ns H(s) (2.3)
and call it information equilibrium. The market can be seen as a system for
equalizing the distributions of supply and demand (so that everywhere there is

4
2 INFORMATION EQUILIBRIUM

some demand, there is some supply on average at least in an ideal market). Let
us take the distributions P to be uniform distributions (over i = 1...σ symbols) so
that2 :
σ σ
1 1 σ 1
H(X) = − ∑ pi log pi = − ∑ log = − log = log σ
i=1 i=1 σ σ σ σ

The information in n such events (a string of n symbols from an alphabet of size σ


with uniformly distributed symbols) is just

n H(X) = n log σ

so that Eq. 2.3 becomes

nd log σd = ns log σs (2.4)

Let us take nd , ns  1 and define nd ≡ D/dD (in an abuse of notation where dD is


an infinitesimal unit of demand) and ns ≡ S/dS so we can write
D S
log σd = log σs
dD dS
or
dD D
=k (2.5)
dS S
where we call k ≡ log(σd )/ log(σs ) the information transfer index which we will
generally take to be empirically measured3 . The differential equation 2.5 defines the
information equilibrium condition (a market) for which we will use the shorthand
notation p : D  S. Additionally, the left hand side is the exchange rate for an
infinitesimal unit of demand for an infinitesimal unit of supply — it represents an
abstract price p ≡ dD/dS.
Interestingly, before continuing on to introduce utility, a less general form
of Eq. 2.5 with k = 1 was written down by Irving Fisher (1892) in his thesis
and credited to the original marginalist arguments introduced by William Jevons
and Alfred Marshall. The picture of two distributions matching leading to the
information equilibrium condition Eq. 2.5 is also remarkably similar to Generative
Adversarial Networks (GANs) described in Goodfellow et al (2014) used to train
neural networks used in machine learning. The demand distribution is analogous
2 TheShannon (1948) definition of information entropy reduces to the Hartley (1928) definition
for a uniform distribution.
3 Taking k to be an empirical parameter allows us to view Eq. 2.5 as the leading term in an

effective theory defined by a scale invariance D → αD, S → β S.

5
2.1 Non-ideal information transfer 2 INFORMATION EQUILIBRIUM

I(s) ≤ I(d)
Communication
I(d) I(s)
Channel

Figure 2.1: A diagram of the communication channel described by the information transfer
framework. We are agnostic about the properties of the communication channel (e.g. noise level or
transmission mechanism). The abstract price represents a measure of information flow through this
channel rather than a receiver or transmitter.

to the real data, the supply distribution is analogous to the generated model,
and the abstract price is analogous to the discriminator. The discriminator in
GANs minimizes the information difference (via the KL-divergence) between the
distribution of real data and the generative model of that distribution. However in
contrast to economics, the real data is usually taken as a fixed set of training data
while the demand distribution is subject to change.

2.1 Non-ideal information transfer


One interpretation of equation 2.5 and information equilibrium is as a communi-
cation channel per Shannon (1948) where we interpret the demand distribution
as the information source distribution (distribution of transmitted messages) and
supply distribution as the information destination distribution (distribution of re-
ceived messages). The diagram looks like Fig. 2.1. Note that this differs from the
communication network analogy of Hayek (1945) where prices act to aggregate in-
formation. In this picture, price changes simply indicate information flow between
source (transmitter) and destination (receiver). We will remain agnostic about the
underlying process of how the information flows through the channel4 , only noting
that it must in order for supply and demand distributions to be in equilibrium.
If the demand is the source of information about e.g. the allocation (distribution)
of goods and services, then we can assert

I(d) ≥ I(s)

since you cannot receive more information than is transmitted. We call the case
where information is lost non-ideal information transfer. Following our derivation
of Eq. 2.5, our differential equation becomes a differential inequality
dD D
p≡ ≤k (2.6)
dS S
4 See Rao (2017) for an example of a study of how information flows through a trading network.

6
2.2 Solutions to the equations 2 INFORMATION EQUILIBRIUM

Use of Gronwall’s inequality5 tells us that the solutions to the differential equation
2.5 now become bounds on the solutions to Eq. 2.6 in the case of non-ideal
information transfer. For example, the information equilibrium price (the ideal
price) now becomes an upper bound on the observed price in the case of non-ideal
information transfer.

2.2 Solutions to the equations


So what are the solutions to the differential equations 2.5 and 2.6? The general
solution (in the case that most closely corresponds to what economists call general
equilibrium where supply and demand adjust together) is
 k
D S
≥ (2.7)
D0 S0
D0 S k−1
 
p ≤ k (2.8)
S0 S0
where D0 and S0 are constants. Equality is obtained for information equilibrium
Eq. 2.5. If we assume that either S or D adjusts to changes faster than the other
(i.e. D ≈ D0 a constant or analogously S ≈ S0 ) for small changes ∆D ≡ D − D0
or ∆S ≡ S − S0 , conditions that most closely correspond to what economists call
partial equilibrium, we obtain supply and demand diagrams as presented in Smith
(2015).
We can also extend Eq. 2.5 to the case of multiple information destinations
D  Si such that
∂D D
= ki
∂ Si Si
Solving this system of differential equations for single information source D =
D(Si ), we obtain:
Si ki
 
D
= (2.9)
D0 ∏ i S0,i
This allows us to view the information equilibrium condition as a single factor
of production, and the straightforward generalization to multiple information
destinations as multiple factors of production. For two factors of production
A and B (with information transfer indices a and b), we obtain the traditional
Cobb-Douglas form
 a  b
D A B
=
D0 A0 B0
5 Gronwall (1919)

7
3 DYNAMIC INFORMATION EQUILIBRIUM

3 Dynamic information equilibrium


Section 2 reviewed the main mathematical results of Smith (2015). In this section,
we add an explicit time dependence and use the information equilibrium condition
2.5 and Cobb-Douglas forms 2.9 to develop a model we will call dynamic informa-
tion equilibrium. This model directly connects to search and matching theory and
allows us to build an empirically accurate description of unemployment.
If we look at the information equilibrium condition p : A  B with information
transfer index k and assume as ansätze both A and B are exponentially growing, i.e.

A ∼ eat (3.1)
B ∼ ebt (3.2)

then the logarithmic derivative (growth rate) of the ratio A/B is

d d A
log p = log ≈ (k − 1)b = a − b (3.3)
dt dt B
where the solution A ∼ Bk to Eq. 2.5 requires a = kb. Since the right hand side of
Eq. 3.3 is a constant, we can identify cases of information equilibrium empirically
by observing lines of constant slope on a logarithmic graph of time series of ratios
of process variables A/B in information equilibrium (or the relevant abstract price
p).
One pair of process variables of interest are the unemployment level U and the
size of the labor force L, the ratio of which is the unemployment rate u ≡ U/L.
If we plot US unemployment data UNRATE from FRED (2017) on a log-linear
graph as we do in Fig. 3.1, we can observe lines of approximately constant slope
between recessions. This constant logarithmic slope6 α is the dynamic information
equilibrium of Eq. 3.3
d
log u ≈ (k − 1) λ ≡ α (3.4)
dt
where λ is the growth rate of L. Similar graphs can be obtained for the ratios of
observables from the US Job Openings and Labor Turnover Survey (JOLTS). Let
us consider the seasonally adjusted JOLTS hires level (JTSHIL), job openings level
(JTSJOL), and the unemployment level (UNEMPLOY) and use the variables H, V
6 In
cases where ratio x is constrained to be e.g. between 0 and 100% (for example the unem-
ployment rate will never rise above 100%) we should actually consider the variable
x
log ∼ log x + o(x)
1−x
which reduces to log x for ratios away from 100%.

8
3 DYNAMIC INFORMATION EQUILIBRIUM

10

Unemployment rate [%]

1960 1980 2000 2020


Year

Figure 3.1: The US unemployment rate from FRED (2017) plotted on log-linear axes. Approxi-
mately constant rates of decline (constant α of Eq. 3.4) between recessions shown with lines.

(for vacancies), and U respectively. If we take these variables to be in information


equilibrium H  U and H  V with information transfer indices ku and kv we
obtain a Cobb-Douglas form
H(U,V ) = a U ku V kv (3.5)
This Cobb-Douglas form is a common ansatz for the matching function in search
and matching theory; see Petrongolo and Pissarides (2001) for a review. We
also obtain the dynamic information equilibria (if we assume the variables to be
exponentially growing with rates ru and rv )
d H
log ≈ (ku − 1)ru ≡ αu (3.6)
dt U
d H
log ≈ (kv − 1)rv ≡ αv (3.7)
dt V
Since empirically αu and αv have opposite signs (H/V slopes downward and H/U
slopes upward), and using the fact that
d V d V /L
log = αu − αv = log = constant
dt U dt U/L
we can show that these two dynamic equilibria combine to create a Beveridge
curve. Were it not for shocks discussed in Section 3.1, unemployment and openings

9
3.1 Shocks 3 DYNAMIC INFORMATION EQUILIBRIUM

would be constrained to follow a hyperbola in (U/L,V /L) space. These shocks


cause the time series to jump from one hyperbola to another.

3.1 Shocks
While the preceding model is adequate for these macroeconomic observables
outside of a recession (i.e. “equilibrium”), a complete description of the data
requires the addition of non-equilibrium shocks corresponding to recessions in the
case of labor market measures. If we subtract the log-linear dynamic equilibrium
from the unemployment rate data, we obtain a series of discrete steps at recessions.
We will model these steps using logistic functions (and therefore approximately
Gaussian shocks to the slope α in Eq. 3.4). The resulting model for n shocks is:
n
ai
log u(t) = α(t − t0 ) + c0 + ∑   (3.8)
t−ti
i=1 1 + exp bi

We will use an entropy minimization algorithm to find the dynamic equilibrium


constant α, minimizing an entropy functional H[X]

min H[log u(t) − α(t − t0 )] (3.9)


α

Python and Mathematica implementations are available at GitHub via Smith (2017).
The effect of this minimization is to place the most points of log u(t) in the fewest
diagonal bins of slope α. This approach was chosen over e.g. direct nonlinear
regression of the function Eq. 3.8 in order to make the determination of the slope
α robust to different sizes and shapes of the shocks (parameterized by ai and
bi ), while simultaneously using as much of the data as possible7 . The entropy
functional evaluated at points α ∈ [0, 0.2] in steps of 0.001 is shown in Fig. 3.2
and the minimum at α ' 0.084 y−1 is indicated with a vertical line. This number
represents an approximate relative fall of 8% per year (e.g. an unemployment rate
of 10% would fall ∼ 0.8 percentage points to 9.2% over the course of a year).
In order to fit the shocks of Eq. 3.8, the data is then transformed by subtracting
the log-linear slope α ' 0.084 and fit to logistic functions (the parameters are
provided in Appendix A). This fit is shown in Fig. 3.3 along with the minimum
entropy histogram corresponding to α ' 0.084. We can transform back to the
unemployment rate domain by adding back the log-linear slope and taking the
exponential as shown in Fig. 3.4.
The same method may be applied to JOLTS data resulting in models of e.g.
the job openings rate (V /L) and unemployment rate (U/L) in Fig. 3.5 that may be
7 For narrow shocks, entropy minimization on α is independent of the shock parameters ai , bi ,
and ti .

10
3.1 Shocks 3 DYNAMIC INFORMATION EQUILIBRIUM

1.35
Relative information entropy [-]

1.30

1.25

1.20

1.15

1.10
0.00 0.05 0.10 0.15 0.20
Logarithmic slope [α]

Figure 3.2: Entropy functional Eq. 3.9 evaluated at points α ∈ [0, 0.2] in steps of 0.001. The
function shown is the information entropy relative to a uniform distribution over the same domain.
The minimum at α ' 0.084 is indicated with a vertical line.

4 4

3 3
log u -α (y-y0 )
log u -α (y-y0 )

2 2

1 1

0
0
2000 2005 2010 2015 0 10 20 30 40
Time [y] Number of data points

Figure 3.3: Left: Logarithm of the unemployment rate data (in percent) with the dynamic equilib-
rium removed (yellow). The logistic function fit is shown in blue. Right: The minimum entropy
histogram of the unemployment rate data for α ' 0.084. The peaks of the histogram align with the
steps of the logistic functions.

11
4 ENSEMBLES AND MACROECONOMICS

14

Unemployment rate [%] 12

10

0
2000 2005 2010 2015
Year

Figure 3.4: Model of US unemployment rate from 1995 to 2016 obtained from transforming the
model fit of Fig. 3.3 back to the unemployment rate domain. Data is blue, model is in red with 90%
confidence intervals shown as a light red band. A forecast conditional on the absence of a recession
shock is shown through 2019.

combined to illustrate a Beveridge curve in Fig. 3.6. The locations of the shocks for
the different labor market measures in the JOLTS data do not coincide and some
measures show signs of the shock statistically significantly earlier than others. In
particular, the center of the recession shock to JOLTS hires data precedes the shock
to vacancies, quits, and the unemployment rate in Fig. 3.7. This is speculative
since detailed JOLTS data is only available from FRED (2017) for one complete
recession (2008-9).

4 Ensembles and macroeconomics


We intend to build a framework for a “statistical economics” as discussed in
Smolin (2009). To this end, we consider an ensemble of information equilibrium
relationships (markets) Ai  B. Initially we take there to be only a single factor
of production B for multiple different outputs Ai . This is not a limitation and the
approach here generalizes to multiple factors of production. This yields a series of
functions

Ai ∼ Bki

where ki is the information transfer index for the information equilibrium rela-
tionship Ai  B. We will refer to the ki as the “k-state” for each market. Now an
exponentially growing economy with growing factor of production B ∼ eγt with
growth rate γ composed of several markets would quickly become dominated by a

12
4 ENSEMBLES AND MACROECONOMICS

6 14

12
5

Unemployment rate [%]


10
Openings rate [%]

8
3
6

2
4

1
2

0 0
2005 2010 2015 2020 2005 2010 2015 2020
Year Year

Figure 3.5: Left: JOLTS job openings rate (JOR) and dynamic equilibrium model fit with two
shocks. Right: Unemployment rate (UNR) data over the same domain and dynamic equilibrium
model fit.

8
Job openings rate [%]

4
2014
2001

2 2008

0
2 4 6 8 10
Unemployment rate [%]

Figure 3.6: The Beveridge curve resulting from combining the JOR and UNR dynamic equilibrium
models on a single graph. The gray lines indicate the paths the data would follow in the absence of
recession shocks. Since the timing and amplitude of recession shocks are not equal, these shocks
cause the Beveridge curve to move from one equilibrium to another (red line).

13
4 ENSEMBLES AND MACROECONOMICS

UNR

QUR
JTS observable

JOR

HIR

J A S O N D J F M A M J J A S O N D J F M A M J J
2008 2009
Year

Figure 3.7: The center of the shocks (ti ) including their width (given by the parameter bi ) for
JOLTS hires rate (HIR), job openings rate (JOR), quit rate (QUR), and unemployment rate (UNR).
The shock to HIR precedes the shock to UNR by several months.

single market Amax if all the markets remained in a single k-state


kmax ≡ max ki
i
−1
log Amax ∼ kmax log B for t  kmax (4.1)
This implies that markets (or firms, industries, etc) must change their k-state
over time and we should really think of an economy as a collection of markets
in a distribution of k-states. The stability of these k-state distributions can be
directly related to the statistical equilibrium approaches of e.g. Scharfenaker
and Semieniuk (2015), Williams et al (2015), or Cockshott et al (2011). We
imagine a macroeconomic equilibrium as an ensemble of markets or firms with a
stable distribution of growth rates, but the growth rate of each individual market
or firm is not necessarily stable and possibly rapidly changing. Non-equilibrium
processes such as recessions could be considered shocks to these stable distributions
changing them to over-represent low or negative growth states. If this distribution
is a maximum entropy distribution (as presented in Section 4.1), a macroeconomy
would experience an “entropic force” to return to the maximum entropy equilibrium.
A quasi-stable macroeconomic equilibrium with well-defined8 growth rate is then
8 Well-defined
in this context means observable or measurable. For a small number of molecules,
the thermodynamic temperature is not well-defined in this sense. In our case, we observe a

14
4.1 Partition function 4 ENSEMBLES AND MACROECONOMICS

(weakly) emergent from random markets (or firms).

4.1 Partition function


Let us assume macroeconomic growth is a well-defined concept such that the
economy grows like
A1 + A2 · · · + An ∼ ehkiγt
where hki is an ensemble average. The maximum entropy distribution9 with
constrained (i.e. macro observable) ensemble average growth rate hki is fixed via a
partition function
n
Z(β ) = ∑ e−β ki (4.2)
i

where β is a Lagrange multiplier10 . Now since


 ki  
B B
= exp ki log
B0 B
 0 
B − B0
= exp ki log 1 +
B0
≡ exp (ki log (1 + b)) (4.3)
where b ≡ (B − B0 )/B0 . If we then take β ≡ log(1 + b), the expected value of the
average output becomes
 ki
1 n B
hAi = ∑ A0,i e−β ki
Z i B0
1 n
= A0,i (4.4)
Z∑i
If B = B0 , then b = 0, β = 0, and Z = n so that Eq. 4.4 becomes
1 n
hAi = A0,i
n∑i

macroeconomic growth rate and choose a distribution of growth k-states consistent with it.
9 The approach here is analogous with thermodynamics where one maximum entropy equilibrium

of the macrostate is one with well-defined energy. Here, the growth rate is analogous to energy.
There is a deep connection between thermodynamics and information theory so there should be no
surprise that an analogy arises.
10 In thermodynamics, temperature T plays the role of Lagrange multiplier β −1 = kT . This also

represents the most obvious point of departure from Smith and Foley (2008) who uses price as a
Lagrange multiplier and the constraint is defined in terms of a market “offer” aggregated across
agents.

15
4.1 Partition function 4 ENSEMBLES AND MACROECONOMICS

which is simply the average of the initial values of the output for each market.
Therefore we can take the definition β ≡ log(1 + b) to be a consistent definition
of the Lagrange multiplier11 for the case of a macro-observable growth rate hkiγ.
Note that these choices for the Lagrange multiplier and partition function are based
on the information equilibrium treatment of Section 2.
The definition of the partition function Eq. 4.2 also lets us derive a macroeco-
nomic information equilibrium relationship. We have already computed hAi, so let
us compute
dhAi d 1 n d 1
= ∑ A0,i ≡ Ā0
dB dB Z i dB Z
Ā0 dZ
= − (4.5)
Z 2 dB
Now let us compute
1 n
hki = ∑ ki e−β ki
Z i
1 n
= ∑ ki e−ki log(B/B0 )
Z i
1 n d
= − ∑ e−ki log(B/B0 )
Z i d log(B/B0 )
B dZ
= − (4.6)
Z dB
Combining Eqs. 4.4, 4.5, and 4.6 we find
dhAi hAi
= hki (4.7)
dB B
which is formally similar to Eq. 2.5. The difference lies in the fact that hki
may change over time. However, if hki changes slowly, then the solution to the
differential equation 4.7 can be approximated by
hAi ∼ Bhki (4.8)
dhAi
hpi = ∼ hkiBhki−1 (4.9)
dB
11 This definition implies that an economy with more of a given factor of production — i.e.a larger
economy — is analogous to a colder thermodynamic system since β ∼ 1/T in thermodynamics.
Another way to think of it is that a given unit of a factor of production is more and more likely to
be found contributing to a low growth market as an economy grows simply because there are more
ways to construct an economy of a given growth rate from a large number of low growth markets
than a few high growth ones. This could potentially form a basis for understanding the “secular
stagnation” of Summers (2013) or the “Great Stagnation” of Cowen (2011). It is possible slower
growth may just be the most likely path an economy follows that becomes almost certain for large
economies.

16
4.2 Okun’s Law 4 ENSEMBLES AND MACROECONOMICS

The generalization to multiple factors of production is straightforward via partition


functions with multiple Lagrange multipliers, and for e.g. two factors of production
the partition function becomes
(1) (1) (2)
β −ki β (2)
Z(β (1) , β (2) ) = ∑ e−ki
i

based on the macro observables hk(1) i and hk(2) i being well-defined12 . Therefore
we can obtain the Cobb-Douglas form for slowly changing hk(i) i:
!hk(1) i !hk(2) i
B(1) B(2)
hAi ∼ (1) (2)
B0 B0
Finally, we can extend these results for ensembles of markets to the case of dynamic
information equilibrium discussed in Section 3 in the case of slowly varying hki so
that for the unemployment rate:
d hUi
log ≈ (hki − 1) λ ≡ α
dt L
where λ is the growth rate of the labor force. We can see that α measured through
the entropy minimization procedure of Section 3.1 could potentially change slowly
as the labor force grows. The fact that assuming constant α remains an empirically
accurate model over the course of 20 years in Fig. 3.4 would allow us to put
constraints on the rate of change of α. In the present paper, we simply note that a
fixed dynamic equilibrium is a very good approximation.

4.2 Okun’s Law


Let us apply Eq. 4.7 to a real-world macroeconomic observable. One stylized fact
of macroeconomics is Okun’s Law. Okun (1962) presents a relationship between
changes real output and changes in unemployment. We will show that there is
a fairly empirically accurate form that follows from an ensemble information
equilibrium relationship P : N  H where P is the price level (Consumer Price
Index, CPI), N is nominal output (Nominal Gross Domestic Product, NGDP), and
the factor of production H is hours worked via e.g. FRED (2017) series CPIAUCSL,
GDP, and HOANBS respectively. The information equilibrium relationship gives
us the equation (if the abstract price is the consumer price index CPI)
dhNi hNi
hPi = = hki (4.10)
dH H
12 A prime example being the exponents of the production function in Solow (1956).

17
5 SUMMARY AND CONCLUSION

20
CPI inflation @%, all itemsD

15

10

-5

1960 1970 1980 1990 2000 2010


Year

Figure 4.1: The model of US inflation using N = NGDP and total hours worked H is shown in
blue. Inflation data (CPI all items) is in green.

rearranging, we have
hNi
H = hki
hPi
Now N/P is real output (RGDP), and taking a logarithmic time derivative of both
sides yields (for hki approximately constant)
d d
log H = log RGDP
dt dt
which is a form of Okun’s law (falls in real output are correlated with falls in total
hours worked). This works fairly well empirically (using data for the US from
FRED) as shown in Fig. 4.1. Another consequence of slowly varying hki is that
as shown in Smith (2015) Eq. 4.10 would lead to an aggregate demand/aggregate
supply (in this case, labor) diagrammatic macro model in the short run where
aggregate demand hNi or aggregate (labor) supply H vary slowly. In the long run,
hNi ∼ H hki reproducing the long run aggregate supply curve.

5 Summary and conclusion


Starting from the insight of Becker (1962) that rational agent behavior is not
required to produce traditional economic theory and the insight of Hayek (1945)
that markets are about moving information, we have constructed an information-
theoretic maximum entropy approach to micro- and macro-economics. We even
discover that micro and macro have a useful self-similarity at different scales

18
A APPENDIX

via Eq. 4.7. This framework is not empty formalism, but leads to empirically
accurate models of unemployment and inflation as well as multiple avenues for
future research and new understanding of well-established results like Okun’s
Law and matching theory. Many of the results are critically dependent on the
information transfer index k or its ensemble average hki. As shown in Zorick
and Smith (2016), the information transfer index is directly related to Lyapunov
exponents in dynamical systems. Here, this parameter characterizes the relative
information content of state spaces and opportunity sets, and therefore we might
view economic theory as the study of the state spaces available to agents rather
than the agents themselves and their decisions. Much like how Becker showed
agent rationality is not a critical assumption, we have shown that maximum entropy
arguments agnostic about the details of agent behavior can be used to describe
empirical data.

Acknowledgment
We would like to thank David Glasner for bringing Becker (1962) to our atten-
tion. We would also like to thank Peter Fielitz and Guenter Borchardt for useful
discussions.

A Appendix
The model fit for Fig. 3.4 has three shocks (n = 3) with two positive (i.e. increasing
the unemployment rate) and one negative (i.e. decreasing the unemployment
rate). By convention, the direction of the shock is dictated by the sign of bi as
Mathematica’s NonlinearModelFit function was found to be more stable for that
choice when selecting Method→Automatic. Standard errors are given for the
shock fit parameters. The model is given by Eq. 3.8 rewritten here for convenience.
n
ai
log u(t) = α(t − t0 ) + c0 + ∑   (A.1)
t−ti
i=1 1 + exp bi

19
REFERENCES REFERENCES

α = 0.084 y−1 (from entropy minimization)


t0 = 1995.6 (fixed)
c0 = 1.07 ± 0.01
a1 = 0.71 ± 0.01
b1 = 0.42 ± 0.03 y
t1 = 2001.71 ± 0.03 y
a2 = 1.07 ± 0.01
b2 = 0.39 ± 0.02 y
t2 = 2008.79 ± 0.02 y
a3 = 0.23 ± 0.03
b3 = −0.49 ± 0.15 y
t3 = 2014.41 ± 0.18 y

References
Becker, Gary S. (1962). Irrational Behavior and Economic Theory. Journal of
Political Economy Vol. 70, No. 1 (Feb 1962), pp. 1-13

Chen, M. Keith and Lakshminarayanan, Venkat and Santos, Laurie (2005). The
Evolution of Our Preferences: Evidence from Capuchin Monkey Trading Behav-
ior (June 2005). Cowles Foundation Discussion Paper No. 1524. Available at
SSRN: http://ssrn.com/abstract=675503

Cockshott, Paul, Cottrell, Allin F., Michaelson, Gregory John, Wright, Ian P. and
Yakovenko, Victor (2011). Classical Econophysics.

Cowen, Tyler (2011). The Great Stagnation: How America Ate All the Low-
Hanging Fruit of Modern History, Got Sick, and Will (Eventually) Feel Better
Dutton.

Fielitz, Peter and Guenter Borchardt (2014). A general concept of natural in-
formation equilibrium: from the ideal gas law to the K-Trumpler effect"
arXiv:0905.0610v4 [physics.gen-ph].

Fisher, Irving (1892). Mathematical Investigations in the Theory of Value and


Prices.

Federal Reserve Economic Data (FRED 2017). US Bureau of Economic Analysis,


US Bureau of Labor Statistics, retrieved from FRED, Federal Reserve Bank of
St. Louis https://research.stlouisfed.org/fred2/ December 2017.

20
REFERENCES REFERENCES

Goodfellow, Ian J. , Pouget-Abadie, Jean, Mirza, Mehdi, Xu, Bing, Warde-Farley,


David, Ozair, Sherjil, Courville, Aaron and Bengio, Yoshua (2014). Generative
Adversarial Networks. arXiv:1406.2661 [stat.ML].

Gronwall, Thomas H. (1919). "Note on the derivatives with respect to a parameter


of the solutions of a system of differential equations", Ann. of Math., 20 (2):
292-296

Hartley, R.V.L. (1928). Transmission of Information. Bell System Technical Jour-


nal, Volume 7, Number 3, pp. 535-563.

Hayek, Friedrich (1945). The Use of Knowledge in Society. The American Eco-
nomic Review. XXXV, No. 4. pp. 519-30 (1945).

Jaynes, E. T. (1991). How should we use entropy in economics? Unpub-


lished. Available at http://bayes.wustl.edu/etj/articles/entropy.in.economics.pdf
(retrieved September 2015)

Lagueux, Maurice (2010). Rationality and Explanation in Economics. p. 152ff

Okun, Arthur M. (1962). Potential GNP, its measurement and significance

Petrongolo, Barbara and Pissarides, Christopher (2001). Looking into the Black
Box: A Survey of the Matching Function Journal of Economic Literature XXXIX
(June 2001) 390-431.

Rao, Anup (2017). A Theory of Market Efficiency. arXiv:1702.03290 [q-fin.EC]

Shannon, Claude E. (1948). A Mathematical Theory of Communication. Bell


System Technical Journal 27 (3): 379-423.

Scharfenaker, E. and Semieniuk, G. (2015). A Statistical Equilibrium Approach to


the Distribution of Profit Rates. Schwartz Center for Economic Policy Analysis
and Department of Economics, The New School for Social Research, Working
Paper Series 2015-5.

Smith, Jason (2015). Information equilibrium as an economic principle.


arXiv:1510.02435 [q-fin.EC]

Smith, Jason (2017). https://github.com/infotranecon/ retrieved December 2017.

Smith, Eric and Foley, Duncan K. (2008). Classical thermodynamics and economic
general equilibrium theory. Journal of Economic Dynamics & Control 32 (2008)
7-65.

21
REFERENCES REFERENCES

Smolin, Lee (2009). Time and symmetry in models of economic markets


arXiv:0902.4274v1 [q-fin.GN].

Solow, Robert M. (1956). A contribution to the theory of economic growth. Quar-


terly Journal of Economics. Oxford Journals. 70 (1): 65-94.

Summers, Lawrence (2013). IMF Fourteenth Annual Research Conference in


Honor of Stanley Fischer.

Williams, Michael A., P. Pinto, Brijesh, and Park, David (2015). Global evidence
on the distribution of firm growth rates. Physica A 432, 15 August 2015, Pages
102-107.

Zorick, Todd and Smith, Jason (2016). Generalized Information Equilibrium


Approaches to EEG Sleep Stage Discrimination. Computational and Math-
ematical Methods in Medicine, vol. 2016, Article ID 6450126, (2016).
doi:10.1155/2016/6450126

22

Potrebbero piacerti anche