Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
to entropy maximizing:
a pedagogic guide
by Martyn L. Senior
Up to the end of section IV, 4 the arguments put forward are illustrated
where possible with numerical examples. Indeed, it is a feature of sections II
and III that the evolving argument is coordinated by reference to illus-
trations which utilize a common hypothetical data base. The author’s
experience, both as student and teacher, suggests that readers should check
their comprehension by working through these examples for themselves.
At the beginning of section IV, 5 readers will detect a discrete rise in the
complexity of the argument as we turn to the mathematics of entropy
maximization, which will inevitably place a severe strain on O-level
mathematical skills. Indeed, this section seems particularly formidable and
the temptation will be to ignore it. This would.be an unfortunate reaction,
for without it the mathematically uninitiated reader is likely to appreciate
entropy maximization only at a simplified numerical level, and may well
be incapable of making the connection between numerical examples and
the gravity-like forms that the mathematics produces. Undoubtedly sec-
tion IV, 5 demands great perseverance from such readers, but they will find
the mathematics is set out in greater detail and more fully than is usual
elsewhere. Consequently, it should not be excessively difficult to progress
from one step to another in the mathematical argument.
-
(Figure lc).
Figure lc illustrates the point that Newton’s squared distance term, dij2, is a
particular case of the negative power function, d,~ ~; that is, f3 equals two in
Newton’s law. The empirically determined parameter, /3, reflects the
responsiveness of the amount of interaction to changes in spatial separation,
and has a value typically less than two in social contexts.
2 The physical
analogy issite
The gravity model is geography’s foremost example of a physical analogy;
that is, a law established for a physical system which is used to elucidate
178
conception of the gravitational law which treated the earth and moon as
single particles; that is, dimensionless objects with mass. Although no
physical object conforms to this definition, some are less particulate than
others (French, 1971) and the size and heterogeneous composition of
planets makes them least amenable to accurate treatment as particles.
Indeed Newton faced time-consuming mathematical difficulties in aggre-
gating from the inter-particle level of attraction to the inter-planetary one
(Bronowski, 1973, 233). However, he deduced, for spherical objects at
least, that the gravitational law applied universally at micro and aggregate
levels, and that the mass of an object could be considered to be concentrated
at its centre..
The analogous aggregation problems raised by geographical uses of the
gravity model, concerning the relationship between individual and aggre-
gate spatial interaction behaviour, have been barely addressed, apparently
because geographers have pursued only the aggregate analogy. Yet such
aggregation issues pervade strategic planning studies which, by definition,
deal with large areas of space containing numerous individuals. To be
workable, strategic planning models must treat persons and space aggre-
gatively, but how do such models relate to the individual location be-
haviour which they supposedly subsume and reflect in aggregate? This
180
4 Tlre iiitefiial
cOllSisteHcy question
Since the early 1960s gravity models have been widely used in transport
planning, but transport analysts have had to rectify intrinsic deficiencies in
their formulation to make them sensible forecasting devices. By compari-
son geographers have been slow to appreciate these problems.
To understand the internal consistency question, consider a simple
journey-to-work example where commuters make trips from two resi-
dence zones (i=1, 2) to three employment zones (j=1, 2, 3). The gravity
model predicts the number of commuter trips in the two-by-three Tij
matrix (Table 1). Further predictions are obtained by:
i) summing all trip values to give a prediction of the system-wide trip
total, Yi Lj T;~; -
ii) summing the trips in each row of the matrix to give a prediction of
trip ends, Lj Tj, originating in i zones;
iii) summing the trips in each column to give a prediction of trip ends,
E; Ty, attracted to j zones.
The internal consistency question asks whether these latter row and column
sums should logically equal the known values of the mass terms, Mi and 1~;
respectively. The answer depends on how the model user interprets his
interaction problem and defines the mass terms.
. Most early model applications (Olsson, 1965) treated the mass terms as
proxy variables for the amount of interaction likely to emanate from origin
zones and be attracted to destination zones. For our journey-to-work
of industrial plus commercial land use. These measures stand proxy for the
number of commuter trips likely to be generated in residential zones and
attracted to employment zones. Consequently, we would expect the
predictions of zonal trip origins and attractions given by the row and
column sums of the trip matrix to be proportional to values of M; and Vj,
but certainly not equal to them:
; ;
T&dquo; oc vi T&dquo; 0 vi (3)
often;
ii) trip distribution: locational choice of trip destination;
iii) modal split: choice of mode of transport;
iv) assignment: choice of route through a transport network.
The gravity model is used for trip distribution, but is preceded by trip
generation and attraction models providing independent estimates of zonal
trip origins and attractions which subsequently become the mass terms of
the gravity model. Hence, the definitions of the row and column sums of -
the predicted trip matrix coincide exactly with the definitions of the OJ and
Dj mass terms respectively, and the logical expectation is that their values
should be exactly consistent; that is: ,
~ Tri-Or (4)
j
Suppose we use these observed trip origins and attractions as mass terms in a
gravity model of the form:
T,, = G 0, D, dT (7)
and seek a predicted trip matrix, Tij, matching the observed trips, T1~P bl)’ as
closely as possible. Geographers have conventionally tackled this problem
by converting equation (7) into a logarithmically linear form:
log (Too)
G , ~’D. =log G-~3 log d;J, (8)
and then have used least-squares regression analysis to calibrate (that is, find
best-fitting values of) the intercept, log G, and regression coefficient, ~3, by
treating / T. B
-
log 5iT
B’-’’~7
and log d,;
as the dependent and independent variables respectively (Figure 2).
Table 3 presents the results from a regression analysis applied to our
numerical example. Clearly the log-linear gravity model violates condi-
tions (4) and (5) as neither the row nor column quantities of the Tij matrix
sum respectively to the known Oi and
Dj values. Furthermore, the model
violates the less restrictive condition that systemwide predicted and
observed trip totals are equal, namely:
L I Tij=N= I L T,;obS,.
; ;
(9)
j j
One might defend the model by suggesting that its predictions of trip
183
I w
are to be forecast.
5 Theforecastillg dilemma
The multiplicative structure of the Newtonian gravity model endows it
184
itstrip predictions do not satisfy trip origin, trip attraction and systemwide
trip constraints, equations (4), (5) and (9) respectively. By imposing, either
singly or in combination, these three sets of constraints a family of intern-
ally consistent gravity models can be constructed which are appropriate to
sundry spatial interaction situations. By adopting alternative locational
interpretations of the journey-to-work (Wilson, 1970a), members of this
family can be illustrated using our previous numerical example (Table 2).
Mathematically all that is required is substitution of one equation into
another.
Tij=KO¡DjdiifJ,
J ij rj (10)
which is structurally identical the Newtonian model of equation (7).
to
However, Kis not an independent parameter like G, but a balancing factor
calculated to ensure that constraint (9) is met. Simply substituting for Tit ion
this constraint using the right-hand side of equation (10), and then re-
arranging terms:
y E Tij=N= L jL KO¡DjdijP
i j
(11)
j
N
~
K- (12)
E E j0,Dd,,-’ N
i .
~ Tij~= ~ F
0,D~ -) (13)
.
[ O¡Djd¡¡p .] I
L L O¡Djdij -P
j i
over all zone pairs thcy sum to one and thus cnsure that predicted com-
muter interactions sum to N.
Equations (10) and (12) have been used to recalculate prcdictcd inter-
actions for our numerical example (Table 5). In principle P should be
Table 5 Results from the total interaction constrained
gravity model, equations (10) and (12)
recalibrated to reoptimize the fit between these new Tij values and the
observed trips, T~ob~}, of Table 2, but for comparative purposes the previous
j3=1 value is retained. The_calculated K value guarantees that the T;l
interactions sum to the given number of commuters, N, and this will
always be the case no matter what value ~3 takes. The row and column sums
of the interaction matrix record respectively the unconstrained zonal
choices of, or demand for, residences and jobs by the N commuters, and
thus need not equal the associated Oi and Dj values.
should sum. These conditions are embodied in constraint (4), and can be
satisfied only by replacing the single systemwide balancing factor, K, with
residence-zone-dependent factors, A;:
Tij=A¡OiDjdijP. (14)
As before, substitution for T;l in constraint (4) using equation (14), and
rearrangement of terms, gives an expression for calculating the new balanc-
ing factors:
L Tij=O¡= L A¡O¡D¡dijP (15)
, j j
1
A
Ar= (16)
L Du~ ~*
j
The constrained commuter flows, T;~, leaving each residence zone are
destined for workplace zones, whose available job opportunities, reflected
in the Dl measures, attract but do not constrain the job choices or demand
of the commuting population.
Table 6 presents the results of using equations (14) and (16) to predict
interactions for our numerical example. Clearly the row sums of the
interaction matrix record commuter numbers leaving each residence zone
and accord with the given Oi values, whereas the column sums are estimates
Table 6 Results from a production constrained gra-
vity model, equations (14) and (16): employment choice
’
interpretation
188
of their zonal job choices or demands, which arc not restricted to match the
workplace attraction valucs, Dj.
The logic of this employment choice model can be reversed by assuming
that we know instead the number of commuters, Dj, already employed in
each workplace zone j, and that journey-to-work interactions result from
their residential location choices, i. Commuter interactions now originate
in_each workplace zone, j, and their consistency with the given commuter
numbers, Dj, is required. Equation (5) replaces equation (4) as the appro-
priate constraint, and is satisfied by the calculation of employment-zone-
dependent balancing factors, Bj, which replace A; in equation (14) to give:
T,J-BaO;D,d,~ ~. (17)
The reader may easily check that substitution of equation .(17) into con-
straint (5) leads to the following expression for Bj:
Bj-L o,d=fl1
;
.
(18)
The 0; terms now reflect housing opportunities which attract but do not
constrain commuter interactions terminating in each residence zone.
Table 7 Results from a production constrained gra-
’
vity model, equations (17) and (18): residential choice
interpretation
189
Table 7 presents the numerical results from this model. The column sums
now record commuter numbers leaving each employment zone and match
the given Dj values, while the row sums predict commuters’ residential
choices or demands, which need not coincide with the housing supply
’
values 0;.
may now envisage either residentially fixed commuters, 0,, choosing jobs
in zones j such that their zonal demands adjust exactly to job supply, Dj; or
commuters already employed in zones j, Dj, selecting residences in zones i
such that their zonal demands match exactly housing supply, 0;. Such
situations imply the existence of not only trip origin or production con-
straints, but trip destination or attraction ones too. Hence constraints (4)
and (5) operate simultaneously and are satisfied by calculating both sets of
zone-dependent balancing factors, A; and Bj. The appropriate model thus
becomes:
T;~=A;BJD;D~d~f ~. (19)
Equation (19) is used to substitute for Tl in both constraints (4) and (5),
giving respectively:
I Tij= 0;= lL A;B~p;D~d;~ ~ (20)
j
I T;,=D;---~ A;BfO;D;d;~ s.
; ; .
(21)
1
Bj (23)
= ~ A;O;d~ ~fl
;
As both A; and Bj are unknown, their solution from these two equations
appears to pose a problem because each A; value is dependent on all Bj
values and vice versa. Fortunately, either all A; or all Bj values may be set
initially to any arbitrary value, and equations (22) and (23) solved iterat-
ively (Figure 3). Such a procedure is guaranteed to converge to unique
values of the products, AiBj, no matter what arbitrary starting value is chosen
(Evans, 1970). Separately neither A; nor Bj values are unique because the
190
Figure 3 Iterative calculation scheme for the balancing factors of the production-
attraction constrained model
ments, for example the physicist’s gaseous system containing around 10&dquo;
particles, the geographer’sjourney-to-work system comprising over 106
or
commuters for larger conurbations. In general it is very difficult, or
impossible, as yet to understand such systems ill toto by analysing the
characteristics and behaviour of each individual element, and then combin-
ing the individual analyses to obtain an aggregate picture. Manageability
often dictates that only aggregate representations of system behaviour are
feasible. The gravity models considered so far achieve this aggregate
representation by totally ignoring individual behaviour. Entropy maxi-
mizing methods, however, offer one mcans of relating such aggregate
constructions to the underlying variety of individual behaviour.
To understand entropy maximizing problems we specify three possible
levels of aggregation, or ’states’ (Wilson, 1970b), at which we could
describe our journey-to-work system for a city partitioned into zones,
denoted i for residential and j for employment activity.
i) At the micro-level each commuter, tJ, and his characteristics, such as
travel cost incurred, cj, are recorded and identified by residential and
employment zone (Figure 4a). The label x (varying between 1 and 111)
distinguishes between individual commuters residing and working in
the same (I, j) zone pair.
ii) At an intermediate or meso-level of aggregation we cease to identify
particular individuals by the label x; only the number of commuters
residing in i and working in j are recorded (Figure 4b):
m
~=I~ X=l
(24)
vvI)tiJ x eel
.r
c = x-’ T iJ
y
(25)
wishes to take into account. Constraints (4), (5) and (9) specify
required consistency between Tij and Oi, D~, and N respectively;
additionally a systemwide cost constraint may be imposed:
~ ~ T~uu=C
j
(26)
j
The analyst admits his usual ignorance of, and therefore uncertainty about,
194
3 A rrrrrrrerical illrr.stratiorr
of tlre methodology .
~ T’n=Dr=N=~
f=)
(27)
7
~ T;rc;,=C=6.
I= I
(28)
Thc latter constraint involves knowledge of the travel costs, (i1, which are
zero within a zone and unity between contiguous zones. Initially all seven
zones possess an equal number of residential opportunities, OJ, which do
not restrict the residential choices of our five workers. Consequently, these
constant Oi values have no importance in finding a solution to our present
problem. Table 9 summarizes the information content of this hypothetical
situation.
Having fully specified our problem, the first part of its solution involves
identifying all possible sets of T~ values consistent with the information in
constraints (27) and (28). By inspection we allocate five persons among
seven zones and find 10 fcasible sets of Til valucs (Table 10), all other sets
being excluded by the constraints imposed. But which of the feasible sets do
we choose as being most probable given our lack of further information?
The answer is that set which maximizes our entropy or uncertainty
about the residential choices that could be made by our five workers
identified individually. We define entropy as the number of ways indivi-
195
dual choices at a micro-level could happen for each set of Tn values. This is
the combinatorial problem of choosing how many ways of selecting Til
individuals from a total of N, for which the formula is:
,
(N- Til)! N&dquo; Til!
-
&dquo;’IV
N fi,,Ti, -
~’=(~T,)’Tj
=
(~~))
(29)
where ! denotes factorial. Thus for our first set of interactions, Tn=4 and
T71= 1, we find first the number of ways of selecting four out of t’V=5
persons to reside in zone 1:
N
N! 5! _5x4x3x2x 1
WTII (30)
(30}
a,WTn=(N-T&dquo;)! T&dquo;!
_
together gives 5 x 1=5 possible ways our five workers could make
us
residential choices consistent with the first set of Ti, values. These five
combinations are illustrated explicitly in Table 11 by attaching the incons-
picuous names of Berry, Chorley, Gould, Haggett and Wilson to our five
individuals. Formally, the product of these combinatorial expressions gives
a more convenient formula for calculating entropy, 14’(T;) , as:
196
N! (N―T,,)’ N!
( N~ 11 . -’
I
i !(N-(N-~’n- Z’~~)! T’o! r Tn! T7,!’
=(N- T.n)! T’n!
1 (32)
(32’
because the (N - TII)! terms cancel and (N - TII - T71)! equals unity.
Such entropy values for all ten sets of Til values are presented in Table 10,
and it can be seen that the sixth set has the maximum value. This set of
interactions permits the maximum variety of residential choices at the
micro-level, hence the analyst is most uncertain about such individual
behaviour in this case. Looked at another way, however, this is also the
most probable interaction set, because it has the greatest number of these
micro-level possibilities (60 out of 210), which are each accorded an equal
probability of occurrence owing to the lack of further information.
Table 10 Feasible sets of interactions, Til, and their entropy values
Entropy values calculated from equation (42) with appropriate zone labels.
197
say, for example, two houses per zone (Figure 5a). In the absence of all
other attractive or detractive factors, such as travel costs, this implies that
the probabilities, p;, of commuters being attracted to residence zones,
would be proportional to relative zone size and thus equal for all zones:
0; 2 1
7 =1~=~
O;
for 14 7
i=1 to 7. (33)
i=1
&dquo; O.D.
(35)
P~=p~!’I-~ ~i ~ D ; j
l
Thus, for our previous numerical example the interaction probabilities, Pit.
betwecn the single employment zone and the seven residential zones arc
constant:
~;~_
Pil A14~-14·55=~7 7
~ for i=1
-for
5 i=l toto 7.7. (36) -
probabilities and it was for this reason that we could ignore the influence of
housing opportunities in section IV, 3.
However, if we change the zoning system so that all jobs are still in zone
1, but with the 14 houses in the system distributed unequally across seven
zones as in Figurc 5b, we change the probabilities, pi, as follows:
2 1
=-=- for i= I and 3
14 7
_1
for i =2, 4 and 6
OJ
oi 14
14
’
P’~ 7 3
(37)
I 0;
~’
;j
-
14
for ,=5
~°~ ~ ~
i=5
4 2
=- =- for i=7
14 7
199
associated with 5~{=625) ways the same four persons may occupy fivejobs.
Combining equations (30) and (39), we have an expression for the number
of ways of selecting four out of five commuters to work in five jobs in zone
1 and live in two houses in zone 1:
M
(p‘D~)T i~ ---5 x 10&dquo;’.
(~― ~ )))! ~ ))’
(40)
Similarly for T&dquo;-1 1 person and 07=4 houses (see Figure 5b) we multiply
equation (31) by the appropriate form of formula (38):
(N- T, (O~D,) r&dquo; ~ ~
1x(4xS)I=20.
~~ ~ ~~ ~ ~~’ (41)
(N- T’n - T711 ’ T71 ’
The product of equations (40) and (41) gives the total number of
arrangements of five persons between the two zones and within then
available jobs and houses, associated with the meso-level interactions
Tn=4and T’&dquo;=1:
N’ .
Wz(Til)-
T,1! N~ T&dquo;! (01D1)TII(07DI)T71=5xI0~x20=106. (42)
where nij instructs us to multiply together all T~! values. The zone-size-
dependent entropy formula, exemplified in equation (42), becomes:
201
11N!
W~, (TI) = n (~’Di)T’’~
~(r,)=-~―n(o.D,)~
~u- J
(44)
,
.
ij
i j I
j .
In WZ(T;J) =N
,
ln N-N-~ ~ i j
Tij
L In ~ ‘JD. -- Trill ‘ J J
(51)
L B~’’~7J
In
f~1== - [In
~~~ ~~ ~~~~’~~~
T,-ln(0~)]. (52)
202
;
C-~ ~ T;;c;~=0 (multiplier ~3) (26’)
j
The second step is to form a Lagrangian expression, denoted ~(Tl, ~;, yj,
~3), where the terms in brackets indicate that the value of the Lagrangian,
Y, depends on the values of the interactions, 7~, and of the newly
introduced Lagrange multipliers. This Lagrangian function is constructed
by taking the right-hand side of the entropy equation (50) and adding to it
the left-hand sides of the above constraints multiplied by their associated
Lagrange multipliers, to produce:
203
,
~ ~ i(~i‘~ Tij) + I Yi(Di-~ T~f) +
; ;
j j
~(C-~ ~ T’ur~i)-
i
(
j
(54)
and which are known as the first-order conditions. Partial derivatives are
calculated with respect to one variable at a time, holding constant all other
variables on which the value of Y depends. Our important proviso is that
the differentiation of a Lagrangian function may produce partial deriva-
tives which are non-zero, whereas traditional unconstrained maximization
sets these derivatives to zero. In our particular case, we can assume that all
the partial derivatives in equations (54) will be zero. As constraints (4), (5)
and (26) are written as strict equalities, setting these partial derivatives to
zero will reproduce these constraints as necessary conditions of the maximi-
zation process. Furthermore, it turns out that putting a~~aT;; equal to zero
is permissible for positive Tij values, which are usually found in the models
we considering.
are
aa;
aa; _ _ _ _1, (56)
as any number raised to a zero power is unity. Second, anything multiplied
by ~; is multiplied by the derivative of ).i; hence
~ -
&dquo;&dquo;’ ; a~=o;- ~=0. (57)
204
oy
(59)
~=C-EZT-,.,=0.
T’i~rJ-O.
i J
These three conditions clearly reproduce our original constraints. Once
they are satisfied, maximization of the Lagrangian function (53) implies the
maximization of our original entropy expression (50).
Differentiation of the Lagrangian with respect to Tij is a somewhat more
cumbersome operation. We can immediately dispose of the (Nin N-N)
term as it involves no T~ variables, and proceed to the most difficult part of
the differentiation involving:
-EZ[~lnT,-~].
¡
(60)
j
For the first Tij term we use the differentiation rules already exemplified for ’
~; so:
aaT,~
Tij =1 (61))
Tij) =In T;;,
OTy T°(ln
(In Tij) ’i
that is the derivative of Tij is multiplied by the In Tij term. In similar vein we
can differentiate In Tij and multiply its derivative by T;j:
.(T,)~=(T,)-=1.
Tij T ij
(62)
The differentiation rule for products states that we add (61) and (62), so:
a T;;
~..(o.-Er,)
’
’――= -~. 1 ~= -~ (66)
~ a Tij
a I’ Yj(Dj- L Tij)
a T;; ’ -- -y~ 1 T~°= -y;
j (67)
I}
205
a~t~-~
lfl ~ T~~~~~)
IJ
- {6s)
(68)
’ ’
aT;l
―
T-
N9 (75)
(75)
this measure can be defined as:
~ Le p’J ln p’J’
i
(76)
j
of
to zone--size-dependent entropy measure
equation (51).
equal)..
Acktzo rvledge11lellts
It is a pleasure to acknowledge the help and advice of Huw Williams (Leeds
University) and of Tony Gatrell and Barrie Gleave (both of Salford
University) during the preparation of this paper. As usual, they are
absolved of any responsibility for the final product. Excellent cartographic
and typing assistance was willingly supplied by Christine Minister, Gustav
Dobrzynski and Barbara Senior.
209
VII References
Olsson, G. 1965: Distance and human interaction: a review and bibliography. Phila-
delphia : Regional Science Research Institute.
O’Sullivan, P. and Ralston, B. 1974: Forecasting intercity commodity transport
in the USA. Regional Studies 8, 191-5.
Semple, R. K. and Golledge, R. G. 1970: An analysis of entropy changes in a
settlement pattern over time. Economic Geography 46, 157-60.
Senior, M. L. 1979: Entropy maximization: a pedagogic note on Gould’s pedagogic
ieview. University of Salford, Department of Geography, forthcoming dis-
cussion paper.
Shannon, C. and Weaver, W. 1949: The mathematical theory of communication
Urbana: University of Illinois Press.
Sheppard, E. S. 1976: Entropy, theory construction and spatial analysis. Environ-
ment and Planning A 8, 741-52.