Senior 1979

From gravity modelling
to entropy maximizing:
a pedagogic guide
by Martyn L. Senior
I Dispelling the entropy maximizing mystique

Within geography the entropy maximizing methodology has apparently
acquired a mystique not yet completely dispelled by various pedagogic
reviews (Gould, 1972; Crowther, 1974; Cesario, 1975; Webber, 1977;
Haggett et nL, 1977). At least, geographers have a rather ambivalent
attitude towards the methodology. Since it has been closely associated with
modelling spatial interactions, geographers have rightly felt obliged to
understand it. Nevertheless, many feel uncomfortable with a methodology
which has had a longer association with transport planning and regional
science than geography, which demands mathematical skills rather than the
more familiar statistical ones, and which involves a concept, entropy,
seemingly so difficult to define and interpret unambiguously. All this serves
to generate the impression that entropy maximization is more complex
than it really is (Gould, 1972).
Because the methodology has not been readily absorbed into geography,
the Newtonian gravity model has had a prolonged lease of life despite its
serious shortcomings, which are discussed in section II. Four main deficien-
cies are identified. First, the model is based on a physical law and lacks an
independent socio-geographical justification. Second, it is a wholly aggre-
gate model, siying nothing of the way group interactions relate to indi-
vidual interaction behaviour. Third, it is incapable of predicting inter-
.
&dquo;
actions which are consistent with known constraints on the number of trips
leaving and/or terminating at each zone. Fourth, its forecasts tend to
exaggerate changes in the amounts of spatial interaction as opportunities
for that interaction change.
Geographers have been largely unaware of the pragmatic adaptations to
the Newtonian model made by transport analysts, who required a sensible
trip forecasting tool for practical planning applications. By incorporating
certain balancing terms in the model, they were able to make its trip
predictions consistent with given trip-making information. In section III it
is shown that by varying this prior information a family of gravity models
may be specified, which have systematic and intuitively appropriate varia-
tions in structure. Such models, however, are still dependent on the
Newtonian analogy and still fail to address the aggregation problem.
176
In section IV the focus turns to entropy maximizing. At this stage we

dispense with the Newtonian gravity model altogether, and consider how
entropy maximizing methods build models from scratch. Using an urban
journey-to-work example, it is argued that the analyst typically lacks
information about numerous individual commuters, but has reasonable
access to associated aggregate data on numbers of jobs in certain districts of
the city, on the number of commuters residing in each zone, and on
average travel costs. What entropy maximization enables him to do is to
estimate amounts of zone-to-zone interactions which are consistent with
this aggregate information, but which are also as unbiased as possible about
the admittedly unknown details of where individual commuters live and
work. Thus maximizing entropy in this context means maximizing the
analyst’s uncertainty, or minimizing his bias, about these individual com-
muter interactions. As a measure of uncertainty entropy may be defined as
a combinatorial statistic giving, for a set of zone-to-zone interactions, the
number of ways individual commuters could choose to live and work in
the set of zones. In maximizing entropy the aim is to identify the set of
interzonal interactions which maximizes this number and yet is consistent
with available aggregate information. In section IV, 4 we briefly consider
recent suggestions that zone size has an important effect on this entropy
measure.
Up to the end of section IV, 4 the arguments put forward are illustrated
where possible with numerical examples. Indeed, it is a feature of sections II
and III that the evolving argument is coordinated by reference to illus-
trations which utilize a common hypothetical data base. The author’s
experience, both as student and teacher, suggests that readers should check
their comprehension by working through these examples for themselves.
At the beginning of section IV, 5 readers will detect a discrete rise in the
complexity of the argument as we turn to the mathematics of entropy
maximization, which will inevitably place a severe strain on O-level
mathematical skills. Indeed, this section seems particularly formidable and
the temptation will be to ignore it. This would.be an unfortunate reaction,
for without it the mathematically uninitiated reader is likely to appreciate
entropy maximization only at a simplified numerical level, and may well
be incapable of making the connection between numerical examples and
the gravity-like forms that the mathematics produces. Undoubtedly sec-
tion IV, 5 demands great perseverance from such readers, but they will find
the mathematics is set out in greater detail and more fully than is usual
elsewhere. Consequently, it should not be excessively difficult to progress
from one step to another in the mathematical argument.
-
In section V an attempt is made to unravel the confusion surrounding the

entropy concept. Our view of entropy as measuring an analyst’s uncer-
tainty in situations of limited information is reiterated, and it is pointed out
that this use of entropy is common to statistical mechanics and information
theory, and is intimately related to Bayesian methods of statistical infer-
177
ence. A separate use of entropy as a statistic which is descriptive of spatial

patterns is identified. Finally, the vague and spurious applications of
entropy as an undefined indicator of ’disorder’, or lack of structure, in
geographical systems is distinguished as a distinct misuse of the concept.
The paper is concluded by pointing readers in the direction of further
substantive issues concerning the potentialities and limitations of the
entropy maximizing methodology.
II A critique of the Newtonian gravity model

1 Structure alId
illterpretatio1l
The social gravity model is a direct analogy with Newton’s physical law
that the magnitude, T, of the attractive force between two entities, i and j,
is directly proportional to the product of their masses, M; and Vl, and
inversely proportional to the squared distance, d2 )’ between them. Incor-
porating the gravitational constant, G, this law is formally stated as:
Tij= GM¡Vjd¡¡22 (1)
where d¡¡2 is a convenient shorthand for 1 jdl.
This model has been applied to various spatial interaction situations
where: i and j are locational or zonal entities (e.g. cities, regions, neighbour-
hoods) ; Tij records the of interzonal intcraction (e.g. number of
amount
trips); Mi and Vi appropriate measures of zone size (e.g. population)
are
rcHecting zonal propensities to emit and attract interactions; dij measures

spatial separation; and G is an empirically determined constant whose role
will be clarified shortly.
The model suggests that the amount of interaction, Tij:
i) increases linearly as G increases (Figure la);
ii) increases nonlinearly with simultaneous increases in Mi and Vj (Figure
1b illustrates the particular case of M;= 1~,);
iii) decreases, but progressively less rapidly, with increasing distance
&dquo;
(Figure lc).
Figure lc illustrates the point that Newton’s squared distance term, dij2, is a
particular case of the negative power function, d,~ ~; that is, f3 equals two in
Newton’s law. The empirically determined parameter, /3, reflects the
responsiveness of the amount of interaction to changes in spatial separation,
and has a value typically less than two in social contexts.
2 The physical
analogy issite
The gravity model is geography’s foremost example of a physical analogy;
that is, a law established for a physical system which is used to elucidate
178
Figure 1 Graphical decomposition of the Newtonian gravity model
apparently parallel features of social systems without independent justifica-

tion. Such analogies may lead to new insights but also to possibly mislead-
ing conclusions, so neither uncritical acceptance nor premature rejection of
them is desirable.’.
Newton’s model has prompted numerous attempts to discern gravity-
like phenomena in human spatial behaviour, and in turn these have
179
produced modifications to the Newtonian model (e.g. to the squared

distanceterm) to improve its descriptive ability in geographical contexts
(Taylor, 1975). However, such modifications do not alter the model’s
physical origins. The gravity law may apparently ’fit’ geographic pheno-
mena, but if it is to be anything more than a purely descriptive device-
indeed, if it is to acquire theoretical and predictive status-it must be
derived in relation to the socio-spatial contexts in which these geographic
phenomena are produced. As VVilson remarks ’... though a part of the
study may proceed through analogy, in the end the analogy must be
thrown away’ (1969, 159).
3 The aggregatioll problem

The aggregation problem involves relating consistently the behaviour of
single ’individuals’ to the collective behaviour of a ’group’ comprising
these individuals. In physics the individuals might be single gas particles and
the group the whole gaseous system; in geography theindividuals might be
single commuters and the group the total city labour force.
This problem is typically ignored in geographical applications of the
gravity model, which treat it solely as an aggregate formulation referring
to groups of decision-makers and zonal quantities of space. Indeed Olsson
claims ‘... that neither the social nor the physical gravity concepts can be
used for studying anything other than groups or masses; consequently,
these concepts cannot be applied to movements made by single individuals
or molecules’ (1965, 4). This statement is contradicted by Newton’s initial
conception of the gravitational law which treated the earth and moon as
single particles; that is, dimensionless objects with mass. Although no
physical object conforms to this definition, some are less particulate than
others (French, 1971) and the size and heterogeneous composition of
planets makes them least amenable to accurate treatment as particles.
Indeed Newton faced time-consuming mathematical difficulties in aggre-
gating from the inter-particle level of attraction to the inter-planetary one
(Bronowski, 1973, 233). However, he deduced, for spherical objects at
least, that the gravitational law applied universally at micro and aggregate
levels, and that the mass of an object could be considered to be concentrated
at its centre..
The analogous aggregation problems raised by geographical uses of the
gravity model, concerning the relationship between individual and aggre-
gate spatial interaction behaviour, have been barely addressed, apparently
because geographers have pursued only the aggregate analogy. Yet such
aggregation issues pervade strategic planning studies which, by definition,
deal with large areas of space containing numerous individuals. To be
workable, strategic planning models must treat persons and space aggre-
gatively, but how do such models relate to the individual location be-
haviour which they supposedly subsume and reflect in aggregate? This
180
question still poses a major research challenge in geography. Entropy

maximizing methods offer a workable, if not ideal, answer to it.
4 Tlre iiitefiial
cOllSisteHcy question
Since the early 1960s gravity models have been widely used in transport
planning, but transport analysts have had to rectify intrinsic deficiencies in
their formulation to make them sensible forecasting devices. By compari-
son geographers have been slow to appreciate these problems.
To understand the internal consistency question, consider a simple
journey-to-work example where commuters make trips from two resi-
dence zones (i=1, 2) to three employment zones (j=1, 2, 3). The gravity
model predicts the number of commuter trips in the two-by-three Tij
matrix (Table 1). Further predictions are obtained by:
i) summing all trip values to give a prediction of the system-wide trip
total, Yi Lj T;~; -
ii) summing the trips in each row of the matrix to give a prediction of
trip ends, Lj Tj, originating in i zones;
iii) summing the trips in each column to give a prediction of trip ends,
E; Ty, attracted to j zones.
The internal consistency question asks whether these latter row and column
sums should logically equal the known values of the mass terms, Mi and 1~;
respectively. The answer depends on how the model user interprets his
interaction problem and defines the mass terms.
. Most early model applications (Olsson, 1965) treated the mass terms as
proxy variables for the amount of interaction likely to emanate from origin
zones and be attracted to destination zones. For our journey-to-work
example Mi might be defined as zonal resident population and Vj as acreage

Table 1 A commuter trip matrix, T;l, with summations of trip elements
181
of industrial plus commercial land use. These measures stand proxy for the
number of commuter trips likely to be generated in residential zones and
attracted to employment zones. Consequently, we would expect the
predictions of zonal trip origins and attractions given by the row and
column sums of the trip matrix to be proportional to values of M; and Vj,
but certainly not equal to them:
L TijocM¡ -; LJ Tiji= M¡ (2)

j
; ;
T&dquo; oc vi T&dquo; 0 vi (3)
Equality signs would be absurd because Tl, M; and Vj are measured in

different units.
However, in transport models commonly one or both mass terms are
defined precisely as numbers of trip origins, 0;, and/or trip attractions, Dj.
This arises because the transport analyst divides trip making into four
components:
i) trip generation and attraction: the decision to make a trip and how ’
often;
ii) trip distribution: locational choice of trip destination;
iii) modal split: choice of mode of transport;
iv) assignment: choice of route through a transport network.
The gravity model is used for trip distribution, but is preceded by trip
generation and attraction models providing independent estimates of zonal
trip origins and attractions which subsequently become the mass terms of
the gravity model. Hence, the definitions of the row and column sums of -
the predicted trip matrix coincide exactly with the definitions of the OJ and
Dj mass terms respectively, and the logical expectation is that their values
should be exactly consistent; that is: ,
~ Tri-Or (4)
j
~ T~l = D;. (5)

;
Unfortunately the Newtonian gravity model of equation (1), with OiDj

replacing M;V;, cannot satisfy either of conditions (4) or (5).
As a numerical illustration, imagine that we have observed distance and
trip matrices, dij and T,J°b5>,. at one point in time for our journey-to-work
example (Table 2). Summing each row and column of the trip matrix
produces the observed number of trip origins, OJ, and attractions, Dj,
respectively. Summing all trips in the matrix gives the observed systcm-
wide trip total, defined as N for convenience; so: .
£ £ T(pb0)~ gq~£ 0;=2: D~.

; j ;
(6)
j
182
Table 2 Known distance and trip matrices
Suppose we use these observed trip origins and attractions as mass terms in a
gravity model of the form:
T,, = G 0, D, dT (7)
and seek a predicted trip matrix, Tij, matching the observed trips, T1~P bl)’ as
closely as possible. Geographers have conventionally tackled this problem
by converting equation (7) into a logarithmically linear form:
log (Too)
G , ~’D. =log G-~3 log d;J, (8)
and then have used least-squares regression analysis to calibrate (that is, find
best-fitting values of) the intercept, log G, and regression coefficient, ~3, by
treating / T. B
-
log 5iT
B’-’’~7
and log d,;
as the dependent and independent variables respectively (Figure 2).
Table 3 presents the results from a regression analysis applied to our
numerical example. Clearly the log-linear gravity model violates condi-
tions (4) and (5) as neither the row nor column quantities of the Tij matrix
sum respectively to the known Oi and
Dj values. Furthermore, the model
violates the less restrictive condition that systemwide predicted and
observed trip totals are equal, namely:
L I Tij=N= I L T,;obS,.
; ;
(9)
j j
One might defend the model by suggesting that its predictions of trip
183
I w
Figure 2 Graphical form of the log-linear gravity model, equation (8)
Table 3 Results from the log-linear gravity model,

equation (8): an illustration of internal inconsistency
origins and attractions, as well as trip interactions, makes the transport

planner’s trip generation and attraction models redundant. However, this
would demand too much of the gravity model’s predictive ability in
replicating reality at satisfactory levels of accuracy. In any event the internal
consistency question has further repercussions when future trip interactions
z
are to be forecast.
5 Theforecastillg dilemma
The multiplicative structure of the Newtonian gravity model endows it
184
with forecasting properties which are often inappropriate for geographical

situations. Returning to our numerical example, assume that independent
forecasts suggest a doubling of employed persons in each residence zone
and a commensuratc doubling of job opportunities in each employment
zone, which we can rcasonably take to imply a doubling of trip origins, Oi,
and attractions, Dj. Intuitively we would expect predicted trip interactions
to double. too. However, using the already calibrated log-linear gravity
model, with # and G held constant, to forecast the impact of doubling O;
and Dj, we obtain the nonsensical result that all trips quadruple (compare
Tables 4 and 3).
Table 4 Predictions of the log-linear gravity model when
trip origins and attractionsare doubled: an illustration of
forecasting problems .
This is a particularly blatant illustration of the forecasting problem; more

realistic examples are evident from the literature (Traffic Research Cor-
poration, 1969; O’Sullivan and Ralston, 1974). Generally, we should be
aware that the gravity model may well exaggerate changes in the amounts
of interaction, particularly when both mass values increase or decrease
simultaneously.
III Heuristic derivation of a family of improved gravity models

With the advent of strategic land use/transportation studies, transport
analysts effected the necessary modifications to the Newtonian model to
overcome the internal consistency and forecasting problems. Effectively
they provided a heuristic derivation of improved gravity models which

were later to be more convincingly justified using entropy maximizing
methods.
To appreciate these improved models recall that the Newtonian gravity
model, calibrated by linear regression methods, is totally unconstrained as
185
itstrip predictions do not satisfy trip origin, trip attraction and systemwide
trip constraints, equations (4), (5) and (9) respectively. By imposing, either
singly or in combination, these three sets of constraints a family of intern-
ally consistent gravity models can be constructed which are appropriate to
sundry spatial interaction situations. By adopting alternative locational
interpretations of the journey-to-work (Wilson, 1970a), members of this
family can be illustrated using our previous numerical example (Table 2).
Mathematically all that is required is substitution of one equation into
another.
1 The total interaction corutrnirted model.

Our first problem is to estimate the number of commuters, Tl, travelling
between each i j zone pair, when our only knowledge of tripmakers is their
total number in the system, N. In this context journey-to-work inter-
actions are formed by the N commuters simultaneously choosing residen-
tial locations, i, and job locations,j. The mass terms, OJ and Dj, indicative of
housing and job opportunities respectively, attract but do not constrain
these locational choices. Consequently only constraint equation (9) applies
and an appropriate model is:
Tij=KO¡DjdiifJ,
J ij rj (10)
which is structurally identical the Newtonian model of equation (7).
to
However, Kis not an independent parameter like G, but a balancing factor
calculated to ensure that constraint (9) is met. Simply substituting for Tit ion
this constraint using the right-hand side of equation (10), and then re-
arranging terms:
y E Tij=N= L jL KO¡DjdijP
i j
(11)
j
N
~
K- (12)
E E j0,Dd,,-’ N
i .
produces an expression for K in terms of total tripmakers, the mass

variables and distance function.
Our new model of equation (10) may then be rewritten as:
’
~ Tij~= ~ F
0,D~ -) (13)
.
[ O¡Djd¡¡p .] I
L L O¡Djdij -P
j i
by using equation (12) to substitute for K. This model is then seen to be

illtrillsically nonlinear; conversion to a linear form, like equation (8), is
impossible because the d;~ ~ terms in the denominator are contained inextri-
cably in the X, Lj summation. The expressions in square brackets in
equation (13) are probabilities of a commuter living in i and working in j;
186
over all zone pairs thcy sum to one and thus cnsure that predicted com-
muter interactions sum to N.
Equations (10) and (12) have been used to recalculate prcdictcd inter-
actions for our numerical example (Table 5). In principle P should be
Table 5 Results from the total interaction constrained
gravity model, equations (10) and (12)
recalibrated to reoptimize the fit between these new Tij values and the
observed trips, T~ob~}, of Table 2, but for comparative purposes the previous
j3=1 value is retained. The_calculated K value guarantees that the T;l
interactions sum to the given number of commuters, N, and this will
always be the case no matter what value ~3 takes. The row and column sums
of the interaction matrix record respectively the unconstrained zonal
choices of, or demand for, residences and jobs by the N commuters, and
thus need not equal the associated Oi and Dj values.
2 Prodyrtioti comtraitled gravity models

In the total interaction constrained model commuters are deemed to be
choosing job atrd residential locations, rather than choosing either job
locations with respect to previously determined residential choices or
residences with reference to already decided workplace locations. As the
realism of this ’simultaneous choice’ interpretation seems questionable,
assume that we now know the number of commuters, O;, resident in each i
zone and that journey-to-work interactions are formed by their job loca-
tion choices, j. A trip origin connotation is now associated with the OJ
values, to which all predicted commuter interactions leaving each i zone
187
should sum. These conditions are embodied in constraint (4), and can be
satisfied only by replacing the single systemwide balancing factor, K, with
residence-zone-dependent factors, A;:
Tij=A¡OiDjdijP. (14)
As before, substitution for T;l in constraint (4) using equation (14), and
rearrangement of terms, gives an expression for calculating the new balanc-
ing factors:
L Tij=O¡= L A¡O¡D¡dijP (15)
, j j
1
A
Ar= (16)
L Du~ ~*
j
The constrained commuter flows, T;~, leaving each residence zone are
destined for workplace zones, whose available job opportunities, reflected
in the Dl measures, attract but do not constrain the job choices or demand
of the commuting population.
Table 6 presents the results of using equations (14) and (16) to predict
interactions for our numerical example. Clearly the row sums of the
interaction matrix record commuter numbers leaving each residence zone
and accord with the given Oi values, whereas the column sums are estimates
Table 6 Results from a production constrained gra-
vity model, equations (14) and (16): employment choice
’
interpretation
188
of their zonal job choices or demands, which arc not restricted to match the
workplace attraction valucs, Dj.
The logic of this employment choice model can be reversed by assuming
that we know instead the number of commuters, Dj, already employed in
each workplace zone j, and that journey-to-work interactions result from
their residential location choices, i. Commuter interactions now originate
in_each workplace zone, j, and their consistency with the given commuter
numbers, Dj, is required. Equation (5) replaces equation (4) as the appro-
priate constraint, and is satisfied by the calculation of employment-zone-
dependent balancing factors, Bj, which replace A; in equation (14) to give:
T,J-BaO;D,d,~ ~. (17)
The reader may easily check that substitution of equation .(17) into con-
straint (5) leads to the following expression for Bj:
Bj-L o,d=fl1
;
.
(18)
The 0; terms now reflect housing opportunities which attract but do not
constrain commuter interactions terminating in each residence zone.
Table 7 Results from a production constrained gra-
’
vity model, equations (17) and (18): residential choice
interpretation
189
Table 7 presents the numerical results from this model. The column sums
now record commuter numbers leaving each employment zone and match
the given Dj values, while the row sums predict commuters’ residential
choices or demands, which need not coincide with the housing supply
’
values 0;.
3 Productioll-attraction cOllstrailledgravity models

The essential features of the production constrained models are constraints
applied at the origin (or fixed location) ends of commuter interactions, but
no restrictions at the destination (or locational choice) ends. However, we
may now envisage either residentially fixed commuters, 0,, choosing jobs
in zones j such that their zonal demands adjust exactly to job supply, Dj; or
commuters already employed in zones j, Dj, selecting residences in zones i
such that their zonal demands match exactly housing supply, 0;. Such
situations imply the existence of not only trip origin or production con-
straints, but trip destination or attraction ones too. Hence constraints (4)
and (5) operate simultaneously and are satisfied by calculating both sets of
zone-dependent balancing factors, A; and Bj. The appropriate model thus
becomes:
T;~=A;BJD;D~d~f ~. (19)
Equation (19) is used to substitute for Tl in both constraints (4) and (5),
giving respectively:
I Tij= 0;= lL A;B~p;D~d;~ ~ (20)
j
I T;,=D;---~ A;BfO;D;d;~ s.
; ; .
(21)
By rearranging terms we have revised expressions for A; and Bj:

1
(22)
A~= ~ B~D~d~l ~
j
1
Bj (23)
= ~ A;O;d~ ~fl
;
As both A; and Bj are unknown, their solution from these two equations
appears to pose a problem because each A; value is dependent on all Bj
values and vice versa. Fortunately, either all A; or all Bj values may be set
initially to any arbitrary value, and equations (22) and (23) solved iterat-
ively (Figure 3). Such a procedure is guaranteed to converge to unique
values of the products, AiBj, no matter what arbitrary starting value is chosen
(Evans, 1970). Separately neither A; nor Bj values are unique because the
190
Figure 3 Iterative calculation scheme for the balancing factors of the production-
attraction constrained model
products, AiBj, are unchanged if A; is multiplied, and Bj divided simul-

taneously, by any constant.
Table 8 presents the numerical results using this model. Four sequences
of A; and Bj values were calculated. Clearly, discrepancies between succes-
sive values decrease allowing termination of the iterative process at a
desired level of accuracy. Both the row and column sums of the interaction
matrix, whose interpretation depends on whether we are dealing with
commuters choosing residences or jobs, are then ensured at least to ap-
proximate the given Oi and Dj values.
191
Table 8 Results from the production-attraction con-

strained gravity model, equations (19), (22) and (23)
IV The entropy maximizing approach to model building

1 Unresolved problems
In section III the Newtonian gravity model was ’patched-up’ to satisfy
increasingly stringent constraints which simultaneously disposed of its
forecasting problem. Yet the improved models so generated still derive
from the physical analogy. It remains to derive these models quite indepen-
dently of Newton’s gravity model. Furthermore our heuristic derivation
of improved models did not address the aggregation issue. To resolve both
problems we appeal to entropy maximizing methods.
2 The entropy
maximizing context: levels of aggregatioii and itiforiiiatioii
Entropy maximizing methods allow us to generate models relating the
aggregate properties of any system containing numerous individual ele-
192
ments, for example the physicist’s gaseous system containing around 10&dquo;
particles, the geographer’sjourney-to-work system comprising over 106
or
commuters for larger conurbations. In general it is very difficult, or
impossible, as yet to understand such systems ill toto by analysing the
characteristics and behaviour of each individual element, and then combin-
ing the individual analyses to obtain an aggregate picture. Manageability
often dictates that only aggregate representations of system behaviour are
feasible. The gravity models considered so far achieve this aggregate
representation by totally ignoring individual behaviour. Entropy maxi-
mizing methods, however, offer one mcans of relating such aggregate
constructions to the underlying variety of individual behaviour.
To understand entropy maximizing problems we specify three possible
levels of aggregation, or ’states’ (Wilson, 1970b), at which we could
describe our journey-to-work system for a city partitioned into zones,
denoted i for residential and j for employment activity.
i) At the micro-level each commuter, tJ, and his characteristics, such as
travel cost incurred, cj, are recorded and identified by residential and
employment zone (Figure 4a). The label x (varying between 1 and 111)
distinguishes between individual commuters residing and working in
the same (I, j) zone pair.
ii) At an intermediate or meso-level of aggregation we cease to identify
particular individuals by the label x; only the number of commuters
residing in i and working in j are recorded (Figure 4b):
m
~=I~ X=l
(24)
Similarly individual travel costs are replaced by a single interzonal

cost, cj, between zone centroids, reflecting average travel costs for
each T;~ commuter group as a whole:
m
vvI)tiJ x eel
.r
c = x-’ T iJ
y
(25)
iii) At higher or macro-levels of aggregation we record only zonal and

city-wide features of the system (Figure 4c), such as: the number of
employed residents, trip ends or houses in each i zone, ~Ja; the number
of workers, trip ends or jobs in each j zone, Dj; the systemwide total of
commuters or trips, N; and the total travel expenditure of all com-
muters, C.
Entropy maximizing methods then permit the analyst to estimate the

most probable, meso--level, journey-to-work interactions, Tij, which are:
i) fully consistent with any macro-level information and assumptions he
193
Figure 4 Describing a journey-to-work system at three levels of aggregation
wishes to take into account. Constraints (4), (5) and (9) specify
required consistency between Tij and Oi, D~, and N respectively;
additionally a systemwide cost constraint may be imposed:
~ ~ T~uu=C
j
(26)
j
ii) maximally noncommittal or unbiased about micro-level behaviour

for which information, and e;~, is typically unavailable or
because ,of system
like t~
incomplete size.
The analyst admits his usual ignorance of, and therefore uncertainty about,
194
micro-level events. means maximizing thc ana-

Maximizing cntropy thus
lyst’s uncertainty, minimizing
or his bias, about the journey-to-work
system specified at the micro-level, but this maximization is subject to
constraints expressing the analyst’s known information at higher aggrega-
tion levels. Now it may seem curious to the reader that an analyst should
seek to maximize his uncertainty, but he is only maximizing uncertainty
about what he does not know, while at the same time using what he does
know as constraints on this maximization process.
3 A rrrrrrrerical illrr.stratiorr
of tlre methodology .
As aii example, assume that journey-to-work patterns result from the

residential choices of persons with known employment locations. A linear
city is divided into seven zones: five persons work in zone j =1 and cach
may choose accommodation in any one of the seven zones, thus giving rise
to the trip pattern Til (i =1 to 7). Our problem is to find the most probable
values of T;,-that is, how many persons choose residences in each zone-
which are consistent with given information. One obvious piece of infor-
mation is that the number of persons travelling home from zone 1, Di, is
five, which is also thc total number of persons or daily home trips in the
system, N, as we assume only one employment zone. Assume also that total
daily travel expenditure for such journeys is six cost units. Both these items
of information constrain possible values of Til:
7
~ T’n=Dr=N=~
f=)
(27)
7
~ T;rc;,=C=6.
I= I
(28)
Thc latter constraint involves knowledge of the travel costs, (i1, which are
zero within a zone and unity between contiguous zones. Initially all seven
zones possess an equal number of residential opportunities, OJ, which do
not restrict the residential choices of our five workers. Consequently, these
constant Oi values have no importance in finding a solution to our present
problem. Table 9 summarizes the information content of this hypothetical
situation.
Having fully specified our problem, the first part of its solution involves
identifying all possible sets of T~ values consistent with the information in
constraints (27) and (28). By inspection we allocate five persons among
seven zones and find 10 fcasible sets of Til valucs (Table 10), all other sets
being excluded by the constraints imposed. But which of the feasible sets do
we choose as being most probable given our lack of further information?
The answer is that set which maximizes our entropy or uncertainty
about the residential choices that could be made by our five workers
identified individually. We define entropy as the number of ways indivi-
195
Table 9 Summary of information for the simple journey-to-work example
dual choices at a micro-level could happen for each set of Tn values. This is
the combinatorial problem of choosing how many ways of selecting Til
individuals from a total of N, for which the formula is:
,
(N- Til)! N&dquo; Til!
-
&dquo;’IV
N fi,,Ti, -
~’=(~T,)’Tj
=
(~~))
(29)
where ! denotes factorial. Thus for our first set of interactions, Tn=4 and
T71= 1, we find first the number of ways of selecting four out of t’V=5
persons to reside in zone 1:
N
N! 5! _5x4x3x2x 1
WTII (30)
(30}
a,WTn=(N-T&dquo;)! T&dquo;!
_
(5-4)! 4! 1x4x3x2x 1=5.

Then we are left with choosing one person out of a remaining (N- Tn)
persons to live in zone 7:
(N-TIB)! f’m! J(5-4-1)! (a - 4} ! 1! 0!0! 11 1 =1.

U~~- ~’11) (~J~.71 =
&dquo;
1!
’
(31)
’ ’
(N-Tn-T7,)!r7,! (5 - 4 - 1)!
Observe that factorial is unity. Multiplying these two expressions
zero
together gives 5 x 1=5 possible ways our five workers could make
us
residential choices consistent with the first set of Ti, values. These five
combinations are illustrated explicitly in Table 11 by attaching the incons-
picuous names of Berry, Chorley, Gould, Haggett and Wilson to our five
individuals. Formally, the product of these combinatorial expressions gives
a more convenient formula for calculating entropy, 14’(T;) , as:
196
N! (N&horbar;T,,)’ N!
( N~ 11 . -’
I
i !(N-(N-~’n- Z’~~)! T’o! r Tn! T7,!’
=(N- T.n)! T’n!
1 (32)
(32’
because the (N - TII)! terms cancel and (N - TII - T71)! equals unity.
Such entropy values for all ten sets of Til values are presented in Table 10,
and it can be seen that the sixth set has the maximum value. This set of
interactions permits the maximum variety of residential choices at the
micro-level, hence the analyst is most uncertain about such individual
behaviour in this case. Looked at another way, however, this is also the
most probable interaction set, because it has the greatest number of these
micro-level possibilities (60 out of 210), which are each accorded an equal
probability of occurrence owing to the lack of further information.
Table 10 Feasible sets of interactions, Til, and their entropy values
Entropyvalues calculated from equation (32) with appropriate zone labels.

*
indicates maximum entropy value.
With zone size modifications (see section IV, 4):
Entropy values calculated from equation (42) with appropriate zone labels.
197
Table 11 An illustration of combinatorial possibilities

. The five ways residential location choices may happen for the 1 st
interaction set:
4 Zotie size rttod~tatious to tlte uutnerical illustratioll

In its various guises the type of entropy measure used in the previous
illustration is the one popularized by Wilson (1970b). More recently some
commentators have suggested that this measure should be expanded to
account explicitly for the influence of different zone sizes on the amounts of
intcrzonal~ interaction (Batty, 1974; Fisk and Brown, 1975; Williams,
1977a). Before we consider an appropriately modified entropy formula, let ’
us demonstrate this zone size effect.
By imposing different zone systems to partition our city we can divide ’
up the job and housing opportunities therein in numerous different ways.

In the previous section we assumed a rare ability to zone our city into seven
zones each possessing a constant number of residential opportunities, Oi,
say, for example, two houses per zone (Figure 5a). In the absence of all
Figure 5 Alternative zonal categorizations of spatially distributed opportunities

198
other attractive or detractive factors, such as travel costs, this implies that
the probabilities, p;, of commuters being attracted to residence zones,
would be proportional to relative zone size and thus equal for all zones:
0; 2 1
7 =1~=~
O;
for 14 7
i=1 to 7. (33)
i=1
Conversely a concentrated distribution of employment was assumed, with

a system total of fivc jobs in zone 1 and none elsewhere. Consequently in
this case the probability, pj, of commuters working in zone 1 would be one
and elsewhere zero:
-
I forj= 1I
.-~ =~=1 5 forj=l
~j = , 0Y Dj =~=0 for j=2 to 7
.
(34)
5
j=1
Possessing only this zone size information our best estimates of the prob-
abilities of commuter interactions, ~;j, are the products of these two
independent zonal probabilities: .
&dquo; O.D.
(35)
P~=p~!’I-~ ~i ~ D ; j
l
Thus, for our previous numerical example the interaction probabilities, Pit.
betwecn the single employment zone and the seven residential zones arc
constant:
~;~_
Pil A14~-14·55=~7 7
~ for i=1
-for
5 i=l toto 7.7. (36) -
Hence this system does not have a differential impact on interaction

zone
probabilities and it was for this reason that we could ignore the influence of
housing opportunities in section IV, 3.
However, if we change the zoning system so that all jobs are still in zone
1, but with the 14 houses in the system distributed unequally across seven
zones as in Figurc 5b, we change the probabilities, pi, as follows:
2 1
=-=- for i= I and 3
14 7
_1
for i =2, 4 and 6
OJ
oi 14
14
’
P’~ 7 3
(37)
I 0;
~’
;j
-
14
for ,=5
~°~ ~ ~
i=5
4 2
=- =- for i=7
14 7
199
The intcraction probabilities, Pil, are likcwise changcd to these values. It

should be stressed that these changes in interaction propensities are due
solely to zoning changes. Moreover, as cmpirically feasible zones usually
vary in size, accounting for their differential impact on interzonal interac-
tions is vital.
Whereas the combinatorial definition of entropy in the previous section
measured the number of ways individual commuters could arrange them-
selves between workplace and residence zone pairs alone, the existence of
zonal opportunities permits a further number of arrangements of com-
muters within houses and jobs, which can be calculated from the expres-
sion:. (~~D~) Tij. (38)

Clearly arrangements increase in number as zonal opportunities increase.
To illustrate the point, recall that for the first set of interactions, T;,, we
,
found five ways of selecting Ti i = 4 persozxs out of N=5 to live in zone 1 ’
(equation (30) and Table 11). For each of these five arrangements we can
find additionally the number of ways four persons can be accommodated in
D,=5 jobs and Oi = 2 houses (see Figure 5b) using:
(OlD,) T>1 =
(2 x 5)4=24 X 54= 104. (39)
This indicates that there are 2;(= 16) ways four persons can be arranged in
two houses (an example is given in Table 12), and each of these can be
Table 12 An illustration of the impact of zonal opportunity

size on the number of ways individual choices could be made
For the 1 st interaction set choose one of the five ways Ti 1 =4

individuals may live in zone 1 (Table 11). and enumerate all
&dquo;
possible ways they may occupy 01 =2 houses.

200
associated with 5~{=625) ways the same four persons may occupy fivejobs.
Combining equations (30) and (39), we have an expression for the number
of ways of selecting four out of five commuters to work in five jobs in zone
1 and live in two houses in zone 1:
M
(p‘D~)T i~ ---5 x 10&dquo;’.
(~&horbar; ~ )))! ~ ))’
(40)
Similarly for T&dquo;-1 1 person and 07=4 houses (see Figure 5b) we multiply
equation (31) by the appropriate form of formula (38):
(N- T, (O~D,) r&dquo; ~ ~
1x(4xS)I=20.
~~ ~ ~~ ~ ~~’ (41)
(N- T’n - T711 ’ T71 ’
The product of equations (40) and (41) gives the total number of
arrangements of five persons between the two zones and within then
available jobs and houses, associated with the meso-level interactions
Tn=4and T’&dquo;=1:
N’ .
Wz(Til)-
T,1! N~ T&dquo;! (01D1)TII(07DI)T71=5xI0~x20=106. (42)
Expression (42) is thus the zone-size-dependent form of the previous

entropy equation (32). Using appropriate versions of equation (42) revised
entropy values, WATit) , are calculated for our ten sets of T;; interactions
using the OJ quantities of Figure ~5b (Table 10). Observe now that the
fourth set of interactions has the maximum entropy value.
5 The matlzematics of entropy maximization .
While highly simplified numerical examples are useful for understanding

the inner workings of entropy maximizing problems, a mathematical
treatment is inevitable for realistic situations.
a Derivitig a mathematically convenient defillitioll of entropy: Entropy maxi-

mizing problems involve constrained maximization techniques, with
entropy as the quantity to be maximized. Our two specific entropy
formulae, used earlier for numerical illustrations, are generalized for math-
ematical treatment by replacing specific zone labels with i and j. The
Wilsonian entropy measure, exemplified in equation (32), becomes:
N!
(43)
W(Tij)=ni T_-’
y
i~l
where nij instructs us to multiply together all T~! values. The zone-size-
dependent entropy formula, exemplified in equation (42), becomes:
201
11N!
W~, (TI) = n (~’Di)T’’~
~(r,)=-~&horbar;n(o.D,)~
~u- J
(44)
,
.
ij
In such forms both entropy measures prove awkward to maximize;

fortunately, the same answers are achieved by maximizing their natural
logarithmic (In) forms. At thisjuncture the reader should recall certain rules
for converting to logarithms:
i) multiplication, n, of terms becomes the addition, E, of their logarith-
mic values;
ii) division of terms becomes the subtraction of their logarithms; and
iii) terms raised to a power, for example (0;D,)~, have their logarithmic
values multiplied by the value of the power, as Tij In(0,D,).
Applying these rules equations (43) and (44) become respectively:
In W(Tij) = In N!-~ ~ In Tij! ’
(45)
i j
In Wz(Tij) = In N!-~ ~ In T,j! + £ £ Tij In (OiD,).

i i
(46)
j j
It is now a simple matter to remove the awkward factorial terms using

Stirling’s approximation:
InN!:::::NlnN-N (47)
In Tij! ~ Tij In Tij- Tu. (48)
These allow equations (45) and (46) to be rewritten as:
In W(T;~) =N ln N-N-~ ~ [T;l ln Ti, - Tyl i

(49)
j
In WZ(T;,)=N ln N-N-~ ~ [T;, ln T;l- T;;]-1-~ ~ Tijln(O¡D) i

(50)
.
i j I
j .
In our subsequent mathematics we shall prefer the more general definition .
of entropy in equation (50). Usually this will be written more economically

as:
In WZ(T;J) =N
,
ln N-N-~ ~ i j
Tij
L In ~ ‘JD. -- Trill ‘ J J
(51)
derived by invoking the subtraction-division rule for logarithms:
L B~’’~7J
In
f~1== - [In
~~~ ~~ ~~~~’~~~
T,-ln(0~)]. (52)
202
b Tlte ittaxirttizntiott process: Our problem is to find the maximum value of

entropy, In J¥z(Tij), as redefined in equation (50). In mathematics a
maximization process like this involves differential calculus, and given a set
of rules for differentiation (see Wilson and Kirkby, 1975, 133-44) this
should be a straightforward exercise. This would be so in this case if the
maximization process was free to consider all possible sets of T;l values and
to select the one giving maximum entropy. However, we saw in section
IV, 3 that not all sets of Tij values satisfied the constraints imposed.
Consequently standard differential calculus cannot be applied without
transforming our constrained maximization problem into an equivalent
unconstrained one.
First, however,we must specify appropriate constraints which depend
on available information, on the interpretation of the problem being
investigated and on assumptions made. So far we have met four constraints,
namely constraints (4), (5), (9) and (26), which are very common but
certainly not exhaustive of all possibilities. In section III different models of
the journey-to-work were associated with various applications of con-
straints (9), (4) and (5). Let us now derive the production-attraction
constrained model via entropy maximization, in which case we apply
constraints (4), (5) and (26).
Conversion of such a constrained entropy maximization problem into
an unconstrained one is achieved by using the Lagrangian method (Wilson
and Kirkby, 1975, 278-91). Here we shall not illustrate how and why this
method works, but merely detail its application. However, readers are
advised to be wary of Gould’s (1972) rather ambiguous and misleading
illustration (Senior, 1979).
The Lagrangian method first requires all constraints to be rearranged so
as to equal zero, and with each one is associated a new variable termed a
Lagrange multiplier. So constraints (4), (5) and (26) may be reformulated

as: .
O¡- L Tij=O (multipliers ~?;) ~ (4’)

j
Dj- L Tl=0 (multipliers yj)

i
(5’)
;
C-~ ~ T;;c;~=0 (multiplier ~3) (26’)
j
The second step is to form a Lagrangian expression, denoted ~(Tl, ~;, yj,
~3), where the terms in brackets indicate that the value of the Lagrangian,
Y, depends on the values of the interactions, 7~, and of the newly
introduced Lagrange multipliers. This Lagrangian function is constructed
by taking the right-hand side of the entropy equation (50) and adding to it
the left-hand sides of the above constraints multiplied by their associated
Lagrange multipliers, to produce:
203
~’=N ln N-N-~ ~ [T;; ln T;;-T;i~-b~ ~ T;; lIl(Q;D;)-f-

i ;
j j
,
~ ~ i(~i‘~ Tij) + I Yi(Di-~ T~f) +
; ;
j j
~(C-~ ~ T’ur~i)-
i
(
j
We now have an titicotistraitied Lagrangian function equivalent to the

constrained entropy maximization problem with which we started.
.
With an important proviso, we may now revert to the traditional

unconstrained maximization process using differential calculus, which
involves the calculation of the partial derivatives (signified by a) of the
Lagrangian:
~ ~ ~ ~
.&horbar;~0;
a T; aY; a~ ~-~0;
a~ ; ~-~0; -~-~0
0
_
(54)
and which are known as the first-order conditions. Partial derivatives are
calculated with respect to one variable at a time, holding constant all other
variables on which the value of Y depends. Our important proviso is that
the differentiation of a Lagrangian function may produce partial deriva-
tives which are non-zero, whereas traditional unconstrained maximization
sets these derivatives to zero. In our particular case, we can assume that all
the partial derivatives in equations (54) will be zero. As constraints (4), (5)
and (26) are written as strict equalities, setting these partial derivatives to
zero will reproduce these constraints as necessary conditions of the maximi-
zation process. Furthermore, it turns out that putting a~~aT;; equal to zero
is permissible for positive Tij values, which are usually found in the models
we considering.
are
Turning now to the actual mathematics of finding these partial deriva-

tives, differentiation with respect to the Lagrange multipliers is very
simple. For example, taking the case of a2-’/a).;, we can ignore all terms in
the Lagrangian equation (53) which do not involve ).i, leaving:
I a~(~i-~ Tij). i
(55)
j
Then we apply simple rules of differentiation. First, ~; can be considered to
be raised to the power r=1; in differentiating, the rule is to reduce the
power by one and multiply by the original power so:
aa;
aa; _ _ _ _1, (56)
as any number raised to a zero power is unity. Second, anything multiplied
by ~; is multiplied by the derivative of ).i; hence
~ -
&dquo;&dquo;’ ; a~=o;- ~=0. (57)
204
Applying identical arguments for the other Lagrange multipliers we have:

&horbar;&horbar;=D,-E
ayj
T,=0
Tj=O
=Dj-E
(58)
oy
(59)
~=C-EZT-,.,=0.
T’i~rJ-O.
i J
These three conditions clearly reproduce our original constraints. Once
they are satisfied, maximization of the Lagrangian function (53) implies the
maximization of our original entropy expression (50).
Differentiation of the Lagrangian with respect to Tij is a somewhat more
cumbersome operation. We can immediately dispose of the (Nin N-N)
term as it involves no T~ variables, and proceed to the most difficult part of
the differentiation involving:
-EZ[~lnT,-~].
¡
(60)
j
For the first Tij term we use the differentiation rules already exemplified for ’
~; so:
aaT,~
Tij =1 (61))
Tij) =In T;;,
OTy T°(ln
(In Tij) ’i
that is the derivative of Tij is multiplied by the In Tij term. In similar vein we
can differentiate In Tij and multiply its derivative by T;j:
.(T,)~=(T,)-=1.
Tij T ij
(62)
The differentiation rule for products states that we add (61) and (62), so:
a T;;0 ln-tT;; - ln .L.~~ + 1.

,,-
(63)
The remaining differentiation provides further illustrations of rules already
used: .
a(--a T;;Trl) _ -1 T;j= -1 I) (64)

a T,’ (65)
In (O¡D¡) T° In(O¡Dj) = In(O¡D¡)
=1
a T;;
~..(o.-Er,)
’
’&horbar;&horbar;= -~. 1 ~= -~ (66)
~ a Tij
a I’ Yj(Dj- L Tij)
a T;; ’ -- -y~ 1 T~°= -y;
j (67)
I}
205
a~t~-~
lfl ~ T~~~~~)
IJ
- {6s)
(68)
’ ’
aT;l
&horbar;
-~3c;; 1 T°--_ -~3c;~.

p~~~&horbar;’-p~-
Combining all these pieces of the differentiation according to the signs in

the Lagrangian we have:
1£f
aT- = -[In~ ~r,+l-l]+ln(O.D,)-~-~-~=0.
c~
’J
~ ~ ~ ~ ~ (69)
Equations (69), (57), (58) and (59) are the necessary or first-order condi-
tions which can be used to obtain expressions for the Tij interactions and, if
necessary, the Lagrange multipliers. Simply rearranging terms in equation
(69) gives:
In Tij=ln(O¡Dj) -).¡-Yj- f1cij. (70)
The exponential function, which is the inverse of the natural logarithmic
one (Vliison and Kirkby, 1975, 70), is used to convert equation (69) to:
T;;=exp{---,~;) exp(-yj) OiDj exP(-~c~i)~ (71)

and by the simple expedient of relabelling terms involving zonal Lagrange
multipliers:
exp ( - ).¡) === A¡ (72)
’
exp ( -Yi) =Bj, (73)

we arrive at the familiar production-attraction constrained gravity model:
Tij=A¡BjO¡Dj exp(-:-{3c¡), (74)

but with a negative exponential rather than a negative power impedance
function (cf. equation (19)). The usual substitution for Tij in the constraint
conditions will give familiar expressions for Ai and Bj, from which values
for a; and yj may be found.
It should be stressed that this derivation of the ’gravity’ model is in no
way dependent on the Newtonian or any physical analogy. In particular,
the A; and Bj balancing factors, which in section III were attached arbitrarily
and pragmatically to the Newtonian model to make it satisfy constraints,
now emerge as outputs of the entropy maximizing process, related closely
to Lagrange multipliers.
The reader should check his understanding of the mathematics by
deriving other members of the gravity model family using appropriate
combinations of constraints (4), (5), (9) and (26). Note, however, that if the
Wilsonian entropy measure of equation (49) is used in place of the zone-
size-dependent version of equation (50), it is impossible to generate com-
pletely the total interaction and production constrained models without
justifying the use of additional constraints.
206
V Entropy: distinguishing different uses and interpretations of

the concept
Considerable confusion has arisen about the meaning of entropy, because

the concept has been used in a number of different ways which are
nevertheless distantly related (Batty, 1974; Cesario, 1975; Chapman, 1977).
To dispel this confusion major uses and misuses of the concept should be
carefully distinguished (cf. Sheppard, 1976).
1 Subjective uses of etitropy as a measure ojrcmertairrty

So far in this paper entropy has been used subjectively as a measure of the
arialyst’s eincertairrty about micro-level features of a system, and not as some
objective property of that system itself. The interpretation of entropy as
measuring numbers of micro-level arrangements in sections IV, 2 to IV, 4
comes from a branch of physics known as statistical mechanics, which
addresses the problems of analysing large systems of particles. Such an
interpretation has considerable pedagogic virtues, but may prompt the
criticism that we are borrowing yet another physical or mechanical
analogy.
In fact, this is not the case, for as the title of Wilson’s original entropy
paper (1967) implies we arc borrowing a more general method of statistical
inference, where entropy is used to generate models or. hypotheses in
situations of incomplete information. Indeed, Jaynes (1957) has argued
cogently that entropy maximization in statistical mechanics is but a specific
instance of the general inferential procedures of information theory as
formulated by Shannon (Shannon and Weaver, 1949). Shannon derived a
unique and unambiguous measure of uncertainty, S, represented by a
probability distribution (Wilson, 1970b, Appendix 1). For our examples,
by defining interaction probabilities, p;~, as:
-
T-
N9 (75)
(75)
this measure can be defined as:
~ Le p’J ln p’J’
i
(76)
j
Formally this is virtually identical to the statistical mechanics combinatorial

definition of entropy, at least in the logarithmic form of equation (49).
Hence, in information theory, S is known as entropy too. This view of
entropy, although essentially equivalent to the statistical mechanics
approach, proves more convenient and rigorous. For example, we do not
require Stirling’s approximation to produce mathematically convenient
entropy measures (section IV, 5a); nor do we need to talk in tcrms of
207
micro-level arrangements and attribute an equal probability ofoccurrence

to them as was done at the end of section IV, 3.
. These subjective uses of entropy to construct hypotheses in situations of
limited information arc closely akin to the Bayesian approach to statistical
reasoning. Indeed, the interaction probabilities, Pu, due solely to zone sizes,
; equation (35), may be treated as prior probabilities in Bayesian statistics
: (Batty and March, 1976; Batty, 1978). So we could redefine our informa-
tion theory entropy term as:
which is very similar

-’ ~ ~ pif lniJ
the
il ~~’ ~
pij I (77)
of
to zone--size-dependent entropy measure
equation (51).
2 Elltropy as a descriptive statistic

Entropy has also been used objectively as a descriptive statistic for summar-
izing certain characteristics of geographic phenomena. Examples include
settlement pattern analysis (Medvedkov, 1967; Semple and Golledge,
1970) and measurement of spatial concentration (Chapman, 1973; Hodge
and Gatrell, 1976). Such applications make use of the fact that the entropy
statistic, as defined in equation (76), is zero when the probabilities are
‘concentrated’ (one probability is unity, the rest arc zero), but is at maxi-
mum value when the probabilities are evenly spread (all probabilities
equal)..
3 Misuse of tlre tlrerrnodyrarnir eutropy analogy

Unfortunately the best-known interpretation of entropy, due ultimately to
Clausius, is embodied in the second law of thermodynamics, where the
principle of increasing entropy reflects the fact that a system ’consumes
more energy than it renders’ (Bronowski, 1973, 347). Boltzmann reinter-
preted this physical entropy concept as a measure of increasing disorder nt

tlre micro-level, arguing that the atoms, or ’individuals’, of a physical system
tend from more organized, and less probable, arrangements to more
disorganized, and more probable, ones (see Cesario, 1975).
Some geographers, notably Berry (1964; 1967), seized on such notions to
suggest that maximum entropy systems lack spatial order, and are thereby
less worthy of study than highly organized systems such as central place
hierarchies. Wilson (1970b, chapter 7) has condemned such arguments as
being highly misleading, because neither the entropy measure nor the
aggregation levels at which the system is described are stated explicitly. As
Wilson stresses, the entropy concept should be used only in relation to
explicitly definable probability distributions.
208
VI Gravity RIP: entropy rules OK?
Hopefully this of the Newtonian gravity model’s weaknesses is

expos6
sufficient lay
to it and for all. Now, by way of conclusion, it is
to rest once
appropriate to offer a brief critique of its successor.
On the credit side, the entropy maximizing methodology providcs a
general model-building procedure, capable of generating internally consis-
tent models of increasing variety and complexity (e.g. Wilson, 1971; 1974;
Eastin, 1975). Additionally, the use of constrained maximization tech-
niques means that entropy maximizing problems are examples of nonlinear
mathematical programs. It is significant that the balancing factors are
related to Lagrange multipliers, equations (72) and (73), because the latter
are dual variables (Wilson and Senior, 1974; Williams, 1976). Furthermore,
Evans (1973) has shown that the well known transportation problem of
linear programming is a limiting case (~3-i oo) of the production-attraction
constrained model, equation (74).
On the other hand the methodology has attracted controversy. Refer-
ence has already been made to suggestions that the methodology be
generalized by incorporating recent ideas from information theory and

Bayesian statistics (March and Batty, 1975; Batty and March, 1976), but the
major criticism is the behavioural one. Although the methodology
addresses the aggregation problem it does so in a fundamentally non-
behavioural way, because the analyst is maximizing his uncertainty about
the behaviour of individual decision-makers. Admittedly maximum
entropy implies maximum variety of micro-behaviour but, as Williams
(1977b) argues, it is not within the spirit of the approach to offer causal
explanations of such behaviour. To use micro-behavioural theories, such as
utility maximization, to interpret and regenerate the form of entropy-
derived models (Neuberger, 1971) is to go beyond entropy maximizing
territory.
University of Salford
Acktzo rvledge11lellts
It is a pleasure to acknowledge the help and advice of Huw Williams (Leeds
University) and of Tony Gatrell and Barrie Gleave (both of Salford
University) during the preparation of this paper. As usual, they are
absolved of any responsibility for the final product. Excellent cartographic
and typing assistance was willingly supplied by Christine Minister, Gustav
Dobrzynski and Barbara Senior.
209
VII References
Batty, M. 1974: Spatial entropy. Geographical Analysis 6, 1-31.

1978: Reilly’s challenge: new laws of retail gravitation which define systems of
central places. Environment and Planning A 10, 185-219.
Batty, M. and March, L. 1976: The method of residues in urban modelling.
Environment and Planning A 8, 189-214.
Berry, B. J. L. ’1964: Cities as systems within systems of cities. Papers of the
Regional Science Association 13, 147-63.
1967: Geography of market centres and retail distribution. Englewood Cliffs, New
Jersey: Prentice-Hall.
Bronowski, J. 1973: The ascent of man. London: British Broadcasting Cor-
poration.
Cesario, F. J. 1975: A primer on entropy modelling. Journal of the American
Institute of Planners 41, 40-8.
Chapman, G. P. 1973: The spatial organization of the population of the United
States and England. Economic Geography 49, 325-43.
1977: Human and environmental systems: a geographer’s appraisal. London: Aca-
demic Press.
Crowther, D. 1974: Entropy: a theoretical approach to urban model building. In
Perraton, J. and Baxter, R., editors, Models, evaluatious and information systems
for planners, Lancaster: MTP Construction.
Eastin, R. V. 1975: Entropy maximization and inferred ideal weights in public
facility location. Environment and Planning A 7, 191-8.
Evans, A. W. 1970: Some properties of trip distribution methods. Transportation
Research 4, 19-36.
Evans, S. P. 1973: A relationship between the gravity model for trip distribution
and the transportation problem of linear programming. Transportation
Research 7,39-61.
Fisk, C. and Brown, G. R. 1975: The role of model parameters in trip distribu-
tion models. Transportation Research 9, 143-8.
French, A. P. 1971: Newtonian mechanics. London: Thomas Nelson.
Gould, P. 1972: Pedagogic review. Annals of the Association of American Geogra-
phers 62,689-700.
Haggett, P., Cliff, A. D. and Frey, A. E. 1977: Locational analysis in human
geography, second edition. London: Edward Arnold.
Hodge, D. and Gatrell, A. 1976: Spatial constraint and the location of urban
public facilities. Enviroment and Planning A 8,215-30.
Jaynes, E. T. 1957: Information theory and statistical mechanics. Physical Review
106,620-30.
March, L. and Batty, M. 1975: Generalized measures of information, Bayes’
likelihood ratio and Jaynes’ formalism. Environment and Planning B 2, 99-
105.
Medvedkov, Y. 1967: The concept of entropy in settlement pattern analysis.
Papers of the Regional Science Association 18, 165-8.
Neuberger, H. L. I. 1971: User benefit in the evaluation of transport and land use
plans. Journal of Transport Economics and Policy 5, 52-75.
210
Olsson, G. 1965: Distance and human interaction: a review and bibliography. Phila-
delphia : Regional Science Research Institute.
O’Sullivan, P. and Ralston, B. 1974: Forecasting intercity commodity transport
in the USA. Regional Studies 8, 191-5.
Semple, R. K. and Golledge, R. G. 1970: An analysis of entropy changes in a
settlement pattern over time. Economic Geography 46, 157-60.
Senior, M. L. 1979: Entropy maximization: a pedagogic note on Gould’s pedagogic
ieview. University of Salford, Department of Geography, forthcoming dis-
cussion paper.
Shannon, C. and Weaver, W. 1949: The mathematical theory of communication
Urbana: University of Illinois Press.
Sheppard, E. S. 1976: Entropy, theory construction and spatial analysis. Environ-
ment and Planning A 8, 741-52.
Taylor, P. J. 1975: Distance decay models in spatial interactions. Concepts and

Techuiques in Modern Ceography 2, Norwich: Geo Abstracts.
Traffic Research Corporation Ltd 1969: The West Yorkshire Transportatio
Study. Leeds.
Webber, M. J. 1977: Pedagogy again: what is entropy? Annals of the Association of
American Geographers 67, 254-66.
Williams, H. C. W. L. 1976: Travel demand models, duality relations and user
benefit analysis. Journal of Regional Science 16, 147-66.
1977a: Some notes on the role and significance of weights’ in spatial interaction
models and related programs. University of Leeds, School of Geography,
unpublished note, 4 pp.
1977b: On the formation of travel demand models and economic evaluation
measures of user benefit. Environment and Planning A 9, 285-344.
Wilson, A. G. 1967: A statistical theory of spatial distribution models. Transpor-

tation Research 1,253-69.
1969: Notes on some concepts in social physics. Papers of the Regional Science
Association 22, 159-93.
1970a: Disaggregating elementary residential location models. Papers of the
Regional Science Association 24, 103-25.
1970b: Entropy in urban and regional modelling. London: Pion.
1971: A family of spatial interaction models, and associated developments.
Environment and Planning 3, 1-32.
1974: Urban and regional models in geography and planning. Chichester: John
Wiley.
Wilson, A. G. and Kirkby, M. J. 1975: Mathematics for geographers and planners.
Oxford: Clarendon Press.
Wilson, A. G. and Senior, M. L. 1974: Some relationships between entropy
maximizing models, linear programming models, and their duals.
Journal of
Regiorral Science 14,207-15.

Senior 1979

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Senior 1979

Caricato da

Copyright:

Formati disponibili

From gravity modelling

I Dispelling the entropy maximizing mystique

In section IV the focus turns to entropy maximizing. At this stage we

In section V an attempt is made to unravel the confusion surrounding the

ence. A separate use of entropy as a statistic which is descriptive of spatial

II A critique of the Newtonian gravity model

rcHecting zonal propensities to emit and attract interactions; dij measures

Figure 1 Graphical decomposition of the Newtonian gravity model

apparently parallel features of social systems without independent justifica-

produced modifications to the Newtonian model (e.g. to the squared

3 The aggregatioll problem

question still poses a major research challenge in geography. Entropy

example Mi might be defined as zonal resident population and Vj as acreage

L TijocM¡ -; LJ Tiji= M¡ (2)

Equality signs would be absurd because Tl, M; and Vj are measured in

~ T~l = D;. (5)

Unfortunately the Newtonian gravity model of equation (1), with OiDj

£ £ T(pb0)~ gq~£ 0;=2: D~.

Table 2 Known distance and trip matrices

Figure 2 Graphical form of the log-linear gravity model, equation (8)

Table 3 Results from the log-linear gravity model,

origins and attractions, as well as trip interactions, makes the transport

with forecasting properties which are often inappropriate for geographical

This is a particularly blatant illustration of the forecasting problem; more

III Heuristic derivation of a family of improved gravity models

they provided a heuristic derivation of improved gravity models which

1 The total interaction corutrnirted model.

produces an expression for K in terms of total tripmakers, the mass

by using equation (12) to substitute for K. This model is then seen to be

2 Prodyrtioti comtraitled gravity models

3 Productioll-attraction cOllstrailledgravity models

By rearranging terms we have revised expressions for A; and Bj:

products, AiBj, are unchanged if A; is multiplied, and Bj divided simul-

Table 8 Results from the production-attraction con-

IV The entropy maximizing approach to model building

Similarly individual travel costs are replaced by a single interzonal

iii) At higher or macro-levels of aggregation we record only zonal and

Entropy maximizing methods then permit the analyst to estimate the

Figure 4 Describing a journey-to-work system at three levels of aggregation

ii) maximally noncommittal or unbiased about micro-level behaviour

micro-level events. means maximizing thc ana-

As aii example, assume that journey-to-work patterns result from the

Table 9 Summary of information for the simple journey-to-work example

(5-4)! 4! 1x4x3x2x 1=5.

(N-TIB)! f’m! J(5-4-1)! (a - 4} ! 1! 0!0! 11 1 =1.

Entropyvalues calculated from equation (32) with appropriate zone labels.

With zone size modifications (see section IV, 4):

Table 11 An illustration of combinatorial possibilities

4 Zotie size rttod~tatious to tlte uutnerical illustratioll

us demonstrate this zone size effect.

By imposing different zone systems to partition our city we can divide ’

up the job and housing opportunities therein in numerous different ways.

Figure 5 Alternative zonal categorizations of spatially distributed opportunities

Conversely a concentrated distribution of employment was assumed, with

Hence this system does not have a differential impact on interaction

The intcraction probabilities, Pil, are likcwise changcd to these values. It

sion:. (~~D~) Tij. (38)

Table 12 An illustration of the impact of zonal opportunity

For the 1 st interaction set choose one of the five ways Ti 1 =4

possible ways they may occupy 01 =2 houses.

Expression (42) is thus the zone-size-dependent form of the previous

5 The matlzematics of entropy maximization .

While highly simplified numerical examples are useful for understanding