AMM April 2014

Periodicity Domains and the Transit of Venus
Author(s): Andrew J. Simoson

Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 283-298
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.283 .
Accessed: 30/03/2014 17:28
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to
The American Mathematical Monthly.
http://www.jstor.org
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
Periodicity Domains and the Transit of Venus
Andrew J. Simoson
Abstract. A transit of Venus occurs when it passes directly between the Earth and the Sun. A
straightforward linear algebraic model for the orbits of Earth and Venusessentially using one
parameter, namely, the relative angular velocity of Venusis powerful enough to generate
respectable transit year predictions. We generalize, allowing to vary; uncover an algebraic
analog for predicting transits; and show that time cycles for transits are what they are because
each is sufciently close to a suitably simple rational number, which for Venus is
13
8
, and
which in turn induces a modulo 8 shufing of successive transit years by a factor of 3.
1. INTRODUCTION. At least once each year, Venus passes between the Earth and
the Sun. Because the orbital planes of Earth and Venus intersect one another at an
angle , only rarely does it come directly between the Earth and the Sun. On these oc-
casions, the prole of Venusa transit of Venus across the Suncan be viewed from
Earth. The last transit was in June 2012 and the next one will be in December 2117.
Ascertaining the periodicity of the transits is a delicate problem. In particular, relative
to Earths angular frequency of one rotation per year, Venus makes
0
1.62555 rota-
tions per year. From this value, how can we deduce the 105-year transit lapse between,
say, 2012 and 2117? And in general, as we allow angular velocity to vary, how does
the time lapse between transits change? The answer is surprisingly chaotic. Before
showing this, we rst give some transit history.
Figure 1. A Venus transit viewed against a spire of the Taj Mahal, June 2012, courtesy of AP Photo/Kevin
Frayer
2. A LITTLE HISTORY. In 1629, Johannes Kepler predicted a 1631 transit of
Venus and estimated the period between transits as 120 years. The rst recorded tran-
sit observation was in 1639 by Jeremiah Horox and William Crabtree. In 1663, James
Gregory realized that careful observations of these transits would enable the scientic
community to determine the distance a of one astronomical unit (AU)the distance
http://dx.doi.org/10.4169/amer.math.monthly.121.04.283
MSC: Primary 11A07; 70F15
April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 283
between the Sun and the Earthin miles. Up until Keplers day, the reigning guess for
a had been about ve million miles; Kepler, after studying the geocentric parallax of
Mars, bumped the value of a to at least 15 million miles. With the advent of the tele-
scope, the guesses improved. In 1716, Edmund Halley predicted that a was 14,000
semi-diameters of the Earth or about 56 million miles, and championed Gregorys
plan to test his guess [4].
Figure 2. William Crabtree observing a transit; mural at Manchester Town Hall by Ford Madox Brown (1821
1893)
But for Halley, the next transit for Venus was 45 years in the future. Therefore he
charged the astronomers of two generations hence to do what he could not. As a recent
biographer of these events has written, even on his death-bed whilst holding a glass
of wine in his hand, Halley said, I wish that many observations of this phenomenon
might be taken by different persons at separate places [11, p. xxiv]. Astronomers of
the eighteenth century had two chances to observe, June 1761 and 1769. Many of the
colorful adventures of these astronomers as they answered Halleys call are chronicled
in [10] and [11]. As reviewed recently in detail by Teets [9], James Short analyzed
transit data from sites as far aeld as South Africa and northern Finland, and published
his conclusions in the December 1761 issue of the Philosophical Transactions of the
Royal Society that a was 93,726,000 miles.
The standard reference for transit dates is Jean Meeuss tables, spanning 6000 years
[5]. Espenak [3], who compiled NASAs website on transits, names Meeuss work
an indispensable reference for anyone wishing to do transit calculations. Danloux-
Dumesnils [2] calls Meeuss original tables [6] une belle etude. Much of Meeuss
number crunching is based on the modern planetary theory VSOP87 of the Bureau
des Longitudes of Paris, [5, p. 1]. Against this standard, we contrast our results.
3. THE MODEL. We assume that the orbits of Earth E and Venus V are circles,
with periods of
e
365.26 days and
v
224.70 days, respectively. By Keplers
third law of planetary motion, with time t in years and distance in astronomical units
(AU), a
3
=
2
where a is the semi-major axis of a planets elliptical orbit and is its
period. Thus, Venus is 0.723 AU from the Sun S.
We further assume that Es orbit lies in the xy-plane with S at the origin O and
that Vs orbit lies in a plane through O inclined at angle 3.39
to the xy-plane.
We call the line between these orbital planes the nexus line or, according to Meeus [5],
284 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
the line of nodes. The nexus line in Figure 3 is labeled BC. A nexus point or node
for VenusF and G in the gureor for EarthB and C in the gureis where
the orbit of V or E pierces the orbital plane of E or V, respectively. Transits will
only occur when E and V are both near B and F, respectively, or both near C and G.
The former transit is called a fall transit because in modern times E is at B in early
December; it is also called, according to Meeus, an ascending transit, because as Vs
prole moves across S from left to right its trajectory rises. The latter transit is called
a spring transit because E is at C in early June; it is also called a descending transit,
because the corresponding trajectory decreases. Es and Vs position at any time is
given respectively by E(t ) and V(t ):
E(t ) =
_
_
cos(2t )
sin(2t )
0
_
_
and V(t ) =
_
_
1 0 0
0 cos sin
0 sin cos
_
_
_
_
cos(2t )
sin(2t )
0
_
_
, (1)
where is the relative angular velocity of V with respect to E. For simplicity, we
initially position V and E at their spring nexus points. The value of for the actual V
and E is
0
=
e
/
v
1.62555. The 3 3 matrix in (1) corresponds with a clockwise
rotation by about the x-axis, so as to be consistent with a descending (spring) transit
occurring near nodes (nexus points) C and G, where C = (1, 0, 0).
A line parametrized by u from E through V at time t is
P(u, t ) =
_
V(t ) E(t )
_
u + E(t ). (2)
To nd the projection of Vs shadow on S as viewed from E(t )an ideal geocentric
point in space at Es centerwe imagine that S resides within a rotating plane or
screen S(t ) ever perpendicular to E(t ). Figure 3 shows the two orbital planes and Vs
projection on the screen as viewed from E. The plane S(t ) of S can be written as
X E(t ) = 0 (3)
where X is a general point (x, y, z) on the screen. When E and V are on opposite
sides of the screen at time t which happens if and only if E(t ) V(t ) < 0we take
the projection point of V onto the screen as that screen point between the planets.
Vs orbit
E(t)
V(t)
s
c
r
e
e
n
O
Es orbit
axis between the planetary planes
S
u
n
V
s

s
h
a
d
o
w
B
C
nexus pt for E
nexus pt for V
G
F
Figure 3. The screen of the Sun
We combine (2) and (3) so as to nd the point X(t ) where the line intersects the
plane. That is, the equations P(u, t ) = X and (3) are the following system of four
equations with four unknowns x, y, z, u, as well as the time variable t :
_
_
x = ( cos(2t ) cos(2t ))u +cos(2t )
y = ( cos sin(2t ) sin(2t ))u +sin(2t )
z = sin sin(2t )u
0 = x cos(2t ) + y sin(2t ).
(4)
Writing (4) as a matrix equation gives A
X(t ) =

E(t ), where
A =
_
_
_
1 0 0 cos(2t ) cos(2t )
0 1 0 sin(2t ) cos sin(2t )
0 0 1 sin sin(2t )
cos(2t ) sin(2t ) 0 0
_
_
(5)
with

X(t ) and

E(t ) being the respective vectors (x, y, z, u) and (cos(2t ), sin(2t ),
0, 0). For this transformation,
det(A) = 1 +
_
cos(2t ) cos(2t ) +cos sin(2t ) sin(2t )
_
= 1 +

2
_
(1 +cos ) cos(2( 1)t ) +(1 cos ) cos(2( +1)t )
_
1 +

2
_
|1 +cos | +|1 cos |
_
= 1 + < 0.
Because the determinant of A is never zero, then

X(t ) = A
1
E(t ). Since it would

be convenient to see these points of intersection on a stationary screen rather than the
dynamic plane S(t ), we clockwise rotate the rst two components of

X(t ) about the
z-axis by 2t radians. The result of such a transformation is a set of points whose
rst three components trace Vs projection onto the screen of S. Finally, since the rst
component of such points will always be 0, and we are disinterested in u, we project
this set of points so as to obtain their second and third components as ordered pairs,
which we index as W(t ) = (W
1
(t ), W
2
(t )),
W(t ) =
_
0 1 0 0
0 0 1 0
_
_
_
_
cos(2t ) sin(2t ) 0 0
sin(2t ) cos(2t ) 0 0
0 0 1 0
0 0 0 1
_
_
A
1
E(t ). (6)
1 1
0.1
distances in AU
0.005
0.005
T
121.5
T
117.5
T
113.5
(a) A wide screen (b) Zooming in near the Sun
Figure 4. Trajectories of Vs shadow on the screen of S
Figure 4(a) shows the path of Vs projection on the screen over 1.5 years. In our
model, spring transits only occur near integer years, n, and fall transits only occur near
half-years, n +
1
2
. Figure 4(b) is a close up of the screen near S over a period of about
ten years, displaying three arcs of Vs projection. The arc labeled T
113.5
corresponds
with a fall transit near t = 113.5 years. The arc T
117.5
corresponds with V and E being
on opposite sides of S near t = 117.5; as such, we display the disk of S in front of this
arc. The arc T
121.5
misses the disk of S.
4. CONDITIONS FOR A TRANSIT TO OCCUR. In order to nd how far from
its nexus V may wander and yet be part of a transit across S, we project the disk of S
through V out to Es orbit, forming a cone as illustrated in Figure 5(a), which displays
the situation where the base of the truncated cone is tangent to Es orbit.
Es orbit
B
C
V
S
V
Es orbit
B
C
S
p
l
a
n
e

o
f

V
s

o
r
b
i
t
h
D
D
V
s

o
r
b
i
t
disk of the
Sun
base of truncated cone
k
1
k(1
(a) A cone of possible shadows (b) A linear approximation of orbits

Figure 5. Maximum separation from the nexus for a transit
Let be the radius of this base with center point D. To approximate where this
extreme position for V occurs, we linearize the orbits of V and E, and imagine that
they proceed along lines perpendicular to the nexus line BC, as illustrated in Figure
5(b). In this gure, we take the distance SB as 1 AU. The distances SV and SD are k
and k, where k is a marginally-larger-than-1 deformation factor due to linearization.
With s 0.00465 AU as the radius of S, from similar triangles, we see that
s
k
=

k(1 )
, (7)
which gives 0.0178 AU. Furthermore,
sin =

h
and tan = h, (8)
where is the angle between the two orbital planes, is the angle between the nexus
line and the line between S and V, and h is distance BD. By (7) and (8),
= tan
1
_
s(1
sin
_
s(1 )
0.0301, (9)
since the arguments of the inverse tangent and sine are so small. Thus, in order to be
part of a transit, V may wander no further than about 0.0218 AU from the nexus.
By (9), the lapse of time L
v
for V to travel this far from its nexus is
L
v

s(1 )
2
0
1.08 days. (10)

The corresponding maximal time L
e
that E may stray from its nexus points and yet
take part in a transit is
L
e
=

2
42 hours < 2 days. (11)
0.5 0.0 0.5 1.0 1.5 2.0
2
4
6
8
10
transit occurs here
V and E on
opposite
sides of S
time t, in years
speed in AU/yr
Figure 6. Speed, ||W
(t )||, of Vs shadow across the screen of S

Since the speed at which a transit is traced across S is bounded by 10.34 AU/year
as indicated by the graph of ||W
(t )|| in Figure 6, then

||W
(t )|| < 10.34 AU/year 0.0284 AU/day (12)

for all t . Let t
0
be a medial transit time, a time of a spring transit near integer time n
or of a fall transit near half-year time n +
1
2
where W
1
(t
0
) = 0. Since the time between
t
0
and either n or n +
1
2
must be at most about 42 hours by (11), then the most that
||W(n)|| or ||W(n +
1
2
)|| can differ from ||W(t
0
)|| is approximately
(0.0280 AU/day)(42 hours) 0.0496AU
by (12). Since |W
2
(t
0
)| < s, then 0.05 AU is about the most that ||W(n)|| or ||W(n +
1
2
)|| can be. Therefore, our litmus test to determine if integer year n or half-year n +
1
2
is a promising one for a transit is for V and E to be on the same side of S and for
||W(n)|| < 0.05 or
W
_
n +
1
2
_
< 0.05. (13)

Applying (13) to the integers 0 to 2000 with =
0
, we nd the promising years
of Table 1.
Table 1. Years at which the spring and fall transits occur
(340.5, (454,
0 113.5 227 348.5) 462) 575.5 689 802.5 916
1029.5 (1143, (1256.5, 1378 1491.5 1605 1718.5 1832 (1945.5,
1151) 1264.5) 1953.5)
Double-checking the dates in Table 1 by graphing the arc W(t ) against the disk of
S veries that each of the years or half-years corresponds with a spring or fall transit,
respectively, and are the only transits during this 2000-year period in our model. As
can be seen, the familiar differences 8, 105.5, and 113.5 between successive transit
times appeargood news for our model. The entries in the table eight years apart have
been grouped as ordered pairs; their associated transits are called twins or doubles. For
example, spring transits occur in our model in both year 454 and year 462. For a twin
transit, we say the transit member whose path across S comes closer to Ss center is
the dominant transit of the two. If a transit has no twin, it is a singleton transit. As
can be seen in Figure 7 of the twin transit T
454
and T
462
, T
462
is the dominant member.
T
227
is a singleton. In section 7, we show how to modify our model to simulate actual
transit dates.
0.005
0.005
T
462
T
454
dom
inant tw
in
Figure 7. A twin pair of descending spring transits
Meanwhile, in looking for a pattern with respect to the data of Table 1, the reader
may notice that W(0), W(802.5), and W(1605) are all almost (0, 0). Have we stum-
bled across a characteristic time period for which the data repeats? To answer, we
dene the practical period

T of this data as

T = 1605 years and argue that a more
natural period exists for three reasons.
It is unclear how

T is related to
0
.
It is unclear how a period of

T explains the time lapse between successive transits.
Since the time lapse between twin transits is 8 years, it seems likely that

T should
somehow be related to 8, but how?
In the next section, we nd a natural period and demonstrate that the practical and
natural periods are related.
5. RECOGNIZING THE PATTERN. To nd a more natural transit period, we fo-
cus on spring transits for a season; from Table 1, we drop the fall transit dates, and are
left with Table 2. When we refer to the spring transit year n
j
from the table, where
Table 2. Spring transits
j 0 1 2 3 4 5 6 7 8
transit year n
j
0 227 (454, 462) 689 916 (1143, 1151) 1378 1605 1832
n
j
mod 8 0 3 6 1 4 7 2 5 0
3 j mod 8 0 3 6 1 4 7 2 5 0
j 0, we mean term- j in row 2 or the dominant transit year if the term is a twin. For
example, n
2
= 462, as evidenced by Figure 7. Observe that the rst eight spring tran-
sits comprise a complete residue set modulo 8. Furthermore, n
j
mod 8 just happens to
be 3 j mod 8, which suggests that the relative motion of the planets induces a linear
shufing of the transit year residues modulo 8. We thus refer to 3 as a shufing factor.
To help understand this 8-fold dynamic, observe that every eight years both E
and V pass each other not far from where they had passed each other eight years
before, with V a bit further ahead of E each time. We say that the arc given by
W(n years 1 week) is rung-n in a ladder of arcs. As the years go by, these rungs
step monotonically upward (or downward) to a climax before reversing their progres-
sion, with rung-8n being more or less either above or below rung-8(n + 1) for all
integers n. Near the spring transit years, neighboring rungs are separated by a distance
somewhat more than the radius of S, as illustrated in Figures 4(b), 7, and 8; the dots in
Figure 8 represent Vs projection at t = 16, 8, 0, 8, 16 years. With p = 8, the ap-
proximate distance d( p) between neighboring rungs near transit years is the distance
between W( p) and its projection onto W(0
+
), where we take 0
+
as one hour, is
d( p) =
W( p)
W( p) W(0
+
)
W(0
+
) W(0
+
)
W(0
+
)
0.00672 AU. (14)

Since s < d( p) < 2s, then a sequence of at most two successive rungs may cross the
face of S, whereas if a rung crosses near the center of S, then only one rung in that
succession of rungs may correspond to a transit.
Sun
16
8
0
8
16
Figure 8. Vs projection as given by W(t ) near t = 16, 8, 0, 8, 16
When we extend the data as given in Table 2, the data seems to sort itself. That is,
plotting {(n, W
1
(n))}
n0
corresponding to the times when E is at its spring nexus point
shows a hodge-podge of dots across 100 years in Figure 9(a). Yet, when we look at
a longer period of time, the trend is clear. Figure 9(b) displays the data across 2000
years. It appears as if Vs projection when sampled at Es spring nexus point lies on
one of eight branches through the data, each of which appear to be uniformly spaced
translates of one another.
20 40 60 80 100
1.0
0.5
0.5
1.0
AU
years
500 1000 1500 2000
1.0
0.5
0.5
1.0
AU
years
(a) A hodge-podge of dots (b) A better perspective
Figure 9. Horizontal component of Vs projection at Es spring nexus over time
By (5) and (6), nding the periodicity present within {(n, W
1
(n))}
n0
is equiva-
lent to nding the periodicity present within D() = {(n, sin(2n))}
n0
, as n ranges
over integer values. Figure 10 shows that when restricted to the years 8n where n is an
integerand when adjacent points are connected by line segmentsboth curves dis-
play the same periodicity for =
0
. The curves appear to have a root near t 917,
but no spring transit occurs at either 912 = 8(114) or 920 = 8(115) years, because in
our model V and E are on opposite sides of S at both times. However, near the next
root t 1834, a transit occurs at n = 1832 = 8(229) years, but not at 1840, because
Vs projection falls just outside Ss disk in that year.
500 1000 1500
1.0
0.5
0.5
1.0
AU
years
(8n, W
1
(8n))
(8n, sin(2(8n))
Figure 10. Paths through W
1
(t ) and sin(2t ) when t = 8n years, =
0
Can we nd curves y
j
= sin((t j )), where and are real numbers and j is
an integer, 0 j 7, which characterize D(
0
)? That is, we seek a period T, with
T =
2
and =
T
8
, for which T is near 1834 and where y
j
passes through all points on
branch- j of D(
0
). Observe that the values of sin(2()8n) and sin(2(
m
8
)8n)
agree for all integers m. In particular, for the integer m for which
m
8
is nearest ,
namely, m = 13, we see that dening and T so that
1
T
=

2
=
13
8

365.26
224.70

13
8
0.000545171 (15)
indeed gives the natural period of D(
0
) as
T =
2
2
0.00342541
1834.29 years, (16)
which means that
=
T
p
=
T
8

1834.29
8
229.286 years. (17)
When we divide the practical period

T = 1605 years by 7,
T
7
229.286 .
That is,
T
8
7
T.
Hence, the practical period of 1605 just happens to be a lucky seven integer multiple
of the phase shift in the branches of the natural period.
To verify the fourth row of Table 2, that 3 is the shufing factor, observe by Figure
9(b) that n
j
is the rst component in that point belonging to branch- j of D(), which
is nearest the rst nonnegative root of y = sin((t j )), with 0 j 7. Hence, for
a given j , we wish to nd the residue r
j
of n
j
modulo p, where 0 j p 1 and
0 r
j
p 1, so that
sin(2t ) = sin((t j)) (18)
for all times t = pn +r
j
, for all integers n, with p = 8. By the pigeonhole principle,
since there are eight branches and eight primitive residues, r
j
is unique for each j .
Furthermore, by the afne nature of the arguments of sine in (18), it is sufcient to
show that (18) has a solution for j = 1, which means that we must solve
sin
_
2( pn +r)
_
= sin
_
( pn +r )
_
(19)
for r, where r = r
1
and p = 8. By (15) through (17), (19) becomes
sin
_
(8n +r) +26n +
(13r)(2)
8
_
= sin
_
(8n +r)
2
8
_
.
Therefore, solving
13r 1 mod 8 (20)
gives the unique solution r = 3 for (19).
Furthermore, generalizing the above argument demonstrates that the shufing factor
r in (19) remains at r = 3 for all =
13
8
, for which

13
8
<
1
32
=
1
4p
,
a range of angular velocities called the periodicity domain of
13
8
. By an interval punc-
tured by x, we mean a disconnected set of real numbers J whose union with {x} is an
interval. Thus, the periodicity domain of
13
8
is an interval punctured by
13
8
. The reason
for excluding
13
8
from its periodicity domain is that its corresponding and would
be 0 and , respectively.
To account for arbitrary relative positions of E and V in their orbits about S, we
imagine that at time t = 0, V is years ahead of its last rendezvous with its spring
nexus, while E is at its spring nexus. Each of the branches characterizing Vs projec-
tion undergo a phase shift , where sin(2(8n +)) must equal sin((8n +)); by
(15), one way for this to occur is when (2)(
13
8
) = , which means that
=
qT
p
=
13T
8
,
where p = 8 and q = 13. Therefore, we have an algorithmfor characterizing all spring
singleton transits and all dominant members of spring twin transits, where is an
orbital phase angle shift between V and E, p = 8 is the apparent periodicity of D(),
r = 3 is the shufing factor among the year residues modulo p as given by (20), and
q
p
is the rational number close to as given by (15).
The Transit Rule. Let k, n, and j be integers, 0 j < p. A spring transit occurs at
integer year m near time (k
q
p
)T + j if and only if m = pn + ( jr mod p) and
m is no further from (k
q
p
)T +j than either m p or m + p. If either m p or
m + p is a transit year as well, then m is the dominant member of the twin.
To ascertain whether m p is also a spring transit, simply utilize the decision
rule (13).
Example 1. To illustrate the transit rule, let = 0, k = 3, and j = 5. Since 3 j
mod 8 = 7, we want to nd the transit year m = 8n +7 closest to kT + j 6649.3.
Then m = 8(830) +7 = 6647, while m +8 = 8(831) +7 = 6655. That is, year 6647
is a singleton transit, while year 6655 is a near-miss, as shown in Figure 11(a).
T
6655
T
6647
T
4754
T
4746
(a) Spring transit near 3T +5 (b) Spring transit near (2
13
80
)T +6
Figure 11. Checking the transit algorithm
Example 2. This time, let = 0.1, k = 2, and j = 6. Since 3 j mod 8 = 2, we want
to nd the transit year m = 8n + 2 closest to (2 0.1(
13
8
))T + 6 4746.2. Then
m = 8(593) +2 = 4746, while m +8 = 8(594) +2 = 4754. That is, year 4746 is a
singleton transit, while year 4754 is far from being a transit, as shown in Figure 11(b).
As for fall transits, a similar rule applies, except that the eight branches through the
data corresponding to time n +
1
2
are
y
j
= sin
_
_
t
_
j +
1
2
_
+
__
.
6. VARYING VENUSS ANGULAR VELOCITY. The key behind the transit rule
is recognizing that D(
0
) consists of eight components or branches. Thus we say that
the periodicity of D() is the integer p if D() appears to fall into p branches. To
formalize what is meant by appears, for each positive integer , we dene N() as the
maximal integer n for which {sin(2j )}
n
j =0
is monotonic. Intuitively, N() counts
the number of rungs from a transit to a climax. We further dene the periodicity quo-
tient Q(, ) as
Q(, ) =
_
N()
_
,
which gives a measure of normalization among the values of N(). We say that the
apparent periodicity of D() is p, if Q(, p) appears to approach the maximum of
{Q(, )| Z
+
}.
Table 3. The periodicity of D(
0
) appears to be 8.
Q(
0
, ) Q(
0
, ) Q(
0
, )
1 1 11 0 21 0
2 0 12 0 22 0
3 1 13 0 23 0
4 0 14 0 24 0
5 0 15 0 25 0
6 0 16 1 26 0
7 0 17 0 27 0
8 7 18 0 28 0
9 0 19 0 29 0
10 0 20 0 30 0
The rst few values of Q(
0
, ) are given in Table 3, with the nonzero periodicity
quotients in boldface. When extending this table indenitely as far as a typical CAS
allows, it appears as if Q(, ) = 0 for all > 16. From such evidence, and since the
maximum quotient among this range is 7 and corresponds to = 8, we conclude
that D(
0
) has apparent periodicity 8.
Let be a number between 0 and 0.5. Observe that Q(, ) = Q(n + , ) for
all integers n. Because sine is an odd function, Q(, ) = Q(1 , ). Therefore, the
only values for which we need to evaluate Q(, ) are those in the range 0
1
2
,
or, equivalently, the range 1.5 2, the reference interval containing
0
. Armed
with the use of the measure Q we ask, how far may we perturb from
0
and yet have
apparent periodicity remain invariant?
1.615 1.620 1.625 1.630 1.635
2
4
6
8
10
12
Q(, 8)
Figure 12. The range of 8-fold apparent periodicity

If Q(, ) 3, we say that D() displays signicant apparent periodicity . From
Figure 12, we see that D() displays signicant apparent periodicity 8 on the interval
(1.6237, 1.6263) punctured by =
13
8
. Plots of D(1.6237) and D(1.6263) are much
like Figure 13(a), in which an 8-fold periodicity is less pronounced than in Figure 9(b).
As approaches
13
8
= 1.625, Q(, 8) goes to . For example, Q(1.6251, 8) = 39;
this strong apparent periodicity 8 is illustrated in the graph of D(1.6251) in Figure
13(b). Of course, when = 13/8, the eight branches collapse into ve parallel lines
corresponding to the sine values 0, 1,
2/2, which means that Q(

13
8
, 8) = 0. We
therefore say that the domain of signicant periodicity for 13/8 is an interval of angular
velocities punctured by
13
8
, for which Q(, 8) is at least 3.
500 1000 1500 2000
1.0
1.0
500 1000 1500 2000
1.0
1.0
(a) D(1.6263) (b) D(1.6251)
Figure 13. A weak and a strong apparent periodicity 8
What about periodicity domains for other values, such as =
5
3
,
7
4
,
12
7
,
17
10
, and
19
11
, as shown in Figure 14? It should come as no surprise that for values taken within
the signicant periodicity domains of these numbers, D() will exhibit apparent pe-
riodicity of 3, 4, 7, 10, and 11, respectively. For example, the data set D(1.714) in
Figure 15(a) shows apparent periodicity 7, and is well within the signicant periodic-
ity domain of
12
7
1.71429.
1.65 1.70 1.75 1.80
2
4
6
8
13/8 17/10 19/11 12/7 7/4 5/3
Figure 14. Domains of periodicity

500 1000 1500 2000
1.0
1.0
2000 4000 6000 8000 10000 12000
1.0
1.0
(a) D(1.714) (b) {n, W
1
(t )}
n0
where =
11
2
10
Figure 15. Apparent periodicities 7 and 9
The next example is an application of the transit rule corresponding to an apparent
periodicity other than 8.
Example 3. Let =
11
2
10
1.55563. The plot of D(), Figure 15(b), shows that its
apparent periodicity is p = 9. Since
14
9
is that fraction of integers with denominator 9
nearest , the analog of (15) is

14
9
=

2
=
1
T
,
which gives T 12,600.3 years. Solving (19) gives the shufing factor r = 7 rather
than 3. Now let = 0, k = 0, and j = 5, which means that we are looking for a
transit year with residue jr mod 9 8 near time 5 = 5T/9 7000.17. Thus, m =
(777)(9) +8 = 7001 is a transit year. With this new value of , V has receded from S,
so the distance d(9) between the rungs has changed to d(9) 0.0014 by (14), which
means that we have more than twin transits; in fact we have septuplets, as shown in
Figure 16(a).
T
7028
T
7019
T
7010
T
7001
T
6992
T
6983
T
6974
actual June 2012
transit path
linear model
approximation of the
June 2012 transit
Y
Z
(a) A transit family of septuplets, =
11
2
10
(b) Hunting for a phase angle
Figure 16. Transits with other than
0
7. A REALITY CHECK. How does our model contrast with reality?
A phenomenon omitted thus far from our transit model is the tendency of objects
to rotateincluding the orbital planes of V and E, a feature called precession. The
values
e
and
v
used to dene
0
are the periods of the two planets with respect to the
background of the xed stars. To adapt our model appropriately, we must incorporate
slightly different periods, namely, the time it takes for a planet to return to its aphelion.
Since E precesses faster than V, as time goes on the nexus line rotates and hence spring
and fall transits occur later in the year. Because precession rates are tiny compared to
0
, we arbitrarily take
0
1.625550000. Meeus [5, p. 13] predicts that an almost
exactly central transit will take place on 11 July 5900a transit through Ss center.
Thus from 2012 to 5900, the spring transit has now become a summer transit, having
slipped forward by about 35 days during a lapse of 3888 years, which means that the
change in the relative orbital speeds of V and E with respect to the nexus line is
35
0
3888
e
0.0000397559, which means that we might try the new angular velocity
1
=
0
1.625510244.
Next, we need a phase shift to start our model. From [5, p. 48], the transit of 6 June
2012 crossed Ss boundary at Y 39.45
and at Z 291.4
measured counterclock-
wise from the top of S, shown as a dotted line in Figure 16(b). Adjusting (1) and (5)
so that the trigonometric arguments 2t are replaced by 2(t +), where is an
indeterminate phase shift, and using a search method to nd by dynamically plotting
W(t 2012) near t = 2012, yields the solid-line transit in Figure 16(b), suggesting
that 0.00102 is a good match. The reason that the two transit lines are non-parallel
is because Es and Vs actual orbits have positive eccentricity. When we apply (13) in
this adjusted model for the years from 700 to 3000 AD, we nd the promising spring
transit Gregorian year possibilities of Table 4. The underlined years indicate a match
between our results and Meeuss. Not bad for a linear model. But can we do better?
Table 4. The linear model versus Meeuss Model
This linear
model
_
(781, 789) (1024, 1032) 1275 1518 (1761, 1769)
(2004, 2012) 2255) 2498 2741 (2984, 2992)
Meeuss
model
_
(789, 797) (1032, 1040) (1275, 1283) (1518, 1526) (1761, 1769)
(2004, 2012) (2247, 2255) (2490, 2498) (2733, 2741) (2976, 2984)
To do so, we work backward through the transit rule and nd a magic angular veloc-
ity. Since
1
is within the periodicity domain of
13
8
, the corresponding shufing factor
is r = 3. We make use of a second unusual spring transit year, 183 BC, whose cor-
responding transit Meeus describes as almost central. The difference between 5900
AD and 183 BC is 6083 years. Identify t = 0 with year 5900. Thus, year 183 BC is
referenced by t = 6083 = 8(761) + 5, which means that 5 3 j mod 8, whose
solution is j = 7. Using the angular velocity
1
with (15), the associated period is
T
1
1959.85. We then solve kT
1
+
7T
1
8
= 6083, getting k 3.98. Next, reset k as
k = 4, and solve (k +
7
8
)T
2
= 6083, getting T
2
=
48664
25
. By (15),
2
=
1
T
+
13
8
=
25
48664
+
13
8
=
9888
6083
1.6255137267795495644.
When we generate transits by the transit rule using angular velocity
2
across the years
2000 BC to 4000 AD, we get an exact match with actual spring transits from Meeuss
results.
Table 5. Spring transit years, generated by the transit rule
1884 BC 1641 BC 1398 BC 1155 BC 912 BC 669 BC 426 BC 183 BC 60 303
546 789 1032 1275 1518 1761 2004 2247 2490 2733
2984
3227 3470 3713 3956 4199 4442 4685 4928 5171

As can be seen, the difference between successive entries in Table 5 is 243 years,
except when passing from 2733 to 2984, the year marked with an asterisk. The match
between these two approaches with respect to the recessive partner in twin transits is
less spectacular.
8. SOME PARTINGOBSERVATIONS ANDQUESTIONS. What we have shown
is that the cycle of transits is the way it is because Vs angular velocity
0
is enmeshed
within the periodicity domain of
13
8
. This in turn induces a modulo 8 shufing of suc-
cessive transit years by a factor of 3, a phenomenon reected in the 6000-year standard
tables of transits generated by Meeus [5], provided we partition transits into two fami-
lies: spring transits and fall transits, and discard one of the years from each twin transit.
With respect to the notion of periodicity domains, some natural questions arise.
Does every D() have a well-dened apparent periodicity? For a challenge, try =
3. What happens when wanders into overlapping periodicity domains? The reality
is a war-torn fractal-like dominance landscape foreshadowed in part by Figure 14. As
a simple example,
35
52
0.673077 exerts its 52-ness dominance over its immediate
neighbors. Yet, it is well within the dominance of
2
3
; an examination of D(
35
52
) shows a
clear 3-fold periodicity, and the periodicity quotient Q(
35
52
, 3) = 4 supports this result.
However, Q(
35
52
0.000001, 52) = 92 and a plot of its corresponding data set suggests
periodicity 52.
With respect to permanence, in the life cycle of S, S slowly loses mass and swells
to giant status and so the orbits of the planets recede from S, which means that the
transit cycle for V may change dramatically. The rational numbers with small integer
denominator near
13
8
in increasing order are
_
3
2
,
11
7
,
8
5
,
29
18
,
21
13
,
13
8
,
31
19
,
18
11
,
23
14
,
28
17
,
33
20
,
5
3
,
7
4
_
.
A billion or two years from now, the natural periodicity of the Venus transit may
change from 8 to 13 or 19. Hopefully, humans will yet be here to see.
For an application of the ideas of this paper to the phases of the Moon, see [8]. Just
as the transit of Venus involves the periodicity domain of
13
8
, so too the phases of the
Moon involve the periodicity domain of another fraction, this time
235
19
.
ACKNOWLEDGMENT. Thanks to Osmo Pekonen for asking me to write a review [7] of [11] which in turn
sparked this project.
REFERENCES
1. G. K. Chesterton, Heretics. Reprint of the 1905 edition, Books for Libraries Press, Freeport, NY, 1970.
2. M. Danlous-Dumesnils, P eriodicit e des passages de V enus, LAstronomie 91 (1977) 117127.
3. F. Espenak, Six millenium catalog of Venus transits, NASA, 2013, available at http://eclipse.gsfc.
nasa.gov/transit/catalog/VenusCatalog.html.
4. E. Halley, A new method of determining the parallax of the Sun, or his distance from the Earth, in The
Abridged Transactions of the Royal Society 6 (1809) 243249.
5. J. Meeus, Transits. William-Bell Press, Richmond, VA, 1989.
6. , The transits of Venus, 3000 BC to AD 3000, Journal of the British Astronomical Association 68
(1958) 98108.
7. A. Simoson, A review of [11], Math. Intel. 35 (2013) 8485.
8. , Bilbo and the last moon of autumn, to appear in Math Horizons.
9. D. A. Teets, Transits of Venus and the astronomical unit, Math. Mag. 76 (2003) 225348.
10. H. Woolf, The Transits of Venus: A Study of Eighteenth Century Science. Princeton University Press,
Princeton, NJ, 1959.
11. A. Wulf, Chasing Venus: the Race to Measure the Heavens. Alfred Knopf Press, New York, 2012.
ANDREW J. SIMOSON is a long time professor of mathematics at King University. Recently he stumbled
upon a pertinent Chesterton quote, Men take thought and ponder rationalistically touching remote things
things that only theoretically matter, such as the transit of Venus [1, p. 141].
King University, 1350 King College Road, Bristol, TN 37620
ajsimoso@king.edu
A Drug-Induced Random Walk
Author(s): Daniel J. Velleman
Accessed: 30/03/2014 17:28
.
.
A Drug-Induced Random Walk
Daniel J. Velleman
Abstract. The label on a bottle of pills says Take one half pill daily. Anatural way to proceed
is as follows: Every day, remove a pill from the bottle at random. If it is a whole pill, break
it in half, take one half, and return the other half to the bottle; if it is a half pill, take it. We
analyze the history of such a pill bottle.
1. INTRODUCTION. A few years ago our cat Natasha (see Figure 1) began losing
weight. We took her to the vet, who did some tests and determined that she had a thy-
roid condition. He gave us a bottle of pills and told us to give her half a pill every day.
Figure 1. Natasha
The next day we shook a pill out of the bottle, broke it in half, gave her half of the
pill, and put the other half back in the bottle. We repeated that procedure for several
more days. Eventually, a day came when the pill we shook out of the bottle was one of
the half pills we had put back in on one of the previous days. Of course, we just gave
her the half pill that day. We continued to follow this procedure until the bottle was
empty, and then we started on a new bottle.
The pills solved Natashas medical problem; she regained the weight she had lost,
and shes doing ne now. But they created an interesting mathematical problem. The
state of the pill bottle on any day can be described by a pair of numbers (w, h), where
w is the number of whole pills in the bottle and h is the number of half pills. We
will assume that every day a pill is removed from the bottle at random, with each pill
being equally likely to be chosen. When a whole pill is removed, it is cut in half and
half of it is returned to the bottle; when a half pill is removed, nothing is returned
to the bottle. Thus, if the state of the pill bottle on a particular day is (w, h), then
with probability w/(w +h) the state on the next day will be (w 1, h +1), and with
MSC: Primary 60G50, Secondary 65L05
April 2014] A DRUG-INDUCED RANDOM WALK 299
probability h/(w +h) it will be (w, h 1). This means that the state of the pill bottle
executes a random walk in the plane, starting at the point (w, h) = (n, 0), where n is
the initial number of pills in the bottle, and ending at (0, 0). Since the bottle contains
2n doses of medicine, the walk takes 2n steps.
For example, Figure 2 shows a computer simulation of a pill-bottle walk starting
with n = 20 pills. On the rst three days, whole pills are removed from the bottle, and
the state of the bottle goes from (20, 0) to (19, 1), (18, 2), and (17, 3). The next day, a
half pill is removed, and the state goes to (17, 2). And the walk continues for 36 more
steps until it ends at (0, 0).
5 10 15 20
w
1
2
3
4
5
6
7
8
h
Figure 2. A pill-bottle walk with n = 20
Figure 3 shows simulated walks with n = 100, n = 1000, and n = 10000. It ap-
pears that although the walks are random, the overall shapes of the walks are similar,
with the shape becoming smoother as n increases. Notice that the scales of the three
walks in Figure 3 are different; the rst starts at (100, 0), the second at (1000, 0), and
the third at (10000, 0). It is only when they are drawn the same size that they look
similar. This suggests that we should rescale the walks to a uniform size, indepen-
dent of n. We will therefore switch to a new coordinate system. If we let x = w/n
and y = h/n, then x represents the fraction of the original n pills that are still whole,
and y represents the fraction that have become half pills. Notice that these fractions
may add up to less than 1, since some fraction of the pills may have been used up
completely.
25 50 75 100
w
10
20
30
h
250 500 750 1000
w
100
200
300
2500 5000 7500 10000
w
1000
2000
3000
Figure 3. Walks with n = 100 (top left), n = 1000 (bottom), and n = 10000 (top right)
Using the coordinates (x, y) to represent the state of the pill bottle, we get a random
walk that starts at (1, 0), ends at (0, 0), and stays in the triangle x + y 1, x 0,
y 0. When the state is (x, y), it changes as follows:
with probability
x
x+y
, the state changes to
_
x
1
n
, y +
1
n
_
;
with probability
y
x+y
, the state changes to
_
x, y
1
n
_
.
We will call such a walk an n-walk. Increasing n does not make the walk larger, but it
makes the steps smaller. Figure 3 suggests that as n increases, the walk approaches a
smooth curve. What is this curve?
The limit curve we seek is an example of a scaling limit of a discrete process.
Perhaps the best-known example of a scaling limit is Brownian motion, which can also
be thought of as the scaling limit of a random walk. For more on Brownian motion and
scaling limits, see [5].
We rst give an intuitive argument that suggests a possible answer to our question.
We will nd it helpful to introduce a third variable t , standing for time. We set t = 0 at
the beginning of the walk, and to keep the scales of the variables comparable we will
assume that t increases by 1/n for each step of the walk. Since the walk consists of 2n
steps, this means that t will run from 0 to 2. We think of the limit curve as being given
by parametric equations
x = f
x
(t ), y = f
y
(t ), 0 t 2,
or, in vector notation,
(x, y) = ( f
x
(t ), f
y
(t )) = f(t ), 0 t 2.
When the state of an n-walk is (x, y), the displacement to the next state is either
the vector (1/n, 1/n), with probability x/(x + y), or (0, 1/n), with probability
y/(x + y). Thus, the expected value of the displacement is
x
x + y
_
1
n
,
1
n
_
+
y
x + y
_
0,
1
n
_
=
1
n
_
x
x + y
,
x y
x + y
_
.
Since t increases by 1/n during the step, this suggests that the parametric form of the
limit curve might be a solution to the system of differential equations
dx
dt
=
x
x + y
,
dy
dt
=
x y
x + y
. (1)
To solve this system of equations, we rst note that
dy
dx
=
dy/dt
dx/dt
=
x y
x
= 1 +
y
x
.
We will let you check that the curve y = x ln x satises this equation for 0 < x 1
and passes through the point (1, 0). The graph of this curve is shown in Figure 4, and
the similarity to the walks in Figure 3 is striking. Notice that although ln 0 is undened,
lim
x0
+(x ln x) = 0. From now on we consider 0 ln 0 to be equal to 0, so that the curve
y = x ln x includes the point (0, 0).
0.2 0.4 0.6 0.8 1
x
0.1
0.2
0.3
y
Figure 4. The graph of y = x ln x
Substituting y = x ln x in the rst equation in (1), we get
dx
dt
=
x
x x ln x
=
1
ln x 1
.
Separation of variables gives
t =
_
(ln x 1) dx = x ln x 2x +C.
Since x = 1 when t = 0, we must have C = 2, and therefore
t = x ln x 2x +2. (2)
Let g(x) = x ln x 2x + 2 for 0 x 1. (Notice that by our convention that
0 ln 0 = 0, we have g(0) = 2.) Then g maps [0, 1] onto [0, 2] and is strictly decreasing,
so it has an inverse. We dene f
x
to be the inverse of g, which is a strictly decreasing
function mapping [0, 2] to [0, 1]. Thus, if 0 t 2 and x = f
x
(t ), then x and t satisfy
equation (2).
1
Using y = x ln x, we can rewrite equation (2) as t = y 2x + 2, or equiva-
lently y = 2 2x t . We therefore dene
f
y
(t ) = 2 2 f
x
(t ) t. (3)
We leave it to you to verify that the equation
(x, y) = ( f
x
(t ), f
y
(t )) = f(t ), 0 t 2 (4)
parametrizes the curve y = x ln x shown in Figure 4, and it satises the differential
equations (1) for 0 t < 2, where we interpret the derivatives at t = 0 as one-sided
derivatives. (At t = 2, we have x = y = 0, and therefore the right-hand sides of the
equations in (1) are undened.) The graphs of f
x
and f
y
are shown in Figure 5.
It turns out that an n-walk does, indeed, approach the curve (4) as n approaches ,
but the sense in which this is true must be stated carefully. Our main theorem is the
following.
1
Using the Lambert W function W
1
(see [1]), we can express f
x
(t ) explicitly by the equation
f
x
(t ) =
t 2
W
1
((t 2)/e
2
)
.
However, we will not have any use for this expression.
0.5 1 1.5 2
t
0.2
0.4
0.6
0.8
1
x
0.5 1 1.5 2
t
0.1
0.2
0.3
y
Figure 5. The graphs of x = f
x
(t ) (left) and y = f
y
(t ) (right)
Theorem 1. Suppose that > 0. Let the points on an n-walk be p
0
= (1, 0), p
1
,
. . . , p
2n
= (0, 0), and for 0 i 2n let t
i
= i /n. Then the probability that for every
i , p
i
f(t
i
) < approaches 1 as n . In other words, the n-walk converges
uniformly in probability to the limit curve.
Two notable features of the limit curve are that the tangent line at (1, 0) has slope
1, and the tangent line at the origin is vertical. The rst feature makes intuitive sense:
early in the walk, almost all of the pills in the bottle are whole pills, so it is likely that
several whole pills will be removed before the rst half pill is removed. For example,
in the walk in Figure 2, three whole pills were removed before the rst half pill was
removed. When these initial whole pills are removed, the walk will move along the
line y = 1 x, which is the tangent line at (1, 0). The second feature seems more
surprising: it appears that near the end of the walk, almost all of the pills are half pills,
and the walk ends by moving along the line x = 0 toward the origin. This suggests
two questions.
Question 1. For a bottle of n pills, what is the expected number of whole pills that are
removed from the bottle before the rst half pill is removed?
Question 2. For a bottle of n pills, what is the expected number of half pills that are
removed from the bottle after the last whole pill is removed?
Versions of Question 1 have appeared in the literature before (see, for example,
[3, 4, 6, 8]). In the case n = 365, it is equivalent to the following version of the birthday
problem: If people are chosen at random, one by one, what is the expected number of
people with distinct birthdays who will be chosen before the rst person who has the
same birthday as a previously chosen person? We will give an elementary derivation
of the answer to Question 1. In our next theorem, we express the answer in terms of
the incomplete gamma function, which is dened as follows,
(a, x) =
_

x
t
a1
e
t
dt.
Theorem2. For a bottle of n pills, the expected number of whole pills that are removed
from the bottle before the rst half pill is removed is
e
n
n
n1
(n, n).
As n , this expected value is asymptotic to
_
n
2
.
The answer to Question 2 was found by Richard Stong.
Theorem 3 (Stong). For a bottle of n pills, the expected number of half pills that
are removed from the bottle after the last whole pill is removed is the nth harmonic
number,
H
n
= 1 +
1
2
+
1
3
+ +
1
n
.
For example, for a bottle of 100 pills, the expected number of whole pills before the
rst half pill is
e
100
100
99
(100, 100) 12.21,
and the asymptotic approximation in Theorem 2 is
_
100
2
12.53.
The expected number of half pills after the last whole pill is
H
100
5.19.
The rest of this paper is devoted to the proofs of Theorems 13. We prove Theorem1
in Section 3, and Theorems 2 and 3 in Section 4. We consider variations on these
theorems in Section 5.
2. BACKGROUND FOR PROOF OF THEOREM 1. In preparation for the proof
of Theorem 1, we simplify the problem by eliminating one variable. According to
denition (3), f
y
(t ) = 2 2 f
x
(t ) t , so
f(t ) = ( f
x
(t ), 2 2 f
x
(t ) t ) = f
x
(t )(1, 2) +(0, 2 t ).
A similar equation holds for the points on any n-walk. Suppose that after i steps,
the n-walk is at the point p
i
= (x
i
, y
i
), and let t
i
= i /n. This means that there are
w
i
= nx
i
whole pills and h
i
= ny
i
half pills in the bottle. These pills are enough
for 2w
i
+ h
i
doses of medicine. Since there were 2n doses in the bottle originally,
and i of those doses have been used up, there must be 2n i doses left. Therefore,
2w
i
+ h
i
= 2n i , or equivalently, h
i
= 2n 2w
i
i . Dividing through by n, we
nd that
y
i
= 2 2x
i
t
i
, (5)
and therefore
p
i
= (x
i
, 2 2x
i
t
i
) = x
i
(1, 2) +(0, 2 t
i
).
It follows that
p
i
f(t
i
) = (x
i
f
x
(t
i
))(1, 2) = |x
i
f
x
(t
i
)|
5.
Thus, to ensure that p
i
is close to f(t
i
), it will sufce to ensure that x
i
is close to f
x
(t
i
);
we can ignore the y-coordinates of p
i
and f(t
i
). In other words, to prove Theorem 1 it
will sufce to prove the following lemma.
Lemma 4. Suppose that > 0. Let the x-coordinates of the points on an n-walk be
x
0
= 1, x
1
, . . . , x
2n
= 0, and for 0 i 2n let t
i
= i /n. Then the probability that for
every i , |x
i
f
x
(t
i
)| < approaches 1 as n .
In fact, using equations (3) and (5), we can completely eliminate the variable y
from the problem. We can describe the x-coordinates of the points on an n-walk by
saying that x
i +1
is equal to either x
i
1/n or x
i
, with the rst possibility occurring
with probability
x
i
x
i
+ y
i
=
x
i
x
i
+2 2x
i
t
i
=
x
i
2 x
i
t
i
. (6)
Similarly, if x = f
x
(t ) and y = f
y
(t ), then for 0 t < 2,
f

x
(t ) =
dx
dt
=
x
x + y
=
x
2 x t
=
f
x
(t )
2 f
x
(t ) t
. (7)
Thus, we can work entirely with the points (t
i
, x
i
) and the curve x = f
x
(t ), both of
which lie in the t x-plane.
The idea behind our proof of Lemma 4 is straightforward. Let m be a large positive
integer, and let n be an integer much larger than m. Now consider an n-walk, and
break the 2n steps of the walk into m large blocks of steps. We view the n-walk in the
t x-plane, ignoring the y-coordinates. The individual steps of the n-walk are random
and unpredictable, but the net change in x that results from a large block of steps is
more predictable: by the law of large numbers, this net change is likely to be close to
its expected value. It will follow that if a block of steps starts at a point (t, x), then
the net result of this block of steps is likely to be a small displacement in the t x-plane
whose slope is close to x/(2 x t ). Since x = f
x
(t ) is a solution to the differential
equation dx/dt = x/(2 x t ), this means that the steps of the n-walk should stay
close to the graph of f
x
.
This proof sketch suggests that our proof will involve ideas related to Eulers
method. Recall that Eulers method is a numerical method for solving a differen-
tial equation of the form f

(t ) = F(t, f (t )) for a t b, with an initial condition
f (a) = x
0
. Here the function F and the numbers a, b, and x
0
are given, and we want
to compute values of f . To apply Eulers method, we choose a positive integer n and
a positive step size h (b a)/n, let t
j
= a + j h for 0 j n, and then dene x
j
recursively by the equation
x
j +1
= x
j
+hF(t
j
, x
j
), 0 j < n.
Thus, the displacement from (t
j
, x
j
) to (t
j +1
, x
j +1
) has slope F(t
j
, x
j
). If h is small
and F is sufciently well-behaved, then the points (t
j
, x
j
) will be close to the graph
of f .
We will need to modify Eulers method slightly, because according to our proof
sketch for Lemma 4, the slope of the displacement caused by a block of steps in the
n-walk starting at (t, x) is likely to be close to x/(2 x t ), but not exactly equal
to it. We will therefore need a version of Eulers method in which the slope of the
displacement at step j is only approximately equal to F(t
j
, x
j
).
To make this precise, suppose that a < b, g
1
and g
2
are functions from [a, b] to R,
and for all t [a, b], g
1
(t ) < g
2
(t ). Let
D = {(t, x) R
2
: a t b and g
1
(t ) x g
2
(t )}.
Nowsuppose that F : D Rand f : [a, b] R, and for all t [a, b], (t, f (t )) D
and
f

(t ) = F(t, f (t )),
where we interpret f

(t ) as a one-sided derivative when t = a or t = b. Let x
0
= f (a).
We want to use a version of Eulers method to locate points (t
j
, x
j
) near the graph of f .
As before, we will use a positive step size h (b a)/n, so for 0 j n we let t
j
=
a + j h. We will assume that for 0 j < n, the slope of the displacement from (t
j
, x
j
)
to (t
j +1
, x
j +1
) deviates from F(t
j
, x
j
) by some amount
j
. Thus, we recursively dene
x
j +1
= x
j
+h(F(t
j
, x
j
) +
j
).
To ensure that this formula is dened, we assume that for every j , g
1
(t
j
) x
j
g
2
(t
j
),
so that (t
j
, x
j
) D.
Lemma 5. In the modied Eulers method described above, assume that for 0
j < n,
|
j
| .
We also assume that F/x and f

are dened and bounded. Thus, we assume that
there are positive constants C
1
and C
2
such that for all (t, x) D,
F
x
(t, x)
C
1
, | f

(t )| C
2
.
Then for 0 j n,
|x
j
f (t
j
)|
_
hC
2
2C
1
+

C
1
_
_
(1 +C
1
h)
j
1
_
. (8)
Proof. We proceed by induction on j . Clearly, inequality (8) holds when j = 0, since
both sides are 0. Now suppose that the inequality holds for some j < n. By Taylors
theorem, we can write
f (t
j +1
) = f (t
j
) +h f

(t
j
) +
h
2
2
f

(c
j
)
for some number c
j
between t
j
and t
j +1
. And by the mean value theorem, we have
F(t
j
, x
j
)=F(t
j
, f (t
j
)) +
F
x
(t
j
, d
j
)(x
j
f (t
j
))= f

(t
j
) +
F
x
(t
j
, d
j
)(x
j
f (t
j
))
for some d
j
between x
j
and f (t
j
). Thus,
x
j +1
f (t
j +1
) = x
j
+h(F(t
j
, x
j
) +
j
) f (t
j +1
)
= x
j
+h
_
f

(t
j
) +
F
x
(t
j
, d
j
)(x
j
f (t
j
)) +
j
_
_
f (t
j
) +h f

(t
j
) +
h
2
2
f

(c
j
)
_
= (x
j
f (t
j
))
_
1 +h
F
x
(t
j
, d
j
)
_
+h
j

h
2
2
f

(c
j
).
Next, we take absolute values and apply the bounds given in the statement of the
lemma:
|x
j +1
f (t
j +1
)| |x
j
f (t
j
)|(1 +C
1
h) +h +
C
2
h
2
2
.
Finally, we apply the inductive hypothesis to conclude that
|x
j +1
f (t
j +1
)|
_
hC
2
2C
1
+

C
1
_
_
(1 +C
1
h)
j
1
_
(1 +C
1
h) +h +
C
2
h
2
2
=
_
hC
2
2C
1
+

C
1
_
_
(1 +C
1
h)
j +1
1
_
,
as required.
3. PROOF OF THEOREM1. To complete the proof of Theorem 1, we return to our
proof sketch for Lemma 4. Unfortunately, nailing down the details of this proof sketch
is not easy. Nevertheless, in this section we show that, with some care, a proof based
on these ideas can be carried out.
Fix > 0. We will refer to the region f
x
(t ) < x < f
x
(t ) + in the t x-plane as
the -corridor. To prove Lemma 4, we must show that for large n, an n-walk is likely
to stay entirely inside the -corridor. We rst determine simple bounds on any n-walk.
At step i of the walk, by (5) we have
x
i
0, 2 2x
i
t
i
= y
i
0,
and therefore
0 x
i

2 t
i
2
. (9)
Similar bounds apply to the graph of f
x
: for 0 t 2,
0 f
x
(t ) 1, 2 2 f
x
(t ) t = f
y
(t ) = f
x
(t ) ln( f
x
(t )) 0,
so
0 f
x
(t )
2 t
2
. (10)
These simple bounds already imply that the end of the n-walk stays inside the -
corridor: if t
i
> 2 2, then
0 x
i
, f
x
(t
i
)
2 t
i
2
< ,
and therefore
|x
i
f
x
(t
i
)| < .
Thus, we only need to worry about t
i
in the interval [0, 2 2]. In particular, if > 1,
then there is nothing more to prove, so we can assume now that 1. By stopping
short of t = 2, we avoid having to deal with the point (t, x, y) = (2, 0, 0) on the limit
curve, where the right-hand sides of the equations in (1) are undened.
We will nd it convenient to go a bit beyond t = 2 2, so we dene
D =
_
(t, x) R
2
: 0 t 2 and 0 x
2 t
2
_
,
and for (t, x) D we let
F(t, x) =
x
2 x t
.
Notice that for (t, x) D,
2 x t 2
2 t
2
t =
2 t
2
> 0, (11)
so F(t, x) is dened.
By (9) and (10), any n-walk and the curve x = f
x
(t ) both stay in the region D up to
time t = 2 , and by (7), if 0 t 2 , then f

x
(t ) = F(t, f
x
(t )). Thus, it makes
sense to apply Lemma 5 to the functions F and f
x
on the region D. In preparation for
this, we make some observations about these functions. We rst note that by (11) and
the denition of D, for (t, x) D we have
2 x t
2 t
2
x 0.
Since F(t, x) = x/(2 x t ), it follows that
1 F(t, x) 0, (12)
and therefore
| f

x
(t )| = |F(t, f
x
(t ))| 1. (13)
Next, we compute
F
x
(t, x) =
2 t
(2 x t )
2
, f

x
(t ) =
f
x
(t )
2
(2 f
x
(t ) t )
3
=
(F(t, f
x
(t )))
2
2 f
x
(t ) t
.
Thus, if (t, x) D, then by (11),
F
x
(t, x)
=
2 t
(2 x t )
2

2 t
((2 t )/2)
2
=
4
2 t

4
.
Similarly, if 0 t 2 , then
| f

x
(t )| =
(F(t, f
x
(t )))
2
2 f
x
(t ) t

1
2 f
x
(t ) t

1
(2 t )/2
=
2
2 t

2
.
We can therefore use C
1
= 4/ and C
2
= 2/ in Lemma 5. For reasons that will be-
come clear later, the value we will use for in Lemma 5 is
=
C
1
6(e
2C
1
1)
. (14)
Since the function F(t, x) is uniformly continuous on D, we can choose some >
0 such that for any two points (t
1
, x
1
), (t
2
, x
2
) D,
if |t
1
t
2
| < and |x
1
x
2
| < , then |F(t
1
, x
1
) F(t
2
, x
2
)| <
4
. (15)
We now choose a positive integer m large enough that
2
m
<
3
,
2
m
< ,
e
2C
1
1
2m
<
6
. (16)
Again, the reason for this choice will become clear later.
Consider an n-walk for any n m
2
. As in the statement of Lemma 4, let the x-
coordinates of the points on the walk be x
0
= 1, x
1
, . . . , x
2n
= 0, and for 0 i 2n let
t
i
= i /n. We now divide 2n by m, getting a quotient q and remainder r. In other words,
2n = mq +r
and 0 r < m. Notice that since n m
2
, we have q 2m. We think of the walk
as consisting of m blocks of steps, with each block containing q steps, followed by r
extra steps at the end. For 0 j m, let (T
j
, X
j
) be the position of the walk after j
blocks of steps have been traversed. Thus, T
j
= t
j q
= j q/n and X
j
= x
j q
.
Let h = q/n, so that for 0 j < m,
T
j +1
T
j
= h,
and note that since x either remains xed or decreases by 1/n in each step of the walk,
0 X
j
X
j +1

q
n
= h.
Applying (16), we see that
h =
2q
2n
=
2q
mq +r

2q
mq
=
2
m
<
3
,
so
|T
j +1
T
j
|
2
m
<
3
, |X
j +1
X
j
|
2
m
<
3
. (17)
In other words, in the course of a single block of steps, x and t change by less than
/3.
For 0 j < m, let
j
=
X
j +1
X
j
h
F(T
j
, X
j
).
Rearranging this denition, this means that
X
j +1
= X
j
+h(F(T
j
, X
j
) +
j
).
Of course, this is the recurrence in our modied version of Eulers method.
We would now like to apply Lemma 5, but we have no guarantee that will be a
bound on the numbers |
j
|. However, we can show that if is such a bound, then the
walk stays in the -corridor:
Claim. Suppose that for all j < m, if T
j
2 2, then |
j
| . Then the n-walk
stays inside the -corridor.
Proof of Claim. Notice that since q 2m and 2/m < /3,
T
m
= t
mq
=
mq
n
=
2mq
2n
=
2mq
mq +r
>
2mq
m(q +1)
= 2
2
q +1
> 2
2
2m
> 2

6
> 2 2.
Thus, we can let k be the least index such that T
k
> 2 2. Then for all j < k,
T
j
2 2, and therefore, by assumption, |
j
| . And since T
k1
2 2, by (17)
we have
T
k
< T
k1
+

3
2 2 +

3
< 2 .
We can therefore apply Lemma 5 to the points (T
j
, X
j
) for 0 j k and the func-
tions F and f
x
on the region D to conclude that for all such j ,
|X
j
f
x
(T
j
)|
_
hC
2
2C
1
+

C
1
_
_
(1 +C
1
h)
j
1
_
.
Since j k m and h 2/m,
(1 +C
1
h)
j
_
1 +
2C
1
m
_
m
< e
2C
1
,
where the last inequality is well known (see, for example, inequality 4.5.13 in [7]).
Therefore,
|X
j
f
x
(T
j
)| <
_
(2/m)(2/)
2(4/)
+

C
1
_
(e
2C
1
1) =
e
2C
1
1
2m
+
(e
2C
1
1)
C
1
.
By (16) and (14), the last two fractions are both at most /6. Thus, we have shown that
|X
j
f
x
(T
j
)| <
3
. (18)
This implies that all of the points (T
j
, X
j
) for 0 j k are in the -corridor.
Since T
k
> 2 2, as we observed after (10), all points on the n-walk beyond
(T
k
, X
k
) are also in the -corridor. We still need to worry about points on the n-walk
in the interiors of the rst k blocks. If (t, x) is such a point, then (t, x) occurs between
(T
j
, X
j
) and (T
j +1
, X
j +1
), for some j < k. To see that (t, x) is in the -corridor, we
compute
|x f
x
(t )| |x X
j
| +|X
j
f
x
(T
j
)| +| f
x
(T
j
) f
x
(t )|.
We now bound each of the terms on the right-hand side. We already know, by (17) and
(18), that |x X
j
| |X
j +1
X
j
| < /3 and |X
j
f
x
(T
j
)| < /3. For the third term
we apply the mean value theorem:
f
x
(T
j
) f
x
(t ) = f

x
(c)(T
j
t ),
for some c between t and T
j
. By (13) and (17), we conclude that
| f
x
(T
j
) f
x
(t )| = | f

x
(c)| |T
j
t | | f

x
(c)| |T
j +1
T
j
| < 1

3
=

3
.
Putting it all together, we get
|x f
x
(t )| |x X
j
| +|X
j
f
x
(T
j
)| +| f
x
(T
j
) f
x
(t )| <
3
+

3
+

3
= ,
so the point (t, x) is in the -corridor. We have now shown that all points on the walk
are in the -corridor, which completes the proof of the claim.
The claim shows that if an n-walk goes outside of the -corridor, then there must be
some j < m such that T
j
2 2 and |
j
| > . To complete the proof, we will show
that this is unlikely to happen.
Partition {(t, x) D : t 2 2} into nitely many disjoint regions R
1
, R
2
, . . . ,
R
K
, each with diameter less than . By (12) and (15), for each k with 1 k K we
can choose a number r
k
such that 1 r
k
0 and for every (t, x) R
k
,
|F(t, x) r
k
| <
4
. (19)
For example, we can take r
k
to be F(t, x) for some particular (t, x) R
k
. Notice that
the regions R
k
and numbers r
k
do not depend on n; as n , R
k
and r
k
will remain
xed.
We will write Pr
n
(E) to denote the probability that an event E occurs when an n-
walk takes place. The claim implies that the probability that an n-walk will leave the
-corridor is at most
m1
j =0
K
k=1
p
j,k
(n),
where
p
j,k
(n) = Pr
n
((T
j
, X
j
) R
k
and |
j
| > ).
Thus, it will sufce to show that for each j and k, lim
n
p
j,k
(n) = 0.
Fix j and k with 0 j < m and 1 k K. The value of
j
is determined by the
block of steps taken by the n-walk in going from (T
j
, X
j
) to (T
j +1
, X
j +1
). The points
on this part of the walk are (t
j q+i
, x
j q+i
) for 0 i q. We will refer to the step from
(t
j q+i
, x
j q+i
) to (t
j q+i +1
, x
j q+i +1
) as step i of this block of the n-walk. Notice that there
are q steps in the block, and since q is the quotient when n is divided by m and m is
xed, q when n .
Let a be the number of steps in the block in which x decreases by 1/n. In the re-
maining q a steps, the value of x does not change, so X
j
X
j +1
= a/n. Therefore,
by denition,
j
=
X
j +1
X
j
h
F(T
j
, X
j
) =
a/n
q/n
F(T
j
, X
j
) =
a
q
F(T
j
, X
j
).
Although the value of p
j,k
(n) does not depend on the precise method by which the
steps in this block of the walk are chosen, it will be helpful to specify a method. We
will assume that for 0 i < q, random numbers s
i
are chosen, independently and
uniformly in [0, 1], and then in step i , x decreases by 1/n if
s
i
<
x
j q+i
2 x
j q+i
t
j q+i
= F(t
j q+i
, x
j q+i
),
and x is unchanged otherwise. Of course, according to equation (6), this procedure
generates the correct probabilities for the steps of the walk.
Suppose that (T
j
, X
j
) R
k
. Then by (19), |F(T
j
, X
j
) r
k
| < /4, or in other
words
r
k

4
< F(T
j
, X
j
) < r
k
+

4
. (20)
Also, for 0 i < q, by (17) and (16), |t
j q+i
T
j
| 2/m, |x
j q+i
X
j
| 2/m,
2/m < /3, and 2/m < . Since t
j q+i
T
j
+2/m < 2 2 +/3 < 2 , we have
(t
j q+i
, x
j q+i
) D, and therefore, by (15), |F(t
j q+i
, x
j q+i
) F(T
j
, X
j
)| < /4. Com-
bining this with |F(T
j
, X
j
) r
k
| < /4, we conclude that |F(t
j q+i
, x
j q+i
) r
k
| <
/2, or in other words
r
k

2
< F(t
j q+i
, x
j q+i
) < r
k
+

2
.
Recall that step i is determined by how s
i
compares to F(t
j q+i
, x
j q+i
). We can
now draw the conclusion that if (T
j
, X
j
) R
k
, then:
(a) if s
i
r
k

2
, then at step i , x decreases by
1
n
;
(b) if s
i
r
k
+

2
, then at step i , x remains unchanged.
We are now ready to show that lim
n
p
j,k
(n) = 0. By denition,
p
j,k
(n) = Pr
n
((T
j
, X
j
) R
k
and
j
> ) +Pr
n
((T
j
, X
j
) R
k
and
j
< ).
We will show that both of the probabilities on the right-hand side approach 0 as
n .
For the rst, suppose that (T
j
, X
j
) R
k
and
j
> . Since
j
= a/q F(T
j
, X
j
),
by (20) this implies that
a
q
< F(T
j
, X
j
) < r
k

3
4
.
Now let a
be the number of values of i for which s

i
r
k
/2. By conclusion (a)
above, a
a, and therefore
0
a
q

a
q
< r
k

3
4
< r
k

2
< 1.
This is very unlikely to happen. To see why, notice rst that for 0 i < q, since s
i
is
chosen uniformly in [0, 1] and 0 < r
k
/2 < 1, the probability that s
i
r
k
/2
is r
k
/2. And since the s
i
are chosen independently, this means that a
/q, which
is the fraction of values of i for which s
i
r
k
/2, should be close to r
k
/2.
More precisely, by the law of large numbers (see [2, Section VI.4, p. 152]), for any
> 0, the probability that |a
/q (r
k
/2)| > must approach 0 as q .
And since q as n , taking = /4 we can conclude that
lim
n
Pr
n
_
a
q
< r
k

3
4
_
= 0.
It follows that
lim
n
Pr
n
((T
j
, X
j
) R
k
and
j
> ) = 0.
The second probability is similar. If (T
j
, X
j
) R
k
and
j
< , then
a
q
> F(T
j
, X
j
) + > r
k
+
3
4
.
Now let a
be the number of values of i for which s

i
< r
k
+/2. This time we use
fact (b) above to conclude that a
a, so
1
a
q

a
q
> r
k
+
3
4
> r
k
+

2
> 0.
Once again, the law of large numbers says that the probability of this event goes to 0
as n , which completes the proof of Lemma 4 and, therefore, Theorem 1.
4. PROOFS OF THEOREMS 2 AND 3. To prove Theorem 2, x n > 0, and let
A denote the number of whole pills removed from the bottle before the rst half pill.
Of course, the rst pill removed from the bottle must be a whole pill, and there are n
whole pills altogether, so 1 A n.
For 1 k n, let X
k
= 1 if the rst k pills removed from the bottle are all whole
pills, and X
k
= 0 otherwise. Then we have A = X
1
+ X
2
+ + X
n
, and therefore
E(A) = E(X
1
+ X
2
+ + X
n
) = E(X
1
) + E(X
2
) + + E(X
n
).
The probability that the rst pill removed is a whole pill is 1. Once the rst whole
pill has been removed, the bottle contains n 1 whole pills and 1 half pill, so the
probability that the second pill is also a whole pill is (n 1)/n. Similarly, if the rst
two pills are whole pills, then the probability that the third pill is a whole pill is (n
2)/n. Continuing in this way, we see that for 1 k n,
E(X
k
) = Pr(X
k
= 1)
= 1
n 1
n

n 2
n

n k +1
n
=
n!
n
k
(n k)!
.
Thus,
E(A) =
n
k=1
E(X
k
) =
n
k=1
n!
n
k
(n k)!
.
Reindexing by j = n k, we get
E(A) =
n
k=1
n!
n
k
(n k)!
=
n1
j =0
n!
n
nj
j !
=
n!
n
n
n1
j =0
n
j
j !
. (21)
To relate this formula to the incomplete gamma function, we rst evaluate the inte-
gral in the denition of the incomplete gamma function. Applying integration by parts
k times leads to the formula in the following lemma.
Lemma 6. For every integer k 0,
_
t
k
e
t
dt =
k!
e
t
k
j =0
t
j
j !
+C.
Using this lemma, we nd that
(n, n) =
_

n
t
n1
e
t
dt
= lim
N
_
_
(n 1)!
e
t
n1
j =0
t
j
j !
_
_
N
n
=
(n 1)!
e
n
n1
j =0
n
j
j !
. (22)
Thus,
n1
j =0
n
j
j !
=
e
n
(n 1)!
(n, n).
Substituting into (21), we get
E(A) =
n!
n
n
n1
j =0
n
j
j !
=
n!
n
n

e
n
(n 1)!
(n, n) =
e
n
n
n1
(n, n).
This proves the rst statement in Theorem 2.
To prove the second statement, about the asymptotic value as n , we need the
following fact.
Lemma 7.
lim
n
(n, n)
(n 1)!
=
1
2
.
Proof. According to inequality 8.10.13 of [7],
(n, n)
(n 1)!
<
1
2
<
(n +1, n)
n!
. (23)
By Lemma 6 and equation (22),
(n +1, n)=
_

n
t
n
e
t
dt =
n!
e
n
n
j =0
n
j
j !
=n
(n 1)!
e
n
n1
j =0
n
j
j !
+
n
n
e
n
=n(n, n) +
n
n
e
n
.
Substituting into the second half of inequality (23), we get
1
2
<
(n, n)
(n 1)!
+
n
n
e
n
n!
,
and therefore
1
2

n
n
2n
e
n
n!

1
2n
<
(n, n)
(n 1)!
<
1
2
.
By Stirlings formula, lim
n
n
n
2n/(e
n
n!) = 1, and the lemma now follows by
the squeeze theorem.
This lemma allows us to determine the asymptotic rate of growth of the expected
value of A. The expected length of the initial run of whole pills can be rewritten in the
form
E(A) =
e
n
n
n1
(n, n) =
2n
e
n
n!
n
n
2n

(n, n)
(n 1)!

2n 1
1
2
=
_
n
2
,
which completes the proof of Theorem 2.
Finally, we give Stongs proof of Theorem 3. For 1 k n, consider the kth whole
pill that is removed from the bottle. This pill is cut in half, and half of it is returned
to the bottle; we will refer to this half pill as the kth half pill. Let X
k
= 1 if the kth
half pill is removed from the bottle after the last whole pill is removed, and X
k
= 0
otherwise. Then the expected value we seek is
E(X
1
+ X
2
+ + X
n
) = E(X
1
) + E(X
2
) + + E(X
n
).
After the kth half pill has been returned to the bottle, there are n k whole pills
still in the bottle, and we have X
k
= 1 if and only if among the set of pills consisting
of these n k remaining whole pills and the kth half pill, the half pill is the last one to
be removed from the bottle. Since each pill in this set is equally likely to be chosen at
each step, we have
E(X
k
) = Pr
n
(X
k
= 1) =
1
n k +1
.
Therefore the expected number of half pills removed from the bottle after the last
whole pill is
E(X
1
) + E(X
2
) + + E(X
n
) =
1
n
+
1
n 1
+ +1 = H
n
.
5. VARIATIONS. In all of our calculations, we have assumed that when a pill is
removed from the bottle, all pills in the bottle are equally likely to be chosen. But
since the whole pills are twice as big as the half pills, another natural assumption
would be that whole pills are twice as likely to be chosen as half pills. In this section
we summarize the results of redoing our calculations with this alternative assumption,
leaving the details to the reader.
If whole pills are twice as likely to be chosen as half pills, then the differential
equations (1) must be replaced by
dx
dt
=
2x
2x + y
,
dy
dt
=
2x y
2x + y
.
The solution to this system of equations that passes through the point (1, 0) is
y = 2(
x x), x =
(2 t )
2
4
, y =
t (2 t )
2
.
Once again, the random walk converges uniformly in probability to this curve as
n .
Surprisingly, in this case the expected number of whole pills removed before the
rst half pill turns out to be exactly the same as the expected number of half pills
removed after the last whole pill. Calculations similar to those in the last section show
that both expected values are
2
2n
_
2n
n
_ 1.
There is a simple explanation for why these two expected values are equal. The
explanation is based on an alternative procedure we could follow to decide which pill
to remove from the bottle each day. First, number the pills in a full bottle from 1 to n.
Then make a deck of 2n cards numbered from 1 to n, with each number appearing on
two cards, and shufe the deck. Every day, deal a card from the top of the deck, and if
the card has the number k on it, then remove pill number k from the bottle. As usual,
if the pill is whole, then cut it in half and return half to the bottle.
On any day, if pill number k is still whole, then there will be two cards numbered k
in the deck; if half of pill number k has already been taken, then there will be only one
card numbered k in the deck; and if pill number k has been used up completely, then
there will be no cards numbered k left in the deck. It follows that whole pills will be
twice as likely to be chosen as half pills, as required.
If we follow this procedure, then the number of whole pills removed from the bottle
before the rst half pill is removed will be the same as the number of distinct cards
dealt from the top of the deck before the rst duplicate card. Similarly, we could deter-
mine how many half pills will be removed from the bottle after the last whole pill by
dealing cards from the bottom of the deck and counting the number of distinct cards
dealt before the rst duplicate. It should now be clear by symmetry that the expected
values of these two numbers are equal. Indeed, the problem of computing this com-
mon expected value is equivalent to the third question addressed in [9], and the answer
follows from Theorem 5 of [9].
ACKNOWLEDGMENTS. I would like to thank Richard Stong, Greg Warrington, Rob Benedetto, Tanya
Leise, Amy Wagaman, and the anonymous referees for helpful conversations and suggestions. Natasha would
like to thank Dr. Michael Katz, D.V.M.
REFERENCES
1. R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, D. E. Knuth, On the Lambert W function, Adv.
Comput. Math. 5 (1996) 329359.
2. W. Feller, An Introduction to Probability Theory and its Applications. Vol. I. Third edition, Wiley, New
York, 1968.
3. L. Holst, On birthday, collectors, occupancy and other classical urn problems, Int. Stat. Review 54 (1986)
1527.
4. M. S. Klamkin, D. J. Newman, Extensions of the birthday surprise, J. Comb. Theory 3 (1967) 279282.
5. G. F. Lawler, V. Limic, Random Walk: A Modern Introduction. Cambridge Studies in Advanced Mathe-
matics. Vol. 123. Cambridge University Press, Cambridge, 2010.
6. B. McCabe, Matching balls drawn from an urn, Problem E 2263, Solutions by B. C. Arnold and R. J. Dick-
son, Amer. Math. Monthly 78 (1971) 10221024.
7. National Institute of Standards and Technology, Digital Library of Mathematical Functions, March 23,
2012, available at http://dlmf.nist.gov/.
8. P. N. Rathie, P. Z ornig, On the birthday problem: Some generalizations and applications, Int. J. Math.
Math. Sci. 2003 (2003) 38273840.
9. D. J. Velleman, G. S. Warrington, What to expect in a game of memory, Amer. Math. Monthly, 120 (2013)
787805.
DANIEL J. VELLEMAN received his B.A. from Dartmouth College in 1976 and his Ph.D. from the Univer-
sity of WisconsinMadison in 1980. He taught at the University of Texas before joining the faculty of Amherst
College in 1983. He was the editor of the American Mathematical Monthly from 2007 to 2011. In his spare
time he enjoys singing, bicycling, and playing volleyball.
Department of Mathematics, Amherst College, Amherst, MA 01002
djvelleman@amherst.edu
Analytical Solution for the Generalized FermatTorricelli Problem
Author(s): Alexei Yu. Uteshev
Accessed: 30/03/2014 17:29
.
.
Analytical Solution for the Generalized
FermatTorricelli Problem
Alexei Yu. Uteshev
Abstract. We present an explicit analytical solution for the problem of minimization of the
function
F(x, y) =
3
j =1
m
j
_
(x x
j
)
2
+(y y
j
)
2
,
i.e., we nd the coordinates of the stationary point and the corresponding critical value as
functions of {m
j
, x
j
, y
j
}
3
j =1
. In addition, we also discuss the inverse problem of nding such
values for m
1
, m
2
, and m
3
for which the corresponding function F possesses a prescribed
position of stationary point.
1. INTRODUCTION. Consider the following problem. Given the coordinates of
three noncollinear points P
1
= (x
1
, y
1
), P
2
= (x
2
, y
2
), and P
3
= (x
3
, y
3
) in the plane,
nd the coordinates of the point P
= (x
, y
) that gives a solution to the optimization

problem
min
(x,y)R
2
F(x, y) for F(x, y) =
3
j =1
m
j
_
(x x
j
)
2
+(y y
j
)
2
. (1)
Here m
1
, m
2
, and m
3
are assumed to be real positive numbers and will be subsequently
referred to as weights.
The stated problem, in its particular case of equal weights m
1
= m
2
= m
3
= 1, has
been known since 1643 as the (classical) FermatTorricelli problem. It has a unique
solution that coincides either with one of the points P
1
, P
2
, P
3
or with the so-called
Fermat or FermatTorricelli point [2, 4] of the triangle P
1
P
2
P
3
; this point makes an
angle of 2/3 with any two vertices of the triangle.
Generalization of the problem to the case of unequal weights has been investigated
since the 19th century. This generalization is known under different names: the Steiner
problem, the Weber problem, the problem of railway junction ((Germ.) Problem des
Knotenpunktes) [3, 8], the three factory problem [6]. The last two names were inspired
by a facility location problem such as the following. Let the cities P
1
, P
2
, and P
3
be
the sources of iron ore, coal, and water, respectively. To produce one ton of steel, the
steel works needs m
1
tons of iron, m
2
tons of coal, and m
3
tons of water. Assuming
that the freight charge for a ton-kilometer is independent of the nature of the cargo,
nd the optimal position for the steel works connected with P
1
, P
2
, and P
3
via straight
roads so as to minimize the transportation costs.
In the rest of the paper, this problem will be referred to as the generalized Fermat
Torricelli problem. Existence and uniqueness of its solution is guaranteed by the fol-
lowing result [4].
MSC: Primary 51N20
Theorem 1. Denote by
1
,
2
, and
3
the corner angles of the triangle P
1
P
2
P
3
. If the
conditions
_
_
m
2
1
< m
2
2
+m
2
3
+2m
2
m
3
cos
1
,
m
2
2
< m
2
1
+m
2
3
+2m
1
m
3
cos
2
,
m
2
3
< m
2
1
+m
2
2
+2m
1
m
2
cos
3
(2)
are fullled, then there exists a unique solution P
= (x
, y
) R
2
for the generalized
FermatTorricelli problem lying inside the triangle P
1
P
2
P
3
. This point is a stationary
point for the function F(x, y), i.e., a real solution of the system
3
j =1
m
j
(x x
j
)
_
(x x
j
)
2
+(y y
j
)
2
= 0,
3
j =1
m
j
(y y
j
)
_
(x x
j
)
2
+(y y
j
)
2
= 0. (3)
If any of the conditions (2) are violated, then F(x, y) attains its minimum value at the
corresponding vertex of the triangle.
Let us overview some approaches for nding the point P
. Historically, the rst

approach is geometrical: The point is found as the intersection point of a special con-
struction of lines or circles. For the equal weighted case, Torricelli proved that the
circles circumscribing the equilateral triangles constructed on the sides of and outside
the triangle P
1
P
2
P
3
intersect at the point P
; for an alternative Simpson construction

of P
, see [5]. For the general, i.e., unequal weighted case, see [3, 8].
The second approach is based on the mechanical model (sometimes incorrectly
called P olyas mechanical model): A horizontal board is drilled with holes at the points
P
1
, P
2
, and P
3
(or at the vertices of a triangle similar to P
1
P
2
P
3
). Three strings are tied
together in a knot at one end, the loose ends are passed through the holes, and are
attached to physical weights proportional to m
1
, m
2
, and m
3
, respectively, below the
board. The equilibrium position of the knot yields the solution [3].
The third approach, based on the gradient descent method, originated in the paper
[11]; further developments and comments can be found in [7, 9].
The present paper is devoted to the fourth approach, the analytical one. We look
for explicit expressions for the coordinates of the stationary point P
as functions of
{m
j
, x
j
, y
j
}
3
j =1
. Although the existence of such a solution by radicals, i.e., in a nite
number of operations like standard arithmetic ones and extraction of (positive integer)
roots, is not questioned in any review article on the problem, we failed to nd in the
literature the constructive and universal version of an algorithm even for the classical,
i.e., equal weighted, case.
2. ALGEBRA.
Theorem 2. Under the conditions (2), the coordinates of the stationary point (x
, y
)
of the function F(x, y) are as follows:
x
=
K
1
K
2
K
3
4|S|d
_
x
1
K
1
+
x
2
K
2
+
x
3
K
3
_
, y
=
K
1
K
2
K
3
4|S|d
_
y
1
K
1
+
y
2
K
2
+
y
3
K
3
_
(4)
April 2014] SOLUTION FOR THE FERMATTORRICELLI PROBLEM 319
with
F(x
, y
) = min
(x,y)R
2
F(x, y) =
d.
Here
d =
1
2
(m
2
1
K
1
+m
2
2
K
2
+m
2
3
K
3
) , or alternatively, (5)
d = 2|S| +
1
2
_
m
2
1
(r
2
12
+r
2
13
r
2
23
)+m
2
2
(r
2
23
+r
2
12
r
2
13
)+m
2
3
(r
2
13
+r
2
23
r
2
12
)
_
, (6)
r
j
= | P
j
P
| =
_
(x
j
x
)
2
+(y
j
y
)
2
for { j, } {1, 2, 3},
S = x
1
y
2
+ x
2
y
3
+ x
3
y
1
x
1
y
3
x
3
y
2
x
2
y
1
, (7)
=
1
2
_
m
4
1
m
4
2
m
4
3
+2m
2
1
m
2
2
+2m
2
1
m
2
3
+2m
2
2
m
2
3
, (8)
and
_
_
K
1
= (r
2
12
+r
2
13
r
2
23
) +(m
2
2
+m
2
3
m
2
1
)|S|,
K
2
= (r
2
23
+r
2
12
r
2
13
) +(m
2
1
+m
2
3
m
2
2
)|S|,
K
3
= (r
2
13
+r
2
23
r
2
12
) +(m
2
1
+m
2
2
m
2
3
)|S|.
(9)
Proof. First, we establish the validity of the equality
K
1
K
2
+ K
1
K
3
+ K
2
K
3
= 4|S|d, (10)
and the dual equality
r
2
23
K
1
+r
2
13
K
2
+r
2
12
K
3
= 2|S|d (11)
for (5). Second, let us deduce the following relationships
_
(x
x
j
)
2
+(y
y
j
)
2
=
m
j
K
j
2
d
for j {1, 2, 3}. (12)
Here is the proof for the case j = 1:
(x
x
1
)
2
+(y
y
1
)
2
(10)
=
_
K
1
K
2
K
3
4|S|d
_
2
_
_
x
2
K
2
+
x
3
K
3
x
1
K
2
x
1
K
3
_
2
+
_
y
2
K
2
+
y
3
K
3
y
1
K
2
y
1
K
3
_
2
_
=
_
K
1
K
2
K
3
4|S|d
_
2
_
(x
2
x
1
)
2
+(y
2
y
1
)
2
K
2
2
+
(x
3
x
1
)
2
+(y
3
y
1
)
2
K
2
3
+2
(x
2
x
1
)(x
3
x
1
) +(y
2
y
1
)(y
3
y
1
)
K
2
K
3
_
=
_
K
1
K
2
K
3
4|S|d
_
2
_
r
2
12
K
2
2
+
r
2
13
K
2
3
+2
1/2(r
2
12
+r
2
13
r
2
23
)
K
2
K
3
_
=
K
2
1
(4|S|d)
2
_
r
2
12
K
2
3
+r
2
13
K
2
2
+(r
2
12
+r
2
13
r
2
23
)K
2
K
3
_
=
K
2
1
(4|S|d)
2
_
(r
2
12
K
3
+r
2
13
K
2
)(K
2
+ K
3
) r
2
23
K
2
K
3
_
(11)
=
K
2
1
(4|S|d)
2
_
(2|S|d r
2
23
K
1
)(K
2
+ K
3
) r
2
23
K
2
K
3
_
=
K
2
1
(4|S|d)
2
_
2|S|d(K
2
+ K
3
) r
2
23
(K
1
K
2
+ K
1
K
3
+ K
2
K
3
)
_
(10)
=
K
2
1
(4|S|d)
2
_
2|S|d(K
2
+ K
3
) 4r
2
23
|S|d
_
=
2|S|dK
2
1
(4|S|d)
2
_
K
2
+ K
3
2r
2
23
_
(9)
=
K
2
1
8|S|d
2
_
2m
2
1
|S|
_
=
m
2
1
K
2
1
4
2
d
.
Similar arguments hold for j {2, 3} in (12). To complete the proof of these equalities,
it should be additionally veried that the values K
1
, K
2
, and K
3
are nonnegative. This
will be done in the next section.
To prove the rst statement of the theorem, we will utilize the following alternative
representation for x
and y
:
x
(10)
=
1
1
K
1
+
1
K
2
+
1
K
3
_
x
1
K
1
+
x
2
K
2
+
x
3
K
3
_
, and
y
(10)
=
1
1
K
1
+
1
K
2
+
1
K
3
_
y
1
K
1
+
y
2
K
2
+
y
3
K
3
_
. (13)
We substitute (4) into the left-hand side of the rst equation of (3). The resulting ex-
pression can be reduced with the aid of (12) to
x
x
1
K
1
+
x
x
2
K
2
+
x
x
3
K
3
= x
_
1
K
1
+
1
K
2
+
1
K
3
_
_
x
1
K
1
+
x
2
K
2
+
x
3
K
3
_
(13)
= 0.
Similar arguments are valid for the second equation from (3). Finally, we compute
F(x
, y
):
F(x
, y
)=
3
j =1
m
j
_
(x
x
j
)
2
+(y
y
j
)
2
(12)
=
3
j =1
m
2
j
K
j
2
d
(5)
=
2d
2
d
=
d.
Some test values are provided in Table 1.
P
1
P
2
P
3
P
m
1
m
2
m
3
d
1. (2, 6) (1, 1) (5, 1)
_
4103+1833
15
2866
,
295234481
15
8598
_
2 3 4 (3.9086, 1.4152)
d = 2
_
79 +15
15 23.4174
2. (2, 6) (1, 1) (5, 1)
_
751
485
,
647
485
_
(1.5484, 1.3340)
3 5 4

d =
970 31.1448
3. (0, 0) (2, 0) (
2,
2)
_
1
1
2

3
110
,
1
2

3
55

3
110
_
3/2 2 2 (0.0068, 0.0165)
d =
_
32 +
23
2
+3
_
55
2
7.9997
Table 1.
3. GEOMETRY. Let us give an interpretation for some constants that appeared in
Theorem 2. First, on rewriting (7) in determinantal form
S =
1 1 1
x
1
x
2
x
3
y
1
y
2
y
3
,
we recognize that |S| = 2S
P
1
P
2
P
3
, where S
P
1
P
2
P
3
stands for the area of triangle
P
1
P
2
P
3
. As for the constant (8), factorization of the radicand on the right-hand side
leads to the form
= 2
_
m
1
+m
2
+m
3
2
_
m
1
+m
2
+m
3
2
m
1
__
m
1
+m
2
+m
3
2
m
2
_
_
m
1
+m
2
+m
3
2
m
3
__
1/2
,
which can be treated as the Heron formula for twice the area of a triangle formed by
the triple of weights m
1
, m
2
, and m
3
. Under the restrictions (2), such a triangle exists.
Construct this triangle and denote its angles, as shown in Figure 1.
m
1
m
2
m
3
m
1
m
2
m
3
3
Figure 1. Two triangles generated by the problem
The rst formula from (9) can thus be represented with the aid of the law of cosines
as
K
1
= |S|
_
r
2
12
+r
2
13
r
2
23
|S|
+
m
2
2
+m
2
3
m
2
1
_
= |S|
_
2r
12
r
13
cos
1
|S|
+
2m
2
m
3
cos
1
_
= 2|S|(cot
1
+cot
1
).
Rewriting the rst condition from (2) in the form cos
1
+cos
1
> 0, we can con-
clude that cot
1
+cot
1
> 0 and, thus, K
1
> 0. In a similar way, the expressions for
K
2
and K
3
can be deduced, and we can establish that, under the restrictions (2), they
are both positive. This completes the proof of Theorem 2.
Remark 1. We set the dual generalized FermatTorricelli problem. Let the triangle
be composed of the sides with the lengths equal to m
1
, m
2
, and m
3
; let the weights
r
12
, r
23
, and r
13
be placed in its vertices, as shown in Figure 2.
r
13
r
23
r
12
m
1
m
2
m
3
Figure 2. Dual problem
The minimum value for the objective function will be the same as in the direct
problem, since (6) is equivalent to
2|S| +
1
2
_
r
2
12
(m
2
1
+m
2
2
m
2
3
) +r
2
13
(m
2
1
+m
2
3
m
2
2
) +r
2
23
(m
2
2
+m
2
3
m
2
1
)
_
.
4. CLASSICAL FERMATTORRICELLI PROBLEM. Consider now the equal
weighted case m
1
= m
2
= m
3
= 1.
Theorem 3. Let all the angles of the triangle P
1
P
2
P
3
be less than 2/3, or, equiva-
lently,
r
2
12
+r
2
13
+r
12
r
13
r
2
23
> 0,
r
2
23
+r
2
12
+r
12
r
23
r
2
13
> 0,
r
2
13
+r
2
23
+r
13
r
23
r
2
12
> 0.
The coordinates of the FermatTorricelli point for this triangle are as follows:
x
=
k
1
k
2
k
3
2
3|S|d
_
x
1
k
1
+
x
2
k
2
+
x
3
k
3
_
, y
=
k
1
k
2
k
3
2
3|S|d
_
y
1
k
1
+
y
2
k
2
+
y
3
k
3
_
, (14)
with the corresponding minimum value of the objective function
F(x
, y
) = min
(x,y)R
2
3
j =1
(x x
j
)
2
+(y y
j
)
2
=
d.
Here,
d =
1
3
(k
1
+k
2
+k
3
) =
r
2
12
+r
2
13
+r
2
23
2
+
3 |S| (15)
and
k
1
=
3
2
(r
2
12
+r
2
13
r
2
23
) +|S|,
k
2
=
3
2
(r
2
23
+r
2
12
r
2
13
) +|S|,
k
3
=
3
2
(r
2
13
+r
2
23
r
2
12
) +|S|,
with the rest of the parameters coinciding with those from Theorem 2.
It turns out that the right-hand sides of the expressions (14), being represented as
rational fractions with respect to {x
j
, y
j
}
3
j =1
, can be reduced further to the form where
denominators become area free.
Corollary. Under conditions of Theorem 3, the coordinates of the FermatTorricelli
point are as follows:
x
=
1
2
3d
(x
1
+ x
2
+ x
3
)|S| +
x
1
r
2
23
+ x
2
r
2
13
+ x
3
r
2
12
(16)
+3 sgn(S)
1 1 1
y
1
y
2
y
3
x
2
x
3
+ y
2
y
3
x
1
x
3
+ y
1
y
3
x
1
x
2
+ y
1
y
2
,
y
=
1
2
3d
(y
1
+ y
2
+ y
3
)|S| +
y
1
r
2
23
+ y
2
r
2
13
+ y
3
r
2
12
(17)
3 sgn(S)
1 1 1
x
1
x
2
x
3
x
2
x
3
+ y
2
y
3
x
1
x
3
+ y
1
y
3
x
1
x
2
+ y
1
y
2
.
Remark 2. The result of the last corollary can be extended to the generalized Fermat
Torricelli problem. Numerators and denominators in the right-hand sides of the for-
mulas (4) can be reduced by the common factor |S|. We do not present the resulting
expressions here, since they are inelegantly cumbersome.
Remark 3. One of the referees of the present paper suggested that the author provide
some motivation or insight of how he found the explicit expressions in Theorem 2.
Frankly speaking, the historical development of the investigation went in the direction
opposite to what has been presented up to this point. First, the formulas (16)(17)
were obtained as the solution of a linear system of equations arising from the feature
of the FermatTorricelli point to make an angle of 2/3 with any two vertices of the
triangle. Next, in a similar way, the formulas mentioned in Remark 2 were obtained for
the generalized FermatTorricelli problem, i.e., for the coordinates x
, y
. Although
these formulas looked awful, they permitted us to deduce the explicit expression (6)
for the value of minimal distance. Moreover, we noticed the appearance of this value
in the expressions for denominators of the formulas for x
and y
. Next, we intended to
perform an additional verication of the obtained results via direct substitution into the
equations (3). At this moment, the following lucky guess came to mind: the radicand of
_
(x
x
j
)
2
+(y
y
j
)
2
should be a perfect square! The only remaining trick was to discover the values (9).
5. INVERSEPROBLEM. Given the coordinates of the point P
= (x
, y
), we wish
to nd the values for the weights m
1
, m
2
, and m
3
with the aim for the corresponding
objective function (1) to posses a minimum point precisely at P
.
Theorem4. Let the vertices of the triangle P
1
P
2
P
3
be counted counterclockwise. Then
for the choice
m
1
= | P
P
1
|
1 1 1
x
x
2
x
3
y
y
2
y
3
,
m
2
= | P
P
2
|
1 1 1
x
1
x
x
3
y
1
y
y
3
, and
m
3
= | P
P
3
|
1 1 1
x
1
x
2
x
y
1
y
2
y
(18)
the function
F(x, y) =
3
j =1
m
j
_
(x x
j
)
2
+(y y
j
)
2
has its stationary point at P
. Provided that the latter is chosen inside the triangle

P
1
P
2
P
3
, the values (18) are all positive, and
F(x
, y
) = min
(x,y)R
2
F(x, y) =
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
x
2
+ y
2
x
2
1
+ y
2
1
x
2
2
+ y
2
2
x
2
3
+ y
2
3
. (19)
Proof. Substitute x = x
, y = y
and the values (18) into the left-hand side of the rst
equation from (3) as follows:
(x
x
1
)
1 1 1
x
x
2
x
3
y
y
2
y
3
+(x
x
2
)
1 1 1
x
1
x
x
3
y
1
y
y
3
+(x
x
3
)
1 1 1
x
1
x
2
x
y
1
y
2
y
. (20)
Represent this combination of the third-order determinants in the form of the fourth-
order determinant, namely
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
0 x
x
1
x
x
2
x
x
3
(expansion by its last row coincides with (20)). Now add the second row to the last to
obtain the following:
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
x
.
In this determinant, the rst row is proportional to the last one; therefore, the determi-
nant equals just zero. The second equality from (3) can be veried in a similar manner.
Let us evaluate F(x
, y
):
F(x
, y
) =
_
(x
x
1
)
2
+(y
y
1
)
2
_
1 1 1
x
x
2
x
3
y
y
2
y
3
+
_
(x
x
2
)
2
+(y
y
2
)
2
_
1 1 1
x
1
x
x
3
y
1
y
y
3
+
_
(x
x
3
)
2
+(y
y
3
)
2
_
1 1 1
x
1
x
2
x
y
1
y
2
y
.
To prove the equality (19), let us split it into the x-part and the y-part. First, keep the
x-terms in brackets of the previous formula:
(x
x
1
)
2
1 1 1
x
x
2
x
3
y
y
2
y
3
+(x
x
2
)
2
1 1 1
x
1
x
x
3
y
1
y
y
3
+(x
x
3
)
2
1 1 1
x
1
x
2
x
y
1
y
2
y
.
Similar to the proof of the rst part of the theorem, represent this linear combination
as the determinant of the fourth order:
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
0 (x
x
1
)
2
(x
x
2
)
2
(x
x
3
)
2
.
Multiply the rst row by (x
2
), the second one by 2x
and add the obtained rows to

the last one:
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
x
2
x
2
1
x
2
2
x
2
3
. (21)
The y-part of the equality (19) can be proven in exactly the same manner with the
resulting determinant differing from (21) only in its last row. The linear property of
determinant with respect to its rows completes the proof of (19).
Remark 4. The solution of the inverse problem is determined up to a common posi-
tive multiplier, i.e., the solution triple (m
1
, m
2
, m
3
) is dened by the value of the ratio
m
1
: m
2
: m
3
. (In the language of the facility location problem mentioned in the Intro-
duction, this statement is equivalent to the fact that the optimal position of the steel
works is independent of the currency of the state.) Up to this remark, the solution of
the inverse problem is unique. We have proven this statement via direct computations
starting from formulas (4).
Example 1. Let P
1
= (2, 6), P
2
= (1, 1), P
3
= (5, 1), and
P
=
_
1
2866
_
4103 +1833
15
_
,
1
8598
_
29523 4481
15
_
_
.
Find the values for the weights m
1
, m
2
, and m
3
from Theorem 4.
Solution. Formulas (18) give:
m
1
=
2(20925 4481
15)
18481401
_
316380606 +35999826
15,
m
2
=
2(15105 2342
15)
6160467
_
75400161 9169767
15,
and
m
3
=
8(1185 +15988
15)
18481401
_
8335761 2050623
15,
with
F(x
, y
) =
1
4299
(333980 +193436
15).
Now, compare the obtained result with the one represented in test 1 from Section 2.
According to Remark 4, we might expect that
m
1
: m
2
: m
3
= 2 : 3 : 4.
We leave the verication of this fact as an exercise for the inquisitive reader.
The next example originated from the question posed by one of the referees of the
present paper: What will happen to the result of Theorem 4 if we take P
= P
j
?
Example 2. Showhowto choose the values for the weights m
1
, m
2
, and m
3
in order for
the point P
to coincide with the given point on a side of the triangle from Example 1.
Solution. If we take P
= P
2
, the formulas (18) give zero values for all the weights;
however, the weights of these zeros are different. To explain this causistry, take
P
= P
2
+(, ) for the innitely small > 0. For this case, formulas (18) give:
m
1
() = 4
_
26 12 +2
2
= 4
26 +o(),
m
2
() = 4
2(5 2),
m
3
() = 4
_
16 8 +2
2
= 16 +o().
The weight m
2
() dominates over m
1
() and m
3
() when +0. As a matter of
fact, the true values of these weights do not inuence the position of the point P
; the
latter depends only on the value of the ratio m
1
() : m
2
() : m
3
(). Thus, the choice
m
1
= 4
26, m
2
= 20
2, m
3
= 16 provides us with P
= P
2
.
Let us now manipulate the weights with the aim of extruding the point P
to an
internal point of the side P
2
P
3
. This manipulation is not trivial, as in the previous case.
First, we utilize formulas (18) and then simplify the obtained result with the aid of
formulas (4). Finally, the variable weights
m
1
() = t , m
2
() = 1 +, m
3
() = 1
with a xed t >
104, provide the following asymptotics as +0:

P

_
2
10
t
2
4
, 1
_
.
Thus, the two essential weights m
2
() and m
3
() guarantee delivery of P
to the
side P
2
P
3
, while the negligible weight m
1
() ensures the ne-tuning of this delivery
to the particular point within the open line segment P
2
P
. Here P
= (2, 1) is the foot

of the altitude of the triangle P
1
P
2
P
3
through the point P
1
.
Let us discuss the geometrical meaning of the constants from Theorem 4. The value
m
1
equals twice the product of the distance | P
1
P
| by the area of the triangle P
P
2
P
3
.
The rst statement of the theorem is equivalent to the geometrical equality
P
1
S
P
P
2
P
3
+
P
2
S
P
P
3
P
1
+
P
3
S
P
P
1
P
2
=
O.
Finally, the constant (19) is connected with
h =
1
S
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
x
2
+ y
2
x
2
1
+ y
2
1
x
2
2
+ y
2
2
x
2
3
+ y
2
3
,
which is known [10, pp. 251252] as the power of the point P
with respect to the

circle through the points P
1
, P
2
, and P
3
(the circumscribed circle of the triangle). If
we denote the circumcenter of the triangle P
1
P
2
P
3
by C, then
h = |CP
|
2
|CP
j
|
2
for j {1, 2, 3}, (22)
and, provided that P
lies inside this triangle, the value h is negative.

Results of the present section can evidently be extended to the case of three (and
more) dimensions.
Theorem 5. Let the points {P
j
= (x
j
, y
j
, z
j
)}
4
j =1
be noncoplanar, and be counted in
such a manner that the value of the determinant
V =
1 1 1 1
x
1
x
2
x
3
x
4
y
1
y
2
y
3
y
4
z
1
z
2
z
3
z
4
(23)
is positive. Then for the choice
_
m
j
= | P
P
j
| V
j
_
4
j =1
, (24)
where V
j
equals the determinant obtained on replacing the j th column of (23) by the
column [1, x
, y
, z
(here

denotes transposition), the function
F(x, y, z) =
4
j =1
m
j
_
(x x
j
)
2
+(y y
j
)
2
+(z z
j
)
2
has its stationary point at P
= (x
, y
, z
). If P
lies inside the tetrahedron P

1
P
2
P
3
P
4
,
then the values (24) are all positive, and
F(x
, y
, z
) = min
(x,y,z)R
3
F(x, y, z)
=
1 1 1 1 1
x
x
1
x
2
x
3
x
4
y
y
1
y
2
y
3
y
4
z
z
1
z
2
z
3
z
4
x
2
+ y
2
+ z
2
x
2
1
+ y
2
1
+ z
2
1
x
2
2
+ y
2
2
+ z
2
2
x
2
3
+ y
2
3
+ z
2
3
x
2
4
+ y
2
4
+ z
2
4
.
(25)
Geometrical meanings of the values appearing in the last theoremare similar to their
counterparts from Theorem 4. For instance, the value (23) equals six times the volume
of tetrahedron P
1
P
2
P
3
P
4
, while the value (25) divided by V is known [10, p. 255] as
the power of the point P
with respect to a sphere circumscribed to that tetrahedron; it

is equivalent to (22), where C this time stands for the circumcenter of the tetrahedron
while j {1, 2, 3, 4}.
6. CONCLUSIONS. An analytical solution for the generalized FermatTorricelli
problem and its inversion is presented. The three-point case is completely solved using
extended radicals: In addition to elementary and extraction of roots operations, the
sign function is utilized in the formulas. The treatment of the multidimensional n > 3
point case requires further investigation, although some theoretical results like [1] give
little reason to hope for a nice, e.g., extended radicals, solution.
7. APPENDIX. We prove here the equalities (10) and (11). We have
K
1
K
2
+ K
1
K
3
+ K
2
K
3
=
1
2
[K
1
(K
2
+ K
3
) + K
2
(K
1
+ K
3
) + K
3
(K
1
+ K
2
)]
(9)
= K
1
(r
2
23
+m
2
1
|S|) + K
2
(r
2
13
+m
2
2
|S|) + K
3
(r
2
12
+m
2
3
|S|)
=
2
_
(r
2
12
+r
2
13
r
2
23
)r
2
23
+(r
2
23
+r
2
12
r
2
13
)r
2
13
+(r
2
13
+r
2
23
r
2
12
)r
2
12
_
+ S
2
_
m
2
1
(m
2
2
+m
2
3
m
2
1
) +m
2
2
(m
2
1
+m
2
3
m
2
2
) +m
2
3
(m
2
1
+m
2
2
m
2
3
)
_
+|S|
_
m
2
1
(r
2
12
+r
2
13
r
2
23
) +m
2
2
(r
2
12
+r
2
23
r
2
13
) +m
2
3
(r
2
13
+r
2
23
r
2
12
)
+r
2
23
(m
2
2
+m
2
3
m
2
1
) +r
2
13
(m
2
1
+m
2
3
m
2
2
) +r
2
12
(m
2
1
+m
2
2
m
2
3
)
_
= 4
2
S
2
+4
2
S
2
+2|S|
_
m
2
1
(r
2
12
+r
2
13
r
2
23
) +m
2
2
(r
2
13
+r
2
23
r
2
12
) +m
2
3
(r
2
13
+r
2
23
r
2
12
)
_
.
Here we have utilized (8) and the equality
4S
2
= (r
2
12
+r
2
13
r
2
23
)r
2
23
+(r
2
13
+r
2
23
r
2
12
)r
2
12
+(r
2
23
+r
2
12
r
2
13
)r
2
13
, (26)
which can be veried either directly or with the aid of the Heron formula for the area
of a triangle (see Section 3). Reference to the denition (6) of the constant d completes
the proof of (10).
We now deduce formula (11):
r
2
23
K
1
+r
2
13
K
2
+r
2
12
K
3
=
_
(r
2
12
+r
2
13
r
2
23
)r
2
23
+(r
2
13
+r
2
23
r
2
12
)r
2
12
+(r
2
23
+r
2
12
r
2
13
)r
2
13
_
+|S|
_
r
2
23
(m
2
2
+m
2
3
m
2
1
) +r
2
13
(m
2
1
+m
2
3
m
2
2
) +r
2
12
(m
2
1
+m
2
2
m
2
3
)
_
(26)
= 4 S
2
+|S|
_
m
2
1
(r
2
12
+r
2
13
r
2
23
) +m
2
2
(r
2
23
+r
2
12
r
2
13
) +m
2
3
(r
2
13
+r
2
23
r
2
12
)
_
(6)
= 2|S|d.
ACKNOWLEDGMENTS. The author thanks the referees and the editor for valuable suggestions that helped
to improve the quality of the paper.
REFERENCES
1. C. Bajaj, The algebraic degree of geometric optimization problems, Discrete Comput. Geom. 3 (1988)
177191.
2. R. Courant, H. Robbins, What is Mathematics? Oxford University Press, London, 1941.
3. F. Dingeldey, Sammlung von Aufgaben zur Anwendung der Differenzial- und Integralrechnung. Erster
Teil. Aufgaben zur Anwendung der Differenzialrechnung. Teubner, Leipzig, 1910.
4. Encyclopedia of Mathematics contributors, FermatTorricelli problem. Encyclopedia of Mathematics,
available at http://www.encyclopediaofmath.org/index.php?title=Fermat-Torricelli_
problem&oldid=22419
5. D. Gonzalez Martinez, The Fermat point, available at http://jwilson.coe.uga.edu/
EMAT6680Fa10/Gonzalez/Assignment6/THEFERMATPOINT.htm
6. I. Greenberg, R. A. Robertello, The three factory problem, Math. Mag. 38 (1965) 6772.
7. H. W. Kuhn, A note on Fermats problem, Math. Program 4 (1973) 98107.
8. W. Launhardt, Kommercielle Tracirung der Verkehrswege, Z. f. Architekten u. Ingenieur-Vereinis im Kon-
igreich Hannover, 18 (1872) 516534.
9. L. M. Ostresh, On the convergence of a class of iterative methods for solving the Weber location problem,
Oper. Res. 26 (1978) 597609.
10. J. V. Uspensky, Theory of Equations. McGraw-Hill, New York, 1948.
11. E. Weiszfeld, Sur le point pour lequel la somme des distances de n points donn es est minimum. Tohoku
Math. J. 43 (1937) 355386.
ALEXEI UTESHEV received his Ph.D. from the Leningrad (St. Petersburg) State University in 1988. His
mathematical interests lie in computational algebra and geometry; he also carries on personal educational
on-line resources in these areas. He is also interested in history and enjoys cross-country skiing.
Faculty of Applied Mathematics, St. Petersburg State University
Universitetskij pr. 35, 198504, Petrodvorets, St. Petersburg, Russia
alexeiuteshev@gmail.com
A One-Sentence Line-of-Sight Proof of the Extreme Value Theorem
The maximum value of a continuous real-valued function f on [a, b] is attained
at its largest lookout point. We call x in [a, b] a lookout point if, whenever
t lies in [a, x), we have f (t ) f (x). The set L of lookout points is closed.
Indeed, let x
n
x, with x
n
in L. If t is in [a, x), then eventually t lies in [a, x
n
),
so f (t ) f (x
n
). By continuity, f (t ) f (x), as desired. We use the fact that
a closed, bounded, and nonempty set has a maximum and a minimum. Thus,
max(L) exists.
Extreme Value Theorem. If f is a real-valued continuous function on [a, b]
then f has a maximum value on [a, b]. In other words, for some c in [a, b], no
value attained by f exceeds f (c).
Proof. Letting L = {x in [a, b] such that t in [a, x) implies f (t ) f (x)} and
c = max(L), it sufces to show that, given k > f (c), the closed, bounded set
S
k
= {t in [a, b] such that f (t ) k} is empty, and this follows since, if some d
satises f (d) k, then d > c, whence d is not in L, so there exists a t < d for
which f (t ) > f (d) k, proving that S
k
has no minimum.
Submitted by Samuel J. Ferguson, University of Iowa
MSC: Primary 26A03 Secondary: 26A15
Author(s): Samuel J. Ferguson
Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), p. 331
Accessed: 30/03/2014 17:29
.
.
6. I. Greenberg, R. A. Robertello, The three factory problem, Math. Mag. 38 (1965) 6772.
7. H. W. Kuhn, A note on Fermats problem, Math. Program 4 (1973) 98107.
8. W. Launhardt, Kommercielle Tracirung der Verkehrswege, Z. f. Architekten u. Ingenieur-Vereinis im Kon-
igreich Hannover, 18 (1872) 516534.
9. L. M. Ostresh, On the convergence of a class of iterative methods for solving the Weber location problem,
Oper. Res. 26 (1978) 597609.
10. J. V. Uspensky, Theory of Equations. McGraw-Hill, New York, 1948.
11. E. Weiszfeld, Sur le point pour lequel la somme des distances de n points donn es est minimum. Tohoku
Math. J. 43 (1937) 355386.
ALEXEI UTESHEV received his Ph.D. from the Leningrad (St. Petersburg) State University in 1988. His
mathematical interests lie in computational algebra and geometry; he also carries on personal educational
on-line resources in these areas. He is also interested in history and enjoys cross-country skiing.
Faculty of Applied Mathematics, St. Petersburg State University
Universitetskij pr. 35, 198504, Petrodvorets, St. Petersburg, Russia
alexeiuteshev@gmail.com
The maximum value of a continuous real-valued function f on [a, b] is attained
at its largest lookout point. We call x in [a, b] a lookout point if, whenever
t lies in [a, x), we have f (t ) f (x). The set L of lookout points is closed.
Indeed, let x
n
x, with x
n
in L. If t is in [a, x), then eventually t lies in [a, x
n
),
so f (t ) f (x
n
). By continuity, f (t ) f (x), as desired. We use the fact that
a closed, bounded, and nonempty set has a maximum and a minimum. Thus,
max(L) exists.
Extreme Value Theorem. If f is a real-valued continuous function on [a, b]
then f has a maximum value on [a, b]. In other words, for some c in [a, b], no
value attained by f exceeds f (c).
Proof. Letting L = {x in [a, b] such that t in [a, x) implies f (t ) f (x)} and
c = max(L), it sufces to show that, given k > f (c), the closed, bounded set
S
k
= {t in [a, b] such that f (t ) k} is empty, and this follows since, if some d
satises f (d) k, then d > c, whence d is not in L, so there exists a t < d for
which f (t ) > f (d) k, proving that S
k
has no minimum.
Submitted by Samuel J. Ferguson, University of Iowa
MSC: Primary 26A03 Secondary: 26A15
On the Proof of the Existence of Undominated Strategies in Normal Form Games
Author(s): Martin Kovr, Alena Chernikava
Accessed: 30/03/2014 17:29
.
.
On the Proof of the Existence of
Undominated Strategies in
Normal Form Games
Martin Kov ar and Alena Chernikava
Abstract. In the game theory literature, there are two versions of the proof of the well-known
fact that in a normal form game of n persons with compact spaces of strategies and continu-
ous utility functions, the sets of undominated strategies are nonempty. The older one, stated
in the rst edition of the well-known book by Herve Moulin, depends on certain, relatively
nontrivial results from measure theory, metric topology, and mathematical analysis. The proof
is valid only for metrizable topological spaces. The second, revised edition of the same book
contains a simplied proof, which is, however, incorrect. The author implicitly assumes that
any linearly ordered set contains a conal subsequence, which is certainly not true. In this
paper we correct, simplify, and generalize the second proof of Moulin by its reformulation in
terms of topological convergence of nets. This modied technique also yields a slightly better
result than is stated in the original. The assertion now holds for almost compact spaces. The
argument used is elementary and easily understandable to non-experts.
1. INTRODUCTION. In a normal form game, assume that the set of all strategies
of a player is compact and its associated utility function is continuous. In this paper,
we present a slightly improved modication of the well-known result, which ensures
the existence of an undominated strategy. Moreover, our result has a new and simpler
proof.
The standard and best-known version of the proof is in the rst edition of the
comprehensive textbook on game theory by Herve Moulin [8]. It is dependent on
a combination of relatively nontrivial results from measure theory, metric topology,
and mathematical analysis. In the second, revised edition [9] of the same book, there
is a newer, simplied proof using some topological arguments together with Zorns
Lemma. Unfortunately, the second Moulins proof is incorrect, since it implicitly uses
a non-valid argument that every chain (that is, a linearly ordered set) contains a co-
nal subsequence. The rst uncountable ordinal
1
is a proper counterexample, which
demonstrates that in general it is not true. The mistake itself is not very critical for
game theory, since in metric spaces, for which the classical results are usually formu-
lated, the topology is rst countable and hence the sequences are still sufcient to fully
describe the topology by means of convergence. Nevertheless, the mentioned fact it-
self, noticed by the second author during her study of [9], constitutes an opportunity
for a revision of Moulins original proof using somewhat ner and a little bit more
general topological arguments.
We will correct, simplify, and somewhat strengthen the proof of Moulins result to
be applicable for slightly more general topological assumptions than is stated in its
original version. Although the central notion that we use to express the phenomena
of convergencenetslies rather aside from the main stream of the current general
topology, it has an important advantage. It is understood also by non-topologists be-
cause of its similarity to the widely-spread and well-known convergence theory of
MSC: Primary 91A10, Secondary 54D30
sequences in metric spaces. We also keep the simplied formulation of Moulins the-
orem as in [8] and exclude the part regarding the prudent strategies, which is new in
[9], but irrelevant with respect to the discussed correctness of Moulins proof. A short
description of our modication of the proof now follows. To reach more clarity of the
sketch, we use compactness instead of almost compactness (in a contrast to the full ver-
sion presented in Section 3). We modify the relation of dominance to induce a preorder
on the strategy set of the i th player. The identity map, restricted to any chain (meaning
now a linearly or totally preordered subset) forms a net in a compact topological space,
which, hence, has a cluster point. The continuous utility function maps the net as well
as its cluster point to R, equipped with the Euclidean topology. Thus the cluster point
is an upper bound of the chain, and Zorns Lemma nally completes the proof.
2. DEFINITIONS AND NOTATIONS. Before demonstrating the complete proof,
we will need to recapitulate some necessary notions from game theory and topology.
Recall that an n-person game G in a normal or strategic form is denoted by the 2n-
tuple G = (X
1
, X
2
, . . . , X
n
, u
1
, u
2
, . . . , u
n
), where for each i {1, 2, . . . , n}, X
i
is a
nonempty set of strategies of the i th player and u
i
:
n
j =1
X
j
R is his real-valued
utility, or pay-off function. Let i {1, 2, . . . , n} and let x
i
, y
i
X
i
be some strategies
of the i th player. We say that the strategy y
i
dominates the strategy x
i
, if the following
conditions hold.
(1) For any selection of strategies s
k
X
k
, where k {1, 2, . . . , n}, k = i ,
u
i
(s
1
, s
2
. . . , s
i 1
, x
i
, s
i +1
, . . . , s
n
) u
i
(s
1
, s
2
. . . , s
i 1
, y
i
, s
i +1
, . . . , s
n
).
(2) For each k {1, 2, . . . , n}, k = i , there exists some strategy t
k
X
k
such that
u
i
(t
1
, t
2
. . . , t
i 1
, x
i
, t
i +1
, . . . , t
n
) < u
i
(t
1
, t
2
. . . , t
i 1
, y
i
, t
i +1
, . . . , t
n
).
The strategy x
i
X
i
of the i th player is said to be undominated if there is no strategy
y
i
X
i
that dominates x
i
. It should be noted that this kind of dominance is sometimes
referred to as a weak dominance, in opposition to strict dominance, which differs from
the above-dened notion at the condition (1) by the strict form < of the inequality. Two
strategies x
i
, y
i
X
i
are called equivalent, if for any selection of strategies s
k
X
k
,
where k {1, 2, . . . , n}, k = i , it follows that
u
i
(s
1
, s
2
. . . , s
i 1
, x
i
, s
i +1
, . . . , s
n
) = u
i
(s
1
, s
2
. . . , s
i 1
, y
i
, s
i +1
, . . . , s
n
).
(For more detail, see, for example, [2, 11].)
A binary relation on a set is called a preorder, if it is reexive and transitive (and
not necessarily antisymmetric). Let A be a nonempty set and be a preorder on A
such that for every x, y A there exists z A with x z and y z. Then we say
that (A, ) is a directed set.
Let X be a topological space. A net in X is an arbitrary mapping from a directed set
to the space X. A family of nonempty sets is called a lter base if any intersection
of two sets belonging to contains a subset from . We say that p X is a -cluster
point of a lter base in X, if for every closed neighborhood H of p and every
F , the intersection H F is nonempty. Similarly, p is a -cluster point of a net
(A, ), if for each closed neighborhood H of p and for each a A, there exists
b A, b a, such that (b) H. Taking the -images of the principal upper sets
a = {b| b A, b a}, we can easily convert the net (A, ) into a lter base, while
the corresponding convergence and -convergence notions will remain preserved.
April 2014] UNDOMINATED STRATEGIES IN NORMAL FORM GAMES 333
A topological space X is said to be compact, if every lter base (or equivalently,
every net) in X has a cluster point. For more detail and other equivalent characteriza-
tions of compactness, especially in terms of open covers, we refer the reader to [3]. We
also remark that in a modern approach to compactness, motivated by the growing in-
terest of theoretical computer scientists in topology, the Hausdorff separation axiom is
no longer assumed as a part of the denition of compactness (see, for example, [15]).
A topological space is called almost compact [1] if every open lter base in X has a
cluster point. It is clear that every compact topological space is almost compact (but
not vice-versa).
The real line R we consider is a topological space equipped with the natural, Eu-
clidean topology, generated by all open intervals.
3. MAIN RESULTS. We will start with the following simple example. It illustrates
some of the limitations of Moulins classical result. The undominated strategies may
exist even if the spaces of strategies are not compact.
Example 3.1. Consider a normal form game of two players with the same sets of
strategies X
1
= X
2
= [0, 1) {0} {1} {0, 1, . . .}. Let the corresponding utility
functions of the players be
u
1
=
x
1
x
1
+ x
2
f (y
2
), and u
2
=
x
2
x
1
+ x
2
g(y
1
),
where f, g are arbitrary real-valued functions dened on {0} N. It is easy to see that
the pairs (1, n) X
i
, where n {0, 1, . . .} and i = 1, 2, are equivalent, maximal and
undominated strategies of the i th player. However, although the utility functions u
i
are continuous, the topology of X
i
, induced from the real plane is not compact. For
instance, the sequence {(1, n)| n = 0, 1, 2, . . .} has no cluster point in X
i
. Hence, the
existence of undominated strategies of the i th player is not a consequence of Moulins
theorem.
There are many possible interpretations of the previous example, but probably one
of the most important is that there could be a duopolistic competition over market
share with patent wars. The rst component x
i
of the strategy (x
i
, y
i
) of the i th player
may represent the market share, while the second component y
i
can be interpreted as
obstructions extracting the prot of the players opponent, in particular, litigation over
patent rights.
For our main theorem and also for a better understanding of some other aspects of
the previous example, we will need the following lemma. The contents of the lemma
are already knownit is essentially contained in (but rather split between) the book
[1] and the paper [14]. Also useful are comments in [5]. We present the result here
with a proof, in order to repeat and concentrate some ideas of these resources in one
place for the readers convenience.
Lemma 3.1. Let (X, ) be a topological space. The following conditions are equiva-
lent:
(i) (X, ) is almost compact,
(ii) every lter base in X has a -cluster point,
(iii) every net in X has a -cluster point,
(iv) every open cover of X has a nite subfamily whose union is dense in X.
Proof. Suppose (i), and let be a lter base in X. The family = {U| U , there
exists F with F U} is an open lter base, and so it has a cluster point, say
p X. Let H be a closed neighborhood of p. We will show that H F = for every
F . Suppose conversely that F X \ H for some F . Then X \ H and
so p cl(X \ H). But this is not possible, since p
H and (
H) (X \ H) = .
Hence, (ii) follows.
Consider (ii) and take a net (A, ) in X. The family = {(a)| a A} is a
lter base with a -cluster point, say p X. Let H be a closed neighborhood of p and
let a A. Then H (a) = , so there is some b A, b a, with (b) H. It
means that p is a -cluster point of (A, ) and (iii) holds.
Assume (iii), and take an open cover of X. Let
F
be the family of all nite
unions of elements of . The family
F
is directed by the set inclusion. Suppose that
for every U
F
, the set X \ cl U is nonempty, so it contains some element (U).
The net (
F
, ) has a -cluster point, say p X. Since
F
is also a cover, there
is some V
F
containing p. By the denition of the -cluster point, there exists
W
F
, W V, such that (W) cl V. But it also holds that (W) X \ cl W, so
= (X \ cl W) V (X \ W) W, which is not possible. Then some element of
F
must be dense in X.
Finally, suppose (iv). Let be an open lter base in X with no cluster point. Then
{cl U| U } = , so = {X \ cl U| U } is an open cover of X; and since

is a lter base, it is directed by the inclusion. By (iv), there exists U , such that
X = cl(X \ cl U). Since X \ U is a closed set containing (X \ cl U), it also contains
its closure, and so X \ U = X. But this is not possible according to the fact that a lter
base contains only nonempty elements. Therefore, has a cluster point and (i) now
follows.
From the previous lemma also follows the well-known fact that for regular spaces,
the compactness and almost compactness coincide. On the other hand, there exists
a Hausdorff almost compact space that is not compact, as the reader may check
in [1]. Hausdorff almost compact spaces are also known as H-closed spaces (also in
[1], or [14]).
Theorem 3.1. Let G = (X
1
, X
2
, . . . , X
n
, u
1
, u
2
, . . . , u
n
) be a normal form game of n
players. Suppose that for some i {1, 2, . . . , n}, X
i
is almost compact and the utility
function u
i
is a continuous, real valued function of the argument x
i
X
i
. Then the i th
player has an undominated strategy.
Proof. For two strategies x
i
, y
i
X
i
we put x
i
y
i
(sometimes we will write this
relation as y
i
x
i
) if they satisfy the condition (1) of the denition of dominance in
Section 2. It is easy to see that is a preorder on X
i
. Let L X
i
be an arbitrary
linearly preordered subset of X
i
(that is, for every a, b L, it holds a b or b a).
Let l be the identity mapping on X
i
, restricted to L. Then l is a net in an almost
compact topological space X
i
, so l has a -cluster point p X
i
. By the denition, it
means that for every closed neighborhood H of p and every t L, there exists some
s L, s t , with l(s) H.
Now, suppose that the strategies s
k
X
k
of the other players, k = i , are arbi-
trarily chosen, but xed in this paragraph. We denote u
i
(x
i
) = u
i
(s
1
, s
2
. . . , s
i 1
, x
i
,
s
i +1
, . . . , s
n
). Suppose, for a moment, that there exist some t L with u
i
( p) <
u
i
(t ). Take c R such that u
i
( p) < c < u
i
(t ). Because of continuity of u
i
, H =
u
1
i
((, c]) is a closed set in X
i
whose interior contains p. Since p is a -
cluster point of l, there exists s L, with s t , such that s = l(s) H. But
then u
i
(s) (, c], which is not possible, because the relation s t means that
c < u
i
(t ) u
i
(s). Consequently, p is an upper bound of L. By Zorns Lemma, there
is a maximal element m in the preordered set (X
i
, ). This completes the proof, since
the strategy, maximal with respect to , cannot be dominated.
Note that Zorns Lema is usually formulated for partially ordered sets. Using pre-
ordered sets, its appropriate formulation can be found in [7]. Hence, the maximality
of m, which is claimed in our proof, is maximality up to the equivalence of strategies.
It means that there may exist another strategy m
X
i
, different from m (and also
maximal), on which the utility function u
i
has the same values.
Since every compact topological space is almost compact, the classical version of
Moulins theorem now follows as a corollary. The reader can also compare Theo-
rem 3.1 with other interesting results and techniques known from the game theory
literature. For instance, H. Salonen in [12] replaced the continuity of the utility func-
tion by its upper semi-continuity. He essentially used a characterization of compact-
ness by the centered collections of sets (in other words, having the nite intersection
property, [10]), or lters and lter bases, which are topologically equivalent to nets. A
similar technique was also used in [11] for iteratively undominated strategies with the
continuous utility function.
Now, let us check the advantage of Theorem 3.1 over its original, classical version.
Example 3.2. Consider the game already described in Example 3.1. Let us dene an-
other topology on X
i
, where i = 1, 2, by the local base of a general point (x, y) X
i
:
(i) the point (0, 0) has neighborhoods of the form [0, ) {0}, 0 < < 1,
(ii) for every x (0, 1), the point (x, 0) has neighborhoods of the form (x ,
x +) {0}, where 0 < < min{x, 1 x},
(iii) for every n = 0, 1, . . . , the point (1, n) has neighborhoods having the form
(1 , 1) {0} {(1, n)}, where 0 < < 1.
The new topology on X
i
is now similar to the Euclidean topology on the unit seg-
ment [0, 1], but with one important differencethe right end point of the segment is
present innitely many times. The space X
i
is T
1
, but certainly non-Hausdorff and non-
compact. Indeed, denoting Y
n
= [0, 1) {0} {(1, n)}, the family {Y
n
| n = 0, 1, . . .}
is an open cover of X
i
, having no nite subcover. However, we can show that the
new topology is almost compact. Let be an open cover of X
i
. The subspace
Y
0
= [0, 1] {0} X
i
is compact since it is homeomorphic to the unit segment
[0, 1], so there exists a nite subfamily {U
1
, U
2
, . . . , U
k
} with Y
0

k
j =1
U
j
.
Then there is r {1, 2, . . . , k} such that (1, 0) U
r
. But for every n = 1, 2, . . . , it
follows that (1, n) cl U
r
, so the closures of {U
1
, U
2
, . . . , U
k
} cover X
i
. By condition
(iv) of Lemma 3.1, X
i
is almost compact. The utility functions u
i
are continuous
functions of the argument (x
i
, y
i
), since they are continuous on the open subspaces
Y
n
= [0, 1) {0} {(1, n)} of X
i
, n = 0, 1, . . . , homeomorphic to [0, 1]. Hence, the
existence of the undominated strategies now follows from Theorem 3.1. Note that sim-
ilar spaces as X
i
are also known as examples of non-Hausdorff manifolds with some
motivation in sheaf theory and mathematical physics (see, for example, [6] or [4]).
ACKNOWLEDGMENTS. The authors are very grateful to both anonymous referees for many valuable sug-
gestions and comments, in particular related to the game-theoretical part of the content of our paper, and to the
editor for his assistance with preparation of the nal form of the manuscript. The authors are also thankful to
Professor V. A. Gorelik from the Dorodnitsyn Computing Center of the Russian Academy of Sciences for his
advice on nding an appropriate game-theoretical literature at the initial stage of their work.
This work is supported by a specic research grant FEKT-S-11-2/921 of the Faculty of Electrical Engi-
neering and Communication, Brno University of Technology.
REFERENCES
1.

A. Cs asz ar, General Topology. Akademiai Kiad o, Budapest, 1978.
2. D. Fudenberg, J. Tirole, Game Theory. MIT Press, Cambridge, 1991.
3. R. Engelking, General Topology. Heldermann Verlag, Berlin, 1989.
4. M. Heller, L. Pysiak, W. Sasin, Geometry of non-Hausdorff spaces and its signicance for physics,
J. Math. Phys. 52 (2011) 17, available at http://dx.doi.org/10.1063/1.3574352.
5. D. S. Jankovi c, -regular spaces, Internat. J. Math. Sci. 8 (1985) 615619, available at http://dx.doi.
org/10.1155/S0161171285000667.
6. S. L. Kent, R. A. Mimna, J. K. Tartir, A note on topological properties of non-Hausdorff manifolds,
Internat. J. Math. Sci. (2009) 14, available at http://dx.doi.org/10.1155/2009/891785.
7. R. E. Meggison, An Introduction to Banach Space Theory. Springer-Verlag, Berlin, 1998.
8. H. Moulin, Theorie des Jeux pour lEconomie et la Politique. Hermann ParisCollection Methodes,
Paris, 1981.
9. , Game Theory for the Social Sciences. Second and revised edition. New York University Press,
New York, 1986.
10. J. Nagata, Modern General Topology. North-Holland, Amsterdam, 1974.
11. K. Ritzberger, Foundations of Non-Cooperative Game Theory. Oxford University Press, Oxford, 2002.
12. H. Salonen, On the existence of undominated Nash equilibria in normal form games, Games and Eco-
nomic Behavior 14 (1996) 208219, available at http://dx.doi.org/10.1006/game.1996.0049.
13. W. J. Thron, Topological Structures. Holt, Rinehart and Winston, New York, 1966.
14. N. V. Veli cko, H-closed topological spaces, Mat. Sb. 70(112) (1966) 98102 (Russian).
15. S. Vickers, Topology Via Logic. Cambridge University Press, Cambridge, 1989.
MARTIN KOV
AR is an associate professor of mathematics at Brno University of Technology (Brno, Czech

Republic). He obtained his Ph.D. (1994) from Masaryk University in Brno and his habilitation degree (2006)
from Charles University in Prague. He started to study physics at Masaryk University in 1985. Even as a
student, he was fascinated by the elegance and beauty of classical topological results. His growing interest in
topology nally led to a change in his area of specialization. His main elds of interest are general and applied
topology motivated by problems from computer science and physics. However, theoretical and mathematical
physics still remain in the extended range of his scientic interests. To date, he has published approximately
30 research articles. He strongly believes that topology is a fascinating mathematical discipline of the future,
with an excellent, but so far underused, potential for many applications, including computer science, physics,
and modern technologies. He also believes that science is fun and that the liberty of scientic research is one
the greatest values and achievements of humanity and should be carefully protected.
Department of Mathematics, Faculty of Electrical Engineering and Communication,
Brno University of Technology, Technick a 8, Brno, 616 00, Czech Republic
kovar@feec.vutbr.cz
ALENA CHERNIKAVA received her masters degree (2009) at the State University of Minsk in Belarus.
She excelled at her studies, and in 2010 she began Ph.D. study at Brno University of Technology in Czech
Republic. Her principal elds of interest are applied topology, formal concept analysis, and game theory. She
is an author or co-author of several research papers.
Department of Mathematics, Faculty of Electrical Engineering and Communication,
Brno University of Technology, Technick a 8, Brno, 616 00, Czech Republic
xcerni07@stud.feec.vutbr.cz
An Asymptotic Formula for (1 + 1/ x )
x
Based on the Partition Function
Author(s): Chao-Ping Chen, Junesang Choi
Accessed: 30/03/2014 17:29
.
.
An Asymptotic Formula for (1 +1/x)
x
Based on the Partition Function
Chao-Ping Chen and Junesang Choi
Abstract. We present a method to produce estimations of the natural logarithmic constant
e, accurate to as many decimal places as we desire. The method is based on an asymptotic
formula for (1 +1/x)
x
, which uses the partition function.
In contrast with the continuing fascination with nding as many digits as possible of
the decimal approximation of , few mathematicians seem interested in computing
the base e of the natural logarithms to a comparable precision (see [2, 6]). Joost B urgi
seems to have formulated the rst approximation to e around 1620, obtaining three-
decimal-place accuracy (see [3, p. 31], [5], and [6, pp. 2627]).
One classical computation of e depends upon the well-known limit
e = lim
x
_
1 +
1
x
_
x
. (1)
Indeed, it is easy to see that the function f (x):= (1 +1/x)
x
increases and is bounded
above by 3 on the interval [1, ); thus, a larger value of x gives a more accurate
approximation to e. For example, f
_
10
5
_
= 2.7182 6823 . . . approximates e to four
decimal places.
Another classical computation of e uses Isaac Newtons rst version (1620) of what
is now known as the Maclaurin series expansion for e
x
[2]:
e
x
=
j =0
x
j
j !
= 1 + x +
x
2
2!
+
x
3
3!
+ for all x R. (2)
Setting x = 1 in (2) and choosing a large value of n, we obtain the partial sum
n
j =0
1
j !
= 1 +1 +
1
2!
+
1
3!
+ +
1
n!
,
which gives a simple, direct approximation to e that is the best way of calculating e to
high accuracy [1, 2]. Present numerical values of e are derived using either optimized
versions of this Maclaurin series (2) or the continued-fraction expansion approach
initiated by Euler [2].
Afurther classical approach to approximating e uses the Maclaurin series expansion
of ln(1 + x):
ln(1 + x) =
j =1
(1)
j 1
j
x
j
for 1 < x 1. (3)
MSC: Primary 05A17, Secondary 11P81
The only example of this alternative approach (that the authors of [2] have found in the
literature) is given by replacing x with 1/x in (3) and then multiplying the resulting
series by x to get
x ln
_
1 +
1
x
_
= 1
1
2x
+
1
3x
2

1
4x
3
+
1
5x
4

1
6x
5
+
1
7x
6
(4)
for x < 1 or x 1. By exponentiating each side of (4) and collecting the same pow-
ers of 1/x with the help of the Maclaurin series (2) for e
x
, we nd an approximation to
e that has been known by mathematicians and bankers alike since the early seventeenth
century (see [2, Eq. (4)]). For x < 1 or x 1, we obtain
_
1 +
1
x
_
x
= e
_
1
1
2x
+
11
24x
2
7
16x
3
+
2447
5760x
4
959
2304x
5
+
238043
580608x
6

_
.
(5)
Setting, for example, x = 100,000 in the left-hand side of (5) yields an approximation
to e that is accurate to four decimal places.
Motivated by this technique, Knox and Brothers [5] (see also Brothers and Knox
[2]) present an interesting and useful method that yields a new and more accurate
approximation to e by combining two good approximations. We choose to demon-
strate one of their many results here (see [2] or [5]). Adding approximation (5) and
the approximation obtained by replacing x by x in (5), and multiplying the resulting
identity by 1/2, they obtain the following better approximation to e than that given
by (5):
1
2
__
1+
1
x
_
x
+
_
1
1
x
_
x
_
= e
_
1+
11
24x
2
+
2447
5760x
4
+
238043
580608x
6
+
_
. (6)
Even though we can obtain as many coefcients as we please in the right-hand side
of (5) by using Mathematica, here we aim at giving a formula for determining these
coefcients. Our formula is based mainly on the partition function (see, e.g., [7, 9]).
For our later use, we introduce the following set of partitions of an integer n N =
N
0
\ {0} := {1, 2, 3, . . .}:
A
n
:=
_
(k
1
, k
2
, . . . , k
n
) N
n
0
: k
1
+2k
2
+ +nk
n
= n
_
. (7)
In number theory, the partition function p(n) represents the number of possible parti-
tions of n N (e.g., the number of distinct ways of representing n as a sum of natural
numbers, regardless of order). By convention, p(0) = 1 and p(n) = 0 for n a negative
integer. For more information on the partition function p(n), please refer to [7] and the
references therein. The rst several values of the partition function p(n) are (starting
with p(0) = 1, see [9]):
1, 1, 2, 3, 5, 7, 11, 15, 22, 30, 42, . . . .
It is easy to see that the cardinality of the set A
n
is equal to the partition function p(n).
Now we are ready to present a formula that determines the coefcients a
j
s in (8), with
the help of the partition function asserted by the following theorem.
April 2014] AN ASYMPTOTIC FORMULA FOR (1 +1/x)
x
339
Theorem. The following approximation formula holds true:
_
1 +
1
x
_
x
= e
j =0
a
j
x
j
as x , (8)
where the coefcients a
j
( j N) are given by
a
0
:= 1 and
a
j
= (1)
j

(
k
1
,k
2
,...,k
j )
A
j
1
k
1
! k
2
! k
j
!
_
1
2
_
k
1
_
1
3
_
k
2

_
1
j +1
_
k
j
, (9)
where the A
j
(for j N) are given in (7).
Proof. In view of the Maclaurin series (2) of ln(1 + x), we can let
x ln
_
1 +
1
x
_
= 1 +ln
_
_
1 +
q
j =1
a
j
x
j
_
_
+ O(x
q1
) for x and q N,
where a
1
, . . . , a
q
are real numbers to be determined. From the fundamental theorem
of algebra, we see that there exist unique complex numbers x
1
, . . . , x
q
such that
1 +
a
1
x
+ +
a
q
x
q
=
_
1 +
x
1
x
_

_
1 +
x
q
x
_
. (10)
By using the following series expansion:
ln
_
1 +
z
x
_
=
q
j =1
(1)
j 1
z
j
j x
j
+ O(x
q1
) for |z| < |x| and x ,
we obtain
ln
_
1 +
a
1
x
+ +
a
q
x
q
_
=
q
j =1
(1)
j 1
S
j
j x
j
+ O(x
q1
) for x , (11)
where
S
j
= x
j
1
+ + x
j
q
for j = 1, . . . , q.
Replacing x by
1
x
in (3) and multiplying the resulting equation by x, we get
x ln
_
1 +
1
x
_
= 1
q
j =1
(1)
j 1
( j +1)x
j
+ O(x
q1
) for x . (12)
We then nd from (11) and (12) that
S
j
=
j
j +1
for j = 1, . . . , q, (13)
that is,
_
_
x
1
+ + x
q
=
1
2
,
x
2
1
+ + x
2
q
=
2
3
,
.
.
.
x
q
1
+ + x
q
q
=
q
q +1
.
(14)
Let
P
q
(x) = x
q
+b
1
x
q1
+ +b
q1
x +b
q
be a polynomial with zeros x
1
, . . . , x
q
satisfying the system of equations (14). So we
have
P
q
(x) = (x x
1
) (x x
q
). (15)
The Newton formulas (see, e.g., [4] and references therein) give the connection be-
tween the coefcients b
j
and the power sums S
j
:
S
j
+ S
j 1
b
1
+ S
j 2
b
2
+ + S
1
b
j 1
+ j b
j
= 0 for j = 1, . . . , q.
It is known [4] that b
j
can be expressed in terms of S
j
:
b
j
=
(
k
1
,k
2
,...,k
j )
A
j
(1)
k
1
+k
2
++k
j
k
1
!k
2
! k
j
!
_
S
1
1
_
k
1
_
S
2
2
_
k
2

_
S
j
j
_
k
j
, (16)
where the A
j
(for j N) are given in (7).
From (15), we obtain
(1)
q
x
q
P
q
(x) =
_
1 +
x
1
x
_

_
1 +
x
q
x
_
.
We thus have
1 +
(1)b
1
x
+
(1)
2
b
2
x
2
+ +
(1)
q1
b
q1
x
q1
+
(1)
q
b
q
x
q
=
_
1 +
x
1
x
_

_
1 +
x
q
x
_
. (17)
We see from (10) and (17) that the coefcients a
j
are given by
a
j
= (1)
j
b
j
= (1)
j

(
k
1
,k
2
,...,k
j )
A
j
(1)
k
1
+k
2
++k
j
k
1
!k
2
! k
j
!
_
S
1
1
_
k
1
_
S
2
2
_
k
2

_
S
j
j
_
k
j
, (18)
where the S
j
are given in (13). Finally, substituting the expression (13) into (18) yields
(9). This completes the proof.
x
341
Remark. Here we give explicit numerical values of some rst terms of a
j
by using
the partition set (7) and the formula (9). This shows how easily we can determine a
j
s
in (9). Obviously,
a
1
=
k
1
=1
1
k
1
!
_
1
2
_
k
1
=
1
2
.
For k
1
+2k
2
= 2, since p(2) = 2, the partition set A
2
in (7) is seen to have two ele-
ments:
A
2
= {(0, 1), (2, 0)} .
From (9), we have
a
2
=
(k
1
,k
2
)A
2
1
k
1
!k
2
!
_
1
2
_
k
1
_
1
3
_
k
2
=
11
24
.
For k
1
+2k
2
+3k
3
= 3, since p(3) = 3, as above, the partition set A
3
in (7) contains
three elements:
A
3
= {(0, 0, 1), (1, 1, 0), (3, 0, 0)} .
We then nd from (9) that
a
3
=
(
k
1
,k
2
,k
3)
A
3
1
k
1
!k
2
!k
3
!
_
1
2
_
k
1
_
1
3
_
k
2
_
1
4
_
k
3
=
7
16
.
Likewise, the partition sets A
4
and A
5
have 5 = p(4) and 7 = p(5) elements, respec-
tively, and so
A
4
= {(0, 0, 0, 1), (1, 0, 1, 0), (0, 2, 0, 0), (2, 1, 0, 0), (4, 0, 0, 0)} and
A
5
= {(0, 0, 0, 0, 1), (1, 0, 0, 1, 0), (0, 1, 1, 0, 0), (2, 0, 1, 0, 0),
(1, 2, 0, 0, 0), (3, 1, 0, 0, 0), (5, 0, 0, 0, 0)} ,
which yields
a
4
=
2447
5760
and a
5
=
959
2304
.
We note that the explicit numerical values of a
j
(for j = 1, 2, 3, 4, 5) here correspond
with the coefcients of 1/x
j
(for j = 1, 2, 3, 4, 5) in (5), respectively.
By using (8), we nd that
1
2
__
1 +
1
x
_
x
+
_
1
1
x
_
x
_
= e
j =0
_
1 +(1)
j
_
a
j
2x
j
for x , (19)
where the a
j
(for j N
0
) are given in (9).
ACKNOWLEDGMENTS. Thanks to the Editor, Professor Scott Chapman, for his several enduring encour-
agements to improve the exposition of this note and to the anonymous referees for their constructive comments.
Thanks also to Professor Jack R. Quine and Professor Bettye Anne Case of Florida State University for their
help in improving the exposition of this note. This research was supported by the Basic Science Research Pro-
gram through the National Research Foundation of Korea funded by the Ministry of Education, Science and
Technology of the Republic of Korea (2012-0002957).
REFERENCES
1. G. Arfken, Mathematical Methods for Physicists. Third edition. Academic Press, New York, 1985.
2. H. J. Brothers, J. A. Knox, New closed-form approximations to the logarithmic constant e, Math. Intelli-
gencer 20 (1998) 2529.
3. H. T. Davis, Tables of the Mathematical Functions. Vol. I, The Principia Press of Trinity University, San
Antonio, Texas, 1963.
4. H. W. Gould, The Girard-Waring power sum formulas for symmetric functions and Fibonacci sequences,
Fibonacci Quart. 37 (1999) 135140.
5. J. A. Knox, H. J. Brothers, Novel series-based approximations to e, College Math. J. 30 (1999) 269275.
6. E. Maor, e: The Story of a Number. Princeton University Press, Princeton, New Jersey, 1994.
7. Wikipedia contributors, Partition (number theory), Wikipedia, The Free Encyclopedia, available at http:
//en.wikipedia.org/wiki/Partition_function_(number_theory)#Partition_function.
8. J. Sondow, E. W. Weisstein, e. From MathWorldA Wolfram Web Resource, available at http:
//mathworld.wolfram.com/e.html.
9. N. J. A. Sloane, a(n) = number of partitions of n (the partition numbers). Maintained by The OEIS
Foundation, available at http://oeis.org/A000041.
CHAO-PING CHEN received his Bachelor of Science degree from Henan Normal University (China) in
1986 and his Master of Science Degrees from Southwest Jiaotong University (China) in 1995. He currently
teaches at Henan Polytechnic University (Jiaozuo) in China.
School of Mathematics and Informatics, Henan Polytechnic University, Jiaozuo City 454003,
Henan Province, China
chenchaoping@sohu.com
JUNESANG CHOI received his B.A. from Gyeongsang National University (Republic of Korea) in 1981 and
his Ph.D. from the Florida State University in 1991. He currently teaches at Dongguk University (Gyeongju)
in the Republic of Korea. See http://wwwk.dongguk.ac.kr/
~
junesang/.
Department of Mathematics, Dongguk University, Gyeongju 780-714, Republic of Korea
junesang@mail.dongguk.ac.kr
x
343
Stirlings Approximation for Central Extended Binomial Coefficients
Author(s): Steffen Eger
Accessed: 30/03/2014 17:30
.
.
NOTES
Edited by Sergei Tabachnikov
Stirlings Approximation for
Central Extended Binomial Coefcients
Steffen Eger
Abstract. We derive asymptotic formulas for central extended binomial coefcients, which
are generalizations of binomial coefcients, using the distribution of the sum of independent
discrete uniform random variables with the Central Limit Theorem and a local limit variant.
1. STIRLINGS FORMULA AND CENTRAL BINOMIAL COEFFICIENTS.
For a nonnegative integer k, Stirlings formula,
k!
2k
_
k
e
_
k
where e is Eulers number, yields an approximation of the central binomial coefcient
_
k
k/2
_
using
_
k
m
_
=
k!
m!(km)!
as
_
k
k/2
_
2
k+1
2k
,
where we write a
k
b
k
as short-hand for lim
k
a
k
b
k
= 1. In our current note, we de-
rive asymptotic formulas for central extended binomial, or polynomial, coefcients (cf.
[2, 3, 7]). These coefcients appear in the extended binomial triangles (which we also
call (l + 1)-nomial, polynomial, or multinomial triangles [8]), which are generaliza-
tions of binomial, or Pascal, triangles, where entries in row k are dened as coefcients
of the polynomial (1 + x + x
2
+ + x
l
)
k
for l 0. Our derivation is not based upon
asymptotics of factorials, but upon the limiting distribution of the sum of discrete uni-
form random variables.
1
2. EXTENDEDBINOMIALTRIANGLES. In generalization to binomial triangles,
(l + 1)-nomial triangles, for l 0, are dened in the following way. Starting with a
1 in row zero, construct an entry in row k, for k 1, by adding the overlying (l +1)
entries in row (k 1) (some of these entries are taken as zero if not dened); thereby,
row k has (kl +1) entries. For example, the binomial (l = 1), trinomial (l = 2), and
quadrinomial triangles (l = 3) start as follows,
MSC: Primary 11B65, Secondary 11N37; 60G50
1
Throughout, we assume that all fractional values such as x =
kl
2
are integral when used in the context of
extended binomial coefcients. If this is not the case, then replace respective quantities with their oor, x,
the largest integer less than or equal to x.
1
1 1
1 2 1
1 3 3 1
1
1 1 1
1 2 3 2 1
1 3 6 7 6 3 1
1
1 1 1 1
1 2 3 4 3 2 1
1 3 6 10 12 12 10 6 3 1
In the (l + 1)-nomial triangle, the nth entry, for 0 n kl in row k, which we
denote by
_
k
n
_
l+1
, has the following interpretation. It is the coefcient of x
n
in the
expansion of
(1 + x + x
2
+ + x
l
)
k
=
kl
n=0
_
k
n
_
l+1
x
n
. (1)
It has been shown that
_
k
n
_
l+1
denotes the number of restricted integer compositions
(for a denition, see, e.g., [9] and many others) of the nonnegative integer n with k
parts
1
, . . . ,
k
, each from the set {0, 1, . . . , l} (cf. [5]), and allows the following
representation,
_
k
n
_
l+1
=
k
0
0,...,k
l
0
k
0
++k
l
=k
0k
0
+1k
1
++lk
l
=n
_
k
k
0
, . . . , k
l
_
, (2)
where
_
k
k
0
,...,k
l
_
is a multinomial coefcient, dened as
k!
k
0
!...k
l
!
, for nonnegative integers
k
0
, . . . , k
l
. We can verify representation (2) by noting that for real numbers x
0
, . . . , x
l
,
the multinomial theorem (cf. [15]) states that
(x
0
+ x
1
+ + x
l
)
k
=
k
0
0,...,k
l
0
k
0
++k
l
=k
_
k
k
0
, . . . , k
l
_
x
k
0
0
x
k
l
l
.
Thus, setting x
i
= x
i
for i = 0, . . . , l,
(1 + x + x
2
+ + x
l
)
k
=
k
0
0,...,k
l
0
k
0
++k
l
=k
_
k
k
0
, . . . , k
l
_
x
0k
0
++lk
l
, (3)
so that comparing coefcients of the right-hand sides of (1) and (3) leads to (2).
3. GENERALIZED STIRLINGS APPROXIMATION. Our strategy for deriving
approximation formulas for central extended binomial coefcients is as follows. First,
we determine the asymptotic distribution of the sum of discrete uniform variables,
which we easily nd to be a normal distribution by the Central Limit Theorem (CLT).
Then, we determine the exact distribution, which turns out to yield the normalized
extended binomial coefcients
_
k
n
_
l+1
. By relating the density of the asymptotic distri-
bution to the density of the exact distribution (e.g., via a local limit argument), we
obtain an extended binomial analogue of Stirlings approximation to central binomial
coefcients.
3.1. Step 1: Asymptotic distribution of the sum of discrete uniform variables.
Let k be a positive integer and let l be a nonnegative integer. Let X
j
, for j = 1, . . . , k,
April 2014] NOTES 345
be identically and independently distributed random draws from the discrete uniform
distribution on the set {0, . . . , l}, and let S
k
be their sum,
S
k
=
k
j =1
X
j
.
Obviously, by standard moments of the uniform distribution, the mean and variance of
each X
j
are given by
= E[X
j
] =
l
2
, and
2
= Var[X
j
] =
(l +1)
2
1
12
.
Hence, by independent and identical distribution of X
1
, . . . , X
k
, and application of
the CLT, the random variable
k(
S
k
k
) converges, as k , in distribution to a
normally N(0,
2
) distributed random variable. Recall that convergence in distribu-
tion precisely means that the cumulative density function of

k(
S
k
k
) converges
pointwise to the cumulative density function of the N(0,
2
) distribution.
3.2. Step 2: Exact distribution of the sum of discrete uniform random variables.
We now determine exactly the probability that S
k
takes on the integer value n, for 0
n kl. To do so, we consider isomorphic copies

X
j
of X
j
, which are independently
and identically multinomially distributed with probabilities p
0
= = p
l
=
1
l+1
of
types 0 to l. Each

X
j
= (A
0
, . . . , A
l
) is vector-valued, with P[
X
j
= (a
0
, . . . , a
l
)] =
1
l+1
for nonnegative integers a
s
, with a
0
+ +a
l
= 1, where A
s
denotes the number
of times an event of type s, for s = 0, . . . , l, occurs. Then, the sum

S
k
=

X
1
+ +

X
k
has the interpretation of representing the event of drawing with replacement k balls of
(l +1) different types from a bag, where the probability of drawing type s = 0, . . . , l
is
1
l+1
. Thus, by the standard interpretation of the multinomial distribution,

S
k
has
density
P[
S
k
= (a
0
, . . . , a
l
)] = P[A
0
= a
0
, . . . , A
l
= a
l
] =
_
k
a
0
, . . . , a
l
__
1
l +1
_
k
,
where a
0
+ +a
l
=k for nonnegative integers a
0
, . . . , a
l
. Then, if

S
k
=(a
0
, . . . , a
l
),
S
k
, the variable corresponding to

S
k
, represents the integer 0 a
0
+ +l a
l
. Thus,
for n such that 0 n kl,
P[S
k
= n] =
a
0
0,...,a
l
0
a
0
++a
l
=k
0a
0
++la
l
=n
P[
S
k
= (a
0
, . . . , a
l
)]
=
_
1
l +1
_
k
a
0
0,...,a
l
0
a
0
++a
l
=k
0a
0
++la
l
=n
_
k
a
0
, . . . , a
l
_
=
_
1
l +1
_
k
_
k
n
_
l+1
,
using representation (2).
An arguably more straightforward derivation of the exact distribution of S
k
, making
use of probability generating functions (pgfs), can be given by noting that the pgf
G
X
j
(x) =
n0
P[X
j
= n]x
n
of each X
j
is given by
G
X
j
(x) =
1
l +1
l
n=0
x
n
.
Whence, the pgf of S
k
is given as, by independence of X
1
, . . . , X
k
,
G
S
k
(x) = G
X
1
(x) G
X
k
(x) =
_
1
l +1
_
k
_
l
n=0
x
n
_
k
=
_
1
l +1
_
k kl
n=0
_
k
n
_
l+1
x
n
.
Thus,
P[S
k
= n] =
G
(n)
S
k
(0)
n!
=
_
1
l +1
_
k
n!
n!
_
k
n
_
l+1
=
_
1
l +1
_
k
_
k
n
_
l+1
,
where, by G
(n)
X
(0), we denote the nth derivative of G
X
, evaluated at zero.
3.3. Step 3: Local limit theorem. To derive an asymptotic formula for
_
k
n
_
l+1
, we
would like to make use of the results derived in Steps 1 and 2 above. Ideally, we would
like to equate the probability density function of the asymptotic normal dstribution
of S
k
with the exact distribution. However, as mentioned, convergence in distribution,
as assured by the CLT, only guarantees pointwise convergence of cumulative density
functions. On the contrary, local limit theorems describe how the probability den-
sity function of a sum of random variables approaches the normal density function.
For integer-valued random variables (also called lattice or arithmetical distributions),
Gnedenko and Kolmogorov [10] provide the following result.
Theorem 3.1 (see [10, p. 233]). If X
1
, X
2
, . . . are independent lattice random vari-
ables with identical distribution with nite mean and variance
2
, such that the
greatest common divisor of the differences of all the values of X
j
taken with positive
probability is 1, then
k P[S
k
= n]
1
2
e
(nk)
2
2
2
k
0
uniformly in n as k .
Since in our situation, the set of values of each X
j
taken with positive probability is
{0, 1, . . . , l}, the greatest common divisor of the differences is clearly 1. Thus, all as-
sumptions of Theorem 3.1 are satised in our case and, hence, also, the consequences
hold. Therefore, the following approximation is suggested for large k:
k P[S
k
= n]
1
2
e
(nk)
2
2
2
k
. (4)
For n = k = kl/2, the argument to the exponential function is zero, and thus
k P[S
k
= kl/2]
1
2
, or equivalently, P[S
k
= kl/2]
1
2
2
k
.
Using the exact form for P[S
k
= n] from Step 2 above, we hence have, bringing the
normalizing term (l +1)
k
to the right-hand side,
_
k
kl
2
_
l+1
(l +1)
k
_
2k
(l+1)
2
1
12
. (5)
For example, for l = 1, Pascals case, l = 2, l = 3, and l = 4, we therefore have the
approximations
_
k
k
2
_
2
k+1
2k
,
_
k
k
_
3
3
k
_
4
3
k
,
_
k
3
2
k
_
4
4
k
_
5
2
k
, and
_
k
2k
_
5
5
k
2
k
.
In Figure 1, we show for l = 4 the distributions P[S
k
= n] for k = 5, 10, 20, and
their respective normal approximations. There, we can see the local limit theorem
at work: The exact density function apparently approaches, pointwise, the normal
density function.
Figure 1. Distributions P[S
k
= n] for k = 5, 10, 20 for l = 4 xed, and normal approximations.
4. DISCUSSION. Although extended binomial coefcients, together with their con-
nection to the sum of discrete uniform random variables, go back at least to De
Moivres Doctrine of Chances [4] and to Eulers [6] analytical study of the coefcients
of polynomial (1), the mathematics community has apparently more or less ignored
their systematic study, except for a few recent publications such as [1, 2, 5, 7, 8].
Next, using the CLT (or a local limit variant) to deduce asymptotics of mathemati-
cal objects has been suggested, for example, by Walsh [14], who derives Stirlings
formula for factorials by equating the distribution of the sum of Poisson distributed
random variables with the normal density. Finally, the asymptotics of both the central
binomial (l = 1) as well as the central trinomial coefcients (l = 2) seem to be known
(e.g. [7, 13]), while the general formula (5) is, to the best of our knowledge, novel.
However, Ratsaby [12] derives our general result (4), as an estimate of the number
of restricted integer compositions, by application of Cauchys coefcient formula to
the polynomial (1) and computation of the resulting integral by Laplaces method for
evaluation of integrals. A historical perspective of local versus central limit theorem
is provided by McDonald [11].
ACKNOWLEDGMENT. The author would like to thank the anonymous reviewers for helpful comments.
REFERENCES
1. R. C. Bollinger, C. L. Burchard, Lucass theorem and some related results for extended Pascal triangles,
Amer. Math. Monthly 97 no. 3 (1990) 198204.
2. C. C. S. Caiado, P. N. Rathie, Polynomial coefcients and distribution of the sum of discrete uniform
variables, in Eighth Annual Conference of the Society of Special Functions and their Applications. Edited
by A. M. Mathai, M. A. Pathan, K. K. Jose, and J. Jacob, Pala, India, 2007.
3. L. Comtet, Advanced Combinatorics: The Art of Finite and Innite Expansions. D. Reidel Publishing
Company, Dordrecht, 1974.
4. A. De Moivre, The Doctrine of Chances: Or, A Method of Calculating the Probabilities of Events in Play.
Reprint of the third (1756) edition. Chelsea, New York, 1967.
5. S. Eger, Restricted weighted integer compositions and extended binomial coefcients, J. Integer Seq., 16
(2013).
6. L. Euler, De evolutione potestatis polynomialis cuiuscunque (1 +x +x
2
+ )
n
. Nova Acta Academiae
Scientarum Imperialis Petropolitinae 12 (1801), available at http://math.dartmouth.edu/
~
euler/.
7. N.-E. Fahssi, The polynomial triangles revisited (2012), available at http://arxiv.org/abs/1202.
0228.
8. D. C. Fielder, C. O. Alford, Pascals triangle: Top gun or just one of the gang?, in Applications of Fi-
bonacci Numbers. Edited by G. E. Bergum, A. N. Philippou, A. F. Horadam, Kluwer, Dordrecht, 1991.
9. P. Flajolet, R. Sedgewick, Analytic Combinatorics. Cambridge University Press, Cambridge, 2009.
10. B. V. Gnedenko, A. N. Kolmogorov, Limit Distributions for Sums of Independent Random Variables.
Second edition. Addison-Wesley, Cambridge, MA, 1968.
11. D. R. McDonald, The local limit theorem: A historical perspective, JIRSS 4 (2005) 7386.
12. J. Ratsaby, Estimate of the number of restricted integer-partitions, Appl. Anal. Discrete Math 2 (2008)
222-233.
13. The On-Line Encyclopedia of Integer Sequences, available at http://oeis.org, 2012, Sequence
A002426.
14. D. P. Walsh, Equating Poisson and normal probability functions to derive Stirlings formula, Amer. Statist.
49 (1995) 270271.
15. E. Weisstein, Multinomial SeriesFrom MathWorld, A Wolfram Web Resource, available at http:
//mathworld.wolfram.com/MultinomialSeries.html.
Economics Department, Goethe University Frankfurt am Main, Germany
eger.steffen@gmail.com
A New Proof of Stirlings Formula
Author(s): Thorsten Neuschel
Accessed: 30/03/2014 17:30
.
.
A New Proof of Stirlings Formula
Thorsten Neuschel
Abstract. A new, simple proof of Stirlings formula via the partial fraction expansion for the
tangent function is presented.
1. INTRODUCTION. Various proofs for Stirlings formula
n! n
n
e
n
2n, as n , (1.1)
have been established in the literature since the days of de Moivre and Stirling in 1730
(for a historical exposition see, e.g., [1]). Many of these proofs show that the limit
lim
n
n!
n
n
e
n
n
exists (for instance, via the EulerMaclaurin formula) in order to identify this limit
by using the asymptotical behavior of the Wallis product, which is the crucial step.
We will show that this last, quite wily, step can be replaced by a simple straightfor-
ward computation of the limit using only the partial fraction expansion for the tangent
function
tan x =
=0
2x
( +
1
2
)
2
x
2
. (1.2)
This expansion was probably found by Euler by the time Stirling determined his proof
via Wallis formula, see, e.g., [6, p. 327]. For some alternative elementary proofs of
Stirlings formula see, e.g., [1, 2, 4, 5, 7].
2. PROOF. An application of the well-known EulerMaclaurin formula in its sim-
plest form (see, e.g., [8, p. 37, (6.21)]) yields
log n! = n log n n +1 +log
n +
n1
0
x [x]
1
2
1 + x
dx.
In order to prove (1.1), it is sufcient to show

0
x [x]
1
2
1 + x
dx = log
2 1. (2.1)
To prove this, we will show directly the identity

0
x [x]
1
2
1 + x
dx =
1/2
0
8x
2
1 4x
2
x tan x
dx, (2.2)
MSC: Primary 41A60
where the integral on the right-hand side can be evaluated by elementary calculus. We
start our computation with

0
x [x]
1
2
1 + x
dx =
=0
+1/2
x
1
2
1 + x
dx +
+1
+1/2
x
1
2
1 + x
dx
=0
1/2
0
x
1
2
1 + + x
dx +
1/2
0
x
3
2
+ + x
dx
.
By an easy change of variables, we observe that
1/2
0
x
1
2
1 + + x
dx =
1/2
0
x
3
2
+ x
dx,
so that we obtain

0
x [x]
1
2
1 + x
dx =
=0
1/2
0
x
3
2
+ + x

x
3
2
+ x
dx
=
=0
1/2
0
2x
2
( +
3
2
)
2
x
2
dx
=
1/2
0
=1
2x
2
( +
1
2
)
2
x
2
dx, (2.3)
where the interchange of summation and integration is allowed, due to the uniform
convergence of the series in (2.3) on the interval [0,
1
2
]. Applying (1.2), we immedi-
ately obtain (2.2). At this point of the proof, we have reduced the problem of determin-
ing the constant in Stirlings formula to a simple matter of elementary calculus as the
resulting integral in (2.2) can be evaluated easily. For convenience, we will give some
details. For example, using the decomposition
8x
2
1 4x
2
=
1
1 +2x
+
1
1 2x
2
it can be rewritten as
1/2
0
8x
2
1 4x
2
x tan x
dx = log
2 1 +
1/2
0
1
1 2x
x tan x
dx.
Now, by a standard argumentation involving integration by parts, we can observe for
0 < < 1/2 that
1
1 2x
x tan x
dx
=
1
2
log(1 2)

0
x tan x dx
=

1
2
log cos() +
1
2
log
cos()
1 2

0
log cos(x) dx.
Letting tend to 1/2, we immediately obtain
1/2
0
1
1 2x
x tan x
dx = log
log
1/2
0
log cos(x) dx.
The remaining integral on the right-hand side can be easily evaluated to log
2 as
shown, e.g., in [2], [3]. This computation relies on the fact that its value, say c, remains
unchanged if cos(x) is replaced by sin(x) so that we have (using the double angle
formula)
1/2
0
log sin(2x) dx = log
2 +2
1/2
0
log sin(x) dx.
As both integrals in the last equation coincide, we obtain c = log
2, which com-
pletes the proof of (1.1).
REFERENCES
1. P. Diaconis, D. Freedman, An elementary proof of Stirlings formula, Amer. Math. Monthly 93 (1986)
123125.
2. W. Feller, A direct proof of Stirlings formula, Amer. Math. Monthly 74 (1967) 12231225.
3. W. Feller, Correction to A direct proof of Stirlings formula, Amer. Math. Monthly 75 (1968) 518.
4. R. Michel, On Stirlings formula, Amer. Math. Monthly 109 (2002) 388390.
5. J. Patin, A very short proof of Stirlings formula, Amer. Math. Monthly 96 (1989) 4142.
6. R. Remmert, Theory of Complex Functions. Springer, New York, 1991.
7. H. Robbins, A remark on Stirlings formula, Amer. Math. Monthly 62 (1955) 2629.
8. R. Wong, Asymptotic Approximation of Integrals. Society for Industrial and Applied Mathematics,
Philadelphia, PA, 2001.
Department of Mathematics, University of Trier, D-54286 Trier, Germany
neuschel@uni-trier.de
By 1914, the MONTHLY had outgrown its nancial arrangements, and it
was Slaught who turned to the American Mathematical Society to adopt the
MONTHLY as an ofcial journal. But American mathematics was growing as
fast as the MONTHLY, and the Society was already plagued by factional disputes
between the Eastern establishment (the Ivy league schools) and the Midwest
(led by Chicago). Slaughts request became a controversy. Should an organi-
zation dedicated to promoting mathematical research support a journal like the
MONTHLY? Many, especially in the East (led by Osgood), thought is should not,
and the AMS voted narrowly to give the MONTHLY a pat on the back rather than
money.
A Century of Mathematics:
Through the Eyes of the Monthly,
Edited by John Ewing.
Mathematical Association of America,
Washington, DC, 1994, p. 4.
Zeta(2) Once Again
Author(s): Ralph M. Krause
Accessed: 30/03/2014 17:30
.
.
Zeta(2) Once Again
Ralph M. Krause
Abstract. This note provides a strikingly efcient evaluation of zeta(2).
An article in the January, 2012, MONTHLY [1] proved, in a manner that might have ap-
pealed to Euler, his famous result that (2) =
1
1/k
2
=
2
/6. The argument there
suggested the following, which makes the same claim and resembles the sixth in [2].
The Taylors series for log(1 +z) converges on the unit circle z = e
i
for z = 1. Thus
log(1 + z) =
1
(1)
k1
z
k
/k, (1)
and
log(1 + z
1
) =
1
(1)
k1
z
k
/k. (2)
To be convinced that equation (1) holds on the circle of convergence (z = 1 ex-
cepted) and not merely inside it, we argue thus. For z in the rst quadrant, |1/(1 +z)
N1
0
(z)
k
| = |z
N
/(1 + z)| |z
N
|. Integrating 1/(1 + z)
N1
0
(z)
k
and |z
N
|
along a radius from z = 0 to z = e
i
shows that | log(1 + z)
N
1
(1)
k1
z
k
/k| <
1/(N +1), establishing (1) on the portion of the unit circle lying in the rst quadrant.
(This is all we need below, although the preceding argument may be modied easily
to use ever larger bounds than 1/(N +1) and prove convergence, though not uniform
convergence, on the entire unit circle with the exception of the point z = 1.) (2) then
follows, as the conjugate of (1).
Subtracting (2) from (1), still for z = e
i
and z = 1,
i = log(z) =
1
(1)
k1
[z
k
z
k
]/k = 2i
1
(1)
k1
sin(k)/k. (3)
Nowadays, one might verify that this is a Fourier series and let Parsevals formula
nish the job. Although Euler was perhaps a century too early for Fourier analysis, he
would have been willing to integrate the rst and last expressions in (3) from 0 to /2
after dividing by i . Making free use of the familiar observation that the even terms
comprise 1/4 of the sum of the series of reciprocal squares, we arrive at
2
8
=
3
4
1
1
k
2
.
MSC: Primary 11M06
The brevity here legitimately ignores the fact that the sums in (1)(3) converge
only conditionally. What is essential is that they converge uniformly on the interval of
integration [0, /2], and this they do by the estimate made in paragraph 2 above. Thus
/2
0
2
N
1
(1)
k1
sin(k)/k
< (/2)2/(N +1).

The result now follows.
REFERENCES
1. D. Kalman, M. McKinzie, Another way to suma series: Generating functions, Euler and the dilog function,
Amer. Math. Monthly 119 (2012) 4251.
2. R. Chapman, Evaluating (2), available at http://www.uam.es/personal_pdi/ciencias/
cillerue/Curso/zeta2.pdf.
3208 44th Street, N.W., Washington, DC 20016-3527
rmkrause@verizon.net
100 Years Ago in The American Mathematical Monthly
Edited by Vadim Ponomarenko
In recent years several German professors of mathematics have called public
attention to the fact that the number of mathematical students at the various
German universities is larger than the probable number of mathematical po-
sitions in the German schools. According to the Jahresbericht der Deutschen
Mathematiker-Vereinigung, 22 (1913), page 369, the number of mathematical
students is much smaller during the current year than it has been during recent
years. The number of women students of mathematics is, however, still on the
increase in the German universities.
All copies of the MONTHLY for January, 1913, have been exhausted. The
demand for sample copies was so great for this particular number that the supply
was entirely inadequate. Any one who may know of extra copies not belonging
to sets will confer a great favor by informing the MANAGING EDITOR.
Excerpted from Notes and News, 21 (1914) 136138.
Polynomials ( x
3
n )( x
2
+ 3) Solvable Modulo Any Integer
Author(s): Andrea M. Hyde, Paul D. Lee, Blair K. Spearman
Accessed: 30/03/2014 17:30
.
.
Polynomials (x
3
n)(x
2
+3) Solvable
Modulo Any Integer
Andrea M. Hyde, Paul D. Lee, and Blair K. Spearman
Abstract. We give an innite family of polynomials that are solvable modulo m for every inte-
ger m > 1, yet have no roots in the rational numbers. Such polynomials are called intersective.
Our classication uses only techniques available in an undergraduate course in number theory.
1. INTRODUCTION. Let f (x) be a monic polynomial with integer coefcients. We
are interested in those polynomials f (x) that have no root in the rational numbers Q,
but do have a root modulo m for all positive integers m. Polynomials of this type are
called intersective (see Sonn [6, 7]). These polynomials provide counterexamples to
the local-global principle. Further information on the local-global principle is available
in the book by Gouv ea [3, pp. 7583.]. It is known that f (x) cannot be irreducible over
Q since in this case there exist prime numbers p for which
f (x) 0 (mod p)
is insolvable (see Brandl, Bubboloni, and Hupp [2]). Consequently, an intersective
polynomial requires at least two irreducible factors over Q. Berend and Bilu prove a
theorem that can be used to establish the intersective property for a given polynomial
[1]. They provide the example
f (x) = (x
3
19)(x
2
+ x +1).
The verication of the intersective property, that is, of conrming that
f (x) 0 (mod m)
is solvable for every m > 1, may proceed by showing that for each prime p and posi-
tive integer j , the congruence
f (x) 0 (mod p
j
)
is solvable. General solvability then follows from the Chinese Remainder Theorem.
For a given prime p, one of the factors of f (x) is proven to be solvable mod p
j
for all
positive integers j . In this note, we propose to use techniques from an undergraduate
number theory course to investigate polynomials of the form
f (x) = (x
3
n)(x
2
+ x +1),
or equivalently
f (x) = (x
3
n)(x
2
+3),
MSC: Primary 11R09
classifying those that are intersective. The equivalence of these two families is due to
the easily established fact that for a given prime p, the congruence
x
2
+ x +1 0 (mod p
j
)
is solvable for all positive integers j if and only if the congruence
x
2
+3 0 (mod p
j
)
is solvable for all positive integers j . Berend and Bilu [1] state that these polynomials
have the least possible degree to be intersective. The explanation of this involves Ga-
lois theory and algebraic number theory. Avoiding these more advanced theories, we
make use of results from an undergraduate number theory course, particularly Hensels
Lemma and a rened version of Hensels Lemma. Our method not only establishes the
intersective property for the polynomials we study, but also enables a characteriza-
tion of them, showing that they form an innite set. Before we begin, we impose two
simple restrictions on the value of n in our polynomials (x
3
n)(x
2
+3), the validity
of which can be easily established by the reader. We assume that n is a positive inte-
ger and that n is cubefree. Our denition of intersective implies that n = 1. Our main
theorem is the following.
Theorem. Let n be a cubefree positive integer, not equal to 1. Then
f (x) = (x
3
n)(x
2
+3)
is intersective if and only if the prime factors of n are of the form 3k +1 and n 1
(mod 9).
Before giving the proof of this theorem, we state the theory that we require. All
of it is available in basic number theory textbooks such as Niven, Zuckerman, and
Montgomery [4]. We begin with Hensels Lemma.
Hensels Lemma (see [4, p. 87]). Suppose that f (x) is a polynomial with integral
coefcients. If f (a) 0 (mod p
j
) and f

(a) 0 (mod p), then there is a unique t
(mod p) such that f (a +t p
j
) 0 (mod p
j +1
).
If the condition f

(a) 0 (mod p) holds, then the root a is called nonsingular. By
repeated application of Hensels Lemma, a nonsingular root a of f (x) 0 (mod p)
may be lifted to a root modulo p
j
, for j = 2, 3, . . . . We also require a rened version
of Hensels Lemma which, in the case of a singular root, enables us to lift our solutions
modulo arbitrarily high prime powers.
Rened Hensels Lemma (see [4, p. 89]). Let f (x) be a polynomial with integral
coefcients. Suppose that f (a) 0 (mod p
j
), that p
f

(a), and that j 2 +1.
If b a mod p
j
, then f (b) f (a) (mod p
j
) and p
f

(b). Moreover, there is
a unique t (mod p) such that f (a +t p
j
) 0 (mod p
j +1
).
As noted in [4, p. 89], since the hypotheses of the theorem apply with a replaced
by a + t p
j
and (mod p
j
) replaced by (mod p
j +1
) but with unchanged, the
lifting may be repeated and continues indenitely. This means that if the polynomial
congruence is solvable to a sufciently high power of p (as dened in the Lemma),
then it can be solved to all powers of p. We require one more lemma involving power
congruences.
Lemma (see [4, p. 101). If p is a prime and gcd(a, p) = 1, then the congruence x
t
a (mod p) has d = gcd(t, p 1) solutions if a

( p1)/d
1 (mod p), and no solutions
otherwise.
We are now ready to prove our theorem.
Proof. Assume that the prime factors of n are of the form 3k +1 and n 1 (mod 9).
Let p be a prime.
Case 1. Suppose that p 1 (mod 3). For these primes, 3 is a quadratic residue
mod p (see [5, p. 440, Exercise 3]) so that the congruence x
2
+ 3 0 (mod p) is
solvable for some integer u, which is clearly not divisible by p. Since the derivative
of x
2
+3 evaluated at u equals 2u, which is again not divisible by p, we may apply
Lemma 1 to conclude that x
2
+3 0 (mod p
j
) is solvable for all positive integers j .
Case 2. Suppose next that p 2 (mod 3). Clearly, p n, since n contains only prime
factors of the form 3k + 1. The factor x
2
+ 3 is insolvable mod p, since 3 is a
quadratic nonresidue (mod p). Applying Lemma 3 to the factor x
3
n, with t =
3, a = n, and noting that d = (3, p 1) = 1, we see that x
3
n 0 (mod p) is solv-
able if n
p1
1 (mod p), which is true by Fermats Little Theorem. Thus, x
3
n 0
(mod p) is solvable for some integer u. Since p n, we see that p u. We may ap-
ply Lemma 1 by observing that p 3u
2
, the derivative of x
3
n evaluated at u, and
conclude that x
3
n 0 (mod p
j
) is solvable for all positive integers j .
Case 3. Finally, suppose that p = 3. The factor x
2
+3 has one solution mod 3 (namely
x = 0), but has no solutions mod 3
2
. Therefore, we consider the factor x
3
n. Since
n 1 (mod 9), we have n 1, 10, 19 (mod 3
3
) so that x
3
n (mod 3
3
) is solvable.
For solutions, we may choose x 1, 4, 7 (mod 3
3
), respectively. At the same time,
the derivative of x
3
n evaluated at 1, 4, and 7 is exactly divisible by 3. Applying
Lemma 2 with j = 3, = 1, and recalling the remark after Lemma 2, we conclude
that x
3
n 0 (mod 3
j
) is solvable for all positive integers j . Hence, by the Chinese
Remainder Theorem, f (x) is solvable for all positive integers m. This establishes the
intersective property.
Conversely, suppose that (x
3
n)(x
2
+3) is intersective with n cubefree and not
equal to 1. Let p be a prime.
Case 1. First, suppose that p 2 (mod 3). For such a prime, x
2
+ 3 0 (mod p)
is insolvable, since 3 is a quadratic nonresidue mod p as noted earlier. Therefore,
we must have x
3
n 0 (mod p
j
) solvable for all positive integers j . It is easy to
show that if p divides n, then p must also divide any solution x. We then see that
the congruence x
3
n (mod p
3
) is only solvable if n is divisible by p
3
, but this is a
contradiction since n is cube free. Hence, p n and n has no prime factors of the form
3k +2.
Case 2. Suppose now that p = 3. Since x
2
+3 0 (mod 9) is insolvable, we must
have x
3
n 0 (mod 3
j
) solvable for all positive integers j . By the same arguments
as in Case 1, we require 3 n, for otherwise x
3
n 0 (mod 3
3
) is insolvable as n is
cubefree. Therefore, 3 is not a prime factor of n, and thus the prime factors of n must
have the form 3k +1.
To complete the classication part of our proof, we study the solvability of x
3
n
0 (mod 3
2
). The nonzero cubes modulo 9 are congruent to 1 and 8. Combining this
with the previously established fact that the prime factors of the positive integer n are
of the form 3k +1, we deduce that n 1 (mod 9), as required.
We close by giving some examples of intersective polynomials obtained from our
theorem.
Factorization of n Intersective polynomial
37 (x
3
37)(x
2
+3)
7 13 (x
3
91)(x
2
+3)
163 (x
3
163)(x
2
+3)
19 37 (x
3
703)(x
2
+3)
7
2
61 (x
3
2989)(x
2
+3)
REFERENCES
1. D. Behrend, Y. Bilu, Polynomials with roots modulo every integer, Proc. Amer. Math. Soc. 124 (1996)
16631671.
2. R. Brandl, D. Bubboloni, I. Hupp, Polynomials with roots mod p for all primes p, J. Group Theory 4
(2001) 233239.
3. F. Q. Gouv ea, P-adic Numbers, An Introduction. Springer Verlag, New York, 1993.
4. I. Niven, H. S. Zuckerman, H. L. Montgomery, An Introduction to the Theory of Numbers. Fifth edition.
John Wiley and Sons, New York, 1995.
5. K. H. Rosen, Elementary Number Theory and its Applications. Sixth edition. Addison-Wesley, Reading,
MA, 2011.
6. J. Sonn, Polynomials with roots in Q
p
for all p, Proc. Amer. Math. Soc. 136 (2008) 19551960.
7. J. Sonn, Two remarks on the inverse galois problem for intersective polynomials, J. Theor. Nombres Bor-
deaux, 21 (2009) 437439.
Department of Mathematics and Statistics, University of British Columbias Okanagan campus, Kelowna, BC,
Canada, V1V 1V7
andrea.hyde@alumni.ubc.ca
Canada, V1V 1V7
paul.lee@ubc.ca
Canada, V1V 1V7
blair.spearman@ubc.ca
Macaulay Expansion
Author(s): B. Sury
Accessed: 30/03/2014 17:30
.
.
Macaulay Expansion
B. Sury
Abstract. Given natural numbers n and r, the greedy algorithm enables us to obtain an
expansion of the integer n as a sum of binomial coefcients in the form

a
r
r
a
r1
r1
+ +
a
1
1
. We give an alternate interpretation of this expansion, which also proves its uniqueness in
an interesting manner.
The 1996 Iranian mathematical olympiad competition contained the following prob-
lem. For natural numbers n and r, there is a unique expansion
n =
a
r
r
a
r1
r 1
+ +
a
1
1
with each a
i
an integer and a
r
> a
r1
> > a
1
0.
The existence is fairly easy to prove using the greedy algorithm. This expansion is
sometimes known as the Macaulay expansion. However, the following alternate inter-
pretation does not seem to be well known; it gives uniqueness in an interesting manner.
In what follows, the following well-known convention is used: the binomial coefcient
n
r
is equated to 0 if n < r.
For each natural number r, denote by S
r
the set of all r-digit numbers in some base
b whose digits are in strictly decreasing order of size. Evidently, S
r
is nonempty if and
only if b r; in this case, S
r
has
b
r
elements. Let us now write the elements of S

r
in
increasing order.
For instance, in base 10, the rst few of the 120 members of S
3
are:
(2, 1, 0), (3, 1, 0), (3, 2, 0), (3, 2, 1), (4, 1, 0), (4, 2, 0), (4, 2, 1), (4, 3, 0), . . . .
We will prove the following.
Theorem. Given any positive integer n, and any base b such that
b
r
> n, the (n +1)-

th member of S
r
is (a
r
, . . . , a
2
, a
1
), where n =
a
r
r
a
r1
r1
+ +
a
1
1
. In particular,
for each n, the Diophantine equation
a
r
r
a
r1
r1
+ +
a
1
1
= n has a unique solu-

tion in positive integers a
r
> a
r1
> > a
1
0.
Here are a couple of examples to illustrate the theorem.
(i) Let r = 3 and n = 12. We may take any base b so that
b
3
> 12. For example,

b = 6 is allowed because

6
3
= 20. Among the 20 members in S

3
, the 13th
member is (5, 2, 1). Note that
5
3
2
2
1
1
= 12.
MSC: Primary 05A10
(ii) Let r = 3, n = 74. We may take b = 10 as
10
3
= 120. The 75th member of S

3
is (8, 6, 3). Note that
8
3
6
2
3
2
= 74.
Proof of theorem. First of all, we notice that the number of members in S
r
that
have rst digit < m equals

m
r
; this is because we are choosing r numbers from

{0, 1, . . . , m 1} and arranging them in decreasing order. Now, suppose the (n +1)th
member of S
r
is
(a
r
, a
r1
, . . . , a
1
).
The number of members of S
r
with rst digit < a
r
is
a
r
r
. The number of members

of S
r
, whose rst digit is a
r
and which occur before the above member, is the number
of members of S
r1
occurring prior to (a
r1
, . . . , a
1
). Inductively, it is clear that this
equals
a
r1
r 1
+ +
a
2
2
a
1
1
.
Therefore, the number of members of S
r
occurring prior to the (n +1)th member
above (which must be n) is
a
r
r
a
r1
r 1
+ +
a
1
1
.
This proves our result.
Remark. We may proceed in a slightly different direction, if we do not use the rst
observation in the proof. For any k, we can obtain by induction that the number of
elements in S
k
starting with some a is
a
k1
. Indeed, to prove this by induction, we use

the identity
n
r
=
n1
m=1
m
r 1
,
which is itself seen by induction on n.
ACKNOWLEDGMENTS. We are indebted to the referee for a number of constructive suggestions. In par-
ticular, she/he drew attention to a simple way to count something for which we gave a roundabout argument
as remarked above. The referees suggestions to add some illuminating examples and to make the uniqueness
argument transparent are well appreciated.
Stat-Math Unit, Indian Statistical Institute, 8th Mile Mysore Road, Bangalore 560059, India
sury@isibang.ac.in
Evaluating Lebesgue Integrals Efficiently with the FTC
Author(s): J. J. Koliha
Accessed: 30/03/2014 17:31
.
.
Evaluating Lebesgue Integrals Efciently
with the FTC
J. J. Koliha
Abstract. This note addresses evaluation of Lebesgue integrals on the real line using the Fun-
damental Theorem of Calculus, without having to verify that the primitive is absolutely con-
tinuous.
The Fundamental Theorem of Calculus (FTC) provides an efcient method for the
evaluation of Lebesgue integrals on real intervals, but only if we can nd an abso-
lutely continuous primitive (antiderivative) to the integrand. However, checking abso-
lute continuity can be quite difcult. In this note, we give examples of evaluation of
integrals that require only continuity of the primitive. Here is a version of Lebesgues
FTC extended to a possibly unbounded interval.
Lebesgues FTC. Let a < b . Let F : (a, b) C be absolutely continu-
ous on (a, b) and let F
= f almost everywhere on (a, b), where f : (a, b) C is

Lebesgue integrable on (a, b). If the one-sided limits F(a+) and F(b) exist, then
b
a
f (t ) dt = F(b) F(a+).
It may seem that with the absolute continuity of F, the hypothesis that f is
Lebesgue integrable is redundant. Alas, no: The notorious function
F(t ) := Si(t ) =
t
0
sin x
x
dx, t > 0,
shows the error of our ways [2, Example 14.17]. The absolute continuity of F on
(0, ) follows from the Mean Value Theorem; F(0+) = 0 is clear and F() =
/2 is well known. Yet the derivative F
(x) = f (x) = (sin x)/x is not Lebesgue inte-

grable as
lim
t
t
0
sin x
x
dx = .
It is well known that on a compact interval, the integrability of f is indeed redundant
(see, for instance, [2, Theorem 14.7]).
The problem with application of Lebesgues FTC can be seen in this situation.
Suppose we know that F
(x) = f (x) everywhere in [a, b] and that f is Lebesgue

integrable on [a, b]. Then we have a paradoxical situation of not being able to use
Lebesgues FTC, since we do not knowwhether F is absolutely continuous. If we write
G(x) =

x
a
f (t ) dt , we know that G is absolutely continuous, and (F G)
(x) = 0
almost everywhere. However, we cannot conclude that F G is constant.
MSC: Primary 26A42
In order to overcome this problem, we need to look at a different type of FTC,
one which is usually proved by methods outside the theory of Lebesgue integration.
A proof that stays strictly within the realm of the Lebesgue theory was given by the
author in this MONTHLY [1]. We recall three versions of this theorem, whose proofs
can be found in [1] and [2, Chapter 14].
Theorem 1 (see [1]). Let a < b . Let F : (a, b) C be such that F
(x) =
f (x) for all x (a, b), where f : (a, b) C is Lebesgue integrable on (a, b). If the
one-sided limits F(a+) and F(b) exist, then
b
a
f (t ) dt = F(b) F(a+).
Even if we tighten the hypotheses to assume that F has a derivative on a compact
interval [a, b] (with one-sided derivatives at the end points), the integrability of f
cannot be dropped due to a possible blowout of the positive and negative oscillation
of f . To see this, dene
F(t ) = t
2
cos
2
1
t
2
if t = 0 and F(0) = 0,
and
f (t ) = F
(t ) if t = 0 and f (0) = 0.
But f is not Lebesgue integrable on [0, 1], since
| f (t )| dt as 0+. (See
[2, Example 14.15] for details.)
Theorem 2 (see [1]). Let a < b . Let F : (a, b) C be continuous on
(a, b) and let F
(x) = f (x) nearly everywhere on (a, b), where f : (a, b) C is

Lebesgue integrable on (a, b). If the one-sided limits F(a+) and F(b) exist, then
b
a
f (t ) dt = F(b) F(a+).
The expression nearly everywhere means everywhere except for a countable set.
If F is continuous on (a, b), F
= f nearly everywhere on (a, b), and the one-sided

limits F(b) and F(a+) exist, then we say that f is Newton integrable on (a, b), and
dene its Newton integral by
(N)
b
a
f (t ) dt := F(b) F(a+).
Theorem 1 enables us to calculate the integral
1
0
t
1/2
dt by observing that F(t ) =
2t
1/2
is a primitive for the integrand f (t ) = t
1/2
everywhere in (0, 1), that F(0+) = 0
and F(1) = 2, but we have to know that f is Lebesgue integrable on (0, 1). For this
we can use, for instance, the Monotone Convergence Theorem applied to the trunca-
tions f
n
= min( f, n) of f . But this does not seem to be the most efcient way to do
itwe would like to conclude the integrability of f directly from the existence of the
Newton integral. For this we need to consider absolute Newton integrability. We say
that a function f : (a, b) C is absolutely Newton integrable if the Newton integral
exists for both f and | f | (where | f | is real valued and nonnegative). Here is the desired
theorem.
Theorem 3 (see [1]). Let a < b . Let f : (a, b) C be absolutely Newton
integrable on (a, b). Then f is Lebesgue integrable on (a, b), and
b
a
f (t ) dt = (N)
b
a
f (t ) dt.
The readers can hone their skills by evaluating the following integrals using Theo-
rems 1, 2, or 3.
Example 1. Evaluate Lebesgue integrals efciently:
(i)
1
0
1
t
3/4
+i log t
dt, (ii)
2
1
t 3i
t +2i
dt, and (iii)

0
dt
(2t +i)
3
.
So far, the substantial power hidden in Theorems 2 and 3 has not been fully utilized,
namely the fact that the derivative of the continuous function F may exist only nearly
everywhere. We illustrate this in the following examples.
Example 2. Let f : (0, 1) C be the function dened by f (t ) = 0 if t is ra-
tional, and f (t ) = log t + i t
4/5
otherwise. Let be the characteristic function of
(0, 1) \ Q. Then F
1
(t ) = t log t t and F
2
(t ) = 5t
1/5
are generalized primitives to
f
1
(t ) = (t ) log t and f
2
(t ) = (t ) t
4/5
on (0, 1), respectively. Further, F
1
(1) = 1,
F
1
(0+) = 0, F
2
(1) = 5, and F
2
(0) = 0. Both f
1
and f
2
are absolutely Newton in-
tegrable as they do not change sign on (0, 1). By Theorem 3, f is Lebesgue inte-
grable with

1
0
f =

1
0
f
1
+ i
1
0
f
2
= 1 + i 5. (Note that by splitting the real and
imaginary parts of f , we avoided the need for nding a generalized primitive for
| f (t )| = (t )(log
2
t +t
8/5
)
1/2
. This is not always the most efcient maneuversee
Example 1 (iii).)
Example 3. Let f be dened on the interval (0, 1) by
f (x) =
1
(n +2){(n +1)x n}
if x
n
< x x
n+1
, n = 0, 1, 2, . . .
where x
n
= n/(n +1), n = 0, 1, 2, . . . . First sketch a graph of f ; it reveals innitely
many vertical asymptotes at the points x
n
, n = 0, 1, 2, . . . , neatly clustering near x =
1. On each interval (x
n
, x
n+1
), a primitive to f is
F(x) =
2
(n +1)
n +2
(n +1)x n +c
n
, x (x
n
, x
n+1
).
The constants of integration c
n
must be chosen wisely to make F continuous on (0, 1).
From F(x
n
) = F(x
n
+), we obtain c
n
= 2/[(n(n +1)] +c
n1
. Choosing c
0
= 0, we
get
c
n
= 2
n
k=1
1
k(k +1)
= 2
n
k=1
1
k

1
k +1
= 2
1
1
n +1
=
2n
n +1
,
n = 0, 1, 2, . . . Setting F(x
n
) = F(x
n
) = F(x
n
+) for n = 1, 2, 3, . . . , we make
F continuous on (0, 1), but the derivatives F
(x
n
) fail to exist for n = 1, 2, 3, . . . .
As the integrand is nonnegative, its Newton integrability implies absolute Newton
integrability. Clearly, F(0+) = 0. Further, F is increasing on (0, 1) being con-
tinuous there and having a positive derivative nearly everywhere in (0, 1) (see
[2, Theorem B25]). Also, F is bounded on (0, 1) as on each interval (x
n
, x
n+1
]
we have F(x
n+1
) = 2x
n+1
2. Hence, the limit F(1) exists and is equal to
lim
n
F(x
n+1
) = 2. Thus,

1
0
f (x) dx = F(1) F(0+) = 2. We note that f
is not improperly Riemann integrable.
Example 4. A striking example of a Lebesgue integrable function that is not improp-
erly Riemann integrable and that has a vertical asymptote at each rational point of the
interval [0, 1] is given by Richardson in [3, Example 5.44]:
f (x) =
k=1
2
k
|x q
k
|
1/2
,
where (q
k
) is a sequence containing all rational numbers in [0, 1]. Write f
k
(x) =
2
k
|x q
k
|
1/2
for x [0, 1] \ {q
k
}, k = 1, 2, . . . . Then f
k
is absolutely Newton in-
tegrable with a generalized primitive F
k
(x) = 2
k+1
sgn(x q
k
)|x q
k
|
1/2
in [0, 1],
and the integral (N)
1
0
f
k
= F
k
(1) F
k
(0+) = 2
k+1
((1 q
k
)
1/2
+q
1/2
k
). By The-
orem 3, this is also Lebesgue integral of f
k
. We have
:=
k=1
1
0
| f
k
(t )| dt =
k=1
2
k+1
((1 q
k
)
1/2
+q
1/2
k
) < .
By the term-by-term integration of series [2, Theorem 13.35], f =

k
f
k
converges
almost everywhere in [0, 1], is Lebesgue integrable, and
1
0
f (t ) dt = .
ACKNOWLEDGMENT. I would like to thank the referees for their comments, which led to improved pre-
sentation of this note.
REFERENCES
1. J. J. Koliha, A fundamental theorem of calculus for Lebesgue integration, Amer. Math. Monthly 113 (2006)
551555.
2. , Metrics, Norms and Integrals: An Introduction to Contemporary Analysis. World Scientic Pub-
lishing, Singapore, 2008.
3. L. F. Richardson, Measure and Integration: A Concise Introduction to Real Analysis. John Wiley, New
York, 2009.
The University of Melbourne, Melbourne VIC 3010, Australia
koliha@unimelb.edu.au
Problems and Solutions
Accessed: 30/03/2014 17:31
.
.
PROBLEMS AND SOLUTIONS
Edited by Gerald A. Edgar, Doug Hensley, Douglas B. West
with the collaboration of Itshak Borosh, Paul Bracken, Ezra A. Brown, Randall
Dougherty, Tam as Erd elyi, Zachary Franco, Christian Friesen, Ira M. Gessel, L aszl o
Lipt ak, Frederick W. Luttmann, Vania Mascioni, Frank B. Miles, Richard Pefer,
Dave Renfro, Cecil C. Rousseau, Leonard Smiley, Kenneth Stolarsky, Richard Stong,
Walter Stromquist, Daniel Ullman, Charles Vanden Eynden, Sam Vandervelde, and
Fuzhen Zhang.
Proposed problems and solutions should be sent in duplicate to the MONTHLY
problems address on the back of the title page. Proposed problems should never
be under submission concurrently to more than one journal. Submitted solutions
should arrive before August 31, 2014. Additional information, such as general-
izations and references, is welcome. The problem number and the solvers name
and address should appear on each solution. An asterisk (*) after the number of
a problem or a part of a problem indicates that no solution is currently available.
PROBLEMS
11768. Proposed by Ovidiu Furdui, Technical University of Cluj-Napoca, Cluj-
Napoca, Romania. Let f be a bounded continuous function mapping [0, ) to itself.
Find
lim
n
n
_
n
_
_

0
f
n+1
(x)e
x
dx
n
_
_

0
f
n
(x)e
x
dx
_
.
11769. Proposed by P al P eter D alyay, Szeged, Hungary. Let a
1
, . . . , a
n
and b
1
, . . . , b
n
be positive real numbers. Show that
_
_
n
j =1
a
j
b
j
_
_
2
2
n
j,k=1
a
j
a
k
(b
j
+b
j
)
2
2
_
_
n
j,k=1
a
j
a
k
(b
j
+b
k
)
n
l,m=1
a
l
a
m
(b
l
+b
m
)
3
_
_
1/2
.
11770. Proposed by Spiros P. Andriopoulos, Third High School of Amaliada, Eleia,
Greece. Prove, for real numbers a, b, x, y with a > b > 1 and x > y > 1, that
a
x
b
y
x y
>
_
a +b
2
_
(x+y)/2
log
_
a +b
2
_
.
11771. Proposed by D. M. B atinet u-Giurgiu, Matei Basarab National College,
Bucharest, Romania, and Neculai Stanciu, George Emil Palade School, Buz au,
Romania. Let n!! =

(n1)/2
i =0
(n 2i ). Find
lim
n
_
n
_
(2n 1)!!
_
tan

n+1
(n +1)!
4
n
n!
1
__
.
April 2014] PROBLEMS AND SOLUTIONS 365
11772. Proposed by Mircea Merca, University of Craiova, Craiova, Romania. Let n
be a positive integer. Prove that the number of integer partitions of 2n +1 that do not
contain 1 as a part is less than or equal to the number of integer partitions of 2n that
contain at least one odd part.
11773. Proposed by Moubinool Omarjee, Lyc ee Henri IV, Paris, France. Given a posi-
tive real number a
0
, let a
n+1
= exp
_
n
k=0
a
k
_
for n 0. For which values of b does
n=0
(a
n
)
b
converge?
11774. Proposed by Yunus Tuncbilek, Ataturk High School of Science, Istanbul, Turkey
and Danny Lee, Herkimer Senior High School, NY, NY. Let be the circumscribed
circle of triangle ABC. The A-mixtilinear incircle of ABC and is the circle that is
internally tangent to , AB, and AC, and similarly for B and C. Let A
, P
B
, and P
C
be
the points on , AB, and AC, respectively, at which the A-mixtilinear incircle touches.
Dene B
and C
in the same manner that A
was dened. (See gure.)

A
B C
P
C
P
B
A
O
O
A
B
Prove that triangles C
P
B
B and CP
C
B
are similar.
SOLUTIONS
The Lenstra Constant of a Ring
11628 [2012, 162]. Proposed by Jeffrey C. Lagarias and Michael E. Zieve, University
of Michigan, Ann Arbor, MI. Dene the Lenstra constant L(R) of a commutative ring
R to be the size of the largest subset A of R such that a b is a unit (invertible
element) in R for any distinct elements a, b A. Show that for each positive integer
N, the Lenstra constant of the ring Z(1/N) is the least prime that does not divide N.
Solution by Mark D. Meyerson, United States Naval Academy, Annapolis, MD. The
elements of Z(1/N) are the numbers of the form k/N
r
with k, r Z. Let p
e
1
1
p
e
m
m
be the prime factorization of N; each e
i
is a positive integer. The units in Z(1/N) are
numbers of the form p
d
1
1
p
d
r
r
with each d
i
Z. Let p be the least prime that does
not divide N. The set {1, . . . , p} has the property that any difference of two distinct
elements is a unit, since any prime factor of such a difference is a prime factor of N.
Hence, L(Z(1/N)) p.
Now, let L be a subset of Z(1/N) such that any nonzero difference is a unit, and
suppose that |L| > p. By deleting extra elements, we may assume |L| = p + 1. If
we multiply the p + 1 elements of L by a sufciently high power of N to make all
the elements integers, the nonzero differences will still be units. However, by the pi-
geonhole principle, two of the p +1 elements are congruent mod p. Their difference
is a multiple of p and hence is not a unit. It follows that L(Z(1/N)) p. The two
inequalities prove that p is the Lenstra constant of this ring.
Also solved by P. Budney, N. Caro (Brazil), R. Chapman (U. K.), W. Chengyuan (Singapore), P. P. D alyay
(Hungary), S. Dey (India), D. Fleischman, O. Geupel (Germany), Y. J. Ionin, B. Karaivanov, J. H. Lindsey II,
O. Lossers (Netherlands), A. Magidin, G. Martin (Canada), M. A. Prasad (India), F. Richman, J. Riegsecker,
K. Schilling, J. H. Smith, J. H. Steelman, R. Stong, M. Tetiva (Romania), Colgate University Problem Solving
Group, NSA Problems Group, TCDmath Problems Group (Ireland), Texas State University Problem Solving
Group, University of Louisiana at Lafayette Math Club, and the proposers.
Rotatable Quasigroups
11631 [2012, 247248]. Proposed by P al P eter D alyay, Szeged, Hungary. A quasi-
group (Q, ) is a set Q together with a binary operation such that for each a, b Q
there exist unique x and unique y (which may be equal) such that ax = b and ya = b.
The Cayley table of a nite quasigroup is its times table. A quasigroup has property
P if each row of the table is a rotation of the rst row.
Find all positive integers n for which there exists a quasigroup ({1, . . . , n}, ) with
property P in which all elements are idempotent. (For instance, the Cayley table below
denes a binary operation on {1, . . . , 5} with property P in which each element is
idempotent.)
* 1 2 3 4 5
1 1 5 4 3 2
2 3 2 1 5 4
3 5 4 3 2 1
4 2 1 5 4 3
5 4 3 2 1 5
Solution by Fred Richman, Florida Atlantic University, Boca Raton, FL. Such quasi-
groups exist if and only if n is odd. Cayley tables are just Latin squares; idempotence
requires diagonal 1, . . . , n in order. The table is then determined by its rst row and
property P. The problem is thus to nd a permutation of 1, . . . , n as the rst row so
that the entries in the rst column are distinct, since property P then completes a Latin
square for the table.
We calculate the rst entry in row 1 k. This row is a rotation of row 1, and it
must have 1 k in column 1 k. Also row 1 has 1 k in column k, so row 1 is rotated
leftward by k (1 k) positions to become row 1 k. Thus, the rst entry in row 1 k
is 1 [k (1 k) +1]. For these values to be distinct, the values k (1 k) must be
distinct modulo n.
When n is odd, 2 is invertible (modulo n). Setting 1 k 2 k as in the proposers
example yields k (1 k) 2(k 1), and these elements are distinct (modulo n).
When n is even, the values k (1 k) cannot be distinct (modulo n) because
n
i =1
i =
(n +1)n/2 n/2 (mod n) and
n
k=1
(k (1 k)) = 0.
Editorial comment. When n is odd, one can require even more: There are many
idempotent commutative quasigroups on Z
n
, such as by putting (i + j )/2 in position
(i, j ), using the uniqueness of the multiplicative inverse of 2. This construction for
n = 2k + 1 is used in the Bose construction of a Steiner triple system on 6k + 3
elements (R. C. Bose, On the construction of balanced incomplete block designs, Ann.
Eugenics 9 (1939), 353399).
Also solved by D. Beckwith, R. Chapman (U. K.), S. M. Gagola Jr., O. Geupel (Germany), A. Habil (Syria),
E. A. Herman, Y. J. Ionin, B. Karaivanov, J. H. Lindsey II, J. M. Lockhart, O. P. Lossers (Netherlands),
C. R. Pranesachar (India), R. E. Prather, J. H. Steelman, R. Stong, J. Wojdylo, Colgate University Problem
Solving Group, GCHQ Problem Solving Group (U. K.), TCDmath Problem Group (Ireland), and the proposer.
A Harmonic Identity
11633 [2012, 248]. Proposed by Anthony Sofo, Victoria University, Melbourne, Aus-
tralia. For real a, let H
(a)
n
=

n
j =1
j
a
. Show that for integers a, b, and n with a
1, b 0, and n 1,
n
k=1
k(H
2
k
+ H
(2)
k
) +2(k +b)
a
H
(1)
k
H
(a)
k+b1
k(k +b)
a
= H
(a)
n+b
(H
2
n
+ H
(2)
n
).
Solution by Subhadip Dey, Bangalore City, Karnataka, India. As in the problem, we
use the notation H
n
= H
(1)
n
and H
(a)
0
= 0. Using the identities
H
2
n
+ H
(2)
n
= 2
n
j =1
j
i =1
1
i j
= 2
n
j =1
H
j
j
and
n
j =1
j
i =1
a
i
b
j
=
n
i =1
n
j =i
a
i
b
j
,
the rst term on the left side of the identity becomes
n
k=1
H
2
k
+ H
(2)
k
(k +b)
a
= 2
n
k=1
k
j =1
H
j
j (k +b)
a
= 2
n
j =1
H
j
j
n
k=j
1
(k +b)
a
.
Therefore, we compute
n
k=1
H
2
k
+ H
(2)
k
(k +b)
a
+2
n
k=1
H
k
H
(a)
k+b1
k
= 2
n
j =1
H
j
j
n
k=j
1
(k +b)
a
+2
n
j =1
H
j
j
H
(a)
j +b1
= 2
n
j =1
H
j
j
_
_
n
k=j
1
(k +b)
a
+ H
(a)
j +b1
_
_
= 2
n
j =1
H
j
j
H
(a)
n+b
= H
(a)
n+b
(H
2
n
+ H
(2)
n
).
Editorial comment. Several solvers noted that the identity is valid for all real a. E. A.
Herman generalized it to
n
k=1
k(H
p
k
+ H
( p)
k
) + Z
k, p
(k +b)
a
H
k
H
(a)
k+b1
k(k +b)
a
= H
(a)
n+b
(H
p
n
+ H
( p)
n
),
where p is a positive even integer and Z
k, p
=

p1
j =1
_
p
j
_
H
j 1
k
_
1
k
_
p1j
.
Also solved by P. Bracken, R. Chapman (U. K.), P. P. D alyay (Hungary), E. S. Eyeson, O. Geupel (Ger-
many), E. A. Herman, B. Karaivanov, O. Kouba (Syria), O. P. Lossers (The Netherlands), M. Omarjee (France),
C. R. Pranesachar (India), M. A. Prasad (India), J. H. Steelman, R. Stong, R. Tauraso (Italy), GCHQ Problem
Solving Group (U. K.), and the proposer.
A Fractional Integral
11637 [2012, 344]. Proposed by Ovidiu Furdui, Technical University of Cluj-Napoca,
Cluj, Romania. Let m 1 be a nonnegative integer. Let {u} = u u; the quantity
{u} is called the fractional part of u. Prove that
_
1
0
_
1
x
_
m
x
m
dx = 1
1
m +1
m
k=1
(k +1).
(Here denotes the Riemann zeta function.)
Solution by Patrick J. Fitzsimmons, San Diego, CA. First note that
_
1
x
_
=
1
x
n if
1
n+1
x <
1
n
. From this it follows that
_
1
0
_
1
x
_
m
x
m
dx =
n=1
_ 1
n
1
n+1
(1 nx)
m
dx =
n=1
(1 nx)
m+1
n(m +1)
_
1
n
1
n+1
=
1
m +1
n=1
1
n
_
1
n +1
_
m+1
.
On the other hand, with Z =

m
k=1
(k +1), we have
Z =
m
k=1
n=1
1
n
k+1
=
n=1
m
k=1
1
n
k+1
= m +
n=2
1
n
2

1
n
m+2
1
1
n
= m +
n=2
1
n(n 1)
_
1
1
n
m
_
= m +
n=2
1
n(n 1)
n=2
1
(n 1)n
m+1
= m +1
n=1
1
n(n +1)
m+1
.
Thus both sides of the stated identity equal
1
m+1
n=1
1
n(n+1)
m+1
.
Editorial comment. A similar problem appeared as Problem 1845, Math. Mag., 84
(April 2011), 155156, and as Problem 11206, this MONTHLY 114 (2007), 928929.
Eugene A. Herman showed for a > m 1 that
_
1
0
_
1
x
_
m
x
a
dx =
1
a m +1

1
m +1
m
k=1
(k +1 m +1)
_
m+1
m+1k
_
_
a+1
m+1k
_.
Also solved by T. Amdeberhan, P. J. Anderson (Canada), M. Bataille (France), D. Beckwith, K. N. Boyadzhiev,
M. A. Carlton, N. Caro (Brazil), R. Chapman (U. K.), M. W. Coffey, C. Curtis, P. P. D alyay (Hungary),
E. S. Eyeson, D. Fleischman, O. Geupel (Germany), M. L. Glasser, M. Goldenberg & M. Kaplan, D. Gove,
G. C. Greubel, J.-P. Grivaux (France), J. A. Grzesik, E. A. Herman, E. Hysnelaj (Australia) & E. Bojaxhiu
(Germany), W. Janous (Austria), B. Karaivanov, D. R. Kim (Korea), O. Kouba (Syria), H. Kwong, J. B. Little,
O. P. Lossers (Netherlands), I. Mez o (Hungary), U. Milutinovi c (Slovenia), J. Minkus, R. Nandan, M. Omarjee
(France), P. Perfetti (Italy) T. Perrson & M. P. Sundqvist (Sweden), C. R. Pranesachar (India), M. A. Prasad
(India), R. Pratt, V. Sah, J. Schlosberg, N. C. Singer, A. Stenger, R. Stong, R. Tauraso (Italy), D. B. Tyler,
J. Vinuesa (Spain), T. Viteam (Uruguay), M. Vowe (Switzerland), A. Witkowski (Poland), J. Zacharias, GCHQ
Problem Solving Group (U. K.), Missouri State University Problem Solving Group, NSA Problems Group,
TCDmath Problem Group (Ireland), and the proposer.
Independent Triples in a Discrete Probability Space
11643 [2012, 426]. Proposed by Eugen J. Ionascu, Columbus State University, Colum-
bus, GA. Let r be a real number with 0 < r < 1, and dene a discrete probability
measure P on N by P(k) = (1 r)r
k1
for k 1. Show that there are uncount-
ably many triples (A
1
, A
2
, A
3
) of subsets of N that are mutually independent, that is,
P(A
i
A
j
) = P(A
i
)P(A
j
) for i = j and P(A
1
A
2
A
3
) = P(A
1
)P(A
2
)P(A
3
).
Solution by Oliver Geupel, Br uhl, NRW, Germany. Let A
1
=

m0
{4m +1, 4m +2}
and A
2
=

m0
{4m + 1, 4m + 3}. For any set B of nonnegative integers, let A
3
=
mB
{4m +1, 4m +2, 4m +3, 4m +4}. Since B is arbitrary, there are uncountably
many such triples.
We show that the events A
1
, A
2
, A
3
are mutually independent. We have
P(A
1
) = (1 r)
m=0
(r
4m
+r
4m+1
) = (1 r)
1 +r
1 r
4
=
1
1 +r
2
,
P(A
2
) = (1 r)
m=0
(r
4m
+r
4m+2
) = (1 r)
1 +r
2
1 r
4
=
1
1 +r
, and
P(A
3
) = (1 r)
mB
(r
4m
+r
4m+1
+r
4m+2
+r
4m+3
) = (1 r
4
)
mB
r
4m
.
Furthermore,
P(A
1
A
2
) = (1 r)
m=0
r
4m
=
1 r
1 r
4
= P(A
1
)P(A
2
),
P(A
1
A
3
) = (1 r)
mB
(r
4m
+r
4m+1
) = (1 r
2
)
mB
r
4m
= P(A
1
)P(A
3
),
P(A
2
A
3
) = (1 r)
mB
(r
4m
+r
4m+2
) = P(A
2
)P(A
3
),
and
P(A
1
A
2
A
3
) = (1 r)
mB
r
4m
= P(A
1
)P(A
2
)P(A
3
).
Editorial comment. Many solvers noted that there are trivial solutions, such as A
1
=
A
2
= N and A
3
arbitrary. The solution presented here demonstrates that the sets can
be required to be nontrivial.
Solved also by M. Carlton, J. H. Lindsey II, M. D. Meyerson, M. Rajeswari (India), K. Schilling, R. Stong,
GCHQ Problem Solving Group (U. K.), and the proposer.
Factorable Polynomials
11645 [2012, 427]. Proposed by Christopher J. Hillar, University of California, Berke-
ley, CA, Lionel Levine, Cornell University, Ithaca, NY, and Darren Rhea, University
of California, San Francisco, CA. Determine all positive integers n such that the poly-
nomial g in two variables given by g(x, y) = 1 + y
2
n
k=1
x
2k
+ y
4
x
2n+2
factors in
C[x, y].
Solution by O. P. Lossers, Eindhoven University of Technology, Eindhoven, The
Netherlands. For n = 1, g has x
2
y
2
+ as a factor, where is a primitive cube
root of unity in C. For n = 2, g has x
2
y
2
+ 1 as a factor. We claim that g does not
factor in C[x, y] when n 3. Equivalently, we claim that h does not factor in C[x, y]
when n 3, where h(x, y) = y
4
+ y
2
n
k=1
x
2k
+ x
2n+2
.
First, we note that h is a polynomial in y
2
over C[x], so if h has a linear factor,
necessarily of the form y +a(x), then y a(x) is another linear factor and so y
2
a(x)
2
is a quadratic factor of h.
If our claim is false, then h factors as a product of two quadratic polynomials in y
over C[x], and such a factorization has the form
h(x, y) = (y
2
+a(x)y +x
r
)(y
2
a(x)y +
1
x
s
),
where r and s are nonnegative integers such that r + s = 2n + 2 and a(x) C[x].
Inspecting the coefcient of y shows that x
r
a(x) =
1
x
s
a(x). Now, a(x) = 0 is
impossible if n 3, as the expression for h(x, y) would not have enough terms. There-
fore, r = s = n +1 and = 1.
Let be the polynomial given by (x) =

n
k=1
x
2k
. From the coefcient of y
2
in h, we see that = 2x
n+1
a
2
(x), with a of degree n. Writing a = i b and b =
c(x
2
) + xd(x
2
) gives
= 2x
n+1
+c
2
(x
2
) +2xc(x
2
)d(x
2
) + x
2
d
2
(x
2
). (1)
If n is even, then equating odd parts in (1) gives 0 = 2x
n+1
+2xc(x
2
)d(x
2
), whence
c and d must be monomials. But then the left side of (1) has n terms while the right
side has just two. So n is odd, say n = 2m +1.
In this case, from (1) it follows that cd = 0, and since b(x) = c(x
2
) + xd(x
2
) has
degree 2m +1, it must be c that is 0 so that b can have odd degree. Writing z = x
2
,
we thus have
2m+1
k=1
z
k
= zd
2
(z) +2z
m+1
, (2)
where d has the form d = 1 +
m1
j =1
d
j
z
j
+z
m
with {1, 1}. The rst m terms
of d now coincide with those of (1 z)
1/2
, so d
j
= (1)
j
_
1/2
j
_
for 0 j m 1.
A similar calculation for the last m coefcients of d shows that d
mj
= d
j
for 0
j m 1. But that gives contradictory values for d
1
when m 1, so there is no
factorization if n 2, as claimed.
Also solved by G. Apostolopoulos (Greece), R. Chapman (U. K.), P. P. Dalyay (Hungary), D. Fleischman,
O. Geupel (Germany), M. Goldenberg & M. Kaplan, E. A. Herman, B. Karaivanov, O. Kouba (Syria),
J. H. Lindsey II, A. Magidin, M. A. Prasad (India), N. Singer, R. Stong, E. Verriest, and the proposers.
A Geometric Inequality
11646 [2012, 427]. Proposed by P al P eter D alyay, Szeged, Hungary. Let ABC be an
acute triangle, and let A
1
, B
1
, C
1
be the intersection points of the angle bisectors from
A, B, C to the respective opposite sides. Let R and r be the circumradius and the
inradius of ABC, and let R
A
, R
B
, R
C
be the circumradii of the triangles AC
1
B
1
, BA
1
C
1
,
and CA
1
B
1
, respectively. Let H be the orthocenter of ABC, and let d
a
, d
b
, d
c
be the
distances from H to sides BC, CA, and AB, respectively. Show that
2r(R
A
+ R
B
+ R
C
) R(d
a
+d
b
+d
c
).
Solution by Peter N uesch, Switzerland. Our solution uses Problem11552 (this MONTHLY,
October 2012, p. 702703). We write a, b, c for the lengths of the sides of ABC, s for
the semi-perimeter, and , , for the measures of the angles. From the denitions,
we have
AB
1
=
bc
c +a
, AC
1
=
bc
a +b
, B
1
C
1
= a
1
= 2R
A
sin .
Using a = 2R sin , we get R
A
= Ra
1
/a. Thus,
R
A
+ R
B
+ R
C
= R
_
a
1
a
+
b
1
b
+
c
1
c
_
R
_
1 +
r
R
_
= R +r,
where the inequality is Problem 11552. From d
a
= 2R cos cos , we have
d
a
+d
b
+d
c
= 2R(cos cos +cos cos +cos cos ) =
r
2
+s
2
4R
2
2R
.
Note that (r
2
+ s
2
4R
2
)/2R (2r(R +r))/R, since this is a rearrangement of a
Blundon inequality, s
2
4R
2
+4Rr +3r
2
. (This follows from s
2
2R
2
+10Rr
r
2
+2(R 2r)
R(R 2r), found in W. J. Blundon, Inequalities associated with the

triangle, Canad. Math. Bull. 8 (1965) 615626.)
This proves 2r(R
A
+ R
B
+ R
C
) 2r(R +r) R(d
a
+d
b
+d
c
).
Also solved by B. Karaivanov, J. Zacharias, and the proposer.
A Subset That Is Not Closed
11648 [2012, 427]. Proposed by Moubinool Omarjee, Paris, France. Let E be the set
of all continuous, differentiable functions from (0, 1] into R such that
_
1
0
t
1/2
f
2
(t ) dt
converges. Let F be the set of all f in E such that
_
1
0
t
3/2
f
2
(t ) dt and
_
1
0
t
1/2
f

(t )
2
dt
converge. Equip E with the distance
d( f, g) =
__
1
0
t
1/2
( f g)
2
(t ) dt
_
1/2
to make it a metric space. Is F a closed subset of E?
Solution by O. P. Lossers, Eindhoven University of Technology, Eindhoven, The
Netherlands. No, F is not closed. Consider f (t ) = t
1/4
, so that f E but f F.
Let be a differentiable function with minimum 0 and maximum 1, and such that
(t ) = 0 for 0 < t < 1 and (t ) = 1 for t > 2. Dene
(t ) = (t /) for > 0.
Note that
f F. Now
d( f,
f )
2
=
_
1
0
t
1/2
_
1 (t /)
_
2
f (t )
2
dt
_
2
0
t
1/2
f (t )
2
dt,
which goes to 0 as goes to 0. Hence, f is in the closure of F.
Editorial comment. If the problem statement had said continuously differentiable
and not just continuous, differentiable, then the above argument would in fact show
that F is dense in E.
Also solved by P. P. D alyay (Hungary), O. Kouba (Syria), J. H. Lindsey II, R. Stong, and GCHQ Problem
Solving Group (U. K.).
Review
Encounters with Chaos and Fractals . 2nd edition. By Denny Gulick. Chapman and Hall/CRC
Press, Boca Raton, 2012, xvi + 371 pp., ISBN 978-1-58488-517-7, $79.95.
Review by: Jeffrey Nunemacher
The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 373-376
Accessed: 30/03/2014 17:31
.
.
REVIEWS
Edited by Jeffrey Nunemacher
Mathematics and Computer Science, Ohio Wesleyan University, Delaware, OH 43015
Encounters with Chaos and Fractals, 2nd edition. By Denny Gulick. Chapman and Hall/CRC
Press, Boca Raton, 2012, xvi + 371 pp., ISBN 978-1-58488-517-7, $79.95.
Reviewed by Jeffrey Nunemacher
How can we convince undergraduates that mathematics is as modern and vibrant as
physics or biology in these days of the Higgs boson and genome sequencing? Cer-
tainly ordinary calculus, although it is intellectually rich, does not do the trick. Since
most promising mathematics and science students see it rst in high school, the level
of excitement that I still remember from seeing it presented relatively rigorously in
college many years ago is simply not present today. My candidate for a teachable con-
temporary mathematical topic that can attract modern students is chaotic dynamical
systems, or to give the subject a more enticing name, chaos and fractals. I have taught
courses on this subject at a variety of levels from freshman honors to senior capstone.
And the text that I have enjoyed using the most (at least for a lower-level version) is the
Gulick book, which has recently appeared in a second edition. The new edition offers
more material on fractals (three chapters rather than one) and gives expanded coverage
of background material and attention to modern algorithms. This second edition is the
subject of the current review.
The subject of chaos was invented around the turn of the twentieth century by
Poincar e (but named much later by Yorke). He showed that a deterministic system of
second-order differential equations modeling a particular three-body solar system can
have solutions that display sensitive dependence on initial conditions. Thus, some tra-
jectories simply cannot be predicted with any degree of accuracy over the long term.
But the subject did not really take off until the development of software for experi-
mentation and graphics. Once these tools were available and applications to subjects
like weather prediction and chemical reactions were discovered, there was incentive to
nd the correct mathematical framework and to build an appropriate theory. Some
chaotic trajectories display fractal behavior, so this modern geometric concept oc-
curs naturally in the study of chaotic systems. Fractals also occur as the limit sets
of simple discrete dynamical systems. Take, for instance, the iterated function sys-
tem (IFS) dened by the three afne mappings of the plane: T
1
(v) = 1/2v, T
2
(v) =
1/2v +(1/2, 0), T
3
(v) = 1/2v +(1/4,
3/4). If we start with the origin and iterate

this IFS many times, the limit set is the famous Sierpi nski gasket, and a good computer
image is obtained by using the tenth iterate.
It is possible to teach much of this material to motivated students who have a back-
ground of only rst-semester calculus. Of course, the more mathematics a student
knows the better, but a course can be taught with this very minimal prerequisite. En-
countered during the course will be some topics from sophomore courses including
iteration (discrete mathematics), matrices (linear algebra), the qualitative study of so-
lutions of differential equations, and algorithms employing pseudorandom numbers
April 2014] REVIEWS 373
(programming and statistics). But it can be argued that seeing interesting topics from
these courses as they arise naturally is the best possible motivation for then taking
the standard courses. Ideas from topology and real analysis, both on the line and in
abstract spaces, also come up naturally as the course proceeds.
The subjects of chaos and fractals have been part of the undergraduate mathemat-
ical landscape ever since Devaneys rst edition of his attractive book [3] in 1985. A
special issue of the College Mathematics Journal (Volume 22, No. 1, January 1991)
was devoted to this new topic and discusses how it might t into the undergraduate
curriculum. Let me list particular aspects of the subject that I nd particularly well
emphasized in the Gulick book.
1. There are fundamental simple, yet fecund, examples to explore and generalize,
e.g., the quadratic mapping in one variable, the Sierpi nski gasket, Smales Horse-
shoe mapping in the plane, the Lorenz system of differential equations (which
provided the rst example of chaos in a real situation), the Mandelbrot set in the
complex plane. Most students and some professors do not appreciate how cru-
cial examples are for the development of a mathematical subject. Since most of
the mathematics that we teach is quite old, motivating examples are often treated
very briey in the rush to get to theorems. The examples in chaos and fractals
are rich and somewhat complicated, and it is natural to linger over them. Thus
the subject is a good corrective to standard courses. Gulick does a good job ana-
lyzing these examples, starting with easy mathematics but getting to some depth.
2. A variety of tools and theory are useful in exploring these examples, e.g., deriva-
tives and Jacobians, Lyapunov exponents, symbolic dynamics, conjugacy, bifur-
cation theory.
3. It is nontrivial to arrive at the best denitions on which to base the relevant the-
ory, e.g., strong and weak chaos for function iteration, the Hausdorff metric on
the space of compact sets in the plane, and various versions of dimension. Recall
the Bourbaki point of view that denitions in mathematics should be carefully
constructed (and perhaps difcult) in order to make the theorems easy.
4. There are surprising fundamental theorems, e.g., Sharkovskys Theorem about
the occurrence of periods in one dimension based on a particular total ordering
of the natural numbers, the Stable and Unstable Manifold Theorem, the connect-
edness of the Mandelbrot set.
5. It is natural to explore the phenomena of both chaos and fractals using computa-
tional resources, e.g., to draw bifurcation diagrams, to approximate fractal sets.
Chaos is a wonderful subject for exploratory mathematics.
6. Finally, examples from a wide variety of applied areas are available to show the
relevance of this subject, e.g., chaotic pendula in mechanics or fractal coastlines
in geography.
While there are many excellent texts about these subjects at various levels, I have
not found a better book than Gulicks for a serious course at the honors freshman or
sophomore level. There are very elementary books that concentrate on intuitive under-
standing and visual images, and many others that require a greater depth of mathemat-
ical background. One requirement, which to me is important, is that the course (and
thus the text) should treat both discrete and continuous dynamical systems. I feel that
the richness and applications of the subject can only be seen by studying both types.
Excellent books that fail to satisfy this criterion (and which also are too advanced for
beginning students) are [1], [7], and [10]. A standard book on fractals and the algo-
rithms to produce them on a computer is [2]. Typically, for my course I use a main
text and then also a good expository book of broader scope. In the past I have selected
popular books by Gleick [4], Peterson [6], Ruelle [8], or Stewart [9]. Of these, the one
Ive enjoyed using the most is [9]. Ive also required each student to do an independent
project, which can be experimental, computational, or mathematical.
Next, I will briey discuss the contents of Gulicks book. The choice of topics is
particularly well selected for the not particularly advanced but still seriously mathe-
matical undergraduate course that I envision. The book begins with two chapters on
discrete one-dimensional iteration, the rst devoted to simple examples, xed and pe-
riodic points, and bifurcation, and the second focusing on chaotic behavior. Some of
the examples are explored in some detail; for example, the study of the logistic family
Q
(x) = x(1 x) requires ten pages. The two most common bifurcations, namely
the period-doubling bifurcation and the tangent bifurcation, are studied and explored in
examples and problems. The Li-Yorke Theorem, which asserts that if f is continuous
on a closed interval J and maps J into itself, then if f has a period-3 point it also has
points of all other periods, is proven in detail, while its generalization by Sharkovsky
is simply stated and discussed. By the way, the Li-Yorke result rst appeared in this
MONTHLY in 1975 [5] and is one of the early papers that made the subject of chaos
popular. The tools needed in one dimension are the single-variable derivative and a
computational system to explore examples of iteration. Chapter 3 generalizes these
ideas to two dimensions using simple matrix theory and the Jacobian, and explores
two classic examples of chaotic behavior: the H enon quadratic mapping and Smales
Horseshoe.
Chapter 4 moves from the discrete setting to continuous dynamical systems, which
are dened in terms of rst-order differential equations. It generalizes the basic con-
cepts to this setting and explores the pendulum system and the Lorenz system as two
examples. No experience in solving differential equations is necessary. The basic idea
of a differential equation dening a ow, together with some of the basic properties
of the ow, is developed. Continuous dynamical systems require more machinery and
sophistication to develop (which is mostly not done in this book). However, the most
important applications of chaos to reality lie in this realm. There are also some philo-
sophical points to make about the modeling process. For example, since chaos is a
mathematical construct, it can apply to a given mathematical mode of reality but never
to physical reality itself. Thus no phenomenon can ever be chaotic in the mathematical
sense.
The last three chapters of the book concentrate on fractals. Chapter 6 introduces
the basic idea of a fractal and discusses self-similarity and various kinds of fractal di-
mension. It also presents some basic examples, such as the Cantor set, the Sierpi nski
gasket, and the H enon attractor. Chapter 7 discusses Barnsleys Iterated Function Sys-
tems using metric spaces and shows how they can be used to generate fractals on a
computer. This chapter includes several elegant and useful results, such as the com-
pleteness of the collection of compact sets in the plane under the Hausdorff metric.
Finally, Chapter 8 studies fractals in the complex plane and introduces Julia sets and
the Mandelbrot set. The second edition of the book offers enhanced coverage of frac-
tals beyond what was presented in the rst edition.
An appendix in the book presents MATLAB functions to allow the study of iteration
empirically and to generate on the computer the classical images associated with chaos
and fractals. For instance, there is MATLAB code to produce the bifurcation diagram
of the logistic mapping Q
(x) as varies over an interval, to draw the H enon attractor,

and to display Julia sets and the Mandelbrot set. It is a good choice to offer these
experimental tools in a commonly available package, since the operation of the code
can be understood with minimal effort. However, minor errors in some of the code
April 2014] REVIEWS 375
will cause problems for beginning users of MATLAB. For example, in Program 4 to
produce a bifurcation diagram, I found four separate errors: The increments are 0.01,
not 0.001 as promised (and 0.001 is necessary to obtain a good picture); a closing
parenthesis is needed for the axis command; hold on should replace holdon; and
m should replace n as an argument for the function Qm. The author intends to
correct the errors on an Errata page. It is important for inexperienced users to be able to
use the code, since the ability to experiment with examples is one of the most attractive
features of this area of mathematics.
I also found a few small errors in the text and exercises and some imprecise state-
ments, such as Lemma 1 on page 227, which is stated for all increasing continuous
functions but applies only to Cantor-like ones. Also, there is a loose statement on
page 214 that asserts concepts pertinent to two-dimensional differential equations ap-
ply equally well in dimension three. The Poincar e-Bendixson Theorem, which is used
in the book, is a rather stark counterexample to this assertion.
Despite these minor errors, I feel that this book is the best text available for a
midlevel undergraduate course on chaos and fractals. The choice of topics, readable
prose, and level of presentation make it a very attractive book.
REFERENCES
1. K. T. Alligood, T. D. Sauer, J. A. Yorke, Chaos: An Introduction to Dynamical Systems. Springer-Verlag,
New York, 1996.
2. M. F. Barnsley, Fractals Everywhere. Second edition, Academic Press, Boston, MA, 1993.
3. R. L. Devaney, An Introduction to Chaotic Dynamical Systems. Second edition, Addison-Wesley, Read-
ing, MA, 1989.
4. J. Gleich, Chaos: The Making of a New Science. Viking, New York, 1988.
5. T. Y. Li, J .A. Yorke, Period three implies chaos, Amer. Math. Monthly 82 (1975) 985992.
6. I. Peterson, Newtons Clock: Chaos in the Solar System. W. H. Freeman, New York, 1995.
7. R. C. Robinson, An Introduction to Dynamical Systems. Prentice Hall, Englewood Cliffs, NJ, 2004.
8. D. Ruelle, Chance and Chaos. Princeton University Press, Princeton, NJ, 1993.
9. I. Stewart, Does God Play Dice? The New Mathematics of Chaos. Second edition, Blackwell, Malden,
MA, 2002.
10. S. H. Strogatz, Nonlinear Dynamics and Chaos With Applications to Physics, Biology, Chemistry, and
Engineering. Addison-Wesley, Reading, MA, 1994.
Ohio Wesleyan University, Delaware, OH 43015
jlnunema@owu.edu
Back Matter
Source: The American Mathematical Monthly, Vol. 121, No. 4 (April)
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.bm .
Accessed: 30/03/2014 17:31
.
.
Illustrated Special Relativity through Its Paradoxes:
A Fusion of Linear Algebra, Graphics, and Reality
By John dePillis and Jos Wudka
Spectrum Series
The text illustrates and resolves several apparent
paradoxes of Special Relativity including the twin
paradox and train-and-tunnel paradox. Assuming
a minimum of technical prerequisites the authors
introduce inertial frames and use them to explain
a variety of phenomena: the nature of simultaneity,
the proper way to add velocities, and why faster-
than-light travel is impossible. Most of these
explanations are contained in the resolution of
apparent paradoxes, including some lesser-known
ones: the pea-shooter paradox, the bug-and-rivet
paradox, and the accommodating universe paradox.
The explanation of time and length contraction is
especially clear and illuminating.
The roots of Einsteins work in Maxwells lead the authors to devote several
chapters to an exposition of Maxwells equations. The authors establish that
those equations predict a frame-independent speed for the propagation of
electromagnetic radiation, a speed that equals that of light. Several chapters are
devoted to experiments of Roemer(SYMBOL!), Fizeau, and de Sitter to measure the
speed of light and the Michelson-Morley experiment abolishing the aether.
Throughout the exposition is thorough, but not overly technical, and often
illustrated by cartoons. The volume might be suitable for a one-semester general-
education introduction to Special Relativity. It is especially well-suited to self-study
by interested laypersons or use as a supplement to a more traditional text.
eISBN 978-1-61444-517-3
2013, 478 pp.
Catalog Code: ISR
PDF Price: $33.00
MATHEMATICAL ASSOCIATION OF AMERICA
New in the
MAA eBooks Store
B
S
T
p
p
a
i
a
t
t
e
a
o
p
T
Illustrated
Special
Relativity
through its
Paradoxes
IIIIIIIIIIIIIIIIllllll
Spectrum
John dePillis & Jos Wudka
Illustrations and animations by John dePillis
To order, visit www.maa.org/ebooks/ISR.
1529 Eighteenth St., NW Washington, DC 20036
Recently Released from the MAA
Distilling Ideas: An Introduction to Mathematical Tinking
By Brian P. Katz and Michael Starbird
MAA Textbooks
Mathematics is not a spectator sport: successful students of mathematics
grapple with ideas for themselves. Distilling Ideas presents a carefully
designed sequence of exercises and theorem statements that challenge
students to create proofs and concepts. As students meet these challenges,
they discover strategies of proofs and strategies of thinking beyond
mathematics. In other words, Distilling Ideas helps its users to develop the
skills, attitudes, and habits of mind of a mathematician and to enjoy the process of distilling and
exploring ideas.
Catalog Code: DIMT ISBN: 978-1-93951-203-1
List Price: $54.00 171 pp., Paperbound, 2013
MAA Member: $45.00
To order, visit maa-store.hostedbywebstore.com or call 800-331-1622.
Front Matter
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.fm .
Accessed: 30/03/2014 17:27
.
.
THE AMERICAN MATHEMATICAL
MONTHLY
VOLUME 121, NO. 4 APRIL 2014
283 Periodicity Domains and the Transit of Venus
Andrew J. Simoson
299 A Drug-Induced Random Walk
Daniel J. Velleman
318 Analytical Solution for the Generalized FermatTorricelli
Problem
Alexei Yu. Uteshev
332 On the Proof of the Existence of Undominated Strategies in
Normal Form Games
Martin Kov ar and Alena Chernikava
338 An Asymptotic Formula for (1 +1/x)
x
Based on the Partition
Function
Chao-Ping Chen and Junesang Choi
NOTES
344 Stirlings Approximation for Central Extended Binomial
Coefcients
Steffen Eger
350 A New Proof of Stirlings Formula
Thorsten Neuschel
353 Zeta(2) Once Again
Ralph M. Krause
355 Polynomials (x
3
n)(x
2
+3) Solvable Modulo Any Integer
Andrea M. Hyde, Paul D. Lee, and Blair K. Spearman
359 Macaulay Expansion
B. Sury
361 Evaluating Lebesgue Integrals Efciently with the FTC
J. J. Koliha
365 PROBLEMS AND SOLUTIONS
REVIEWS
373 Encounters with Chaos and Fractals
By Denny Gulick
Jeffrey Nunemacher
MATHBITS
331, A One-Sentence Line-of-Sight Proof of the Extreme Value Theorem
An Ofcial Publication of the Mathematical Association of America
Latest in the MAA Notes Series
Applications of Mathematics in Economics
Warren Page, Editor
Applications of Mathematics in Economics
presents an overview of the (qualitative and
graphical) methods and perspectives of
economists. Its objectives are not intended
to teach economics, but rather to give math-
ematicians a sense of what mathematics is
used at the undergraduate level in various
parts of economics, and to provide students
with the opportunities to apply their math-
ematics in relevant economics contexts.
Te volumes applications span a broad range
of mathematical topics and levels of sophis-
tication. Each article consists of self-contained, stand-alone, expository
sections whose problems illustrate what mathematics is used, and how, in
that subdiscipline of economics. Te problems are intended to be richer
and more informative about economics than the economics exercises in
most mathematics texts. Since each section is self-contained, instructors
can readily use the economics background and worked-out solutions to
tailor (simplify or embellish) a sections problems to their students needs.
Overall, the volumes 47 sections contain more than 100 multipart prob-
lems. Tus, instructors have ample material to select for classroom uses,
homework assignments, and enrichment activities.
eISBN: 9781614443179
Print ISBN: 9780883851920
ebook: $24.00
Print on demand (paperbound): $40.00
To order go to www.maa.org/ebooks/NTE82
THE AMERICAN MATHEMATICAL
MONTHLY
Volume 121, No. 4 April 2014
EDITOR
Scott T. Chapman
Sam Houston State University
NOTES EDITOR BOOK REVIEW EDITOR
Sergei Tabachnikov Jeffrey Nunemacher
Pennsylvania State University Ohio Wesleyan University
PROBLEM SECTION EDITORS
Douglas B. West Gerald Edgar Doug Hensley
University of Illinois Ohio State University Texas A&M University
ASSOCIATE EDITORS
William Adkins
Louisiana State University
David Aldous
University of California, Berkeley
Elizabeth Allman
University of Alaska, Fairbanks
Jonathan M. Borwein
University of Newcastle
Jason Boynton
North Dakota State University
Edward B. Burger
Southwestern University
Minerva Cordero-Epperson
University of Texas, Arlington
Allan Donsig
University of Nebraska, Lincoln
Michael Dorff
Brigham Young University
Daniela Ferrero
Texas State University
Luis David Garcia-Puente
Sam Houston State University
Sidney Graham
Central Michigan University
Tara Holm
Cornell University
Roger A. Horn
University of Utah
Lea Jenkins
Clemson University
Daniel Krashen
University of Georgia
Ulrich Krause
Universit at Bremen
Jeffrey Lawson
Western Carolina University
C. Dwight Lahr
Dartmouth College
Susan Loepp
Williams College
Irina Mitrea
Temple University
Bruce P. Palka
National Science Foundation
Vadim Ponomarenko
San Diego State University
Catherine A. Roberts
College of the Holy Cross
Rachel Roberts
Washington University, St. Louis
Ivelisse M. Rubio
Universidad de Puerto Rico, Rio Piedras
Adriana Salerno
Bates College
Edward Scheinerman
Johns Hopkins University
Anne Shepler
University of North Texas
Susan G. Staples
Texas Christian University
Dennis Stowe
Idaho State University
Daniel Ullman
George Washington University
Daniel Velleman
Amherst College
EDITORIAL ASSISTANT
Bonnie K. Ponce
NOTICE TO AUTHORS
The MONTHLY publishes articles, as well as notes and
other features, about mathematics and the profes-
sion. Its readers span a broad spectrum of math-
ematical interests, and include professional mathe-
maticians as well as students of mathematics at all
collegiate levels. Authors are invited to submit arti-
cles and notes that bring interesting mathematical
ideas to a wide audience of MONTHLY readers.
The MONTHLYs readers expect a high standard of ex-
position; they expect articles to inform, stimulate,
challenge, enlighten, and even entertain. MONTHLY
articles are meant to be read, enjoyed, and dis-
cussed, rather than just archived. Articles may be
expositions of old or new results, historical or bio-
graphical essays, speculations or denitive treat-
ments, broad developments, or explorations of a
single application. Novelty and generality are far
less important than clarity of exposition and broad
appeal. Appropriate gures, diagrams, and photo-
graphs are encouraged.
Notes are short, sharply focused, and possibly infor-
mal. They are often gems that provide a new proof
of an old theorem, a novel presentation of a familiar
theme, or a lively discussion of a single issue.
Submission of articles, notes, and ller pieces is re-
quired via the MONTHLYs Editorial Manager System.
Initial submissions in pdf or L
A
T
E
X form can be sent
to the Editor Scott Chapman at
http://www.editorialmanager.com/monthly
The Editorial Manager System will cue the author
for all required information concerning the paper.
Questions concerning submission of papers can
be addressed to the Editor at monthly@shsu.edu.
Authors who use L
A
T
E
X can nd our article/note tem-
plate at http://www.shsu.edu/
~
bks006/Monthly.
html. This template requires the style le maa-
monthly.sty, which can also be downloaded from the
same webpage. Aformatting document for MONTHLY
references can be found at http://www.shsu.edu/
~
bks006/FormattingReferences.pdf. Follow the
link to Electronic Publications Information for
authors at http://www.maa.org/pubs/monthly.
html for information about gures and les, as well
as general editorial guidelines.
Letters to the Editor on any topic are invited.
Comments, criticisms, and suggestions for mak-
ing the MONTHLY more lively, entertaining, and
informative can be forwarded to the Editor at
monthly@shsu.edu.
The online MONTHLY archive at www.jstor.org is a
valuable resource for both authors and readers; it
may be searched online in a variety of ways for any
specied keyword(s). MAA members whose institu-
tions do not provide JSTOR access may obtain indi-
vidual access for a modest annual fee; call 800-331-
1622.
See the MONTHLY section of MAA Online for current
information such as contents of issues and descrip-
tive summaries of forthcoming articles:
http://www.maa.org/
Proposed problems or solutions should be sent to:
DOUG HENSLEY, MONTHLY Problems
Department of Mathematics
Texas A&M University
3368 TAMU
College Station, TX 77843-3368.
In lieu of duplicate hardcopy, authors may submit
pdfs to monthlyproblems@math.tamu.edu.
Advertising correspondence should be sent to:
MAA Advertising
1529 Eighteenth St. NW
Washington DC 20036.
Phone: (877) 622-2373,
E-mail: tmarmor@maa.org.
Further advertising information can be found online
at www.maa.org.
Change of address, missing issue inquiries, and
other subscription correspondence can be sent to:
MAA Service Center, maahq@maa.org.
All of these are at the address:
The Mathematical Association of America
1529 Eighteenth Street, N.W.
Washington, DC 20036.
Recent copies of the MONTHLY are available for pur-
chase through the MAA Service Center:
maahq@maa.org, 1-800-331-1622.
Microlm Editions are available at: University Micro-
lms International, Serial Bid coordinator, 300 North
Zeeb Road, Ann Arbor, MI 48106.
The AMERICAN MATHEMATICAL MONTHLY (ISSN
0002-9890) is published monthly except bimonthly
June-July and August-September by the Mathe-
matical Association of America at 1529 Eighteenth
Street, N.W., Washington, DC 20036 and Lancaster,
PA, and copyrighted by the Mathematical Asso-
ciation of America (Incorporated), 2014, including
rights to this journal issue as a whole and, except
where otherwise noted, rights to each individual
contribution. Permission to make copies of individ-
ual articles, in paper or electronic form, including
posting on personal and class web pages, for ed-
ucational and scientic use is granted without fee
provided that copies are not made or distributed for
prot or commercial advantage and that copies bear
the following copyright notice: [Copyright the Math-
ematical Association of America 2014. All rights re-
served.] Abstracting, with credit, is permitted. To
copy otherwise, or to republish, requires specic
permission of the MAAs Director of Publications and
possibly a fee. Periodicals postage paid at Washing-
ton, DC, and additional mailing ofces. Postmaster:
Send address changes to the American Mathemati-
cal Monthly, Membership/Subscription Department,
MAA, 1529 Eighteenth Street, N.W., Washington, DC,
20036-1385.

AMM April 2014

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

AMM April 2014

Caricato da

Copyright:

Formati disponibili

Periodicity Domains and the Transit of Venus

Author(s): Andrew J. Simoson

E(t ). Since it would

(a) A cone of possible shadows (b) A linear approximation of orbits

1.08 days. (10)

(t )||, of Vs shadow across the screen of S

(t )|| in Figure 6, then

(t )|| < 10.34 AU/year 0.0284 AU/day (12)

< 0.05. (13)

It is unclear how a period of

0.00672 AU. (14)

Figure 12. The range of 8-fold apparent periodicity

2/2, which means that Q(

Figure 14. Domains of periodicity

3227 3470 3713 3956 4199 4442 4685 4928 5171

be the number of values of i for which s

be the number of values of i for which s

) that gives a solution to the optimization

. Historically, the rst

; for an alternative Simpson construction

. Provided that the latter is chosen inside the triangle

), the second one by 2x

and add the obtained rows to

104, provide the following asymptotics as +0:

= (2, 1) is the foot

| by the area of the triangle P

with respect to the

lies inside this triangle, the value h is negative.

lies inside the tetrahedron P

with respect to a sphere circumscribed to that tetrahedron; it

{cl U| U } = , so = {X \ cl U| U } is an open cover of X; and since

AR is an associate professor of mathematics at Brno University of Technology (Brno, Czech

< (/2)2/(N +1).

a (mod p) has d = gcd(t, p 1) solutions if a

elements. Let us now write the elements of S

> n, the (n +1)-

= n has a unique solu-

> 12. For example,

= 20. Among the 20 members in S

= 120. The 75th member of S

; this is because we are choosing r numbers from

. The number of members

. Indeed, to prove this by induction, we use

= f almost everywhere on (a, b), where f : (a, b) C is

(x) = f (x) = (sin x)/x is not Lebesgue inte-

(x) = f (x) everywhere in [a, b] and that f is Lebesgue

(x) = f (x) nearly everywhere on (a, b), where f : (a, b) C is

= f nearly everywhere on (a, b), and the one-sided

in the same manner that A

was dened. (See gure.)

Prove that triangles C

R(R 2r), found in W. J. Blundon, Inequalities associated with the

3/4). If we start with the origin and iterate

(x) as varies over an interval, to draw the H enon attractor,

Potrebbero piacerti anche