Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
to the xy-plane.
We call the line between these orbital planes the nexus line or, according to Meeus [5],
284 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
the line of nodes. The nexus line in Figure 3 is labeled BC. A nexus point or node
for VenusF and G in the gureor for EarthB and C in the gureis where
the orbit of V or E pierces the orbital plane of E or V, respectively. Transits will
only occur when E and V are both near B and F, respectively, or both near C and G.
The former transit is called a fall transit because in modern times E is at B in early
December; it is also called, according to Meeus, an ascending transit, because as Vs
prole moves across S from left to right its trajectory rises. The latter transit is called
a spring transit because E is at C in early June; it is also called a descending transit,
because the corresponding trajectory decreases. Es and Vs position at any time is
given respectively by E(t ) and V(t ):
E(t ) =
_
_
cos(2t )
sin(2t )
0
_
_
and V(t ) =
_
_
1 0 0
0 cos sin
0 sin cos
_
_
_
_
cos(2t )
sin(2t )
0
_
_
, (1)
where is the relative angular velocity of V with respect to E. For simplicity, we
initially position V and E at their spring nexus points. The value of for the actual V
and E is
0
=
e
/
v
1.62555. The 3 3 matrix in (1) corresponds with a clockwise
rotation by about the x-axis, so as to be consistent with a descending (spring) transit
occurring near nodes (nexus points) C and G, where C = (1, 0, 0).
A line parametrized by u from E through V at time t is
P(u, t ) =
_
V(t ) E(t )
_
u + E(t ). (2)
To nd the projection of Vs shadow on S as viewed from E(t )an ideal geocentric
point in space at Es centerwe imagine that S resides within a rotating plane or
screen S(t ) ever perpendicular to E(t ). Figure 3 shows the two orbital planes and Vs
projection on the screen as viewed from E. The plane S(t ) of S can be written as
X E(t ) = 0 (3)
where X is a general point (x, y, z) on the screen. When E and V are on opposite
sides of the screen at time t which happens if and only if E(t ) V(t ) < 0we take
the projection point of V onto the screen as that screen point between the planets.
Vs orbit
E(t)
V(t)
s
c
r
e
e
n
O
Es orbit
axis between the planetary planes
S
u
n
V
s
s
h
a
d
o
w
B
C
nexus pt for E
nexus pt for V
G
F
Figure 3. The screen of the Sun
April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 285
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
We combine (2) and (3) so as to nd the point X(t ) where the line intersects the
plane. That is, the equations P(u, t ) = X and (3) are the following system of four
equations with four unknowns x, y, z, u, as well as the time variable t :
_
_
x = ( cos(2t ) cos(2t ))u +cos(2t )
y = ( cos sin(2t ) sin(2t ))u +sin(2t )
z = sin sin(2t )u
0 = x cos(2t ) + y sin(2t ).
(4)
Writing (4) as a matrix equation gives A
X(t ) =
E(t ), where
A =
_
_
_
1 0 0 cos(2t ) cos(2t )
0 1 0 sin(2t ) cos sin(2t )
0 0 1 sin sin(2t )
cos(2t ) sin(2t ) 0 0
_
_
(5)
with
X(t ) and
E(t ) being the respective vectors (x, y, z, u) and (cos(2t ), sin(2t ),
0, 0). For this transformation,
det(A) = 1 +
_
cos(2t ) cos(2t ) +cos sin(2t ) sin(2t )
_
= 1 +
2
_
(1 +cos ) cos(2( 1)t ) +(1 cos ) cos(2( +1)t )
_
1 +
2
_
|1 +cos | +|1 cos |
_
= 1 + < 0.
Because the determinant of A is never zero, then
X(t ) = A
1
_
A
1
E(t ). (6)
1 1
0.1
distances in AU
0.005
0.005
T
121.5
T
117.5
T
113.5
(a) A wide screen (b) Zooming in near the Sun
Figure 4. Trajectories of Vs shadow on the screen of S
286 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
Figure 4(a) shows the path of Vs projection on the screen over 1.5 years. In our
model, spring transits only occur near integer years, n, and fall transits only occur near
half-years, n +
1
2
. Figure 4(b) is a close up of the screen near S over a period of about
ten years, displaying three arcs of Vs projection. The arc labeled T
113.5
corresponds
with a fall transit near t = 113.5 years. The arc T
117.5
corresponds with V and E being
on opposite sides of S near t = 117.5; as such, we display the disk of S in front of this
arc. The arc T
121.5
misses the disk of S.
4. CONDITIONS FOR A TRANSIT TO OCCUR. In order to nd how far from
its nexus V may wander and yet be part of a transit across S, we project the disk of S
through V out to Es orbit, forming a cone as illustrated in Figure 5(a), which displays
the situation where the base of the truncated cone is tangent to Es orbit.
Es orbit
B
C
V
S
V
Es orbit
B
C
S
p
l
a
n
e
o
f
V
s
o
r
b
i
t
h
D
D
V
s
o
r
b
i
t
disk of the
Sun
base of truncated cone
k
1
k(1
s(1 )
0.0301, (9)
since the arguments of the inverse tangent and sine are so small. Thus, in order to be
part of a transit, V may wander no further than about 0.0218 AU from the nexus.
By (9), the lapse of time L
v
for V to travel this far from its nexus is
L
v
s(1 )
2
0
W
_
n +
1
2
_
It is unclear how
T is related to
0
.
Since the time lapse between twin transits is 8 years, it seems likely that
T should
somehow be related to 8, but how?
In the next section, we nd a natural period and demonstrate that the practical and
natural periods are related.
5. RECOGNIZING THE PATTERN. To nd a more natural transit period, we fo-
cus on spring transits for a season; from Table 1, we drop the fall transit dates, and are
left with Table 2. When we refer to the spring transit year n
j
from the table, where
Table 2. Spring transits
j 0 1 2 3 4 5 6 7 8
transit year n
j
0 227 (454, 462) 689 916 (1143, 1151) 1378 1605 1832
n
j
mod 8 0 3 6 1 4 7 2 5 0
3 j mod 8 0 3 6 1 4 7 2 5 0
April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 289
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
j 0, we mean term- j in row 2 or the dominant transit year if the term is a twin. For
example, n
2
= 462, as evidenced by Figure 7. Observe that the rst eight spring tran-
sits comprise a complete residue set modulo 8. Furthermore, n
j
mod 8 just happens to
be 3 j mod 8, which suggests that the relative motion of the planets induces a linear
shufing of the transit year residues modulo 8. We thus refer to 3 as a shufing factor.
To help understand this 8-fold dynamic, observe that every eight years both E
and V pass each other not far from where they had passed each other eight years
before, with V a bit further ahead of E each time. We say that the arc given by
W(n years 1 week) is rung-n in a ladder of arcs. As the years go by, these rungs
step monotonically upward (or downward) to a climax before reversing their progres-
sion, with rung-8n being more or less either above or below rung-8(n + 1) for all
integers n. Near the spring transit years, neighboring rungs are separated by a distance
somewhat more than the radius of S, as illustrated in Figures 4(b), 7, and 8; the dots in
Figure 8 represent Vs projection at t = 16, 8, 0, 8, 16 years. With p = 8, the ap-
proximate distance d( p) between neighboring rungs near transit years is the distance
between W( p) and its projection onto W(0
+
), where we take 0
+
as one hour, is
d( p) =
W( p)
W( p) W(0
+
)
W(0
+
) W(0
+
)
W(0
+
)
and =
T
8
, for which T is near 1834 and where y
j
passes through all points on
branch- j of D(
0
). Observe that the values of sin(2()8n) and sin(2(
m
8
)8n)
agree for all integers m. In particular, for the integer m for which
m
8
is nearest ,
namely, m = 13, we see that dening and T so that
1
T
=
2
=
13
8
365.26
224.70
13
8
0.000545171 (15)
indeed gives the natural period of D(
0
) as
T =
2
2
0.00342541
1834.29 years, (16)
which means that
=
T
p
=
T
8
1834.29
8
229.286 years. (17)
When we divide the practical period
T = 1605 years by 7,
T
7
229.286 .
That is,
T
8
7
T.
April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 291
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
Hence, the practical period of 1605 just happens to be a lucky seven integer multiple
of the phase shift in the branches of the natural period.
To verify the fourth row of Table 2, that 3 is the shufing factor, observe by Figure
9(b) that n
j
is the rst component in that point belonging to branch- j of D(), which
is nearest the rst nonnegative root of y = sin((t j )), with 0 j 7. Hence, for
a given j , we wish to nd the residue r
j
of n
j
modulo p, where 0 j p 1 and
0 r
j
p 1, so that
sin(2t ) = sin((t j)) (18)
for all times t = pn +r
j
, for all integers n, with p = 8. By the pigeonhole principle,
since there are eight branches and eight primitive residues, r
j
is unique for each j .
Furthermore, by the afne nature of the arguments of sine in (18), it is sufcient to
show that (18) has a solution for j = 1, which means that we must solve
sin
_
2( pn +r)
_
= sin
_
( pn +r )
_
(19)
for r, where r = r
1
and p = 8. By (15) through (17), (19) becomes
sin
_
(8n +r) +26n +
(13r)(2)
8
_
= sin
_
(8n +r)
2
8
_
.
Therefore, solving
13r 1 mod 8 (20)
gives the unique solution r = 3 for (19).
Furthermore, generalizing the above argument demonstrates that the shufing factor
r in (19) remains at r = 3 for all =
13
8
, for which
13
8
<
1
32
=
1
4p
,
a range of angular velocities called the periodicity domain of
13
8
. By an interval punc-
tured by x, we mean a disconnected set of real numbers J whose union with {x} is an
interval. Thus, the periodicity domain of
13
8
is an interval punctured by
13
8
. The reason
for excluding
13
8
from its periodicity domain is that its corresponding and would
be 0 and , respectively.
To account for arbitrary relative positions of E and V in their orbits about S, we
imagine that at time t = 0, V is years ahead of its last rendezvous with its spring
nexus, while E is at its spring nexus. Each of the branches characterizing Vs projec-
tion undergo a phase shift , where sin(2(8n +)) must equal sin((8n +)); by
(15), one way for this to occur is when (2)(
13
8
) = , which means that
=
qT
p
=
13T
8
,
where p = 8 and q = 13. Therefore, we have an algorithmfor characterizing all spring
singleton transits and all dominant members of spring twin transits, where is an
orbital phase angle shift between V and E, p = 8 is the apparent periodicity of D(),
r = 3 is the shufing factor among the year residues modulo p as given by (20), and
q
p
is the rational number close to as given by (15).
292 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
The Transit Rule. Let k, n, and j be integers, 0 j < p. A spring transit occurs at
integer year m near time (k
q
p
)T + j if and only if m = pn + ( jr mod p) and
m is no further from (k
q
p
)T +j than either m p or m + p. If either m p or
m + p is a transit year as well, then m is the dominant member of the twin.
To ascertain whether m p is also a spring transit, simply utilize the decision
rule (13).
Example 1. To illustrate the transit rule, let = 0, k = 3, and j = 5. Since 3 j
mod 8 = 7, we want to nd the transit year m = 8n +7 closest to kT + j 6649.3.
Then m = 8(830) +7 = 6647, while m +8 = 8(831) +7 = 6655. That is, year 6647
is a singleton transit, while year 6655 is a near-miss, as shown in Figure 11(a).
T
6655
T
6647
T
4754
T
4746
(a) Spring transit near 3T +5 (b) Spring transit near (2
13
80
)T +6
Figure 11. Checking the transit algorithm
Example 2. This time, let = 0.1, k = 2, and j = 6. Since 3 j mod 8 = 2, we want
to nd the transit year m = 8n + 2 closest to (2 0.1(
13
8
))T + 6 4746.2. Then
m = 8(593) +2 = 4746, while m +8 = 8(594) +2 = 4754. That is, year 4746 is a
singleton transit, while year 4754 is far from being a transit, as shown in Figure 11(b).
As for fall transits, a similar rule applies, except that the eight branches through the
data corresponding to time n +
1
2
are
y
j
= sin
_
_
t
_
j +
1
2
_
+
__
.
6. VARYING VENUSS ANGULAR VELOCITY. The key behind the transit rule
is recognizing that D(
0
) consists of eight components or branches. Thus we say that
the periodicity of D() is the integer p if D() appears to fall into p branches. To
formalize what is meant by appears, for each positive integer , we dene N() as the
maximal integer n for which {sin(2j )}
n
j =0
is monotonic. Intuitively, N() counts
the number of rungs from a transit to a climax. We further dene the periodicity quo-
tient Q(, ) as
Q(, ) =
_
N()
_
,
which gives a measure of normalization among the values of N(). We say that the
apparent periodicity of D() is p, if Q(, p) appears to approach the maximum of
{Q(, )| Z
+
}.
April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 293
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
Table 3. The periodicity of D(
0
) appears to be 8.
Q(
0
, ) Q(
0
, ) Q(
0
, )
1 1 11 0 21 0
2 0 12 0 22 0
3 1 13 0 23 0
4 0 14 0 24 0
5 0 15 0 25 0
6 0 16 1 26 0
7 0 17 0 27 0
8 7 18 0 28 0
9 0 19 0 29 0
10 0 20 0 30 0
The rst few values of Q(
0
, ) are given in Table 3, with the nonzero periodicity
quotients in boldface. When extending this table indenitely as far as a typical CAS
allows, it appears as if Q(, ) = 0 for all > 16. From such evidence, and since the
maximum quotient among this range is 7 and corresponds to = 8, we conclude
that D(
0
) has apparent periodicity 8.
Let be a number between 0 and 0.5. Observe that Q(, ) = Q(n + , ) for
all integers n. Because sine is an odd function, Q(, ) = Q(1 , ). Therefore, the
only values for which we need to evaluate Q(, ) are those in the range 0
1
2
,
or, equivalently, the range 1.5 2, the reference interval containing
0
. Armed
with the use of the measure Q we ask, how far may we perturb from
0
and yet have
apparent periodicity remain invariant?
1.615 1.620 1.625 1.630 1.635
2
4
6
8
10
12
Q(, 8)
2
10
Figure 15. Apparent periodicities 7 and 9
April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 295
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
The next example is an application of the transit rule corresponding to an apparent
periodicity other than 8.
Example 3. Let =
11
2
10
1.55563. The plot of D(), Figure 15(b), shows that its
apparent periodicity is p = 9. Since
14
9
is that fraction of integers with denominator 9
nearest , the analog of (15) is
14
9
=
2
=
1
T
,
which gives T 12,600.3 years. Solving (19) gives the shufing factor r = 7 rather
than 3. Now let = 0, k = 0, and j = 5, which means that we are looking for a
transit year with residue jr mod 9 8 near time 5 = 5T/9 7000.17. Thus, m =
(777)(9) +8 = 7001 is a transit year. With this new value of , V has receded from S,
so the distance d(9) between the rungs has changed to d(9) 0.0014 by (14), which
means that we have more than twin transits; in fact we have septuplets, as shown in
Figure 16(a).
T
7028
T
7019
T
7010
T
7001
T
6992
T
6983
T
6974
actual June 2012
transit path
linear model
approximation of the
June 2012 transit
Y
Z
(a) A transit family of septuplets, =
11
2
10
(b) Hunting for a phase angle
Figure 16. Transits with other than
0
7. A REALITY CHECK. How does our model contrast with reality?
A phenomenon omitted thus far from our transit model is the tendency of objects
to rotateincluding the orbital planes of V and E, a feature called precession. The
values
e
and
v
used to dene
0
are the periods of the two planets with respect to the
background of the xed stars. To adapt our model appropriately, we must incorporate
slightly different periods, namely, the time it takes for a planet to return to its aphelion.
Since E precesses faster than V, as time goes on the nexus line rotates and hence spring
and fall transits occur later in the year. Because precession rates are tiny compared to
0
, we arbitrarily take
0
1.625550000. Meeus [5, p. 13] predicts that an almost
exactly central transit will take place on 11 July 5900a transit through Ss center.
Thus from 2012 to 5900, the spring transit has now become a summer transit, having
slipped forward by about 35 days during a lapse of 3888 years, which means that the
change in the relative orbital speeds of V and E with respect to the nexus line is
35
0
3888
e
0.0000397559, which means that we might try the new angular velocity
1
=
0
1.625510244.
Next, we need a phase shift to start our model. From [5, p. 48], the transit of 6 June
2012 crossed Ss boundary at Y 39.45
and at Z 291.4
measured counterclock-
296 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
wise from the top of S, shown as a dotted line in Figure 16(b). Adjusting (1) and (5)
so that the trigonometric arguments 2t are replaced by 2(t +), where is an
indeterminate phase shift, and using a search method to nd by dynamically plotting
W(t 2012) near t = 2012, yields the solid-line transit in Figure 16(b), suggesting
that 0.00102 is a good match. The reason that the two transit lines are non-parallel
is because Es and Vs actual orbits have positive eccentricity. When we apply (13) in
this adjusted model for the years from 700 to 3000 AD, we nd the promising spring
transit Gregorian year possibilities of Table 4. The underlined years indicate a match
between our results and Meeuss. Not bad for a linear model. But can we do better?
Table 4. The linear model versus Meeuss Model
This linear
model
_
(781, 789) (1024, 1032) 1275 1518 (1761, 1769)
(2004, 2012) 2255) 2498 2741 (2984, 2992)
Meeuss
model
_
(789, 797) (1032, 1040) (1275, 1283) (1518, 1526) (1761, 1769)
(2004, 2012) (2247, 2255) (2490, 2498) (2733, 2741) (2976, 2984)
To do so, we work backward through the transit rule and nd a magic angular veloc-
ity. Since
1
is within the periodicity domain of
13
8
, the corresponding shufing factor
is r = 3. We make use of a second unusual spring transit year, 183 BC, whose cor-
responding transit Meeus describes as almost central. The difference between 5900
AD and 183 BC is 6083 years. Identify t = 0 with year 5900. Thus, year 183 BC is
referenced by t = 6083 = 8(761) + 5, which means that 5 3 j mod 8, whose
solution is j = 7. Using the angular velocity
1
with (15), the associated period is
T
1
1959.85. We then solve kT
1
+
7T
1
8
= 6083, getting k 3.98. Next, reset k as
k = 4, and solve (k +
7
8
)T
2
= 6083, getting T
2
=
48664
25
. By (15),
2
=
1
T
+
13
8
=
25
48664
+
13
8
=
9888
6083
1.6255137267795495644.
When we generate transits by the transit rule using angular velocity
2
across the years
2000 BC to 4000 AD, we get an exact match with actual spring transits from Meeuss
results.
Table 5. Spring transit years, generated by the transit rule
1884 BC 1641 BC 1398 BC 1155 BC 912 BC 669 BC 426 BC 183 BC 60 303
546 789 1032 1275 1518 1761 2004 2247 2490 2733
2984
3. What happens when wanders into overlapping periodicity domains? The reality
is a war-torn fractal-like dominance landscape foreshadowed in part by Figure 14. As
a simple example,
35
52
0.673077 exerts its 52-ness dominance over its immediate
neighbors. Yet, it is well within the dominance of
2
3
; an examination of D(
35
52
) shows a
clear 3-fold periodicity, and the periodicity quotient Q(
35
52
, 3) = 4 supports this result.
However, Q(
35
52
0.000001, 52) = 92 and a plot of its corresponding data set suggests
periodicity 52.
With respect to permanence, in the life cycle of S, S slowly loses mass and swells
to giant status and so the orbits of the planets recede from S, which means that the
transit cycle for V may change dramatically. The rational numbers with small integer
denominator near
13
8
in increasing order are
_
3
2
,
11
7
,
8
5
,
29
18
,
21
13
,
13
8
,
31
19
,
18
11
,
23
14
,
28
17
,
33
20
,
5
3
,
7
4
_
.
A billion or two years from now, the natural periodicity of the Venus transit may
change from 8 to 13 or 19. Hopefully, humans will yet be here to see.
For an application of the ideas of this paper to the phases of the Moon, see [8]. Just
as the transit of Venus involves the periodicity domain of
13
8
, so too the phases of the
Moon involve the periodicity domain of another fraction, this time
235
19
.
ACKNOWLEDGMENT. Thanks to Osmo Pekonen for asking me to write a review [7] of [11] which in turn
sparked this project.
REFERENCES
1. G. K. Chesterton, Heretics. Reprint of the 1905 edition, Books for Libraries Press, Freeport, NY, 1970.
2. M. Danlous-Dumesnils, P eriodicit e des passages de V enus, LAstronomie 91 (1977) 117127.
3. F. Espenak, Six millenium catalog of Venus transits, NASA, 2013, available at http://eclipse.gsfc.
nasa.gov/transit/catalog/VenusCatalog.html.
4. E. Halley, A new method of determining the parallax of the Sun, or his distance from the Earth, in The
Abridged Transactions of the Royal Society 6 (1809) 243249.
5. J. Meeus, Transits. William-Bell Press, Richmond, VA, 1989.
6. , The transits of Venus, 3000 BC to AD 3000, Journal of the British Astronomical Association 68
(1958) 98108.
7. A. Simoson, A review of [11], Math. Intel. 35 (2013) 8485.
8. , Bilbo and the last moon of autumn, to appear in Math Horizons.
9. D. A. Teets, Transits of Venus and the astronomical unit, Math. Mag. 76 (2003) 225348.
10. H. Woolf, The Transits of Venus: A Study of Eighteenth Century Science. Princeton University Press,
Princeton, NJ, 1959.
11. A. Wulf, Chasing Venus: the Race to Measure the Heavens. Alfred Knopf Press, New York, 2012.
ANDREW J. SIMOSON is a long time professor of mathematics at King University. Recently he stumbled
upon a pertinent Chesterton quote, Men take thought and ponder rationalistically touching remote things
things that only theoretically matter, such as the transit of Venus [1, p. 141].
King University, 1350 King College Road, Bristol, TN 37620
ajsimoso@king.edu
298 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM
All use subject to JSTOR Terms and Conditions
A Drug-Induced Random Walk
Author(s): Daniel J. Velleman
Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 299-317
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.299 .
Accessed: 30/03/2014 17:28
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to
The American Mathematical Monthly.
http://www.jstor.org
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
A Drug-Induced Random Walk
Daniel J. Velleman
Abstract. The label on a bottle of pills says Take one half pill daily. Anatural way to proceed
is as follows: Every day, remove a pill from the bottle at random. If it is a whole pill, break
it in half, take one half, and return the other half to the bottle; if it is a half pill, take it. We
analyze the history of such a pill bottle.
1. INTRODUCTION. A few years ago our cat Natasha (see Figure 1) began losing
weight. We took her to the vet, who did some tests and determined that she had a thy-
roid condition. He gave us a bottle of pills and told us to give her half a pill every day.
Figure 1. Natasha
The next day we shook a pill out of the bottle, broke it in half, gave her half of the
pill, and put the other half back in the bottle. We repeated that procedure for several
more days. Eventually, a day came when the pill we shook out of the bottle was one of
the half pills we had put back in on one of the previous days. Of course, we just gave
her the half pill that day. We continued to follow this procedure until the bottle was
empty, and then we started on a new bottle.
The pills solved Natashas medical problem; she regained the weight she had lost,
and shes doing ne now. But they created an interesting mathematical problem. The
state of the pill bottle on any day can be described by a pair of numbers (w, h), where
w is the number of whole pills in the bottle and h is the number of half pills. We
will assume that every day a pill is removed from the bottle at random, with each pill
being equally likely to be chosen. When a whole pill is removed, it is cut in half and
half of it is returned to the bottle; when a half pill is removed, nothing is returned
to the bottle. Thus, if the state of the pill bottle on a particular day is (w, h), then
with probability w/(w +h) the state on the next day will be (w 1, h +1), and with
http://dx.doi.org/10.4169/amer.math.monthly.121.04.299
MSC: Primary 60G50, Secondary 65L05
April 2014] A DRUG-INDUCED RANDOM WALK 299
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
probability h/(w +h) it will be (w, h 1). This means that the state of the pill bottle
executes a random walk in the plane, starting at the point (w, h) = (n, 0), where n is
the initial number of pills in the bottle, and ending at (0, 0). Since the bottle contains
2n doses of medicine, the walk takes 2n steps.
For example, Figure 2 shows a computer simulation of a pill-bottle walk starting
with n = 20 pills. On the rst three days, whole pills are removed from the bottle, and
the state of the bottle goes from (20, 0) to (19, 1), (18, 2), and (17, 3). The next day, a
half pill is removed, and the state goes to (17, 2). And the walk continues for 36 more
steps until it ends at (0, 0).
5 10 15 20
w
1
2
3
4
5
6
7
8
h
Figure 2. A pill-bottle walk with n = 20
Figure 3 shows simulated walks with n = 100, n = 1000, and n = 10000. It ap-
pears that although the walks are random, the overall shapes of the walks are similar,
with the shape becoming smoother as n increases. Notice that the scales of the three
walks in Figure 3 are different; the rst starts at (100, 0), the second at (1000, 0), and
the third at (10000, 0). It is only when they are drawn the same size that they look
similar. This suggests that we should rescale the walks to a uniform size, indepen-
dent of n. We will therefore switch to a new coordinate system. If we let x = w/n
and y = h/n, then x represents the fraction of the original n pills that are still whole,
and y represents the fraction that have become half pills. Notice that these fractions
may add up to less than 1, since some fraction of the pills may have been used up
completely.
25 50 75 100
w
10
20
30
h
250 500 750 1000
w
100
200
300
2500 5000 7500 10000
w
1000
2000
3000
Figure 3. Walks with n = 100 (top left), n = 1000 (bottom), and n = 10000 (top right)
300 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
Using the coordinates (x, y) to represent the state of the pill bottle, we get a random
walk that starts at (1, 0), ends at (0, 0), and stays in the triangle x + y 1, x 0,
y 0. When the state is (x, y), it changes as follows:
with probability
x
x+y
, the state changes to
_
x
1
n
, y +
1
n
_
;
with probability
y
x+y
, the state changes to
_
x, y
1
n
_
.
We will call such a walk an n-walk. Increasing n does not make the walk larger, but it
makes the steps smaller. Figure 3 suggests that as n increases, the walk approaches a
smooth curve. What is this curve?
The limit curve we seek is an example of a scaling limit of a discrete process.
Perhaps the best-known example of a scaling limit is Brownian motion, which can also
be thought of as the scaling limit of a random walk. For more on Brownian motion and
scaling limits, see [5].
We rst give an intuitive argument that suggests a possible answer to our question.
We will nd it helpful to introduce a third variable t , standing for time. We set t = 0 at
the beginning of the walk, and to keep the scales of the variables comparable we will
assume that t increases by 1/n for each step of the walk. Since the walk consists of 2n
steps, this means that t will run from 0 to 2. We think of the limit curve as being given
by parametric equations
x = f
x
(t ), y = f
y
(t ), 0 t 2,
or, in vector notation,
(x, y) = ( f
x
(t ), f
y
(t )) = f(t ), 0 t 2.
When the state of an n-walk is (x, y), the displacement to the next state is either
the vector (1/n, 1/n), with probability x/(x + y), or (0, 1/n), with probability
y/(x + y). Thus, the expected value of the displacement is
x
x + y
_
1
n
,
1
n
_
+
y
x + y
_
0,
1
n
_
=
1
n
_
x
x + y
,
x y
x + y
_
.
Since t increases by 1/n during the step, this suggests that the parametric form of the
limit curve might be a solution to the system of differential equations
dx
dt
=
x
x + y
,
dy
dt
=
x y
x + y
. (1)
To solve this system of equations, we rst note that
dy
dx
=
dy/dt
dx/dt
=
x y
x
= 1 +
y
x
.
We will let you check that the curve y = x ln x satises this equation for 0 < x 1
and passes through the point (1, 0). The graph of this curve is shown in Figure 4, and
the similarity to the walks in Figure 3 is striking. Notice that although ln 0 is undened,
lim
x0
+(x ln x) = 0. From now on we consider 0 ln 0 to be equal to 0, so that the curve
y = x ln x includes the point (0, 0).
April 2014] A DRUG-INDUCED RANDOM WALK 301
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
0.2 0.4 0.6 0.8 1
x
0.1
0.2
0.3
y
Figure 4. The graph of y = x ln x
Substituting y = x ln x in the rst equation in (1), we get
dx
dt
=
x
x x ln x
=
1
ln x 1
.
Separation of variables gives
t =
_
(ln x 1) dx = x ln x 2x +C.
Since x = 1 when t = 0, we must have C = 2, and therefore
t = x ln x 2x +2. (2)
Let g(x) = x ln x 2x + 2 for 0 x 1. (Notice that by our convention that
0 ln 0 = 0, we have g(0) = 2.) Then g maps [0, 1] onto [0, 2] and is strictly decreasing,
so it has an inverse. We dene f
x
to be the inverse of g, which is a strictly decreasing
function mapping [0, 2] to [0, 1]. Thus, if 0 t 2 and x = f
x
(t ), then x and t satisfy
equation (2).
1
Using y = x ln x, we can rewrite equation (2) as t = y 2x + 2, or equiva-
lently y = 2 2x t . We therefore dene
f
y
(t ) = 2 2 f
x
(t ) t. (3)
We leave it to you to verify that the equation
(x, y) = ( f
x
(t ), f
y
(t )) = f(t ), 0 t 2 (4)
parametrizes the curve y = x ln x shown in Figure 4, and it satises the differential
equations (1) for 0 t < 2, where we interpret the derivatives at t = 0 as one-sided
derivatives. (At t = 2, we have x = y = 0, and therefore the right-hand sides of the
equations in (1) are undened.) The graphs of f
x
and f
y
are shown in Figure 5.
It turns out that an n-walk does, indeed, approach the curve (4) as n approaches ,
but the sense in which this is true must be stated carefully. Our main theorem is the
following.
1
Using the Lambert W function W
1
(see [1]), we can express f
x
(t ) explicitly by the equation
f
x
(t ) =
t 2
W
1
((t 2)/e
2
)
.
However, we will not have any use for this expression.
302 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
0.5 1 1.5 2
t
0.2
0.4
0.6
0.8
1
x
0.5 1 1.5 2
t
0.1
0.2
0.3
y
Figure 5. The graphs of x = f
x
(t ) (left) and y = f
y
(t ) (right)
Theorem 1. Suppose that > 0. Let the points on an n-walk be p
0
= (1, 0), p
1
,
. . . , p
2n
= (0, 0), and for 0 i 2n let t
i
= i /n. Then the probability that for every
i , p
i
f(t
i
) < approaches 1 as n . In other words, the n-walk converges
uniformly in probability to the limit curve.
Two notable features of the limit curve are that the tangent line at (1, 0) has slope
1, and the tangent line at the origin is vertical. The rst feature makes intuitive sense:
early in the walk, almost all of the pills in the bottle are whole pills, so it is likely that
several whole pills will be removed before the rst half pill is removed. For example,
in the walk in Figure 2, three whole pills were removed before the rst half pill was
removed. When these initial whole pills are removed, the walk will move along the
line y = 1 x, which is the tangent line at (1, 0). The second feature seems more
surprising: it appears that near the end of the walk, almost all of the pills are half pills,
and the walk ends by moving along the line x = 0 toward the origin. This suggests
two questions.
Question 1. For a bottle of n pills, what is the expected number of whole pills that are
removed from the bottle before the rst half pill is removed?
Question 2. For a bottle of n pills, what is the expected number of half pills that are
removed from the bottle after the last whole pill is removed?
Versions of Question 1 have appeared in the literature before (see, for example,
[3, 4, 6, 8]). In the case n = 365, it is equivalent to the following version of the birthday
problem: If people are chosen at random, one by one, what is the expected number of
people with distinct birthdays who will be chosen before the rst person who has the
same birthday as a previously chosen person? We will give an elementary derivation
of the answer to Question 1. In our next theorem, we express the answer in terms of
the incomplete gamma function, which is dened as follows,
(a, x) =
_
x
t
a1
e
t
dt.
Theorem2. For a bottle of n pills, the expected number of whole pills that are removed
from the bottle before the rst half pill is removed is
e
n
n
n1
(n, n).
April 2014] A DRUG-INDUCED RANDOM WALK 303
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
As n , this expected value is asymptotic to
_
n
2
.
The answer to Question 2 was found by Richard Stong.
Theorem 3 (Stong). For a bottle of n pills, the expected number of half pills that
are removed from the bottle after the last whole pill is removed is the nth harmonic
number,
H
n
= 1 +
1
2
+
1
3
+ +
1
n
.
For example, for a bottle of 100 pills, the expected number of whole pills before the
rst half pill is
e
100
100
99
(100, 100) 12.21,
and the asymptotic approximation in Theorem 2 is
_
100
2
12.53.
The expected number of half pills after the last whole pill is
H
100
5.19.
The rest of this paper is devoted to the proofs of Theorems 13. We prove Theorem1
in Section 3, and Theorems 2 and 3 in Section 4. We consider variations on these
theorems in Section 5.
2. BACKGROUND FOR PROOF OF THEOREM 1. In preparation for the proof
of Theorem 1, we simplify the problem by eliminating one variable. According to
denition (3), f
y
(t ) = 2 2 f
x
(t ) t , so
f(t ) = ( f
x
(t ), 2 2 f
x
(t ) t ) = f
x
(t )(1, 2) +(0, 2 t ).
A similar equation holds for the points on any n-walk. Suppose that after i steps,
the n-walk is at the point p
i
= (x
i
, y
i
), and let t
i
= i /n. This means that there are
w
i
= nx
i
whole pills and h
i
= ny
i
half pills in the bottle. These pills are enough
for 2w
i
+ h
i
doses of medicine. Since there were 2n doses in the bottle originally,
and i of those doses have been used up, there must be 2n i doses left. Therefore,
2w
i
+ h
i
= 2n i , or equivalently, h
i
= 2n 2w
i
i . Dividing through by n, we
nd that
y
i
= 2 2x
i
t
i
, (5)
and therefore
p
i
= (x
i
, 2 2x
i
t
i
) = x
i
(1, 2) +(0, 2 t
i
).
304 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
It follows that
p
i
f(t
i
) = (x
i
f
x
(t
i
))(1, 2) = |x
i
f
x
(t
i
)|
5.
Thus, to ensure that p
i
is close to f(t
i
), it will sufce to ensure that x
i
is close to f
x
(t
i
);
we can ignore the y-coordinates of p
i
and f(t
i
). In other words, to prove Theorem 1 it
will sufce to prove the following lemma.
Lemma 4. Suppose that > 0. Let the x-coordinates of the points on an n-walk be
x
0
= 1, x
1
, . . . , x
2n
= 0, and for 0 i 2n let t
i
= i /n. Then the probability that for
every i , |x
i
f
x
(t
i
)| < approaches 1 as n .
In fact, using equations (3) and (5), we can completely eliminate the variable y
from the problem. We can describe the x-coordinates of the points on an n-walk by
saying that x
i +1
is equal to either x
i
1/n or x
i
, with the rst possibility occurring
with probability
x
i
x
i
+ y
i
=
x
i
x
i
+2 2x
i
t
i
=
x
i
2 x
i
t
i
. (6)
Similarly, if x = f
x
(t ) and y = f
y
(t ), then for 0 t < 2,
f
x
(t ) =
dx
dt
=
x
x + y
=
x
2 x t
=
f
x
(t )
2 f
x
(t ) t
. (7)
Thus, we can work entirely with the points (t
i
, x
i
) and the curve x = f
x
(t ), both of
which lie in the t x-plane.
The idea behind our proof of Lemma 4 is straightforward. Let m be a large positive
integer, and let n be an integer much larger than m. Now consider an n-walk, and
break the 2n steps of the walk into m large blocks of steps. We view the n-walk in the
t x-plane, ignoring the y-coordinates. The individual steps of the n-walk are random
and unpredictable, but the net change in x that results from a large block of steps is
more predictable: by the law of large numbers, this net change is likely to be close to
its expected value. It will follow that if a block of steps starts at a point (t, x), then
the net result of this block of steps is likely to be a small displacement in the t x-plane
whose slope is close to x/(2 x t ). Since x = f
x
(t ) is a solution to the differential
equation dx/dt = x/(2 x t ), this means that the steps of the n-walk should stay
close to the graph of f
x
.
This proof sketch suggests that our proof will involve ideas related to Eulers
method. Recall that Eulers method is a numerical method for solving a differen-
tial equation of the form f
(t ) = F(t, f (t )) for a t b, with an initial condition
f (a) = x
0
. Here the function F and the numbers a, b, and x
0
are given, and we want
to compute values of f . To apply Eulers method, we choose a positive integer n and
a positive step size h (b a)/n, let t
j
= a + j h for 0 j n, and then dene x
j
recursively by the equation
x
j +1
= x
j
+hF(t
j
, x
j
), 0 j < n.
Thus, the displacement from (t
j
, x
j
) to (t
j +1
, x
j +1
) has slope F(t
j
, x
j
). If h is small
and F is sufciently well-behaved, then the points (t
j
, x
j
) will be close to the graph
of f .
April 2014] A DRUG-INDUCED RANDOM WALK 305
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
We will need to modify Eulers method slightly, because according to our proof
sketch for Lemma 4, the slope of the displacement caused by a block of steps in the
n-walk starting at (t, x) is likely to be close to x/(2 x t ), but not exactly equal
to it. We will therefore need a version of Eulers method in which the slope of the
displacement at step j is only approximately equal to F(t
j
, x
j
).
To make this precise, suppose that a < b, g
1
and g
2
are functions from [a, b] to R,
and for all t [a, b], g
1
(t ) < g
2
(t ). Let
D = {(t, x) R
2
: a t b and g
1
(t ) x g
2
(t )}.
Nowsuppose that F : D Rand f : [a, b] R, and for all t [a, b], (t, f (t )) D
and
f
(t ) = F(t, f (t )),
where we interpret f
(t ) as a one-sided derivative when t = a or t = b. Let x
0
= f (a).
We want to use a version of Eulers method to locate points (t
j
, x
j
) near the graph of f .
As before, we will use a positive step size h (b a)/n, so for 0 j n we let t
j
=
a + j h. We will assume that for 0 j < n, the slope of the displacement from (t
j
, x
j
)
to (t
j +1
, x
j +1
) deviates from F(t
j
, x
j
) by some amount
j
. Thus, we recursively dene
x
j +1
= x
j
+h(F(t
j
, x
j
) +
j
).
To ensure that this formula is dened, we assume that for every j , g
1
(t
j
) x
j
g
2
(t
j
),
so that (t
j
, x
j
) D.
Lemma 5. In the modied Eulers method described above, assume that for 0
j < n,
|
j
| .
We also assume that F/x and f
are dened and bounded. Thus, we assume that
there are positive constants C
1
and C
2
such that for all (t, x) D,
F
x
(t, x)
C
1
, | f
(t )| C
2
.
Then for 0 j n,
|x
j
f (t
j
)|
_
hC
2
2C
1
+
C
1
_
_
(1 +C
1
h)
j
1
_
. (8)
Proof. We proceed by induction on j . Clearly, inequality (8) holds when j = 0, since
both sides are 0. Now suppose that the inequality holds for some j < n. By Taylors
theorem, we can write
f (t
j +1
) = f (t
j
) +h f
(t
j
) +
h
2
2
f
(c
j
)
for some number c
j
between t
j
and t
j +1
. And by the mean value theorem, we have
F(t
j
, x
j
)=F(t
j
, f (t
j
)) +
F
x
(t
j
, d
j
)(x
j
f (t
j
))= f
(t
j
) +
F
x
(t
j
, d
j
)(x
j
f (t
j
))
306 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
for some d
j
between x
j
and f (t
j
). Thus,
x
j +1
f (t
j +1
) = x
j
+h(F(t
j
, x
j
) +
j
) f (t
j +1
)
= x
j
+h
_
f
(t
j
) +
F
x
(t
j
, d
j
)(x
j
f (t
j
)) +
j
_
_
f (t
j
) +h f
(t
j
) +
h
2
2
f
(c
j
)
_
= (x
j
f (t
j
))
_
1 +h
F
x
(t
j
, d
j
)
_
+h
j
h
2
2
f
(c
j
).
Next, we take absolute values and apply the bounds given in the statement of the
lemma:
|x
j +1
f (t
j +1
)| |x
j
f (t
j
)|(1 +C
1
h) +h +
C
2
h
2
2
.
Finally, we apply the inductive hypothesis to conclude that
|x
j +1
f (t
j +1
)|
_
hC
2
2C
1
+
C
1
_
_
(1 +C
1
h)
j
1
_
(1 +C
1
h) +h +
C
2
h
2
2
=
_
hC
2
2C
1
+
C
1
_
_
(1 +C
1
h)
j +1
1
_
,
as required.
3. PROOF OF THEOREM1. To complete the proof of Theorem 1, we return to our
proof sketch for Lemma 4. Unfortunately, nailing down the details of this proof sketch
is not easy. Nevertheless, in this section we show that, with some care, a proof based
on these ideas can be carried out.
Fix > 0. We will refer to the region f
x
(t ) < x < f
x
(t ) + in the t x-plane as
the -corridor. To prove Lemma 4, we must show that for large n, an n-walk is likely
to stay entirely inside the -corridor. We rst determine simple bounds on any n-walk.
At step i of the walk, by (5) we have
x
i
0, 2 2x
i
t
i
= y
i
0,
and therefore
0 x
i
2 t
i
2
. (9)
Similar bounds apply to the graph of f
x
: for 0 t 2,
0 f
x
(t ) 1, 2 2 f
x
(t ) t = f
y
(t ) = f
x
(t ) ln( f
x
(t )) 0,
so
0 f
x
(t )
2 t
2
. (10)
April 2014] A DRUG-INDUCED RANDOM WALK 307
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
These simple bounds already imply that the end of the n-walk stays inside the -
corridor: if t
i
> 2 2, then
0 x
i
, f
x
(t
i
)
2 t
i
2
< ,
and therefore
|x
i
f
x
(t
i
)| < .
Thus, we only need to worry about t
i
in the interval [0, 2 2]. In particular, if > 1,
then there is nothing more to prove, so we can assume now that 1. By stopping
short of t = 2, we avoid having to deal with the point (t, x, y) = (2, 0, 0) on the limit
curve, where the right-hand sides of the equations in (1) are undened.
We will nd it convenient to go a bit beyond t = 2 2, so we dene
D =
_
(t, x) R
2
: 0 t 2 and 0 x
2 t
2
_
,
and for (t, x) D we let
F(t, x) =
x
2 x t
.
Notice that for (t, x) D,
2 x t 2
2 t
2
t =
2 t
2
> 0, (11)
so F(t, x) is dened.
By (9) and (10), any n-walk and the curve x = f
x
(t ) both stay in the region D up to
time t = 2 , and by (7), if 0 t 2 , then f
x
(t ) = F(t, f
x
(t )). Thus, it makes
sense to apply Lemma 5 to the functions F and f
x
on the region D. In preparation for
this, we make some observations about these functions. We rst note that by (11) and
the denition of D, for (t, x) D we have
2 x t
2 t
2
x 0.
Since F(t, x) = x/(2 x t ), it follows that
1 F(t, x) 0, (12)
and therefore
| f
x
(t )| = |F(t, f
x
(t ))| 1. (13)
Next, we compute
F
x
(t, x) =
2 t
(2 x t )
2
, f
x
(t ) =
f
x
(t )
2
(2 f
x
(t ) t )
3
=
(F(t, f
x
(t )))
2
2 f
x
(t ) t
.
308 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
Thus, if (t, x) D, then by (11),
F
x
(t, x)
=
2 t
(2 x t )
2
2 t
((2 t )/2)
2
=
4
2 t
4
.
Similarly, if 0 t 2 , then
| f
x
(t )| =
(F(t, f
x
(t )))
2
2 f
x
(t ) t
1
2 f
x
(t ) t
1
(2 t )/2
=
2
2 t
2
.
We can therefore use C
1
= 4/ and C
2
= 2/ in Lemma 5. For reasons that will be-
come clear later, the value we will use for in Lemma 5 is
=
C
1
6(e
2C
1
1)
. (14)
Since the function F(t, x) is uniformly continuous on D, we can choose some >
0 such that for any two points (t
1
, x
1
), (t
2
, x
2
) D,
if |t
1
t
2
| < and |x
1
x
2
| < , then |F(t
1
, x
1
) F(t
2
, x
2
)| <
4
. (15)
We now choose a positive integer m large enough that
2
m
<
3
,
2
m
< ,
e
2C
1
1
2m
<
6
. (16)
Again, the reason for this choice will become clear later.
Consider an n-walk for any n m
2
. As in the statement of Lemma 4, let the x-
coordinates of the points on the walk be x
0
= 1, x
1
, . . . , x
2n
= 0, and for 0 i 2n let
t
i
= i /n. We now divide 2n by m, getting a quotient q and remainder r. In other words,
2n = mq +r
and 0 r < m. Notice that since n m
2
, we have q 2m. We think of the walk
as consisting of m blocks of steps, with each block containing q steps, followed by r
extra steps at the end. For 0 j m, let (T
j
, X
j
) be the position of the walk after j
blocks of steps have been traversed. Thus, T
j
= t
j q
= j q/n and X
j
= x
j q
.
Let h = q/n, so that for 0 j < m,
T
j +1
T
j
= h,
and note that since x either remains xed or decreases by 1/n in each step of the walk,
0 X
j
X
j +1
q
n
= h.
Applying (16), we see that
h =
2q
2n
=
2q
mq +r
2q
mq
=
2
m
<
3
,
April 2014] A DRUG-INDUCED RANDOM WALK 309
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
so
|T
j +1
T
j
|
2
m
<
3
, |X
j +1
X
j
|
2
m
<
3
. (17)
In other words, in the course of a single block of steps, x and t change by less than
/3.
For 0 j < m, let
j
=
X
j +1
X
j
h
F(T
j
, X
j
).
Rearranging this denition, this means that
X
j +1
= X
j
+h(F(T
j
, X
j
) +
j
).
Of course, this is the recurrence in our modied version of Eulers method.
We would now like to apply Lemma 5, but we have no guarantee that will be a
bound on the numbers |
j
|. However, we can show that if is such a bound, then the
walk stays in the -corridor:
Claim. Suppose that for all j < m, if T
j
2 2, then |
j
| . Then the n-walk
stays inside the -corridor.
Proof of Claim. Notice that since q 2m and 2/m < /3,
T
m
= t
mq
=
mq
n
=
2mq
2n
=
2mq
mq +r
>
2mq
m(q +1)
= 2
2
q +1
> 2
2
2m
> 2
6
> 2 2.
Thus, we can let k be the least index such that T
k
> 2 2. Then for all j < k,
T
j
2 2, and therefore, by assumption, |
j
| . And since T
k1
2 2, by (17)
we have
T
k
< T
k1
+
3
2 2 +
3
< 2 .
We can therefore apply Lemma 5 to the points (T
j
, X
j
) for 0 j k and the func-
tions F and f
x
on the region D to conclude that for all such j ,
|X
j
f
x
(T
j
)|
_
hC
2
2C
1
+
C
1
_
_
(1 +C
1
h)
j
1
_
.
Since j k m and h 2/m,
(1 +C
1
h)
j
_
1 +
2C
1
m
_
m
< e
2C
1
,
where the last inequality is well known (see, for example, inequality 4.5.13 in [7]).
Therefore,
|X
j
f
x
(T
j
)| <
_
(2/m)(2/)
2(4/)
+
C
1
_
(e
2C
1
1) =
e
2C
1
1
2m
+
(e
2C
1
1)
C
1
.
310 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
By (16) and (14), the last two fractions are both at most /6. Thus, we have shown that
|X
j
f
x
(T
j
)| <
3
. (18)
This implies that all of the points (T
j
, X
j
) for 0 j k are in the -corridor.
Since T
k
> 2 2, as we observed after (10), all points on the n-walk beyond
(T
k
, X
k
) are also in the -corridor. We still need to worry about points on the n-walk
in the interiors of the rst k blocks. If (t, x) is such a point, then (t, x) occurs between
(T
j
, X
j
) and (T
j +1
, X
j +1
), for some j < k. To see that (t, x) is in the -corridor, we
compute
|x f
x
(t )| |x X
j
| +|X
j
f
x
(T
j
)| +| f
x
(T
j
) f
x
(t )|.
We now bound each of the terms on the right-hand side. We already know, by (17) and
(18), that |x X
j
| |X
j +1
X
j
| < /3 and |X
j
f
x
(T
j
)| < /3. For the third term
we apply the mean value theorem:
f
x
(T
j
) f
x
(t ) = f
x
(c)(T
j
t ),
for some c between t and T
j
. By (13) and (17), we conclude that
| f
x
(T
j
) f
x
(t )| = | f
x
(c)| |T
j
t | | f
x
(c)| |T
j +1
T
j
| < 1
3
=
3
.
Putting it all together, we get
|x f
x
(t )| |x X
j
| +|X
j
f
x
(T
j
)| +| f
x
(T
j
) f
x
(t )| <
3
+
3
+
3
= ,
so the point (t, x) is in the -corridor. We have now shown that all points on the walk
are in the -corridor, which completes the proof of the claim.
The claim shows that if an n-walk goes outside of the -corridor, then there must be
some j < m such that T
j
2 2 and |
j
| > . To complete the proof, we will show
that this is unlikely to happen.
Partition {(t, x) D : t 2 2} into nitely many disjoint regions R
1
, R
2
, . . . ,
R
K
, each with diameter less than . By (12) and (15), for each k with 1 k K we
can choose a number r
k
such that 1 r
k
0 and for every (t, x) R
k
,
|F(t, x) r
k
| <
4
. (19)
For example, we can take r
k
to be F(t, x) for some particular (t, x) R
k
. Notice that
the regions R
k
and numbers r
k
do not depend on n; as n , R
k
and r
k
will remain
xed.
We will write Pr
n
(E) to denote the probability that an event E occurs when an n-
walk takes place. The claim implies that the probability that an n-walk will leave the
-corridor is at most
m1
j =0
K
k=1
p
j,k
(n),
April 2014] A DRUG-INDUCED RANDOM WALK 311
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
where
p
j,k
(n) = Pr
n
((T
j
, X
j
) R
k
and |
j
| > ).
Thus, it will sufce to show that for each j and k, lim
n
p
j,k
(n) = 0.
Fix j and k with 0 j < m and 1 k K. The value of
j
is determined by the
block of steps taken by the n-walk in going from (T
j
, X
j
) to (T
j +1
, X
j +1
). The points
on this part of the walk are (t
j q+i
, x
j q+i
) for 0 i q. We will refer to the step from
(t
j q+i
, x
j q+i
) to (t
j q+i +1
, x
j q+i +1
) as step i of this block of the n-walk. Notice that there
are q steps in the block, and since q is the quotient when n is divided by m and m is
xed, q when n .
Let a be the number of steps in the block in which x decreases by 1/n. In the re-
maining q a steps, the value of x does not change, so X
j
X
j +1
= a/n. Therefore,
by denition,
j
=
X
j +1
X
j
h
F(T
j
, X
j
) =
a/n
q/n
F(T
j
, X
j
) =
a
q
F(T
j
, X
j
).
Although the value of p
j,k
(n) does not depend on the precise method by which the
steps in this block of the walk are chosen, it will be helpful to specify a method. We
will assume that for 0 i < q, random numbers s
i
are chosen, independently and
uniformly in [0, 1], and then in step i , x decreases by 1/n if
s
i
<
x
j q+i
2 x
j q+i
t
j q+i
= F(t
j q+i
, x
j q+i
),
and x is unchanged otherwise. Of course, according to equation (6), this procedure
generates the correct probabilities for the steps of the walk.
Suppose that (T
j
, X
j
) R
k
. Then by (19), |F(T
j
, X
j
) r
k
| < /4, or in other
words
r
k
4
< F(T
j
, X
j
) < r
k
+
4
. (20)
Also, for 0 i < q, by (17) and (16), |t
j q+i
T
j
| 2/m, |x
j q+i
X
j
| 2/m,
2/m < /3, and 2/m < . Since t
j q+i
T
j
+2/m < 2 2 +/3 < 2 , we have
(t
j q+i
, x
j q+i
) D, and therefore, by (15), |F(t
j q+i
, x
j q+i
) F(T
j
, X
j
)| < /4. Com-
bining this with |F(T
j
, X
j
) r
k
| < /4, we conclude that |F(t
j q+i
, x
j q+i
) r
k
| <
/2, or in other words
r
k
2
< F(t
j q+i
, x
j q+i
) < r
k
+
2
.
Recall that step i is determined by how s
i
compares to F(t
j q+i
, x
j q+i
). We can
now draw the conclusion that if (T
j
, X
j
) R
k
, then:
(a) if s
i
r
k
2
, then at step i , x decreases by
1
n
;
(b) if s
i
r
k
+
2
, then at step i , x remains unchanged.
We are now ready to show that lim
n
p
j,k
(n) = 0. By denition,
p
j,k
(n) = Pr
n
((T
j
, X
j
) R
k
and
j
> ) +Pr
n
((T
j
, X
j
) R
k
and
j
< ).
312 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
We will show that both of the probabilities on the right-hand side approach 0 as
n .
For the rst, suppose that (T
j
, X
j
) R
k
and
j
> . Since
j
= a/q F(T
j
, X
j
),
by (20) this implies that
a
q
< F(T
j
, X
j
) < r
k
3
4
.
Now let a
a, and therefore
0
a
q
a
q
< r
k
3
4
< r
k
2
< 1.
This is very unlikely to happen. To see why, notice rst that for 0 i < q, since s
i
is
chosen uniformly in [0, 1] and 0 < r
k
/2 < 1, the probability that s
i
r
k
/2
is r
k
/2. And since the s
i
are chosen independently, this means that a
/q, which
is the fraction of values of i for which s
i
r
k
/2, should be close to r
k
/2.
More precisely, by the law of large numbers (see [2, Section VI.4, p. 152]), for any
> 0, the probability that |a
/q (r
k
/2)| > must approach 0 as q .
And since q as n , taking = /4 we can conclude that
lim
n
Pr
n
_
a
q
< r
k
3
4
_
= 0.
It follows that
lim
n
Pr
n
((T
j
, X
j
) R
k
and
j
> ) = 0.
The second probability is similar. If (T
j
, X
j
) R
k
and
j
< , then
a
q
> F(T
j
, X
j
) + > r
k
+
3
4
.
Now let a
a, so
1
a
q
a
q
> r
k
+
3
4
> r
k
+
2
> 0.
Once again, the law of large numbers says that the probability of this event goes to 0
as n , which completes the proof of Lemma 4 and, therefore, Theorem 1.
4. PROOFS OF THEOREMS 2 AND 3. To prove Theorem 2, x n > 0, and let
A denote the number of whole pills removed from the bottle before the rst half pill.
Of course, the rst pill removed from the bottle must be a whole pill, and there are n
whole pills altogether, so 1 A n.
For 1 k n, let X
k
= 1 if the rst k pills removed from the bottle are all whole
pills, and X
k
= 0 otherwise. Then we have A = X
1
+ X
2
+ + X
n
, and therefore
E(A) = E(X
1
+ X
2
+ + X
n
) = E(X
1
) + E(X
2
) + + E(X
n
).
April 2014] A DRUG-INDUCED RANDOM WALK 313
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
The probability that the rst pill removed is a whole pill is 1. Once the rst whole
pill has been removed, the bottle contains n 1 whole pills and 1 half pill, so the
probability that the second pill is also a whole pill is (n 1)/n. Similarly, if the rst
two pills are whole pills, then the probability that the third pill is a whole pill is (n
2)/n. Continuing in this way, we see that for 1 k n,
E(X
k
) = Pr(X
k
= 1)
= 1
n 1
n
n 2
n
n k +1
n
=
n!
n
k
(n k)!
.
Thus,
E(A) =
n
k=1
E(X
k
) =
n
k=1
n!
n
k
(n k)!
.
Reindexing by j = n k, we get
E(A) =
n
k=1
n!
n
k
(n k)!
=
n1
j =0
n!
n
nj
j !
=
n!
n
n
n1
j =0
n
j
j !
. (21)
To relate this formula to the incomplete gamma function, we rst evaluate the inte-
gral in the denition of the incomplete gamma function. Applying integration by parts
k times leads to the formula in the following lemma.
Lemma 6. For every integer k 0,
_
t
k
e
t
dt =
k!
e
t
k
j =0
t
j
j !
+C.
Using this lemma, we nd that
(n, n) =
_
n
t
n1
e
t
dt
= lim
N
_
_
(n 1)!
e
t
n1
j =0
t
j
j !
_
_
N
n
=
(n 1)!
e
n
n1
j =0
n
j
j !
. (22)
Thus,
n1
j =0
n
j
j !
=
e
n
(n 1)!
(n, n).
314 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
Substituting into (21), we get
E(A) =
n!
n
n
n1
j =0
n
j
j !
=
n!
n
n
e
n
(n 1)!
(n, n) =
e
n
n
n1
(n, n).
This proves the rst statement in Theorem 2.
To prove the second statement, about the asymptotic value as n , we need the
following fact.
Lemma 7.
lim
n
(n, n)
(n 1)!
=
1
2
.
Proof. According to inequality 8.10.13 of [7],
(n, n)
(n 1)!
<
1
2
<
(n +1, n)
n!
. (23)
By Lemma 6 and equation (22),
(n +1, n)=
_
n
t
n
e
t
dt =
n!
e
n
n
j =0
n
j
j !
=n
(n 1)!
e
n
n1
j =0
n
j
j !
+
n
n
e
n
=n(n, n) +
n
n
e
n
.
Substituting into the second half of inequality (23), we get
1
2
<
(n, n)
(n 1)!
+
n
n
e
n
n!
,
and therefore
1
2
n
n
2n
e
n
n!
1
2n
<
(n, n)
(n 1)!
<
1
2
.
By Stirlings formula, lim
n
n
n
2n/(e
n
n!) = 1, and the lemma now follows by
the squeeze theorem.
This lemma allows us to determine the asymptotic rate of growth of the expected
value of A. The expected length of the initial run of whole pills can be rewritten in the
form
E(A) =
e
n
n
n1
(n, n) =
2n
e
n
n!
n
n
2n
(n, n)
(n 1)!
2n 1
1
2
=
_
n
2
,
which completes the proof of Theorem 2.
Finally, we give Stongs proof of Theorem 3. For 1 k n, consider the kth whole
pill that is removed from the bottle. This pill is cut in half, and half of it is returned
to the bottle; we will refer to this half pill as the kth half pill. Let X
k
= 1 if the kth
April 2014] A DRUG-INDUCED RANDOM WALK 315
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
half pill is removed from the bottle after the last whole pill is removed, and X
k
= 0
otherwise. Then the expected value we seek is
E(X
1
+ X
2
+ + X
n
) = E(X
1
) + E(X
2
) + + E(X
n
).
After the kth half pill has been returned to the bottle, there are n k whole pills
still in the bottle, and we have X
k
= 1 if and only if among the set of pills consisting
of these n k remaining whole pills and the kth half pill, the half pill is the last one to
be removed from the bottle. Since each pill in this set is equally likely to be chosen at
each step, we have
E(X
k
) = Pr
n
(X
k
= 1) =
1
n k +1
.
Therefore the expected number of half pills removed from the bottle after the last
whole pill is
E(X
1
) + E(X
2
) + + E(X
n
) =
1
n
+
1
n 1
+ +1 = H
n
.
5. VARIATIONS. In all of our calculations, we have assumed that when a pill is
removed from the bottle, all pills in the bottle are equally likely to be chosen. But
since the whole pills are twice as big as the half pills, another natural assumption
would be that whole pills are twice as likely to be chosen as half pills. In this section
we summarize the results of redoing our calculations with this alternative assumption,
leaving the details to the reader.
If whole pills are twice as likely to be chosen as half pills, then the differential
equations (1) must be replaced by
dx
dt
=
2x
2x + y
,
dy
dt
=
2x y
2x + y
.
The solution to this system of equations that passes through the point (1, 0) is
y = 2(
x x), x =
(2 t )
2
4
, y =
t (2 t )
2
.
Once again, the random walk converges uniformly in probability to this curve as
n .
Surprisingly, in this case the expected number of whole pills removed before the
rst half pill turns out to be exactly the same as the expected number of half pills
removed after the last whole pill. Calculations similar to those in the last section show
that both expected values are
2
2n
_
2n
n
_ 1.
There is a simple explanation for why these two expected values are equal. The
explanation is based on an alternative procedure we could follow to decide which pill
to remove from the bottle each day. First, number the pills in a full bottle from 1 to n.
Then make a deck of 2n cards numbered from 1 to n, with each number appearing on
316 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
two cards, and shufe the deck. Every day, deal a card from the top of the deck, and if
the card has the number k on it, then remove pill number k from the bottle. As usual,
if the pill is whole, then cut it in half and return half to the bottle.
On any day, if pill number k is still whole, then there will be two cards numbered k
in the deck; if half of pill number k has already been taken, then there will be only one
card numbered k in the deck; and if pill number k has been used up completely, then
there will be no cards numbered k left in the deck. It follows that whole pills will be
twice as likely to be chosen as half pills, as required.
If we follow this procedure, then the number of whole pills removed from the bottle
before the rst half pill is removed will be the same as the number of distinct cards
dealt from the top of the deck before the rst duplicate card. Similarly, we could deter-
mine how many half pills will be removed from the bottle after the last whole pill by
dealing cards from the bottom of the deck and counting the number of distinct cards
dealt before the rst duplicate. It should now be clear by symmetry that the expected
values of these two numbers are equal. Indeed, the problem of computing this com-
mon expected value is equivalent to the third question addressed in [9], and the answer
follows from Theorem 5 of [9].
ACKNOWLEDGMENTS. I would like to thank Richard Stong, Greg Warrington, Rob Benedetto, Tanya
Leise, Amy Wagaman, and the anonymous referees for helpful conversations and suggestions. Natasha would
like to thank Dr. Michael Katz, D.V.M.
REFERENCES
1. R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, D. E. Knuth, On the Lambert W function, Adv.
Comput. Math. 5 (1996) 329359.
2. W. Feller, An Introduction to Probability Theory and its Applications. Vol. I. Third edition, Wiley, New
York, 1968.
3. L. Holst, On birthday, collectors, occupancy and other classical urn problems, Int. Stat. Review 54 (1986)
1527.
4. M. S. Klamkin, D. J. Newman, Extensions of the birthday surprise, J. Comb. Theory 3 (1967) 279282.
5. G. F. Lawler, V. Limic, Random Walk: A Modern Introduction. Cambridge Studies in Advanced Mathe-
matics. Vol. 123. Cambridge University Press, Cambridge, 2010.
6. B. McCabe, Matching balls drawn from an urn, Problem E 2263, Solutions by B. C. Arnold and R. J. Dick-
son, Amer. Math. Monthly 78 (1971) 10221024.
7. National Institute of Standards and Technology, Digital Library of Mathematical Functions, March 23,
2012, available at http://dlmf.nist.gov/.
8. P. N. Rathie, P. Z ornig, On the birthday problem: Some generalizations and applications, Int. J. Math.
Math. Sci. 2003 (2003) 38273840.
9. D. J. Velleman, G. S. Warrington, What to expect in a game of memory, Amer. Math. Monthly, 120 (2013)
787805.
DANIEL J. VELLEMAN received his B.A. from Dartmouth College in 1976 and his Ph.D. from the Univer-
sity of WisconsinMadison in 1980. He taught at the University of Texas before joining the faculty of Amherst
College in 1983. He was the editor of the American Mathematical Monthly from 2007 to 2011. In his spare
time he enjoys singing, bicycling, and playing volleyball.
Department of Mathematics, Amherst College, Amherst, MA 01002
djvelleman@amherst.edu
April 2014] A DRUG-INDUCED RANDOM WALK 317
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM
All use subject to JSTOR Terms and Conditions
Analytical Solution for the Generalized FermatTorricelli Problem
Author(s): Alexei Yu. Uteshev
Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 318-331
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.318 .
Accessed: 30/03/2014 17:29
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to
The American Mathematical Monthly.
http://www.jstor.org
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
Analytical Solution for the Generalized
FermatTorricelli Problem
Alexei Yu. Uteshev
Abstract. We present an explicit analytical solution for the problem of minimization of the
function
F(x, y) =
3
j =1
m
j
_
(x x
j
)
2
+(y y
j
)
2
,
i.e., we nd the coordinates of the stationary point and the corresponding critical value as
functions of {m
j
, x
j
, y
j
}
3
j =1
. In addition, we also discuss the inverse problem of nding such
values for m
1
, m
2
, and m
3
for which the corresponding function F possesses a prescribed
position of stationary point.
1. INTRODUCTION. Consider the following problem. Given the coordinates of
three noncollinear points P
1
= (x
1
, y
1
), P
2
= (x
2
, y
2
), and P
3
= (x
3
, y
3
) in the plane,
nd the coordinates of the point P
= (x
, y
j =1
m
j
_
(x x
j
)
2
+(y y
j
)
2
. (1)
Here m
1
, m
2
, and m
3
are assumed to be real positive numbers and will be subsequently
referred to as weights.
The stated problem, in its particular case of equal weights m
1
= m
2
= m
3
= 1, has
been known since 1643 as the (classical) FermatTorricelli problem. It has a unique
solution that coincides either with one of the points P
1
, P
2
, P
3
or with the so-called
Fermat or FermatTorricelli point [2, 4] of the triangle P
1
P
2
P
3
; this point makes an
angle of 2/3 with any two vertices of the triangle.
Generalization of the problem to the case of unequal weights has been investigated
since the 19th century. This generalization is known under different names: the Steiner
problem, the Weber problem, the problem of railway junction ((Germ.) Problem des
Knotenpunktes) [3, 8], the three factory problem [6]. The last two names were inspired
by a facility location problem such as the following. Let the cities P
1
, P
2
, and P
3
be
the sources of iron ore, coal, and water, respectively. To produce one ton of steel, the
steel works needs m
1
tons of iron, m
2
tons of coal, and m
3
tons of water. Assuming
that the freight charge for a ton-kilometer is independent of the nature of the cargo,
nd the optimal position for the steel works connected with P
1
, P
2
, and P
3
via straight
roads so as to minimize the transportation costs.
In the rest of the paper, this problem will be referred to as the generalized Fermat
Torricelli problem. Existence and uniqueness of its solution is guaranteed by the fol-
lowing result [4].
http://dx.doi.org/10.4169/amer.math.monthly.121.04.318
MSC: Primary 51N20
318 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
Theorem 1. Denote by
1
,
2
, and
3
the corner angles of the triangle P
1
P
2
P
3
. If the
conditions
_
_
m
2
1
< m
2
2
+m
2
3
+2m
2
m
3
cos
1
,
m
2
2
< m
2
1
+m
2
3
+2m
1
m
3
cos
2
,
m
2
3
< m
2
1
+m
2
2
+2m
1
m
2
cos
3
(2)
are fullled, then there exists a unique solution P
= (x
, y
) R
2
for the generalized
FermatTorricelli problem lying inside the triangle P
1
P
2
P
3
. This point is a stationary
point for the function F(x, y), i.e., a real solution of the system
3
j =1
m
j
(x x
j
)
_
(x x
j
)
2
+(y y
j
)
2
= 0,
3
j =1
m
j
(y y
j
)
_
(x x
j
)
2
+(y y
j
)
2
= 0. (3)
If any of the conditions (2) are violated, then F(x, y) attains its minimum value at the
corresponding vertex of the triangle.
Let us overview some approaches for nding the point P
, see [5]. For the general, i.e., unequal weighted case, see [3, 8].
The second approach is based on the mechanical model (sometimes incorrectly
called P olyas mechanical model): A horizontal board is drilled with holes at the points
P
1
, P
2
, and P
3
(or at the vertices of a triangle similar to P
1
P
2
P
3
). Three strings are tied
together in a knot at one end, the loose ends are passed through the holes, and are
attached to physical weights proportional to m
1
, m
2
, and m
3
, respectively, below the
board. The equilibrium position of the knot yields the solution [3].
The third approach, based on the gradient descent method, originated in the paper
[11]; further developments and comments can be found in [7, 9].
The present paper is devoted to the fourth approach, the analytical one. We look
for explicit expressions for the coordinates of the stationary point P
as functions of
{m
j
, x
j
, y
j
}
3
j =1
. Although the existence of such a solution by radicals, i.e., in a nite
number of operations like standard arithmetic ones and extraction of (positive integer)
roots, is not questioned in any review article on the problem, we failed to nd in the
literature the constructive and universal version of an algorithm even for the classical,
i.e., equal weighted, case.
2. ALGEBRA.
Theorem 2. Under the conditions (2), the coordinates of the stationary point (x
, y
)
of the function F(x, y) are as follows:
x
=
K
1
K
2
K
3
4|S|d
_
x
1
K
1
+
x
2
K
2
+
x
3
K
3
_
, y
=
K
1
K
2
K
3
4|S|d
_
y
1
K
1
+
y
2
K
2
+
y
3
K
3
_
(4)
April 2014] SOLUTION FOR THE FERMATTORRICELLI PROBLEM 319
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
with
F(x
, y
) = min
(x,y)R
2
F(x, y) =
d.
Here
d =
1
2
(m
2
1
K
1
+m
2
2
K
2
+m
2
3
K
3
) , or alternatively, (5)
d = 2|S| +
1
2
_
m
2
1
(r
2
12
+r
2
13
r
2
23
)+m
2
2
(r
2
23
+r
2
12
r
2
13
)+m
2
3
(r
2
13
+r
2
23
r
2
12
)
_
, (6)
r
j
= | P
j
P
| =
_
(x
j
x
)
2
+(y
j
y
)
2
for { j, } {1, 2, 3},
S = x
1
y
2
+ x
2
y
3
+ x
3
y
1
x
1
y
3
x
3
y
2
x
2
y
1
, (7)
=
1
2
_
m
4
1
m
4
2
m
4
3
+2m
2
1
m
2
2
+2m
2
1
m
2
3
+2m
2
2
m
2
3
, (8)
and
_
_
K
1
= (r
2
12
+r
2
13
r
2
23
) +(m
2
2
+m
2
3
m
2
1
)|S|,
K
2
= (r
2
23
+r
2
12
r
2
13
) +(m
2
1
+m
2
3
m
2
2
)|S|,
K
3
= (r
2
13
+r
2
23
r
2
12
) +(m
2
1
+m
2
2
m
2
3
)|S|.
(9)
Proof. First, we establish the validity of the equality
K
1
K
2
+ K
1
K
3
+ K
2
K
3
= 4|S|d, (10)
and the dual equality
r
2
23
K
1
+r
2
13
K
2
+r
2
12
K
3
= 2|S|d (11)
for (5). Second, let us deduce the following relationships
_
(x
x
j
)
2
+(y
y
j
)
2
=
m
j
K
j
2
d
for j {1, 2, 3}. (12)
Here is the proof for the case j = 1:
(x
x
1
)
2
+(y
y
1
)
2
(10)
=
_
K
1
K
2
K
3
4|S|d
_
2
_
_
x
2
K
2
+
x
3
K
3
x
1
K
2
x
1
K
3
_
2
+
_
y
2
K
2
+
y
3
K
3
y
1
K
2
y
1
K
3
_
2
_
=
_
K
1
K
2
K
3
4|S|d
_
2
_
(x
2
x
1
)
2
+(y
2
y
1
)
2
K
2
2
+
(x
3
x
1
)
2
+(y
3
y
1
)
2
K
2
3
+2
(x
2
x
1
)(x
3
x
1
) +(y
2
y
1
)(y
3
y
1
)
K
2
K
3
_
320 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
=
_
K
1
K
2
K
3
4|S|d
_
2
_
r
2
12
K
2
2
+
r
2
13
K
2
3
+2
1/2(r
2
12
+r
2
13
r
2
23
)
K
2
K
3
_
=
K
2
1
(4|S|d)
2
_
r
2
12
K
2
3
+r
2
13
K
2
2
+(r
2
12
+r
2
13
r
2
23
)K
2
K
3
_
=
K
2
1
(4|S|d)
2
_
(r
2
12
K
3
+r
2
13
K
2
)(K
2
+ K
3
) r
2
23
K
2
K
3
_
(11)
=
K
2
1
(4|S|d)
2
_
(2|S|d r
2
23
K
1
)(K
2
+ K
3
) r
2
23
K
2
K
3
_
=
K
2
1
(4|S|d)
2
_
2|S|d(K
2
+ K
3
) r
2
23
(K
1
K
2
+ K
1
K
3
+ K
2
K
3
)
_
(10)
=
K
2
1
(4|S|d)
2
_
2|S|d(K
2
+ K
3
) 4r
2
23
|S|d
_
=
2|S|dK
2
1
(4|S|d)
2
_
K
2
+ K
3
2r
2
23
_
(9)
=
K
2
1
8|S|d
2
_
2m
2
1
|S|
_
=
m
2
1
K
2
1
4
2
d
.
Similar arguments hold for j {2, 3} in (12). To complete the proof of these equalities,
it should be additionally veried that the values K
1
, K
2
, and K
3
are nonnegative. This
will be done in the next section.
To prove the rst statement of the theorem, we will utilize the following alternative
representation for x
and y
:
x
(10)
=
1
1
K
1
+
1
K
2
+
1
K
3
_
x
1
K
1
+
x
2
K
2
+
x
3
K
3
_
, and
y
(10)
=
1
1
K
1
+
1
K
2
+
1
K
3
_
y
1
K
1
+
y
2
K
2
+
y
3
K
3
_
. (13)
We substitute (4) into the left-hand side of the rst equation of (3). The resulting ex-
pression can be reduced with the aid of (12) to
x
x
1
K
1
+
x
x
2
K
2
+
x
x
3
K
3
= x
_
1
K
1
+
1
K
2
+
1
K
3
_
_
x
1
K
1
+
x
2
K
2
+
x
3
K
3
_
(13)
= 0.
Similar arguments are valid for the second equation from (3). Finally, we compute
F(x
, y
):
F(x
, y
)=
3
j =1
m
j
_
(x
x
j
)
2
+(y
y
j
)
2
(12)
=
3
j =1
m
2
j
K
j
2
d
(5)
=
2d
2
d
=
d.
April 2014] SOLUTION FOR THE FERMATTORRICELLI PROBLEM 321
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
Some test values are provided in Table 1.
P
1
P
2
P
3
P
m
1
m
2
m
3
d
1. (2, 6) (1, 1) (5, 1)
_
4103+1833
15
2866
,
295234481
15
8598
_
2 3 4 (3.9086, 1.4152)
d = 2
_
79 +15
15 23.4174
2. (2, 6) (1, 1) (5, 1)
_
751
485
,
647
485
_
(1.5484, 1.3340)
3 5 4
d =
970 31.1448
3. (0, 0) (2, 0) (
2,
2)
_
1
1
2
3
110
,
1
2
3
55
3
110
_
3/2 2 2 (0.0068, 0.0165)
d =
_
32 +
23
2
+3
_
55
2
7.9997
Table 1.
3. GEOMETRY. Let us give an interpretation for some constants that appeared in
Theorem 2. First, on rewriting (7) in determinantal form
S =
1 1 1
x
1
x
2
x
3
y
1
y
2
y
3
,
we recognize that |S| = 2S
P
1
P
2
P
3
, where S
P
1
P
2
P
3
stands for the area of triangle
P
1
P
2
P
3
. As for the constant (8), factorization of the radicand on the right-hand side
leads to the form
= 2
_
m
1
+m
2
+m
3
2
_
m
1
+m
2
+m
3
2
m
1
__
m
1
+m
2
+m
3
2
m
2
_
_
m
1
+m
2
+m
3
2
m
3
__
1/2
,
which can be treated as the Heron formula for twice the area of a triangle formed by
the triple of weights m
1
, m
2
, and m
3
. Under the restrictions (2), such a triangle exists.
Construct this triangle and denote its angles, as shown in Figure 1.
m
1
m
2
m
3
m
1
m
2
m
3
3
Figure 1. Two triangles generated by the problem
322 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
The rst formula from (9) can thus be represented with the aid of the law of cosines
as
K
1
= |S|
_
r
2
12
+r
2
13
r
2
23
|S|
+
m
2
2
+m
2
3
m
2
1
_
= |S|
_
2r
12
r
13
cos
1
|S|
+
2m
2
m
3
cos
1
_
= 2|S|(cot
1
+cot
1
).
Rewriting the rst condition from (2) in the form cos
1
+cos
1
> 0, we can con-
clude that cot
1
+cot
1
> 0 and, thus, K
1
> 0. In a similar way, the expressions for
K
2
and K
3
can be deduced, and we can establish that, under the restrictions (2), they
are both positive. This completes the proof of Theorem 2.
Remark 1. We set the dual generalized FermatTorricelli problem. Let the triangle
be composed of the sides with the lengths equal to m
1
, m
2
, and m
3
; let the weights
r
12
, r
23
, and r
13
be placed in its vertices, as shown in Figure 2.
r
13
r
23
r
12
m
1
m
2
m
3
Figure 2. Dual problem
The minimum value for the objective function will be the same as in the direct
problem, since (6) is equivalent to
2|S| +
1
2
_
r
2
12
(m
2
1
+m
2
2
m
2
3
) +r
2
13
(m
2
1
+m
2
3
m
2
2
) +r
2
23
(m
2
2
+m
2
3
m
2
1
)
_
.
4. CLASSICAL FERMATTORRICELLI PROBLEM. Consider now the equal
weighted case m
1
= m
2
= m
3
= 1.
Theorem 3. Let all the angles of the triangle P
1
P
2
P
3
be less than 2/3, or, equiva-
lently,
r
2
12
+r
2
13
+r
12
r
13
r
2
23
> 0,
r
2
23
+r
2
12
+r
12
r
23
r
2
13
> 0,
r
2
13
+r
2
23
+r
13
r
23
r
2
12
> 0.
The coordinates of the FermatTorricelli point for this triangle are as follows:
x
=
k
1
k
2
k
3
2
3|S|d
_
x
1
k
1
+
x
2
k
2
+
x
3
k
3
_
, y
=
k
1
k
2
k
3
2
3|S|d
_
y
1
k
1
+
y
2
k
2
+
y
3
k
3
_
, (14)
April 2014] SOLUTION FOR THE FERMATTORRICELLI PROBLEM 323
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
with the corresponding minimum value of the objective function
F(x
, y
) = min
(x,y)R
2
3
j =1
(x x
j
)
2
+(y y
j
)
2
=
d.
Here,
d =
1
3
(k
1
+k
2
+k
3
) =
r
2
12
+r
2
13
+r
2
23
2
+
3 |S| (15)
and
k
1
=
3
2
(r
2
12
+r
2
13
r
2
23
) +|S|,
k
2
=
3
2
(r
2
23
+r
2
12
r
2
13
) +|S|,
k
3
=
3
2
(r
2
13
+r
2
23
r
2
12
) +|S|,
with the rest of the parameters coinciding with those from Theorem 2.
It turns out that the right-hand sides of the expressions (14), being represented as
rational fractions with respect to {x
j
, y
j
}
3
j =1
, can be reduced further to the form where
denominators become area free.
Corollary. Under conditions of Theorem 3, the coordinates of the FermatTorricelli
point are as follows:
x
=
1
2
3d
(x
1
+ x
2
+ x
3
)|S| +
x
1
r
2
23
+ x
2
r
2
13
+ x
3
r
2
12
(16)
+3 sgn(S)
1 1 1
y
1
y
2
y
3
x
2
x
3
+ y
2
y
3
x
1
x
3
+ y
1
y
3
x
1
x
2
+ y
1
y
2
,
y
=
1
2
3d
(y
1
+ y
2
+ y
3
)|S| +
y
1
r
2
23
+ y
2
r
2
13
+ y
3
r
2
12
(17)
3 sgn(S)
1 1 1
x
1
x
2
x
3
x
2
x
3
+ y
2
y
3
x
1
x
3
+ y
1
y
3
x
1
x
2
+ y
1
y
2
.
Remark 2. The result of the last corollary can be extended to the generalized Fermat
Torricelli problem. Numerators and denominators in the right-hand sides of the for-
mulas (4) can be reduced by the common factor |S|. We do not present the resulting
expressions here, since they are inelegantly cumbersome.
Remark 3. One of the referees of the present paper suggested that the author provide
some motivation or insight of how he found the explicit expressions in Theorem 2.
324 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
Frankly speaking, the historical development of the investigation went in the direction
opposite to what has been presented up to this point. First, the formulas (16)(17)
were obtained as the solution of a linear system of equations arising from the feature
of the FermatTorricelli point to make an angle of 2/3 with any two vertices of the
triangle. Next, in a similar way, the formulas mentioned in Remark 2 were obtained for
the generalized FermatTorricelli problem, i.e., for the coordinates x
, y
. Although
these formulas looked awful, they permitted us to deduce the explicit expression (6)
for the value of minimal distance. Moreover, we noticed the appearance of this value
in the expressions for denominators of the formulas for x
and y
. Next, we intended to
perform an additional verication of the obtained results via direct substitution into the
equations (3). At this moment, the following lucky guess came to mind: the radicand of
_
(x
x
j
)
2
+(y
y
j
)
2
should be a perfect square! The only remaining trick was to discover the values (9).
5. INVERSEPROBLEM. Given the coordinates of the point P
= (x
, y
), we wish
to nd the values for the weights m
1
, m
2
, and m
3
with the aim for the corresponding
objective function (1) to posses a minimum point precisely at P
.
Theorem4. Let the vertices of the triangle P
1
P
2
P
3
be counted counterclockwise. Then
for the choice
m
1
= | P
P
1
|
1 1 1
x
x
2
x
3
y
y
2
y
3
,
m
2
= | P
P
2
|
1 1 1
x
1
x
x
3
y
1
y
y
3
, and
m
3
= | P
P
3
|
1 1 1
x
1
x
2
x
y
1
y
2
y
(18)
the function
F(x, y) =
3
j =1
m
j
_
(x x
j
)
2
+(y y
j
)
2
has its stationary point at P
, y
) = min
(x,y)R
2
F(x, y) =
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
x
2
+ y
2
x
2
1
+ y
2
1
x
2
2
+ y
2
2
x
2
3
+ y
2
3
. (19)
Proof. Substitute x = x
, y = y
and the values (18) into the left-hand side of the rst
equation from (3) as follows:
April 2014] SOLUTION FOR THE FERMATTORRICELLI PROBLEM 325
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
(x
x
1
)
1 1 1
x
x
2
x
3
y
y
2
y
3
+(x
x
2
)
1 1 1
x
1
x
x
3
y
1
y
y
3
+(x
x
3
)
1 1 1
x
1
x
2
x
y
1
y
2
y
. (20)
Represent this combination of the third-order determinants in the form of the fourth-
order determinant, namely
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
0 x
x
1
x
x
2
x
x
3
(expansion by its last row coincides with (20)). Now add the second row to the last to
obtain the following:
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
x
.
In this determinant, the rst row is proportional to the last one; therefore, the determi-
nant equals just zero. The second equality from (3) can be veried in a similar manner.
Let us evaluate F(x
, y
):
F(x
, y
) =
_
(x
x
1
)
2
+(y
y
1
)
2
_
1 1 1
x
x
2
x
3
y
y
2
y
3
+
_
(x
x
2
)
2
+(y
y
2
)
2
_
1 1 1
x
1
x
x
3
y
1
y
y
3
+
_
(x
x
3
)
2
+(y
y
3
)
2
_
1 1 1
x
1
x
2
x
y
1
y
2
y
.
To prove the equality (19), let us split it into the x-part and the y-part. First, keep the
x-terms in brackets of the previous formula:
(x
x
1
)
2
1 1 1
x
x
2
x
3
y
y
2
y
3
+(x
x
2
)
2
1 1 1
x
1
x
x
3
y
1
y
y
3
+(x
x
3
)
2
1 1 1
x
1
x
2
x
y
1
y
2
y
.
Similar to the proof of the rst part of the theorem, represent this linear combination
as the determinant of the fourth order:
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
0 (x
x
1
)
2
(x
x
2
)
2
(x
x
3
)
2
.
326 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
Multiply the rst row by (x
2
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
x
2
x
2
1
x
2
2
x
2
3
. (21)
The y-part of the equality (19) can be proven in exactly the same manner with the
resulting determinant differing from (21) only in its last row. The linear property of
determinant with respect to its rows completes the proof of (19).
Remark 4. The solution of the inverse problem is determined up to a common posi-
tive multiplier, i.e., the solution triple (m
1
, m
2
, m
3
) is dened by the value of the ratio
m
1
: m
2
: m
3
. (In the language of the facility location problem mentioned in the Intro-
duction, this statement is equivalent to the fact that the optimal position of the steel
works is independent of the currency of the state.) Up to this remark, the solution of
the inverse problem is unique. We have proven this statement via direct computations
starting from formulas (4).
Example 1. Let P
1
= (2, 6), P
2
= (1, 1), P
3
= (5, 1), and
P
=
_
1
2866
_
4103 +1833
15
_
,
1
8598
_
29523 4481
15
_
_
.
Find the values for the weights m
1
, m
2
, and m
3
from Theorem 4.
Solution. Formulas (18) give:
m
1
=
2(20925 4481
15)
18481401
_
316380606 +35999826
15,
m
2
=
2(15105 2342
15)
6160467
_
75400161 9169767
15,
and
m
3
=
8(1185 +15988
15)
18481401
_
8335761 2050623
15,
with
F(x
, y
) =
1
4299
(333980 +193436
15).
Now, compare the obtained result with the one represented in test 1 from Section 2.
According to Remark 4, we might expect that
m
1
: m
2
: m
3
= 2 : 3 : 4.
We leave the verication of this fact as an exercise for the inquisitive reader.
April 2014] SOLUTION FOR THE FERMATTORRICELLI PROBLEM 327
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM
All use subject to JSTOR Terms and Conditions
The next example originated from the question posed by one of the referees of the
present paper: What will happen to the result of Theorem 4 if we take P
= P
j
?
Example 2. Showhowto choose the values for the weights m
1
, m
2
, and m
3
in order for
the point P
to coincide with the given point on a side of the triangle from Example 1.
Solution. If we take P
= P
2
, the formulas (18) give zero values for all the weights;
however, the weights of these zeros are different. To explain this causistry, take
P
= P
2
+(, ) for the innitely small > 0. For this case, formulas (18) give:
m
1
() = 4
_
26 12 +2
2
= 4
26 +o(),
m
2
() = 4
2(5 2),
m
3
() = 4
_
16 8 +2
2
= 16 +o().
The weight m
2
() dominates over m
1
() and m
3
() when +0. As a matter of
fact, the true values of these weights do not inuence the position of the point P
; the
latter depends only on the value of the ratio m
1
() : m
2
() : m
3
(). Thus, the choice
m
1
= 4
26, m
2
= 20
2, m
3
= 16 provides us with P
= P
2
.
Let us now manipulate the weights with the aim of extruding the point P
to an
internal point of the side P
2
P
3
. This manipulation is not trivial, as in the previous case.
First, we utilize formulas (18) and then simplify the obtained result with the aid of
formulas (4). Finally, the variable weights
m
1
() = t , m
2
() = 1 +, m
3
() = 1
with a xed t >
_
2
10
t
2
4
, 1
_
.
Thus, the two essential weights m
2
() and m
3
() guarantee delivery of P
to the
side P
2
P
3
, while the negligible weight m
1
() ensures the ne-tuning of this delivery
to the particular point within the open line segment P
2
P
. Here P
1
equals twice the product of the distance | P
1
P
P
2
P
3
.
The rst statement of the theorem is equivalent to the geometrical equality
P
1
S
P
P
2
P
3
+
P
2
S
P
P
3
P
1
+
P
3
S
P
P
1
P
2
=
O.
Finally, the constant (19) is connected with
h =
1
S
1 1 1 1
x
x
1
x
2
x
3
y
y
1
y
2
y
3
x
2
+ y
2
x
2
1
+ y
2
1
x
2
2
+ y
2
2
x
2
3
+ y
2
3
,
which is known [10, pp. 251252] as the power of the point P
|
2
|CP
j
|
2
for j {1, 2, 3}, (22)
and, provided that P
1 1 1 1
x
1
x
2
x
3
x
4
y
1
y
2
y
3
y
4
z
1
z
2
z
3
z
4
(23)
is positive. Then for the choice
_
m
j
= | P
P
j
| V
j
_
4
j =1
, (24)
where V
j
equals the determinant obtained on replacing the j th column of (23) by the
column [1, x
, y
, z
(here
denotes transposition), the function
F(x, y, z) =
4
j =1
m
j
_
(x x
j
)
2
+(y y
j
)
2
+(z z
j
)
2
has its stationary point at P
= (x
, y
, z
). If P
, y
, z
) = min
(x,y,z)R
3
F(x, y, z)
=
1 1 1 1 1
x
x
1
x
2
x
3
x
4
y
y
1
y
2
y
3
y
4
z
z
1
z
2
z
3
z
4
x
2
+ y
2
+ z
2
x
2
1
+ y
2
1
+ z
2
1
x
2
2
+ y
2
2
+ z
2
2
x
2
3
+ y
2
3
+ z
2
3
x
2
4
+ y
2
4
+ z
2
4
.
(25)
Geometrical meanings of the values appearing in the last theoremare similar to their
counterparts from Theorem 4. For instance, the value (23) equals six times the volume
of tetrahedron P
1
P
2
P
3
P
4
, while the value (25) divided by V is known [10, p. 255] as
the power of the point P
H and (
H) (X \ H) = .
Hence, (ii) follows.
Consider (ii) and take a net (A, ) in X. The family = {(a)| a A} is a
lter base with a -cluster point, say p X. Let H be a closed neighborhood of p and
let a A. Then H (a) = , so there is some b A, b a, with (b) H. It
means that p is a -cluster point of (A, ) and (iii) holds.
Assume (iii), and take an open cover of X. Let
F
be the family of all nite
unions of elements of . The family
F
is directed by the set inclusion. Suppose that
for every U
F
, the set X \ cl U is nonempty, so it contains some element (U).
The net (
F
, ) has a -cluster point, say p X. Since
F
is also a cover, there
is some V
F
containing p. By the denition of the -cluster point, there exists
W
F
, W V, such that (W) cl V. But it also holds that (W) X \ cl W, so
= (X \ cl W) V (X \ W) W, which is not possible. Then some element of
F
must be dense in X.
Finally, suppose (iv). Let be an open lter base in X with no cluster point. Then
i
(x
i
) = u
i
(s
1
, s
2
. . . , s
i 1
, x
i
,
s
i +1
, . . . , s
n
). Suppose, for a moment, that there exist some t L with u
i
( p) <
u
i
(t ). Take c R such that u
i
( p) < c < u
i
(t ). Because of continuity of u
i
, H =
u
1
i
((, c]) is a closed set in X
i
whose interior contains p. Since p is a -
cluster point of l, there exists s L, with s t , such that s = l(s) H. But
April 2014] UNDOMINATED STRATEGIES IN NORMAL FORM GAMES 335
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PM
All use subject to JSTOR Terms and Conditions
then u
i
(s) (, c], which is not possible, because the relation s t means that
c < u
i
(t ) u
i
(s). Consequently, p is an upper bound of L. By Zorns Lemma, there
is a maximal element m in the preordered set (X
i
, ). This completes the proof, since
the strategy, maximal with respect to , cannot be dominated.
Note that Zorns Lema is usually formulated for partially ordered sets. Using pre-
ordered sets, its appropriate formulation can be found in [7]. Hence, the maximality
of m, which is claimed in our proof, is maximality up to the equivalence of strategies.
It means that there may exist another strategy m
X
i
, different from m (and also
maximal), on which the utility function u
i
has the same values.
Since every compact topological space is almost compact, the classical version of
Moulins theorem now follows as a corollary. The reader can also compare Theo-
rem 3.1 with other interesting results and techniques known from the game theory
literature. For instance, H. Salonen in [12] replaced the continuity of the utility func-
tion by its upper semi-continuity. He essentially used a characterization of compact-
ness by the centered collections of sets (in other words, having the nite intersection
property, [10]), or lters and lter bases, which are topologically equivalent to nets. A
similar technique was also used in [11] for iteratively undominated strategies with the
continuous utility function.
Now, let us check the advantage of Theorem 3.1 over its original, classical version.
Example 3.2. Consider the game already described in Example 3.1. Let us dene an-
other topology on X
i
, where i = 1, 2, by the local base of a general point (x, y) X
i
:
(i) the point (0, 0) has neighborhoods of the form [0, ) {0}, 0 < < 1,
(ii) for every x (0, 1), the point (x, 0) has neighborhoods of the form (x ,
x +) {0}, where 0 < < min{x, 1 x},
(iii) for every n = 0, 1, . . . , the point (1, n) has neighborhoods having the form
(1 , 1) {0} {(1, n)}, where 0 < < 1.
The new topology on X
i
is now similar to the Euclidean topology on the unit seg-
ment [0, 1], but with one important differencethe right end point of the segment is
present innitely many times. The space X
i
is T
1
, but certainly non-Hausdorff and non-
compact. Indeed, denoting Y
n
= [0, 1) {0} {(1, n)}, the family {Y
n
| n = 0, 1, . . .}
is an open cover of X
i
, having no nite subcover. However, we can show that the
new topology is almost compact. Let be an open cover of X
i
. The subspace
Y
0
= [0, 1] {0} X
i
is compact since it is homeomorphic to the unit segment
[0, 1], so there exists a nite subfamily {U
1
, U
2
, . . . , U
k
} with Y
0
k
j =1
U
j
.
Then there is r {1, 2, . . . , k} such that (1, 0) U
r
. But for every n = 1, 2, . . . , it
follows that (1, n) cl U
r
, so the closures of {U
1
, U
2
, . . . , U
k
} cover X
i
. By condition
(iv) of Lemma 3.1, X
i
is almost compact. The utility functions u
i
are continuous
functions of the argument (x
i
, y
i
), since they are continuous on the open subspaces
Y
n
= [0, 1) {0} {(1, n)} of X
i
, n = 0, 1, . . . , homeomorphic to [0, 1]. Hence, the
existence of the undominated strategies now follows from Theorem 3.1. Note that sim-
ilar spaces as X
i
are also known as examples of non-Hausdorff manifolds with some
motivation in sheaf theory and mathematical physics (see, for example, [6] or [4]).
ACKNOWLEDGMENTS. The authors are very grateful to both anonymous referees for many valuable sug-
gestions and comments, in particular related to the game-theoretical part of the content of our paper, and to the
editor for his assistance with preparation of the nal form of the manuscript. The authors are also thankful to
Professor V. A. Gorelik from the Dorodnitsyn Computing Center of the Russian Academy of Sciences for his
advice on nding an appropriate game-theoretical literature at the initial stage of their work.
336 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PM
All use subject to JSTOR Terms and Conditions
This work is supported by a specic research grant FEKT-S-11-2/921 of the Faculty of Electrical Engi-
neering and Communication, Brno University of Technology.
REFERENCES
1.
A. Cs asz ar, General Topology. Akademiai Kiad o, Budapest, 1978.
2. D. Fudenberg, J. Tirole, Game Theory. MIT Press, Cambridge, 1991.
3. R. Engelking, General Topology. Heldermann Verlag, Berlin, 1989.
4. M. Heller, L. Pysiak, W. Sasin, Geometry of non-Hausdorff spaces and its signicance for physics,
J. Math. Phys. 52 (2011) 17, available at http://dx.doi.org/10.1063/1.3574352.
5. D. S. Jankovi c, -regular spaces, Internat. J. Math. Sci. 8 (1985) 615619, available at http://dx.doi.
org/10.1155/S0161171285000667.
6. S. L. Kent, R. A. Mimna, J. K. Tartir, A note on topological properties of non-Hausdorff manifolds,
Internat. J. Math. Sci. (2009) 14, available at http://dx.doi.org/10.1155/2009/891785.
7. R. E. Meggison, An Introduction to Banach Space Theory. Springer-Verlag, Berlin, 1998.
8. H. Moulin, Theorie des Jeux pour lEconomie et la Politique. Hermann ParisCollection Methodes,
Paris, 1981.
9. , Game Theory for the Social Sciences. Second and revised edition. New York University Press,
New York, 1986.
10. J. Nagata, Modern General Topology. North-Holland, Amsterdam, 1974.
11. K. Ritzberger, Foundations of Non-Cooperative Game Theory. Oxford University Press, Oxford, 2002.
12. H. Salonen, On the existence of undominated Nash equilibria in normal form games, Games and Eco-
nomic Behavior 14 (1996) 208219, available at http://dx.doi.org/10.1006/game.1996.0049.
13. W. J. Thron, Topological Structures. Holt, Rinehart and Winston, New York, 1966.
14. N. V. Veli cko, H-closed topological spaces, Mat. Sb. 70(112) (1966) 98102 (Russian).
15. S. Vickers, Topology Via Logic. Cambridge University Press, Cambridge, 1989.
MARTIN KOV
j =0
x
j
j !
= 1 + x +
x
2
2!
+
x
3
3!
+ for all x R. (2)
Setting x = 1 in (2) and choosing a large value of n, we obtain the partial sum
n
j =0
1
j !
= 1 +1 +
1
2!
+
1
3!
+ +
1
n!
,
which gives a simple, direct approximation to e that is the best way of calculating e to
high accuracy [1, 2]. Present numerical values of e are derived using either optimized
versions of this Maclaurin series (2) or the continued-fraction expansion approach
initiated by Euler [2].
Afurther classical approach to approximating e uses the Maclaurin series expansion
of ln(1 + x):
ln(1 + x) =
j =1
(1)
j 1
j
x
j
for 1 < x 1. (3)
http://dx.doi.org/10.4169/amer.math.monthly.121.04.338
MSC: Primary 05A17, Secondary 11P81
338 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM
All use subject to JSTOR Terms and Conditions
The only example of this alternative approach (that the authors of [2] have found in the
literature) is given by replacing x with 1/x in (3) and then multiplying the resulting
series by x to get
x ln
_
1 +
1
x
_
= 1
1
2x
+
1
3x
2
1
4x
3
+
1
5x
4
1
6x
5
+
1
7x
6
(4)
for x < 1 or x 1. By exponentiating each side of (4) and collecting the same pow-
ers of 1/x with the help of the Maclaurin series (2) for e
x
, we nd an approximation to
e that has been known by mathematicians and bankers alike since the early seventeenth
century (see [2, Eq. (4)]). For x < 1 or x 1, we obtain
_
1 +
1
x
_
x
= e
_
1
1
2x
+
11
24x
2
7
16x
3
+
2447
5760x
4
959
2304x
5
+
238043
580608x
6
_
.
(5)
Setting, for example, x = 100,000 in the left-hand side of (5) yields an approximation
to e that is accurate to four decimal places.
Motivated by this technique, Knox and Brothers [5] (see also Brothers and Knox
[2]) present an interesting and useful method that yields a new and more accurate
approximation to e by combining two good approximations. We choose to demon-
strate one of their many results here (see [2] or [5]). Adding approximation (5) and
the approximation obtained by replacing x by x in (5), and multiplying the resulting
identity by 1/2, they obtain the following better approximation to e than that given
by (5):
1
2
__
1+
1
x
_
x
+
_
1
1
x
_
x
_
= e
_
1+
11
24x
2
+
2447
5760x
4
+
238043
580608x
6
+
_
. (6)
Even though we can obtain as many coefcients as we please in the right-hand side
of (5) by using Mathematica, here we aim at giving a formula for determining these
coefcients. Our formula is based mainly on the partition function (see, e.g., [7, 9]).
For our later use, we introduce the following set of partitions of an integer n N =
N
0
\ {0} := {1, 2, 3, . . .}:
A
n
:=
_
(k
1
, k
2
, . . . , k
n
) N
n
0
: k
1
+2k
2
+ +nk
n
= n
_
. (7)
In number theory, the partition function p(n) represents the number of possible parti-
tions of n N (e.g., the number of distinct ways of representing n as a sum of natural
numbers, regardless of order). By convention, p(0) = 1 and p(n) = 0 for n a negative
integer. For more information on the partition function p(n), please refer to [7] and the
references therein. The rst several values of the partition function p(n) are (starting
with p(0) = 1, see [9]):
1, 1, 2, 3, 5, 7, 11, 15, 22, 30, 42, . . . .
It is easy to see that the cardinality of the set A
n
is equal to the partition function p(n).
Now we are ready to present a formula that determines the coefcients a
j
s in (8), with
the help of the partition function asserted by the following theorem.
April 2014] AN ASYMPTOTIC FORMULA FOR (1 +1/x)
x
339
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM
All use subject to JSTOR Terms and Conditions
Theorem. The following approximation formula holds true:
_
1 +
1
x
_
x
= e
j =0
a
j
x
j
as x , (8)
where the coefcients a
j
( j N) are given by
a
0
:= 1 and
a
j
= (1)
j
(
k
1
,k
2
,...,k
j )
A
j
1
k
1
! k
2
! k
j
!
_
1
2
_
k
1
_
1
3
_
k
2
_
1
j +1
_
k
j
, (9)
where the A
j
(for j N) are given in (7).
Proof. In view of the Maclaurin series (2) of ln(1 + x), we can let
x ln
_
1 +
1
x
_
= 1 +ln
_
_
1 +
q
j =1
a
j
x
j
_
_
+ O(x
q1
) for x and q N,
where a
1
, . . . , a
q
are real numbers to be determined. From the fundamental theorem
of algebra, we see that there exist unique complex numbers x
1
, . . . , x
q
such that
1 +
a
1
x
+ +
a
q
x
q
=
_
1 +
x
1
x
_
_
1 +
x
q
x
_
. (10)
By using the following series expansion:
ln
_
1 +
z
x
_
=
q
j =1
(1)
j 1
z
j
j x
j
+ O(x
q1
) for |z| < |x| and x ,
we obtain
ln
_
1 +
a
1
x
+ +
a
q
x
q
_
=
q
j =1
(1)
j 1
S
j
j x
j
+ O(x
q1
) for x , (11)
where
S
j
= x
j
1
+ + x
j
q
for j = 1, . . . , q.
Replacing x by
1
x
in (3) and multiplying the resulting equation by x, we get
x ln
_
1 +
1
x
_
= 1
q
j =1
(1)
j 1
( j +1)x
j
+ O(x
q1
) for x . (12)
We then nd from (11) and (12) that
S
j
=
j
j +1
for j = 1, . . . , q, (13)
340 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM
All use subject to JSTOR Terms and Conditions
that is,
_
_
x
1
+ + x
q
=
1
2
,
x
2
1
+ + x
2
q
=
2
3
,
.
.
.
x
q
1
+ + x
q
q
=
q
q +1
.
(14)
Let
P
q
(x) = x
q
+b
1
x
q1
+ +b
q1
x +b
q
be a polynomial with zeros x
1
, . . . , x
q
satisfying the system of equations (14). So we
have
P
q
(x) = (x x
1
) (x x
q
). (15)
The Newton formulas (see, e.g., [4] and references therein) give the connection be-
tween the coefcients b
j
and the power sums S
j
:
S
j
+ S
j 1
b
1
+ S
j 2
b
2
+ + S
1
b
j 1
+ j b
j
= 0 for j = 1, . . . , q.
It is known [4] that b
j
can be expressed in terms of S
j
:
b
j
=
(
k
1
,k
2
,...,k
j )
A
j
(1)
k
1
+k
2
++k
j
k
1
!k
2
! k
j
!
_
S
1
1
_
k
1
_
S
2
2
_
k
2
_
S
j
j
_
k
j
, (16)
where the A
j
(for j N) are given in (7).
From (15), we obtain
(1)
q
x
q
P
q
(x) =
_
1 +
x
1
x
_
_
1 +
x
q
x
_
.
We thus have
1 +
(1)b
1
x
+
(1)
2
b
2
x
2
+ +
(1)
q1
b
q1
x
q1
+
(1)
q
b
q
x
q
=
_
1 +
x
1
x
_
_
1 +
x
q
x
_
. (17)
We see from (10) and (17) that the coefcients a
j
are given by
a
j
= (1)
j
b
j
= (1)
j
(
k
1
,k
2
,...,k
j )
A
j
(1)
k
1
+k
2
++k
j
k
1
!k
2
! k
j
!
_
S
1
1
_
k
1
_
S
2
2
_
k
2
_
S
j
j
_
k
j
, (18)
where the S
j
are given in (13). Finally, substituting the expression (13) into (18) yields
(9). This completes the proof.
April 2014] AN ASYMPTOTIC FORMULA FOR (1 +1/x)
x
341
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM
All use subject to JSTOR Terms and Conditions
Remark. Here we give explicit numerical values of some rst terms of a
j
by using
the partition set (7) and the formula (9). This shows how easily we can determine a
j
s
in (9). Obviously,
a
1
=
k
1
=1
1
k
1
!
_
1
2
_
k
1
=
1
2
.
For k
1
+2k
2
= 2, since p(2) = 2, the partition set A
2
in (7) is seen to have two ele-
ments:
A
2
= {(0, 1), (2, 0)} .
From (9), we have
a
2
=
(k
1
,k
2
)A
2
1
k
1
!k
2
!
_
1
2
_
k
1
_
1
3
_
k
2
=
11
24
.
For k
1
+2k
2
+3k
3
= 3, since p(3) = 3, as above, the partition set A
3
in (7) contains
three elements:
A
3
= {(0, 0, 1), (1, 1, 0), (3, 0, 0)} .
We then nd from (9) that
a
3
=
(
k
1
,k
2
,k
3)
A
3
1
k
1
!k
2
!k
3
!
_
1
2
_
k
1
_
1
3
_
k
2
_
1
4
_
k
3
=
7
16
.
Likewise, the partition sets A
4
and A
5
have 5 = p(4) and 7 = p(5) elements, respec-
tively, and so
A
4
= {(0, 0, 0, 1), (1, 0, 1, 0), (0, 2, 0, 0), (2, 1, 0, 0), (4, 0, 0, 0)} and
A
5
= {(0, 0, 0, 0, 1), (1, 0, 0, 1, 0), (0, 1, 1, 0, 0), (2, 0, 1, 0, 0),
(1, 2, 0, 0, 0), (3, 1, 0, 0, 0), (5, 0, 0, 0, 0)} ,
which yields
a
4
=
2447
5760
and a
5
=
959
2304
.
We note that the explicit numerical values of a
j
(for j = 1, 2, 3, 4, 5) here correspond
with the coefcients of 1/x
j
(for j = 1, 2, 3, 4, 5) in (5), respectively.
By using (8), we nd that
1
2
__
1 +
1
x
_
x
+
_
1
1
x
_
x
_
= e
j =0
_
1 +(1)
j
_
a
j
2x
j
for x , (19)
where the a
j
(for j N
0
) are given in (9).
342 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM
All use subject to JSTOR Terms and Conditions
ACKNOWLEDGMENTS. Thanks to the Editor, Professor Scott Chapman, for his several enduring encour-
agements to improve the exposition of this note and to the anonymous referees for their constructive comments.
Thanks also to Professor Jack R. Quine and Professor Bettye Anne Case of Florida State University for their
help in improving the exposition of this note. This research was supported by the Basic Science Research Pro-
gram through the National Research Foundation of Korea funded by the Ministry of Education, Science and
Technology of the Republic of Korea (2012-0002957).
REFERENCES
1. G. Arfken, Mathematical Methods for Physicists. Third edition. Academic Press, New York, 1985.
2. H. J. Brothers, J. A. Knox, New closed-form approximations to the logarithmic constant e, Math. Intelli-
gencer 20 (1998) 2529.
3. H. T. Davis, Tables of the Mathematical Functions. Vol. I, The Principia Press of Trinity University, San
Antonio, Texas, 1963.
4. H. W. Gould, The Girard-Waring power sum formulas for symmetric functions and Fibonacci sequences,
Fibonacci Quart. 37 (1999) 135140.
5. J. A. Knox, H. J. Brothers, Novel series-based approximations to e, College Math. J. 30 (1999) 269275.
6. E. Maor, e: The Story of a Number. Princeton University Press, Princeton, New Jersey, 1994.
7. Wikipedia contributors, Partition (number theory), Wikipedia, The Free Encyclopedia, available at http:
//en.wikipedia.org/wiki/Partition_function_(number_theory)#Partition_function.
8. J. Sondow, E. W. Weisstein, e. From MathWorldA Wolfram Web Resource, available at http:
//mathworld.wolfram.com/e.html.
9. N. J. A. Sloane, a(n) = number of partitions of n (the partition numbers). Maintained by The OEIS
Foundation, available at http://oeis.org/A000041.
CHAO-PING CHEN received his Bachelor of Science degree from Henan Normal University (China) in
1986 and his Master of Science Degrees from Southwest Jiaotong University (China) in 1995. He currently
teaches at Henan Polytechnic University (Jiaozuo) in China.
School of Mathematics and Informatics, Henan Polytechnic University, Jiaozuo City 454003,
Henan Province, China
chenchaoping@sohu.com
JUNESANG CHOI received his B.A. from Gyeongsang National University (Republic of Korea) in 1981 and
his Ph.D. from the Florida State University in 1991. He currently teaches at Dongguk University (Gyeongju)
in the Republic of Korea. See http://wwwk.dongguk.ac.kr/
~
junesang/.
Department of Mathematics, Dongguk University, Gyeongju 780-714, Republic of Korea
junesang@mail.dongguk.ac.kr
April 2014] AN ASYMPTOTIC FORMULA FOR (1 +1/x)
x
343
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM
All use subject to JSTOR Terms and Conditions
Stirlings Approximation for Central Extended Binomial Coefficients
Author(s): Steffen Eger
Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 344-349
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.344 .
Accessed: 30/03/2014 17:30
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to
The American Mathematical Monthly.
http://www.jstor.org
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM
All use subject to JSTOR Terms and Conditions
NOTES
Edited by Sergei Tabachnikov
Stirlings Approximation for
Central Extended Binomial Coefcients
Steffen Eger
Abstract. We derive asymptotic formulas for central extended binomial coefcients, which
are generalizations of binomial coefcients, using the distribution of the sum of independent
discrete uniform random variables with the Central Limit Theorem and a local limit variant.
1. STIRLINGS FORMULA AND CENTRAL BINOMIAL COEFFICIENTS.
For a nonnegative integer k, Stirlings formula,
k!
2k
_
k
e
_
k
where e is Eulers number, yields an approximation of the central binomial coefcient
_
k
k/2
_
using
_
k
m
_
=
k!
m!(km)!
as
_
k
k/2
_
2
k+1
2k
,
where we write a
k
b
k
as short-hand for lim
k
a
k
b
k
= 1. In our current note, we de-
rive asymptotic formulas for central extended binomial, or polynomial, coefcients (cf.
[2, 3, 7]). These coefcients appear in the extended binomial triangles (which we also
call (l + 1)-nomial, polynomial, or multinomial triangles [8]), which are generaliza-
tions of binomial, or Pascal, triangles, where entries in row k are dened as coefcients
of the polynomial (1 + x + x
2
+ + x
l
)
k
for l 0. Our derivation is not based upon
asymptotics of factorials, but upon the limiting distribution of the sum of discrete uni-
form random variables.
1
2. EXTENDEDBINOMIALTRIANGLES. In generalization to binomial triangles,
(l + 1)-nomial triangles, for l 0, are dened in the following way. Starting with a
1 in row zero, construct an entry in row k, for k 1, by adding the overlying (l +1)
entries in row (k 1) (some of these entries are taken as zero if not dened); thereby,
row k has (kl +1) entries. For example, the binomial (l = 1), trinomial (l = 2), and
quadrinomial triangles (l = 3) start as follows,
http://dx.doi.org/10.4169/amer.math.monthly.121.04.344
MSC: Primary 11B65, Secondary 11N37; 60G50
1
Throughout, we assume that all fractional values such as x =
kl
2
are integral when used in the context of
extended binomial coefcients. If this is not the case, then replace respective quantities with their oor, x,
the largest integer less than or equal to x.
344 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM
All use subject to JSTOR Terms and Conditions
1
1 1
1 2 1
1 3 3 1
1
1 1 1
1 2 3 2 1
1 3 6 7 6 3 1
1
1 1 1 1
1 2 3 4 3 2 1
1 3 6 10 12 12 10 6 3 1
In the (l + 1)-nomial triangle, the nth entry, for 0 n kl in row k, which we
denote by
_
k
n
_
l+1
, has the following interpretation. It is the coefcient of x
n
in the
expansion of
(1 + x + x
2
+ + x
l
)
k
=
kl
n=0
_
k
n
_
l+1
x
n
. (1)
It has been shown that
_
k
n
_
l+1
denotes the number of restricted integer compositions
(for a denition, see, e.g., [9] and many others) of the nonnegative integer n with k
parts
1
, . . . ,
k
, each from the set {0, 1, . . . , l} (cf. [5]), and allows the following
representation,
_
k
n
_
l+1
=
k
0
0,...,k
l
0
k
0
++k
l
=k
0k
0
+1k
1
++lk
l
=n
_
k
k
0
, . . . , k
l
_
, (2)
where
_
k
k
0
,...,k
l
_
is a multinomial coefcient, dened as
k!
k
0
!...k
l
!
, for nonnegative integers
k
0
, . . . , k
l
. We can verify representation (2) by noting that for real numbers x
0
, . . . , x
l
,
the multinomial theorem (cf. [15]) states that
(x
0
+ x
1
+ + x
l
)
k
=
k
0
0,...,k
l
0
k
0
++k
l
=k
_
k
k
0
, . . . , k
l
_
x
k
0
0
x
k
l
l
.
Thus, setting x
i
= x
i
for i = 0, . . . , l,
(1 + x + x
2
+ + x
l
)
k
=
k
0
0,...,k
l
0
k
0
++k
l
=k
_
k
k
0
, . . . , k
l
_
x
0k
0
++lk
l
, (3)
so that comparing coefcients of the right-hand sides of (1) and (3) leads to (2).
3. GENERALIZED STIRLINGS APPROXIMATION. Our strategy for deriving
approximation formulas for central extended binomial coefcients is as follows. First,
we determine the asymptotic distribution of the sum of discrete uniform variables,
which we easily nd to be a normal distribution by the Central Limit Theorem (CLT).
Then, we determine the exact distribution, which turns out to yield the normalized
extended binomial coefcients
_
k
n
_
l+1
. By relating the density of the asymptotic distri-
bution to the density of the exact distribution (e.g., via a local limit argument), we
obtain an extended binomial analogue of Stirlings approximation to central binomial
coefcients.
3.1. Step 1: Asymptotic distribution of the sum of discrete uniform variables.
Let k be a positive integer and let l be a nonnegative integer. Let X
j
, for j = 1, . . . , k,
April 2014] NOTES 345
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM
All use subject to JSTOR Terms and Conditions
be identically and independently distributed random draws from the discrete uniform
distribution on the set {0, . . . , l}, and let S
k
be their sum,
S
k
=
k
j =1
X
j
.
Obviously, by standard moments of the uniform distribution, the mean and variance of
each X
j
are given by
= E[X
j
] =
l
2
, and
2
= Var[X
j
] =
(l +1)
2
1
12
.
Hence, by independent and identical distribution of X
1
, . . . , X
k
, and application of
the CLT, the random variable
k(
S
k
k
) converges, as k , in distribution to a
normally N(0,
2
) distributed random variable. Recall that convergence in distribu-
tion precisely means that the cumulative density function of
k(
S
k
k
) converges
pointwise to the cumulative density function of the N(0,
2
) distribution.
3.2. Step 2: Exact distribution of the sum of discrete uniform random variables.
We now determine exactly the probability that S
k
takes on the integer value n, for 0
n kl. To do so, we consider isomorphic copies
X
j
of X
j
, which are independently
and identically multinomially distributed with probabilities p
0
= = p
l
=
1
l+1
of
types 0 to l. Each
X
j
= (A
0
, . . . , A
l
) is vector-valued, with P[
X
j
= (a
0
, . . . , a
l
)] =
1
l+1
for nonnegative integers a
s
, with a
0
+ +a
l
= 1, where A
s
denotes the number
of times an event of type s, for s = 0, . . . , l, occurs. Then, the sum
S
k
=
X
1
+ +
X
k
has the interpretation of representing the event of drawing with replacement k balls of
(l +1) different types from a bag, where the probability of drawing type s = 0, . . . , l
is
1
l+1
. Thus, by the standard interpretation of the multinomial distribution,
S
k
has
density
P[
S
k
= (a
0
, . . . , a
l
)] = P[A
0
= a
0
, . . . , A
l
= a
l
] =
_
k
a
0
, . . . , a
l
__
1
l +1
_
k
,
where a
0
+ +a
l
=k for nonnegative integers a
0
, . . . , a
l
. Then, if
S
k
=(a
0
, . . . , a
l
),
S
k
, the variable corresponding to
S
k
, represents the integer 0 a
0
+ +l a
l
. Thus,
for n such that 0 n kl,
P[S
k
= n] =
a
0
0,...,a
l
0
a
0
++a
l
=k
0a
0
++la
l
=n
P[
S
k
= (a
0
, . . . , a
l
)]
=
_
1
l +1
_
k
a
0
0,...,a
l
0
a
0
++a
l
=k
0a
0
++la
l
=n
_
k
a
0
, . . . , a
l
_
=
_
1
l +1
_
k
_
k
n
_
l+1
,
using representation (2).
An arguably more straightforward derivation of the exact distribution of S
k
, making
use of probability generating functions (pgfs), can be given by noting that the pgf
346 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM
All use subject to JSTOR Terms and Conditions
G
X
j
(x) =
n0
P[X
j
= n]x
n
of each X
j
is given by
G
X
j
(x) =
1
l +1
l
n=0
x
n
.
Whence, the pgf of S
k
is given as, by independence of X
1
, . . . , X
k
,
G
S
k
(x) = G
X
1
(x) G
X
k
(x) =
_
1
l +1
_
k
_
l
n=0
x
n
_
k
=
_
1
l +1
_
k kl
n=0
_
k
n
_
l+1
x
n
.
Thus,
P[S
k
= n] =
G
(n)
S
k
(0)
n!
=
_
1
l +1
_
k
n!
n!
_
k
n
_
l+1
=
_
1
l +1
_
k
_
k
n
_
l+1
,
where, by G
(n)
X
(0), we denote the nth derivative of G
X
, evaluated at zero.
3.3. Step 3: Local limit theorem. To derive an asymptotic formula for
_
k
n
_
l+1
, we
would like to make use of the results derived in Steps 1 and 2 above. Ideally, we would
like to equate the probability density function of the asymptotic normal dstribution
of S
k
with the exact distribution. However, as mentioned, convergence in distribution,
as assured by the CLT, only guarantees pointwise convergence of cumulative density
functions. On the contrary, local limit theorems describe how the probability den-
sity function of a sum of random variables approaches the normal density function.
For integer-valued random variables (also called lattice or arithmetical distributions),
Gnedenko and Kolmogorov [10] provide the following result.
Theorem 3.1 (see [10, p. 233]). If X
1
, X
2
, . . . are independent lattice random vari-
ables with identical distribution with nite mean and variance
2
, such that the
greatest common divisor of the differences of all the values of X
j
taken with positive
probability is 1, then
k P[S
k
= n]
1
2
e
(nk)
2
2
2
k
0
uniformly in n as k .
Since in our situation, the set of values of each X
j
taken with positive probability is
{0, 1, . . . , l}, the greatest common divisor of the differences is clearly 1. Thus, all as-
sumptions of Theorem 3.1 are satised in our case and, hence, also, the consequences
hold. Therefore, the following approximation is suggested for large k:
k P[S
k
= n]
1
2
e
(nk)
2
2
2
k
. (4)
April 2014] NOTES 347
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM
All use subject to JSTOR Terms and Conditions
For n = k = kl/2, the argument to the exponential function is zero, and thus
k P[S
k
= kl/2]
1
2
, or equivalently, P[S
k
= kl/2]
1
2
2
k
.
Using the exact form for P[S
k
= n] from Step 2 above, we hence have, bringing the
normalizing term (l +1)
k
to the right-hand side,
_
k
kl
2
_
l+1
(l +1)
k
_
2k
(l+1)
2
1
12
. (5)
For example, for l = 1, Pascals case, l = 2, l = 3, and l = 4, we therefore have the
approximations
_
k
k
2
_
2
k+1
2k
,
_
k
k
_
3
3
k
_
4
3
k
,
_
k
3
2
k
_
4
4
k
_
5
2
k
, and
_
k
2k
_
5
5
k
2
k
.
In Figure 1, we show for l = 4 the distributions P[S
k
= n] for k = 5, 10, 20, and
their respective normal approximations. There, we can see the local limit theorem
at work: The exact density function apparently approaches, pointwise, the normal
density function.
Figure 1. Distributions P[S
k
= n] for k = 5, 10, 20 for l = 4 xed, and normal approximations.
348 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM
All use subject to JSTOR Terms and Conditions
4. DISCUSSION. Although extended binomial coefcients, together with their con-
nection to the sum of discrete uniform random variables, go back at least to De
Moivres Doctrine of Chances [4] and to Eulers [6] analytical study of the coefcients
of polynomial (1), the mathematics community has apparently more or less ignored
their systematic study, except for a few recent publications such as [1, 2, 5, 7, 8].
Next, using the CLT (or a local limit variant) to deduce asymptotics of mathemati-
cal objects has been suggested, for example, by Walsh [14], who derives Stirlings
formula for factorials by equating the distribution of the sum of Poisson distributed
random variables with the normal density. Finally, the asymptotics of both the central
binomial (l = 1) as well as the central trinomial coefcients (l = 2) seem to be known
(e.g. [7, 13]), while the general formula (5) is, to the best of our knowledge, novel.
However, Ratsaby [12] derives our general result (4), as an estimate of the number
of restricted integer compositions, by application of Cauchys coefcient formula to
the polynomial (1) and computation of the resulting integral by Laplaces method for
evaluation of integrals. A historical perspective of local versus central limit theorem
is provided by McDonald [11].
ACKNOWLEDGMENT. The author would like to thank the anonymous reviewers for helpful comments.
REFERENCES
1. R. C. Bollinger, C. L. Burchard, Lucass theorem and some related results for extended Pascal triangles,
Amer. Math. Monthly 97 no. 3 (1990) 198204.
2. C. C. S. Caiado, P. N. Rathie, Polynomial coefcients and distribution of the sum of discrete uniform
variables, in Eighth Annual Conference of the Society of Special Functions and their Applications. Edited
by A. M. Mathai, M. A. Pathan, K. K. Jose, and J. Jacob, Pala, India, 2007.
3. L. Comtet, Advanced Combinatorics: The Art of Finite and Innite Expansions. D. Reidel Publishing
Company, Dordrecht, 1974.
4. A. De Moivre, The Doctrine of Chances: Or, A Method of Calculating the Probabilities of Events in Play.
Reprint of the third (1756) edition. Chelsea, New York, 1967.
5. S. Eger, Restricted weighted integer compositions and extended binomial coefcients, J. Integer Seq., 16
(2013).
6. L. Euler, De evolutione potestatis polynomialis cuiuscunque (1 +x +x
2
+ )
n
. Nova Acta Academiae
Scientarum Imperialis Petropolitinae 12 (1801), available at http://math.dartmouth.edu/
~
euler/.
7. N.-E. Fahssi, The polynomial triangles revisited (2012), available at http://arxiv.org/abs/1202.
0228.
8. D. C. Fielder, C. O. Alford, Pascals triangle: Top gun or just one of the gang?, in Applications of Fi-
bonacci Numbers. Edited by G. E. Bergum, A. N. Philippou, A. F. Horadam, Kluwer, Dordrecht, 1991.
9. P. Flajolet, R. Sedgewick, Analytic Combinatorics. Cambridge University Press, Cambridge, 2009.
10. B. V. Gnedenko, A. N. Kolmogorov, Limit Distributions for Sums of Independent Random Variables.
Second edition. Addison-Wesley, Cambridge, MA, 1968.
11. D. R. McDonald, The local limit theorem: A historical perspective, JIRSS 4 (2005) 7386.
12. J. Ratsaby, Estimate of the number of restricted integer-partitions, Appl. Anal. Discrete Math 2 (2008)
222-233.
13. The On-Line Encyclopedia of Integer Sequences, available at http://oeis.org, 2012, Sequence
A002426.
14. D. P. Walsh, Equating Poisson and normal probability functions to derive Stirlings formula, Amer. Statist.
49 (1995) 270271.
15. E. Weisstein, Multinomial SeriesFrom MathWorld, A Wolfram Web Resource, available at http:
//mathworld.wolfram.com/MultinomialSeries.html.
Economics Department, Goethe University Frankfurt am Main, Germany
eger.steffen@gmail.com
April 2014] NOTES 349
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM
All use subject to JSTOR Terms and Conditions
A New Proof of Stirlings Formula
Author(s): Thorsten Neuschel
Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 350-352
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.350 .
Accessed: 30/03/2014 17:30
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to
The American Mathematical Monthly.
http://www.jstor.org
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PM
All use subject to JSTOR Terms and Conditions
A New Proof of Stirlings Formula
Thorsten Neuschel
Abstract. A new, simple proof of Stirlings formula via the partial fraction expansion for the
tangent function is presented.
1. INTRODUCTION. Various proofs for Stirlings formula
n! n
n
e
n
2n, as n , (1.1)
have been established in the literature since the days of de Moivre and Stirling in 1730
(for a historical exposition see, e.g., [1]). Many of these proofs show that the limit
lim
n
n!
n
n
e
n
n
exists (for instance, via the EulerMaclaurin formula) in order to identify this limit
by using the asymptotical behavior of the Wallis product, which is the crucial step.
We will show that this last, quite wily, step can be replaced by a simple straightfor-
ward computation of the limit using only the partial fraction expansion for the tangent
function
tan x =
=0
2x
( +
1
2
)
2
x
2
. (1.2)
This expansion was probably found by Euler by the time Stirling determined his proof
via Wallis formula, see, e.g., [6, p. 327]. For some alternative elementary proofs of
Stirlings formula see, e.g., [1, 2, 4, 5, 7].
2. PROOF. An application of the well-known EulerMaclaurin formula in its sim-
plest form (see, e.g., [8, p. 37, (6.21)]) yields
log n! = n log n n +1 +log
n +
n1
0
x [x]
1
2
1 + x
dx.
In order to prove (1.1), it is sufcient to show
0
x [x]
1
2
1 + x
dx = log
2 1. (2.1)
To prove this, we will show directly the identity
0
x [x]
1
2
1 + x
dx =
1/2
0
8x
2
1 4x
2
x tan x
dx, (2.2)
http://dx.doi.org/10.4169/amer.math.monthly.121.04.350
MSC: Primary 41A60
350 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PM
All use subject to JSTOR Terms and Conditions
where the integral on the right-hand side can be evaluated by elementary calculus. We
start our computation with
0
x [x]
1
2
1 + x
dx =
=0
+1/2
x
1
2
1 + x
dx +
+1
+1/2
x
1
2
1 + x
dx
=0
1/2
0
x
1
2
1 + + x
dx +
1/2
0
x
3
2
+ + x
dx
.
By an easy change of variables, we observe that
1/2
0
x
1
2
1 + + x
dx =
1/2
0
x
3
2
+ x
dx,
so that we obtain
0
x [x]
1
2
1 + x
dx =
=0
1/2
0
x
3
2
+ + x
x
3
2
+ x
dx
=
=0
1/2
0
2x
2
( +
3
2
)
2
x
2
dx
=
1/2
0
=1
2x
2
( +
1
2
)
2
x
2
dx, (2.3)
where the interchange of summation and integration is allowed, due to the uniform
convergence of the series in (2.3) on the interval [0,
1
2
]. Applying (1.2), we immedi-
ately obtain (2.2). At this point of the proof, we have reduced the problem of determin-
ing the constant in Stirlings formula to a simple matter of elementary calculus as the
resulting integral in (2.2) can be evaluated easily. For convenience, we will give some
details. For example, using the decomposition
8x
2
1 4x
2
=
1
1 +2x
+
1
1 2x
2
it can be rewritten as
1/2
0
8x
2
1 4x
2
x tan x
dx = log
2 1 +
1/2
0
1
1 2x
x tan x
dx.
Now, by a standard argumentation involving integration by parts, we can observe for
0 < < 1/2 that
1
1 2x
x tan x
dx
=
1
2
log(1 2)
0
x tan x dx
=
1
2
log cos() +
1
2
log
cos()
1 2
0
log cos(x) dx.
April 2014] NOTES 351
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PM
All use subject to JSTOR Terms and Conditions
Letting tend to 1/2, we immediately obtain
1/2
0
1
1 2x
x tan x
dx = log
log
1/2
0
log cos(x) dx.
The remaining integral on the right-hand side can be easily evaluated to log
2 as
shown, e.g., in [2], [3]. This computation relies on the fact that its value, say c, remains
unchanged if cos(x) is replaced by sin(x) so that we have (using the double angle
formula)
1/2
0
log sin(2x) dx = log
2 +2
1/2
0
log sin(x) dx.
As both integrals in the last equation coincide, we obtain c = log
2, which com-
pletes the proof of (1.1).
REFERENCES
1. P. Diaconis, D. Freedman, An elementary proof of Stirlings formula, Amer. Math. Monthly 93 (1986)
123125.
2. W. Feller, A direct proof of Stirlings formula, Amer. Math. Monthly 74 (1967) 12231225.
3. W. Feller, Correction to A direct proof of Stirlings formula, Amer. Math. Monthly 75 (1968) 518.
4. R. Michel, On Stirlings formula, Amer. Math. Monthly 109 (2002) 388390.
5. J. Patin, A very short proof of Stirlings formula, Amer. Math. Monthly 96 (1989) 4142.
6. R. Remmert, Theory of Complex Functions. Springer, New York, 1991.
7. H. Robbins, A remark on Stirlings formula, Amer. Math. Monthly 62 (1955) 2629.
8. R. Wong, Asymptotic Approximation of Integrals. Society for Industrial and Applied Mathematics,
Philadelphia, PA, 2001.
Department of Mathematics, University of Trier, D-54286 Trier, Germany
neuschel@uni-trier.de
By 1914, the MONTHLY had outgrown its nancial arrangements, and it
was Slaught who turned to the American Mathematical Society to adopt the
MONTHLY as an ofcial journal. But American mathematics was growing as
fast as the MONTHLY, and the Society was already plagued by factional disputes
between the Eastern establishment (the Ivy league schools) and the Midwest
(led by Chicago). Slaughts request became a controversy. Should an organi-
zation dedicated to promoting mathematical research support a journal like the
MONTHLY? Many, especially in the East (led by Osgood), thought is should not,
and the AMS voted narrowly to give the MONTHLY a pat on the back rather than
money.
A Century of Mathematics:
Through the Eyes of the Monthly,
Edited by John Ewing.
Mathematical Association of America,
Washington, DC, 1994, p. 4.
352 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PM
All use subject to JSTOR Terms and Conditions
Zeta(2) Once Again
Author(s): Ralph M. Krause
Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 353-354
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.353 .
Accessed: 30/03/2014 17:30
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to
The American Mathematical Monthly.
http://www.jstor.org
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:27 PM
All use subject to JSTOR Terms and Conditions
Zeta(2) Once Again
Ralph M. Krause
Abstract. This note provides a strikingly efcient evaluation of zeta(2).
An article in the January, 2012, MONTHLY [1] proved, in a manner that might have ap-
pealed to Euler, his famous result that (2) =
1
1/k
2
=
2
/6. The argument there
suggested the following, which makes the same claim and resembles the sixth in [2].
The Taylors series for log(1 +z) converges on the unit circle z = e
i
for z = 1. Thus
log(1 + z) =
1
(1)
k1
z
k
/k, (1)
and
log(1 + z
1
) =
1
(1)
k1
z
k
/k. (2)
To be convinced that equation (1) holds on the circle of convergence (z = 1 ex-
cepted) and not merely inside it, we argue thus. For z in the rst quadrant, |1/(1 +z)
N1
0
(z)
k
| = |z
N
/(1 + z)| |z
N
|. Integrating 1/(1 + z)
N1
0
(z)
k
and |z
N
|
along a radius from z = 0 to z = e
i
shows that | log(1 + z)
N
1
(1)
k1
z
k
/k| <
1/(N +1), establishing (1) on the portion of the unit circle lying in the rst quadrant.
(This is all we need below, although the preceding argument may be modied easily
to use ever larger bounds than 1/(N +1) and prove convergence, though not uniform
convergence, on the entire unit circle with the exception of the point z = 1.) (2) then
follows, as the conjugate of (1).
Subtracting (2) from (1), still for z = e
i
and z = 1,
i = log(z) =
1
(1)
k1
[z
k
z
k
]/k = 2i
1
(1)
k1
sin(k)/k. (3)
Nowadays, one might verify that this is a Fourier series and let Parsevals formula
nish the job. Although Euler was perhaps a century too early for Fourier analysis, he
would have been willing to integrate the rst and last expressions in (3) from 0 to /2
after dividing by i . Making free use of the familiar observation that the even terms
comprise 1/4 of the sum of the series of reciprocal squares, we arrive at
2
8
=
3
4
1
1
k
2
.
http://dx.doi.org/10.4169/amer.math.monthly.121.04.353
MSC: Primary 11M06
April 2014] NOTES 353
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:27 PM
All use subject to JSTOR Terms and Conditions
The brevity here legitimately ignores the fact that the sums in (1)(3) converge
only conditionally. What is essential is that they converge uniformly on the interval of
integration [0, /2], and this they do by the estimate made in paragraph 2 above. Thus
/2
0
2
N
1
(1)
k1
sin(k)/k
f
(a), and that j 2 +1.
If b a mod p
j
, then f (b) f (a) (mod p
j
) and p
f
(b). Moreover, there is
a unique t (mod p) such that f (a +t p
j
) 0 (mod p
j +1
).
As noted in [4, p. 89], since the hypotheses of the theorem apply with a replaced
by a + t p
j
and (mod p
j
) replaced by (mod p
j +1
) but with unchanged, the
lifting may be repeated and continues indenitely. This means that if the polynomial
congruence is solvable to a sufciently high power of p (as dened in the Lemma),
356 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PM
All use subject to JSTOR Terms and Conditions
then it can be solved to all powers of p. We require one more lemma involving power
congruences.
Lemma (see [4, p. 101). If p is a prime and gcd(a, p) = 1, then the congruence x
t
a
r1
r1
+ +
a
1
1
. We give an alternate interpretation of this expansion, which also proves its uniqueness in
an interesting manner.
The 1996 Iranian mathematical olympiad competition contained the following prob-
lem. For natural numbers n and r, there is a unique expansion
n =
a
r
r
a
r1
r 1
+ +
a
1
1
with each a
i
an integer and a
r
> a
r1
> > a
1
0.
The existence is fairly easy to prove using the greedy algorithm. This expansion is
sometimes known as the Macaulay expansion. However, the following alternate inter-
pretation does not seem to be well known; it gives uniqueness in an interesting manner.
In what follows, the following well-known convention is used: the binomial coefcient
n
r
is equated to 0 if n < r.
For each natural number r, denote by S
r
the set of all r-digit numbers in some base
b whose digits are in strictly decreasing order of size. Evidently, S
r
is nonempty if and
only if b r; in this case, S
r
has
b
r
b
r
a
r
r
a
r1
r1
+ +
a
1
1
. In particular,
for each n, the Diophantine equation
a
r
r
a
r1
r1
+ +
a
1
1
b
3
5
3
2
2
1
1
= 12.
http://dx.doi.org/10.4169/amer.math.monthly.121.04.359
MSC: Primary 05A10
April 2014] NOTES 359
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:58 PM
All use subject to JSTOR Terms and Conditions
(ii) Let r = 3, n = 74. We may take b = 10 as
10
3
8
3
6
2
3
2
= 74.
Proof of theorem. First of all, we notice that the number of members in S
r
that
have rst digit < m equals
m
r
a
r
r
a
r1
r 1
+ +
a
2
2
a
1
1
.
Therefore, the number of members of S
r
occurring prior to the (n +1)th member
above (which must be n) is
a
r
r
a
r1
r 1
+ +
a
1
1
.
This proves our result.
Remark. We may proceed in a slightly different direction, if we do not use the rst
observation in the proof. For any k, we can obtain by induction that the number of
elements in S
k
starting with some a is
a
k1
n
r
=
n1
m=1
m
r 1
,
which is itself seen by induction on n.
ACKNOWLEDGMENTS. We are indebted to the referee for a number of constructive suggestions. In par-
ticular, she/he drew attention to a simple way to count something for which we gave a roundabout argument
as remarked above. The referees suggestions to add some illuminating examples and to make the uniqueness
argument transparent are well appreciated.
Stat-Math Unit, Indian Statistical Institute, 8th Mile Mysore Road, Bangalore 560059, India
sury@isibang.ac.in
360 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:58 PM
All use subject to JSTOR Terms and Conditions
Evaluating Lebesgue Integrals Efficiently with the FTC
Author(s): J. J. Koliha
Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 361-364
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.361 .
Accessed: 30/03/2014 17:31
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to
The American Mathematical Monthly.
http://www.jstor.org
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PM
All use subject to JSTOR Terms and Conditions
Evaluating Lebesgue Integrals Efciently
with the FTC
J. J. Koliha
Abstract. This note addresses evaluation of Lebesgue integrals on the real line using the Fun-
damental Theorem of Calculus, without having to verify that the primitive is absolutely con-
tinuous.
The Fundamental Theorem of Calculus (FTC) provides an efcient method for the
evaluation of Lebesgue integrals on real intervals, but only if we can nd an abso-
lutely continuous primitive (antiderivative) to the integrand. However, checking abso-
lute continuity can be quite difcult. In this note, we give examples of evaluation of
integrals that require only continuity of the primitive. Here is a version of Lebesgues
FTC extended to a possibly unbounded interval.
Lebesgues FTC. Let a < b . Let F : (a, b) C be absolutely continu-
ous on (a, b) and let F
b
a
f (t ) dt = F(b) F(a+).
It may seem that with the absolute continuity of F, the hypothesis that f is
Lebesgue integrable is redundant. Alas, no: The notorious function
F(t ) := Si(t ) =
t
0
sin x
x
dx, t > 0,
shows the error of our ways [2, Example 14.17]. The absolute continuity of F on
(0, ) follows from the Mean Value Theorem; F(0+) = 0 is clear and F() =
/2 is well known. Yet the derivative F
t
0
sin x
x
dx = .
It is well known that on a compact interval, the integrability of f is indeed redundant
(see, for instance, [2, Theorem 14.7]).
The problem with application of Lebesgues FTC can be seen in this situation.
Suppose we know that F
(x) = 0
almost everywhere. However, we cannot conclude that F G is constant.
http://dx.doi.org/10.4169/amer.math.monthly.121.04.361
MSC: Primary 26A42
April 2014] NOTES 361
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PM
All use subject to JSTOR Terms and Conditions
In order to overcome this problem, we need to look at a different type of FTC,
one which is usually proved by methods outside the theory of Lebesgue integration.
A proof that stays strictly within the realm of the Lebesgue theory was given by the
author in this MONTHLY [1]. We recall three versions of this theorem, whose proofs
can be found in [1] and [2, Chapter 14].
Theorem 1 (see [1]). Let a < b . Let F : (a, b) C be such that F
(x) =
f (x) for all x (a, b), where f : (a, b) C is Lebesgue integrable on (a, b). If the
one-sided limits F(a+) and F(b) exist, then
b
a
f (t ) dt = F(b) F(a+).
Even if we tighten the hypotheses to assume that F has a derivative on a compact
interval [a, b] (with one-sided derivatives at the end points), the integrability of f
cannot be dropped due to a possible blowout of the positive and negative oscillation
of f . To see this, dene
F(t ) = t
2
cos
2
1
t
2
if t = 0 and F(0) = 0,
and
f (t ) = F
(t ) if t = 0 and f (0) = 0.
But f is not Lebesgue integrable on [0, 1], since
| f (t )| dt as 0+. (See
[2, Example 14.15] for details.)
Theorem 2 (see [1]). Let a < b . Let F : (a, b) C be continuous on
(a, b) and let F
b
a
f (t ) dt = F(b) F(a+).
The expression nearly everywhere means everywhere except for a countable set.
If F is continuous on (a, b), F
b
a
f (t ) dt := F(b) F(a+).
Theorem 1 enables us to calculate the integral
1
0
t
1/2
dt by observing that F(t ) =
2t
1/2
is a primitive for the integrand f (t ) = t
1/2
everywhere in (0, 1), that F(0+) = 0
and F(1) = 2, but we have to know that f is Lebesgue integrable on (0, 1). For this
we can use, for instance, the Monotone Convergence Theorem applied to the trunca-
tions f
n
= min( f, n) of f . But this does not seem to be the most efcient way to do
itwe would like to conclude the integrability of f directly from the existence of the
Newton integral. For this we need to consider absolute Newton integrability. We say
362 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PM
All use subject to JSTOR Terms and Conditions
that a function f : (a, b) C is absolutely Newton integrable if the Newton integral
exists for both f and | f | (where | f | is real valued and nonnegative). Here is the desired
theorem.
Theorem 3 (see [1]). Let a < b . Let f : (a, b) C be absolutely Newton
integrable on (a, b). Then f is Lebesgue integrable on (a, b), and
b
a
f (t ) dt = (N)
b
a
f (t ) dt.
The readers can hone their skills by evaluating the following integrals using Theo-
rems 1, 2, or 3.
Example 1. Evaluate Lebesgue integrals efciently:
(i)
1
0
1
t
3/4
+i log t
dt, (ii)
2
1
t 3i
t +2i
dt, and (iii)
0
dt
(2t +i)
3
.
So far, the substantial power hidden in Theorems 2 and 3 has not been fully utilized,
namely the fact that the derivative of the continuous function F may exist only nearly
everywhere. We illustrate this in the following examples.
Example 2. Let f : (0, 1) C be the function dened by f (t ) = 0 if t is ra-
tional, and f (t ) = log t + i t
4/5
otherwise. Let be the characteristic function of
(0, 1) \ Q. Then F
1
(t ) = t log t t and F
2
(t ) = 5t
1/5
are generalized primitives to
f
1
(t ) = (t ) log t and f
2
(t ) = (t ) t
4/5
on (0, 1), respectively. Further, F
1
(1) = 1,
F
1
(0+) = 0, F
2
(1) = 5, and F
2
(0) = 0. Both f
1
and f
2
are absolutely Newton in-
tegrable as they do not change sign on (0, 1). By Theorem 3, f is Lebesgue inte-
grable with
1
0
f =
1
0
f
1
+ i
1
0
f
2
= 1 + i 5. (Note that by splitting the real and
imaginary parts of f , we avoided the need for nding a generalized primitive for
| f (t )| = (t )(log
2
t +t
8/5
)
1/2
. This is not always the most efcient maneuversee
Example 1 (iii).)
Example 3. Let f be dened on the interval (0, 1) by
f (x) =
1
(n +2){(n +1)x n}
if x
n
< x x
n+1
, n = 0, 1, 2, . . .
where x
n
= n/(n +1), n = 0, 1, 2, . . . . First sketch a graph of f ; it reveals innitely
many vertical asymptotes at the points x
n
, n = 0, 1, 2, . . . , neatly clustering near x =
1. On each interval (x
n
, x
n+1
), a primitive to f is
F(x) =
2
(n +1)
n +2
(n +1)x n +c
n
, x (x
n
, x
n+1
).
The constants of integration c
n
must be chosen wisely to make F continuous on (0, 1).
From F(x
n
) = F(x
n
+), we obtain c
n
= 2/[(n(n +1)] +c
n1
. Choosing c
0
= 0, we
get
c
n
= 2
n
k=1
1
k(k +1)
= 2
n
k=1
1
k
1
k +1
= 2
1
1
n +1
=
2n
n +1
,
April 2014] NOTES 363
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PM
All use subject to JSTOR Terms and Conditions
n = 0, 1, 2, . . . Setting F(x
n
) = F(x
n
) = F(x
n
+) for n = 1, 2, 3, . . . , we make
F continuous on (0, 1), but the derivatives F
(x
n
) fail to exist for n = 1, 2, 3, . . . .
As the integrand is nonnegative, its Newton integrability implies absolute Newton
integrability. Clearly, F(0+) = 0. Further, F is increasing on (0, 1) being con-
tinuous there and having a positive derivative nearly everywhere in (0, 1) (see
[2, Theorem B25]). Also, F is bounded on (0, 1) as on each interval (x
n
, x
n+1
]
we have F(x
n+1
) = 2x
n+1
2. Hence, the limit F(1) exists and is equal to
lim
n
F(x
n+1
) = 2. Thus,
1
0
f (x) dx = F(1) F(0+) = 2. We note that f
is not improperly Riemann integrable.
Example 4. A striking example of a Lebesgue integrable function that is not improp-
erly Riemann integrable and that has a vertical asymptote at each rational point of the
interval [0, 1] is given by Richardson in [3, Example 5.44]:
f (x) =
k=1
2
k
|x q
k
|
1/2
,
where (q
k
) is a sequence containing all rational numbers in [0, 1]. Write f
k
(x) =
2
k
|x q
k
|
1/2
for x [0, 1] \ {q
k
}, k = 1, 2, . . . . Then f
k
is absolutely Newton in-
tegrable with a generalized primitive F
k
(x) = 2
k+1
sgn(x q
k
)|x q
k
|
1/2
in [0, 1],
and the integral (N)
1
0
f
k
= F
k
(1) F
k
(0+) = 2
k+1
((1 q
k
)
1/2
+q
1/2
k
). By The-
orem 3, this is also Lebesgue integral of f
k
. We have
:=
k=1
1
0
| f
k
(t )| dt =
k=1
2
k+1
((1 q
k
)
1/2
+q
1/2
k
) < .
By the term-by-term integration of series [2, Theorem 13.35], f =
k
f
k
converges
almost everywhere in [0, 1], is Lebesgue integrable, and
1
0
f (t ) dt = .
ACKNOWLEDGMENT. I would like to thank the referees for their comments, which led to improved pre-
sentation of this note.
REFERENCES
1. J. J. Koliha, A fundamental theorem of calculus for Lebesgue integration, Amer. Math. Monthly 113 (2006)
551555.
2. , Metrics, Norms and Integrals: An Introduction to Contemporary Analysis. World Scientic Pub-
lishing, Singapore, 2008.
3. L. F. Richardson, Measure and Integration: A Concise Introduction to Real Analysis. John Wiley, New
York, 2009.
The University of Melbourne, Melbourne VIC 3010, Australia
koliha@unimelb.edu.au
364 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PM
All use subject to JSTOR Terms and Conditions
Problems and Solutions
Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 365-372
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.365 .
Accessed: 30/03/2014 17:31
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to
The American Mathematical Monthly.
http://www.jstor.org
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM
All use subject to JSTOR Terms and Conditions
PROBLEMS AND SOLUTIONS
Edited by Gerald A. Edgar, Doug Hensley, Douglas B. West
with the collaboration of Itshak Borosh, Paul Bracken, Ezra A. Brown, Randall
Dougherty, Tam as Erd elyi, Zachary Franco, Christian Friesen, Ira M. Gessel, L aszl o
Lipt ak, Frederick W. Luttmann, Vania Mascioni, Frank B. Miles, Richard Pefer,
Dave Renfro, Cecil C. Rousseau, Leonard Smiley, Kenneth Stolarsky, Richard Stong,
Walter Stromquist, Daniel Ullman, Charles Vanden Eynden, Sam Vandervelde, and
Fuzhen Zhang.
Proposed problems and solutions should be sent in duplicate to the MONTHLY
problems address on the back of the title page. Proposed problems should never
be under submission concurrently to more than one journal. Submitted solutions
should arrive before August 31, 2014. Additional information, such as general-
izations and references, is welcome. The problem number and the solvers name
and address should appear on each solution. An asterisk (*) after the number of
a problem or a part of a problem indicates that no solution is currently available.
PROBLEMS
11768. Proposed by Ovidiu Furdui, Technical University of Cluj-Napoca, Cluj-
Napoca, Romania. Let f be a bounded continuous function mapping [0, ) to itself.
Find
lim
n
n
_
n
_
_
0
f
n+1
(x)e
x
dx
n
_
_
0
f
n
(x)e
x
dx
_
.
11769. Proposed by P al P eter D alyay, Szeged, Hungary. Let a
1
, . . . , a
n
and b
1
, . . . , b
n
be positive real numbers. Show that
_
_
n
j =1
a
j
b
j
_
_
2
2
n
j,k=1
a
j
a
k
(b
j
+b
j
)
2
2
_
_
n
j,k=1
a
j
a
k
(b
j
+b
k
)
n
l,m=1
a
l
a
m
(b
l
+b
m
)
3
_
_
1/2
.
11770. Proposed by Spiros P. Andriopoulos, Third High School of Amaliada, Eleia,
Greece. Prove, for real numbers a, b, x, y with a > b > 1 and x > y > 1, that
a
x
b
y
x y
>
_
a +b
2
_
(x+y)/2
log
_
a +b
2
_
.
11771. Proposed by D. M. B atinet u-Giurgiu, Matei Basarab National College,
Bucharest, Romania, and Neculai Stanciu, George Emil Palade School, Buz au,
Romania. Let n!! =
(n1)/2
i =0
(n 2i ). Find
lim
n
_
n
_
(2n 1)!!
_
tan
n+1
(n +1)!
4
n
n!
1
__
.
http://dx.doi.org/10.4169/amer.math.monthly.121.04.365
April 2014] PROBLEMS AND SOLUTIONS 365
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM
All use subject to JSTOR Terms and Conditions
11772. Proposed by Mircea Merca, University of Craiova, Craiova, Romania. Let n
be a positive integer. Prove that the number of integer partitions of 2n +1 that do not
contain 1 as a part is less than or equal to the number of integer partitions of 2n that
contain at least one odd part.
11773. Proposed by Moubinool Omarjee, Lyc ee Henri IV, Paris, France. Given a posi-
tive real number a
0
, let a
n+1
= exp
_
n
k=0
a
k
_
for n 0. For which values of b does
n=0
(a
n
)
b
converge?
11774. Proposed by Yunus Tuncbilek, Ataturk High School of Science, Istanbul, Turkey
and Danny Lee, Herkimer Senior High School, NY, NY. Let be the circumscribed
circle of triangle ABC. The A-mixtilinear incircle of ABC and is the circle that is
internally tangent to , AB, and AC, and similarly for B and C. Let A
, P
B
, and P
C
be
the points on , AB, and AC, respectively, at which the A-mixtilinear incircle touches.
Dene B
and C
O
O
A
B
P
B
B and CP
C
B
are similar.
SOLUTIONS
The Lenstra Constant of a Ring
11628 [2012, 162]. Proposed by Jeffrey C. Lagarias and Michael E. Zieve, University
of Michigan, Ann Arbor, MI. Dene the Lenstra constant L(R) of a commutative ring
R to be the size of the largest subset A of R such that a b is a unit (invertible
element) in R for any distinct elements a, b A. Show that for each positive integer
N, the Lenstra constant of the ring Z(1/N) is the least prime that does not divide N.
Solution by Mark D. Meyerson, United States Naval Academy, Annapolis, MD. The
elements of Z(1/N) are the numbers of the form k/N
r
with k, r Z. Let p
e
1
1
p
e
m
m
be the prime factorization of N; each e
i
is a positive integer. The units in Z(1/N) are
numbers of the form p
d
1
1
p
d
r
r
with each d
i
Z. Let p be the least prime that does
not divide N. The set {1, . . . , p} has the property that any difference of two distinct
elements is a unit, since any prime factor of such a difference is a prime factor of N.
Hence, L(Z(1/N)) p.
366 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM
All use subject to JSTOR Terms and Conditions
Now, let L be a subset of Z(1/N) such that any nonzero difference is a unit, and
suppose that |L| > p. By deleting extra elements, we may assume |L| = p + 1. If
we multiply the p + 1 elements of L by a sufciently high power of N to make all
the elements integers, the nonzero differences will still be units. However, by the pi-
geonhole principle, two of the p +1 elements are congruent mod p. Their difference
is a multiple of p and hence is not a unit. It follows that L(Z(1/N)) p. The two
inequalities prove that p is the Lenstra constant of this ring.
Also solved by P. Budney, N. Caro (Brazil), R. Chapman (U. K.), W. Chengyuan (Singapore), P. P. D alyay
(Hungary), S. Dey (India), D. Fleischman, O. Geupel (Germany), Y. J. Ionin, B. Karaivanov, J. H. Lindsey II,
O. Lossers (Netherlands), A. Magidin, G. Martin (Canada), M. A. Prasad (India), F. Richman, J. Riegsecker,
K. Schilling, J. H. Smith, J. H. Steelman, R. Stong, M. Tetiva (Romania), Colgate University Problem Solving
Group, NSA Problems Group, TCDmath Problems Group (Ireland), Texas State University Problem Solving
Group, University of Louisiana at Lafayette Math Club, and the proposers.
Rotatable Quasigroups
11631 [2012, 247248]. Proposed by P al P eter D alyay, Szeged, Hungary. A quasi-
group (Q, ) is a set Q together with a binary operation such that for each a, b Q
there exist unique x and unique y (which may be equal) such that ax = b and ya = b.
The Cayley table of a nite quasigroup is its times table. A quasigroup has property
P if each row of the table is a rotation of the rst row.
Find all positive integers n for which there exists a quasigroup ({1, . . . , n}, ) with
property P in which all elements are idempotent. (For instance, the Cayley table below
denes a binary operation on {1, . . . , 5} with property P in which each element is
idempotent.)
* 1 2 3 4 5
1 1 5 4 3 2
2 3 2 1 5 4
3 5 4 3 2 1
4 2 1 5 4 3
5 4 3 2 1 5
Solution by Fred Richman, Florida Atlantic University, Boca Raton, FL. Such quasi-
groups exist if and only if n is odd. Cayley tables are just Latin squares; idempotence
requires diagonal 1, . . . , n in order. The table is then determined by its rst row and
property P. The problem is thus to nd a permutation of 1, . . . , n as the rst row so
that the entries in the rst column are distinct, since property P then completes a Latin
square for the table.
We calculate the rst entry in row 1 k. This row is a rotation of row 1, and it
must have 1 k in column 1 k. Also row 1 has 1 k in column k, so row 1 is rotated
leftward by k (1 k) positions to become row 1 k. Thus, the rst entry in row 1 k
is 1 [k (1 k) +1]. For these values to be distinct, the values k (1 k) must be
distinct modulo n.
When n is odd, 2 is invertible (modulo n). Setting 1 k 2 k as in the proposers
example yields k (1 k) 2(k 1), and these elements are distinct (modulo n).
When n is even, the values k (1 k) cannot be distinct (modulo n) because
n
i =1
i =
(n +1)n/2 n/2 (mod n) and
n
k=1
(k (1 k)) = 0.
Editorial comment. When n is odd, one can require even more: There are many
idempotent commutative quasigroups on Z
n
, such as by putting (i + j )/2 in position
(i, j ), using the uniqueness of the multiplicative inverse of 2. This construction for
April 2014] PROBLEMS AND SOLUTIONS 367
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM
All use subject to JSTOR Terms and Conditions
n = 2k + 1 is used in the Bose construction of a Steiner triple system on 6k + 3
elements (R. C. Bose, On the construction of balanced incomplete block designs, Ann.
Eugenics 9 (1939), 353399).
Also solved by D. Beckwith, R. Chapman (U. K.), S. M. Gagola Jr., O. Geupel (Germany), A. Habil (Syria),
E. A. Herman, Y. J. Ionin, B. Karaivanov, J. H. Lindsey II, J. M. Lockhart, O. P. Lossers (Netherlands),
C. R. Pranesachar (India), R. E. Prather, J. H. Steelman, R. Stong, J. Wojdylo, Colgate University Problem
Solving Group, GCHQ Problem Solving Group (U. K.), TCDmath Problem Group (Ireland), and the proposer.
A Harmonic Identity
11633 [2012, 248]. Proposed by Anthony Sofo, Victoria University, Melbourne, Aus-
tralia. For real a, let H
(a)
n
=
n
j =1
j
a
. Show that for integers a, b, and n with a
1, b 0, and n 1,
n
k=1
k(H
2
k
+ H
(2)
k
) +2(k +b)
a
H
(1)
k
H
(a)
k+b1
k(k +b)
a
= H
(a)
n+b
(H
2
n
+ H
(2)
n
).
Solution by Subhadip Dey, Bangalore City, Karnataka, India. As in the problem, we
use the notation H
n
= H
(1)
n
and H
(a)
0
= 0. Using the identities
H
2
n
+ H
(2)
n
= 2
n
j =1
j
i =1
1
i j
= 2
n
j =1
H
j
j
and
n
j =1
j
i =1
a
i
b
j
=
n
i =1
n
j =i
a
i
b
j
,
the rst term on the left side of the identity becomes
n
k=1
H
2
k
+ H
(2)
k
(k +b)
a
= 2
n
k=1
k
j =1
H
j
j (k +b)
a
= 2
n
j =1
H
j
j
n
k=j
1
(k +b)
a
.
Therefore, we compute
n
k=1
H
2
k
+ H
(2)
k
(k +b)
a
+2
n
k=1
H
k
H
(a)
k+b1
k
= 2
n
j =1
H
j
j
n
k=j
1
(k +b)
a
+2
n
j =1
H
j
j
H
(a)
j +b1
= 2
n
j =1
H
j
j
_
_
n
k=j
1
(k +b)
a
+ H
(a)
j +b1
_
_
= 2
n
j =1
H
j
j
H
(a)
n+b
= H
(a)
n+b
(H
2
n
+ H
(2)
n
).
Editorial comment. Several solvers noted that the identity is valid for all real a. E. A.
Herman generalized it to
n
k=1
k(H
p
k
+ H
( p)
k
) + Z
k, p
(k +b)
a
H
k
H
(a)
k+b1
k(k +b)
a
= H
(a)
n+b
(H
p
n
+ H
( p)
n
),
where p is a positive even integer and Z
k, p
=
p1
j =1
_
p
j
_
H
j 1
k
_
1
k
_
p1j
.
Also solved by P. Bracken, R. Chapman (U. K.), P. P. D alyay (Hungary), E. S. Eyeson, O. Geupel (Ger-
many), E. A. Herman, B. Karaivanov, O. Kouba (Syria), O. P. Lossers (The Netherlands), M. Omarjee (France),
C. R. Pranesachar (India), M. A. Prasad (India), J. H. Steelman, R. Stong, R. Tauraso (Italy), GCHQ Problem
Solving Group (U. K.), and the proposer.
368 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM
All use subject to JSTOR Terms and Conditions
A Fractional Integral
11637 [2012, 344]. Proposed by Ovidiu Furdui, Technical University of Cluj-Napoca,
Cluj, Romania. Let m 1 be a nonnegative integer. Let {u} = u u; the quantity
{u} is called the fractional part of u. Prove that
_
1
0
_
1
x
_
m
x
m
dx = 1
1
m +1
m
k=1
(k +1).
(Here denotes the Riemann zeta function.)
Solution by Patrick J. Fitzsimmons, San Diego, CA. First note that
_
1
x
_
=
1
x
n if
1
n+1
x <
1
n
. From this it follows that
_
1
0
_
1
x
_
m
x
m
dx =
n=1
_ 1
n
1
n+1
(1 nx)
m
dx =
n=1
(1 nx)
m+1
n(m +1)
_
1
n
1
n+1
=
1
m +1
n=1
1
n
_
1
n +1
_
m+1
.
On the other hand, with Z =
m
k=1
(k +1), we have
Z =
m
k=1
n=1
1
n
k+1
=
n=1
m
k=1
1
n
k+1
= m +
n=2
1
n
2
1
n
m+2
1
1
n
= m +
n=2
1
n(n 1)
_
1
1
n
m
_
= m +
n=2
1
n(n 1)
n=2
1
(n 1)n
m+1
= m +1
n=1
1
n(n +1)
m+1
.
Thus both sides of the stated identity equal
1
m+1
n=1
1
n(n+1)
m+1
.
Editorial comment. A similar problem appeared as Problem 1845, Math. Mag., 84
(April 2011), 155156, and as Problem 11206, this MONTHLY 114 (2007), 928929.
Eugene A. Herman showed for a > m 1 that
_
1
0
_
1
x
_
m
x
a
dx =
1
a m +1
1
m +1
m
k=1
(k +1 m +1)
_
m+1
m+1k
_
_
a+1
m+1k
_.
Also solved by T. Amdeberhan, P. J. Anderson (Canada), M. Bataille (France), D. Beckwith, K. N. Boyadzhiev,
M. A. Carlton, N. Caro (Brazil), R. Chapman (U. K.), M. W. Coffey, C. Curtis, P. P. D alyay (Hungary),
E. S. Eyeson, D. Fleischman, O. Geupel (Germany), M. L. Glasser, M. Goldenberg & M. Kaplan, D. Gove,
G. C. Greubel, J.-P. Grivaux (France), J. A. Grzesik, E. A. Herman, E. Hysnelaj (Australia) & E. Bojaxhiu
(Germany), W. Janous (Austria), B. Karaivanov, D. R. Kim (Korea), O. Kouba (Syria), H. Kwong, J. B. Little,
O. P. Lossers (Netherlands), I. Mez o (Hungary), U. Milutinovi c (Slovenia), J. Minkus, R. Nandan, M. Omarjee
(France), P. Perfetti (Italy) T. Perrson & M. P. Sundqvist (Sweden), C. R. Pranesachar (India), M. A. Prasad
(India), R. Pratt, V. Sah, J. Schlosberg, N. C. Singer, A. Stenger, R. Stong, R. Tauraso (Italy), D. B. Tyler,
J. Vinuesa (Spain), T. Viteam (Uruguay), M. Vowe (Switzerland), A. Witkowski (Poland), J. Zacharias, GCHQ
Problem Solving Group (U. K.), Missouri State University Problem Solving Group, NSA Problems Group,
TCDmath Problem Group (Ireland), and the proposer.
April 2014] PROBLEMS AND SOLUTIONS 369
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM
All use subject to JSTOR Terms and Conditions
Independent Triples in a Discrete Probability Space
11643 [2012, 426]. Proposed by Eugen J. Ionascu, Columbus State University, Colum-
bus, GA. Let r be a real number with 0 < r < 1, and dene a discrete probability
measure P on N by P(k) = (1 r)r
k1
for k 1. Show that there are uncount-
ably many triples (A
1
, A
2
, A
3
) of subsets of N that are mutually independent, that is,
P(A
i
A
j
) = P(A
i
)P(A
j
) for i = j and P(A
1
A
2
A
3
) = P(A
1
)P(A
2
)P(A
3
).
Solution by Oliver Geupel, Br uhl, NRW, Germany. Let A
1
=
m0
{4m +1, 4m +2}
and A
2
=
m0
{4m + 1, 4m + 3}. For any set B of nonnegative integers, let A
3
=
mB
{4m +1, 4m +2, 4m +3, 4m +4}. Since B is arbitrary, there are uncountably
many such triples.
We show that the events A
1
, A
2
, A
3
are mutually independent. We have
P(A
1
) = (1 r)
m=0
(r
4m
+r
4m+1
) = (1 r)
1 +r
1 r
4
=
1
1 +r
2
,
P(A
2
) = (1 r)
m=0
(r
4m
+r
4m+2
) = (1 r)
1 +r
2
1 r
4
=
1
1 +r
, and
P(A
3
) = (1 r)
mB
(r
4m
+r
4m+1
+r
4m+2
+r
4m+3
) = (1 r
4
)
mB
r
4m
.
Furthermore,
P(A
1
A
2
) = (1 r)
m=0
r
4m
=
1 r
1 r
4
= P(A
1
)P(A
2
),
P(A
1
A
3
) = (1 r)
mB
(r
4m
+r
4m+1
) = (1 r
2
)
mB
r
4m
= P(A
1
)P(A
3
),
P(A
2
A
3
) = (1 r)
mB
(r
4m
+r
4m+2
) = P(A
2
)P(A
3
),
and
P(A
1
A
2
A
3
) = (1 r)
mB
r
4m
= P(A
1
)P(A
2
)P(A
3
).
Editorial comment. Many solvers noted that there are trivial solutions, such as A
1
=
A
2
= N and A
3
arbitrary. The solution presented here demonstrates that the sets can
be required to be nontrivial.
Solved also by M. Carlton, J. H. Lindsey II, M. D. Meyerson, M. Rajeswari (India), K. Schilling, R. Stong,
GCHQ Problem Solving Group (U. K.), and the proposer.
Factorable Polynomials
11645 [2012, 427]. Proposed by Christopher J. Hillar, University of California, Berke-
ley, CA, Lionel Levine, Cornell University, Ithaca, NY, and Darren Rhea, University
of California, San Francisco, CA. Determine all positive integers n such that the poly-
nomial g in two variables given by g(x, y) = 1 + y
2
n
k=1
x
2k
+ y
4
x
2n+2
factors in
C[x, y].
370 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM
All use subject to JSTOR Terms and Conditions
Solution by O. P. Lossers, Eindhoven University of Technology, Eindhoven, The
Netherlands. For n = 1, g has x
2
y
2
+ as a factor, where is a primitive cube
root of unity in C. For n = 2, g has x
2
y
2
+ 1 as a factor. We claim that g does not
factor in C[x, y] when n 3. Equivalently, we claim that h does not factor in C[x, y]
when n 3, where h(x, y) = y
4
+ y
2
n
k=1
x
2k
+ x
2n+2
.
First, we note that h is a polynomial in y
2
over C[x], so if h has a linear factor,
necessarily of the form y +a(x), then y a(x) is another linear factor and so y
2
a(x)
2
is a quadratic factor of h.
If our claim is false, then h factors as a product of two quadratic polynomials in y
over C[x], and such a factorization has the form
h(x, y) = (y
2
+a(x)y +x
r
)(y
2
a(x)y +
1
x
s
),
where r and s are nonnegative integers such that r + s = 2n + 2 and a(x) C[x].
Inspecting the coefcient of y shows that x
r
a(x) =
1
x
s
a(x). Now, a(x) = 0 is
impossible if n 3, as the expression for h(x, y) would not have enough terms. There-
fore, r = s = n +1 and = 1.
Let be the polynomial given by (x) =
n
k=1
x
2k
. From the coefcient of y
2
in h, we see that = 2x
n+1
a
2
(x), with a of degree n. Writing a = i b and b =
c(x
2
) + xd(x
2
) gives
= 2x
n+1
+c
2
(x
2
) +2xc(x
2
)d(x
2
) + x
2
d
2
(x
2
). (1)
If n is even, then equating odd parts in (1) gives 0 = 2x
n+1
+2xc(x
2
)d(x
2
), whence
c and d must be monomials. But then the left side of (1) has n terms while the right
side has just two. So n is odd, say n = 2m +1.
In this case, from (1) it follows that cd = 0, and since b(x) = c(x
2
) + xd(x
2
) has
degree 2m +1, it must be c that is 0 so that b can have odd degree. Writing z = x
2
,
we thus have
2m+1
k=1
z
k
= zd
2
(z) +2z
m+1
, (2)
where d has the form d = 1 +
m1
j =1
d
j
z
j
+z
m
with {1, 1}. The rst m terms
of d now coincide with those of (1 z)
1/2
, so d
j
= (1)
j
_
1/2
j
_
for 0 j m 1.
A similar calculation for the last m coefcients of d shows that d
mj
= d
j
for 0
j m 1. But that gives contradictory values for d
1
when m 1, so there is no
factorization if n 2, as claimed.
Also solved by G. Apostolopoulos (Greece), R. Chapman (U. K.), P. P. Dalyay (Hungary), D. Fleischman,
O. Geupel (Germany), M. Goldenberg & M. Kaplan, E. A. Herman, B. Karaivanov, O. Kouba (Syria),
J. H. Lindsey II, A. Magidin, M. A. Prasad (India), N. Singer, R. Stong, E. Verriest, and the proposers.
A Geometric Inequality
11646 [2012, 427]. Proposed by P al P eter D alyay, Szeged, Hungary. Let ABC be an
acute triangle, and let A
1
, B
1
, C
1
be the intersection points of the angle bisectors from
A, B, C to the respective opposite sides. Let R and r be the circumradius and the
inradius of ABC, and let R
A
, R
B
, R
C
be the circumradii of the triangles AC
1
B
1
, BA
1
C
1
,
and CA
1
B
1
, respectively. Let H be the orthocenter of ABC, and let d
a
, d
b
, d
c
be the
distances from H to sides BC, CA, and AB, respectively. Show that
2r(R
A
+ R
B
+ R
C
) R(d
a
+d
b
+d
c
).
April 2014] PROBLEMS AND SOLUTIONS 371
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM
All use subject to JSTOR Terms and Conditions
Solution by Peter N uesch, Switzerland. Our solution uses Problem11552 (this MONTHLY,
October 2012, p. 702703). We write a, b, c for the lengths of the sides of ABC, s for
the semi-perimeter, and , , for the measures of the angles. From the denitions,
we have
AB
1
=
bc
c +a
, AC
1
=
bc
a +b
, B
1
C
1
= a
1
= 2R
A
sin .
Using a = 2R sin , we get R
A
= Ra
1
/a. Thus,
R
A
+ R
B
+ R
C
= R
_
a
1
a
+
b
1
b
+
c
1
c
_
R
_
1 +
r
R
_
= R +r,
where the inequality is Problem 11552. From d
a
= 2R cos cos , we have
d
a
+d
b
+d
c
= 2R(cos cos +cos cos +cos cos ) =
r
2
+s
2
4R
2
2R
.
Note that (r
2
+ s
2
4R
2
)/2R (2r(R +r))/R, since this is a rearrangement of a
Blundon inequality, s
2
4R
2
+4Rr +3r
2
. (This follows from s
2
2R
2
+10Rr
r
2
+2(R 2r)
(t ) = (t /) for > 0.
Note that
f F. Now
d( f,
f )
2
=
_
1
0
t
1/2
_
1 (t /)
_
2
f (t )
2
dt
_
2
0
t
1/2
f (t )
2
dt,
which goes to 0 as goes to 0. Hence, f is in the closure of F.
Editorial comment. If the problem statement had said continuously differentiable
and not just continuous, differentiable, then the above argument would in fact show
that F is dense in E.
Also solved by P. P. D alyay (Hungary), O. Kouba (Syria), J. H. Lindsey II, R. Stong, and GCHQ Problem
Solving Group (U. K.).
372 c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM
All use subject to JSTOR Terms and Conditions
Review
Encounters with Chaos and Fractals . 2nd edition. By Denny Gulick. Chapman and Hall/CRC
Press, Boca Raton, 2012, xvi + 371 pp., ISBN 978-1-58488-517-7, $79.95.
Review by: Jeffrey Nunemacher
The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 373-376
Published by: Mathematical Association of America
Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.373 .
Accessed: 30/03/2014 17:31
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to
The American Mathematical Monthly.
http://www.jstor.org
This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PM
All use subject to JSTOR Terms and Conditions
REVIEWS
Edited by Jeffrey Nunemacher
Mathematics and Computer Science, Ohio Wesleyan University, Delaware, OH 43015
Encounters with Chaos and Fractals, 2nd edition. By Denny Gulick. Chapman and Hall/CRC
Press, Boca Raton, 2012, xvi + 371 pp., ISBN 978-1-58488-517-7, $79.95.
Reviewed by Jeffrey Nunemacher
How can we convince undergraduates that mathematics is as modern and vibrant as
physics or biology in these days of the Higgs boson and genome sequencing? Cer-
tainly ordinary calculus, although it is intellectually rich, does not do the trick. Since
most promising mathematics and science students see it rst in high school, the level
of excitement that I still remember from seeing it presented relatively rigorously in
college many years ago is simply not present today. My candidate for a teachable con-
temporary mathematical topic that can attract modern students is chaotic dynamical
systems, or to give the subject a more enticing name, chaos and fractals. I have taught
courses on this subject at a variety of levels from freshman honors to senior capstone.
And the text that I have enjoyed using the most (at least for a lower-level version) is the
Gulick book, which has recently appeared in a second edition. The new edition offers
more material on fractals (three chapters rather than one) and gives expanded coverage
of background material and attention to modern algorithms. This second edition is the
subject of the current review.
The subject of chaos was invented around the turn of the twentieth century by
Poincar e (but named much later by Yorke). He showed that a deterministic system of
second-order differential equations modeling a particular three-body solar system can
have solutions that display sensitive dependence on initial conditions. Thus, some tra-
jectories simply cannot be predicted with any degree of accuracy over the long term.
But the subject did not really take off until the development of software for experi-
mentation and graphics. Once these tools were available and applications to subjects
like weather prediction and chemical reactions were discovered, there was incentive to
nd the correct mathematical framework and to build an appropriate theory. Some
chaotic trajectories display fractal behavior, so this modern geometric concept oc-
curs naturally in the study of chaotic systems. Fractals also occur as the limit sets
of simple discrete dynamical systems. Take, for instance, the iterated function sys-
tem (IFS) dened by the three afne mappings of the plane: T
1
(v) = 1/2v, T
2
(v) =
1/2v +(1/2, 0), T
3
(v) = 1/2v +(1/4,
(x) = x(1 x) requires ten pages. The two most common bifurcations, namely
the period-doubling bifurcation and the tangent bifurcation, are studied and explored in
examples and problems. The Li-Yorke Theorem, which asserts that if f is continuous
on a closed interval J and maps J into itself, then if f has a period-3 point it also has
points of all other periods, is proven in detail, while its generalization by Sharkovsky
is simply stated and discussed. By the way, the Li-Yorke result rst appeared in this
MONTHLY in 1975 [5] and is one of the early papers that made the subject of chaos
popular. The tools needed in one dimension are the single-variable derivative and a
computational system to explore examples of iteration. Chapter 3 generalizes these
ideas to two dimensions using simple matrix theory and the Jacobian, and explores
two classic examples of chaotic behavior: the H enon quadratic mapping and Smales
Horseshoe.
Chapter 4 moves from the discrete setting to continuous dynamical systems, which
are dened in terms of rst-order differential equations. It generalizes the basic con-
cepts to this setting and explores the pendulum system and the Lorenz system as two
examples. No experience in solving differential equations is necessary. The basic idea
of a differential equation dening a ow, together with some of the basic properties
of the ow, is developed. Continuous dynamical systems require more machinery and
sophistication to develop (which is mostly not done in this book). However, the most
important applications of chaos to reality lie in this realm. There are also some philo-
sophical points to make about the modeling process. For example, since chaos is a
mathematical construct, it can apply to a given mathematical mode of reality but never
to physical reality itself. Thus no phenomenon can ever be chaotic in the mathematical
sense.
The last three chapters of the book concentrate on fractals. Chapter 6 introduces
the basic idea of a fractal and discusses self-similarity and various kinds of fractal di-
mension. It also presents some basic examples, such as the Cantor set, the Sierpi nski
gasket, and the H enon attractor. Chapter 7 discusses Barnsleys Iterated Function Sys-
tems using metric spaces and shows how they can be used to generate fractals on a
computer. This chapter includes several elegant and useful results, such as the com-
pleteness of the collection of compact sets in the plane under the Hausdorff metric.
Finally, Chapter 8 studies fractals in the complex plane and introduces Julia sets and
the Mandelbrot set. The second edition of the book offers enhanced coverage of frac-
tals beyond what was presented in the rst edition.
An appendix in the book presents MATLAB functions to allow the study of iteration
empirically and to generate on the computer the classical images associated with chaos
and fractals. For instance, there is MATLAB code to produce the bifurcation diagram
of the logistic mapping Q