Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Ross G. Pinsky
Problems from
the Discrete to
the Continuous
Probability, Number Theory, Graph
Theory, and Combinatorics
Universitext
Universitext
Series Editors:
Sheldon Axler
San Francisco State University
Vincenzo Capasso
Università degli Studi di Milano
Carles Casacuberta
Universitat de Barcelona
Angus MacIntyre
Queen Mary University of London
Kenneth Ribet
University of California, Berkeley
Claude Sabbah
CNRS, École Polytechnique, Paris
Endre Süli
University of Oxford
Wojbor A. Woyczynski
Case Western Reserve University, Cleveland, OH
Universitext is a series of textbooks that presents material from a wide variety of mathematical
disciplines at master’s level and beyond. The books, often well class-tested by their author,
may have an informal, personal even experimental approach to their subject matter. Some of
the most successful and established books in the series have evolved through several editions,
always following the evolution of teaching curricula, to very polished texts.
Thus as research topics trickle down into graduate-level teaching, first textbooks written for
new, cutting-edge courses may make their way into Universitext.
123
Ross G. Pinsky
Department of Mathematics
Technion-Israel Institute of Technology
Haifa, Israel
—Hermann Hankel
—Ernst Kummer
To Jeanette
and to
E. A. P.
Y. U. P.
L. A. T-P.
M. D. P.
Preface
ix
x Preface
of interest to mathematicians whose fields of expertise are away from the subjects
treated herein. In light of the primary intended audience, the level of detail in proofs
is a bit greater than what one sometimes finds in graduate mathematics texts.
I conclude with some brief comments on the novelty or lack thereof in the
various chapters. A bit more information in this vein may be found in the chapter
notes at the end of each chapter. Chapter 1 follows a standard approach to the
problem it solves. The same is true for Chap. 2 except for the probabilistic proof
of Theorem 2.1, which I haven’t seen in the literature. The packing problem
result in Chap. 3 seems to be new, and the proof almost certainly is. My approach
to the arcsine laws in Chap. 4 is somewhat different than the standard one; it
exploits generating functions to the hilt and is almost completely combinatorial.
The traditional method of proof is considerably more probabilistic. The proofs of
the results in Chap. 5 on the distribution of cycles in random permutations are
almost exclusively combinatorial, through the method of generating functions. In
particular, the proof of Theorem 5.2 makes quite sophisticated use of this technique.
In the setting of weighted permutations, it seems that the method of proof offered
here cannot be found elsewhere. The number theoretic topics in Chaps. 6–8 are
developed in a standard fashion, although the route has been streamlined a bit to
provide a rapid approach to the primary goal, namely, the proof of the Hardy–
Ramanujan theorem. In Chap. 9, the proof concerning the number of cliques in a
random graph is more or less standard. The result on tampering detection constitutes
material with a new twist and the methods are rather probabilistic; a little additional
probabilistic background and sophistication on the part of the reader would be useful
here. The results from Ramsey theory are presented in a standard way. Chapter 10,
which deals with the phase transition concerning the giant component in a sparse
random graph, is the most demanding technically. The reader with a modicum of
probabilistic sophistication will be at quite an advantage here. It appears to me that
a complete proof of the main results in this chapter, with all the details, is not to be
found in the literature.
xi
xii Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
A Note on Notation
xiii
Chapter 1
Partitions with Restricted Summands
or “the Money Changing Problem”
Imagine a country with coins of denominations 5 cents, 13 cents, and 27 cents. How
many ways can you make change for $51,419.48? That is, how many solutions
.b1 ; b2 ; b3 / are there to the equation 5b1 C 13b2 C 27b3 D 5;141;948, with
the restriction that b1 ; b2 ; b3 be nonnegative integers? This is a specific case of
the following general problem. Fix m distinct, positive integers faj gm j D1 . Count the
number of solutions .b1 ; : : : ; bm / with integral entries to the equation
b1 a1 C b2 a2 C C bm am D n; bj 0; j D 1; : : : ; m: (1.1)
X
k
xi D n and x1 x2 xk 1:
iD1
1 p 2n
Pn p e 3 ; as n ! 1:
4n 3
Now consider partitions of n where we restrict the values of the summands xi above
to the set faj gm
j D1 . Denote the number of such restricted partitions by Pn .faj gj D1 /.
m
Does there exist a solution to (1.1) for every sufficiently large integer n? And
if so, can one evaluate asymptotically the number of such solutions for large n?
Without posing any restrictions on faj gm j D1 , the answer to the first question is
negative. For example, if m D 3 and a1 D 5; a2 D 10; a3 D 30, then clearly
there is no solution to (1.1) if n − 5. Indeed, it is clear that a necessary condition
for the existence of a solution for all large n is that faj gm j D1 are relatively prime:
gcd.a1 ; : : : ; am / D 1. This is the time to recall a well-known result concerning
solutions .b1 ; : : : ; bm / with (not necessarily nonnegative) integral entries to the
equation b1 a1 C b2 a2 C C bm am D n. A fundamental theorem in algebra/number
theory states that there exists an integral solution to this equation for all n 2 Z if
and only if gcd.a1 ; : : : ; am / D 1. This result has an elegant group theoretical proof.
We will prove that for all large n, (1.1) has a solution .b1 ; : : : ; bm / with integral
entries if and only if gcd.a1 ; : : : ; am / D 1, and we will give a precise asymptotic
estimate for the number of such solutions for large n.
Theorem 1.1. Let m 2 and let faj gm j D1 be distinct, positive integers. Assume
that the greatest common divisor of faj gm j D1 is 1: gcd.a1 ; : : : ; am / D 1. Then for all
sufficiently large n, there exists at least one integral solution to (1.1). Furthermore,
the number Pn .faj gm j D1 / of such solutions satisfies
nm1
Pn .faj gm
j D1 / Q ; as n ! 1: (1.2)
.m 1/Š mj D1 aj
Remark. In particular, we see (not surprisingly) that for fixed m and sufficiently
large n, the smaller the faj gm
j D1 are, the more solutions there are. We also see
.1/ .2/
that given m1 and faj gm m2
j D1 , and given m2 and faj gj D1 , with m2 > m1 , then
1
for sufficiently large n there will be more solutions for the latter set of parameters.
Proof. We will prove the asymptotic estimate in (1.2), from which the first statement
of the theorem will also follow. Let hn denote the number of solutions to (1.1). (For
the proof, the notation hn will be a lot more convenient than Pn .faj gmj D1 /.) Thus,
we need to show that (1.2) holds with hn in place of Pn .faj gmj D1 /. We define the
generating function of fhn g1
nD1 :
1
X
H.x/ D hn x n : (1.3)
nD1
m
A simple, rough estimate shows that hn Qmn , from which it follows that the
j D1 aj
power series on the right hand side of (1.3) converges for jxj < 1. See Exercise 1.1.
It turns out that we can exhibit H explicitly. We demonstrate this for the case m D 2,
from which the general case will become clear.
For k D 1; 2, we have
1
D 1 C x ak C x 2ak C x 3ak C ;
1 x ak
1
D 1Cx a1
Cx 2a1
Cx 3a1
C 1Cx a2
Cx 2a2
Cx 3a2
C D
.1x a1 /.1x a2 /
1 C x a1 C x 2a1 C x 3a1 C C x a2 C x a1 Ca2 C x 2a1 Ca2 C x 3a1 Ca2 C C
2a2
x C x a1 C2a2 C x 2a1 C2a2 C x 3a1 C2a2 C C (1.4)
A little thought now reveals that on the right hand side of (1.4), the number of
times the term x n appears is the number of integral solutions .b1 ; b2 / to (1.1) with
m D 2; that is, hn is the coefficient of x n on the right hand side of (1.4). So H.x/ D
1
.1x a1 /.1x a2 /
. Clearly, the same argument works for all m; thus we conclude that
1
H.x/ D ; jxj < 1: (1.5)
.1 x a1 /.1 x a2 / .1 x am /
We now begin an analysis of H , as given in its closed form in (1.5), which will
lead us to the asymptotic behavior as n ! 1 of the coefficients hn in its power
series representation in (1.3). Consider the polynomial
p.x/ D .1 x a1 /.1 x a2 / .1 x am /:
2 ij
1
For each k, the roots of 1 x ak are the ak th roots of unity: fe ak gjakD0 . Clearly 1 is a
root of p.x/ of multiplicity m. Because of the assumption that gcd.a1 ; : : : ; am / D 1,
it follows that every other root of p.x/ is of multiplicity less than m—that is, there is
2 ijk
no complex number r that can be written in the form r D e ak , simultaneously for
k D 1; : : : ; m, where 1 jk < ak . Indeed, if r can be written in the above form
for all k, then it follows that ajkk is independent of k. In particular, ak D jkja1 1 , for
k D 2; : : : ; m. Since 1 j1 < a1 , it follows that there is at least one prime factor
of a1 which is a factor of all of the ak , k D 2; : : : ; m, and this contradicts the
assumption that gcd.a1 ; : : : ; am / D 1.
Denote the distinct roots of p.x/ by 1; r2 ; : : : ; rl , and note from above that
jrj j D 1, for all j . Let mk denote the multiplicity of the root rk , for k D 2; : : : ; l.
Also, note that p.0/ D 1. Then we can write
x m2 x
.1 x a1 /.1 x a2 / .1 x am / D .1 x/m .1 / .1 /ml ; (1.6)
r2 rl
where 1 mj < m; for j D 2; : : : ; l:
In light of (1.5) and (1.6), we can write the generating function H.x/ in the form
1
H.x/ D : (1.7)
.1 x/m .1 x m2
r2
/ .1 x ml
rl
/
4 1 Partitions with Restricted Summands
By the method of partial fractions, we can rewrite H from (1.7) in the form
A11 A12 A1m
H.x/ D C C C C
.1 x/m .1 x/m1 .1 x/
A A2m2 A Alml
21 l1
x m2 C C C C x ml C C : (1.8)
.1 r2 / .1 r2 /
x
.1 rl / .1 rxl /
For positive integers k, the function F .x/ D .1 x/k has the power series
expansion
1
!
X nCk1 n
k
.1 x/ D x :
nD0
k1
.n/ nCk1
To prove this, just verify that F nŠ.0/ D k1
. Thus, the first term on the right
hand side of (1.8) can be expanded as
1
!
A11 X nCm1 n
D A11 x : (1.9)
.1 x/m nD0
m1
.n C m 1/.n C m 2/ .n C 1/ nm1
A11 A11 as n ! 1:
.m 1/Š .m 1/Š
Every other term on the right hand side of (1.8) is of the form .1Ax /k where 1
r
k < m and jrj D 1. By the same argument as above, the coefficient of x n in the
k1
n .k1/Š as n ! 1 (substitute r for x in the
expansion for .1Ax /k is asymptotic to r An x
r
appropriate series expansion). Thus, each of these terms is on a smaller order than
the coefficient of x n in (1.9). We thereby conclude that the coefficient of x n in H.x/
nm1
is asymptotic to A11 .m1/Š as n ! 1. By (1.3), this gives
nm1
hn A11 ; as n ! 1: (1.10)
.m 1/Š
It remains to evaluate the constant A11 . From (1.8), it follows that
A11 1
H.x/ CO ; as x ! 1:
.1 x/ m .1 x/ m1
Thus,
.1 x/m Y m
x1
.1 x/m H.x/ D D : (1.12)
.1 x a1 /.1 x a2 / .1 x am / j D1
x aj 1
1
lim .1 x/m H.x/ D Qm : (1.13)
x!1 j D1 aj
From (1.11) and (1.13) we obtain A11 D Qm 1 , and thus from (1.10) we conclude
j D1 aj
nm1
that hn Q
.m1/Š m
.
j D1 aj
Exercise 1.5. Let Cnf1;2g denote the number of compositions of n with summands
restricted to the integers 1 and 2, that is, compositions .x1 ; ; xk / of n with the
restriction that xi 2 f1; 2g, for all i . The series
X 1
1
F .x/ WD D .x C x 2 /n (1.14)
1 x x2 nD0
p
converges absolutely for jxj < 51
2
since jx C x 2 j jxj C jxj2 < 1 if jxj <
p
51
2
:
(a) Similar to the argument leading from (1.3) to (1.5), argue that Cnf1;2g is the
coefficient of x n in the
Ppower series expansion of F .
1
(b) Show that F .x/ D nD0 fn x n
, where ffn g1
nD0 is the Fibonacci sequence—
see (B.2) in Appendix B. (Hint: One has .x C x 2 /F .x/ D F .x/ 1.)
(c) Conclude from (a) and (b) that Cnf1;2g is the nth Fibonacci number; thus,
from (B.10) in Appendix B,
p p
1 1C 5 n 1 5 n
Cnf1;2g Dp . / . / :
5 2 2
Chapter Notes
P
then must use the Euler product formula to show that this is equal to . 1 1 1
nD1 n2 / .
We will first give the number theoretic proof and then give the heuristic and the
rigorous probabilistic proofs.
The number theoretic ideas we develop along the way to our first proof of
Theorem 2.1 will bring us close to proving another result, which we now describe.
Every positive integer n 2 can be factored uniquely as n D p1k1 pm km
, where
m 1, fpj gj D1 are distinct primes, and kj 2 N, for j 2 Œm. If in this factorization,
m
one has kj D 1, for all j 2 Œm, then we say that n is square-free. Thus, an integer
n 2 is square-free if and only if it is of the form n D p1 pm , where m 1 and
fpj gmj D1 are distinct primes. The integer 1 is also called square-free. There are 61
square-free positive integers that are no greater than 100:
1,2,3,5,6,7,10,11,13,14,15,17,19,21,22,23,26,29,30,31,33,34,35,37,38,39,41,42,43,
46,47,51,53,55,57,58,59,61,62,65,66,67,69,70,71,73,74,77,78,79,82,83,85,86,
87,89,91,93,94,95,97.
Let Cn D fk W 1 k n; k is square-freeg. If limn!1 jCnn j exists, we call
this limit the asymptotic density of square-free numbers. After giving the number
theoretic proof of Theorem 2.1, we will prove the following theorem.
Theorem 2.2. The asymptotic density of square-free integers is 6
2
0:6079.
For the number theoretic proof of Theorem 2.1, the first alternative suggested
above in the second paragraph of this chapter will be more convenient. In fact, once
we have chosen the two distinct integers, it will be convenient to order them by size;
thus, we may consider the set Bn of all possible (and equally likely) outcomes to be
Then the probability qn that the two selected integers are relatively prime is
jAn j 2jAn j
qn D D : (2.1)
jBn j n.n 1/
and
1
X b.n/
g.x/ D : (2.3)
nD1
nx
X1 1
1 X n X .a b/.n/
D x
a.d /b. / D : (2.4)
n d nx
nD1 d jn nD1
If the series on the right hand side of (2.2) and (2.3) are in fact absolutely convergent,
then the series on the right P1hand side of (2.4) is alsoPabsolutely convergent. In
a.d / P1 b.k/ 1 .ab/.n/
such case, the equality d D1 d x kD1 k x D nD1 nx
is a rigorous
statement in mathematical analysis.
An arithmetic function a is called multiplicative if a.nm/ D a.n/a.m/ whenever
gcd.n; m/ D 1. It follows that if a 6
0 is multiplicative, then a.1/ D 1. If a 6
0 is
multiplicative, then it is completely determined by its values on the prime powers;
Q kj
indeed, if n D m j D1 pj is the factorization of n into a product of distinct prime
Q kj Qm kj
powers, then a.n/ D a. m j D1 pj / D j D1 a.pj /.
It is trivial to verify that is multiplicative. For the first proposition below, the
following lemma will be useful.
P
Lemma 2.1. The arithmetic function d jn .d / is multiplicative.
Proof. Let n and m be positive integers satisfying gcd.n; m/ D 1. We have
10 2 Relatively Prime Pairs and Square-Free Numbers
X X X X X
.d1 / .d2 / D .d1 /.d2 / D .d1 d2 / D .d /;
d1 jn d2 jm d1 jn;d2 jm d1 jn;d2 jm d jnm
where the second equality follows from the fact that is multiplicative and the fact
that if gcd.n; m/ D 1, d1 jn and d2 jm, then gcd.d1 ; d2 / D 1, while the final equality
follows from the fact that if gcd.n; m/ D 1 and d jnm, then d can be written as
d D d1 d2 for a unique pair d1 ; d2 satisfying d1 jn and d2 jm. (The reader should
verify these facts.)
We introduce three more arithmetic functions that will be used in the sequel:
(
1; if n D 1I
1.n/ D 1; for all nI i.n/ D n; for all nI e.n/ D
0; otherwise:
P
Note that a e D a, for all a, and that .a 1/.n/ D d jn a.d /. A key result we
need is the Möbius inversion formula.
Proposition 2.1. Let a be an arithmetic function. Define b D a 1. Then a D b .
Remark. Written out explicitly, the theorem asserts that if
X
b.n/ WD a.d /;
d jn
P
then a.n/ D d jn b.d /. dn /.
Proof. To prove the proposition, it suffices to prove that
1 D e: (2.5)
Indeed, using this along with the easily verified associativity of the convolution, we
have
b D .a 1/ D a .1 / D a e D a:
P
By Lemma 2.1, the function d jn .d / is multiplicative. Clearly, the function e
P D 1 and e.p / D 0, for any prime p and any
k
is multiplicative. Obviously, e.1/
positive integer k. We have d j1 .d / D .1/ D 1. Thus, since a nonzero,
2 Relatively Prime Pairs and Square-Free Numbers 11
That is, .n/ counts the number of positive integers less than or equal to n which
are relatively prime to n. For our calculation of limn!1 qn , we will use a result that
is a corollary of the following proposition.
Proposition 2.2. 1 D i ; that is,
X
.d / D n:
d jn
From Proposition 2.2 and the Möbius inversion formula, the following corollary
is immediate.
Corollary 2.1. i D ; that is,
X n
.n/ D .d / :
d
d jn
Remark. For the proofs of Theorems 2.1 and 2.2, we do not need Proposition 2.2,
but only Corollary 2.1. In Exercise 2.1, the reader is guided through a direct proof of
the corollary. The proof also will reveal why the seemingly strange Möbius function
has such nice properties.
Proof of Proposition 2.2. Let d jn. It is easy to see that .d / is equal to the number
of k 2 Œn satisfying gcd.k; n/ D dn . Indeed, k 2 Œn satisfies gcd.k; n/ D dn if and
only if k D j. dn /, for some j 2 Œd satisfying gcd.d; j / D 1. (The reader should
verify this.) Also, clearly, every k 2 Œn satisfies gcd.k; n/ D dn for some d jn. The
proposition follows from these facts.
Remark. For an alternative proof of Proposition 2.2, exactly in the spirit of
Lemma 2.1 and the proof of (2.5), see Exercise 2.2.
We are now in a position to prove Theorem 2.1.
Number Theoretic Proof of Theorem 2.1. For each k 2, there are .k/ integers j
satisfying 1 j < k and gcd.j; k/ D 1. Thus,
X
n
jAn j D jf.j; k/ W 1 j < k n; gcd.j; k/ D 1gj D .k/:
kD2
12 2 Relatively Prime Pairs and Square-Free Numbers
Q
where pjn indicates that the product is over all primes that divide n; see
Exercise 2.3. However, this formula is of no help whatsoever for analyzing the above
sum.
P
We will use Corollary 2.1 to analyze nkD1 .k/. From Corollary 2.1, we have
X
n X
n X
n X
k
.k/ D . i /.k/ D .d / D
d
kD1 kD1 kD1 d jk
X
n X X
n X
d 0 .d / D .d / d 0:
kD1 dd 0 Dk d D1 d 0 dn
Pm
Since j D1 j D 12 m.m C 1/, we have
X
n X
n X 1X
n
n n
.k/ D .d / d0 D .d /Œ .Œ C 1/: (2.8)
2 d d
kD1 d D1 d 0 dn d D1
We have Œ dn .Œ dn C 1/ n n
.
d d
C 1/ D . dn /2 C dn , and Œ dn .Œ dn C 1/ . dn 1/ dn D
. dn /2 dn ; thus,
n n n n n n
. /2 Œ .Œ C 1/ . /2 C :
d d d d d d
Substituting this two-sided inequality in (2.8), we obtain
Now
X
n
.d / X
n
1 X
n
1
j j D1C 1 C log n; (2.10)
d d d
d D1 d D1 d D2
Rn
since the final sum is a lower Riemann sum for 1 x1 dx. From (2.9) and (2.10), we
obtain
Pn 1
kD2 .k/ 1 X .d /
lim D : (2.11)
n!1 n.n 1/ 2 d2
d D1
P
It remains to evaluate 1 .d /
d D1 d 2 . On the face of it, from the definition of ,
it would seem very difficult to evaluate this explicitly. However, Möbius inversion
saves the day. Consider (2.2)–(2.4) with a D 1 and b D and with x D 2. With
these choices, the right hand sides of (2.2) and (2.3) are absolutely convergent.
By (2.5), we have 1 D e; that is, a b D e. Therefore, we conclude from
(2.2)–(2.4) that
1
! 1
!
X 1 X .d /
D 1: (2.12)
d2 d2
d D1 d D1
X1
1 2
2
D : (2.13)
nD1
n 6
6
lim qn D ;
n!1 2
completing the proof of the theorem.
Remark. If a is an arithmetic function and f is a nondecreasing function, we
say that the function f is the average order of the arithmetic function a if
1 Pn
n kD1 a.k/ D f .n/ C o.f .n//. Of course this doesn’t uniquely define f ; we
usually choose a particular such f which has a simple form. From (2.11) and (2.14),
it follows that the average order of the Euler -function is 3n2 .
14 2 Relatively Prime Pairs and Square-Free Numbers
Thus, letting
An D fj 2 Œn W j is square-freeg;
we have
X
n
jAn j D 2 .j /: (2.16)
j D1
jAn j 6
lim D 2: (2.17)
n!1 n
We need the following lemma.
Lemma 2.2.
X
2 .n/ D .k/:
k 2 jn
P
Proof. Let ƒ.n/ WD k 2 jn .k/. If n is square-free, then the only integer k that
satisfies k jn is k D 1. Thus, since .1/ D 1, we have ƒ.n/ D 1. On the other
2
where the last equality follows from (2.5). The lemma now follows from (2.15).
Using Lemma 2.2, we have
X
n X
n X
2 .j / D .k/: (2.18)
j D1 j D1 k 2 jj
2 Relatively Prime Pairs and Square-Free Numbers 15
If k 2 > n, then .k/ will not appear on the right hand side of (2.18). If k 2 n,
then .k/ will appear on the right hand side of (2.18) Œ kn2 times, namely, when
j D k 2 ; 2k 2 ; : : : ; Œ kn2 k 2 . Thus, we have
X
n X
n X X n X n
2 .j / D .k/ D Œ 2 .k/ D Œ 2 .k/ D
j D1 j D1 k 2 jj
k k
k n
2 1
kŒn 2
X .k/ X n n
n C Œ .k/: (2.19)
1
k2 1
k2 k2
kŒn 2 kŒn 2
Since each summand in the second term on the right hand side of (2.19) is bounded
in absolute value by 1, we have
X n n 1
j Œ 2 2 .k/j n 2 : (2.20)
1
k k
kŒn 2
Using this with (2.14) gives (2.17) and completes the proof of the theorem.
We now give a heuristic probabilistic proof and a rigorous probabilistic proof of
Theorem 2.1. In the heuristic proof, we put quotation marks around the steps that
are not rigorous.
Heuristic Probabilistic Proof of Theorem 2.1. Let fpk g1 kD1 be an enumeration of
the primes. In the spirit described in the first paragraph of the chapter, if we
pick a positive integer “at random,” then the “probability” of it being divisible by
the prime number pk is p1k . (Of course, this is true also with pk replaced by an
arbitrary positive integer.) If we pick two positive integers “independently,” then the
“probability” that they are both divisible by pk is p1k p1k D p12 , by “independence.”
k
So the “probability” that at least one of them is not divisible by pk is 1 p12 . The
k
“probability” that a “randomly” selected positive integer is divisible by the two
distinct primes, pj and pk , is pj1pk D p1j p1k . (The reader should check that this
“holds” more generally if pj and pk are replaced by an arbitrary pair of relatively
prime positive integers, but not otherwise.) Thus, the events of being divisible by pj
and being divisible by pk are “independent.” Now two “randomly” selected positive
integers are relatively prime if and only if, for every k, at least one of the integers
is not divisible by pk . But since the “probability” that at least one of them is not
divisible by pk is 1 p12 , and since being divisible by a prime pj and being divisible
k
16 2 Relatively Prime Pairs and Square-Free Numbers
by a different prime pk are “independent” events, the “probability” that the two
Q integers1 are such that, for every k, at least one of them
“randomly” selected positive
is not divisible by pk is 1
kD1 .1 p 2 /. Thus, this should be the “probability” that
k
two “randomly” selected positive integers are relatively prime.
Rigorous Probabilistic Proof of Theorem 2.1. For the probabilistic proof, the sec-
ond alternative suggested in the second paragraph of the chapter will be more
convenient. Thus, we choose an integer from Œn uniformly at random and then
choose a second integer from Œn uniformly at random. Let n D Œn. The
appropriate probability space on which to analyze the model described above is the
space .n n ; Pn /, where the probability measure Pn on n n is the uniform
measure; that is, Pn .A/ D jAj
n2
, for any A n n . The point .i; j / 2 n n
indicates that the integer i was chosen the first time and the integer j was chosen
the second time. Let Cn denote the event that the two selected integers are relatively
prime; that is,
Then the probability qn that the two selected integers are relatively prime is
jCn j
qn D Pn .Cn / D :
n2
Let fpk g1 kD1 denote the prime numbers arranged in increasing order. (Any
enumeration of the primes would do, but for the proof it is more convenient to
choose the increasing enumeration.) For each k 2 N, let BnIk
1
denote the event that
2
the first integer chosen is divisible by pk and let BnIk denote the event that the
second integer chosen is divisible by pk . That is,
1
BnIk D f.i; j / 2 n n W pk ji g; BnIk
2
D f.i; j / 2 n n W pk jj g:
Note of course that the above sets are empty if pk > n. The event BnIk 1
\ BnIk
2
D
f.i; j / 2 n n W pk ji and pk jj g is the event that both selected integers have
pk as a factor. There are Œ pnk integers in n that are divisible by pk , namely,
pk ; 2pk ; ; Œ pnk pk . Thus, there are Œ pnk 2 pairs .i; j / 2 n n for which both
coordinates are divisible by pk ; therefore,
Œ pnk 2
1
Pn .BnIk \ BnIk
2
/D : (2.21)
n2
Note that [1kD1 .BnIk \ BnIk / D [kD1 .BnIk \ BnIk / is the event that the two
1 2 n 1 2
selected integers have at least one common prime factor. (The equality above
1 2
follows from the fact that BnIk and BnIk are clearly empty for k > n.) Consequently,
Cn can be expressed as
2 Relatively Prime Pairs and Square-Free Numbers 17
c
Cn D [nkD1 .BnIk
1
\ BnIk
2
/ D \nkD1 .BnIk
1
\ BnIk
2 c
/ ;
\nkD1 .BnIk
1
\ BnIk
2 c
/ D \R
kD1 .BnIk \ BnIk / [kDRC1 .BnIk \ BnIk /
1 2 c n 1 2
n
Pn \RkD1 .BnIk \ BnIk / Pn [kDRC1 .BnIk \ BnIk /
1 2 c 1 2
Pn \nkD1 .BnIk
1
\ BnIk
2 c
/ Pn \R
kD1 .BnIk \ BnIk / :
1 2 c
(2.23)
Using the sub-additivity property of probability measures for the first inequality
below, and using (2.21) for the equality below, we have
X
n X
n Œ pnk 2 1
X
1 1
Pn [nkDRC1 .BnIk
1 2
\ BnIk / Pn BnIk 2
\ BnIk / D :
n2 p2
kDRC1 kDRC1 kDRC1 k
(2.24)
Up until now, we have made no assumption on n. Q Now assume that pk jn, for
k D 1; ; R; that is, assume that n is a multiple of R
kD1 pk . Denote the set of
such n by DR ; that is,
DR D fn 2 N W pk jn for k D 1; ; Rg:
1
Recall that the event BnIk \ BnIk
2
is the event that both selected integers are divisible
by k. We claim that if n 2 DR , then the events fBnIk 1
\ BnIk
2
gR
kD1 are independent.
That is, for any subset I f1; 2; ; Rg, one has
Y
Pn \k2I .BnIk
1
\ BnIk
2
/ D 1
Pn .BnIk \ BnIk
2
/; if n 2 DR : (2.25)
k2I
The proof of (2.25) is a straightforward counting exercise and is left as Exercise 2.4.
If events fAk gR
kD1 are independent, then the complementary events fAk gkD1 are also
c R
YR
1
Pn \R
kD1 .B 1
nIk \ B 2 c
nIk / D Pn .BnIk \ BnIk
2 c
/ ; if n 2 DR : (2.26)
kD1
18 2 Relatively Prime Pairs and Square-Free Numbers
1 Œ pn 2
By (2.21) we have Pn .BnIk \ BnIk
2 c
/ D 1 Pn .BnIk
1
\ BnIk
2
/ D 1 k
n2
, for any
n. Thus, from the definition of DR , we have
1 1
Pn .BnIk \ BnIk
2 c
/ D 1 2 ; if n 2 DR : (2.27)
pk
Y
R 1
X YR
1 1 1
.1 / P n .C n / .1 2 /; for R 2 N and n 2 DR : (2.28)
kD1
pk2 kDRC1 pk2 kD1
p k
QRWe now use 0 (2.28) to obtain an estimate on Pn .Cn / for general n. Let n
kD1 pk . Let n denote the largest integer in DR which is smaller or equal to n,
and let n00 denote the smallest integer
Q in DR which is larger or equal to n. Since DR
is the set of positive multiples of RkD1 pk , we obviously have
Y
R Y
R
n0 > n pk and n00 < n C pk : (2.29)
kD1 kD1
For any n, note that n2 Pn .Cn / D jCn j is the number of pairs .i; j / 2 n n that
are relatively prime. Obviously, the number of such pairs is increasing in n. Thus
.n0 /2 Pn0 .Cn0 / n2 Pn .Cn / .n00 /2 Pn00 .Cn00 /, or equivalently,
n0 2 n00
. / Pn0 .Cn0 / Pn .Cn / . /2 Pn00 .Cn00 /: (2.30)
n n
QR 1 QR
n kD1 pk 2
Y
R
1 X 1 nC kD1 pk 2
Y
R
1
. / .1 2
/ < Pn .Cn / < . / .1 /:
n
kD1
pk p2
kDRC1 k
n
kD1
pk2
(2.31)
Letting n ! 1 in (2.31), we obtain
Y
R X1 YR
1 1 1
.1 2
/ 2
lim inf P n .C n / lim sup P n .C n / .1 2 /:
kD1
pk p
kDRC1 k
n!1 n!1
kD1
pk
(2.32)
Now (2.32) holds for arbitrary R; thus letting R ! 1, we conclude that
1
Y 1
lim Pn .Cn / D .1 /: (2.33)
n!1
kD1
pk2
2 Relatively Prime Pairs and Square-Free Numbers 19
X1
1 1
Q1 D ; r > 1I (2.34)
kD1 .1
1 nr
pkr
/ nD1
see Exercise 2.5. From (2.33), (2.34), and (2.13), we conclude that
1 6
lim qn D lim Pn .Cn / D P1 1
D :
n!1 n!1
nD1 n2
2
Exercise 2.1. Give a direct proof of Corollary 2.1. (Hint: The Euler -function
.n/ counts the number of positive integers that are less than or equal to n and
relatively prime to n. We employ the sieve method, which from the point of view
of set theory is the method of inclusion–exclusion. Start with a list of all n integers
between 1 and n as potential members of the set of the .n/ relatively prime integers
to n. Let fpj gm n
j D1 be the prime divisors of n. For any such pj , the pj numbers
pj ; 2pj ; : : : ; pnj pj are not relatively prime to n. So we should strike these numbers
from our list. When we do this for each j , the remaining numbers on the list are
those numbers that are relatively prime to n, and the size of theP list is .n/. Now
we haven’t necessarily reduced the size of our list to N1 WD n m n
j D1 pj , because
some of the numbers we have deleted might be multiples of two different primes,
pi and pj , in which case they were subtracted above twice. Thus we need to add
back to N1 all of the pinpj multiples of pi pj , for i ¤ j . That is, we now have
P
N2 WD N1 C i¤j pinpj . Continue in this vein.
Exercise 2.2. This exercise presents an alternative proof to Proposition 2.2:
P
(a) Show that the arithmetic function d jn .d / is multiplicative. Use the fact that
is multiplicative—see
P Exercise 2.3.
(b) Show that d jn .d / D n, when n is a prime power.
(c) Conclude that Proposition 2.2 holds.
Exercise 2.3. The Chinese remainder theorem states that if n and m are relatively
prime positive integers, and a 2 Œn and b 2 Œm, then there exists a unique c 2 Œnm
such that c D a mod n and c D b mod m. (For a proof, see [27].) Use this to prove
that the Euler -function is multiplicative. Then use the fact that is multiplicative
to prove (2.7).
Exercise 2.4. Prove (2.25).
Exercise 2.5. Prove the Euler product formula (2.34). (Hint: Let N` denote the set
of positive integers all of whose prime factors are in the set fpk g`kD1 . Using the fact
that
X1
1 1
D rm ;
1 pr
1 p
mD0 k
k
20 2 Relatively Prime Pairs and Square-Free Numbers
P
for all k 2 N, first show that 11 1 11 1 D 1
n2N2 nr , and then show that
p1r p2r
Q` P
kD1 1 1 D n2N` nr , for any ` 2 N.)
1 1
pkr
Exercise 2.6. Using Theorem 2.1, prove the following result: Let 2 d 2 N.
Choose two integers uniformly at random from Œn. As n ! 1, the asymptotic
probability that their greatest common divisor is d is d 26 2 .
Exercise 2.7. Give a probabilistic proof of Theorem 2.2.
Chapter Notes
It seems that Theorem 2.1 was first proven by E. Cesàro in 1881. A good source for
the results in this chapter is Nathanson’s book [27]. See also the more advanced
treatment of Tenenbaum [33], which contains many interesting and nontrivial
exercises. The heuristic probabilistic proof of Theorem 2.1 is well known and
can be found readily, including via a Google-search. I am unaware of a rigorous
probabilistic proof in the literature.
Chapter 3
A One-Dimensional Probabilistic Packing
Problem
X
k1 Z 1 X
k1 j
EMnIk 1 s
lim D k exp.2 / exp.2 / ds WD pk : (3.1)
n!1 n j D1
j 0 j D1
j
MnIk
Furthermore, n
satisfies the weak law of large numbers; that is, for all > 0,
MnIk
lim P .j pk j / D 0: (3.2)
n!1 n
and that
2
EMnIk D L.k/
n D pk n C o.n /; as n ! 1:
2 2 2
(3.4)
This method of proof is known as the second moment method. It is clear that (3.1)
follows from (3.3). An application of Chebyshev’s inequality shows that (3.2)
follows from (3.3) and (3.4). To see this, note that if Z is a random variable with
expected value EZ and variance
2 .Z/, then Chebyshev’s inequality states that
2 .Z/
P .jZ EZj ı/ ; for any ı > 0:
ı2
MnIk
Also,
2 .Z/ D EZ 2 .EZ/2 . We apply Chebyshev’s inequality with Z D n
.
Using (3.3) and (3.4), we have
.k/
Hn
EZ D D pk C o.1/; as n ! 1; (3.5)
n
3 Probabilistic Packing Problem 23
and
.k/ .k/
Ln .Hn /2
2 .Z/ D 2
D pk2 C o.1/ .pk C o.1//2 D o.1/; as n ! 1:
n n2
.k/
MnIk Hn
lim P .j j ı/ D 0; for all ı > 0: (3.6)
n!1 n n
We now show that (3.2) follows from (3.3) and (3.6). Fix > 0. We have
.k/ .k/ .k/ .k/
MnIk MnIk Hn Hn MnIk Hn Hn
j pk j D j C pk j j jCj pk j:
n n n n n n n
.k/
For sufficiently large n , one has from (3.3) that j Hnn pk j 2 , for n n . Thus,
.k/
MnIk MnIk
for n n , a necessary condition for j n
pk j is that j n
Hn
n
j 2 .
Consequently,
.k/
MnIk MnIk Hn
P .j pk j / P .j j /; for n n :
n n n 2
Now (3.2) follows from this and (3.6).
Our proofs of (3.3) and (3.4) will follow similar lines. Before commencing with
the proof of (3.3), we trace its general architecture. Only the first step of the proof
involves probability. In this step, we employ probabilistic reasoning to produce a
.k/ .k/ .k/ .k/
recursion equation that gives Hn in terms of H0 ; H1 ; : : : ; Hnk . In this form,
.k/
the equation is not useful because as n ! 1, it gives Hn in terms of a growing
.k/ Pn .k/
number of its predecessors. However, defining Sn D j D0 Hj , and using the
.k/
abovementioned recursion equation, we find that Sn is given in terms of only
two of its predecessors. We then construct the generating function g.t / whose
coefficients are fSn g1
.k/ .k/
nD0 . Using the recursion equation for Sn , we show that g
solves a linear, first order differential equation. We solve this differential equation
to obtain an explicit formula for g.t /. This explicit formula reveals that g possesses
.k/
a singularity at t D 1. Exploiting this singularity allows us to evaluate limn!1 Sn
n2
,
.k/ .k/
and then a simple observation allows us to obtain limn!1 Hnn from Sn
limn!1 n2 .
We now commence with the proof of (3.3). Note that if we start with n < k
molecules, then none of them will get bonded. Thus,
24 3 Probabilistic Packing Problem
.k/
We now derive a recursion relation for Hn . The method we use is called first step
analysis. We begin with a line of n k unbonded molecules, and in the first step,
one of the nearest neighbor k-tuples is chosen at random and its k molecules are
bonded. In order from left to right, denote the original n k C 1 nearest neighbor
k-tuples by fBj gnkC1
j D1 . If Bj was chosen in the first step, then the original row now
contains a row of j 1 unbonded molecules to the left of the bonded k-tuple Bj
and a row of n C 1 j k unbonded molecules to the right of Bj . To complete the
random bonding process, we choose random k-tuples from these two sub-rows until
there are no more k-tuples to choose from. This gives us the following formula for
the conditional expectation of MnIk given that Bj was selected first: for n k,
.k/ .k/
E.MnIk jBj selected first/DkCE.Mj 1Ik CMnC1j kIk /DkCHj 1 CHnC1j k :
(3.8)
Of course, for each j 2 Œn k C 1, the probability that Bj was chosen first is
1
nkC1
. Thus, we obtain the formula
X
nkC1
EMnIk D Hn.k/ D P .Bj selected first/E.MnIk jBj selected first/ D
j D1
1 X
nkC1
.k/ .k/
.k C Hj 1 C HnC1j k /; n k:
n k C 1 j D1
2 X
nk
.k/
Hn.k/ DkC H ; n k: (3.9)
n k C 1 j D0 j
.k/
The above recursion equation is not useful directly because it gives Hn in terms
of n k C 1 of its predecessors; we want a recursion equation that expresses a given
term in terms of a fixed finite number of its predecessors. To that end, we define
X
n
.k/
Sn.k/ D Hj : (3.10)
j D0
2 .k/
Hn.k/ D k C S ; n k: (3.11)
n k C 1 nk
3 Probabilistic Packing Problem 25
and
.k/ 2 .k/
Sn.k/ Sn1 D k C S ; n k: (3.13)
n k C 1 nk
.k/
This recursion equation has the potential to be useful since it gives Sn in terms of
.k/ .k/
only two of its predecessors—Sn1 and Snk . Of course, we have paid a price—we
.k/ .k/
are now working with Sn instead of Hn ; but this will be dealt with easily. For
.k/ .k/ .k/
convenience, we drop the superscript k from Sn ; Hn , and Ln for the rest of the
chapter, except in the statement of propositions. We rewrite (3.13) as
We now define the generating function for fSn g1 nD0 and use (3.14) to derive a
linear, first-order differential equation that is satisfied by this generating function.
The generating function g.t / is defined by
1
X 1
X
g.t / D Sn t n D Sn t n ; (3.15)
nD0 nDk
where the second equality follows from (3.12). From the definitions, it follows that
Hn n, and thus Sn 12 n.n C 1/. Consequently, the sum on the right hand side
of (3.15) converges for jt j < 1, with the convergence being uniform for jt j , for
any 2 .0; 1/. It follows then that
1
X
g 0 .t / D nSn t n1 ; jt j < 1: (3.16)
nDk
Multiply equation (3.14) by t n and group the terms in the following way:
Now summing the equation over all n k, and appealing to (3.15), (3.16),
and (3.12), we obtain the differential equation
tg 0 .t / .k 1/g.t / D t 2 g 0 .t / .k 2/tg.t /
1
X 1
X
C 2t k g.t / C k t nt n1 k.k 1/ t n: (3.17)
nDk nDk
26 3 Probabilistic Packing Problem
P1 P1 tk
Since nDk nt n1 is the derivative of nDk tn D 1t
, it follows that
P1 tk 0 .1t/kt k1 Ct k
nDk nt
n1
D D
. 1t / .
Using these facts and doing some algebra,
.1t/2
which leads to many cancelations, we obtain
1
X 1
X ktk
kt nt n1 k.k 1/ tn D : (3.18)
.1 t /2
nDk nDk
.k 1/ .k 2/t C 2t k k t k1
g 0 .t / D g.t / C ; 0 < t < 1: (3.19)
t .1 t / .1 t /3
.k 1/ .k 2/t C 2t k k t k1
a.t / D ; b.t / D : (3.20)
t .1 t / .1 t /3
Rt
Z t Rs
g.t /e a.r/ dr
D g. / C b.s/e a.r/ dr
ds; t 2 . ; 1/;
which we rewrite as
Rt
Z t Rt
g.t / D g. /e a.r/ dr
C b.s/e s a.r/ dr
ds; t 2 . ; 1/: (3.21)
k 12
Since limt!0 t a.t / D k 1, there exists a t0 > 0 such that a.t / t
, for
0 < t t0 . Thus, for < t0 , one has
Rt Rt 1
t0
0 0 k 2 1
e a.r/ dr
e r dr
D . /k 2 :
3 Probabilistic Packing Problem 27
.k 1/ .k 2/r k1 1
D C :
r.1 r/ r 1r
We also have
r k1 1
D .1 C r C C r k2 /:
.1 r/ 1r
k1 3
a.r/ D C 2.1 C r C C r k2 /:
r 1r
We then obtain
Z t X
k1 j
t
a.r/ dr D .k 1/ log t 3 log.1 t / 2 ;
j D1
j
and thus
Rt Pk1 t j Pk1 s j
e s a.r/ dr
D t k1 .1 t /3 e 2 j D1 j s 1k .1 s/3 e 2 j D1 j : (3.23)
Substituting this in (3.22) and recalling the definition of b from (3.20), we obtain
Z
t k1 2 Pjk1 tj
t Pk1 sj
g.t / D e D1 j
ke 2 j D1 j
ds: (3.24)
.1 t / 3
0
Proposition 3.1.
.k/
Hn
lim D`
n!1 n
if and only if
.k/
Sn `
lim D :
n!1 n2 2
Proof. The proof is immediate from (3.11).
And we have the following proposition which connects the limiting behavior of
Sn with the singularity in g at t D 1.
Proposition 3.2. If
.k/
Sn
lim D L;
n!1 n2
then
X
n0 1
X X
n0 1
X
Sn t n C.L / n.n1/t n g.t / Sn t n C.LC / n.n1/t n :
nD0 nDn0 C1 nD0 nDn0 C1
(3.25)
Now
1
X 1
X 1 00 2t 2
n.n 1/t n D t 2 . t n /00 D t 2 . / D ;
nD0 nD0
1t .1 t /3
so
1
X X n
2t 2 0
n.n 1/t Dn
n.n 1/t n :
nDn0 C1
.1 t / 3
nD0
sides, we have
As already noted, from the definitions, we have Hl l and Sl 12 l.l C 1/. Thus,
there exists a C > 0 such that
ˇ .2k 5/n C 3 k ˇ
ˇ ˇSn1 C and
n2 .n 1/2 .n k C 1/ n2
2 C
.HnkC1 C C Hn1 / 2 : (3.27)
n2 .n k C 1/ n
This shows that the right hand side of (3.26) is O. n12 / and thus so is the left hand
P Sn
side. Consequently, the telescopic series 1nD2 n2 .n1/2 is convergent. Since
Sn1
Sn X Sj n
Sj 1
D ;
n2 j D2
j 2 .j 1/2
lim .1 t /3 g.t / D `:
t!1
30 3 Probabilistic Packing Problem
Pk1 Z 1 Pk1
1 sj
lim .1 t /3 g.t / D ke 2 j D1 j
e2 j D1 j
ds D pk :
t!1 0
where the last term comes from the fact that the independence gives
2 X
nk
4k X
nk
2 X
nk
Ln D k 2 C Lj C Hj C Hj Hnkj ;
n k C 1 j D0 n k C 1 j D0 n k C 1 j D0
for k n: (3.29)
X
n
Rn D Lj :
j D0
Rn D 0; for n D 0; : : : ; k 1: (3.30)
2 4k 2 X
nk
Rn DRn1 Ck 2 C Rnk C Snk C Hj Hnkj ; n k:
nkC1 nkC1 nkC1
j D0
(3.31)
3 Probabilistic Packing Problem 31
Proposition 3.4.
1 X .k/ .k/
nk
pk2
lim H j H nkj D :
n!1 n3 6
j D0
Proof. Let > 0. Since limn!1 Hnn D pk , we can find an n such that .pk /n
Hn .pk C /n, for n > n . Thus
X X
.pk /2 j.n k j / C Hj Hnkj
n <j <nn k 0j n ;nn kj nk
X
nk
Hj Hnkj
j D0
X X
.pk C /2 j.n k j / C Hj Hnkj :
n <j <nn k 0j n ;nn kj nk
(3.32)
Since Hj j , for all j , we have
X
Hj Hnkj 2.n C 1/n n: (3.33)
0j n ;nn kj nk
(There are 2.n C 1/ summands on the left hand side of (3.33),Pand each summand,
Hj Hnkj , is less than or equal to n n.) Using the identity nj D1 j 2 D 16 n.n C
1/.2n C 1/, we have
X X X
j.n k j / D .n k/ j j2 D
1j <nn k 1j <nn k 1j <nn k
1
.n k/.n n k 1/.n n k/
2
1 1
.nn k1/.nn k/ 2.nn k 1/ C 1 D n3 C o.n3 /; as n ! 1:
6 6
(3.34)
Of course,
X X 1
j.n k j / n j nn .n C 1/: (3.35)
1j n 1j n
2
1 X 1 X
nk nk
1 1
.pk /2 lim inf 3 Hj Hnkj lim sup 3 Hj Hnkj .pk C /2 ;
6 n!1 n j D0 n!1 n
j D0
6
nkC3 2
Rn D Rn1 C k 2 .LnkC1 C C Ln1 /C
nkC1 nkC1
4k 2 X
nk
Snk C Hj Hnkj :
nkC1 n k C 1 j D0
nkC3 Wn p2
Rn D Rn1 C Wn ; where Wn satisfies lim 2 D k : (3.36)
nkC1 n!1 n 3
In Exercise 3.1 the reader is asked to show that if for some n0 , the positive sequence
fRO n g1 O
nDn satisfies Rn
0
nkC3 O
Rn1 C cn2 (RO n nkC3 RO n1 C cn2 ), then
nkC1 nkC1
RO n RO n
lim supn!1 n3
c (lim infn!1 n3
c). Using this with (3.36), we conclude
that
Rn p2
lim
3
D k: (3.37)
n!1 n 3
Writing (3.31) in the form
2 4k 2 X
nk
Ln D k 2 C Rnk C Snk C Hj Hnkj ; n k;
nkC1 nkC1 n k C 1 j D0
(3.38)
2
dividing both sides of this equation by n , and using (3.37), Proposition 3.4, and the
fact that Sn is on the order n2 , we conclude that
Ln p2 p2
lim 2
D 2 k C 2 k D pk2 :
n!1 n 3 6
This gives (3.4) and completes the proof of Theorem 3.1.
Exercise 3.1. Show that if for some n0 , the positive sequence fRO n g1
nDn0 satisfies
O
RO n nkC1 RO n1 C cn (RO n nkC1 RO n1 C cn ), then lim supn!1 Rn3n c
nkC3 2 nkC3 2
RO n
(lim infn!1 n3
c).
Exercise 3.2. Any molecule that remains unbonded at the end of the nearest neigh-
bor k-tuple bonding process occurs in a maximal row of j unbonded molecules,
for some j 2 Œk 1. In the limit as n ! 1, what fraction of molecules ends up
in a maximal row of j unbounded molecules? Let’s denote these fractions by qkIj ,
P
j 2 Œk 1. Of course k1
j D1 qkIj D 1 pk .
3 Probabilistic Packing Problem 33
(a) Let k 3 and fix j 2 Œk 1. Consider the following bonding process:
implement the bonding of nearest neighbor k-tuples as described in the chapter.
When this process terminates, bond all the unbonded molecules that occur in
a maximal row of j unbonded molecules, but leave untouched all unbonded
molecules that occur in a maximal row of i unbonded molecules, for some
i ¤ j . Let MnIk;j denote the number of bonded molecules at the end of
.k;j / .k;j / P .k;j /
the process, and let Hn D EMnIk;j . Let Sn D niD0 Hi . Convince
.k;j / 1
yourself that fHn gnD0 satisfies the recursion equation (3.9) and that it
.k;j /
satisfies the boundary condition (3.7) with one change, namely Hj D j,
D 0. Thus, fSn g1
.k;j / .k;j /
instead of Hj nD0 satisfies the recursion equation (3.13),
and in place of the boundary condition (3.12), it satisfies the boundary condition
.k;j / .k;j /
Sn D 0, n D 0; : : : ; j 1; Sn D j , n D j; : : : ; k 1.
P1
the generating function for fSn g1
.k;j / n .k;j /
(b) Let gj .t / D nD0 S n t denote nD0 .
0
Show that gj solves the differential equation gj .t / D a.t /gj .t / C bj .t /, where
a is as in (3.20) and
with b as in (3.20).
(c) In particular, note that bk1 D b; therefore, gk1 satisfies the same differential
equation satisfied by g. Thus, (3.21) holds for gk1 ; that is,
Rt
Z t Rt
gk1 .t / D gk1 . /e a.r/ dr
C b.s/e s a.r/ dr
ds; t 2 . ; 1/:
1 2
qkIk1 e ; as k ! 1:
k1
34 3 Probabilistic Packing Problem
Rt
Z t Rt
gj .t / D gj . /e a.r/ dr
C bj .s/e s a.r/ dr
ds; t 2 . ; 1/: (3.39)
On the other hand, since bj appears instead of bk1 D b, show that one also has
Z t Rt
lim bj .s/e s a.r/ dr
ds D 1:
!0
Rt
You are Rinvited to show that the appropriate terms in gj . /e a.r/ dr and
Rt t
s a.r/ dr ds cancel each other out and to obtain a finite limiting
bj .s/e
expression as ! 0 on the right hand side of (3.39). This limiting expression
is then also gj .t /. One then has limt!1 .1 t /3 gj .t / D pk C qkIj , which gives
an explicit formula for qkIj . The above analysis gets more involved the smaller
j is. Try it first for j D k 2.
Chapter Notes
The calculation of (3.1) in the case k D 2 goes back to an article by the Nobel Prize
winning chemist Flory in 1939 [21]. The problem was rediscovered by Page, who
obtained the asymptotic behavior for the mean and variance in the case k D 2 [28].
The method used there does not generalize to k > 2. Theorem 3.1 seems to be new.
A continuous space version of this problem was considered by Rényi [31].
Chapter 4
The Arcsine Laws for the One-Dimensional
Simple Symmetric Random Walk
The simple, symmetric random walk fSn g1 nD0 on Z starts at step n D 0 at 0 2 Z and
at each successive step jumps one unit to the right or left, each with probability 12 .
The random walk is called “simple” because the sizes of its jumps are restricted to
the set f1; 1g. One way to realize this random walk is as follows. Let fXn g1 nD1
be an infinite sequence of independent, identically distributed random variables
distributed according to the Bernoulli distribution with parameter 12 ; that is, P .Xj D
P
1/ D P .Xj D 1/ D 12 . Now define S0 D 0 and Sn D nj D1 Xj , n 1.
We begin with a fundamental fact about the simple, symmetric random walk
on Z.
Proposition 4.1.
Remark 1. A moment’s thought shows that (4.1) is equivalent to the statement that
the random walk is recurrent; that is, with probability one, fSn g1
nD0 visits every site
in Z infinitely often.
Remark 2. One can consider a simple, symmetric random walk fSn g1 nD0 on Z , the
d
d -dimensional lattice—at each step it jumps in one of the 2d directions with prob-
1
ability 2d . Again, the random walk is called recurrent if with probability one every
site is visited infinitely often. It is called transient if P .limn!1 jSn j D 1/ D 1. In
1923, G. Polya proved the quite surprising result that this random walk is recurrent
in two dimensions but transient in three or more dimensions. For a proof of this, see,
for example, [15].
Proof. By Remark 1 above, to prove the proposition, it suffices to prove that with
probability one, the random walk visits every site in Z infinitely often. Let p denote
the probability that the random walk fSn g1 nD0 ever returns to its starting point 0.
We will show that p D 1. Let N0 denote the number of times the random walk is
at 0 after time n D 0. Then of course, P .N0 D 0/ D 1 p. Now let’s calculate
P .N0 D 1/. In order to have N0 D 1, the random walk must return to 0 and then
never return to 0 again. The probability of returning to 0 is p. If the random walk
returns to 0, it continues independently of everything that has already transpired.
Thus, conditioned on returning to 0, the probability that the random walk does not
return to 0 again is 1 p. So P .N0 D 1/ D p.1 p/. Continuing with this line of
reasoning, we obtain
P .N0 D n/ D p n .1 p/; n D 0; 1; : : : :
(The term by term differentiation above is permitted because for any p0 < 1, the
series is uniformly absolutely convergent over p 2 Œ0; p0 .) Of course, if p D 1,
then EN0 D 1. Thus, the formula for EN0 in (4.2) also holds if p D 1.
We now calculate EN0 in a different way. Let 1fSn D0g denote the indicator
random variable that is equal to 1 if Sn D 0 and is equal to 0 otherwise. Then
N0 , the number of times the random walk returns to 0, can be represented as
1
X
N0 D 1fSn D0g :
nD1
thus,
2n
P .S2n D 0/ D
n
: (4.4)
22n
p
Using Stirling’s formula, namely, nŠ nn e n 2 n as n ! 1, we have
2n p
.2n/Š .2n/2n e 2n 4 n 1
n
D 2n 2n D p ; as n ! 1: (4.5)
22n 2
.nŠ/ 22n n e .2 n/2 2n n
P
Since 1 nD1 n D 1, it follows from (4.3)–(4.5) that EN0 D 1. In light of (4.2),
p1
we conclude that p D 1.
We have shown that with probability one, the random walk returns to 0.
Upon returning to 0, the random walk continues independently of everything that
transpired previously; thus, in fact, with probability one, the random walk visits 0
infinitely often. From this, it is easy to show that in fact with probability one the
random walk visits every site infinitely often. We leave this as Exercise 4.1.
Define
The random time T0 is called the first return time to 0. By Proposition 4.1, it follows
that P .T0 < 1/ D 1. However, perhaps surprisingly, one has ET0 D 1; the reader
is guided through a proof of this in Exercise 4.2. This result suggests that there is
quite some tendency for the random walk to take a long time to return to 0. In this
chapter we present two results which give vivid expression to this phenomenon.
The arcsine distribution will figure prominently in the results of this chapter. The
distribution function for this distribution is defined by
2 p
Farcsin .x/ D arcsin x; 0 x 1:
0
The corresponding density function farcsin .x/ D Farcsin .x/ is given by
1 1
farcsin .x/ D p ; 0 < x < 1:
x.1 x/
.2n/
which is the last return time to 0 up to step 2n. By parity considerations, L0 can
take on only even values.
38 4 Arcsine Laws for Random Walk
Theorem 4.1.
2k 2n2k
.2n/
P .L0 D 2k/ D k nk
; k D f0; 1; : : : ; ng: (4.6)
22n
Furthermore,
L0
.2n/
2 p
lim P . x/ D arcsin x; 0 x 1: (4.7)
n!1 2n
Remark. This theorem highlights the tendency of the random walk to take a long
time to return to 0. Indeed, since the density farcsin .x/ blows up at x D 0; 1, it
follows from (4.7) that for large n the most likely epochs k for the last visit to 0 up
to time 2n are those satisfying k D o.n/ or k D 2n o.n/, that is, those q epochs
at the very beginning or at the very end of the trajectory. Since 2
arcsin 1
2
D 12 ,
1
from (4.7) it also follows that for large n, there is a probability of about that a 2
random walk trajectory of 2n steps will never return to 0 during the second half of
its life.
C
Our second theorem concerns the random variable O2n , which should be thought
of as the number of steps k 2 Œ2n at which the random walk is positive
(or nonnegative). Of course, the number of steps between 1 and 2n that the random
walk is positive is usually not equal to the number of steps that it is nonnegative.
In order to obtain an exact result in closed form for all n, we need to work in a
symmetric setting. Therefore, if the random walk is equal to 0 at some step 2k, we
classify that step as “positive” if the previous step was positive and “negative” if the
previous step was negative. That is,
C
O2n D jfk 2 Œ2n W Sk > 0 or Sk D 0 and Sk1 > 0gj:
C OC
We call O2n the occupation time of the positive half line up to time 2n. Then 2n2n
gives the fraction of steps among the first 2n steps that the random walk is in the
C
positive half line. Note that by parity considerations, O2n can only take on even
values.
Theorem 4.2.
2k 2n2k
C
P .O2n D 2k/ D k nk
; k D f0; 1; : : : ; ng: (4.8)
22n
Furthermore,
C p
O2n 2
lim P . x/ D arcsin x; 0 x 1: (4.9)
n!1 2n
4 Arcsine Laws for Random Walk 39
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
−1
−2
−3
−4
Remark 1. Since the density farcsin .x/ takes on its minimum at x D 12 , and since
it blows up at x D 0; 1, it follows that for large n the most likely percentages
of time that a random walk trajectory is nonnegative are around 0 % and 100 %,
while the least likely percentage is around 50 %! To put it in a different way, if two
players bet a dollar each on a succession of fair coin flips, then after a long time it
is overwhelmingly more likely that one of the players was leading almost the whole
time than that each player was leading about half the time. This result even more
vividly highlights the tendency of the random walk to take a long time to return to 0.
0
Remark 2. Let O2n D fk 2 Œn W S2k D 0g denote the number of visits to 0 of the
O0
random walk up to step Œ2n. It is not hard to show that the random variable 2n2n ,
denoting the fraction of steps up to 2n at which the random walk is at 0, converges
to 0 in probability; that is,
0
O2n
lim P . > / D 0; for all > 0: (4.10)
n!1 2n
We leave this as Exercise 4.3. In light of this, it follows that (4.9) would also hold if
C
we had defined O2n in an asymmetric fashion as the number of steps up to Œ2n for
which the random walk is nonnegative: jfk 2 Œ2n W Sk 0gj.
Our approach to proving the above two theorems will be completely combi-
natorial rather than probabilistic. Generating functions will play a seminal role.
A random walk path of length m is a path fxj gmj D0 which satisfies
x0 D 0I
(4.11)
xj xj 1 D ˙1; j 2 ŒmI
See Fig. 4.1. Since a random walk path has two choices at each step, there are 2m
random walk paths of length m. The probability that the simple, symmetric random
walk behaves in a certain way up until time m is simply the number of random walk
paths that behave in that certain way divided by 2m .
40 4 Arcsine Laws for Random Walk
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Our basic combinatorial object upon which our results will be developed is the
Dyck path. A Dyck path of length 2n is a nonnegative random walk path fxj g2n j D0 of
length 2n which returns to 0 at step 2n; that is, in addition to satisfying (4.11) with
m D 2n, it also satisfies the following conditions:
xj 0; j 2 Œ2nI
(4.12)
x2n D 0:
See Fig. 4.2. We use generating functions to determine the number of Dyck paths.
Let dn denote the number of Dyck paths of length 2n. We also define d0 D 1.
Proposition 4.2. The number of Dyck paths of length 2n is given by
!
1 2n
dn D ; n 1:
nC1 n
1 2n
Remark. The number Cn WD nC1 n
is known as the nth Catalan number.
Proof. We derive a recursion formula for fdn g1 nD0 . A primitive Dyck path of length
2k is a Dyck path fxj g2k
j D0 of length 2k which satisfies xj > 0 for j D 1; : : : ; 2k1.
Let vk denote the number of primitive Dyck paths of length 2k. Every Dyck path
of length 2n returns to 0 for the first time at 2k, for some k 2 Œn. Consider a Dyck
path of length 2n that returns to 0 for the first time at 2k. The part of the path from
time 0 to time 2k is a primitive Dyck path of length 2k, and the part of the path from
time 2k to 2n is an arbitrary Dyck path of length 2n 2k. (In Fig. 4.2, the Dyck
path of length 16 is composed of an initial primitive Dyck path of length 6, followed
by a Dyck path of length 10.) This reasoning yields the recurrence relation
X
n
dn D vk dnk ; n 1: (4.13)
kD1
vk D dk1 ; k 1: (4.14)
4 Arcsine Laws for Random Walk 41
X
n
dn D dk1 dnk : (4.15)
kD1
Let
1
X
D.x/ D dn x n (4.16)
nD0
1 Y
n1
1 1 2n .2n 2/Š
..1 4x/ 2 /.n/ jxD0 D 2n .2j 1/ D n1 D
nŠ nŠ j D1 nŠ2 .n 1/Š
!
2 2n 1
; for n 2I
2n 1 n
42 4 Arcsine Laws for Random Walk
p
thus, the Taylor series for 1 4x is given by
1
!
p X 2 2n 1 n
1 4x D 1 2x x : (4.19)
nD2
2n 1 n
2nC1 2n
The coefficient of x nC1 in (4.19) is 2nC1
2
nC1
D nC1
2
n
. Using this along
with (4.18) and (4.19), we conclude that
1
!
X 1 2n n 1
D.x/ D x ; jxj < : (4.20)
nD0
nC1 n 4
1 2n
From (4.20) and (4.16) it follows that dn D nC1 n
.
The proof of the proposition gives us the following corollary.
Corollary 4.1. The generating function for the sequence fdn g1
nD0 , which counts
Dyck paths, is given by
p
1 1 4x 1
D.x/ D ; jxj < :
2x 4
Let wn denote the number of nonnegative random walk paths of length 2n. The
difference between such a path and a Dyck path is that for such a path there is no
requirement that it return to 0 at time 2n. We also define w0 D 1. We now calculate
fwn g1 1
nD0 by deriving a recursion formula which involves fdn gnD0 .
Proof of Proposition 4.3. Of course every nonnegative random walk path of length
2n C 2, when restricted to its first 2n steps, constitutes a nonnegative random walk
path of length 2n. A nonnegative random walk path of length 2n which does not
return to 0 at time 2n, that is, which is not a Dyck path, can be extended in four
different ways to create a nonnegative random walk path of length 2n C 2. On the
4 Arcsine Laws for Random Walk 43
other hand, a nonnegative random walk path of length 2n which is a Dyck path can
only be extended in two different ways to create a nonnegative random walk path of
length 2n C 2. Thus, we have the relation
Let
1
X
W .x/ D wn x n
nD0
be the generating function for fwn g1nD0 . As with the power series defining D.x/,
it is clear that the power series defining W .x/ converges for jxj < 14 . Multiply
equation (4.22) by x n and sum over n from 0 to 1. On the left side we obtain
P 1
nD0 wnC1 x D x .W .x/1/, and on the right hand side we obtain 4W .x/2D.x/.
n 1
1 2xD.x/
W .x/ D : (4.23)
1 4x
1 1
W .x/ D p ; jxj < :
1 4x 4
1
!
X 2n n
W .x/ D x ;
nD0
n
2n
and we conclude that wn D n
.
Armed with Propositions 4.2 and 4.3, we can give a quick proof of (4.6).
Proof of Theorem 4.1. By the remark after Proposition 4.3, it follows that (4.6)
holds for k D n. So we now assume that k 2 f0; 1; : : : ; n 1g. Given a random
walk path, fxj glj D0 , we define the negative of the path to be the path fxj glj D0 .
.2n/
If a random walk path of length 2n satisfies L0 D 2k, then its first 2k steps
constitute a random walk path that returns to 0 at time 2k, and its last 2n 2k
44 4 Arcsine Laws for Random Walk
steps constitute either a random walk path that is strictly positive or thenegative
of such a path. As noted in the remark after Proposition 4.3, there are 2k k
random
walk paths of length 2k that return to 0 at time 2k. How many strictly positive
random walk paths of length 2n 2k are there? Let fxj g2n2k j D0 be such a path. Then
x1 D 1, and by parity considerations, x2n2k 2. Consider now the part of the
path from time 1 to time 2n 2k. If we relabel and subtract one, yj D xj C1 1,
j D 0; 1 : : : ; 2n 2k 1, then we obtain a nonnegative random walk path of length
2n 2k 1. By defining y2n2k D y2n2k1 ˙ 1, we can extend this path in two
ways to get a nonnegative random walk path of length 2n 2k. This reasoning
shows that there is a two-to-one correspondence between nonnegative random walk
paths of length 2n 2k and strictly positive
random walk paths of length 2n 2k.
We know that there are wnk D 2n2k nk
nonnegative random walk paths of length
2n 2k; thus, we conclude thatthe number
of strictly positive random walk paths
of length 2n 2k is equal to 12 2n2k
nk
. We conclude from the above analysis that
.2n/
the number of random walk paths of length 2n that satisfy L0 D 2k is equal to
2k 2n2k
k nk
, from which (4.6) follows.
We now consider (4.7). In Exercise 4.4 the reader is asked to apply Stirling’s
formula and show that for any > 0,
2k 2n2k
1 1
k nk
p ; uniformly over n k .1 /n; as n ! 1:
22n k.n k/
(4.24)
Using (4.24) and (4.6), we have for 0 < a < b < 1
2k 2n2k
L
.2n/ X
Œnb
X
Œnb
1 1
P .a < 0 b/ D k nk
p D
2n 22n k.n k/
kDŒnaC1 kDŒnaC1
1 X
Œnb
1 1
q ; as n ! 1: (4.25)
k
.1 k n
/
kDŒnaC1 n n
But the last term on the right hand side of (4.25) is a Riemann sum for
R
1 b p 1
a x.1x/
dx. Thus, letting n ! 1 in (4.25) gives
Z p
L 1
.2n/ b
1 2 2 p
lim P .a < 0 b/ D p dx D arcsin b arcsin a;
n!1 2n a x.1 x/
for 0 < a < b < 1;
Proof of Theorem 4.2. We need to prove (4.8). Of course, (4.9) follows from (4.8)
C
just like (4.7) followed from (4.6). Recalling the symmetric definition of O2n , for
the purpose of this proof, we will refer to S2k as “positive” if either S2k > 0 or
S2k D 0 and S2k1 > 0. Let cn;k denote the number of random walk paths of length
2n which are positive at exactly 2k steps. Since there are 22n random walk paths of
length 2n, in order to prove (4.8), we need to prove that
! !
2k 2n 2k
cn;k D ; k D 0; 1; : : : ; n: (4.26)
k nk
By Proposition 4.3, we have cn;n D 2n n
, and by symmetry, cn;0 D 2n n
; thus, (4.26)
holds for k D 0; n.
C
Consider now k 2 Œn 1. A random walk path that satisfies O2n D 2k
must return to 0 before step 2n. Consider the first return to 0. If the path was
positive before the first return to 0, then the first return to 0 must occur at step 2j , for
some j 2 Œk (for otherwise, the path would be positive for more than 2k steps). If
the path was negative before the first return to 0, then the first return to 0 must occur
at step 2j , for some j 2 Œn k (for otherwise the path would be positive for fewer
than 2k steps). In light of these facts, and recalling that vj D dj 1 is the number of
primitive Dyck paths of length 2j , it follows that for j 2 Œk, the number of random
walk paths of length 2n which start out positive, return to 0 for the first time at step
2j , and are positive for exactly 2k steps is equal to dj 1 cnj;kj , Similarly, for
j 2 Œn k, the number of random walk paths of length 2n which start out negative,
return to 0 for the first time at step 2j , and are positive for exactly 2k steps is equal
to dj 1 cnj;k . Thus, we obtain the recursion relation
X
k X
nk
cn;k D dj 1 cnj;kj C dj 1 cnj;k ; k 2 Œn 1: (4.27)
j D1 j D1
Let en WD 2nn
, n 0. As follows from the remark after Proposition 4.3, for
n 1, en is the number of random walk paths of length 2n that are equal to 0 at
step 2n. We derive a recursion formula for fen g1 nD0 . A random walk path of length
2n which is equal to 0 at step 2n must return to 0 for the first time at step 2k, for
some k 2 Œn. The number of random walk paths of length 2n which are equal to 0
at time 2n and which return to 0 for the first time at step 2k is equal to 2vk enk D
2dk1 enk . Consequently, we obtain the recursion formula
X
n
en D 2dk1 enk : (4.28)
kD1
We can now prove (4.26) by considering (4.27) and (4.28) and applying induction.
To prove (4.26) we need to show that for all n 1,
When n D 1, (4.29) clearly holds. We now assume that (4.29) holds for all n n0
and prove that it also holds for n D n0 C 1. When n D n0 C 1 and k D 0 or
k D n0 C 1, we already know that (4.29) holds. So we need to show that (4.29)
holds for n D n0 C 1 and k 2 Œn0 . Using (4.27) for the first equality, using the
inductive assumption for the second equality, and using (4.28) for the third equality,
we have
X
k C1k
n0X
cn0 C1;k D dj 1 cn0 C1j;kj C dj 1 cn0 C1j;k D
j D1 j D1
X
k C1k
n0X
dj 1 ekj en0 C1k C dj 1 ek en0 C1kj D
j D1 j D1
1 1
ek en0 C1k C en0 C1k ek D ek en0 C1k ; (4.30)
2 2
which proves that (4.29) holds for n D n0 C 1 and completes the proof of
Theorem 4.2.
Exercise 4.1. This exercise completes the proof of Proposition 4.1. We proved that
with probability one, the simple, symmetric random walk on Z visits 0 infinitely
often.
(a) For fixed x 2 Z, use the fact that with probability one the random walk visits
0 infinitely often to show that with probability one the random walk visits x
infinitely often. (Hint: Every time the process returns to 0, it has probability
. 12 /jxj of moving directly to x in jxj steps.)
(b) Show that with probability one the random walk visits every x 2 Z infinitely
often.
Exercise 4.2. In this exercise, you will prove that ET0 D 1, where T0 is the first
return time to 0. We can consider the random walk starting from any j 2 Z, rather
than just from 0. When we start the random walk from j , denote the corresponding
probabilities and expectations by Pj and Ej . Fix n 1 and consider starting the
random walk from some j 2 f0; 1; : : : ; ng. Let T0;n denote the first nonnegative
time that the random walk is at 0 or n.
(a) Define g.j / D Ej T0;n . By analyzing what happens on the first step, show that
g solves the difference equation g.j / D 1 C 12 g.j C 1/ C 12 g.j 1/, for
j D 1; : : : ; n 1. Note that one has the boundary conditions g.0/ D g.n/ D 0.
(b) Use (a) to show that Ej T0;n D j.n j /. (Hint: Write the difference equation
in the form g.j C 1/ g.j / D g.j / g.j 1/ 2.)
(c) In particular, (b) gives E1 T0;n D n 1. From this, conclude that ET0 D 1.
O0
Exercise 4.3. Prove (4.10): limn!1 P . 2n2n > / D 0; for all > 0. (Hint:
P2n
Represent O2n0 0
by O2n D j D1 1fSj D0g , where 1fSj D0g is as in the proof of
4 Arcsine Laws for Random Walk 47
0
O2n
Proposition 4.1. From this representation, show that limn!1 E 2n
D 0. Conclude
from this that (4.10) holds.)
Exercise 4.4. Use Stirling’s formula to prove (4.24). That is, show that for any
; ı > 0, there exists an n ;ı such that if n n ;ı , then
2k 2n2k
p
1ı k nk
k.n k/ 1 C ı;
22n
Chapter Notes
The arcsine law in Theorem 4.2 was first proven by P. Lévy in 1939 in the context
of Brownian motion, which is a continuous time and continuous path version of the
simple, symmetric random walk. The proof of Theorem 4.2 is due to K.L. Chung and
W. Feller. One can find a proof in volume 1 of Feller’s classic text in probability [19].
48 4 Arcsine Laws for Random Walk
One can also find there a proof of Theorem 4.1. Our proofs of these theorems are
a little different from Feller’s proofs. As expected, the proofs in Feller’s book have
a probabilistic flavor. We have taken a more combinatorial/counting approach via
generating functions. Proposition 4.3 and Corollary 4.2 can be derived alternatively
via the “reflection principle”; see [19]. For a nice little book on random walks from
the point of view of electrical networks, see Doyle and Snell [15]; for a treatise on
random walks, see the book by Spitzer [32].
Chapter 5
The Distribution of Cycles in Random
Permutations
In this chapter we study the limiting behavior of the total number of cycles and of
the number of cycles of fixed length in random permutations of Œn as n ! 1. This
class of problems springs from a classical question in probability called the envelope
matching problem. You have n letters and n addressed envelopes. If you randomly
place one letter in each envelope, what is the asymptotic probability as n ! 1 that
no letter is in its correct envelope?
Let Sn denote the set of permutations of Œn. Of course, Sn is a group, but the
group structure will not be relevant for our purposes. For us, a permutation
2
Sn is simply a 1-1 map of Œn onto Œn. The notation
j will be used to denote
the image of j 2 Œn under this map. We have jSn j D nŠ. Let PnU denote the
uniform probability measure on Sn . That is, PnU .A/ D jAjnŠ
, for any subset A Sn .
If
j D j , then j is called a fixed point for the permutation
. Let Dn Sn
denote the set of permutations that do not fix any points; that is,
2 Dn if
j ¤ j ,
for all j 2 Œn. Such permutations are called derangements. The classical envelope
matching problem then asks for limn!1 PnU .Dn /.
The standard way to solve the envelope matching problem is by the method of
inclusion–exclusion. Define Gi D f
2 Sn W
i D i g. (We suppress the dependence
of Gi on n since n is fixed in this discussion.) Then the complement Dnc of Dn is
given by Dnc D [niD1 Gi , and the inclusion–exclusion principle states that
X
n X
P .[niD1 Gi / D P .Gi / P .Gi \ Gj /C
iD1 1i<j n
X
P .Gi \ Gj \ Gk / C .1/n1 P .\niD1 Gi /:
1i<j <kn
(See Exercise A.2 in Appendix A.) Each of the probabilities above can be computed
readily. After some calculations one finds that P .Dn / D 1 P .[niD1 Gi / D 1 1 C
1
2Š
3Š1 C C .1/n nŠ1 ; thus, limn!1 P .Dn / D e 1 .
.n/
Here is an elegant, alternative proof using generating functions. Let dk denote
the number of permutations in Sn that fix exactly k points. We need to calculate
.n/
d0
limn!1 nŠ
. Clearly,
X
n
.n/
dk D nŠ; (5.1)
kD0
or equivalently
X
n
d0
.nk/
D 1: (5.2)
kŠ.n k/Š
kD0
P
If one multiplies the absolutely convergent
P1 power series 1 n
nD0 an x by the abso-
n
lutely convergent
P power series b
nD0 P n x , one gets the absolutely convergent
power series 1 c
nD0 n x n
, where c n D n
a b
kD0 k nk . Thus, it follows from (5.2)
that
1 1 1
X x n X d0
.n/
X
xn D x n ; jxj < 1;
nD0
nŠ nD0
nŠ nD0
or
1
X .n/
d e x
0
xn D ; jxj < 1: (5.3)
nD0
nŠ 1x
.n/
d0
Thus nŠ
is the coefficient of x n in
e x x2 x3
D .1 x C C /.1 C x C x 2 C x 3 C /;
1x 2Š 3Š
.n/
d0
and this is easily seen to give nŠ
D11C 1
2Š
1
3Š
C C .1/n nŠ1 .
5 Cycles in Random Permutations 51
In order to begin our study of the behavior of the number of cycles and of the
number of cycles of fixed length in random permutations, we recall some basic facts
and notation concerning cycles of permutations. Consider the permutation
2 S4
given in two-line form by 12 24 31 43 . This means that
1 D 2;
2 D 4, etc. Since
1 goes to 2, 2 goes to 4, 4 goes to 3, and 3 goes back to 1, we call
cyclic and
denote this by writing
D .1 2 4 3/. (We could also just as well write it as .4 3 1 2/,
for example.) Recall that every permutation can be decomposed into a product of
disjoint cycles. For example, consider
2 S8 given by 13 22 35 48 56 67 71 84 . Under
,
1 goes to 3, 3 goes to 5, 5 goes to 6, 6 goes to 7, and 7 goes back to 1, closing a
cycle. Now 2 goes to 2, which makes a cycle unto itself, and finally, 4 goes to 8
and 8 goes back to 4. Therefore, we write
D .1 3 5 6 7/.2/.4 8/ or, alternatively,
D .1 3 5 6 7/.4 8/; in the latter form, the convention is that every number that does
not appear at all forms a cycle unto itself. Note that
has one cycle of length 5, one
cycle of length 2, and one cycle of length 1.
.n/
For
2 Sn and j 2 Œn, let Cj .
/ denote the number of cycles of length j in
X
n
.n/
jCj .
/ D n:
j D1
X
n
.n/
N .n/ .
/ D Cj .
/
j D1
where
X .n/ .
/
Kn . / D N
2Sn
.n/ WD . C 1/ . C n 1/; n 1:
nŠ Y aj 1
n
.n/ .n/
Pn./ .C1 D a1 ; C2 D a2 ; : : : ; Cn.n/ D an / D . / :
.n/ j D1 j aj Š
N .n/
! in probabilityI
log n
N .n/
lim Pn./ .j j / D 0:
n!1 log n
5 Cycles in Random Permutations 53
j
P .Z D j / D e ; for j D 0; 1; : : : :
jŠ
j
The j discrete random variables fXi giD1 are called independent if P .X1 D
Qj j
x1 ; : : : ; Xj D xj / D iD1 P .Xi D xi /, for all choices of fxi giD1 R. In the
sequel, Z will denote a random variable distributed according to Pois./, and it
j j
will always be assumed that fZi giD1 are independent for distinct fi giD1 .
We will prove a weak convergence result for small cycles.
./
Theorem 5.2. Let 2 .0; 1/. Let j be a positive integer. Under the measure Pn ,
.n/ .n/ .n/
the distribution of the random vector .C1 ; C2 ; : : : ; Cj / converges weakly to the
distribution of .Z ; Z ; : : : ; Z /. That is,
2 j
Y
j
. i /mi
e i
.n/ .n/ .n/
lim Pn./ .C1 D m 1 ; C2 D m 2 ; : : : ; Cj D mj / D ;
n!1
iD1
mi Š
mi 0; i D 1; : : : ; j: (5.4)
Y
j
. ki /mi
e ki
.n/ .n/ .n/
lim P ./ .Ck1 D m1 ; Ck2 D m2 ; : : : ; Ckj D mj / D ;
n!1 n mi Š
iD1
mi 0; i D 1; : : : ; j: (5.5)
.n/
In particular, for any fixed j , the distribution of Cj converges weakly to the
Pois. j / distribution. Actually, (5.5) can be deduced directly from (5.4); see
Exercises 5.2 and 5.3.
Our proofs of these two theorems will be very combinatorial, through the method
of generating functions. The use of purely probabilistic reasoning will be rather
minimal.
For the proofs of the two theorems, we will need to evaluate the normalizing
constant Kn . /. Of course, this is trivial in the case of the uniform measure, that
is, the case D 1. Let s.n; k/ denote the number of permutations in Sn that have
exactly k cycles. From the definition of Kn . /, we have
X
n
Kn . / D s.n; k/ k : (5.6)
kD1
54 5 Cycles in Random Permutations
Proposition 5.2.
Kn . / D .n/ :
Remark. The numbers s.n; k/ are called unsigned Stirling numbers of the first kind.
Proposition 5.2 and (5.6) show that they arise as the coefficients of the polynomials
qn . / WD .n/ D . C 1/ . C n 1/.
Proof. There are .n 1/Š permutations in Sn that contain only one cycle and one
permutation in Sn that contains n cycles:
Note that (5.7) and (5.8) uniquely determine s.n; k/ for all n 1 and all k 2 Œn.
To create a permutation
0 2 SnC1 , we can start with a permutation
2 Sn
and then take the number n C 1 and either insert it into one of the existing cycles
of
or let it stand alone as a cycle of its own. If we insert n C 1 into one of the
existing cycles, then
0 will have k cycles if and only if
has k cycles. There are n
possible locations in which one can place the number nC1 and preserve the number
of cycles. (The reader should verify this.) Thus, from each permutation in Sn with
k cycles, we can construct n permutations in SnC1 with k cycles. If, on the other
hand, we let n C 1 stand alone in its own cycle, then
0 will have k cycles if and
only if
has k 1 cycles. Thus, from each permutation in Sn with k 1 cycles, we
can construct one permutation in SnC1 with k cycles. Now (5.8) is the mathematical
expression of this verbal description.
Let cn;k denote the coefficient of k in qn . / D . C 1/ . C n 1/. Clearly
cn;1 D .n 1/Š and cn;n D 1, for n 1. Writing qnC1 . / D qn . /. C n/, one sees
that cnC1;k D ncn;k C cn;k1 , for n 2; 2 k n. Thus, cn;k satisfies the same
recursion relation (5.8) and the same boundary condition (5.7) as does s.n; k/. We
conclude that cn;k D s.n; k/. The proposition follows from this along with (5.6).
./
In light of Proposition 5.2, from now on, we write the probability measure Pn
in the form
.n/
N .
/
Pn./ .f
g/ D :
.n/
We now set the stage to prove Theorem 5.1. The probability generating function
PX .s/ of a random variable X taking nonnegative integral values is defined by
1
X
PX .s/ D Es X D s i P .X D i /; jsj 1:
iD0
5 Cycles in Random Permutations 55
X
n
PN .n/ .sI / D s i Pn./ .N .n/ D i /:
iD1
i s.n; i /
Pn./ .N .n/ D i / D :
.n/
Using this with (5.6) and Proposition 5.2 gives
X
n
i s.n; i / .s /.n/ s.s C 1/ .s C n 1/
PN .n/ .sI / D si D D D
iD1
.n/ .n/ . C 1/ . C n 1/
Y
n
i 1
. sC /: (5.9)
iD1
Ci 1 Ci 1
Pn
X.Ci 1/1
Y
n
PZn; .s/ D Es Zn; D Es i D1
D Es X.Ci 1/1 D
iD1
Y
n
i 1
. sC /: (5.10)
iD1
Ci 1 Ci 1
For the third equality above we have used the fact that the expected value of a
product of independent random variables is equal to the product of their expected
values. From (5.9), (5.10), and the uniqueness of the probability generating function,
we obtain the following proposition.
./
Proposition 5.3. Under Pn , the distribution of N .n/ is equal to the distribution of
P n
iD1 X.Ci1/1 , where fX.Ci1/1 giD1 are independent random variables, and
n
X
n Z n1
1 1
Var.Zn; / C dx D C log.n 1/: (5.13)
iD2
i 1 1 x
Using (5.12) for the last inequality below, we have for sufficiently large n
Zn;
P .j j / D P .jZn; log nj log n/ D
log n
P .j Zn; EZn; C EZn; log n/j log n/
P .jZn; EZn; j log n jEZn; log n/j/
1
P .jZn; EZn; j log n/: (5.14)
2
5 Cycles in Random Permutations 57
Applying Chebyshev’s inequality to the last term in (5.14), it follows from (5.13)
and (5.14) that for sufficiently large n,
Zn; C log.n 1/
P .j j / 1 2
: (5.15)
log n 4
log2 n
nŠ
cn .a1 ; : : : ; an / D Qn ai
:
iD1 i ai Š
nŠ Y . i /mi X Y
j n
. i /ai
:
.n/ iD1 mi Š Pj P iDj C1
ai Š
i D1imi C niDj C1 iai Dn
aj C1 0;:::;an 0
58 5 Cycles in Random Permutations
The sum on the right hand side above is a real mess; however, a sophisticated
application of generating functions in conjunction with the lemma will allow us
to evaluate the right hand side of (5.16) indirectly.
Proof of Lemma 5.1. First we separate out a1 numbers for 1 cycles, 2a2 numbers
for 2 cycles,: : :, .n 1/an1 numbers for .n 1/ cycles, and finally the last nan
numbers for n cycles. The number of ways of doing this is
! ! ! !
n n a1 n a1 2a2 n a1 .n 1/an1
D
a1 2a2 3a3 nan
nŠ
:
a1 Š.2a2 /Š .nan /Š
The a1 numbers selected for 1 cycles need no further differentiation. The 2a2
numbers selected for 2 cycles must be separated out into a2 pairs. Of course the
order of the pairs is irrelevant, so the number of ways of doing this is
! ! ! !
1 2a2 2a2 2 4 2 .2a2 /Š
D :
a2 Š 2 2 2 2 a2 Š.2Š/a2
The 3a3 numbers selected for 3 cycles must be separated out into a3 triplets, and then
each such triplet must be ordered in a cycle. The number of ways of separating the
3a3 numbers into triplets is
! ! ! !
1 3a2 3a3 3 6 3 .3a3 /Š
D :
a3 Š 3 3 3 3 a3 Š.3Š/a3
Each such triplet can be ordered into a cycle in .31/Š ways.a Thus, we conclude that
3 .3a3 /Š
the 3a3 numbers can be arranged into a3 3 cycles in ..31/Š/
a3 Š.3Š/a3
ways. Continuing
like this, we obtain
We now turn to generating functions. Consider an infinite dimensional vector
x D .x1 ; x2 ; : : :/, and for any positive integer n, define x .n/ D .x1 ; : : : ; xn /. For
a D .a1 ; : : : ; an /, let x a D .x .n/ /a WD x1a1 xnan . Let T .
/ denote the cycle type
of
2 Sn . Define the cycle index of Sn , n 1, by
5 Cycles in Random Permutations 59
1 X T .
/ 1 X
n .x/ D n .x .n/ / D x D cn .a/x a :
nŠ
2G nŠ Pn
i D1 iai Dn
a1 0;:::;an 0
We also define 0 .x/ D 1. We now consider (formally for the moment) the
generating function for n . x/:
1
X
G ./ .x; t / D n . x/t n ; x D .x1 ; x2 ; : : :/:
nD0
Using Lemma 5.1, we can obtain a very nice representation for G ./ , as well as a
domain on which its defining series converges. Let jjxjj1 WD supn1 jxn j.
Proposition 5.4.
1
X xi t i
G ./ .x; t / D exp. /; for jt j < 1; jjxjj1 < 1:
iD1
i
Proof. Consider t 2 Œ0; 1/ and x with xj 0 for all j , and jjxjj1 < 1. Using
Lemma 5.1 and the definition of n .x/, we have
1
X X cn .a/. x/a t n
G ./ .x; t / D D
Pn nŠ
j D1 jaj Dn
nD0
a1 0;:::;an 0
1
X X nŠ . x1 /a1 . xn /an t n
Qn ai
D
Pn iD1 i ai Š nŠ
j D1 jaj Dn
nD0
a1 0;:::;an 0
1
X X Y
n i
X 1 xi t i ai
Y 1
Y
. xi t /ai . / xi t i
i
D i
D e i D
Pn ai Š a1 0;a2 0;::: iD1
ai Š
j D1 jaj Dn
nD0 iD1 iD1
a1 0;:::;an 0
1
X xi t i
exp. /: (5.17)
iD1
i
The right hand side above converges for t and x in the range specified at the
beginning of the proof. Since all of the summands in sight are nonnegative, it follows
that the series defining G ./ is convergent in this range. For t and x in the range
specified in the statement of the theorem, the above calculation shows that there is
absolute convergence and hence convergence.
We now exploit the formula for G ./ .x; t / in Proposition 5.4 in a clever way.
Recall that
60 5 Cycles in Random Permutations
1 i
X t
log.1 t / D : (5.18)
iD1
i
X1 X1
1
b i t i
D i t i ; jt j < 1:
.1 t / iD0 iD0
P1
P1 >i 1, also assume that iD0 jbi j < 1. If 2 .0; 1/, also assume that
If
iD0 s jbi j < 1, for some s > 1. Then
X 1
nŠ
lim .n/
n D bi :
n!1
iD0
1
Proof. Since . .1t/ /
.n/
jtD0 D . C 1/ . C n 1/ D .n/ , the Taylor expansion
1
for .1t/
is given by
X .n/ 1
1
D t n; (5.20)
.1 t / nD0
nŠ
where Pfor convenience we have defined .0/ D 1. Thus, the Taylor expansion for
1 1 i
.1t/ iD0 bi t is given by
X1 X1
1
b i t i
D dn t n ;
.1 t / iD0 nD0
Pn .ni /
where dn D iD0
bi .ni/Š . Therefore, by the assumption in the lemma, we have
X
n
.ni/
n D bi :
iD0
.n i /Š
5 Cycles in Random Permutations 61
If DP1, then kŠ D .k/ , for all k. Consequently the above equation reduces to
n
n D iD0 bi , and thus the statement of the lemma holds. When ¤ 1, then using
the additional assumptions on fbi g1iD0 , we can show that
1
nŠ X X
n
.ni/
lim b i D bi ; (5.21)
n!1 .n/
iD0
.n i /Š iD0
which finishes the proof of the lemma. The reader is guided through a proof of (5.21)
in Exercise 5.5.
We can now give the proof of Theorem 5.2.
Proof of Theorem 5.2. From (5.19) and the original definition of G ./ .x; t /, we have
Xj
X1
1 .xi 1/t i
exp. / D n . x .j /I1 /t n : (5.22)
.1 t / iD1
i nD0
Xj
X1
.xi 1/t i
n D n . x .j /I1
/ and exp. /D bi t i : (5.23)
iD1
i iD0
P1In order to be able to apply the lemma for all > 0, we need to show that
N 1
iD0 s jbi j < 1, for some s > 1. Define fbi giD0 by
i
Xj
X1
jxi 1jt i
exp. /D bNi t i : (5.24)
iD1
i iD0
Since all of the coefficients in the sum in the exponent on the left hand side of (5.24)
are nonnegative, we have bNi jbi j 0, for all i . The reader is asked to prove this
in Exercise 5.6. The function on the left hand side of (5.24) is real analytic for all
t 2 R (and complex analytic for all complex t ); consequently, its power series on
the right handPside converges for all t 2 R. From this and the nonnegativity of bNi , it
1 iN N
P1thati iD0 s bi < 1, for all s 0, and then, since jbi j bi , we conclude
follows
that iD0 s jbi j < 1, for all s 0.
By definition, from (5.23), we have
1
X Xj
xi 1
bi D exp. /: (5.25)
iD0 iD1
i
62 5 Cycles in Random Permutations
Consider now
nŠ nŠ 1 X
n D .n/ n . x .j /I1 / D .n/ cn .a/. x .j /I1 /a : (5.26)
.n/ Pn
i D1iai Dn
a1 0;:::;an 0
For any given j -vector .m1 ; : : : ; mj / with nonnegative integral entries, the coeffi-
m
cient of x1m1 x2m2 xj j in (5.26) is
1 X Pj Pn
mi C i Dj C1 ai
i D1 cn .m1 ; : : : ; mj ; aj C1 ; : : : ; an /:
.n/ Pj Pn
i D1imi C i Dj C1 iai Dn
aj C1 0;:::;an 0
Xj
xi 1 X m
exp. /D pm1 ;:::;mj ./ x1m1 xj j : (5.29)
i
iD1 m 1 0;:::;mj 0
m
On the one hand, (5.29) shows that the coefficient of x1m1 xj j in the Taylor
Pj
expansion about x D 0 of the function exp. iD1 xi i1 / is pm1 ;:::;mj . /. On the
other hand, by Taylor’s formula, this coefficient is equal to
Pj xi 1
1 @m1 CCmj exp. iD1 /
mj
i
jxD0 D
m1 Š mj Š @m
x1 @xj
1
1 Xj
1 Y mi
j
Yj mi
. /
exp. / . / D e i i : (5.30)
m1 Š mj Š iD1
i iD1 i iD1
mi Š
5 Cycles in Random Permutations 63
Y
j
. i /mi
e i
.n/ .n/ .n/
lim Pn./ .C1 D m 1 ; C2 D m 2 ; : : : ; Cj D mj / D ;
n!1
iD1
mi Š
X j
Y . i /ri
e i
./ .n/ .n/ .n/
lim Pn .C1 m1 ; C2 m2 ; : : : ; Cj mj /D ;
n!1 ri Š
0r1 m1 ;:::;0rj mj iD1
mi 0; i D 1; : : : ; j: (5.31)
Exercise 5.3. In this exercise you will show directly that (5.5) follows from (5.4).
(a) Fix an integer j 2. Use (5.31) to show that for any > 0, there exists an N
such that if n N and m N , then
.n/
Pn./ .Ci > m; for some i 2 Œj / < : (5.32)
(b) From (5.31) and (5.32), deduce that (5.31) also holds if some of the mi are equal
to 1.
(c) Prove that (5.5) follows from (5.4).
Exercise 5.4. This exercise gives an alternative probabilistic proof of Proposi-
tion 5.3. A uniformly random ( D 1) permutation
2 Sn can be constructed in
the following manner via its cycles. We begin with the number 1. Now we randomly
choose a number from Œn. If we chose j , then we declare that
1 D j . This is the
first stage of the construction. If j ¤ 1, then we randomly choose a number from
Œn fj g. If we chose k, then we declare that
j D k. This is the second stage of
the construction. If k ¤ 1, then we randomly choose a number from Œn fj; kg.
We continue like this until we finally choose 1, which closes the cycle. For example,
if after k we chose 1, then the permutation
would contain the cycle .1j k/. Once
we close a cycle, we begin again, starting with the smallest number that has not yet
been used. We continue like this for n stages, at which point the permutation
has
been defined completely.
(a) The above construction has n stages. Show that the probability of completing a
1
cycle on the j th stage is nC1j . Thus, letting
64 5 Cycles in Random Permutations
(
.n/ 1; if a cycle was completed at stage j I
Xj D
0; otherwise;
.n/
it follows that Xj Ber. nC1j
1
/.
.n/
(b) Argue that fXj gnj D1 are independent.
P .n/
(c) Show that the number of cycles N .n/ can be represented as N .n/ D nj D1 Xj ,
thereby proving Proposition 5.3 in the case D 1.
(d) Let 2 .0; 1/. Amend the above construction as follows. At any stage j ,
close the cycle with probability nCj , and choose any other particular number
1
that has not yet been used with probability nCj . Show that this construction
./
yields a permutation distributed according to Pn , and use the above reasoning
to prove Proposition 5.3 for all > 0.
P
Exercise 5.5. (a) Show that if 1 iD0 jbi j < 1 and the triangular array fcn;i W i D
0; 1; : : : ; nI n DP0; 1; : : :g is bounded
P and satisfies limn!1 cn;ni D 1, for all
i , then limn!1 niD1 bi cn;ni D 1 iD1 bi . Then use this to prove (5.21) in the
case that > 1.
nŠ .ni / .0/
(b) Show that if 2 .0; 1/, then .ni/Š .n/
ni
n
, if i < n. Also, nŠ0Š .n/
n,
where we recall, P .0/
D 1.
(c) Show that if 1 i
iD0 jbi js < 1, where s > 1, then jbi j s , for all large i .
i
nŠ Pn .ni /
(d) For 2 .0; 1/, prove (5.21) as follows. Break the sum .n/ iD0 bi .ni/Š into
three parts—from i D 0 to i D N , from i D N C 1 to i D Œ n2 , and from
i D Œ n2 C 1 to i D n. Use the reasoning in the proof of (a) to show that by
choosing N sufficiently P large, the limit as n !P 1 of the first part can be made
arbitrarily close to 1 iD0 ib . Use the fact that 1
iD0 jbi j < 1 to show that by
choosing N sufficiently large, the lim supn!1 of the second part can be made
arbitrarily small. Use (b) and (c) to show that the limit as n ! 1 of the third
part is 0.
Exercise 5.6. Prove that bNi jbi j, where fbi g1 N 1
iD0 and fbi giD0 are defined in (5.23)
and (5.24).
Exercise 5.7. Make a small change in the proof of Theorem 5.2 to show that (5.5)
holds.
.1/ .1/
Exercise 5.8. Consider the uniform probability measure Pn on Sn and let En
.1/
denote the expectation under Pn . Let Xn D Xn .
/ be the random variable denoting
the number of nearest neighbor pairs in the permutation
2 Sn , and let Yn D Yn .
/
be the random variable denoting the number of nearest neighbor triples in
2 Sn .
(A nearest neighbor pair for
is a pair k; k C 1, with k 2 Œn 1, such that
i D k
and
iC1 D k C 1, for some i 2 Œn 1, and a nearest neighbor triple is a triple
.k; k C 1; k C 2/ with k 2 Œn 2 such that
i D k,
iC1 D k C 1 and
iC2 D k C 2,
for some i 2 Œn 2.)
5 Cycles in Random Permutations 65
.1/
(a) Show that En Xn D 1, for all n. (Hint: Represent Xn as the sum of indicator
random variables fIk gn1
kD1 , where Ik .
/ is equal to 1 if k; k C 1 is a nearest
neighbor pair in
and is equal to 0 otherwise.) It can be shown that the
distribution of Xn converges weakly to the Pois.1/ distribution as n ! 1;
see [17].
.1/ .1/
(b) Show that limn!1 En Yn D 0 and conclude that limn!1 Pn .Yn D 0/ D 1.
Chapter Notes
Let .n/ denote the number of primes that are no larger than n; that is,
X
.n/ D 1;
pn
where here and elsewhere in this chapter and the next two, the letter p in a
summation denotes a prime. Euclid proved that there are infinitely many primes:
limn!1 .n/ D 1. The asymptotic density of the primes is 0; that is,
.n/
lim D 0:
n!1 n
The prime number theorem gives the leading order asymptotic behavior of .n/. It
states that
.n/ log n
lim D 1:
n!1 n
This landmark result was proved in 1896 independently by J. Hadamard and by C.J.
de la Vallée Poussin. Their proofs used contour integration and Cauchy’s theorem
from analytic function theory. A so-called “elementary” proof, that is, a proof that
does not use analytic function theory, was given by P. Erdős and A. Selberg in 1949.
Although their proof uses only elementary methods, it is certainly more involved
than the proofs of Hadamard and de la Vallée Poussin. We will not prove the prime
number theorem in this book. In this chapter we prove a precursor of the prime
number theorem, due to Chebyshev in 1850. Chebyshev was the first to prove that
.n/ grows on the order logn n . Chebyshev’s methods were ingenious but entirely
elementary. Given the truly elementary nature of his approach, it is quite impressive
how close his result is to the prime number theorem. Here is Chebyshev’s result.
Chebyshev’s result is not the type of result we are emphasizing in this book, since
it is not an exact asymptotic result but rather only an estimate. We have included the
result because we will need it to prove Mertens’ theorems in Chap. 7, and one of
Mertens’ theorems will be used to prove the Hardy–Ramanujan theorem in Chap. 8.
Define Chebyshev’s -function by
X
.n/ D log p: (6.1)
pn
We will give an exceedingly simple proof of the following result, which links the
asymptotic behavior of to that of .
Proposition 6.1.
(i) lim infn!1 .n/
n
D lim infn!1 .n/nlog n ;
(ii) lim supn!1 n D lim supn!1 .n/nlog n .
.n/
where the last inequality comes from the trivial fact that .y/ y. Dividing this by
n and letting n ! 1, and using the fact that 2 .0; 1/ is arbitrary, we obtain
thus,
!
2m C 1
22m : (6.6)
m
That is, in the sum above, a term log p appears for every prime p and integer k 1
for which p k n. So, for example, .14/ D 3 log 2 C 2 log 3 C log 5 C log 7 C
log 11 C log 13. Of course, .n/ .n/. We show now that and have the same
asymptotic behavior.
Proposition 6.2.
(i) lim infn!1 .n/
n
D lim infn!1 .n/
n
;
.n/ .n/
(ii) lim supn!1 n D lim supn!1 n
.
Proof. Since .n/ .n/, we have
log n
Œ log 2
X X X
.n/ .n/ D log p D log p D
p k n;k2 kD2 1
pn k
log n
Œ log 2
X 1 log n 1
.Œn k / .Œn 2 /: (6.10)
log 2
kD2
P
Now trivially, .k/ D pk log p k log k. Using this with (6.10) gives
1
.log n/2 n 2
.n/ .n/ : (6.11)
2 log 2
Remark. The bound obtained in (6.11) can be improved by replacing the trivial
bound on , namely, .k/ k log k, by the bound obtained from Theorem 6.2.
We will carry out a lower-bound analysis of . This will be somewhat more
involved than the upper bound analysis for but still entirely elementary. For n 2 N
and p a prime, let vp .n/ denote the largest exponent k such that p k jn. One calls
vp .n/ the p-adic value of n. It follows from the definition of vp that any positive
integer n can be written as
Y
nD p vp .n/ : (6.13)
p
In Exercise 6.1 the reader is asked to prove the following simple formula:
X
n
vp .nŠ/ D vp .m/: (6.15)
mD1
X
n X
n X 1
X X
vp .nŠ/ D vp .m/ D 1D 1: (6.16)
mD1 mD1 1k<1;p k jm kD1 1mn;p k jm
Theorem 6.3.
.n/
log 2:
lim inf
n n!1
Proof. Consider the binomial coefficient 2n
n
D .2n/Š
.nŠ/2
. Using (6.13) we have
!
2n .2n/Š Y Y
D 2
D p vp ..2n/Š/2vp .nŠ/ D p vp ..2n/Š/2vp .nŠ/ ; (6.17)
n .nŠ/ p p2n
where the final equality comes from the fact that neither .2n/Š nor nŠ has a prime
factor larger than 2n. From Proposition 6.3, we have
1
X 2n n
vp ..2n/Š/ 2vp .nŠ/ D 2 k : (6.18)
pk p
kD1
2n log 2n
Of course, pk
D n
pk
D 0 if p k > 2n, that is, if k >log p
. Thus,
log 2n
in the summation over k above, we may replace the upper limit 1 by log p .
Furthermore, it is easy to verify that Œ2x 2Œx is equal to either 0 or 1, for all
real numbers x. From these two facts we obtain from (6.18) the estimate
log 2n
0 vp ..2n/Š/ 2vp .nŠ/ : (6.19)
log p
22n Y Œ log 2n
p log p
2n p2n
or, equivalently,
X log 2n
2n log 2 log 2n log p: (6.22)
p2n
log p
P
Recalling from (6.8) that .2n/ D pk 2n;k1 log p, it follows that the summand
log p appears in .2n/ one time for each k 1 that satisfies p k 2n; that is, the
2n
summand log p appears Œ log
log p
times. Thus, the right hand side of (6.22) is equal to
.2n/, giving the inequality
.n/
lim inf log 2;
n!1 n
which completes the proof of the theorem.
We can now prove Chebyshev’s theorem in one line.
Proof of Theorem 6.1. The upper bound follows from Theorem 6.2 and part (ii)
of Proposition 6.1, while the lower bound follows from Theorem 6.3, part (i) of
Proposition 6.2, and part (i) of Proposition 6.1.
Exercise 6.1. Prove (6.14): vp .mn/ D vp .m/ C vp .n/; m; n 2 N.
Exercise 6.2. Prove that 2n
n
D maxk2Œ2n 2n
k
.
Exercise 6.3. Bertrand’s postulate states that for each positive integer n, there
exists a prime in the interval .n; 2n/. This result was first proven by Chebyshev.
Use the upper and lower bounds obtained in this chapter for Chebyshev’s -function
to prove the following weak form of Bertrand’s postulate: For every > 0, there
exists an n0 . / such that for every n n0 . / there exists a prime in the interval
.n; .2 C /n/.
74 6 Chebyshev’s Theorem
Chapter Notes
Chebyshev also proved that if limn!1 .n/nlog n exists, then this limit must be equal
to 1. For a proof, see Tenenbaums’ book [33]. Late in his life, in a letter, Gauss
recollected that in the early 1790s, when he was 15 or 16, he conjectured the prime
number theorem; however, he never published the conjecture. The theorem was
conjectured by Dirichlet in 1838. For some references for further reading, see the
notes at the end of Chap. 8.
Chapter 7
Mertens’ Theorems on the Asymptotic
Behavior of the Primes
Mertens’ second theorem will play a key role in the proof of the Hardy–
Ramanujan theorem in Chap. 8. For our proof of Mertens’ second theorem, we will
need a result known as Mertens’ first theorem.
Theorem 7.2.
X log p
D log n C O.1/; as n ! 1:
pn
p
p
We note that (7.1) follows from Stirling’s formula: nŠ nn e n 2 n. However, we
certainly don’t need such a precise estimate of nŠ to obtain (7.1). We give a quick
direct proof of (7.1). Consider an integer m 2 and x 2 Œm 1; m. Integrating the
inequality log.m 1/ log x log m over x 2 Œm 1; m gives
Z m
log.m 1/ log x dx log m;
m1
which we rewrite as
Z m
0 log m log x dx log m log.m 1/:
m1
Summing this inequality from m D 2 to m D n, and noting that the resulting series
on the right hand side is telescopic, we obtain
Z n
0 log nŠ log x dx log n: (7.2)
1
Rn
An integration by parts shows that 1 log x dx D n log n n C 1. Substituting this
in (7.2) gives
Thus, we have
Y Y P1 n
kD1 Œ p k
nŠ D p vp .nŠ/ D p ;
pn pn
and
XX 1 X n XX 1
n n
log nŠ D Œ k log p D Œ log p C Œ k log p: (7.3)
pn
p pn
p pn
p
kD1 kD2
7 Mertens’ Theorems 77
We now analyze the two terms on the right hand of (7.3), beginning with the
second term. We have
X1 X1 1
n 1 p2 n
Œ k n Dn D :
p pk 1 p1 p.p 1/
kD2 kD2
Thus, we obtain
XX 1 X log p
n
Œ k log p n C n; (7.4)
pn
p pn
p.p 1/
kD2
Recalling that Theorem 6.2 gives .n/ .log 4/n, we can estimate the second term
on the right hand side of (7.5) by
X n n X
0 . Œ / log p log p D .n/ .log 4/n: (7.6)
pn
p p pn
P log p
Comparing (7.1) with (7.7) allows us to conclude that pn p
D log n C O.1/,
completing the proof of Mertens’ first theorem.
In order to use Mertens’ first theorem to prove his second theorem, we need to
introduce Abel summation, a tool that is used extensively in number theory. Abel
summation is a discrete version of integration by parts. It appears in a variety of
guises, the following of which is the most suitable in the present context.
Proposition 7.1 (Abel Summation). Let j0 ; n 2 Z with j0 < n. Let a W Œj0 ; n \
PŒt
Z ! R, and let A W Œj0 ; n ! R be defined by A.t / D kDj0 a.k/. Let f W
Œj0 ; n ! R be continuously differentiable. Then
X Z n
a.r/f .r/ D A.n/f .n/ A.j0 /f .j0 / A.t /f 0 .t / dt: (7.8)
j0 <rn j0
78 7 Mertens’ Theorems
Remark. Since A.j0 / D a.j0 /, we could also write the above formula in the more
compact form
X Z n
a.r/f .r/ D A.n/f .n/ A.t /f 0 .t / dt: (7.9)
j0 rn j0
The form in the proposition of course mimics the standard integration by parts
formula.
Proof. Since A is constant between integers, we have
Z n1 Z
X X
n1
n
0
rC1
0
A.t /f .t / dt D A.t /f .t / dt D A.r/ f .r C 1/ f .r/ :
j0 rDj0 r rDj0
(7.10)
Substituting for A in the last term on the right hand side, and interchanging the order
of the resulting summation, we obtain
X
n1 n1 X
X r
A.r/ f .r C 1/ f .r/ D a.k/ f .r C 1/ f .r/ D
rDj0 rDj0 kDj0
X
n1 X
n1
X
n1
a.k/ f .r C 1/ f .r/ D a.k/ f .n/ f .k/ D
kDj0 rDk kDj0
X
n1
A.n 1/f .n/ a.k/f .k/: (7.11)
kDj0
X
n1 X
n
A.n/f .n/ A.j0 /f .j0 / A.n 1/f .n/ C a.k/f .k/ D a.k/f .k/;
kDj0 kDj0 C1
(7.12)
and let
1
f .t / D ; t > 1:
log t
We use Abel summation in the form (7.9) with j0 D 2. By Mertens’ first theorem,
we have
X
Œt
X log p
A.t / D a.k/ D D log t C O.1/; as t ! 1: (7.13)
p
kD2 pŒt
We have
Z n
1
dt D log log t jn2 D log log n log log 2;
2 t log t
R
and since 1
t.log t/2
dt D log1 t , we have
Z 1
1
dt < 1:
2 t .log t /2
Exercise 7.1. (a) Use Mertens’ first theorem and Abel summation to prove that
X log2 p 1
D log2 n C O.log n/:
pn
p 2
P 2 P
(Hint: Write pn logp p D 1rn a.r/ log r, where a.r/ is as in the proof of
Mertens’ second theorem.)
(b) Use induction and the result in (a) to prove that
X logk p 1
D logk n C O.logk1 n/;
pn
p k
Chapter Notes
The two theorems in this chapter were proven by F. Mertens in 1874. For some
references for further reading, see the notes at the end of Chap. 8.
Chapter 8
The Hardy–Ramanujan Theorem
on the Number of Distinct Prime Divisors
Let !.n/ denote the number of distinct prime divisors of n; that is,
X
!.n/ D 1:
pjn
Remark. From the proof of the theorem, it is very easy to infer that the statement of
the theorem is equivalent to the following statement: For every ı > 0,
1
jfn 2 ŒN W j!.n/ log log N j .log log N / 2 Cı gj
lim D 1:
N !1 N
While the statement of the theorem is probably more aesthetically pleasing than
this latter statement, the latter statement is more practical. Thus, for example, take
ı D :1. Then for sufficiently large n, a very high percentage of the positive integers
n
up to the astronomical number N D e e will have between nn:6 and nCn:6 distinct
prime factors. Let n D 109 . We leave it to the interested reader to estimate the
O.1/ terms appearing in the proofs of Mertens’ theorems, and to keep track of how
they appear in the proof of the Hardy–Ramanujan theorem below, and to conclude
109
that over ninety percent of the positive integers up to N D e e have between
109 .109 /:6 and 109 C .109 /:6 distinct prime factors. That is, over ninety percent
109
of the positive integers up to e e have between 109 251; 188 and 109 C 251; 188
distinct prime factors.
Our proof of the Hardy–Ramanujan theorem will have a probabilistic flavor. For
any positive integer N , let PN denote the uniform probability measure on ŒN ; that
is, PN .fj g/ D N1 , for j 2 ŒN . Then we may think of the distinct prime divisor
function ! D !.n/ as a random variable on the space ŒN with the probability
measure PN . For the sequel, note that when we write PN .! 2 A/, where A ŒN ,
what we mean is
1 X
N
EN ! D !.n/: (8.2)
N nD1
1 X 2
N
EN ! 2 D ! .n/: (8.3)
N nD1
VarN .!/
PN .j! EN !j / ; for > 0: (8.5)
2
8 Hardy–Ramanujan Theorem 83
In order to implement this, we need to calculate EN ! and VarN .!/ or, equivalently,
EN ! and EN ! 2 . The next two theorems give the asymptotic behavior as N ! 1
of EN ! and of EN ! 2 . The proofs of these two theorems will use Mertens’ second
theorem.
Theorem 8.2.
Remark. Recall the definition of the average order of an arithmetic function, given
in the remark following the number-theoretic proof of Theorem 2.1. Theorem 8.2
shows that the average order of !, the function counting the number of distinct
prime divisors, is given by the function log log n.
Proof. From the definition of the divisor function we have
X
N X
N X X X X N
!.n/ D 1D 1D Œ D
nD1 nD1 pjn pN pjn;nN pN
p
X 1 X N N
N . Œ /: (8.6)
pN
p pN p p
(We could use Chebyshev’s theorem (Theorem 6.1) to get the better bound O. logNN /
on the right hand side above, but that wouldn’t improve the order of the final bound
we obtain for EN !.) Mertens’ second theorem (Theorem 7.1) gives
X 1
D log log N C O.1/; as N ! 1: (8.8)
pN
p
X
N
!.n/ D N log log N C O.N /; as N ! 1;
nD1
Theorem 8.3.
Remark. To prove the Hardy–Ramanujan theorem, we only need the upper bound
Proof. We have
X X X X X X
! 2 .n/ D . 1/2 D . 1/. 1/ D 1C 1D 1 C !.n/: (8.11)
pjn p1 jn p2 jn p1 p2 jn pjn p1 p2 jn
p1 ¤p2 p1 ¤p2
Thus,
X
N X
N X X
N
! 2 .n/ D 1C !.n/: (8.12)
nD1 nD1 p1 p2 jn nD1
p1 ¤p2
The second term on the right hand side of (8.12) can be estimated by Theorem 8.2,
giving
X
N
!.n/ D NEN ! D N log log N C O.N /; as N ! 1: (8.13)
nD1
To estimate the first term on the right hand side of (8.12), we write
X
N X X X X N
1D 1D D
nD1 p1 p2 jn p1 p2 N nN p p N
p1 p2
1 2
p1 ¤p2 p1 ¤p2 p1 p2 jn p1 ¤p2
X 1 X N N
N : (8.14)
p1 p2 N
p1 p2 p p N p1 p2 p1 p2
1 2
p1 ¤p2 p1 ¤p2
Using Mertens’ second theorem for the second inequality below, we bound from
above the summation in the first term on the right hand side of (8.14) by
X 1 X 1 2
. /2 log log N C O.1/ ; as N ! 1: (8.16)
p1 p2 N
p1 p2 pN
p
p1 ¤p2
where the first equality follows from Theorem 8.2. For an alternative proof, see
Exercise 8.1.
We now use Chebyshev’s inequality along with the estimates in Theorems 8.2
and 8.3 to prove the Hardy–Ramanujan theorem.
Proof of Theorem 8.1. From Theorems 8.2 and 8.3 we have
2
VarN .!/DEN ! 2 .EN !/2 D.log log N /2 CO.log log N / log log N C O.1/ D
O.log log N /; as N ! 1: (8.17)
Thus,
1
lim PN j! log log N RN j .log log N / 2 Cı D 1: (8.19)
N !1
Translating (8.19) back to the notation in the statement of the theorem, we have for
every ı > 0
1
jfn 2 ŒN W j!.n/ log log N RN j .log log N / 2 Cı gj
lim D 1: (8.20)
N !1 N
86 8 Hardy–Ramanujan Theorem
The main difference between (8.20) and the statement of the Hardy–Ramanujan
theorem is that log log N appears in (8.20) and log log n appears in (8.1). Because
log log x is such a slowly varying function, this difference is not very significant.
The remainder of the proof consists of showing that if (8.20) holds for all ı > 0,
then (8.1) also holds for all ı > 0.
Fix an arbitrary ı > 0. Using the fact that (8.20) holds with ı replaced by 2ı , we
will show that (8.1) holds for ı. This will then complete the proof of the theorem.
The term RN in (8.20) may vary with N , but it is bounded in absolute value, say
1
by M . For N 2 n N , we have
1
log log N log log n log log N log log N 2 D log 2: (8.21)
Therefore, writing !.n/ log log n D .!.n/ log log N RN / C .log log N
log log n/ C RN , the triangle inequality and (8.21) give
1
j!.n/log log nj j!.n/log log N RN jClog 2CM; for N 2 n N: (8.22)
ı
Using (8.20) with ı replaced by 2
, along with (8.22) and the fact that
1
limN !1 NN 2
D 0, we have
1 1
jfn 2 ŒN W j!.n/ log log nj .log log N / 2 C 2 ı C log 2 C M gj
lim D 1:
N !1 N
(8.23)
1 1 1
By (8.21), it follows that .log log n/ 2 Cı .log log N log 2/ 2 Cı , for N 2 n N .
Clearly, we have
1 1 1
.log log N log 2/ 2 Cı .log log N / 2 C 2 ı C log 2 C M; for sufficiently large N:
Thus,
1 1 1 1
.log log n/ 2 Cı .log log N / 2 C 2 ı C log 2 C M; for N 2 n N and sufficiently large N:
(8.24)
1
From (8.23), (8.24), and the fact that limN !1 N
N
2
D 0, we conclude that
1
jfn 2 ŒN W j!.n/ log log nj .log log n/ 2 Cı gj
lim D 1:
N !1 N
Exercise 8.1. Prove the lower bound
P
by using (8.12)–(8.15) and an inequality that begins with 1
p1 p2 N p p
1 2
p1 ¤p2
P p 1
p1 ;p2 N p1 p2 .
p1 ¤p2
Exercise 8.2. Let .n/ denote the number of prime divisors of n,Qcounted with
prime factorization of n is given by n D m
repetitions. Thus, if the P ki
iD1 pi , then
m
!.n/ D m, but .n/ D iD1 ki . Use the method of proof in Theorem 8.2 to prove
that
1 X
N
EN D .n/ D log log N C O.1/; as N ! 1:
N nD1
Exercise 8.3. Let d.n/ denote the number of divisors of n. Thus, d.12/ D 6
because the divisors of 12 are 1,2,3,4,6,12. Show that
1X
n
d.j / D log n C O.1/:
n j D1
This shows that the average order of the divisor function is the function log n. Recall
from the remark after Theorem 8.2 that the average order of !.n/, the function
counting the number
P of distinct
P prime divisors,
P Pis the function
P logP log n. (Hint: We
have d.k/ D mjk 1, so nkD1 d.k/ D k2Œn mjk 1 D m2Œn k2ŒnWmjk 1.)
Chapter Notes
The theorem of G. H. Hardy and S. Ramanujan was proved in 1917. The proof we
give is along the lines of the 1934 proof of P. Turán, which is much simpler than the
original proof. For more on multiplicative number theory and primes, the subject
of the material in Chaps. 6–8, the reader is referred to Nathanson’s book [27] and
to the more advanced treatment of Tenenbaum in [33]. In [27] one can find a proof
of the prime number theorem by “elementary” methods. For very accessible books
on analytic number theory and a proof of the prime number theorem using analytic
function theory, see, for example, Apostol’s book [5] or Jameson’s book [25]. For
a somewhat more advanced treatment, see the book of Montgomery and Vaughan
[26]. One can also find a proof of the prime number theorem using analytic function
theory, as well as a whole trove of sophisticated material, in [33].
Chapter 9
The Largest Clique in a Random Graph
and Applications to Tampering Detection
and Ramsey Theory
A finite graph G is a pair .V; E/, where V is a finite set of vertices and E is a
subset of V .2/ , the set of unordered pairs of elements of V . The elements of E
are called edges. (This is what graph theorists call a simple graph. That is, there
are no loops—edges connecting a vertex to itself—and there are no multiple edges,
more than one edge connecting the same pair of vertices.) If x; y 2 V and the pair
fx; yg 2 E, then we say that an edge joins the vertices x and y;otherwise, we say
that there is no edge joining x and y. If jV j D n, then jV .2/ j D n2 D 12 n.n1/. The
size of the graph is the number of vertices it contains, that is, jV j. We will identify
the vertex set V of a graph of size n with Œn. The graph G D .V; E/ with jV j D n
and E D V .2/ is called the complete graph of size n and is henceforth denoted
by Kn . This graph has n vertices and an edge connects every one of the 12 n.n 1/
pairs of vertices. See Fig. 9.1.
For a graph G D .V; E/ of size n, a clique of size k 2 Œn is a complete subgraph
.2/
K of G of size k; that is, K D .VK ; EK /, where VK V; jVK j D k and EK D VK .
See Fig. 9.2.
Consider the vertex set V D Œn. Now construct the edge set E Œn.2/ in the
following random fashion. Let p 2 .0; 1/. For each pair fx; yg 2 Œn.2/ , toss a coin
with probability p of heads and 1p of tails. If heads occurs, include the pair fx; yg
in E, and if tails occurs, do not include it in E. Do this independently for every
pair fx; yg 2 Œn.2/ . Denote the resulting random edge set by En .p/. The resulting
random graph is sometimes called an Erdős–Rényi graph; it will be denoted by
Gn .p/ D .Œn; En .p//. In this chapter, the generic notation P for probability and E
for expectation will be used throughout.
To get a feeling for how many edges one expects to see in the random graph,
attach to each of the N WD 12 n.n 1/ potential edges a random variable which is
equal to 1 if the edge exists in the random set of edges En .p/ and is equal to 0
if the edge does not exist in En .p/. Denote these random variables by fWm gN mD1 .
The random variables are distributed according to the Bernoulli distribution with
10
3
6
9
5
2 7 8
1
4
Fig. 9.2 A graph with 10 vertices and 13 edges. The largest clique is the one of size 4, formed by
the vertices f4; 5; 6; 7g
inequality gives
1C Np.1 p/
P .jSN Npj N 2 / :
N 1C
1C
Consequently, for any > 0, one has limN !1 P .jSN Npj N 2 / D 0. Thus,
for any > 0 and large n (depending on ), with high probability the Erdős–Rényi
graph Gn .p/ will have 12 n2 p C O.n1C / edges.
The main question we address in this chapter is this: how large is the largest
complete subgraph, that is, the largest clique, in Gn .p/, as n ! 1? We study this
question in Sect. 9.2. In Sect. 9.3 we apply the results of Sect. 9.2 to a problem in
tampering detection. In Sect. 9.4, we discuss Ramsey theory for cliques in graphs
and use random graphs to give a bound on the size of a fundamental deterministic
quantity.
9.2 The Size of the Largest Clique 91
Let Ln;p be the random variable denoting the size of the largest clique in Gn .p/. Let
.2/
log 1 n WD log 1 log 1 n.
p p p
Theorem 9.1. Let Ln;p denote the size of the largest clique in the Erdős–Rényi
graph Gn .p/. Then
(
.2/ 0; if c < 2I
lim P Ln;p 2 log 1 n c log 1 n D
n!1 p p 1; if c > 2:
.2/
ii. If kn 2 log 1 n c log 1 n, for some c > 2, then Nn;p .kn / converges to 1 in
p p
probability; that is,
of Theorem 9.1. The number of cliques of size kn in the complete graph Kn
Proof
is knn ; denote these cliques by fKjn W j D 1; : : : ; knn g. Let IKjn be the indicator
random variable defined to be equal to 1 or 0, according to whether the clique Kjn is
or is not contained in the random graph Gn .p/. Then we can represent the random
variable Nn;p .kn /, denoting the number of cliques of size kn in the random graph
Gn .p/, as
.knn /
X
Nn;p .kn / D IKjn : (9.1)
j D1
Let P .Kjn / denote the probability that the clique Kjn is contained in Gn .p/; that is,
the probability that the edges of the clique Kjn are all contained in the random edge
set En .p/ of Gn .p/. Since each clique Kjn contains k2n edges, we have
kn
P .Kjn / D p . 2 / :
The expected value EIKjn of IKjn is given by EIKjn D P .Kjn /. Thus, the expected
value of Nn;p .kn / is given by
.knn / !
X n kn
ENn;p .kn / D EIKjn D p. 2 /: (9.2)
j D1
kn
.2/
lim P .Ln;p 2 log 1 n c log 1 n/ D 0: (9.3)
n!1 p p
We have
where the equality follows from the fact that a clique of size l contains sub-cliques
of size j for all j 2 Œl 1. Thus, to prove (9.3) it suffices to prove that
.2/
lim ENn;p .2 log 1 n cn log 1 n/ D 0; (9.4)
n!1 p p
where 0 cn c < 2, for all n. (We have written cn instead of c in (9.4) because
we need the argument of Nn;p to be an integer.) This approach to proving (9.3) is
known as the first moment method.
9.2 The Size of the Largest Clique 93
n.n 1/ .n kn C 1/
lim D 1;
n!1 nkn
or, equivalently,
n 1
kX
j
lim log.1 / D 0: (9.5)
n!1
j D1
n
Letting f .x/ D log.1 x/, and applying Taylor’s remainder theorem in the form
f .x/ D f .0/ C f 0 .x .x//x, for x > 0, where x .x/ 2 .0; x/, we have
1
0 log.1 x/ 2x; 0 x :
2
n 1
kX kXn 1
j j .kn 1/kn
0 log.1 /2 D :
j D1
n j D1
n n
1
Letting n ! 1 in the above equation, and using the assumption that kn D o.n 2 /,
we obtain (9.5).
.2/
We can now prove (9.4). Let kn D 2 log 1 n cn log 1 n, where 0 cn c < 2,
p p
for all n. Stirling’s formula gives
p
kn Š knkn e kn 2kn ; as n ! 1:
and thus
!
n kn
log 1 ENn;p .kn / D log 1 p. 2 /
p p kn
1 1 1
kn log 1 n kn2 C kn kn log 1 kn C kn log 1 e log 1 2kn ; as n ! 1:
p 2 2 p p 2 p
(9.6)
Note that
.2/
cn log 1 n
.2/
log 1 kn D log 1 .2 log 1 n cn log 1 n/ D log 1 .log 1 n/ 2 D
p
p p p p p p log 1 n
p
.2/
cn log 1 n
.2/ .2/
log 1 n C log 1 2 D log 1 n C O.1/; as n ! 1:
p
(9.7)
p p log 1 n p
p
1 .2/
kn log 1 n kn2 kn log 1 kn D .2 log 1 n cn log 1 n/ log 1 n
p 2 p p p p
.2/
.cn 2/.log 1 n/ log 1 n C O log 1 n : (9.8)
p p p
Since 12 kn C kn log 1 e 12 log 1 2kn D O.log 1 n/, it follows from (9.6), (9.8), and
p p p
the fact that 0 cn c < 2 that
.2/
lim log 1 ENn;p .2 log 1 n cn log 1 n/ D 1:
n!1 p p p
.2/
lim P .Ln;p 2 log 1 n c log 1 n/ D 1: (9.9)
n!1 p p
The analysis in the above paragraph shows that if cn c > 2, for all n, then
.2/
lim ENn;p .2 log 1 n cn log 1 n/ D 1: (9.10)
n!1 p p
The first moment method used above exploits the fact that (9.4) implies (9.3).
Now (9.10) does not imply (9.9). To prove (9.9), we employ the second moment
9.2 The Size of the Largest Clique 95
method. (This method was also used in Chap. 3 and Chap. 8.) The variance of
Nn;p .kn / is given by
2 2
Var Nn;p .kn / D E Nn;p .kn / ENn;p .kn / D ENn;p
2
.kn / ENn;p .kn / :
(9.11)
.2/
Our goal now is to show that if kn D 2 log 1 n cn log 1 n with cn c > 2, for all
p p
n, then
2
Var Nn;p .kn / D o ENn;p .kn / ; as n ! 1: (9.12)
Nn;p .kn /
lim P .j 1j < / D 1; for all > 0: (9.14)
n!1 ENn;p .kn /
In particular then, (9.9) follows from (9.15). Thus, the proof of the theorem will be
complete when we prove (9.12), or, in light of (9.11), when we prove that
2 2
2
ENn;p .kn / D ENn;p .kn / C o .ENn;p .kn / ; as n ! 1: (9.16)
We relabel the cliques fKjn W j D 1; : : : ; knn g, of size kn in Kn according to
the vertices that are contained in each clique. Thus, we write Kin1 ;i2 ;:::;ikn to denote
the clique whose vertices are i1 ; i2 ; : : : ; ikn . The representation for Nn;p .kn / in (9.1)
becomes
X
Nn;p .kn / D IKin ;i ;:::;i : (9.17)
1 2 kn
1i1 <i2 <<ikn n
Note that the random variable IKin ;i ;:::;i IKln ;l ;:::;l is equal to 1 if the edges of the
1 2 kn 1 2 kn
two cliques Kin1 ;i2 ;:::;ikn and Kln1 ;l2 ;:::;lk are all contained in Gn .p/ and is equal to 0
n
otherwise. Thus,
where P .Kin1 ;i2 ;:::;ikn [ Kln1 ;l2 ;:::;lk / is the probability that the edges of Kin1 ;i2 ;:::;ikn and
n
Kln1 ;l2 ;:::;lk are all contained in the random edge set En .p/ of Gn .p/. Consequently,
n
we have
X
ENn;p2
.kn / D EIKin ;i ;:::;i IKln ;l ;:::;l D
1 2 kn 1 2 kn
1i1 <i2 <<ikn n
1l1 <l2 <<lkn n
X
P .Kin1 ;i2 ;:::;ikn [ Kln1 ;l2 ;:::;lkn /: (9.18)
1i1 <i2 <<ikn n
1l1 <l2 <<lkn n
over all kn -tuples 1 l1 < l2 < < lkn n is independent of the particular
choice of kn -tuple i1 ; i2 ; : : : ; ikn . (The reader should
verify
this.) For convenience,
we select the kn -tuple 1; 2; : : : ; kn . Since there are knn different kn -tuples, we have
!
n X
2
ENn;p .kn / D n
P .K1;2;:::;kn
[ Kln1 ;l2 ;:::;lkn /: (9.19)
kn
1l1 <l2 <<lkn n
Let
n
denote the number of vertices shared by the cliques K1;2;:::;k and Kln1 ;l2 ;:::;lk . Each
kn n n
of these two cliques has 2 edges. Since the cliques share J vertices, the number
n
of edges in K1;2;:::;k [ Kln1 ;l2 ;:::;lk is equal to 2 k2n J2 , if J 2, and is equal to
n n
2 knn , if J D 0 or J D 1. Thus,
( kn J
p 2. 2 /. 2 / ; if J D J.l1 ; l2 ; : : : ; lkn / 2I
n
P .K1;2;:::;k [ Kln1 ;l2 :::;lkn / D kn (9.20)
p 2. 2 / ; if J D J.l1 ; l2 ; : : : ; lk / 1:
n
n
Keep in mind that our aim is to prove (9.16). We will do this by showing that
2
the first term on the right hand side of (9.21) is equal to o .ENn;p .kn / and
2
that
the second term on the right hand side of (9.21) is equal to ENn;p .kn / C
2
o .ENn;p .kn / .
In order to analyze the two terms on the right hand side of (9.21), we need to
count the number of kn -tuples l1 ; l2 ; : : : ; lkn for which J.l1 ; l2 ; : : : ; lkn / D j , for
j D 0; 1; : : : ; kn . Denote this number by #.j /. In order that J.l1 ; l2 ; : : : ; lkn / D j ,
we need to choose j of the vertices of l1 ; l2 ; : : : ; lkn from the set Œkn and the other
kn j vertices of l1 ; l2 ; : : : ; lkn from the set Œn Œkn . Thus,
! !
kn n kn
#.j / D ; j D 0; 1; : : : ; kn : (9.22)
j kn j
! ! ! h nkn i
n h n kn n kn i kn 2
nkn
kn
C k n kn 1
C kn p 2. 2 / D ENn;p .kn / n ;
kn kn kn 1 kn
(9.23)
kn
where (9.2) was used for the final equality. By Lemma 9.1, knn nkn Š , and applying
n .nkn /kn kn
Lemma 9.1 with n replaced by nkn , we have nk kn
kn Š D nkn Š .1 knn /kn
nk n 1 n kn 1
kn Š
, since kn D o.n 2 /. Of course then also nk
kn 1
.knn 1/Š . Thus,
nkn nkn nk n kn 1
kn
C kn kn 1 kn Š
C kn .knn 1/Š kn2
n D1C : (9.24)
nk n n
kn kn Š
From (9.23) and (9.24), we conclude that the secondterm on the right hand side
2 2
of (9.21) is equal to ENn;p .kn / C o .ENn;p .kn / .
Now we consider the first term on the right hand side of (9.21). Of course,
nkn kn j j kn
kn j
.knn j /Š and kjn kjnŠ . Also, by Lemma 9.1, knn nkn Š . Using these
2 2 kn
estimates and (9.22), and recalling from (9.2) that ENn;p .kn / D knn p 2. 2 / ,
we can estimate the first term on the right hand side of (9.21) by
98 9 The Largest Clique in a Random Graph and Applications
! ! k ! !
n X n X n
kn n kn 2.kn /.j /
2.k2n /.J2 /
p D p 2 2 D
kn kn j D2 j kn j
1l1 <l2 <<lkn n
J.l1 ;l2 ;:::;lkn /2
kn nkn
2 X
kn
k j j 2 X
kn
nkn j kn
j
j
p . 2 / ENn;p .kn / n p . 2 /
j
ENn;p .kn / n n
j D2 kn j D2
.kn j /Šj Š kn
2 Xkn
nkn j kn kn Š .j /
j
2 Xkn
kn
2j
j.j 1/
ENn;p .kn / p 2 EN
n;p .k n / p 2 :
j D2
.kn j /Šj Šn kn
j D2
j
n jŠ
(9.25)
j j
p
By Stirling’s formula, j Š j e 2j , as j ! 1, and thus there exists a
constant C > 0 such that
Using (9.26) for the first inequality below and (9.27) for the second inequality below,
for sufficiently large n the summation in the last term on the right hand side of (9.25)
can be estimated by
kn
ekn2 j
kn
X 1 X 1 X ekn j
kn 2j
kn j.j 1/
2
p 1
j D2
nj j Š C j D2 j np 2j
C j D2 np kn21
p p
1 X pekn j
1
1 n2 pekn
D ; if n WD < 1: (9.28)
C j D2 np 2 kn
C 1 n kn
np 2
.2/
Using the fact that kn D 2 log 1 n cn log 1 n with cn c > 2, we now show
p p
that
p kn
lim n D pe lim kn
D 0: (9.29)
n!1 n!1
np 2
kn kn .2/ kn
log 1 kn
D log 1 kn log 1 n log 1 p D log 1 n C O.1/ log 1 n C D
p
np 2
p p 2 p p p 2
.2/ cn .2/ cn .2/
log 1 n C O.1/ log 1 n C log 1 n log 1 n D .1 / log 1 n C O.1/;
p p p 2 p 2 p
as n ! 1:
10
3 6
9
5
2 7 8
1
4
Fig. 9.3 The graph from Fig. 9.2 of size n D 10 has been tampered with by adding to it the clique
of size kn D 3 formed by the vertices {3,6,10}
100 9 The Largest Clique in a Random Graph and Applications
In Exercise 9.1, the reader is asked to show that the distance DTV .;
/ can be
written in two other fashions:
1X
DTV .;
/ D max..A/
.A// D j.x/
.x/j: (9.31)
A 2 x2
It is easy to see that DTV .;
/ takes on values in Œ0; 1, vanishes if and only if
D
, and equals 1 if and only if and
are mutually singular. We recall that
two probability measures and
are called mutually singular if there exists a subset
A such that .A/ D
.A/ D 1 (and then of course .A/ D
.A/ D 0).
Consider now a -valued random variable X (defined on some probability space
.S; P /). The random variable X induces a probability measure X on , namely
for any subset A , we define X .A/ D P .X 2 A/. This probability measure is
called the distribution of X . Given two random variables X; Y taking values in ,
we define the total variation distance between them by
We now apply the above concepts to the random graph. The original random
graph Gn .p/ has as its edge set En .p/, whereas the tampered random graph
GntamIkn .p/ has the augmented edge set EntamIkn .p/. Each of the random variables
.2/
En .p/ and EntamIkn .p/ takes values in the space P.Œn.2/ / WD 2Œn , the set of all
.2/
subsets of Œn . (Given a set A, the set of all subsets of A is sometimes denoted by
2A ; it is known as the power set of A.) We define the tamper detection problem as
follows.
Definition.
i. If
lim DTV En .p/; EntamIkn .p/ D 0;
n!1
lim inf DTV .En .p/; EntamIkn .p// > 0 and lim sup DTV .En .p/; EntamIkn .p// < 1;
n!1 n!1
.2/
If kn 2 log 1 n c log 1 n; for some c > 2; then for all > 0;
p p
.k / (9.32)
Nn;pn
lim P .j .k /
1j < / D 1:
n!1 ENn;pn
This result was actually proved in the course of the proof of Theorem 9.1—it appears
as (9.14).
Let n denote the distribution of the random variable En .p/ and let nItam denote
the distribution of the random variable EntamIkn .p/. Let fKjn W j D 1; : : : ; knn g
102 9 The Largest Clique in a Random Graph and Applications
denote the knn cliques of size kn in the complete graph Kn . Recall that P.Œn.2/ /
denotes the set of subsets of Œn.2/ ; thus, a point ! 2 P.Œn.2/ / is a subset of
Œn.2/ , while a subset A P.Œn.2/ / is a collection of subsets of Œn.2/ . Denote by
Anj P.Œn.2/ / the subset of P.Œn.2/ / consisting of all those subsets of Œn.2/ which
contain all of the k2n edges of the clique Kj . Let An D [kj nD1 Anj P.Œn.2/ / denote
the set of all those subsets of Œn.2/ which possess at least one clique of size kn .
The tampered graph is obtained by choosing at random one of the knn cliques of
size kn in Kn and adding all of its edges to the original random edge set En .p/. That
is, one of the Kjn ; j D 1; : : : ; knn is chosen at random, and its edges are adjoined to
En .p/ to form EntamIkn .p/. Of course then, by construction, the tampered edge set
EntamIkn .p/ must possess a clique of size kn ; thus,
.2/
We first prove part (i) of the theorem. Let kn 2 log 1 n c log 1 n, for some
p p
c < 2. By Corollary 9.1 (or Theorem 9.1), the probability of there being at least one
clique of size kn in En .p/ converges to 0 as n ! 1; thus,
lim n .An / D 0:
n!1
Consequently,
DTV .En .p/; EntamIkn .p// D DTV .n ; nItam / D max jn .A/ nItam .A/j
AP.Œn.2/ /
.knn /
1 X
nItam .A/ D n n .AjAnj /; for A P.Œn.2/ /: (9.34)
kn j D1
.k /
Consequently, from the definition of Nn;pn and the definition of fAnj W j D
1; : : : ; knn g, we have
.knn /
X
n .f!g \ Anj / D n .f!g/Nn;p
.kn /
.!/; ! 2 P.Œn.2/ /: (9.35)
j D1
kn kn
Note that n .Anj / D p . 2 / , for all j . Recall from (9.2) that ENn;pn D knn p . 2 / .
.k /
.knn / .knn /
1 X 1 X n .f!g \ Anj /
nItam .f!g/ D n n .f!gjAj / D n
n
D
k j D1 k j D1
n .Anj /
n n
.k / .k /
n .f!g/Nn;pn .!/ Nn;pn .!/
n .kn / D .k /
n .f!g/: (9.36)
k
p 2 ENn;pn
n
Equation (9.36) shows that the probability measure nItam is the tilted probability
.k /
measure of n , tilted by the random variable Nn;pn .
For > 0, let
.k /
Nn;pn .!/
B n D f! 2 P.Œn.2/ / W j .k /
1j < g:
ENn;pn
.2/
Since kn 2 log 1 n c log 1 n, for some c > 2, it follows from the law of large
p p
numbers in (9.32) that
lim n .B n / D 1: (9.37)
n!1
X Nn;p
.kn /
.!/
jnItam .B n / n .B n /j D j n .f!g/ .k /
1 j<
!2B n ENn;pn
X
n .f!g/ D n .B n / ; (9.38)
!2B n
104 9 The Largest Clique in a Random Graph and Applications
where the first inequality follows from the definition of B n . From (9.37) and (9.38),
it follows that
Now let A P.Œn.2/ / be arbitrary. Note that (9.38) holds also with B n replaced
by A \ B n ; so jnItam .A \ B n / n .A \ B n /j < . Let .B n /c D P.Œn.2/ / B n
denote the complement of B n . Then we have
From (9.40) and the definition of the total variation distance, it follows that
DTV En .p/; EntamIkn .p/ D DTV n ; nItam D
max jn .A/ nItam .A/j < C n ..B n /c / C nItam ..B n /c /: (9.41)
AP.Œn.2/ /
From (9.37), (9.39), (9.41), and the fact that > 0 is arbitrary, we conclude that
lim DTV En .p/; EntamIkn .p/ D 0: (9.42)
n!1
Remark. The final two paragraphs of the proof can be replaced by a shorter argu-
ment using L2 -convergence and the Cauchy–Schwarz inequality. See Exercise 9.2.
Consider the complete graph Kn . For each edge in Kn , choose either blue or red,
and color the edge with that color. We call this a 2-coloring of Kn . For 2 k n,
one can ask whether there exists a monochromatic clique of size k, that is, a clique
with all of its edges blue or with all of its edges red. For k D 2, obviously there
exists such a monochromatic clique, for all n 2. The fundamental theorem of
Ramsey theory states the following:
For each integer k 3, there exists an integer R.k/ > k such that if n R.k/,
then every 2-coloring of Kn will necessarily have a monochromatic clique of size
k, while if k n < R.k/, then it is possible to find a 2-coloring of Kn with no
monochromatic clique of size k.
9.4 Ramsey Theory 105
Note that this result is purely deterministic—it says that no matter how we arrange
the coloring of Kn , there must be a monochromatic clique of size k, if n R.k/.
The exact computation of the Ramsey numbers R.k/ is notoriously hard. One has
R.3/ D 6 and R.4/ D 18, but the exact value of R.5/ is unknown! See Fig. 9.4.
Remark. It is known that 43 R.5/ 49. The complete graph K43 has 12 43
42 D 903 edges. There are 2903 different two-colorings of K43 and 43
5
D 962; 598
different cliques of size 5.
We will prove the above fundamental result by providing upper and lower bounds
on R.k/. A nice, elementary combinatorial argument yields the following result.
Theorem 9.3.
Remark. The above estimate is not far from the best known asymptotic upper bound
for R.k/. In particular, it is not known if R.k/ c k , for large k and some c < 4.
For the best known upper bound, see [12].
Proof. Let k 3. Consider an arbitrary coloring of the complete graph K4k1 of
size 4k1 D 22k2 . Define x1 D 1 and S0 D K4k1 . Since x1 shares an edge with
22k2 1 vertices, there must be a set of vertices S1 of size at least 22k3 such
that every edge from x1 to a vertex in S1 is the same color. This is the so-called
pigeonhole principle. Let x2 denote the vertex in S1 with the lowest number. By the
same reasoning, since x2 shares an edge with all the other vertices in S1 , of which
there are at least 22k3 1, there must be a set S2 S1 of size at least 22k4 such
that every edge from x2 to a vertex in S2 has the same color. Continuing like this,
we obtain a sequence x1 ; : : : ; x2k2 of vertices and a decreasing, nested sequence of
sets of vertices fSj g2k3
j D0 such that xj 2 Sj 1 , j 2 Œ2k 2. By the construction,
it follows that for each i , the color of the edge joining xi to xj is the same for all
˚ 2k3
j > i . Now look at the 2k 3 edges fxi ; xiC1 g iD1 . Obviously, we can choose
at least k 1 of these edges to be all the same color. Find such a set of edges and
denote the set of vertices in these edges by S . Note that jS j k. Because the color
106 9 The Largest Clique in a Random Graph and Applications
joining xi to xj is the same for all j > i , it follows in fact that the color of the edge
joining any two vertices in S is the same. We have thus exhibited a monochromatic
clique of size at least k.
Despite the fact that the Ramsey number R.k/ is a quantity associated with a
purely deterministic result, one can give a very short and ingenious probabilistic
proof of a lower bound for R.k/.
Theorem 9.4. R.k/ > k, for all k 3, and
1 k
R.k/ 1 C o.1/ k2 2 ; as k ! 1: (9.44)
e
p
Remark. The best known lower bound is just 2 times the above estimate; see [2].
Thus, a real chasm lies between the best known upper bound and the best known
lower bound!
Proof. Consider a random two-coloring of the graph Kn , where each edge is colored
red or blue with equal probability, and independently of what occurs at other edges.
Let W be a clique in Kn of size k, with 3 k n. Let IW be the indicator random
variable, which is equal to 1 if W is monochromatic, and equal to 0 otherwise.
Since there are k2 edges in W , the probability that W is all blue (or all red) is
k k
. 1 /.2/ ; consequently, the probability that W is monochromatic is 21.2/ . Of course,
2
k
also equal to 21.2/ .
the expected value EIW of IW isP
For 3 k n, let Xk D jW jDk IW . The random variable Xk counts the
number of monochromatic cliques of size k in Kn . We have
!
X n 1.k/
EXk D EIW D 2 2 :
k
jW jDk
k
In particular, choosing n D k C 1, one obtains R.k/ > k C 1 2.k C 1/2.2/ , and
it is easy to check that the right hand side is greater than or equal to k, for all k 3.
In Exercise 9.3 the reader is asked to show that
!
n 1.k/ 1 k
max n 2 2 D 1 C o.1/ k2 2 ; as k ! 1: (9.45)
kn<1 k e
Remark. The strategy used to prove Theorem 9.4 is known as the probabilistic
method. It was pioneered by P. Erdős. He used the method in a slightly different
p
way from above and obtained a lower bound on R.k/ with an extra factor of 2 in
the denominator on the right hand side of (9.44).
Exercise 9.1. Show that the total variation distance DTV .;
/ defined in (9.30)
satisfies (9.31).
Exercise 9.2. This exercise presents an alternative approach in place of the final
two paragraphs of the proof of part (ii) of Theorem q9.2. Recall that the Cauchy–
Pm Pm 2 Pm 2
Schwarz inequality states that j iD1 ai bi j . iD1 ai /. iD1 bi /, where
fai gm
iD1 ; fb gm
i iD1 are real numbers and m is a positive integer.
a. Use (9.36) and the Cauchy–Schwarz inequality to show that for any A
P.Œn.2/ /, one has
v
u
p u X Nn;p
.kn /
.!/ 2
jnItam .A/ n .A/j .A/ t .kn /
1 n .!/
!2A ENn;p
v
u X
u Nn;p
.kn /
.!/ 2
t 1 n .!/: (9.46)
.k /
!2P.Œn.2/ /
ENn;pn
b. The expression on the right hand side of (9.46) is called the L2 -norm with respect
.k /
Nn;pn .!/
to the measure n of the function .k / 1, which is defined on the domain
ENn;pn
.k /
Nn;pn
P.Œn.2/ /. We denote this norm by jj .k / 1jj2In . Use (9.16) (where the
ENn;pn
.k /
notation Nn;p .kn / instead of Nn;pn is used), which holds for kn as in part (ii)
of Theorem 9.2, to prove that
.k /
Nn;pn
lim jj .k /
1jj2In D 0: (9.47)
n!1 ENn;pn
k
21. 2 / x k
Exercise 9.3. Show that (9.45) holds. (Hint: Let f1;k .x/ D x kŠ
and
1.k2 /
2 .xk/k
f2;k .x/ D x . Show that maxkx<1 f1;k .x/ maxkx<1 f2;k .x/
kŠ
.nk/k k
as k ! 1. Since kn nkŠ , it then follows that maxkn<1 n
n 1.k/ kŠ
k
2 2 maxkx<1 f1;k .x/, as k ! 1. To obtain the asymptotic behavior
of maxkx<1 f1;k .x/, you will need Stirling’s formula.)
Exercise 9.4. Figure 9.4 shows that the Ramsey number R.3/ satisfies R.3/ 5.
Prove that R.3/ D 6.
Chapter Notes
For a wide scope of results concerning graphs, deterministic and random, see
Bollobás’ books [9] and [10].
For a paper that considers tampering detection, see [29]. In particular, one
finds there two examples that show that the intuition for Theorem 9.2, discussed
in the remark following the theorem, can fail. It should be noted that the word
“detection” must be understood here in a very theoretical way, as there are no known
algorithms for detecting this clique in a reasonable amount of time, namely an
amount of time which grows no more than polynomially in the number of vertices n.
The construction of such algorithms is known in the theoretical computer science
literature as the “planted clique” problem. See, for example, the paper of Alon et al.
1
[3], where for p D 12 it is shown that a planted clique of order n 2 can be detected in
polynomial time. (This order for the clique is of course far, far larger than the order
log n for the cliques discussed in this chapter.)
The proof of the existence of the Ramsey number R.k/ goes back to F. Ramsey
in 1930. The nice little book by Alon and Spencer [2] is devoted entirely to the
probabilistic method in combinatorics. The book by Graham et al. [22] is devoted
entirely to Ramsey theory.
Chapter 10
The Phase Transition Concerning the Giant
Component in a Sparse Random Graph:
A Theorem of Erdős and Rényi
Let Gn .pn / D .Œn; En .pn // denote the Erdős–Rényi graph of size n which was
introduced in Chap. 9. As in Chap. 9, the generic notation P for probability and
E for expectation will be used in this chapter. Note that whereas in Chap. 9 the
edge probability p was fixed independent of the graph size, in this chapter the
edge probability pn will vary with n. A subset A Œn of the vertex set Œn is
called connected if for every x; y 2 A, there exists a path between x and y along
edges in En .pn /. The vertex set Œn is of course equal to the disjoint union of its
lg
connected components. Let Cn be the random variable denoting the size of the
largest connected component in the random graph Gn .pn /. It turns out that the
size of the largest connected component undergoes a striking phase transition as
the edge probability passes from nc with c < 1 to nc with c > 1. In this chapter we
will prove the following two theorems.
Theorem 10.1. Let pn D nc , with c < 1. Then there exists a D .c/ such that the
lg
size Cn of the largest connected component of Gn .pn / satisfies
Theorem 10.2. Let pn D nc , with c > 1. Then there exists a unique solution ˇ D
ˇ.c/ 2 .0; 1/ to the equation 1 e cx x D 0. For any > 0, the size Cn of the
lg
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext, 109
DOI 10.1007/978-3-319-07965-3__10, © Springer International Publishing Switzerland 2014
110 10 Giant Component in a Sparse Random Graph
extinction. This will be used for one part of the proof of Theorem 10.2, which
is presented in Sect. 10.6. The proof of Theorem 10.2 requires considerably more
technical work over and above that which is required for the proof of Theorem 10.1.
Let x 2 Œn be a vertex of the random graph. All the random quantities that we define
below depend on x and n, but we suppress this dependence in the notation. We
construct an algorithm that produces the connected component to which x belongs.
We begin by calling x “alive” and calling all of the other vertices in Œn “neutral.”
We define Y0 D 1, to indicate that at the beginning there is one vertex that is alive.
Each of the neutral vertices y is now observed. If there is an edge connecting x to y,
that is, if fx; yg 2 En .pn /, then y is declared alive; if not, then y remains neutral.
After every such y has been checked, we declare x to be “dead.” We define Y1 to
be the new number of vertices that are alive. We also say that at time t D 1 there
is one dead vertex. This ends the first step of the algorithm. We continue like this.
If at the end of step t there are Yt > 0 vertices that are alive (and t dead vertices),
we begin step t C 1 by selecting one of the alive vertices (it doesn’t matter which
one) and call it z. Each of the currently neutral vertices y is now observed. If there is
an edge connecting z and y, then y is declared alive; if not, then y remains neutral.
After every such y has been checked, we declare z to be “dead.” We define YtC1 to
be the new number of vertices that are alive, and we say that at time t C 1 there are
t C 1 dead vertices. The process stops at the end of the step T for which YT D 0.
It follows that at the end of step T , there are T dead vertices. A little thought shows
that these dead vertices form the connected component to which x belongs. Thus
T is the size of the connected component to which x belongs. (The reader should
verify this.) See Fig. 10.1. Of course, T is a random variable since it depends on the
random edge configuration En .pn /.
For 1 t T , define Zt to be the number of neutral vertices that are declared
alive at step t . Then from the description of the algorithm, we have for 1 t T ,
Yt D Yt1 C Zt 1: (10.3)
Assuming that t T , at the end of step t 1, there are t 1 dead vertices and
Yt1 > 0 alive vertices. Thus there are n t Yt1 C 1 neutral vertices. A key
feature of the above algorithm is that no pair of vertices is ever checked twice.
Consequently, for every pair of vertices that is checked, the probability of there
being an edge between them is equal to pn , independently of what occurred when
checking other pairs of vertices. Thus, since Zt counts how many of the n t
Yt1 C 1 neutral vertices have a common edge with the alive vertex z that has been
selected for implementing step t , and since the probability of there being an edge
112 10 Giant Component in a Sparse Random Graph
x =1
y = 2 : neutral → alive
y = 3 : neutral → neutral
y = 4 : neutral → alive
y = 5 : neutral → neutral
y = 6 : neutral → neutral
x = 1 is declared dead 5
Take alive vertex z = 2 2
y = 3 : neutral → neutral 1
y = 5 : neutral → alive
y = 6 : neutral → neutral 4
x = 2 is declared dead
3
Take alive vertex z = 4
y = 3 : neutral → neutral
y = 6 : neutral → neutral
z = 4 is declared dead
6
Take alive vertex z = 5
y = 3 : neutral → neutral
y = 6 : neutral → neutral
z = 5 is declared dead
There are no more alive vertices, so algorithm ends
Dead sites : {1, 2, 4, 5} = the connected component containing x = 1
Of course, Yt1 , which appears in the size parameter of the binomial distribution
above, is itself a random variable. The meaning of (10.4) is that conditioned on
knowing that Yt1 D y, then Zt Bin.nt y C1; pn /. Since no pair of vertices is
ever checked twice, and since from (10.3), Yt1 only depends on fZs gt1
sD1 , it follows
that given the value of Yt1 , and given that T t , the random variable Zt and the
random variables fZs gt1
sD1 are conditionally independent; that is, for all m 1 and
all t 2,
As noted, (10.3) and (10.4) hold only up to time T ; however it will be convenient
to define Yt and Zt recursively from (10.3) and (10.4) for all integers 0 t n.
(Thus, e.g., if T D t0 , then we have Yt0 D 0 (as well as Zt0 D 0), and thus
Zt0 C1 Bin.n t0 ; pn / and Yt0 C1 D Zt0 C1 1.) In particular, for t > T , Yt
can take on negative values. For 1 t T , note that the number Nt of neutral
vertices at the end of step t is given by Nt D n t Yt . We use this equation to
define N0 , namely, N0 D n1, indicating that there are n1 neutral vertices before
the first step begins. We now use this equation to extend Nt also to all 0 t n.
We have the following key lemma.
10.3 Large Deviations 113
Lemma 10.1.
Yt 1 C t Bin.n 1; 1 .1 pn /t /; t 0:
Nt Bin.n 1; .1 pn /t /: (10.6)
We prove (10.6) by induction. Clearly (10.6) holds for t D 0. Now assume that for
some t 1,
Nt Bin.Nt1 ; 1 pn /: (10.9)
By the inductive hypothesis (10.7), Nt1 is the number of heads in n1 independent
coin flips, where on each flip the probability of heads is .1 pn /t1 . Then (10.9)
states that Nt is the number of “successes” in n 1 independent trials, where each
trial consists of first tossing a coin with probability .1 pn /t1 of heads and then
tossing a second coin with probability 1 pn of heads, and a “success” is defined as
obtaining heads on both flips. This description of Nt is the description of a random
variable distributed according to Bin.n 1; .1 pn /t /. For an alternative derivation
that (10.9) and (10.7) imply (10.6), using generating functions, see Exercise 10.4.
We present two propositions which are known as large deviations estimates. The
first proposition will be used in the proof of Theorem 10.1 and the second one will
be used in the proof of Theorem 10.2.
Proposition 10.1. Let c 2 .0; 1/. For n 2 ZC and t > 0 with tcn 1, let Sn;t
Bin.n; tcn /. Then there exists a D .c/ > 0, independent of n and t , such that
P .Sn;t t / e t :
114 10 Giant Component in a Sparse Random Graph
X
n Y
n
E exp.Sn;t / D E exp. Bj / D E exp.Bj / D .E exp.B1 //n D
j D1 j D1
tc tc
.1 C e /n : (10.11)
n n
any > 0
P .Sn;t t / e t e tc.e
1/
D exp . ce C c/t : (10.12)
0 1 0
.0 ; / D 0 log C .1 0 / log ; 0 < ; 0 < 1:
1
Remark. The function .0 ; / is a relative entropy. For more about this, see the
notes at the end of the chapter.
Proof. The following three facts show that (ii) follows from (i): SOn WD n Sn is
distributed according to the distribution Bin.n; 1 /, P .Sn 0 n/ D P .SOn
.1 0 /n/ and .1 0 ; 1 / D .0 ; /. So it suffices to show that (i) holds and
that .0 ; / > 0, if ¤ 0 .
Let 0 > . For any > 0, we have
X
n Y
n
E exp.Sn / D E exp. Bj / D E exp.Bj / D .E exp.B1 //n D
j D1 j D1
In this section, and also in Sect. 10.6, we will use tacitly the following facts, which
are left to the reader in Exercise 10.6:
116 10 Giant Component in a Sparse Random Graph
(The inequality above is not an equality because we have continued the definition
of Yt past the time T .) Let YNt be a random variable distributed according to the
distribution Bin.n 1; tcn /. By Taylor’s remainder formula, .1 x/t 1 tx,
for x 0 and t a positive integer. Thus, tcn 1 .1 nc /t , and consequently
P .YOt t / P .YNt t /. Thus, we have
We have proven that the probability that the connected component containing x is
larger than log n is no greater than n . There are n vertices in Gn .pn /; thus the
probability that at least one of them is in a connected component larger than log n
is certainly no larger than nn D n1 ! 0 as n ! 1. This completes the
proof of Theorem 10.1.
1
P1 process in discrete time. Let fq1
We define a random population n gnD0 be a nonneg-
ative sequence satisfying nD0 qn D 1. We will refer to fqn gnD0 as the offspring
distribution of the process. Consider an initial particle alive at time t D 0 and set
10.5 Galton–Watson Branching Process 117
denote the mean number of offspring of a particle. It is easy to show that EXtC1 D
EXt (Exercise 10.7), from which it follows that EXt D t , t 0. From this, it
follows that if < 1, then limt!1 EXt D 0. Since EXt P .Xt 1/, it follows
that limt!1 P .Xt 1/ D 0, which means that the process has probability 1 of
extinction. The fact that EXt is growing exponentially in t when > 1 would
suggest, but not prove, that for > 1 the probability of extinction is less than 1. In
fact, we can use the method of generating functions to prove the following result.
Define
1
X
.s/ D qn s n ; s 2 Œ0; 1: (10.19)
nD0
118 10 Giant Component in a Sparse Random Graph
The function .s/ is the probability generating function for the distribution fqn g1
nD0 .
In particular then, since q0 C q1 < 1, is a strictly convex function on Œ0; 1, and
consequently, so is .s/ WD .s/ s. We have .0/ D q0 > 0 and .1/ D 0.
Also, lims!1 0 .s/ D lims!1 0 .s/ 1 D 1. Since is strictly convex, it
follows that if 1, then 0 .s/ < 0 for s 2 Œ0; 1/, and consequently .s/ > 0,
for s 2 Œ0; 1/. However, if > 1, then 0 .s/ > 0 for s < 1 and sufficiently
close to 1. Using this along with the strict convexity and the fact that .0/ > 0
and .1/ D 0, it follows that there exists a unique ˛ 2 .0; 1/ such that .˛/ D 0
and that .s/ > 0, for s 2 .0; ˛/, and .s/ < 0, for s 2 .˛; 1/. (The reader should
verify this.) We have thus shown that
Now let t WD P .Xt D 0/ denote the probability that extinction has occurred by
time t . Of course, 0 D 0. We claim that
To prove this, first note that when t D 1, (10.21) says that 1 D .0/ D q0 , which
is of course true. Now consider t > 1. We first calculate P .Xt D 0jX1 D n/, the
probability that Xt D 0, conditioned on X1 D n. By the conditioning, at time t D 1,
there are n particles, and each of these particles will contribute independently to the
10.5 Galton–Watson Branching Process 119
We assume that pn D nc , with c > 1. From the analysis in Sect. 10.2, we have seen
that for x 2 Œn, the size of the connected component of Gn .pn / containing x is
given by T D minft 0 W Yt D 0g.
Consider a Galton–Watson branching process fXt g1 tD0 in the alternative form
described at the end of Sect. 10.5, and let the offspring distribution be the Poisson
m
distribution with parameter c; that is, qm D e c cmŠ . The probability generating
function of this distribution is given by
1
X 1
X cm m
.s/ D qm s m D e c s D e c.s1/ : (10.23)
mD0 mD0
mŠ
where ˛ 2 .0; 1/ is the unique solution s 2 .0; 1/ to the equation .s/ D s, that is,
to the equation e c.s1/ D s. Substituting z D 1 s in this equation, this becomes
If Text < 1, then of course Xt D 0 for all t Text . For any fixed t 1, as soon as
one knows the values of fWs gtsD1 , one knows the values of fXs gtsD1 . (We note that
it might happen that these values of fWs gtsD1 result in Xs0 D 0 for some s0 < t ,
in which case the values of fWs gtsDs0 C1 are superfluous for determining the values
of fXs gtsD1 .) If rN WD frs gtsD1 are the values obtained for fWs gtsD1 , let lN WD fls gtsD1
denote the corresponding values for fXs gtsD1 . We write lN D l. N r/.
N Note that Text t
occurs if and only if ls > 0, for 0 s t 1, or, equivalently, if and only if
lt1 > 0.
Now consider the process fYt g1 tD0 introduced in Sect. 10.2. Recall that T is
equal to the smallest t for which Yt D 0. Note from (10.3) and (10.26) that
fYt gTtD0 is defined recursively in a way very similar to the way fXt gTtD0 ext
is defined.
The difference is that the independent sequence of random variables fWt g1 tD1
10.6 Proof of Theorem 10.2 121
Y
t
P .fWs gtsD1 D r/
N D P .Ws D rs /: (10.27)
sD1
Y
t
P .fZs gtsD1 D r/
N D P .Zs D rs jYs1 D ls1 /; (10.28)
sD1
Thus, we conclude from (10.27) and (10.28) that for any fixed t ,
N D P .fXs gt D l/;
lim P .fYs gtsD1 D l/ N for all lN D fls gt satisfying lt1 > 0:
n!1 sD1 sD1
(10.29)
Since Text , the extinction time for fXt g1
tD0 , is the smallest t for which Xt D 0, and
Text t is equivalent to lt1 > 0, and since T is the smallest t for which Yt D 0, it
follows from (10.29) that
From (10.24), we have limt!1 P .Text t / D ˛; thus, for any > 0, there exists an
integer such that P .Text t / 2 .˛ 2 ; ˛/, if t . It then follows from (10.30)
that there exists an n1; D n1; .t / such that
(Recall that Yt has also been defined recursively for t > T and can take on negative
values for such t .) Thus, letting YOt be the random variable distributed according to
the distribution Bin.n 1; 1 .1 nc /t /, it follows from Lemma 10.1 that
X
P . n T .1 ˛ /n/ P .YOt t 1/: (10.33)
nt.1˛ /n
One has limn!1 .1 nc /bn D e cb , uniformly over b in a bounded set. (The
reader should verify this by taking the logarithm of .1 nc /bn and applying Taylor’s
ct
formula.) Applying this with b D nt , with 0 t n, it follows that .1 nc /t e n
is small for large n, uniformly over t 2 Œ0; n. Thus, for ı D ı. /, which has been
ct
defined above, there exists an n2;ı D n2;ı. / such that 1 .1 nc /t 1 e n ı,
for n n2;ı and 0 t n. Let YNt be a random variable distributed according to
the distribution Bin.n 1; 1 e n ı/. Then P .YOt t 1/ P .YNt t 1/, if
ct
Every t in the summation on the right hand side of (10.34) is of the form t D bn n,
ct
with bn 1 ˛ . Thus, it follows from (10.32) that 1 e n ı D
cbn
1e ı bn C ı. We now apply part (ii) of Proposition 10.2 with n 1 in
place of n, with D 1e cbn ı, and with 0 D bn . Note that and 0 are bounded
from 0 and from 1 as n varies and as t varies over the above range. Also, we have
> 0 C ı. Consequently, there exists a constant > 0 such that .0 ; / , for
all ; 0 as above. Thus, we have for n n2;ı ,
10.6 Proof of Theorem 10.2 123
for some n3;ı D n3;ı. / . This is left to the reader as Exercise 10.8.
We now analyze the probability P .t < T < n/, for fixed t . As in (10.33), we
have
X
P .t < T < n/ P .YOs s 1/; (10.38)
t<s< n
As in the proofs of Propositions 10.1 and 10.2, we have for any > 0
Q
P .YQs n s/ e .ns/ Ee Ys : (10.40)
P
We can represent the random variable YQs as YQs D n1
j D1 Bj , where the fBj gj D1 are
n1
Q
Y
n1
c c n1
Ee Ys D Ee Bj D .1 /s /e C 1 .1 /s : (10.41)
j D1
n n
We will show that for an appropriate choice of > 0, the expression in the square
brackets above is negative and bounded away from 0 for all s 1 and sufficiently
large M . Let
c s
fs;M ./ WD .M 1/ C M log .1 / .e 1/ C 1 :
Ms
0
M.1 Mc s /s e
fs;M ./ D .M 1/ C : (10.44)
.1 Mc s /s .e 1/ C 1
For any fixed , defining g.y/ D y.eye 1/C1
, for y > 0, it is easy to check that
g 0 .y/ > 0; therefore, g is increasing. The last term on the right hand side of (10.44)
is M g.y/, with y D .1 Mc s /s . Since 1 x e x , for x 0, we have
c
.1 Mc s /s e M , if n D M s c, and thus the last term on the right hand
c
side of (10.44) is bounded from above by M g.e M /, independent of s, for s Mc .
Thus, from (10.44), we have
c
0 M e M e M e
fs;M ./ .M 1/ C c
M
D M C 1 C c D
e .e 1/ C 1 e 1 C e M
c
1 eM c
1CM c ; for all s :
e 1Ce M M
c
Since limM !1 M 1e M c D ce , uniformly over 2 Œ0; 1, and since c > 1,
e 1Ce M
it follows that there exists a 0 > 0 and an M0 such that if 2 Œ0; 0 and M M0 ,
0
then fs;M ./ 1c2
, for all s 1. It then follows that fs;M .0 / 0 .c1/
2
, for all
M M0 and s 1. Choosing D 0 in (10.43) and using this last inequality for
fs;M .0 /, we conclude that
0 .c1/
P .YQs n s/ e 2 s
; for n M0 s; s 1: (10.45)
10.6 Proof of Theorem 10.2 125
X 1
X
0 .c1/
0 .c1/
0 .c1/ e 2 t
P .t < T < n/ e 2 s
< e 2 s
D 0 .c1/
;
t<s< n sDt 1 e 2
1
if : (10.46)
M0
Now (10.31), (10.36), (10.37), and (10.46) guarantee that for any 2 .0; 1/, we can
choose t and n such that for all n n , one has
˛ P .T t / ˛ C I
P T > t ; T 62 ..1 ˛ /n; .1 ˛ C /n/ I
(10.47)
1 ˛ 2 P T 2 ..1 ˛ /n; .1 ˛ C /n/ 1 ˛ C ;
for all n n :
(The third set of inequalities above is a consequence of the first two sets of
inequalities.)
We recall that the above estimates have been obtained when p D nc , with
c > 1, and where 1 ˛ D 1 ˛.c/ is the unique root z 2 .0; 1/ of the equation
1 e cz z D 0. The reader can check that the above estimates hold uniformly for
c 2 Œc1 ; c2 , for any 1 < c1 < c2 . Thus, consider as before a fixed c > 1 and
˛ D ˛.c/, and let ı > 0 satisfy c ı > 1. For c 0 2 Œc ı; c, let ˛ 0 WD ˛.c 0 /. Then
for all > 0, there exists a t > 0 and a n > 0 such that for all n n and all
0
c 0 2 Œc ı; c, one has for the graph G.n; cn /,
˛ 0 P .T t / ˛ 0 C I
P T > t ; T 62 ..1 ˛ 0 /n; .1 ˛ 0 C /n/ I
(10.48)
1 ˛ 0 2 P T 2 ..1 ˛ 0 /n; .1 ˛ 0 C /n/ 1 ˛ 0 C ;
for all n n :
Return now to our graph G.n; nc /, with n considerably larger than the n
in (10.48). (We will quantify “considerably larger” a bit later on.) Recall that we
started out by choosing arbitrarily some vertex x in the graph G.n; nc /, and then
applied our algorithm, obtaining T , which is the size of the connected component
containing x. Call this the first step in a “game.” If it results in T t , say that a
“draw” occurred on the first step. If it results in .1˛ /n < T < .1˛ C /n, say
that a “win” occurred on the first step. Otherwise, say that a “loss” occurred on the
first step. If a win or a loss occurs on this first step, we stop the procedure and say
that the game ended in a win or loss, respectively. If a draw occurs, then consider the
remaining n T vertices that are not in the connected component containing x, and
126 10 Giant Component in a Sparse Random Graph
consider the corresponding edges. This gives a graph of size n0 D n T . Note that
by the definition of the algorithm, there is no pair of points in this new graph that has
already been checked by the algorithm. Therefore, the conditional edge probabilities
for this new graph, conditioned on having implemented the algorithm, are as before,
namely nc , independently for each edge. This edge probability can be written as
0
pn0 D nc 0 , where c 0 D nTn
c. Now T t . Thus, if n n is sufficiently large,
then c 0 2 Œc ı; c and n0 D n T n , so the estimates (10.48) (with n replaced
by n0 ) will hold for this new graph, which has n0 vertices and edge probabilities
0
pn0 D nc 0 . Choose an arbitrary vertex x1 from this new graph and repeat the above
algorithm on the new graph. Let T1 denote the random variable T for this second
step. If a win or a loss occurs on the second step of the game, then we stop the game
and say that the game ended in a win or a loss, respectively. (Of course, here we
define win, loss, and draw in terms of T1 ; n0 , and ˛ 0 instead of T; n, and ˛. However,
the same t is used.) If a draw occurs on this second step, then we consider the
n0 T1 D n T T1 vertices that are neither in the connected component of x
nor of x1 . We continue like this for a maximum of M steps, where M is chosen
M
sufficiently large to satisfy ˛.c ı/ C < . (We work with > 0 sufficiently
small so that ˛.c ı/ C < 1.) The reason for this choice of M will become
clear below. If after M steps, a win or a loss has not occurred, then we declare
that the game has ended in a draw. Note that the smallest possible graph size that
can ever be used in this game is n t .M 1/. The smallest modified value of c
1/
that can ever be used is nt .M n
c. We can now quantify what we meant when
we said at the outset of this paragraph that we are choosing n “considerably larger”
than n . We choose n sufficiently large so that n t .M 1/ n and so that
nt .M 1/
n
c c ı. Thus, the estimates in (10.48) are valid for all of the steps of
the game.
It is easy to check that ˛ D ˛.c/ is decreasing for c > 1. Thus, if the game ends
in a win, then there is a connected component of size between .1 ˛.c ı/ /n
and .1 ˛.c/ C /n. What is the probability that the game ends in a win? Let W
denote the event that the game ends in a win, let D denote the event that it ends in a
draw, and let L denote the event that it ends in a loss. We have
The game ends in a draw if there was a draw on M consecutive steps. Since on any
given step the probability of a draw is no greater than ˛.c ı/ C , the probability
of obtaining M consecutive draws is no greater than
M
˛.c ı/ C ; so by the choice of M , we have
M
P .D/ ˛.c ı/ C < : (10.50)
If one played a game with three possible outcomes on each step—win, loss, or
draw—with respective nonzero probabilities p 0 , q 0 , and r 0 , and the outcomes of all
the steps were independent of one another, and one continued to play step after step
0
until either a win or a loss occurred, then the probability of a win would be p0pCq 0
0
and the probability of a loss would be p0qCq 0 (Exercise 10.9). Conditioned on D c ,
our game essentially reduces to this game. However, the probabilities of win and
loss and draw are not exactly fixed, but can vary a little according to (10.48). Thus,
we can conclude that
P .LjD c / D : (10.52)
1 ˛.c ı/ 2 C 1 ˛.c ı/
In conclusion, we have demonstrated the following. Consider any c > 1 and any
ı > 0 such that c ı > 1. Then for each sufficiently small > 0 and sufficiently
large n depending on , with probability at least 1 1˛.cı/
there will exist
a connected component of G.n; n / of size between .1 ˛.c ı/ /n and .1
c
˛.c/ C /n. If the connected component above, which has been shown to exist
with probability close to 1 and which is of size around .1 ˛/n, is in fact with
probability close to 1 the largest connected component, then the above estimates
prove (10.1), since by (10.25) the ˇ defined in the statement of the theorem is in
fact 1 ˛. Thus, to complete the proof of (10.1) and (10.2), it suffices to prove
that with probability approaching 1 as n ! 1, every other component of G.n; nc /
is of size O.log n/, as n ! 1. In fact, we will prove here the weaker result that
with probability approaching 1 as n ! 1, every other component is of size o.n/
as n ! 1. In Exercise 10.10, the reader is guided through a proof that every other
component is of size O.log n/.
To prove that every other component is of size o.n/ with probability approaching
1 as n ! 1, assume to the contrary. Then for an unbounded sequence of n’s,
the following holds. As above, with probability at least 1 1˛.cı/
, there
will exist a connected component of G.n; n / of size between .1 ˛.c ı/ /n
c
and .1 ˛.c/ C /n, and by our assumption, for some > 0, with probability
at least , there will be another connected component of size at least n. We may
take < 1 ˛.c ı/ . But if this were true, then at the first step of our
algorithm, when we randomly selected a vertex x, the probability that it would be
in a connected component of size at least n would be at least
.1 ˛.c ı/ /n n
1 C :
1 ˛.c ı/ n n
128 10 Giant Component in a Sparse Random Graph
2
For and ı sufficiently small, this number will be larger than 1 ˛.c/ C 2
, in
2
which case the algorithm would have to give P .T t / < ˛.c/ 2
. However, for
> 0 sufficiently small, this contradicts the first line of (10.47).
Exercise 10.1. This exercise refers to Remark 3 after Theorem 10.2. Prove that for
1
any > 0 and large n, the number of edges of Gn . nc / is equal to 12 cn C O.n 2 C /
with high probability. Show directly that ˇ.c/ 2 , for 1 < c < 2, where ˇ.c/ is as
c
in Theorem 10.2.
Exercise 10.2. Let Dn denote the number of disconnected vertices in the Erdős–
Rényi graph Gn .pn /. For this exercise, it will be convenient to represent Dn as a sum
of indicator random variables. Let Dn;iPbe equal to 1 if the vertex i is disconnected
and equal to 0 otherwise. Then Dn D niD1 Dn;i .
(a) Calculate EDn . P P
(b) Calculate EDn2 . (Hint: Write EDn2 D E. niD1 Dn;i /. nj D1 Dn;j /.)
Exercise 10.3. In this exercise, you are guided through a proof of the result noted
in Remark 2 after Theorem 10.2, namely that:
if pn D log nCc
n
n
, then as n!1, the probability that the Erdős–Rényi graph Gn .pn /
possesses at least one disconnected vertex approaches 0 if limn!1 cn D 1, while
for any M , the probability that it possesses at least M disconnected vertices
approaches 1 if limn!1 cn D 1.
Let Dn be as in Exercise 10.2, with pn D log nCc
n
n
.
(a) Use Exercise 10.2(a) to show that limn!1 EDn equals 0 if limn!1 cn D 1
and equals 1 if limn!1 cn D 1. (Hint: Consider log EDn and note that by
2
Taylor’s remainder theorem, log.1 x/ D x .1x1 /2 x2 , for 0 < x < 1,
where x D x .x/ satisfies 0 < x < x.)
(b) Use (a) to show that if limn!1 cn D 1, then limn!1 P .Dn D 0/ D 1.
(c) Use Exercise 10.2(b) to calculate EDn2 .
(d) Show
that if limn!1 cn D 1, then the variance
2 .Dn / satisfies
2 .Dn / D
o .EDn /2 . (Hint: Recall that
2 .Dn / D EDn2 .EDn /2 .)
(e) Use Chebyshev’s inequality with (a) and (d) to conclude that if limn!1 cn D
1, then for any M , limn!1 P .Dn M / D 1.
Exercise 10.4. Recall from Chap. 5 that the probability generating function PX .s/
of a nonnegative random variable X taking integral values is defined by
1
X
PX .s/ D Es X D s i P .X D i /:
iD0
X
n
PY .s/ D Es Y D E.s Y jZ D m/P .Z D m/;
mD0
and conclude that Y Bin.n; pp 0 /. Conclude from this that (10.7) and (10.9)
imply (10.6).
Exercise 10.5. Let f ./ D e 0 .e C 1 /, with 0 < < 0 < 1. Show that
1 10 0
inf0 f ./ is attained at some 0 > 0 and that f .0 / D 1 0 0
2 .0; 1/.
Pn
Exercise 10.6. If X Bin.n; p/, then X can be represented as X D iD1 Bi ,
where fBi gniD1 are independent and identically distributed random variables dis-
tributed according to the Bernoulli distribution with parameter p; that is, P .Bi D
1/ D 1 P .Bi D 0/ D p.
(a) Use the above representation to prove that
and that
(Hint: For (10.54), represent X1 using the random variables fBi gniD1 1
and
represent X2 using the first n2 of these very same random variables. For (10.55),
let fUi gniD1 be independent and identically distributed random variables, dis-
tributed according to the uniform distribution on Œ0; 1; that is, P .a Ui
.1/
b/ D b a, for 0 a < b 1. Define random variables fBi gniD1 and
.2/ n
fBi giD1 by the formulas
( (
.1/ 1; if Ui p1 I .2/ 1; if Ui p2 I
Bi D Bi D
0; if Ui > p1 ; 0; if Ui > p2 :
.1/ .2/
Now represent X1 and X2 through fBi gniD1 and fBi gniD1 , respectively. This
method is called coupling.)
(b) Prove (10.54) and (10.55) directly P fact that if X Bin.n; p/, then for
from the
0 k n, one has P .X k/ D nj Dk jn p j .1 p/nj .
130 10 Giant Component in a Sparse Random Graph
Chapter Notes
The context in which Theorems 10.1 and 10.2 were originally proven by Erdős
and Rényi
in 1960 [18] is a little different from the context presented here. Let
N WD n2 . Define G.n; M /, 0 M N , to be the random graph with n vertices
and exactly M edges, where the M edges are selected uniformly at random from
the N possible edges. One can consider an evolving random graph fG.n; t /gN tD0 . By
definition, G.n; 0/ is the graph on n vertices with no edges. Then sequentially, given
G.n; t /, for 0 t N 1, one obtains the graph G.n; t C1/ by choosing at random
from the complete graph Kn one of the edges that is not in G.n; t / and adjoining
it to G.n; t /. Erdős and Rényi looked at evolving graphs of the form G.n; tn /, with
tn D Œ cn2
. They showed that if c < 1, then with probability approaching 1 as
n ! 1, the largest component of G.n; tn / is of size O.log n/, while if c > 1,
then with probability approaching 1 as n ! 1 there is one component of size
approximately ˇ.c/ n, and all other components are of size O.log n/. To see how
10.6 Proof of Theorem 10.2 131
this connects up to the version given in this chapter, note that the expected number
of edges in the graph Gn . nc / is nc n2 D c.n1/ 2
. A detailed study of the borderline
case, when tn n2 as n ! 1, was undertaken by Bollobás [8]. Our proofs of
Theorems 10.1 and 10.2 are along the lines of the method sketched briefly in the
book of Alon and Spencer [2]. We are not aware in the literature of a complete
proof of Theorems 10.1 and 10.2 with all the details.
The large deviations bound in Proposition 10.2 is actually tight. That is, in part
(i), where 0 > , for any > 0, one has for sufficiently large n, P .Sn
0 n/ e ..0 ;/C /n . Thus, in particular, limn!1 n1 log P .Sn 0 n/ D .0 ; /.
Similarly, in part (ii), where 0 < , limn!1 n1 log P .Sn 0 n/ D .0 ; /.
Consider two measures, P and 0 , defined on a finite or countably infinite set A.
Then H.0 I / WD x2A 0 .x/ log .x/ 0 .x/
is called the relative entropy of 0 with
respect to . It plays a fundamental role in the theory of large deviations. In the
case that A is a two-point set, say A D f0; 1g, and .f1g/ D 1 .f0g/ D and
0 .f1g/ D 1 0 .f0g/ D 0 , one has H.0 I / D .0 ; /. For more on large
deviations, see the book by Dembo and Zeitouni [13].
For some basic results on the Galton–Watson branching process, using prob-
abilistic methods, see the advanced probability textbook of Durrett [16]. Two
standard texts on branching processes are the books of Harris [24] and of Athreya
and Ney [7].
Appendix A
A Quick Primer on Discrete Probability
P .C \ D/
P .DjC / D :
P .C /
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext, 133
DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
134 A Primer on Discrete Probability
Note that the set of x 2 R for which P .X D x/ > 0 is either finite or countably
infinite; thus, these summations are well defined. We frequently denote EX by . If
P .X 0/ D 1 and the condition above in the definition of EX does not hold, then
we write EX P D 1. In the sequel, when we say that the expectation of X “exists,”
we mean that x2R jxj P .X D x/ < 1.
Given a function W R ! R and a random variable X , we can define a new
random variable Y D .X /. One can calculate EY according to the definition of
expectation above or in the following equivalent way:
X X
EY D .x/P .X D x/; if j .x/jP .X D x/ < 1:
x2R x2R
2 .X / D EX 2 2 : (A.1)
2
P .jX j / :
2
A Primer on Discrete Probability 135
Proof.
X X .x /2
P .jX j / D P .X D x/ P .X D x/
2
x2RWjxj x2RWjxj
X .x /2
2
P .X D x/ D :
x2R
2 2
Let fXj gnj D1 be a finite collection of random variables on a probability space
.; P /. We call X D .X1 ; : : : ; Xn / a random vector. The joint probability function
of these random variables, or equivalently, the probability function of the random
vector, is given by
P
In particular then, if EXj exists, it can be written as EXj D x2Rn xj pX .x/.
Similarly, if EXk exists, for all k, then we have
X
n X Xn X
n
X
E ck X k D . ck xk /pX .x/ D ck xk pX .x/ :
kD1 x2Rn kD1 kD1 x2Rn
It follows from this that the expectation is linear; that is, if EXk exists for k D
1; : : : ; n, then
X
n X
n
E ck X k D ck EXk ;
kD1 kD1
Y
n
P .X1 D x1 ; X2 D x2 ; : : : ; Xn D xn / D P .Xj D xj /;
j D1
for all xj 2 R; j D 1; 2; : : : ; n:
Let ffi gniD1 be real-valued functions with fi defined at least on the set fx 2 R W
P .Xi D x/ > 0g. Assume that Ejfi .Xi /j < 1, for i D 1; : : : ; n. From the
definition of independence it is easy to show that if fXj gnj D1 are independent, then
Y
n Y
n
E fi .Xi / D Efi .Xi /: (A.2)
iD1 iD1
The variance is of course not linear. However the variance of a sum of independent
random variables is equal to the sum of the variances of the random variables:
It suffices to prove (A.3) for n D 2 and then use induction. Let i D EXi , i D 1; 2.
We have
2 2
where the last equality follows because (A.2) shows that E.X1 1 /.X2 2 / D
E.X1 1 /E.X2 2 / D 0.
Chebyshev’s inequality and (A.3) allow for an exceedingly short proof of
an important result—the weak law of large numbers for sums of independent,
identically distributed (IID) random variables.
Theorem A.1. Let fXn g1nD1 be a sequence of independent, identically distributed
2
Pncommon variance
is finite. Denote their
random variables and assume that their
common expectation by . Let Sn D j D1 Xj . Then for any > 0,
Sn
lim P .j j / D 0:
n!1 n
Proof. We have ESn D n, and since the random variables are independent and
identically distributed, it follows from (A.3) that
2 .Sn / D n
2 . Now applying
Chebyshev’s inequality to Sn with D n gives
A Primer on Discrete Probability 137
n
2
P .jSn nj n / ;
.n /2
Yn
lim P .j 1j / D 0:
n!1 EYn
2 .Yn /
P .jYn EYn j jEYn j/ 2 :
EYn
If X and Y are random variables on a probability space .; P /, and if
P .X Dx/>0, then the conditional probability function of Y given X D x is
defined by
P .X D x; Y D y/
pY jX .yjx/ WD P .Y D yjX D x/ D :
P .X D x/
2 .X / D p.1 p/.
Let n 2 N and let p 2 Œ0; 1. A random variable X satisfying
!
n j
P .X D j / D p .1 p/nj ; j D 0; 1; : : : ; n;
j
is called a binomial random variable, and one writes X Bin.n; p/. The random
variable X can be thought of as the number of “successes” in n independent trials,
where on each trial there are two possible outcomes—“success” and “failure”—
and the probability of “success” is p on each trial. Letting fZi gniD1 be independent,
identically distributed random variables
Pn distributed according to Ber.p/, it follows
that X can be realized as X D iD1 Zi . From the formula for the expected
value and variance of a Bernoulli random variable, and from the linearity of the
expectation and (A.3), the above representation immediately yields EX D np and
2 .X / D np.1 p/.
A random variable X satisfying
n
P .X D n/ D e ; n D 0; 1; : : : ;
nŠ
where > 0, is called a Poisson random variable, and one writes X Pois./.
One can check easily that EX D and
2 .X / D .
Proposition A.3 (Poisson Approximation to the Binomial Distribution). For
n 2 N and p 2 Œ0; 1, let Xn;p Bin.n; p/. For > 0, let X Pois./. Then
thus,
j
lim P .Xn;p D j / D e D P .X D j /:
n!1;p!0;np! jŠ
A Primer on Discrete Probability 139
Exercise A.2. Prove that P .A1 [A2 / D P .A1 /CP .A2 /P .A1 \A2 /, for arbitrary
events A1 ; A2 . Then prove more generally that for any finite n and arbitrary events
fAk gnkD1 , one has
X X
P .[nkD1 Ak / D P .Ai / P .Ai \ Aj /C
1in 1i<j n
X
P .Ai \ Aj \ Ak / C .1/n1 P .A1 \ A2 \ An /:
1i<j <kn
Ỳ
P .\`j D1 Bj / D P .Bj /; for any ` R and any
j D1 (A.5)
Q`
Using this, we need to prove that P .\`j D1 Bjc / D c
j D1 P .Bj /, for any sub-
collection fBj gj D1 of fAk gkD1 . Let pj D P .Bj / and p D P .\`j D1 Bjc /. Then
c ` c R
Q
we need to prove that p D `j D1 .1 pj /. Write
140 A Primer on Discrete Probability
Ỳ X X
.1 pj / D 1 pi C pi pj ;
j D1 1i` 1i<j `
and use (A.5) along with the principle of inclusion–exclusion, which appears in
Exercise A.2.
Exercise A.4. Using (A.4), show that
We review without proof some basic results concerning power series. For more
details, the reader should consult an advanced calculus or undergraduate analysis
text. We also illustrate the utility of generating functions by analyzing the one that
arises from the Fibonacci sequence.
Let fan g1nD0 be a sequence of real numbers. Define formally the generating
function F .t / of fan g1
nD0 by
1
X
F .t / D an t n ; (B.1)
nD0
P
that is, if the tail of the series 1nD0 jan t j converges to 0 uniformly over jt j .
n
The number r0 in (2) is called the radius of convergence of the power series.
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext, 141
DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
142 B Power Series and Generating Functions
1
r0 D p
n a
:
lim supn!1 n
of the Fibonacci numbers. Multiply both sides of (B.2) by t n and then sum both
sides over n, with n running from 2 to 1. This gives us
1
X 1
X 1
X
fn t n D fn1 t n C fn2 t n :
nD2 nD2 nD2
F .t / t D tF .t / C t 2 F .t /;
p
of convergence jr C j D 51
2
. Thus, the generating function of the Fibonacci series
is given by
p
t 51
F .t / D ; jt j < : (B.4)
1 t t2 2
t
We now use the method of partial fractions to represent the function 1tt 2
in an
explicit power series. Using the fact that r C r D 1, we write
thus,
t t
D : (B.5)
1 t t2 .t r C 1/.t r C C 1/
t A B t .Ar C C Br / C .A C B/
D C C D :
.t r C
C 1/.t r C 1/ tr C 1 tr C 1 .t r C 1/.t r C C 1/
(B.6)
Comparing the left-most and right-most terms in (B.6) , we conclude that ACB D 0
and Ar C C Br D 1. Solving for A and B, we obtain A D r C r 1
D
p1 and
5
B D r r
1
C D
p1 . Thus, from (B.5) and the first equality in (B.6), we arrive at
5
the partial fraction representation
t 1 1 1
Dp
C
: (B.7)
1 t t2 5 1 C t r 1 C t r
1 1 p
1 X X 1C 5 n n
n n n
D .1/ .r / t D . / t I
1 C t r nD0 nD0
2
p (B.8)
X1 X1
1 n C n n 1 5 n n
D .1/ .r / t D . / t :
1 C t rC nD0 nD0
2
Comparing (B.3) with (B.9), we conclude that the nth Fibonacci number fn is
given explicitly by
p p
1 1C 5 n 1 5 n
fn D p . / . / : (B.10)
5 2 2
In order to obtain an asymptotic formula for the discrete quantity nŠ, it is extremely
useful to be able to embed this quantity in a function of a continuous variable.
Integrating by parts and then applying induction shows that nŠ D .n C 1/, n 2 N,
where the gamma function .t / is defined by
Z 1
.t / D x t1 e x dx; t > 0:
0
Proof. In the literature one can find literally dozens of proofs of Stirling’s formula.
We present here an elementary proof that uses Laplace’s asymptotic method [14].
We begin by giving the intuition for the method. We write
Z 1
.t C 1/ D e t .x/
dx; (C.3)
0
where
t .x/ D t log x x:
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext, 145
DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
146 C Proof of Stirling’s Formula
.x t /2
t log t t DW O t .x/:
2t
R1 1 2 p
Since 1 e 2 z d z D 2, we conclude that
Z 1 p
O t .x/
e dx t t e t 2 t ; as t ! 1:
0
where
Z 1
N / D p1
.t e
tg. pz /
t d z:
p
2 t
C Proof of Stirling’s Formula 147
N / D 1:
lim .t (C.5)
t!1
N / D N L .t / C p1 T C .t / C p1 TL .t /;
.t (C.6)
L
2 2
where
Z L
1 tg. pz /
N L .t / D p e t dz
2 L
and
Z 1 Z L
tg. pz / tg. pz /
TLC .t / D e t d z; TL .t / D p e t d z:
L t
From Taylor’s remainder formula it follows that for any > 0 and sufficiently small
v, one has
1 1
.1 /v 2 g.v/ .1 C /v 2 :
2 2
Z L
1
lim N L .t / D p
1 2
e 2 z d z: (C.7)
t!1 2 L
0 p p
Since t g. pz t / D t 1 1
D p tz is increasing in z, we have
1C pz t Cz
t
p Z p
t CL 1 z 0 tg. pz t / t C L tg. pLt /
TLC .t / p t g. p / e dz D p e D
tL L t tL
p
t C L tŒ pLt log.1C pLt /
p e :
tL
L2 3
By Taylor’s formula, we have log.1 C L
p
t
/ D L
p
t
2t
C O.t 2 / as t ! 1; thus,
1 1 L2
lim sup TLC .t / e 2 : (C.8)
t!1 L
148 C Proof of Stirling’s Formula
1 1 L2
lim sup TL .t / e 2 : (C.9)
t!1 L
The standard way to prove the identity in the title of this appendix is via Fourier
series. We give a completely elementary proof, following [1]. Consider the double
integral
Z 1 Z 1
1
I D dxdy: (D.1)
0 0 1 xy
(Actually, the expression on the right hand side of (D.1) is an improper integral,
R1R1 1
because the integrand blows up at .x; y/ D .1; 1/. Thus, 0 0 1xy dxdy WD
R 1 R 1 1
lim !0C 0 0 1xy
dxdy. Since the integrand is nonnegative, there is no
R1R1 1
problem applying the standard rules of calculus directly to 0 0 1xy dxdy.) On
the one hand, expanding the integrand in a geometric series and integrating term by
term gives
Z 1 Z 1
1X 1 Z
X 1 Z 1
I D .xy/ dxdy D
n
x n y n dxdy D
0 0 nD0 nD0 0 0
1 Z 1
X Z 1 1
X X 1 1
1
x n dx y n dy D D : (D.2)
nD0 0 0 nD0
.n C 1/ 2
nD1
n2
(The interchanging of the order of the integration and the summation is justified by
the fact that all the summands are nonnegative.)
On the other hand, consider the change of variables u D yCx 2
, v D yx 2
. This
ı
transformation p rotates the square Œ0; 1Œ0; 1 clockwise by 45 and shrinks its sides
by the factor 2. The new domain is f.u; v/ W 0 u 12 ; u v ug [ f.u; v/ W
1
2
u 1; u 1 v 1 ug. The Jacobian @.x;y/@.u;v/
of the transformation is equal to
1
2, so the area element dxdy gets replaced by 2d udv. The function 1xy becomes
1
1u2 Cv 2
. Since the function and the domain are symmetric with respect to the u-axis,
we have
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext, 149
DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
P1 2
150 D Proof of 1
nD1 n2 D 6
Z 1 Z Z Z
2 u
dv 1 1u
dv
I D4 du C 4 d u:
0 0 1u Cv
2 2 1
2 0 1u Cv
2 2
R
Using the integration formula dx
x 2 Ca2
D 1
a
arctan xa , we obtain
Z 1 Z
2 1 u 1
1 1u
I D4 p arctan p du C 4 p arctan p d u:
0 1 u2 1 u2 1
2 1 u2 1 u2
Now the derivative of g.u/ WD arctan p u 2 is p 1 2 , and the derivative of
q 1u
1u 1u
1u
h.u/ WD arctan p
2
D arctan 1Cu
is 2
1p1
2
. Thus, we conclude that
1u 1u
Z 1 Z 1 1
2
0
I D4 g.u/g .u/ d u 8 h.u/h0 .u/ d u D 2g 2 .u/j02 4h2 .u/j11 D
1 2
0 2
1 1 1
2 arctan2 p arctan2 0 4 arctan2 0 arctan2 p D 6 arctan2 p
3 3 3
2
D 6. /2 D : (D.3)
6 6
Comparing (D.2) and (D.3) gives
X1
1 2
2
D :
nD1
n 6
References
1. Aigner, M., Ziegler, G.: Proofs from the Book, 4th edn. Springer, Berlin (2010)
2. Alon, N., Spencer, J.: The Probabilistic Method, 3rd edn. Wiley-Interscience Series in Discrete
Mathematics and Optimization. Wiley, Hoboken (2008)
3. Alon, N., Krivelevich, M., Sudakov, B.: Finding a large hidden clique in a random graph.
In: Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms (San
Francisco, CA, 1998), pp. 594–598. ACM, New York (1998)
4. Andrews, G.: The Theory of Partitions, reprint of the 1976 original. Cambridge University
Press, Cambridge (1998)
5. Apostol, T.: Introduction to Analytic Number Theory. Undergraduate Texts in Mathematics.
Springer, New York (1976)
6. Arratia, R., Barbour, A.D., Tavaré, S.: Logarithmic Combinatorial Structures: A Probabilistic
Approach. EMS Monographs in Mathematics. European Mathematical Society, Zürich (2003)
7. Athreya, K., Ney, P.: Branching Processes, reprint of the 1963 original [Springer, Berlin].
Dover Publications, Inc., Mineola (2004)
8. Bollobás, B.: The evolution of random graphs. Trans. Am. Math. Soc. 286, 257–274 (1984)
9. Bollobás, B.: Modern Graph Theory. Graduate Texts in Mathematics, vol. 184. Springer, New
York (1998)
10. Bollobás, B.: Random Graphs, 2nd edn. Cambridge Studies in Advanced Mathematics, vol. 73.
Cambridge University Press, Cambridge (2001)
11. Brauer, A.: On a problem of partitions. Am. J. Math. 64, 299–312 (1942)
12. Conlon, D.: A new upper bound for diagonal Ramsey numbers. Ann. Math. 170, 941–960
(2009)
13. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer,
New York (1998)
14. Diaconis, P., Freedman, D.: An elementary proof of Stirling’s formula. Am. Math. Mon. 93,
123–125 (1986)
15. Doyle, P., Snell, J.L.: Random Walks and Electric Networks. Carus Mathematical Monographs,
vol. 22. Mathematical Association of America, Washington (1984)
16. Durrett, R.: Probability: Theory and Examples, 4th edn. Cambridge Series in Statistical and
Probabilistic Mathematics. Cambridge University Press, Cambridge (2010)
17. Dwass, M.: The number of increases in a random permutation. J. Combin. Theor. Ser. A 15,
192–199 (1973)
18. Erdős, P., Rényi, A.: On the evolution of random graphs. Magyar Tud. Akad. Mat. Kutató Int.
Közl 5, 17–61 (1960)
19. Feller, W.: An Introduction to Probability Theory and Its Applications, 3rd edn, vol. I. Wiley,
New York (1968)
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext, 151
DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
152 References
20. Flajolet, P., Sedgewick, R.: Analytic Combinatorics. Cambridge University Press, Cambridge
(2009)
21. Flory, P.J.: Intramolecular reaction between neighboring substituents of vinyl polymers. J. Am.
Chem. Soc. 61, 1518–1521 (1939)
22. Graham, R., Rothschild, B., Spencer, J.: Ramsey Theory, 2nd edn. Wiley-Interscience Series
in Discrete Mathematics and Optimization. Wiley, New York (1990)
23. Hardy, G.H., Ramanujan, S.: Asymptotic formulae in combinatory analysis. Proc. London
Math. Soc. 17, 75–115 (1918)
24. Harris, T.: The Theory of Branching Processes, corrected reprint of the 1963 original [Springer,
Berlin]. Dover Publications, Inc., Mineola (2002)
25. Jameson, G.J.O.: The Prime Number Theorem. London Mathematical Society Student Texts,
vol. 53. Cambridge University Press, Cambridge (2003)
26. Montgomery, H., Vaughan, R.: Multiplicative Number Theory. I. Classical Theory. Cambridge
Studies in Advanced Mathematics, vol. 97. Cambridge University Press, Cambridge (2007)
27. Nathanson, M.: Elementary Methods in Number Theory. Graduate Texts in Mathematics, vol.
195. Springer, New York (2000)
28. Page, E.S.: The distribution of vacancies on a line. J. Roy. Stat. Soc. Ser. B 21, 364–374 (1959)
29. Pinsky, R.: Detecting tampering in a random hypercube. Electron. J. Probab. 18, 1–12 (2013)
30. Pitman, J.: Combinatorial stochastic processes. Lectures from the 32nd Summer School on
Probability Theory held in Saint-Flour, 7–24 July 2002. Lecture Notes in Mathematics, 1875.
Springer, Berlin (2006)
31. Rényi, A.: On a one-dimensional problem concerning random space filling (Hungarian;
English summary). Magyar Tud. Akad. Mat. Kutató Int. Közl. 3, 109–127 (1958)
32. Spitzer, F.: Principles of Random Walk, 2nd edn. Graduate Texts in Mathematics, vol. 34.
Springer, New York (1976)
33. Tenenbaum, G.: Introduction to Analytic and Probabilistic Number Theory. Cambridge Studies
in Advanced Mathematics, vol. 46. Cambridge University Press, Cambridge (1995)
34. Wilf, H.: Generating Functionology, 3rd edn. A K Peters, Ltd., Wellesley (2006)
Index
A F
Abel summation, 77 Fibonacci sequence, 142
arcsine distribution, 37 finite graph, 89
average order, 13
G
B Galton–Watson branching process, 117
Bernoulli random variable, 138 generating function, 141
binomial random variable, 138 giant component, 110
branching process – see Galton–Watson
branching process, 117
H
Hardy–Ramanujan theorem, 81
C
Chebyshev’s -function, 70
Chebyshev’s -function, 68 I
Chebyshev’s inequality, 134 independent events, 133
Chebyshev’s theorem, 68 independent random variables, 135
Chinese remainder theorem, 19
clique, 89
coloring of a graph, 104 L
composition of an integer, 5 large deviations, 113
cycle index, 58
cycle type, 51
M
Mertens’ theorems, 75
D Mőbius function, 8
derangement, 49 Mőbius inversion, 10
Dyck path, 40 multiplicative function, 9
E P
Erdős–Rényi graph, 89 p-adic, 71
Euler -function, 11 partition of an integer, 1
Euler product formula, 19 Poisson approximation to the binomial
Ewens sampling formula, 52 distribution, 138
expected value, 134 Poisson random variable, 138
extinction, 117 prime number theorem, 67
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext, 153
DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
154 Index
T
R tampering detection, 99
Ramsey number, 105 total variation distance, 99
random variable, 133
relative entropy, 115, 131
restricted partition of an integer, 1 V
variance, 134
S
sieve method, 19
simple, symmetric random walk, 35 W
square-free integer, 8 weak convergence, 139
Stirling numbers of the first kind, 54 weak law of large numbers, 136, 137