Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
of Continued fractions
Preface ix
v
vi CONTENTS
References 347
Index 377
Preface
thus explaining the name of τ . The above equation will be also written as
ix
x Preface
where λ denotes Lebesgue measure and γ is what we now call Gauss’ mea-
sure, defined by Z
1 dx
γ(A) = , A ∈ BI .
log 2 A x + 1
Gauss asked for an estimate of the convergence rate in the above limiting
relation, and this has actually been the first problem of the metrical theory
of continued fractions. Ramifications of this problem, which was given a
first solution only in 1928, still pervade the current developments. Chapter
2 contains a detailed treatment of Gauss’ problem by an elementary ap-
proach and functional-theoretic methods as well. The latter are applied to
the Perron–Frobenius operator associated with τ , considered as acting on
various Banach spaces including that of functions of bounded variation on
I.
Gauss’ measure is important since it is preserved by τ , that is, γ(τ −1 (A))
= γ(A) for any A ∈ BI . This implies that, by its very definition, the
sequence (an )n∈N+ is strictly stationary under γ. As such, there should
exist a doubly infinite version of it, say (ā` )`∈Z , Z = { · · · , −1, 0, 1, · · · },
defined on a richer probability space. It appears that this doubly infinite
version can be effectively constructed on (I 2 , BI2 , γ̄), where γ̄ is the so called
extended Gauss’ measure defined by
ZZ
1 dxdy
γ̄(B) = , B ∈ BI2 .
log 2 B (xy + 1)2
Put ā−n (ω, θ) = an+1 (θ), ā0 (ω, θ) = a1 (θ), ān (ω, θ) = an (ω) for any
n ∈ N+ and (ω, θ) ∈ Ω2 . Then whatever ` ∈ Z, k ∈ N, and n ∈ N+
the probability distribution of the random vector (ā` , · · · , ā`+k ) under γ̄ is
identical with that of the random vector (an , · · · , an+k ) under γ, that is,
(ā` )`∈Z under γ̄ is a doubly infinite version of (an )n∈N+ under γ. A distinc-
tive feature of our treatment is the consistent use of the extended incomplete
quotients ā` , ` ∈ Z. It appears that
(a + 1)x
γ̄ ( [0, x] × I | ā0 , ā−1 , · · · ) = γ̄-a.s.
ax + 1
for any x ∈ I, where a = [ā0 , ā−1 , · · · ], which in turn implies that
a+1
γ̄(ā`+1 = i | ā` , ā`−1 , · · · ) = γ̄-a.s.
(a + i)(a + i + 1)
Preface xi
Acknowledgements
Much of our original work included in this book has been carried out in the
framework of our association with the Bucharest ‘Gheorghe Mihoc’ Centre
for Mathematical Statistics of the Romanian Academy, and the Department
of Probability and Statistics (CROSS), Faculty ITS, of the Delft University
of Technology.
Many institutions and persons have helped us in various ways.
The first of us wishes to acknowledge the hospitality of Université René
Descartes – Paris 5, Université des Sciences et des Technologies de Lille,
and Université Victor Segalen – Bordeaux 2. He is grateful to Bui Trong
Lieu, Michel Schreiber (both of Paris 5), George Haiman (Lille), and Jean-
Marc Deshouillers (Bordeaux 2) for their kind invitations at these locations
where his stays in the period 1996–1999 were very helpful in completing
parts of the book. He is also grateful to the Nederlandse Organisatie voor
Preface xiii
Finally, we must thank all the people with Kluwer Academic Publishers
who helped during the development and production of this book project.
Abbreviations
Cov = covariance
Var = variance
xv
xvi Frequently Used Notation
Symbols
Z = (−N) ∪ N+ = {· · · , −1, 0, 1, · · · }
R+ = (x ∈ R : x ≥ 0) , R++ = (x ∈ R : x > 0)
z ∗ = complex conjugate of z ∈ C
BM = Bn ∩ M := (B ∩ M : B ∈ B n ), M ∈ B n , n ∈ N+
λ = Lebesgue measure on B
λ2 = Lebesgue measure on B2
P f −1 = P -distribution of r.v. f
∗ = convolution of measures
F0 = F1 = 1, Fn+1 = Fn + Fn−1 , n ∈ N+
√
g = ( 5 − 1)/2, G = g + 1 (‘golden ratios’)
an , 3, 14 || · || L , 54
ā` , 31 Lp , 55
B(I), 53 ||·||p , 55
|| · ||, 53 L∞ , 55
||·||v , 56 Lpµ , 54
||·||v,µ , 56 ||·||p,µ , 54
BV (I), 54 L∞
µ , 55
|| · || v , 54 ||.||∞,µ , 55
C 1 (I), 53 να , 197
|| · || 1 , 53 Pλ , 60
cτ Pois µ, 317 Pi , 22
γ̄, 26 Pµ , 57
γa , 36 pn , 4, 19
F, 324 qn , 4, 19
Qν , 328 W , 319
rn , 14 yn , 15
r̄` , 34 y ` , 34
s (f ), 53
sn , 14
san , 36
s̄` , 34
σ (C), 313
ten , 263
e
ten , 273
τ, 2
τ , 25
Θn , 27
Θ0n , 251
Θen , 263
e e , 280
Θ n
¡ (n) ¢
u i , 18
U := Pγ , 59
un , 14
uan , 38
ū` , 34
v (f ), 55
¡ ¢
v i(n) , 18
xx Frequently Used Notation
Chapter 1
v0 := a, v1 := b,
v0 = a1 v1 + v2 ,
v1 = a2 v2 + v3 ,
1
2 Chapter 1
where 0 ≤ vm+1 < vm . Clearly, the procedure should stop after finitely many
steps: there exists n ∈ N+ such that vn 6= 0 and vn+1 = 0. Then, as is well
known, we have
vn = g.c.d. (a, b) .
Remark. The running time of Euclid’s algorithm depends on the number
of division steps required to get the g.c.d. of the given positive integers
v0 > v1 . In an 1844 paper of the French mathematician Gabriel Lamé it
is essentially shown that (i) given n ∈ N+ , if Euclid’s algorithm applied to
v0 and v1 requires exactly n division steps and v0 is as small as possible
satisfying this condition, then v0 = Fn+1 and v1 = Fn ; (ii) if v1 < v0 < m ∈
N+ , then the number of division steps required by Euclid’s algorithm when
applied to v0 and v1 is at most
j √ √ k
log( 5m)/ log(( 5 + 1)/2) − 2 ≈ b2.078 log m + 1.672c − 2,
For historical details we refer the reader to Shallit (1994), and for recent
developments to Knuth (1981, Section 4.5.3) and Hensley (1994). It should
be noted that the latter are based on results to be proved in this and later
chapters. 2
To consider Euclid’s algorithm more closely we define the so called con-
tinued fraction transformation τ : I → I by
½ −1
x − bx−1 c if x 6= 0,
τ (x) =
0 if x = 0.
and
vm
= τ m−1 (x) , 1 ≤ m ≤ n, τ n (x) = 0,
vm−1
where τ 0 = identity map and τ ` , ` ∈ N+ , is the composition of τ with itself
` times. Note that
¡ ¢
am (x) = a1 τ m−1 (x) , 1 ≤ m ≤ n. (1.1.2)
Basic properties 3
x = a0 + [a1 , . · · · , an ] , (1.1.4)
[a0 ; a1 , · · · , an ] .
[a0 ; a1 , · · · , an ] = [a0 ; a1 , · · · , an − 1, 1] .
τ n (x − a0 ) ∈ Ω = I\Q, n ∈ N.
Let us define
Hence
x = [a0 ; a1 + τ (x − a0 )] = · · · = [a0 ; a1 , · · · , an−1 , an + τ n (x − a0 )] (1.1.5)
for any n ≥ 2.
The two cases x ∈ Q and x ∈ R\Q can be treated in a unitary manner
if we define
a1 (0) = ∞,
the symbol ∞ being subject to the rules 1/∞ = 0, 1/0 = ∞. Equations
(1.1.5) are then valid for any x ∈ R. Clearly, for any x ∈ Q there exists
n = n (x) ∈ N+ such that am (x) = ∞ for any m ≥ n.
The integers a1 (x), a2 (x), · · · will be called the (continued fraction) digits
of x ∈ R whilst the functions x → ai (x) ∈ N+ ∪ {∞}, x ∈ R, i ∈ N+ ,
will be called the incomplete (or partial ) quotients of the continued fraction
expansion. Euclid’s algorithm implies that x ∈ R has finitely many finite
continued fraction digits if and only if x ∈ Q.
Thus
Q2 (x1 , x2 ) = x1 x2 + 1, Q3 (x1 , x2 , x3 ) = x1 x2 x3 + x1 + x3 ,
Q4 (x1 , x2 , x3 , x4 ) = x1 x2 x3 x4 + x1 x2 + x1 x4 + x3 x4 + 1,
Qn−1 (x2 , · · · , xn )
[x1 , · · · , xn ] = , n ∈ N+ . (1.1.6)
Qn (x1 , · · · , xn )
The proof by induction is immediate and is left to the reader. The continu-
ants enjoy the symmetry property
Qn−1 (a2 , · · · , an )
ωn (ω) = , (1.1.10)
Qn (a1 , · · · , an )
qn = an qn−1 + qn−2 , n ≥ 2,
(1.1.11)
pn = an pn−1 + pn−2 , n ≥ 3,
pn+1 ≥ Fn , qn ≥ Fn , n ∈ N. (1.1.13)
Notice that by (1.1.5), (1.1.6), (1.1.7), (1.1.10), and (1.1.11) we also have
1 p1 + τ (ω) p0
ω = [a1 + τ (ω)] = = ,
a1 + τ (ω) q1 + τ (ω) q0
£ ¤ a2 + τ 2 (ω) p2 + τ 2 (ω) p1
ω= a1 , a2 + τ 2 (ω) = = ,
a1 a2 + 1 + a1 τ 2 (ω) q2 + τ 2 (ω) q1
and for n ≥ 3,
and remark that (1.1.14) also holds for any rational ω in [0, 1).
Remark. A matrix approach to equations (1.1.12) and (1.1.14) is as
follows. Consider the matrices
µ ¶
pn−1 pn
Mn = , n ∈ N,
qn−1 qn
so that M0 = identity matrix, and define
µ ¶
0 1
M−1 = .
1 0
Mn = Mn−1 An , n ∈ N,
where µ ¶
0 1
An = , n ∈ N,
1 an
with a0 = 0. Hence
µ ¶Y
n µ ¶
0 1 0 1
Mn = , n ∈ N,
1 0 1 ai
i=0
det Mn = (−1)n , n ∈ N.
whence
(1, 0) Mn−1 (1, an )T pn
Mn (0) = T
=
(0, 1) Mn−1 (1, an ) qn
½
[a1 , · · · , an ] if n ∈ N+ ,
:=
0 if n = 0.
It follows that
pn + zpn−1
Mn (z) = = [a1 , · · · , an−1 , an + z] , n ≥ 2,
qn + zqn−1
for any z ∈ C, z 6= −qn /qn−1 , and
µ ¶
1 p1 + zp0
M1 (z) = =
a1 + z q1 + zq0
for any z ∈ C, z 6= −a1 . Now, (1.1.14) follows from the last two equations
by taking z = τ n (ω) , n ≥ 2, respectively z = τ (ω), ω ∈ Ω.
Finally, it is obvious by (1.1.100 ) that pn and qn , n ∈ N+ , can be actually
defined as µ ¶ µ ¶ µ ¶µ ¶
pn 0 1 0 1 0
= ··· .
qn 1 a1 1 an 1
It is worth mentioning that any irrational number
ω = [a0 ; a1 , a2 , · · · ] ∈ R
Basic properties 9
for any z0 ∈ C. This simple remark is the starting point for understanding
by the use of elementary results about continued fractions the behaviour of
the geodesic flow on a certain Riemann surface. For details see Series (1982,
1991). See also Adler (1991), Faivre (1993), and Nakada (1995). For another
representation of irrationals ω ∈ R in terms of matrices R and L = (P Q)2 Q
see Raney (1973). 2
We can now prove the result announced before defining the continuants.
Proposition 1.1.1 For any x ∈ [0, 1) we have
(−1)n τ n (x)
x − ωn (x) = , n ∈ N. (1.1.15)
qn (qn + τ n (x) qn−1 )
and
lim ωn (ω) = ω. (1.1.17)
n→∞
Proof. Equation (1.1.15) follows from (1.1.12) and (1.1.14). Next, since
1
= an+1 + τ n+1 (ω) , n ∈ N, ω ∈ Ω,
τ n (ω)
by (1.1.11) we have
τ n (ω) 1
=
qn (qn + τ n (ω) qn−1 ) qn (qn (an+1 + τ n+1 (ω)) + qn−1 )
1
= ,
qn (qn+1 + qn τ n+1 (ω))
10 Chapter 1
(−1)n+1
ωn − ωn−1 = , n ∈ N+ , ω ∈ Ω, (1.1.18)
qn qn−1
which in conjunction with (1.1.15) yields
for any ω ∈ Ω. Clearly, the above inequalities also hold for any rational
ω ∈ [0, 1) with some inequality signs ‘<’ replaced by ‘≤’.
In what follows we shall write
ω = [a1 , a2 , · · · ] , ω ∈ Ω,
ωn = [i1 , · · · , in ] , n ∈ N+ .
a1 (ω) = b1/ωc = i1 .
This follows from (1.1.20) letting n → ∞ and noting that limn→∞ [i2 , · · · , in ]
exists and lies in the open interval (0, 1). 2
where
³ b ∈ N+ is not a perfect square, and a, c ∈ Z, c 6= 0. Then x0 =
√ ´
a − b /c is called the algebraic conjugate of x. A purely periodic quadratic
irrationality x is characterized by the inequalities x > 1, −1 < x0 < 0. We
have, for example, √
1+ 7 £ ¤
= 1; 1, 4, 1
2
and √
1+ 2 £ ¤
= 1, 4, 8 .
3
The first quadratic irrationality above is purely periodic and has period
length 4 while the second one has period length 2 but is not purely periodic.
Apart from that, the continued fraction expansion of even a single ad-
ditional algebraic number is not explicitly known. We do not know even
whether the sequence of digits is unbounded for such a number. [In connec-
tion with this matter see, however, Brjuno (1964) and Richtmyer (1975).]
For transcendental numbers of interest it is not clear when to expect
a continuous fraction expansion with a good ‘pattern’. For example, in a
paper titled De Fractionibus Continuis, published Pin 1737, Leonhard Euler
gave a nice continued fraction expansion for e = n∈N 1/n!, namely
e = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, · · · , 1, 2n, 1, · · · ] .
= [1; (n − 1)/2, 6n, (5n − 1)/2, 1, 1, (7n − 1)/2, 18n, (11n − 1)/2, 1, · · · ]
The digits of π do not appear to follow any pattern and are widely suspected
to be in some sense random.
There is a vague folklore statement [cf. Thakur (1996)] that the nice
patterns come from the connection with hypergeometric functions and the
representation of the latter by certain generalized continued fraction expan-
sions. For more on that see Chudnovsky and Chudnovsky (1991, 1993).
thus it is equal to
¯ ¯
(10 391 013)2 log 10 391 013 ¯¯ 28 245 729 ¯¯
log log 10 391 013 ¯e − 10 391 013 ¯
If we define a1 (0) = ∞ then the above equations also define the incomplete
quotients for the rational numbers in [0, 1). As we have noted in Subsec-
tion 1.1.1, for any rational x ∈ [0, 1) there exists n = n (x) ∈ N+ such
that am (x) = ∞ for any m ≥ n.
The metric point of view in studying the sequence (an )n∈N+ is to con-
sider that the an , n ∈ N+ , are N+ -valued random variables on (I, BI )
which are defined µ-a.s. in I for any probability measure µ on BI assigning
measure 0 to the rationals in I . (Such a µ is clearly Lebesgue measure λ.)
Alternatively, we can look at the an , n ∈ N+ , as N+ ∪ {∞}-valued random
variables which are defined everywhere in [0, 1). It is clear, for example, that
µ ¶
1
a1 (0) = ∞, a1 (x) = 1, x ∈ ,1 ,
2
µ ¸
1 1
a1 (x) = i, x ∈ , , i ≥ 2,
i+1 i
µ ¶
1
a2 (0) = a2 = ∞, i ≥ 2,
i
[ µ 1 1
¶
a2 (x) = 1, x ∈ , ,
i + 1 i + 1/2
i∈N+
[ · 1 1
¶
a2 (x) = j, x ∈ , , j ≥ 2.
i + 1/j i + 1/ (j + 1)
i∈N+
1
rn = = [an ; an+1 , an+2 , · · · ], (1.2.1)
τ n−1
Basic properties 15
qn−1 1
sn = , yn = , (1.2.2)
qn sn
¯ ¯−1
−2 ¯
¯ pn−1 ¯¯
un (ω) = qn−1 ¯ ω − , ω ∈ Ω, (1.2.3)
qn−1 ¯
where, as usual, pn /qn = [a1 , · · · , an ] , n ∈ N+ , is the nth convergent,
p0 = 0, q0 = 1. Note that qn = y1 · · · yn = (s1 · · · sn )−1 , n ∈ N+ . Next, it
follows from the first equation (1.1.11) that
1
= an + sn−1 , n ∈ N+ ,
sn
with s0 = 0. Hence
sn = [an , · · · , a1 ] , n ∈ N+ . (1.2.20 )
un = sn−1 + rn , n ∈ N+ . (1.2.30 )
log (x + 1)
lim Fn (x) = , x ∈ I.
n→∞ log 2
Gauss’ proof has never been found. Later, in a letter dated 30th January
1812, Gauss asked Laplace what we now call:
Gauss’ Problem. Estimate the error
log(x + 1)
en (x) := Fn (x) − , n ∈ N, x ∈ I.
log 2
16 Chapter 1
Gauss’ letter has been published on pages 371–372 of his Werke, Volume 1,
Section 1, Teubner, Leipzig, 1917. Almost the whole letter is reproduced
on pages 396–397 of J.V. Uspensky’s Introduction to Mathematical Proba-
bility, McGraw-Hill, New York, 1937. See also Gray (1984, p. 123) for other
historical details about Gauss’ problem.
The first one to give a solution to Gauss’ problem (implicitly proving
Gauss’ 1800 assertion) was R.O.√Kuzmin, who showed in 1928 [see Kuzmin
(1928, 1932)] that en (x) = O(q n ) as n → ∞, with 0 < q < 1, uniformly
in x ∈ I. Kuzmin’s proof is reproduced in Khintchine (1956, 1963, 1964).
Independently, Paul Lévy showed one year later [see Lévy (1929) and√also
Lévy (1954, Ch.IX)] that |en (x)| ≤ q n , n ∈ N+ , x ∈ I, with q = 3.5−2 2 =
0.67157 · · · . We present a slightly improved version of Lévy’s solution in
Subsection 1.3.5. Using Kuzmin’s approach, Szűsz (1961) claimed to have
lowered the Lévy estimate for q to 0.4. Actually, Szűsz’s argument yields just
0.485 rather than 0.4. The optimal value of q was determined by Wirsing
(1974), who found that it was equal to 0.303 663 002 · · · .
Chapter 2 is devoted to a thorough treatment of Gauss’ problem. In
particular, Corollary 2.3.6 provides a complete solution to a generalization
of it, where the interval [0, x), x ∈ I, is replaced by an arbitrary set A ∈ BI .
The limiting distribution function log(x + 1)/ log 2, x ∈ I, occurring
in Gauss’ problem motivates the introduction of what we now call Gauss’
measure γ, which is defined on BI by
Z
1 dx
γ (A) = , A ∈ BI .
log 2 A x + 1
Then clearly γ([0, x]) = log(x+1)/ log 2, x ∈ I. We are going to prove that γ
and τ enjoy an important property. First, we note that τ does not preserve
λ. This means that we do not have λ(τ −1 (A)) = λ (A) for any A ∈ BI .
Indeed, for, e.g., A = (1/2, 1) we have
[ µ 1 1
¶
−1
τ (A) = ,
i + 1 i + 1/2
i∈N+
and
¡ ¢ X µ 1 1
¶ X µ 1 1
¶
λ τ −1 (A) = − =2 −
i + 1/2 i + 1 2i + 1 2i + 2
i∈N+ i∈N+
µ ¶
1
= 2 log 2 − 1 + = 2 log 2 − 1
2
while λ (A) = 1/2.
Basic properties 17
For this it is enough to show that the above equation holds for any interval
A = (0, u], 0 < u ≤ 1. As
[ · 1 1¶
−1
τ ((0, u]) = , ,
u+i i
i∈N+
I(i(n) ) = ( ω ∈ Ω : ak (ω) = ik , 1 ≤ k ≤ n ) .
We are going to prove that any I(i(n) ) is the set of irrationals from a certain
open interval with rational endpoints. The sets I(i(n) ), i(n) ∈ Nn+ , are
18 Chapter 1
where
pn + pn−1
if n is odd,
qn + qn−1
(n)
u(i )=
pn
if n is even,
qn
pn
if n is odd,
qn
v(i(n) ) =
pn + pn−1
if n is even.
qn + qn−1
We have
[i1 + 1] if n = 1,
pn + pn−1
=
qn + qn−1
[i1 , · · · , in−1 , in + 1] if n > 1,
1
λ(I(i(n) )) = (1.2.5)
qn (qn + qn−1 )
and
1
max λ(I(i(n) )) = λ (I (1(n))) = , n ∈ N+ , (1.2.6)
i(n) ∈Nn
+
Fn Fn+1
pn−1 p−n
= [i1 , · · · , in−1 ] , = [i1 , · · · , in−1 , in − 1]
qn−1 qn−
Then
Ip/q = I (i1 , · · · , in ) ∪ I (i1 , · · · , in−1 , in − 1, 1) (1.2.7)
µ ¶
p + pn−1 p + p− n
Ω ∩ , if n is odd,
q + qn−1 q + qn−
= µ ¶
p + p−n p + pn−1
Ω ∩ , if n is even
q + qn− q + qn−1
and
¡ ¢ 3
λ Ip/q = ¡ ¢.
(q + qn−1 ) q + qn−
We have
¡ ¢ ¡ ¢ 3
max λ Ip/q = λ IFn /Fn+1 = , n ∈ N+ .
{p,q∈N+ : n(p,q)=n} (Fn−1 + Fn+1 ) Fn+2
20 Chapter 1
Assuming that p/q < ω, that is, n is even, from (1.2.8) we obtain
p p 1 p + pn−1
<ω< + =
q q q (q + qn−1 ) q + qn−1
[by (1.1.12)]. Similarly, assuming that p/q > ω, that is, n is odd, we obtain
p p 1 p + pn−1
>ω> − =
q q q (q + qn−1 ) q + qn−1
1 1 + v(i(n) )
γ (ak = i1 , · · · , ak+n−1 = in ) = log , k ∈ N+ .
log 2 1 + u(i(n) )
In particular,
µ ¶
1 (i + 1)2 1 1
γ (ak = i) = log = log 1 + (1.2.9)
log 2 i (i + 2) log 2 i (i + 2)
for any k, i ∈ N+ .
Proof. Theorem 1.2.1 and equation (1.2.4). 2
Corollary 1.2.6 (Brodén–Borel–Lévy formula) For any n ∈ N+ we
have
(sn + 1) x
λ (τ n < x | a1 , · · · , an ) = , x ∈ I, (1.2.10)
sn x + 1
where sn is defined by (1.2.2) or (1.2.20 ).
Proof. Clearly, for any n ∈ N+ and x ∈ I,
(τ n < x) ∩ I (a1 , · · · , an )
µ ¶
pn + xpn−1 pn
ω∈Ω: <ω< if n is odd,
qn + xqn−1 qn
= µ ¶
pn pn + xpn−1
ω∈Ω: <ω< if n is even.
qn qn + xqn−1
Z x−1
1
= Gn−1 (s) ds,
x 0
Z 1
d (s + 1) dGn−1 (s)
λ (rn < x) = (1.2.19)
dx 0 (s + x)2
Z 1
2 (s − x + 2) Gn−1 (s) ds
= 2 + .
(x + 1) 0 (s + x)3
Also, for any n ∈ N+ we have
Z x−1
d 1 1
λ (un < x) = Gn−1 (x − 1) − 2 Gn−1 (s) ds (1.2.20)
dx x x 0
Z x−1
1 1
G (x − 1) − 2 Gn−1 (s) ds if 1 ≤ x ≤ 2,
x n−1 x 0
= µ ¶
Z 1
1
2 2− Gn−1 (s) ds if x > 2
x 0
Basic properties 25
whatever (ω, θ) ∈ Ω2 .
Equations (1.3.1) and (1.3.10 ) imply that
for any ω ∈ Ω. Note that the above equation also hold for n = 0 if we define
τ 0 =identity map.
Now, define the extended Gauss measure γ on BI2 by
ZZ
1 dxdy
γ (B) = , B ∈ BI2 .
log 2 B (xy + 1)2
Note that
γ (A × I) = γ (I × A) = γ (A) (1.3.4)
for any A ∈ BI . The result below shows that γ plays with respect to τ the
part played by γ with respect to τ (cf. Theorem 1.2.1).
Theorem 1.3.1 The extended Gauss measure γ is preserved by τ .
¡ ¢
Proof. We should show that γ τ −1 (B) = γ (B) for any B ∈ BI2 or,
equivalently, since τ is invertible on Ω2 , that γ (τ (B)) = γ (B) for any B ∈
BI2 . As the set of Cartesian products I(i(m) ) × I(j (n) ), i(m) ∈ Nm
+, j
(n) ∈
Θn = u−1
n+1 , n ∈ N. (1.3.6)
Hence
0 < Θn < 1, n ∈ N.
It is rather easy to obtain more information about Θn , n ∈ N. It follows
from (1.2.30 ) and (1.2.1) that
1 τn
Θn = = , n ∈ N.
sn + rn+1 sn τ n + 1
−1
Moreover, as s−1
n = an + sn−1 and rn = an + rn+1 , n ∈ N+ , we also have
1 1
Θn−1 = = −1
sn−1 + rn sn−1 + an + rn+1
sn
= , n ∈ N+ .
sn τ n + 1
Thus it appears that
(Θn−1 , Θn ) = Ψ (τ n , sn ) , n ∈ N+ , (1.3.7)
Θn−1 + Θn < 1, n ∈ N+ ,
whence
1
min (Θn−1 , Θn ) < , n ∈ N+ ,
2
28 Chapter 1
For i ∈ N+ put
Vi = I (i) × Ω
Hi = Ω × I (i) .
It follows from the definition of τ that
τ (Vi ) = Hi , Vi = τ −1 (Hi ) , i ∈ N+ ,
and notice that its symmetrical with respect to the diagonal α = β is Hi∗ =
ΨHi , i ∈ N+ . (For i = 1 both quadrangles are in fact triangles.) Define
the mapping F : ∆ → ∆ as F = Ψτ Ψ−1 .
It is easy to check that for any i ∈ N+ we have
³ p ´
(α, β) ∈ Vi∗ ⇒ F (α, β) = β, α + i 1 − 4αβ − i2 β . (1.3.10)
Ψ−1 (Θn−1 , Θn ) = (τ n , sn ) ,
whence ¡ ¢
τ Ψ−1 (Θn−1 , Θn ) = τ n+1 , sn+1 , n ∈ N+ .
Therefore, by (1.3.7) again,
we obtain
p
Θn−1 = Θn+1 + an+1 1 − 4Θn Θn+1 − a2n+1 Θn , n ∈ N+ . (1.3.120 )
if and only if (Θn−1 , Θn ) ∈ Vi∗ , that is, by (1.3.7), if and only if an+1 = i.
Finally, note that
Θn−1 (ξi∗ ) 6= Θn (ξi∗ ) (1.3.16)
for any i, n ∈ N+ .
Now, on account of (1.3.14) through (1.3.16) we can state the following
result.
Theorem 1.3.2 For any ω ∈ Ω and n ∈ N+ we have
1
min (Θn−1 , Θn , Θn+1 ) < q (1.3.17)
2
an+1 + 4
and
1
max (Θn−1 , Θn , Θn+1 ) > q . (1.3.18)
a2n+1 + 4
with
a1 (ω, θ) = a1 (ω) , (ω, θ) ∈ Ω2 .
Clearly, by (1.3.10 ) and (1.3.20 ) we have
(a + 1) x
γ ([0, x] × I | a0 , a−1 , · · · ) = γ-a.s.,
ax + 1
γ ([0, x] × In )
=
γ (I × In )
Z Z x Z
−1 du −1 x (y + 1) dy
(log 2) dy (log 2)
In 0 (uy + 1)2 In xy + 1 y + 1
= =
γ (In ) γ (In )
Z
x (y + 1)
γ (dy)
In xy + 1 x (yn + 1)
= =
γ (In ) xyn + 1
γ (a`+1 = i| a` , a`−1 , · · · ), i ∈ N+ ,
1
s` = [ a` , a`−1 , · · · ] , y` = ,
s̄`
r` = r0 ◦ τ ` , u` = s0 ◦ τ `−1
+ r1 ◦ τ `−1
, ` ∈ Z.
It follows from the above equations, Theorem 1.3.1, and Corollary
¡ 1.3.6¢ that
(s` )`∈Z is a strictly stationary Ω-valued Markov process on I 2 , BI2 , γ with
the following transition mechanism: from state s ∈ Ω the possible transi-
tions are to any state 1/ (s + i) with corresponding transition probability
Pi (s) , i ∈ N+ . Clearly, for any ` ∈ Z we have
Similar considerations can be made about the¡ process ¢(y ` )`∈Z . This is a
strictly stationary Ω0 -valued Markov process on I 2 , BI2 , γ , where Ω0 = the
set of irrationals in [1, ∞). The transition mechanism of (y ` )`∈Z is as follows:
from state y ∈ Ω0 the only possible transitions are to any state y −1 + i with
corresponding transition probability Pi (1/y), i ∈ N. For any ` ∈ Z we have
γ(y ` < x) = γ(y 0 < x) = γ([x−1 , 1]) = γ 0 ([1, x]) , x ∈ [1, ∞),
Basic properties 35
lowed by state
µ ¶
−1 1 −1 −1
τ (s, ω) = , ω − bω c .
s + bω −1 c
For any ` ∈ Z we have
¡ ¢ ¡ ¢
γ s`−1 < x, r−1
` <y = γ s0 < x, r−1
1 <y
Z y Z x
1 dudv
= γ ([0, y] × [0, x]) =
log 2 0 0 (uv + 1)2
log (xy + 1)
= , x, y ∈ I.
log 2
The process (u` )`∈Z , which is a functional of (s`−1 , r−1` )`∈Z (note that
u` = s`−1 +r` , ` ∈ Z), is no longer Markovian but is still a strictly stationary
one. For any ` ∈ Z we have
γ (u` < x) = γ (u1 < x) = γ (s0 + r1 < x)
ZZ
1 dudv
= , x ∈ [1, ∞),
log 2 D (uv + 1)2
¡ ¢
where D = (u, v) ∈ I 2 : u + v −1 < x . Hence
µ ¶
1 x−1
log x − if 1 ≤ x ≤ 2,
log 2 x
γ (u` < x) = µ ¶
1 1
log 2 − if x ≥ 2.
log 2 x
36 Chapter 1
(a + 1) x
γa ([0, x]) = , x ∈ I, a ∈ I.
ax + 1
In particular, γ0 = λ. The density ha of γa is
a+1
ha (x) = , x ∈ I, a ∈ I,
(ax + 1)2
1
sup |γa (A) − γb (A)| ≤ |b − a| , a, b ∈ I . (1.3.19)
A∈BI 4
α (1 − α) |b − a|
sup |γa ([0, x]) − γb ([0, x])| = , a, b ∈ I.
x∈I (1 + αa) (1 + αb)
1
san = , n ∈ N+ . (1.3.20)
san−1 + an
It follows from the properties just described of the process (s` )`∈Z that
the sequence (san )n∈N+ is an I-valued Markov chain on (I, BI , γa ) which
starts at sa0 = a and has the following transition mechanism: from state
Basic properties 37
(san + 1) x
γa (τ n < x | a1 , · · · , an ) = , x ∈ I. (1.3.21)
san x + 1
On the one hand, it follows from Theorems 1.3.1 and 1.3.5 (see also
Remark 1 after Corollary 1.3.6) that the conditional probability (1.3.22) is
γ-a.s. equal to
(san + 1) x
.
san x + 1
On the other hand, putting
γ̄a ( · ) = γ ( · | a0 , a−1 , · · · ) ,
whatever the set A belonging to the σ-algebra generated by the random vari-
ables an+1 , an+2 , · · · , that is, τ −n (BI ).
a+1
γa (r1 < x) = 1 − ,
x+a
0 if x ≤ a + 1,
γa (ua1 < x) =
1− a+1
if x > a + 1,
x
san + 1
γa (rn+1 < x | a1 , . . . , an ) = 1 − ,
x + san
0 if x ≤ san + 1,
γa (uan+1 < x | a1 , . . . , an ) = a
1 − sn + 1
if x > san + 1.
x
Corollary 1.3.11 For any a ∈ I and n ∈ N+ let Gan (s) = γa (san <
s), s ∈ R, Ga0 (s) = 0 or 1 according as s ≤ a or s > a. For any a ∈ I, n ∈
Basic properties 39
N+ , and x ≥ 1 we have
Z 1
x−1 a
γa (rn < x) = dG (s)
0 s + x n−1
µ Z 1 Ga ¶
1 n−1 (s) ds
= (x − 1) + ,
x+1 0 (s + x)2
Z x−1 µ ¶
s+1
1− dGan−1 (s) if 1 ≤ x ≤ 2,
0 x
γa (uan < x) =
Z 1µ ¶
s+1
1− dGan−1 (s) if x>2
0 x
Z x−1
1
= Gan−1 (s) ds.
x 0
and µ ¶ µ ¶
1 1
Gan+1 = Fna , m, n ∈ N+ , a ∈ I. (1.3.27)
m m
40 Chapter 1
while
µ ¶ µ ¶ µ ¶
1 1 1 1 1
Ga1 − Ga1 = γa ≤ <
m m+θ m+θ a1 + a m
(1.3.280 )
Z θ
= Pm (s) dGa0 (s),
0+
we obtain
Z 1
x(1 − x) x(1 − x) a
|Fna (x) − G(x)| ≤ αna ds = α ,
0 (xs + 1)2 x+1 n
hence √
|Fna (x) − G(x)| ≤ (3 − 2 2)αna (1.3.30)
for any a, x ∈ I and n ∈ N. Let us note that
where Z ¯ ¯
θ ¯ dPm (s) ¯
β(m, θ) = ¯ ¯
¯ ds ¯ ds + Pm (θ).
0
42 Chapter 1
It is easy to check that β(m, θ) ≤ 1/2 for any m ∈ N+ and θ ∈ [0, 1).
Actually,
1/2 if m = 1,
√
4/(3 + θ) − 2/(2 + θ) − 1/6 if m = 2 and θ ≤ 2 − 1,
β(m, θ) = √ √
6 − 4 2 − 1/6
if m = 2 and θ ≥ 2 − 1,
2Pm (θ) − 1/m(m + 1) if m ≥ 3.
Hence
¯ µ ¶ µ ¶¯
¯ a 1 1 ¯
a
αn+1 = sup ¯G − G ¯
¯ n+1 m + θ m+θ ¯
m∈N+ , θ∈[0,1)
(1.3.31)
√
≤ (3.5 − 2 2)αna
for any a ∈ I and n ∈ N+ .
Finally, by (1.3.27), (1.3.270 ), and (1.3.280 ),
µ ¶ µ ¶
1 1 1
G01 = G01 =
m+θ m m+1
and
µ ¶ µ ¶ Z θ
1 1
Ga1 = Ga1 − Pm (s)dGa0 (s)
m+θ m 0
µ ¶
1
F0a − Pm (a) if 0 ≤ θ ≤ a,
m
= µ ¶
1
a
F0 if θ > a
m
a+1
if 0 ≤ θ ≤ a,
a+m+1
=
a+1
if θ > a
a+m
for any a ∈ (0, 1], θ ∈ [0, 1), and m ∈ N+ . It is easy to see that
¯ µ ¶ µ ¶¯
¯ a 1 1 ¯ 1
a
α1 = sup ¯G1 − G ¯ ≤ , a ∈ I. (1.3.32)
¯ m+θ m+θ ¯ 2
m∈N+ , θ∈[0,1)
Basic properties 43
where the supremum is taken over all A ∈ B1k and B ∈ Bk+n∞ such that
µ (A) µ (B) 6= 0, and k ∈ N+ .
Define ¯ ¯
¯ γa (B) ¯
¯
εn = sup ¯ − 1¯¯ , n ∈ N+ ,
γ (B)
where the supremum is taken over all a ∈ I and B ∈ Bn∞ with γ (B) > 0.
∞ ⊂ B ∞ for any
Note that the sequence (εn )n∈N+ is non-increasing since Bn+1 n
a , a ∈ I,
n ∈ N+ . We shall show that εn can be expressed in terms of Fn−1
and G, namely, εn = ε0n with
¯ a ¯
¯ dFn−1 (x) /dx ¯
εn = sup ¯¯
0
− 1¯¯ , n ∈ N+ ,
a,x∈I g (x)
As
(a + 1) (x + 1)
1≤ ≤ 2, a, x ∈ I,
(ax + 1)2
it follows that
ε1 = 2 log 2 − 1 = 0.38629 · · · .
X (a + 1)x
= , a, x ∈ I.
(x + a + i)(a + i)
i∈N+
Basic properties 45
Then
¯ a ¯
¯ dF1 (x)/dx ¯
ε2 = sup ¯ ¯ − 1¯¯
a,x∈I g(x)
¯ ¯
¯ X ¯
¯ 1 ¯
= sup ¯¯(log 2)(a + 1)(x + 1) 2
− 1¯¯ .
a,x∈I ¯ (x + a + i) ¯
i∈N+
Hence
ε2 = max(ζ(2) log 2 − 1, 1 − 2(ζ(2) − 1) log 2)
But
Z 1
|x(s + 2) − 1|
ds
0 (xs + 1)3
Z 1
1 − x(s + 2)
ds if 0 ≤ x ≤ 13 ,
(xs + 1)3
0
Z (1−2x)/x 1 − x(s + 2) Z 1
1 − x(s + 2)
= 3
ds − 3
ds if 13 ≤ x ≤ 12 ,
0 (xs + 1) (1−2x)/x (xs + 1)
Z 1
x(s + 2) − 1
ds if 12 ≤ x ≤ 1
0 (xs + 1)3
2(x + 1)−2 − 1 if 0 ≤ x ≤ 13 ,
= −2(x + 1)−2 − 1 + (2x(1 − x))−1 if 13 ≤ x ≤ 12 ,
1 − 2(x + 1)−2 if 12 ≤ x ≤ 1
and Z 1
|x(s + 2) − 1|
(x + 1) ds =
0 (xs + 1)3
2(x + 1)−1 − (x + 1) if 0 ≤ x ≤ 13
= −2(x + 1)−1 − (x + 1) + (x + 1)(2x(1 − x))−1 if 31 ≤ x ≤ 12
x + 1 − 2(x + 1)−1 if 21 ≤ x ≤ 1
≤ 1.
Therefore
¯ a ¯
¯ dFn (x)/dx ¯
sup ¯¯ − 1¯¯ ≤ (log 2) sup |Gan (s) − G(s)| , n ∈ N.
a,x∈I g(x) a,s∈I
Then
ε01 = ε1 ≤ log 2
and, by Theorem 1.3.12,
1
ε0n+1 = εn+1 ≤ (log 2)cn−1 , n ∈ N+ .
2
Basic properties 47
2
Theorem 1.3.14 For any a ∈ I we have
εn + εn+1
ψγa (n) ≤ , n ∈ N+ . (1.3.33)
1 − εn+1
Also,
ψγ (n) = εn , n ∈ N+ . (1.3.34)
where the supremum is taken over all B ∈ Bk+n ∞ with γ(B) > 0, i(k) ∈ Nk ,
+
and k ∈ N. For arbitrarily given k, `, n ∈ N+ , i(k) ∈ Nk+ , and j (`) ∈ N`+
put
A = I(i(k) ), B = ((ak+n , · · · , ak+n+`−1 ) = j (`) ))
and note that γa (A) γa (B) 6= 0 for any a ∈ I. By (1.3.35) we have
and
whence
we obtain ψγ (n) ≤ εn , n ∈ N+ .
To prove the converse inequality remark that the ψ-mixing coefficients
under the extended Gauss measure γ̄ of the doubly infinite sequence (ā` )`∈Z
of extended incomplete quotients, are equal to the corresponding ψ-mixing
coefficients under γ of (an )n∈N+ . This is obvious by the very definitions of
(ā` )`∈Z and ψ-mixing coefficients. See Subsection 1.3.3 and Section A3.1.
As (ā` )`∈Z is strictly stationary under γ̄, we have
¯ ¯
¯ γ̄(A ∩ B) ¯
ψγ (n) = ψγ̄ (n) = sup ¯¯ − 1¯¯ , n ∈ N+ ,
γ̄(A) γ̄(B)
where the upper bound is taken over all Ā = σ(ān , ān+1 , · · · ) and B̄ ∈
σ(ā0 , ā−1 , · · · ) for which γ̄(A) γ̄(B) 6= 0. Clearly, A = A × I and B = I × B,
with A ∈ Bn∞ = τ −n+1 (BI ) and B ∈ BI . Then
¯ ¯
¯ γ̄(A × B) ¯
ψγ (n) = sup ¯ − 1¯¯ , n ∈ N+ . (1.3.38)
¯
A ∈ τ −n+1 (BI ), B ∈ BI γ(A) γ(B)
γ(A)γ(B) 6= 0
for any A, B ∈ BI . It then follows from (1.3.38) and the very definition of
εn that
¯ ¯
¯ γb (A) ¯
ψγ (n) ≥ sup ¯ − 1¯¯ = εn , n ∈ N+ .
¯
b ∈ I, A ∈ τ −n+1 (BI ) γ(A)
γ(A) 6= 0
Basic properties 49
Corollary 1.3.15 The sequence (an )n∈N+ is ψ-mixing under γ and any
γa , a ∈ I. For any a ∈ I we have ψγa (1) ≤ (ε1 + ε2 )/(1 − ε2 ) = 0.61231 · · ·
and
(log 2)cn−2 (1 + c)
ψγa (n) ≤ , n ≥ 2.
2 − (log 2)cn−1
Also, ψγ (1) = 2 log 2 − 1 = 0.38629 · · · , ψγ (2) = ζ(2) log 2 − 1 = 0.14018 · · ·
and
1
ψγ (n) ≤ (log 2)cn−2 , n ≥ 3.
2
The doubly infinite sequence (ā` )`∈Z of extended incomplete quotients is
ψ-mixing under the extended Gauss measure γ̄, and its ψ-mixing coefficients
are equal to the corresponding ψ-mixing coefficients under γ of (an )n∈N+ .
The proof follows from Proposition 1.3.13 and Theorem 1.3.14. As al-
ready noted, the last assertion is obvious by the very definitions of (ā` )`∈Z
and ψ-mixing coefficients. 2
Remark. The above result will be improved in Chapter 2. See Proposi-
tion 2.3.7. 2
Proposition 1.3.16 (F. Bernstein’s theorem) Let (cn )n∈N+ be a se-
quence of positive numbers. The random event (an ≥ cnP ) occurs infinitely
often with γ-probability 0 or 1, according as the series n∈N+ 1/cn con-
verges or diverges.
P In other words, γ(an ≥ cn i.o.) is either 0 or 1 according
as the series n∈N+ 1/cn converges or diverges.
Proof. We can clearly assume that cn ≥ 1, n ∈ N+ . Let En = (an ≥
cn ), n ∈ N+ . By (1.2.9) we have
P
Assume now that n∈N+ 1/cn diverges. It follows from Theorem 1.3.14
that for any k, n ∈ N+ such that k ≤ n we have
therefore
¡ c ¯ c ¢
γ En+1 ¯ E ∩ · · · ∩ Enc ≤ 1 − 1 − ε1
k
2cn+1
for any k, n ∈ N+ such that k ≤ n.
It follows that for any k, m ∈ N+ we have
m µ ¶
¡ ¢ Y 1 − ε1
γ Ekc ∩ · · · ∩ Ek+m
c
≤ 1− ,
2ck+i
i=0
whence
m µ
Y ¶
¡ c c
¢ 1 − ε1
γ Ek ∩ Ek+1 ∩ · · · ≤ lim 1− =0
m→∞ 2ck+i
i=0
P
since n∈N+ 1/cn diverges.
Finally,
¡ ¢
= 1 − lim γ Ekc ∩ Ek+1
c
∩ · · · = 1.
k→∞
2
In Chapter 3 we shall need the following result.
Corollary 1.3.17 Let bn , n ∈ N+ , be real-valued random variables on
(I, BI ) such that an ≤ bn ≤ an + c, n ∈ N+ , for some c ∈ R+ . Let
(cn )n∈N+ be a sequence of positive
P numbers. Then γ (bn ≥ cn i.o.) is either
0 or 1 according as the series n∈N+ 1/cn converges or diverges.
Basic properties 51
Proof. Clearly,
|f (x0 ) − f (x00 ) |
s (f ) := sup < ∞·
x0 6=x00 |x0 − x00 |
53
54 Chapter 2
|| f || L = || f || + s (f ) , f ∈ L (I) .
Clearly,
C 1 (I) ⊂ L (I) ⊂ C (I) ⊂ B (I) .
The variation varA f over A ⊂ I of a function f : I → C is defined as
k−1
X
sup |f (ti ) − f (ti−1 )| ,
i=1
(Note that the value of the integral is the same for all functions in an equiv-
alence class.)
To define L∞ µ we should first define the µ-essential supremum. For a
measurable function f : I → R, its µ-essential supremum, which is denoted
µ-ess sup f , is defined as
inf {a ∈ R : µ (f > a) = 0} .
Solving Gauss’ problem 55
(Note that the value of the essential supremum is the same for all functions
p
in an equivalence class.) Clearly, L∞
µ ⊂ Lµ for any p ≥ 1.
The special case p = 2 is an important one: L2µ can be also considered
as a Hilbert space with inner product (·, ·)µ defined by
Z
(f, g)µ = f g ∗ dµ, f, g ∈ L2µ .
I
In the case where µ = λ we simply write Lp , ||f ||p , L∞ , ||f ||∞ , and
ess sup f instead of Lpλ , ||f ||p,λ , L∞
λ , ||f ||∞,λ , and λ-ess sup f , respectively.
or, equivalently, Z Z
gPµ f dµ = (g ◦ τ ) f dµ (2.1.2)
I I
(iii) kPµ kp,µ := sup (||Pµ f ||p,µ : f ∈ Lpµ , ||f ||p,µ = 1) ≤ 1 for any p ≥ 1
and p = ∞;
(iv) for any n ∈ N+ the nth power Pµn of Pµ is the Perron–Frobenius
operator of the nth iterate τ n of τ under µ ;
(v) (Pµ f )∗ = Pµ f ∗ for any f ∈ L1µ ;
(vi) Pµ ((g ◦ τ ) f ) = gPµ f for any f ∈ L1µ and g ∈ L∞ µ and for any f ∈
p
Lµ and g ∈ Lqµ with p > 1 and q = p/ (p − 1);
58 Chapter 2
n
P γ f = f ◦ τ −n
a.e. in I 2 for any n ∈ N+ .
We should, however, note that the Perron–Frobenius operator of an in-
vertible transformation, like τ̄ , is not of great value for deriving asymptotic
properties of its nth power as n → ∞. For an interesting discussion of the
Perron–Frobenius operator of τ̄ in connection with the time evolution of
certain spatially homogeneous cosmologies (‘mixmaster universe’), we refer
the reader to Mayer (1987). 2
Proposition 2.1.2 The Perron–Frobenius operator Pγ := U of τ under
γ is given a.e. in I by the equation
X µ ¶
1
U f (x) = Pi (x) f , f ∈ L1γ . (2.1.3)
x+i
i∈N+
U n g (x)
Pµn f (x) = , f ∈ L1µ , n ∈ N+ . (2.1.7)
(x + 1) h (x)
X µ ¶
1 n 1 U n+1 g (x)
= Pi (x) U g = a.e. in I,
(x + 1) h (x) x+i (x + 1) h (x)
i∈N+
which is obviously true. Assume that (2.1.8) holds for some n ∈ N. Then
³ ´ ¡ ¡ ¢¢
µ τ −(n+1) (A) = µ τ −n τ −1 (A)
Z Z
U n f (x)
= dx = (log 2) U n f dγ.
−1
τ (A) x + 1 −1
τ (A)
Therefore
³ ´ Z Z
−(n+1) n+1 U n+1 f (x)
µ τ (A) = (log 2) U f dγ = dx,
A A x+1
T0 = Pλ − Π1 .
Hence
Π21 = Π1 , Pλ Π1 = Π1 Pλ = Π1 , T0 Π1 = Π1 T0 = 0. (2.1.9)
T U ∞ = U ∞ T = 0. (2.1.14)
U n = U ∞ + T n, n ∈ N+ .
we can write
X µ µ 1
¶ µ
1
¶¶
S2 = − f −f (Pi (y) − Pi (x)) .
x+1 x+i
i∈N+
As X 1
P1 (0) = Pi+1 (0) =
2
i∈N+
and µ ¶
1
f ≥ f (0) , i ∈ N+ ,
i+1
we finally obtain
1 1
var U f ≤ (f (1) − f (0)) = var f.
2 2
Since for f defined by f (x) = 0, 0 ≤ x < 1, and f (1) = 1 we have
var U f = (var f ) /2, it follows that the constant 1/2 cannot be lowered. 2
Corollary 2.1.13 If f ∈ BV (I) is real-valued, then
1
var U f ≤ var f.
2
The constant 1/2 cannot be lowered.
Proof. By Hahn’s decomposition of a signed measure, for any f ∈ BV (I)
there exist monotone functions f1 , f2 ∈ B (I) such that f = f1 − f2 and
var f = var f1 + var f2 . [To obtain this consider the signed measure µ on
BI defined by µ ((a, b]) = f (b) − f (a), a < b, a, b ∈ I.] Then by Proposition
2.1.12 we have
Hence
X Pi (y) − Pi (x) µ 1 ¶
f
y−x x+i
i∈N+
µ µ ¶ µ ¶¶ (2.1.19)
X i 1 1
= f −f .
(x + i + 1) (y + i + 1) x+i+1 x+i
i∈N+
Assume that x > y. It then follows from (2.1.18) and (2.1.19) that
¯ ¯
¯ U f (y) − U f (x) ¯ X µ Pi (y) i
¶
¯ ¯ ≤ s (f ) + .
¯ y−x ¯ (y + i)2 (y + i) (y + i + 1)3
i∈N+
X µ 1 1
¶
≤ +
i3 (i + 1) (i + 1)3
i∈N+
X µ1 1 1 1 1
¶
= − + − +
i3 i2 i i + 1 (i + 1)3
i∈N+
= ζ (3) − ζ (2) + 1 + ζ (3) − 1 = 2ζ (3) − ζ (2) .
As clearly
¯ ¯
¯ U f (y) − U f (x) ¯
sup ¯ ¯ ¯ = s (U f ) ,
y−x ¯
x,y∈I, x>y
we obtain (2.1.17).
Finally, it is easy to check that for f (x) = x, x ∈ I, we have s (f ) =
1 and s (U f ) = 2ζ (3) − ζ (2). The proof is complete. 2
Proof. Equations (2.1.19) and (2.1.18) show that for f ∈ C 1 (I) the series
defining U f can be differentiated term by term since the series of the deriva-
tives is uniformly convergent, it being dominated by a convergent series of
positive constants (cf.further Subsection 2.2.1). Then (2.1.20) follows from
(2.1.17) since for any f ∈ C 1 (I) we have s (f ) = || f 0 ||. 2
Proof. The result follows from Corollary 2.1.13 and Propositions 2.1.14
and 2.1.15, having in view that U preserves the constant functions. In the
case of BV (I) we should note that for a complex-valued f ∈ BV (I) we have
kU n f0 − U ∞ f0 kv ≤ C0 q n ||f0 ||v , n ∈ N+ .
2
√ Remark. As for q, we conjecture that its (optimal) value is g = (3 −
5)/2 = 0.38196 · · · , as in a further related result, namely, Corollary 2.5.7.2
In the next three sections we will take up Gauss’ problem assuming that
F00 = dµ/dλ belongs to Banach spaces ‘smaller’ than BEV (I).
where
H1 = H(· · · , ā−2 , ā−1 , ā0 , ā1 , ā2 , · · · ).
¡ 2 2 ¢
Clearly (H
Pn` )`∈Z is a strictly stationary process on I , BI , γ . Set S0 =
0, Sn = i=1 Hi , n ∈ N+ . We start with some well known results.
Theorem 2.1.18 If Eγ H12 < ∞, Eγ H1 = 0, and limn→∞ Eγ H1 Hn = 0,
then the finite or infinite limit limn→∞¡Eγ¢Sn2 exists. We have limn→∞ Eγ Sn2
< ∞ if and only if there exists g ∈ L2γ I 2 such that H1 = g ◦ τ − g a.e. in
I 2.
This is a special case of Theorem 18.2.2 in Ibragimov and Linnik (1971).
Proposition 2.1.19 If Eγ H12 < ∞, Eγ H1 = 0, and the series
X
σ 2 = Eγ H12 + 2 Eγ H1 Hn+1 (2.1.26)
n∈N+
Therefore
n−1
P
j |Eγ H1 Hj+1 |
1 ¯¯ ¯ j=1 X
2 2¯
Eγ Sn − nσ ≤ 2 + |Eγ H1 Hj+1 | ,
n n
j≥n
P
and the right hand side is o (1) as n → ∞ when |E H H |<∞
P Pn n∈N+ γ 1 n+1
(note that n∈N+ |un | < ∞ implies limn→∞ j=1 j |uj | /n = 0), so that
(2.1.27) holds. Finally, since
P
n−1 P
j |Eγ H1 Hj+1 | j |Eγ H1 Hj+1 |
j=1 X j∈N+
+ |Eγ H1 Hj+1 | ≤ ,
n n
j≥n
where the upper bound is taken over all (ω, θ),R (ω 0 , θ0 ) ∈ I 2 (i−n ,P
· · · , in ) and
ik ∈ N+ , −n ≤ k ≤ n. Assume that Eγ H12 = I 2 h2 dγ < ∞ and n∈N+ cn <
∞. Then (2.1.29) holds.
Proof. For any n ∈ N+ we have
Z 2
Z h (ω, θ) γ (dω, dθ)
X ¡ 0 0¢ I 2 (i−n ,··· ,in ) ¡ 0 ¢
= h ω , θ − γ dω , dθ0
γ (I 2 (i −n , · · · , in ))
i−n ,··· ,in ∈N+ I 2 (i−n ,··· ,in )
Z ÃZ !2
¡ ¢ ¡ ¡ 0 0¢ ¢
γ dω 0 , dθ0 h ω , θ − h (ω, θ) γ (dω, dθ)
X I 2 (i−n ,··· ,in ) I 2 (i−n ,··· ,in )
=
γ 2 (I 2 (i−n , · · · , in ))
i−n ,··· ,in ∈N+
X ¡ ¢
≤ γ I 2 (i−n , · · · , in ) c2n = c2n .
i−n ,··· ,in ∈N+
Hence
P the series occurring in (2.1.29) is dominated by the convergent series
n∈N+ cn , which completes the proof. 2
74 Chapter 2
for any (ω, θ), (ω 0 , θ0 ) ∈ Ω2 , then the assumption of Proposition 2.1.22 holds.
Indeed, for (ω, θ), (ω 0 , θ0 ) ∈ I 2 (i−n , · · · , in ) we have
¯ ¯
¯1 ¯
¯ − 1¯≤ sup λ (I (i−1 , · · · , i−n )) = (Fn Fn+1 )−1
¯ θ θ0 ¯
i−1 ,··· ,i−n ∈N+
and similarly ¯ ¯
¯1 1 ¯
¯ − ¯ ≤ (Fn−1 Fn )−1 .
¯ ω ω0 ¯
Hence ¯ ¡ ¢¯
¯h (ω, θ) − h ω 0 , θ0 ¯ ≤ c 2ε (Fn−1 Fn )−ε, n ∈ N+ ,
for any (ω,Pθ), (ω 0 , θ0 )
∈ I 2 (i
−n , · · · , in ), ik ∈ N+ , −n ≤ k ≤ n, and clearly
−ε
the series n∈N+ (Fn−1 Fn ) is convergent.
In particular, (2.1.30) holds if h satisfies a Hölder condition of order
ε > 0, that is,
|h (ω, θ) − h (ω 0 , θ0 )|
sup < ∞.
(ω,θ),(ω 0 ,θ0 )∈Ω2 (|ω 0 − ω| + |θ0 − θ|)ε
2
The results above clearly apply to the special case where H is a real-
N
valued function on N+ + . In this case we set
In the present case the conditional mean value occurring in (2.1.31) and σ 2
can be expressed in terms of the random variable h on (I, BI ) defined on
Ω (thus a.e. in I) by
h ([i1 , i2 , · · · ]) = H (i1 , i2 , · · · )
Solving Gauss’ problem 75
N
for any (i` )`∈N+ ∈ N+ + . Clearly,
Z Z
Eγ H1 = hdγ, Eγ H1 = h2 dγ,
2
I I
Z
1
Eγ (H1 |a1 , · · · , an ) (ω) = ¡ ¡ (n) ¢¢ hdγ
γ I i I(i(n) )
In turn, the second assumption holds if for some positive constants c and ε
we have ¯ ¯ε
¯ ¡ ¢¯ ¯ ¯
¯h (ω) − h ω 0 ¯ ≤ c ¯ 1 − 1 ¯ , ω, ω 0 ∈ Ω. (2.1.32)
¯ ω ω0 ¯
¯ ¯p 1/p
Z ¯ Z ¯
¯ ¯
X ¯ 1 ¡ ¢ ¡ ¢¯
= ¯h (ω) − ¡ ¡ (n) ¢¢ h ω 0 γ dω 0 ¯ γ (dω)
¯ γ I i ¯
i(n) ∈Nn
+ I(i(n) )
¯ I(i(n) )
¯
¯ ¯p 1/p
Z ¯ Z ¯
¯ ¡ 0 ¢¢ ¡ 0 ¢¯¯
X 1
¡ ¡ ¢¢ ¯ ¡
= γ (dω) ¯ h (ω) − h ω γ dω ¯
γp I i(n) ¯ ¯
i(n) ∈Nn
+ I(i(n) )
¯ I(i(n) ) ¯
1/p
X (p)
≤ max γ(I(i(n) )) varI(i(n) ) h
(n) n
i ∈N+
i(n) ∈Nn
+
µ ¶1/p ³ ´1/p
1 (p)
≤ (Fn Fn+1 )−1 varΩ h .
log 2
Hence the series occurring in (2.1.31) is dominated by
(p)
(varΩ h)1/p X
1/p
(Fn Fn+1 )−1/p ,
(log 2) n∈N+
converges absolutely, and we have σ = 0 if and only if there exists b ∈ L2γ (I)
such that h = b ◦ τ − b a.e. in I. In particular, if h is essentially unbounded
then σ 6= 0.
Proof. By (2.0.2) and Proposition 2.1.7(ii) we have
for some positive q <P1. This Rclearly entails the absolute convergence of both
series (2.1.33) and n∈N+ n I h U n h dγ. Then Corollary 2.1.20 completes
the proof of the first two assertions concerning σ.
Without appealing to Corollary 2.1.20, the characterizationP of the case
σ = 0 can be given a direct proof as follows. Put h1 = n∈N+ U n h. By
(2.1.34) this series converges in BEV (I), and we have h1 = U h + U h1 =
U (h + h1 ). Writing g = h + h1 we note that U g ∈ BEV (I) and
Z Z ³ ´
¡ 2 ¢
2
σ = h + 2hh1 dγ = g 2 − (U g)2 dγ.
I I
By (2.1.2) we have
Z Z
2
(U g) dγ = ((U g) ◦ τ )g dγ
I I
and Z Z
(U g)2 dγ = ((U g) ◦ τ )2 dγ.
I I
R R
[Note that (2.1.2) implies in general that I f dγ = I f ◦ τ dγ, f ∈ L1γ , which
also follows from the fact that τ is γ-preserving.] Consequently, we can write
Z Z Z
2
σ = g dγ − 2 ((U g) ◦ τ ) g dγ + ((U g) ◦ τ )2 dγ
2
ZI I I
= (g − (U g) ◦ τ )2 dγ.
I
78 Chapter 2
h = (U g) ◦ τ − U g a.e. in I, (2.1.35)
that is, σ = 0.
Finally, since U g ∈ BEV (I) as shown above, equation (2.1.35) cannot
hold in the case where h is essentially unbounded, that is, we cannot have
σ = 0. 2
Corollary 2.1.25 Let f : N+ → R such that Eγ f 2 (a1 ) < ∞, Eγ f (a1 ) =
0. Put X
σ 2 = Eγ f 2 (a1 ) + 2 Eγ f (a1 ) f (an+1 ) (2.1.36)
n∈N+
and
X X |f (i)|
v (U h) ≤ |f (i)| var Pi ≤ C
i2
i∈N+ i∈N+
for some C > 0. The last series is convergent since Eγ |f (a1 )| < ∞, so that
U h ∈ BEV (I). Then by Proposition 2.1.24 we have σ = 0 if and only if
there exists b ∈ L2γ (I) such that h = b ◦ τ − b a.e. in I, and we have to show
that this happens if and only if f = 0. Clearly, if f = 0 then σ = 0. To
prove the converse we first note that
U h = U (b ◦ τ ) − U b = b − U b a.e. in I.
P
This equation holds for b equal to h1 = n∈N+ U n h ∈ BEV (I). Putting
b = b1 + h1 we get b1 = U b1 . But by Proposition 2.1.7 the last equation
Solving Gauss’ problem 79
only holds for a.e. constant functions b1 . This shows that actually b ∈
BEV (I). Next, whatever i ∈ N+ , for u ∈ (1/ (i + 1) , 1/i) the equation
h (u) = (b ◦ τ ) (u) − b (u) a.e. in I implies
µ ¶
1
f (i) = b (x) − b
x+i
a.e. in I . Hence
Fn (x) = µ (τ n < x) , x ∈ I,
(U f )0 = −V f 0 , f ∈ C 1 (I), (2.2.2)
vϕ ≤ V ϕ ≤ wϕ.
Solving Gauss’ problem 81
Hence µ ¶ µ ¶ µ ¶
1 1 1 1
g (u) = +1 h −1 − h , u ∈ (0, 1],
u u u u
and we indeed have U g = h since
X µ h (x + i − 1) h (x + i)
¶
U g (x) = (x + 1) −
x+i x+i+1
i∈N+
µ ¶
h (x) h (x + i)
= (x + 1) − lim = h (x) , x ∈ R+ .
x + 1 i→∞ x + i + 1
satisfies
U ga (x) = ha (x), x ∈ I.
We come to V via (2.2.2). Setting
1−a a+1
ϕa (x) = ga0 (x) = 2 + , x ∈ I,
(ax + 1) ((a + 1) x + 1)2
82 Chapter 2
we have
1
V ϕa (x) = − (U ga )0 (x) = , x ∈ I.
(x + a + 1)2
Let us choose a by asking that
ϕa ϕa
(0) = (1) .
V ϕa V ϕa
This amounts to
or
2 (a + 1)4 − 3 (a + 1) − 2 = 0,
which yields as unique acceptable solution
a = 0.3126597 · · · .
that is,
vϕ ≤ V ϕ ≤ wϕ,
where
1 1
v= > 0.29017, w= < 0.30796.
2 (a + 1)2 m (a)
2
Solving Gauss’ problem 83
v n ϕ ≤ V n ϕ ≤ wn ϕ, n ∈ N+ .
≤ |µ (τ n < x) − G (x)|
¡ ¢
Proof. For any n ∈ N and y ∈ I set dn (y) = µ τ n < ey log 2 − 1 − y so
that
dn (G (x)) = µ (τ n < x) − G(x), x ∈ I.
Then by (2.2.1) we have
Z x
U n f0 (u)
dn (G (x)) = du − G(x).
0 u+1
1 U n f0 (x) 1
d0n (G (x)) = − ,
(x + 1) log 2 x+1 (x + 1) log 2
1 d00n (G (x))
(U n f0 (x))0 = , n ∈ N, x ∈ I.
(log 2)2 x + 1
Hence, by (2.2.3),
θ+1 n 0
µ (τ n < x) − G (x) = (−1)n+1 (log 2)2 V f0 (θ) G (x) (1 − G (x))
2
for any n ∈ N and x ∈ I, and another suitable θ = θ (n, x) ∈ I. The result
stated follows now from Corollary 2.2.2.
In the special case µ = λ we have f0 (x) = x + 1, x ∈ I. Then with
a = 0.3126597 · · · we have
ϕ (x) 1−a a+1
α = min = 2 + = 0.644333 · · · ,
x∈I f00 (x) (a + 1) (a + 2)2
ϕ (x)
β = max = 2,
x∈I f00 (x)
so that
µ (τ n < x) − G (x)
V ≥ F. (2.2.5)
Set v0 = v, w0 = w, and
ϕn+1 ϕn+1
vn = inf , wn = sup , n ∈ N+ .
ϕn ϕn
Then
vn ϕn ≤ ϕn+1 ≤ wn ϕn , n ∈ N, (2.2.9)
whence
vn V ϕn ≤ V ϕn+1 ≤ wn V ϕn ,
that is,
vn ϕn+1 ≤ ϕn+2 ≤ wn ϕn+1 .
Therefore vn+1 ≥ vn and wn+1 ≤ wn , n ∈ N. We are going to improve
these inequalities.
It follows from (2.2.5) and (2.2.9) that
whence
F(ϕn+1 − vn ϕn )
vn+1 ≥ vn + , n ∈ N. (2.2.10)
|| ϕn+1 ||
Similarly,
whence
F (wn ϕn − ϕn+1 )
wn+1 ≤ wn − , n ∈ N. (2.2.100 )
|| ϕn+1 ||
Solving Gauss’ problem 87
dn+1 ≤ dn (1 − en ), n ∈ N, (2.2.11)
Hence
vn
en+1 ≥ en , n ∈ N. (2.2.12)
wn+1
In conjunction with (2.2.11) and (2.2.12), assumption (2.2.7) which can be
written as
d0
e0 − > 0,
w0
ensures exponential decrease of the dn , n ∈ N, since
whence
wn en − dn ≥ w0 e0 − d0 ,
1 d0
1 ≥ en ≥ (w0 e0 − d0 ) ≥ e0 − > 0, (2.2.13)
wn w0
and µ ¶
d0 n
dn ≤ d0 1 − e0 + , n ∈ N. (2.2.14)
w0
Put λ0 = limn→∞ vn = limn→∞ wn , and define
Hence
Yµ
n−1
di
¶
ϕ
en ≤ 1+ ϕ
e0 ≤ A ϕ
e0 , n ∈ N+ . (2.2.16)
v0
i=0
dn dn A
0≤ϕ
en+1 − ϕ
en ≤ ϕen ≤ ϕ
e0 , n ∈ N.
v0 v0
P
Therefore by (2.2.14) the series n∈N || ϕ
en+1 − ϕ en || converges. By the
completeness of B the limit ψ = limn→∞ ϕ en exists. Letting n → ∞ in
vn ϕ
en ≤ V ϕ en ≤ wn ϕ en , n ∈ N, yields V ψ = λ0 ψ.
Since ϕ en+1 ≥ ϕ en ≥ · · · ≥ ϕ e0 = ϕ, e we have ψ ≥ ϕ. As 1 ≥ en =
F (ϕn ) /|| ϕn+1 || = F (ϕ en ) /|| V ϕ
en || , n ∈ N, letting n → ∞ yields 1 ≥
F (ψ) /λ0 || ψ || . Finally, by (2.2.13) we have
fn fn
ven = inf , w
en = sup , n ∈ N.
λn0 ψ λn0 ψ
Hence
which yields
1
ven+1 ≥ ven + F (fn − ven λn0 ψ) ≥ ven , n ∈ N.
λn+1
0 || ψ ||
Similarly,
1
w
en+1 ≤ w
en − n+1 en λn0 ψ
F(w − fn ) ≤ w
en , n ∈ N.
λ0 || ψ ||
Therefore
µ ¶
F (ψ)
w
en+1 − ven+1 ≤ (w
en − ven ) 1 − , n ∈ N,
λ0 || ψ ||
Solving Gauss’ problem 89
whence µ ¶
f F (ψ) n
w
en − ven ≤ osc 1− , n ∈ N,
ψ λ0 || ψ ||
since
f f f
w
e0 − ve0 = sup − inf = osc .
ψ ψ ψ
If we denote by G (f ) the common limit of ven and w en as n → ∞, then we
have µ ¶
e f F (ψ) n
ven , w
en = G (f ) + θn osc 1− , n ∈ N,
ψ λ0|| ψ ||
¯ ¯
¯ ¯
with a suitable θn ∈ R satisfying ¯θen ¯ ≤ 1. Hence, by the very definition
e
of the ven and w en , n ∈ N, equation (2.2.8) should hold. Since
|| f ||
|G(f )| ≤ max (|e
v0 | , |w
e0 |) ≤ , f ∈ B,
inf ψ
it follows that
|G (f )| 1
|| G || = sup ≤ .
f ∈B || f || inf ψ
The fact that G is a positive linear functional is an immediate consequence
of equation (2.2.8). 2
Let us show that Theorem 2.2.4 applies to Gauss’ problem as considered
in Subsection 2.2.1. The space B is Cr (I), the collection of all real-valued
functions in C (I) , and the operator V the one denoted there by the same
letter. As function ϕ we could use the function ϕa constructed in Subsection
2.2.1 with a = 0.3126597 · · · . Nevertheless, it is more convenient to use V ϕa
instead, for which the same values of v and w apply. Thus we take
1
ϕ (x) = , x ∈ I,
(x + a + 1)2
X Z 1/(x+i)
i
V f (x) ≥ f (y) dy
i∈N+
(x + i + 1)2 1/(x+i+1)
Z 1
= k (x, y) f (y) dy, x ∈ I,
0
90 Chapter 2
where
k (x, 0) = 0, x ∈ I,
by −1 − xc
k (x, y) = , x ∈ I, y ∈ (0, 1].
(x + by −1 − xc + 1)2
t → (t + x + 1)−2 , t ≥ 2,
y −1 − x y −1 − 1 y (1 − y)
k (x, y) ≥ 2 ≥ 2 =
(y −1 + 1) (y −1 + 1) (y + 1)2
w F (ϕ) F (ϕ)
≥ = (a + 1)2 F (ϕ) > 0.033184. (2.2.17)
|| V ϕ || || ϕ ||
Since w − v < 0.01779, inequality (2.2.7) holds. Thus Theorem 2.2.4 applies
and we have
F (ψ)
≥ (a + 1)2 F (ϕ) − (w − v) > 0.01539. (2.2.18)
|| ψ ||
Solving Gauss’ problem 91
and Z x
Ψ (u) − U ∞ Ψ
ψe (x) = du, x ∈ I.
0 u+1
It is easy to check that
³ ´0
(x + 1) ψe0 (x) = ψ(x), x ∈ I,
f00
(log 2)2 (λ0 − 0.01539)n G (x) (1 − G (x)) ,
≤ || ψ || osc
ψ
where λ0 = 0.303 663 002 898 732 658 · · · ,
1 3.41
2 ≤ ψ (x) ≤ , x ∈ I,
(x + a + 1) (x + a + 1)2
with a = 0.3126597, and G is a positive bounded functional on Cr (I) such
that
1
|| G || ≤ ≤ (a + 2)2 = 5.34839 · · · .
inf ψ
In particular, for any n ∈ N and x ∈ I we have
¯ ¯
¯ ¯
¯λ (τ n < x) − G (x) − (−λ0 ) G (1) ψe (x)¯
n
(2.2.19)
92 Chapter 2
as n → ∞. Using this equation Wirsing (1974) has obtained the value given
in the statement. Note that in Knuth (1981, p. 350) the first 20 (RCF)
digits of λ0 are given as 3, 3, 2, 2, 3, 13, 1, 174, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 1. The
20th convergent equals
227 769 828
,
750 074 345
which yields 14 exact significant digits of λ0 .
Solving Gauss’ problem 93
wF (ϕ)
we0 = ≥ (a + 1)2 F (ϕ) ≥ 0.033184,
|| V (ϕ) ||
it follows that
P
n∈N dn w (w − v)
A ≤ exp ≤ exp ≤ 3.409 · · · .
v v (we0 − (w − v))
f00 1 1 1 (a + 1)2
osc = osc = − ≤ (a + 2)2 − = 4.843094 · · · ,
ψ ψ inf ψ sup ψ 3.41
|| U n f − U ∞ f || (2.2.20)
µ ¶ Z Z
¯ ¡ ¢¯ f0 x
≤ λn0 ¯G f 0 ¯ + osc (λ0 − 0.01539)n γ(dx) ψ dλ
ψ I 0
and
|| U n f − U ∞ f || (2.2.21)
µ ¶ Z Z x
¯ ¡ ¢¯ f0
≥ λn0 ¯G f 0 ¯ − osc (λ0 − 0.01539)n γ(dx) ψ dλ .
ψ I 0
U n f (x) − U n f (y) =
94 Chapter 2
µ Z y Z y ¶
f0
= (−1)n G(f 0 )λn0 ψ dλ + osc (λ0 − 0.01539)n θen ψ dλ
x ψ x
|| U n f − U ∞ f || ≥ |U n f (0) − U ∞ f | .
2
we have
|U n f (1) − U n f (0)| ≤ 2 || U n f − U ∞ f ||.
Finally, noting that by (2.2.3) we have
Z Z
¯ n 0¯ ¯ n 0¯
n
var U f = ¯ ¯
(U f ) dλ = ¯V f ¯ dλ,
I I
¯Z ¯ ¯Z ¯
¯ ¯ ¯ ¯
|U f (1) − U f (0)| = ¯ (U f ) dλ¯ = ¯ V f dλ¯¯ ,
n n ¯ n 0 ¯ ¯ n 0
I I
Solving Gauss’ problem 95
and
µ ¶Z
n ∞ 1 ¯
n¯
¯
0 ¯ f0 n
|| U f − U f || ≥ λ0 G(f ) − osc (λ0 − 0.01539) ψ dλ
2 ψ I
This follows easily using Theorem 2.2.6 and equations (2.2.3) and (2.2.8).
The details are left to the reader. 2
1 3.41
2
≤ ψ(x) ≤ , x ∈ I,
(x + a + 1) (x + a + 1)2
and
µ ¶Z Z x
¯ ¯ f0
|| U n f − U ∞ f || ≥ λn0 ¯G(f 0 )¯ − osc (λ0 − 0.01539)n γ(dx) ψdλ.
ψ I 0
since Z Z
∞ 1 1
U f0 = f0 dγ = F00 dλ = .
log 2 log 2
I I
Note that
Z Z x Z 1 µ ¶
|| ψ || x dx 1
γ(dx) ψ dλ ≤ = || ψ || −1 (2.2.27)
I 0 log 2 0 x+1 log 2
and
µ(τ −n (A)) − γ(A) = γ(Ac ) − µ(τ −n (Ac )) (2.2.28)
for any n ∈ N and A ∈ BI .
Now, (2.2.25) follows from (2.2.26) through (2.2.28) and Theorem 2.2.8.
2
Corollary 2.2.10 The spectral radius of the operator U − U ∞ in L(I)
equals λ0 .
Proof. Obvious by Theorem 2.2.8. 2
As an application of Theorem 2.2.8 we shall derive the asymptotic be-
haviour of
γa (uan < x), x ≥ 1,
as n → ∞ for any a ∈ I. While it is natural to think that for any a ∈ I the
limit distribution function
is somewhat surprising to find out that the (exact) convergence rate is O(λn0 )
for most a ∈ I.
Theorem 2.2.11 For any n ∈ N+ and x ≥ 1 we have
¯ ¯
sup ¯γa (uan+1 < x) − H(x)¯ (2.2.29)
a∈I
I(1,∞) (x) n
≤ 3.2228 λ0 (1 + (0.94932)n ),
x
where µ ¶
1 x−1
log x − if 1 ≤ x ≤ 2,
log 2 x
H(x) = µ ¶
1 1
log 2 − if x ≥ 2.
log 2 x
In (2.2.29), λ0 cannot be replaced by a smaller constant, and the exact con-
vergence rate to 0 of the left hand side of (2.2.29) is O(λn0 ).
Proof. By Proposition 1.3.10, for any a ∈ I, x ≥ 1, and n ∈ N+ we have
µ ¶
a san + 1
γa (un+1 < x|a1 , . . . , an ) = 1 − I(san +1,∞) (x).
x
Hence
µ ¯ ¶ µ ¶
1 ¯¯ 1
γa uan+1 ≥ ¯ a1 , . . . , an a
= 1 − (1 − t(sn + 1))I(san +1,∞)
t t
a a
= min(1, t(sn + 1)) = ft (sn )
for any a ∈ I, t ∈ (0, 1], and n ∈ N+ , with
ft (y) = min(1, t(y + 1)), y ∈ I.
Therefore, by Proposition 2.1.10,
µ ¶ µ µ ¯ ¶¶
a 1 1 1 ¯¯
γa un+1 ≥ = E γa un+1 ≥ ¯ a1 , . . . , an = U n ft (a), (2.2.30)
t t
for any a ∈ I, t ∈ (0, 1], and n ∈ N+ . It is easy to check that (2.2.30) holds
for n = 0, too. Clearly, ft ∈ L(I) for any t ∈ (0, 1], and
t
Z
if 0 < t ≤ 1/2,
log 2
U ∞ ft = ft (y)γ(dy) =
I
1
(1 − t + log(2t)) if 1/2 ≤ t ≤ 1.
log 2
Solving Gauss’ problem 99
ft0
osc ≤ 5.348396 tI(0,1) (t)
ψ
and ¯ ¯
¯G(ft0 )¯ ≤ || G || || ft0 || ≤ 5.348396 tI(0,1) (t)
≤ 0.60256.
for any n ∈ N and t ∈ (0, 1]. Hence, by putting 1/t = x, (2.2.29) follows.
Finally, the assertion concerning the optimality of λ0 also follows from
Theorem 2.2.8. 2
Remarks. 1. The convergence of λ(un < x) to H(x), x ≥ 1, as n →
∞ was first sketchy proved by Doeblin (1940, p. 365) with an unspecified
convergence rate. A detailed proof following Doeblin’s suggestions was given
by Samur (1989, Lemma 4.5) together with a slower convergence rate than
that occurring in Theorem 2.2.11.
2. Theorem 2.2.8 shows that the convergence rate to 0 as n → ∞ of
¯ ¯
sup sup ¯γa (uan+1 < x) − H(x)¯
a∈I x≥1
is O(αn ) with 0 < α < λ0 . It follows from equation (2.2.22), which is valid
for f ∈ L(I) too, that this happens if and only if a ∈ E, with E defined in
100 Chapter 2
and ¯ ¯
sup ¯γ1 (u1n+1 < x) − H(x)¯ = O(λn0 )
x≥1
and
[min(t−1 − 1, 1) + t−1 I[1/2,1) (t)]
|Dt λ(Θn ≤ t) − h̃(t)| ≤ k0 ,
Fn Fn+1
where
t if 0 ≤ t ≤ 1/2,
log 2
H̃(t) =
1 (1 − t + log(2t)) if 1/2 ≤ t ≤ 1
log 2
and 1
if 0 ≤ t ≤ 1/2,
log 2
dH̃
h̃(t) = = µ ¶
dt
1 1
−1 if 1/2 ≤ t ≤ 1.
log 2 t
Remark. The first result above improves on the convergence rate ob-
tained by Faivre (1998a) while the second one on that obtained by Knuth
(1984). 2
Note that H is known [see Duren (1970)] as the ordinary Hardy space of
functions holomorphic in the half-plane Re z > −1/2, which is a Hilbert
space with inner product (·, ·)H defined by
Z µ ¶ µ ¶
1 ∗ 1 1
(f, g)H = f − + iy g − + iy dy, f, g ∈ H,
2π R 2 2
and norm
||ϕ||2 = (ϕ, ϕ)1/2 , ϕ ∈ L2 (R+ ) .
A Paley–Wiener theorem holds, giving a simple characterization of the
elements of H [see Duren (1970)]: f ∈ H if and only if there exists ϕ ∈
L2 (R+ ) such that
Z
f (z) = e−zs−s/2 ϕ (s) ds, Re z > −1/2;
R+
|| f || H = || ϕ ||2 . (2.3.1)
A = SM −1 : H → L2 (R+ )
with inverse ¡ ¢
A−1 = M S −1 : S L2 (R+ ) → H.
where ¡ √ ¢
J1 2 st
k (s, t) = , s, t ∈ R+ ,
((es − 1) (et − 1))1/2
and J1 is the Bessel function of order 1 defined by
s X (−1)k ³ s ´2k
J1 (s) = , s ∈ R+ .
2 k! (k + 1)! 2
k∈N
104 Chapter 2
Then
Pλ = A−1 K A. (2.3.2)
¡ ¢
Proof. Note first that the range of K is included in S L2 (R+ ) .
Let ϕ ∈ L2 (R+ ) and put f = M ϕ ∈ H. We have
A−1 K A f = M S −1 K S ϕ.
But
µ ¶1/2 Z µ ¶1/2
¡ −1 ¢ s 1 − e−t
S KSϕ (s) = k (s, t) ϕ (t) dt
1 − e−s R+ t
Z ³ s ´1/2 ¡ √ ¢
s−t J1 2 st
= e 2 ϕ (t) dt, s ∈ R+ ,
R+ t es − 1
whence
Z t ¡ √ ¢
¡ ¢ −zs − ³ s ´1/2 J1 2 st
M S −1 KSϕ (z) = e 2 ϕ (t) dsdt
R2+ t es − 1
Z X µ ¶
1 t t ϕ (t) dt,
= exp − −
R+ k∈N
(z + k)2 z+k 2
+
2
Solving Gauss’ problem 105
See, e.g., Kanwal (1997, Ch.7). Note that 0 cannot be an eigenvalue since
Kϕ = 0 implies that ϕ = 0 by the invertibility of the Hankel transform.
See, e.g., Magnus et al. (1966, Ch. 11). As usual, we order the eigenvalues
according to their absolute values, that is, |λ1 | ≥ |λ2 | ≥ ... , where we list
each eigenvalue according to its multiplicity. We then have
X
Kϕ = λj (ϕ, ϕj ) ϕj , ϕ ∈ L2 (R+ ) , (2.3.3)
j∈N+
with X
(||yi || ||xi ||)r < ∞
i∈I
for any r > 0. Here I is a countable set while xi ∈ B and yi ∈ B ∗ = the dual
Banach space of B (consisting of all bounded linear functional on B) for any
i ∈ I. Such operators have been introduced and studied by Grothendieck
(1955, 1956). They are compact and thus have discrete spectra. Moreover,
most of matrix algebra can be extended to them. In particular, one can
define the trace of such an operator as
X X
Tr L = yi (xi ) = λj , (2.3.4)
i∈I j∈N+
X ZZ
Tr K 2 = λ2j = k (s, t) k (t, s) ds dt
j∈N+ R2+
¡ √ ¢ (2.3.5)
ZZ
J12 2 st
= s t
ds dt = 1.103839654 · · · .
R2+ (e − 1) (e − 1)
Solving Gauss’ problem 107
We have Z
¡ ¢2
se−s L1n (s) ds = n + 1, n ∈ N,
R+
Z
se−s L1m (s) L1n (s) ds = 0, m, n ∈ N, m 6= n.
R+
¡ √ ¢ √
See, e.g., Magnus et al. (1966, Ch. 5). We expand J1 2 st / st, s, t ∈ R+ ,
in terms of the L1n (s) , n ∈ N, to obtain
¡ √ ¢
J1 2 st X
√ = L1n (s) Cn (t), s, t ∈ R+ ,
st n∈N
where
Z ¡ √ ¢
1 J1 2 st
Cn (t) = se−s L1n (s) √ ds
n+1 R+ st
n
X X (−1)m+k (m + k + 1)!tk
= n!
k! (k + 1)!m! (m + 1)! (n − m)!
m=0 k∈N
e−t tn
= , n ∈ N, t ∈ R+ .
(n + 1)!
It follows that X
Kϕ = (ϕ, βn ) αn , ϕ ∈ L2 (R+ ) , (2.3.6)
n∈N
P
for any r > 0. Since (es − 1)−1 = k∈N+ e−ks , s ∈ R++ , the computation
of ||αn ||2 reduces to that of a standard integral:
X Z ¡ ¢2
||αn ||22 = se−ks L1n (s) ds
k∈N+ R+
X n+1 X n µ ¶µ ¶
n+1 n
= (k − 1)2p ,
k 2n+2 p p
k∈N+ p=0
¡n+1¢
and since p≤ 2n+1 , 0 ≤ p ≤ n, we obtain
³ ´n
2
X (k − 1) + 1
||αn ||22 ≤ 2n+1 (n + 1) ≤ 2n+1 (n + 1) ζ (2) .
k 2n+2
k∈N+
Z
Next, as sm e−s ds = m!, m ∈ N, we have
R+
1 XZ
||βn ||22 = s2n+1 e−ks ds
((n + 1)!)2 k≥3 R+
¡2n+1¢
(2n + 1)! X 1 n+1
X 1
= 2 = , n ∈ N.
((n + 1)!) k≥3 k 2n+2 n+1
k≥3
k 2n+2
Since
X 2 X
X
1 1
=
k≥3
k 2n+2
j=0 `∈N+
(3` + j)2n+2
X 1
≤ 3 = 3−2n−1 ζ (2n + 2)
`∈N+
(3`)2n+2
and µ ¶
2n + 1
≤ 22n+1 , ζ (2n + 2) ≤ ζ(2), n ∈ N,
n+1
we obtain µ ¶2n+1
ζ (2) 2
||βn ||22 ≤ , n ∈ N.
n+1 3
Finally, for any r > 0 we have
X µ ¶r X ÃÃ √ !r !n
2 2 2
(||αn ||2 ||βn ||2 )r ≤ √ ζ (2) < ∞.
3 3
n∈N n∈N
Solving Gauss’ problem 109
and
à !
1 X ij + 2
Tr K 2 = p −1
2 ij (ij + 4)
i,j∈N+
à !
1 X k+2
= p − 1 t(k),
2 k(k + 4)
k∈N+
Q
where
Q nα t(k) is the number of divisors of k, equal to α (nα + 1) if 1 < k =
α pα is the factorization of k into distinct primes, and t(1) = 1. 2
µ ¶1/2
1 1 − e−s
ϕ1 (s) = e−s/2 , s ∈ R+ .
(log 2)1/2 s
Z
Proof. Since sk e−s ds = k!, k ∈ N, we have
R+
Z ³ √ ´
1
Kϕ1 (s) = J1 2 st t−1/2 e−t dt
(log 2)1/2 (es − 1)1/2 R+
s1/2 X (−1)k sk Z
= tk e−t dt
(log 2)1/2 (es − 1)1/2 k! (k + 1)! R+
k∈N
s1/2 (1 − e−s )
= = ϕ1 (s) , s ∈ R+ ,
(log 2)1/2 (es − 1)1/2 s
Solving Gauss’ problem 111
and
Z
1 (1 − e−s ) e−s
||ϕ1 ||22 = ds
log 2 R+ s
Z
1 X (−1)k+1
= sk−1 e−s ds
log 2 k! R+
k∈N+
1 X (−1)k+1
= = 1.
log 2 k
k∈N+
Proof. For any fixed z with Re z > −1/2 consider the function
µ ¶1/2
−zs−s/2 s
ϕ (s) = e , s ∈ R+ ,
1 − e−s
where
ej = (ϕ, ϕj ) = ψj (z) , j ∈ N+ .
Parseval’s equation then yields
X
|ej |2 = ||ϕ||22 .
j∈N+
Solving Gauss’ problem 113
But
Z ¯ ¯
¯ −zs−s/2 ¯2 s ds
||ϕ||22 = ¯e ¯
R+ 1 − e−s
Z X Z
−2sRez s ds
= e = e−(2 Re z+j)s s ds
R+ es − 1 R+
j∈N+
¯
X µ ¶¯∞
s 1 ¯
= − e−(2 Re z+j)s + ¯
2 Re z + j (2 Re z + j) ¯¯
2
j∈N+
0
X 1
= , Re z > −1/2,
j∈N+
(2 Re z + j)2
(−1)n+1 ψj (0)
ψj (−i − [i1 , . . . , in ] + z) = n+2 (1 − λj ) z + O(1)
λj
µ ¶
π 2 log 2
≤ − 1 |λ` |n−1 min (γ (A) , 1 − γ (A))
6
P
for any a ∈ I, A ∈ BI , ` ≥ 2, and n ∈ N+ . (Clearly, 1j=2 = 0.)
a+1
ha (z) = , Re z > −1/2.
(az + 1)2
prove that this last equation holds with APλ ha given by (2.3.12). We have
s
S −1 (APλ ha ) (s) = (a + 1) e−s/2−as , s ∈ R+ ,
1 − e−s
Z
¡ ¢ se−s e−(z+a)s
M S −1 APλ ha (z) = (a + 1) ds
R+ 1 − e−s
X Z
= (a + 1) se−(z+j+a)s ds
j∈N+ R+
X 1
= (a + 1)
j∈N+
(z + j + a)2
1 X
Pλn ha (x) − = (a + 1) λn−1
j ψj (a) ψ(x), x ∈ I. (2.3.14)
(x + 1) log 2
j≥2
Since
Z
¡ ¢
γ (da) γa τ −n (A) = γ(τ −n (A)) = γ (A) , n ∈ N, A ∈ BI ,
I
116 Chapter 2
1
≤ ζ (2) − (2.3.15)
log 2
Solving Gauss’ problem 117
Note that
π 2 log 2
− 1 = 0.14018 · · · = ε2
6
(cf. Subsection 1.3.6 ). 2
Corollary 2.3.6 For any a, x ∈ I, n ∈ N+ , and ` ≥ 2 we have
X Z x
γa (τ n < x) − γ([0, x]) = (a + 1) λn−1
j ψj (a) ψj dλ,
j≥2 0
d 1 X
γa (τ n < x) − = (a + 1) λn−1
j ψj (a)ψj (x),
dx (x + 1) log 2
j≥2
¯ ¯
¯ `−1
X Z x ¯
¯ ¯
¯γa (τ n < x) − γ([0, x]) − (a + 1) λn−1
ψ (a) ψ dλ¯
¯ j j j ¯
¯ j=2 0 ¯
µ ¶ µ ¯ ¯¶
π 2 log 2 n−1 1
¯1 ¯
≤ − 1 |λ` | − ¯ − γ([0, x])¯¯ ,
¯
6 2 2
¯ ¯
¯ `−1
X ¯
¯d 1 ¯
¯ γa (τ n < x) − − (a + 1) λj ψj (a)ψj (x)¯¯
n−1
¯ dx (x + 1) log 2
¯ j=2 ¯
µ ¶
π2 1 1
≤ − |λ` |n−1 .
6 log 2 x+1
Next (cf. Corollary 1.2.5), for any a ∈ I, n, k ∈ N+ , and i(k) ∈ Nk+ we have
¯ ¡ ¢ ¯ µ 2 ¶
¯ γ (a , · · · , a ) = i (k) ¯ π log 2
¯ a n+1 n+k ¯
¯ ¡ ¢ − 1¯ ≤ − 1 λn−1
0 ,
¯ γ [u(i(k) ), v(i(k) )] ¯ 6
118 Chapter 2
where
x+1
fa (x) = , a, x ∈ I.
(ax + 1)2
Differentiating (2.3.16) with respect to x and then putting x = a yield
In particular, ψ22 (0) = −λ0 G(1)ψe 0 (0) = λ0 G(1)U ∞ Ψ 6= 0 (since G(1) > 0).
Now, it follows from (2.3.16) that for any x ∈ I such that ψe 0 (x) 6= 0 the ratio
ψ2 (x)/ψe 0 (x) has a constant value equal to −(sgn ψ2 (0))(λ0 G(1)/U ∞ Ψ)1/2 ,
and that for any a ∈ I such that ψ2 (a) 6= 0 the ratio G(fa0 )/ψ2 (a) has a
constant value equal to G(1)/ψ2 (0). Then
µ ∞ ¶1/2 Z x
e U Ψ
ψ(x) = −(sgn ψ2 (0)) ψ2 dλ
λ0 G(1) 0
and µ ¶1/2
λ0 G(1)
ψ2 (x) = −(sgn ψ2 (0)) ψe 0 (x)
U ∞Ψ
for any x ∈ I.
Remark. It follows from Corollary 2.3.6 that the exact convergence rate
to 0 as n → ∞ of
It is not difficult to check that the above formula yields ψγ (2) = ε2 . Other-
wise it seems to be of little value. 2
dm t
= t , t ∈ R+ .
dt e −1
Note that
Z X X 1
m (R+ ) = t e−kt dt = = ζ (2) .
R+ k2
k∈N+ k∈N+
¡ ¢
Consider the Hilbert space L2 R+ , BR+ , m = L2m (R+ ) of m-square inte-
grable functions f : R+ → C with inner product (·, ·)m defined by
Z
(ϕ, ψ)m = ϕψ ∗ dm, ϕ, ψ ∈ L2m (R+ ),
R+
Let D denote the half-plane Re z > −1/2 and consider the measure ν
on BD with density
1 1 1
2 if − < x < 0, y ∈ R,
dν π (x + 1) + y 2 2
=
dxdy
0 otherwise.
Note that
Z 0 Z Z 0
1 dy dx
ν (D) = dx = = log 2.
π −1/2 R (x + 1)2 + y 2 −1/2 x+1
Solving Gauss’ problem 121
Consider
¯ the Hilbert
¯ space H 2 (ν) of functions f holomorphic in D such
¯ ¯
that ¯(z + 1)−1 f (z)¯ is bounded in every half-plane Re z > −1/2 + ε, ε > 0,
and µZ ¶ 1/2
kf k2,ν = |f |2 dν < ∞,
D
and ° °
° e°
°f ° ≤ kf k2,ν . (2.4.2)
2,γ
Alternatively,
X
U ng = λnk (g, M ϕ
bk )ν M ϕ
bk , n ∈ N+ , g ∈ H 2 (ν) .
k∈N+
and, by (2.4.1),
1
(g, M ϕ
b1 )ν M ϕ
b1 = (g, 1)ν = U ∞ ge, g ∈ H 2 (ν) .
log 2
b we also have
As 0 is not an eigenvalue of K,
X
M −1 g = (M −1 g, ϕ bk , g ∈ H 2 (ν) ,
bk )m ϕ
k∈N+
or, alternatively,
X
g= (g, M ϕ
bk )ν M ϕ
bk , g ∈ H 2 (ν) .
k∈N+
Then
° −1 °2 X ¯ ¯2 X
°M g ° = ||g||2
2,ν = ¯(M −1
g, ϕ
bk )m
¯ = bk )ν |2
|(g, M ϕ
2,m
k∈N+ k∈N+
|µ (τ −n (A)) − γ (A)|
µ ZZ ¶1/2 (2.4.7)
1 2 1
≤ (log 2) γ 1/2
(A) |h (x + iy)| dxdy − |λ2 |n
π D log 2
124 Chapter 2
But Z
n ∞ 1 U n ge (x) − U ∞ ge
(IA , U ge − U ge)γ = dx
log 2 A x+1
and, by Proposition 2.1.5,
Z
U n ge (x) − U ∞ ge ¡ ¢
dx = µ τ −n (A) − γ (A)
A x+1
since Z
∞ 1 (x + 1) h (x) 1
U ge = dx = .
log 2 I x+1 log 2
¯ ¡ −n ¢ ¯
¯µ τ (A) − γ (A)¯ ≤ (log 2) γ 1/2 (A) kU n ge − U ∞ gek (2.4.9)
2,γ
for any n ∈ N+ and A ∈ BI . Now, (2.4.7) follows from (2.4.9) and Proposi-
tion 2.4.1. 2
X
≤ ||g||22,ν − |U ∞ ge|2 log 2 − bk )ν |2 |λ`+1 |2n
|(g, M ϕ
2≤k≤`
with the usual convention which assigns value 0 to a sum over the empty
set. Proposition 2.4.1 and Corollary 2.4.2 can be accordingly generalized. 2
Solving Gauss’ problem 125
= (z + 1) ψk (z), k ∈ N+ , z ∈ D.
(a + 1) (z + 1)
ga (z) = , z ∈ D,
(az + 1)2
Hence
Z
(M −1
U ga , ϕ
bk )m = (a + 1) bk (t) m(dt) = ψbk (a),
e−at ϕ a ∈ I, k ∈ N+ .
R+
Therefore
X
U n ga = λn−1
k ψbk (a)ψbk , n ∈ N+ , a ∈ I, (2.4.12)
k∈N+
2
X Z
= (a + 1) te−(2a+k)t dt (2.4.13)
k∈N+ R+
X 1
= (a + 1)2 ,
k∈N+
(2a + k)2
³ ´1/2 (2.4.15)
≤ ||U ga ||22,ν − |U ∞ gea |2 log 2 |λ2 |n−1 .
is just the Radon–Nikodym derivative dγa /dλ. Now, (2.4.16) follows from
(2.4.9) and (2.4.13) through (2.4.15). 2
Remarks. 1. On account of the remark following Corollary 2.4.2, in-
equality (2.4.16) can be generalized as follows. For any a ∈ I, `, n ∈ N+ ,
and A ∈ BI we have
¯ ¯
¯ X Z ¯
¯ ¡ −n ¢ n−1 b
¯
¯γa τ (A) − γ (A) − (log 2) λk ψk (a) b
ψk dγ ¯¯ (2.4.17)
¯
¯ 2≤k≤` A ¯
1/2
X 1 X
≤ (log 2) γ 1/2 (A) (a + 1)2 2 − ψbk2 (a) |λ`+1 |n−1 .
k∈N+
(2a + k) 1≤k≤`
αu0 ≤ T p x ≤ βu0 .
D1 = (z ∈ C : |z − 1| < 3/2)
Both operators Pλ and U take A (D1 ) into itself. Obviously, for f ∈ A (D1 )
we define Pλ f and U f by
X µ ¶
1 1
Pλ f (z) = f , z ∈ D1 ,
(z + i)2
i∈N+
z+i
and µ ¶
X 1
U f (z) = Pi (z) f , z ∈ D1 ,
z+i
i∈N+
and
(log 2)−1
f1 (z) = , z ∈ D1 .
z+1
Since Pλ (f1 f ) = f1 U f, f ∈ A (D1 ), the spectra of the operators Pλ and U
on A (D1 ) are identical, algebraic multiplicities of the eigenvalues included.
Theorem 2.4.4 The spectra of U on A (D1 ) and on H 2 (ν) (see Subsec-
tion 2.4.1) are identical, algebraic multiplicities of the eigenvalues included.
Consider the subspaces
µ Z ¶
⊥ ∞
A (D1 ) = f ∈ A (D1 ) : U f = f dγ = 0
I
Solving Gauss’ problem 129
and µ Z ¶
e⊥ (D1 ) =
A f ∈ A(D1 ) : f dλ = 0
I
³ ´ ³ ´
of A (D1 ) and the real subspaces A⊥ e⊥ of A⊥ (D1 ) Ae⊥ (D1 )
r (D1 ) Ar (D1 )
consisting of functions that take real values on R ∩ D1 = [−1/2, 5/2]. Note
that by Proposition 2.1.1(ii) U leaves invariant both subspaces A⊥ (D1 ) and
A⊥ e⊥ (D1 ) and Ae⊥
r (D1 ) while Pλ leaves invariant³both subspaces
´ A ³ r (D1´).
The complexification of A⊥ r (D1 )
e⊥
A ⊥
r (D1 ) is just A (D1 )
e⊥ (D1 ) .
A
Also, the spectrum of T0 on A e⊥ (D1 ) is identical with the spectrum of U on
⊥
A (D1 ).
The set ³ ´
C = f ∈ A⊥ 0
r (D1 ) : f ≥ 0 on [−1/2, 5/2]
is a reproducing cone in A⊥
r (D1 ) . Define u0 ∈ A(D1 ) by
1
u0 (z) = z + 1 − , z ∈ D1 .
log 2
Clearly, u0 ∈ C.
Theorem 2.4.5 The operator −U on A⊥ r (D1 ) is positive with respect
to the cone C . Moreover, −U is u0 -positive. Hence the operator − U +
U ∞ on A (D1 ) has a simple positive dominant eigenvalue equal to λ0 (cf.
Theorem 2.2.5) with eigenfunction f2 in the interior C o of C. There is no
other eigenfunction in C.
Corollary 2.4.6 The operator −T0 on A e⊥
r (D1 ) is positive with respect to
the (reproducing) cone f1 C = (f1 f : f ∈ C). Moreover, −T0 is f1 u0 -positive.
Hence the operator −T0 on A (D1 ) has a simple positive dominant eigenvalue
equal to λ0 with eigenfunction fb2 = f1 f2 . There is no other eigenfunction
in f1 C.
Note that a minimax principle for −λ0 holds. We namely have
(U f )0 (x) (U f )0 (x)
mino max = −λ0 = max min .
f ∈C −1/2≤x≤5/2 f 0 (x) f ∈C o −1/2≤x≤5/2 f 0 (x)
Hence
(U f )0 (x) (U f )0 (x)
min ≤ −λ0 ≤ max
−1/2≤x≤5/2 f 0 (x) −1/2≤x≤5/2 f 0 (x)
for any f ∈ C o . For example, taking
z+1
f (z) = − c, z ∈ D1 ,
z + 1.14617
130 Chapter 2
0.2995 ≤ λ0 ≤ 0.3038,
Moreover, Ã√ !u
5−1
λ(β + u) ≤ λ(β), u ∈ R+ .
2
(iv) The spectral radius ρ(β) of the linear operator T0β : A∞ (D1 ) →
A∞ (D1 ) is strictly smaller than λ(β), and for any f ∈ A∞ (D1 ) such that
f |[−1/2,5/2] > 0 we have
µµ ¶n ¶
Gnβ f (z) ρ(β)
=1+O
λn (β)`β (f )gβ (z) λ(β)
for some ϕ ∈ L2m0 (R+ ), the Hilbert space of m0 -square integrable functions
ϕ : R+ → C with inner product (·, ·)m0 defined by
Z
(ϕ, ψ)m0 = ϕψ ∗ dm0 , ϕ, ψ ∈ L2m0 (R+ )
R+
and norm
µZ ¶1/2
2 0
||ϕ||2,m0 = |ϕ| dm , ϕ ∈ L2m0 (R+ ).
R+
where p
pn−1 + qn + (pn−1 + qn )2 + 4(−1)n−1
yi1 ···in =
2
with, as usual,
pn
= [i1 , · · · , in ] , g.c.d. (pn , qn ) = 1, p0 = 0,
qn
134 Chapter 2
and
λ1 (4) = 0.19945 88183 43767 26019 ··· ,
λ2 (4) = −0.07573 95140 84360 60892 · · · ,
λ3 (4) = 0.02856 64037 69818 52783 ··· ,
λ4 (4) = −0.01077 74165 76612 69829 · · · ,
λ5 (4) = 0.00407 09406 93426 42144 ··· .
is defined by
X µ ¶
1 1 1
Gα,β F (z, w) = F ,
(z + i) (w + i)β
α z+i w+i
i∈N+
for any F ∈ B∞ (D2 ) and (z, w) ∈ D22 . The spectral properties of Gα,β ,
which is positive and nuclear of trace class, are strongly related to those of
Gα+β+2` , ` ∈ N. For details see Vallée (1997).
where
uin ···i1 = uin ◦ · · · ◦ ui1 ,
(2.5.2)
Pi1 ···in (x) = Pi1 (x)Pi2 (ui1 (x)) · · · Pin (uin−1 ···i1 (x)), n ≥ 2,
U n f (x) = Ex (f (sxn ))
Now, by (1.2.4), I(i(n) ) is the set of irrationals in the interval with end-
points pn /qn and (pn + pn−1 )/(qn + qn−1 ). Since
1/i1 if n = 1,
pn
= [i1 , · · · , in ] =
qn
1
if n > 1
i1 + pn−1 (i2 , · · · , in )/qn−1 (i2 , · · · , in )
and
1/(i1 + 1) if n = 1,
pn + pn−1
=
qn + qn−1
[i1 , · · · , in−1 , in + 1] if n > 1
1/(i1 + 1) if n = 1,
=
1
if n > 1
i1 + pn (i2 , · · · , in , 1)/qn (i2 , · · · , in , 1)
we can write
1
Pi1 ···in (x) = (x + 1) × ×
qn−1 (i2 , · · · , in )(x + i1 ) + pn−1 (i2 , · · · , in )
1
× (2.5.5)
qn (i2 , · · · , in , 1)(x + i1 ) + pn (i2 , · · · , in , 1)
for any n ≥ 2, i(n) ∈ Nn+ , and x ∈ I.
A useful alternative representation of U n f, n ∈ N+ , when f ∈ BV (I) is
available.
Proposition 2.5.1 If f ∈ BV (I) then for any n ∈ N+ and x ∈ I we
have Z
n
U f (x) = U n I(a,1] (x)df (a) + f (0)
[0,1)
R
with [0,x) df = f (x) − f (0), x ∈ I.
Proof. Since f can be represented as the difference of two non-decreasing
functions, we may and shall assume that f is non-decreasing. Then for any
x ∈ I we have Z
f (x) − f (0) = I(a,1] (x)df (a).
[0,1)
138 Chapter 2
where the first three upper bounds are taken over non-constant functions f ,
and f ↑ (↓) means that f is non-decreasing (non-increasing).
Proof. It is clear that
var U n f var U n f
sup = sup
f ∈B(I),f ↑ var f f ∈B(I),f ↓ var f
since
var U n (−f ) var U n f
= .
var(−f ) var f
Next, let
var U n f
vn = sup , n ∈ N+ .
f ∈B(I),f ↑ var f
Then (cf. the proof of Corollary 2.1.13) for any non-constant f ∈ BV (I)
there exist two non-decreasing functions f1 and f2 such that f = f1 − f2
and var f = var f1 + var f2 . Therefore
var U n f ≤ var U n f1 + var U n f2
Hence
var U n f
sup ≤ vn
f ∈BV (I) var f
and since
var U n f var U n f
sup ≥ sup = vn ,
f ∈BV (I) var f f ∈B(I),f ↑ var f
var U n f
var U n I(a,1] ≤ sup ≤ sup var U n I(a,1]
f ∈B(I),f ↑ var f a∈(0,1]
√
where g = [1, 1, 1, · · · ] = ( 5 − 1)/2 = 0.6180339 · · · .
Without any loss of generality, throughout this subsection we assume
that f ∈ BV (I) is non-decreasing. To simplify the writing put
Pi1 ···in (0) = αi1 ···in , ui1 ···in (0) = βi1 ···in , i1 , · · · , in ∈ N+ .
140 Chapter 2
X
= [Pi1 ···in (0)f (uin ···i1 (0)) − Pi1 ···in (1)f (uin ···i1 (1))]
i1 ,··· ,in ∈N+
X
= [Pi1 ···in (0)f (uin ···i1 (0)) − 2P(i1 +1)i2 ···in (0)f (uin ···i2 (i1 +1) (0))]
i1 ,··· ,in ∈N+
X X
= α1i2 ···in f (βin ···i2 1 ) − α(i1 +1)i2 ···in f (βin ···i2 (i1 +1) ) .
i2 ,··· ,in ∈N+ i1 ∈N+
It is easy to see that if n is odd then var U n I(a,1] has a constant value
for · ¶
1 1
, if n = 1,
j1 + 1 j1
a∈
[ [j1 , · · · , jn−1 , jn + 1], [j1 , · · · , jn ] ) if n > 1
while if n is even then var U n I(a,1] has a constant value for
that is, in both cases, on the closure without the right endpoint of any
fundamental interval I(j (n) ), j (n) = (j1 , · · · , jn ) ∈ Nn+ . Write 1(n) for
(j1 , · · · , jn ) with jk = 1, 1 ≤ k ≤ n, n ∈ N+ . Then in particular for
that is, · ¶
F2m+1 F2m
a∈ , , m ∈ N, (2.5.8)
F2m+2 F2m+1
Solving Gauss’ problem 141
we have
and
0
v2m+1 := var U 2m+1 I(a,1]
m−2
X X h
= α1i2 i3 ···i2m−2q−1 (i2m−2q +1)1···1
q=0 i2 ,··· ,i2m−2q ∈N+
X i
− α(i1 +1)i2 i3 ···i2m−2q−1 (i2m−2q +1)1···1
i1 ∈N+
X X X
+ α1i2 1···1 − α(i1 +1)i2 1···1 + α(i1 +1)1···1
i2 ∈N+ i1 ∈N+ i1 ∈N+
for m ≥ 2. (In the last equation the number of subscripts of the α’s is
2m + 1.) Similarly, for
that is, · ¶
F2m+1 F2m+2
a∈ , , m ∈ N, (2.5.9)
F2m+2 F2m+3
we have X
v20 := var U 2 I(a,1] = α(i1 +1)1 ,
i1 ∈N+
0
v2m+2 := var U 2m+2 I(a,1]
m−1
X X h X
= α(i1 +1)i2 i3 ···i2m−2q (i2m−2q+1 +1)1···1
q=0 i2 ,··· ,i2m−2q+1 ∈N+ i1 ∈N+
i
− α1i2 i3 ···i2m−2q (i2m−2q+1 +1)1···1
X
+ α(i1 +1)1···1
i1 ∈N+
142 Chapter 2
for m ∈ N+ . (In the last equation the number of subscripts of the α’s is
2m + 2.)
Since g belongs to all intervals (2.5.8) and (2.5.9), the UB Conjecture
amounts to
vn = vn0 , n ∈ N+ .
The case n = 1. This case was dealt with in Proposition 2.1.12. Actually,
writing i for i1 , equation (2.5.6) yields
X
var U f = α1 f (β1 ) − αi+1 f (βi+1 ).
i∈N+
Hence
1
var U I(a,1] =
i+1
· ¶
1 1
for a ∈ , , i ∈ N+ and
i+1 i
1
v1 = sup var U I(a,1] = = var U I(g,1] = v10
a∈[0,1) 2
But
X X 1
α(i+1)(j+1) =
((i + 1)(j + 1) + 1) ((i + 1)(j + 2) + 1)
i∈N+ i∈N+
1 X 1
≤ (2.5.11)
(j + 1)(j + 2) (i + 1)2
i∈N+
for any non-decreasing f ∈ B(I). Now, note that for f = I(a,1] with
a ∈ [1/2, 2/3), in particular for a = g, we have
X
var U 2 I(a,1] = α(i+1)1 ,
i∈N+
X X
+ α1i2 i3 ···in − α(i1 +1)(i2 +1)i3 ···in f (βin ···i3 i21 )
i2 ∈N+ i1 ∈N
X X X
≤ α1i2 i3 ···in − α(i1 +1)i2 i3 ···in f (βin ···i3 )
i3 ,··· ,in ∈N+ i2 ∈N+ i1 ∈N
X
+ α(i1 +1)1i3 ···in (f (βin ···i3 ) − f (βin ···i3 1 )) .
i1 ∈N+
(2.5.14)
For an even n the corresponding inequality is
X X X
var U n f ≤ α(i +1)i i − α1i2 i3 ···in f (βin ···i3 )
1 2 3 ···in
i3 ,··· ,in ∈N i2 ∈N+ i1 ∈N+
X
+ α(i1 +1)1i3 ···in (f (βin ···i3 1 ) − f (βin ···i3 )) . (2.5.15)
i1 ∈N+
Put
X X
δi3 ···in = (−1)n−1 α1i2 i3 ···in − α(i1 +1)i2 i3 ···in )
i2 ∈N+ i1 ∈N
Solving Gauss’ problem 145
1
ψ(z + 1) = ψ(z) +
z
for z 6= 0, −1, −2, · · · . Tables for ψ can be found in Abramowitz and Stegun
(1964).
Putting X
δ (n) (f ) = δi3 ···in f (βin ···i3 ),
i3 ,··· ,in ∈N+
146 Chapter 2
for any n ≥ 3. Here we used the fact that α(i1 +1)1i3 ···in < α(i1 +1)11···1 for any
n ≥ 3 and (i3 , · · · , in ) 6= 1(n − 2), which follows at once from (2.5.5).
First, note that by (2.5.16) we have
1 X
δ (n) (f ) ≤ |δi3 ···in | (f (1) − f (0)). (2.5.18)
2
i3 ,··· ,in ∈N+
Since X X
1
|δi3 ···in | = sup δi3 ···in ,
2
i3 ,··· ,in ∈N+ (i3 ,··· ,in )∈A
1 X 1 X
|δi3 ···in | ≥ |δi | .
2 2
i3 ,··· ,in ∈N+ i∈N+
Hence the right hand side of (2.5.17) does not tend to 0 as n → ∞, and
(2.5.18) is useless for n ≥ 3. As a matter of fact, it is a general result which
does not take into account that f is non-decreasing.
If for some given n ≥ 3 the inequality
for any a ∈ [0, 1). It is easy to see that the right hand side of (2.5.20) is
equal to vn0 . Since whatever n ∈ N+ we have
it follows from (2.5.20) that vn = vn0 . Thus if (2.5.19) holds then for the
given n the UB Conjecture holds, too.
In particular for n = 3, writing i, j, k for i1 , i2 , i3 , respectively, we have
1
αijk = , i, j, k ∈ N+ .
(i(jk + 1) + k)(i(j(k + 1) + 1) + k + 1)
Solving Gauss’ problem 147
is positive for k = 1 and negative for k > 1. Then (2.5.19) clearly holds in
this case. Hence the UB Conjecture holds for n = 3 and
X
v3 = δ1 + α(i+1)11
i∈N+
X µ 1
µ
1
¶ µ
2
¶¶
= +ψ 2+ −ψ 2+
(j + 2)(2j + 3) j+1 2j + 1
j∈N+
µ ¶ µ ¶
2 1
+ψ 2 + −ψ 2+
3 2
7 X µ µ 1
¶ µ
2
¶¶
= log 4 − + ψ 2+ −ψ 2+
6 j+1 2j + 1
j∈N+
µ ¶ µ ¶
3 3 2 2 1
+ + +ψ − −2−ψ
5 2 3 3 2
7 17 4 π
= log 4 − − + log √ + √
6 30 27 2 3
X µ µ ¶ µ ¶¶
1 2
+ ψ 2+ −ψ 2+ .
j+1 2j + 1
j∈N+
X
+ γ0 (I(i(n) ))(I(a,1] (uin ···i1 (0)) − I(a,1] (uin ···i1 (1)).
i(n) ∈Nn
+
(2.5.22)
Note that if a ∈ Ω then just one of the differences
I(a,1] (uin ···i1 (0)) − I(a,1] (uin ···i1 (1)), i(n) ∈ Nn+ ,
(i1 < j1 ), if n = 1;
and then
2
max γ(I(i(n) )) ≤ σ(n), n ∈ N+ . (2.5.26)
i(n) ∈Nn
+
3 log 2
Now, by (2.5.24) through (2.5.26), with k1 = log 2 = 0.69315 · · · , k2
¡√ ¢2 ¡ √ ¢
= ζ(2) log 2 − 1 = 0.14018, · · · , θ = g2 = 5 − 1 /4 = 3 − 5 /2 =
0.38196 · · · , it follows from (2.5.23) that
var U n I(a,1]
µ ¶
4k2 n−2 n−3 k1
≤ λ0 σ (0) + λ0 σ (1) + · · · + σ (n − 2) + σ (n − 1) + σ (n)
3 log 2 2k2
µ
4k2 σ (0) σ (1)
= σ (n − 1) λn−20 + λ0n−3 + ···
3 log 2 σ (n − 1) σ (n − 1)
¶
σ(n − 2) k1
+ + + σ(n).
σ(n − 1) 2k2
Since
σ (k) 1
≤ θk−n−1 , k, n ∈ N,
σ (n) 2
and
σ (n − 1) 8
≤ , n ≥ 3,
σ (n) 3
we finally obtain
µ ¶
n 16k1 16k2
var U I(a,1] ≤ 1+ + σ (n) .
9 log 2 9θ (θ − λ0 ) log 2
We have
µ ¶
16k1 16k2 16 k2
1+ + =1+ k1 +
9 log 2 9θ (θ − λ0 ) log 2 9 log 2 θ (θ − λ0 )
à !
32 log 2 ζ (2) log 2 − 1
= 1+ + √ ¡ √ ¢
9 log 2 2 7 − 3 5 − 3 − 5 λ0
and the proof is complete for any odd n. The case of an even n can be
treated similarly. 2
Corollary 2.5.4 Let f ∈ BV (I). For any n ∈ N we have
k0 var f
|| U n f − U ∞ f || ≤ .
Fn Fn+1
|| U n f − U ∞ f || ≤ var U n f, n ∈ N,
and the result stated is implied by Theorem 2.5.3 for n ∈ N+ . The case
n = 0 can be checked directly. 2
Remark. It was claimed in Iosifescu (1997, p.76) that Theorem 2.5.3
holds with k0 = 1/ log 2 for all n ∈ N large enough. (This is clearly true
for n = 1, 2, or 3.) A flaw detected by Adriana Berechet in the method
of proof in that paper invalidates the conclusion. We conjecture, however,
that both Theorem 2.5.3 and Corollary 2.5.4 hold with k0 = 1/ log 2 for any
n ∈ N. 2
a+1 k0
≤ sup |γa (san ≤ x) − G(x)| ≤ .
2(Fn + aFn−1 )(Fn+1 + aFn ) x∈I Fn Fn+1
Proof. (i) The upper bound. We have already used in Subsection 2.5.1
the property of U of being the transition operator of the Markov chain
(san )n∈N for any a ∈ I. Therefore in particular
¡ ¢
U n I[0,x] (a) = Ea I[0,x] (san ) = γa (san ≤ x)
Hence
1
sup |γa (san ≤ x) − G(x)| ≥ sup γa (san = s) (2.5.27)
x∈I 2 s∈I
for any a ∈ I and n ∈ N.
Next, recall (see Subsection 2.5.1) that
for any a ∈ I and (i1 , · · · , in ) = i(n) ∈ Nn+ . By (2.5.5) and (2.5.27) we then
have
sup γa (san = s) = P1(n) (a), a ∈ I, n ∈ N+ , (2.5.28)
s∈I
The argument used in the proof of Corollary 2.5.4, and Theorem 2.5.3
yield
|| U n f − U ∞ f ||v = || U n f − U ∞ f || + var U n f
4k0 4k0
≤ 2 var U n f ≤ var f ≤ || f ||v
Fn Fn+1 Fn Fn+1
154 Chapter 2
for any n ∈ N and f ∈ BV (I). (We took into account that, as mentioned
at the beginning of this section, here f is complex-valued. See the proof of
Proposition 2.1.16.) Hence
lim || U n − U ∞ ||1/n
v ≤ g2 .
n→∞
γa (τ n ≤ x, san ≤ y)
a+1
2(Fn + aFn−1 )(Fn+1 + aFn )
¯ ¯
¯ log(xy + 1) ¯¯
¯ n a
≤ sup ¯γa (τ ≤ x, sn ≤ y) −
log 2 ¯
x,y∈I
k0
≤ .
Fn Fn+1
Proof. Set Gan (y) = γa (san ≤ y), Hna (y) = Gan (y) − G(y), a, y ∈ I, n ∈ N.
Theorem 2.5.5 yields
k0
|Hna (y) | ≤ , a, y ∈ I, n ∈ N. (2.5.32)
Fn Fn+1
and n ∈ N we have
Z y
n
γa (τ ≤ x, san ≤ y) = γa (τ n ≤ x|san = z) dGan (z)
0
Z y
(z + 1)x a
= dGn (z)
0 zx + 1
Z y Z y
1 (z + 1)x dz (z + 1)x
= + dHna (z)
log 2 0 zx + 1 z + 1 0 zx + 1
¯
log(xy + 1) (z + 1)x a ¯¯z=y
= + H (z)
log 2 zx + 1 n ¯z=0
Z y
x − x2
− H a (z)dz.
2 n
0 (zx + 1)
[When applying formula (1.3.21) we used the fact that the σ-algebras gen-
erated by (a1 , · · · , an ) and by san are identical for any a ∈ I and n ∈ N+ .]
Hence, by (2.5.32),
¯ ¯
¯ ¯
¯γa (τ n ≤ x, san ≤ y) − log(xy + 1) ¯
¯ log 2 ¯
µ ¶
k0 (y + 1)x (x − x2 )y k0
≤ + ≤
Fn Fn+1 xy + 1 xy + 1 Fn Fn+1
for any a, x, y ∈ I and n ∈ N, so that the upper bound holds.
To get the lower bound we note that by Theorem 2.5.5 for any a ∈ I
and n ∈ N we have
¯ ¯
¯ log(xy + 1) ¯¯
¯ n a
sup ¯γa (τ ≤ x, sn ≤ y) −
log 2 ¯
x,y∈I
¯ ¯
¯ log(y + 1) ¯¯
¯ n a
≥ sup ¯γa (τ ≤ 1, sn ≤ y) −
y∈I log 2 ¯
a+1
= sup |γa (san ≤ y) − G(y)| ≥ .
y∈I 2(Fn + aFn−1 )(Fn+1 + aFn )
2
Remarks. 1. We can replace γa (τ n ≤ x, san ≤ y) by λ(τ n ≤ x, san ≤ y) in
the statement of ¯Theorem ¯ 2.5.8 since it is possible to relate these quantities
¯ a 0 ¯ 2
by noticing that sn − sn ≤ 1/Fn , n ∈ N, a ∈ I. The new upper and lower
bounds are of order O(g2n ) as n → ∞, too.
156 Chapter 2
x log(y + 1)
lim λ([a1 , · · · , an ] ≤ x, s0n ≤ y) = (2.5.33)
n→∞ log 2
≤ max(µ(x − (Fn Fn+1 )−1 < τ 0 ≤ x), µ(x < τ 0 ≤ x + (Fn Fn+1 )−1 )) → 0
for n ∈ N+ . Clearly,
Z Z
fna dγa = f dγa , n ∈ N. (2.5.34)
I I
Solving Gauss’ problem 157
Since for any n ∈ N+ and x ∈ I \ En there is a unique i(n) ∈ Nn+ such that
x ∈ I(i(n) ) and since
max γa (I(i(n) )) → 0
i(n) ∈Nn
+
Clearly,
¯ ¯
¯ ¯
¯ X Z ¯ Z
¯ ¯
¯ (f − fk )(h ◦ sn )dγa ¯ ≤ || h || |f − fka |dγa .
a a
(2.5.39)
¯ (k) k I(i(k) ) ¯ I
¯i ∈N+ ¯
158 Chapter 2
a+1
γa (I(i(k) )) = ,
(qk + apk )(qk + qk−1 + a(pk + pk−1 ))
where
pk
= [i1 , . . . , ik ], g.c.d. (pk , qk ) = 1, k ∈ N+ ,
qk
and p0 = 0, q0 = 1. With the change of variable
pk + t pk−1
u= , t ∈ I,
qk + t qn−1
noting that
0
san (u) = san−k (t)
for t ∈ Ω, where
½
0 [ik , . . . , i2 , i1 + a] if k > 1,
a =
1/(i1 + a) if k = 1
qk−1 + apk−1
= ,
qk + apk
we obtain
Z Z
h(san (u))du
h(san (u))γa (du) = (a + 1)
I(i(k) ) I(i(k) ) (au + 1)2
Z 0
h(san−k (t))dt
= (a + 1) .
I (t(qk−1 + apk−1 ) + qk + apk )2
Hence
Z Z 0
1 h(san−k (t))dt
h(san (u))γa (du) = (a0 + 1) 0 2
γa (I(i(k) )) I(i(k) ) I (a t + 1)
(2.5.41)
Z Z
= a0
(h ◦ sn−k )dγa0 = a0
h(v) dGn−k (v),
I I
Solving Gauss’ problem 159
0 0
where Gam (v) = γa0 (sam < v), m ∈ N, v ∈ I. By Theorem 2.5.5 we have
k0
|Gam (v) − G(v)| ≤
Fm Fm+1
for any a, v ∈ I and m ∈ N. Then
¯Z Z ¯
¯ ¯
¯ h(v)dG a 0
hdγ ¯¯
¯ n−k (v) −
I I
¯Z Z ¯ (2.5.42)
¯ ¯ k0 var h
= ¯¯ Gn−k (v)dh(v) − G(v)dh(v)¯¯ ≤
a0
.
I I Fn−k Fn−k+1
respectively.
Now, (2.5.37) follows from (2.5.38), (2.5.380 ) (2.5.39), (2.5.390 ), (2.5.43),
and (2.5.430 ). 2
Corollary 2.5.10 For any a, x, y ∈ I and n ∈ N we have
¯ ¯
¯γa (τ 0 ≤ x, san ≤ y) − γa ([0, x])G(y)¯
µ ¶ (2.5.44)
k0
≤ inf δka (x) + γa ([0, x])
0≤k≤n Fn−k Fn−k+1
160 Chapter 2
where
0 if x ∈ Ek ,
δka (x) = 2(a + 1)(x − ak )(bk − x)
if x ∈ (ak , bk ),
(bk − ak )(ax + 1)2
and [ak , bk ] is the closure of the (unique) fundamental interval of order
k ∈ N containing x ∈ I \ Ek .
Proof. Clearly,
Z
0
γa (τ ≤ x, san ≤ y) = I[0,x] (I[0,y] ◦ san ) dγa
I
a+1 ¯ ¯
≤ sup ¯γa (τ 0 ≤ x, san ≤ y) − γa ([0, x])G(y)¯
2(Fn + aFn−1 )(Fn+1 + aFn ) x,y∈I
µ ¶
a+1 1
≤ + k0 . (2.5.45)
2 Fbn/2c Fbn/2c+1
a+1 a+1
δka (x) ≤ max λ(I(i(k) )) = , k ∈ N, a, x ∈ I. (2.5.46)
2 i (k) 2Fk Fk+1
The upper bound from (2.5.45) follows by using (2.5.46) and taking k =
bn/2c.
Next, as in the proof of Theorem 2.5.8, we get
¯ ¯
supx,y∈I ¯γa (τ 0 ≤ x, san ≤ y) − γa ([0, x])G(y)¯
a+1
≥ supy∈I |γa (san ≤ y) − G(y)| ≥
2(Fn + aFn−1 )(Fn+1 + aFn )
Remark.
√ The upper bound in Corollary 2.5.11 is O(gn ) as n → ∞, with
g = ( 5 − 1)/2. The lower bound is O(g2n ) as n → ∞ so that the problem
of the exact rate of convergence is unsettled. 2
Corollary 2.5.12 Let µ ∈ pr (BI ) such that µ ¿ λ and let ga =
dµ/dγa , a ∈ I. Then we have
¯ ¯
¯µ(τ 0 ≤ x, san ≤ y) − µ([0, x])G(y)¯
µZ ¶ (2.5.47)
¯ ¯ k0
≤ inf ¯ga I[0,x] − (ga I[0,x] )a ¯ dγa + µ([0, x])
k
0≤k≤n I Fn−k Fn−k+1
for any a, x, y ∈ I and n ∈ N. In particular, if ga has a version gea of
bounded variation, then
Z
¯ ¯
¯ga I[0,x] − (ga I[0,x] )a ¯ dγa (2.5.48)
k
I
(a + 1)var[0,x] gea
if x ∈ Ek
(Fk + aFk−1 )(Fk+1 + aFk )
≤ Z
(a + 1)var[0,x] gea x
+2 ga (t)γa (dt) if x ∈ (ak , bk ),
(Fk + aFk−1 )(Fk+1 + aFk ) ak
where [ak , bk ] is the closure of the (unique) fundamental interval of order
k ∈ N containing x ∈ I \ Ek .
Proof. We have
Z
0
µ(τ ≤ x, san ≤ y) = I[0,x] (I[0,y] ◦ san )ga dγa
I
for any a, x, y ∈ I and n ∈ N. Theorem 2.5.9 applies with f = ga I[0,x] and
h = I[0,y] , x, y ∈ I, yielding (2.5.47). Next, (2.5.48) can be obtained noting
that (i) for a typical fundamental interval I(i(k) ) of order k ∈ N contained
in [0, x] we have
Z
¯ ¯
¯ga I[0,x] − (ga I[0,x] )a ¯ dγa
k
I(i(k) )
Z ¯ Z ¯
¯ 1 ¯
¯ ¯
= ¯ga (t) − ga (s)γa (ds) ¯ γa (dt)
(k)
I(i ) ¯ γ a (I(i (k) )) (k)
I(i ) ¯
Z ¯ ¯
1 ¯Z ¯
¯ ¯
= ¯ (e
ga (t) − g
ea (s)) γa (ds) ¯ γa (dt)
γa (I(i(k) )) I(i(k) ) ¯ I(i(k) ) ¯
Z ¯
x¯ Z x ¯
1 ¯
= ¯ga (t) − ga (s)γa (ds)¯¯ γa (dt)
¯ γa ([ak , bk ]) ak
ak
Z b k ¯Z x ¯
1 ¯ ¯
+ ¯ ga (s)γa (ds)¯¯ γa (dt)
γa ([ak , bk ]) x ¯
ak
Z x Z bk µZ x ¶
1
≤ ga (t)γa (dt) + ga (s)γa (ds) γa (dt)
ak γa ([ak , bk ]) ak ak
Z x
= 2 ga (t)γa (dt).
ak
Limit theorems
3.0 Preliminaries
As in Subsection 2.5.4 let g be a λ-integrable complex-valued function on I.
We particularize here the framework considered there taking a = 0 and ac-
cordingly γ0 = λ. Denote by Ek , k ∈ N, the set consisting of the endpoints
of all fundamental intervals of rank `, 0 ≤ ` ≤ k. For any n ∈ N+ we asso-
ciate with g a function gn which has a constant value on each fundamental
interval I(i(n) ), i(n) ∈ Nn+ , of rank n. Specifically,
Z
1
gn (x) = ¡ ¢ gdλ, x ∈ I(i(n) ), i(n) ∈ Nn+ , n ∈ N+ . (3.0.1)
λ I(i(n) ) I(i(n) )
Then Z Z
gn dλ = gdλ, n ∈ N+ , (3.0.2)
I I
and
lim gn (x) = g(x) a.e. in I. (3.0.3)
n→∞
165
166 Chapter 3
R
It follows from (3.0.2) and (3.0.3) that limn→∞ I |g − gn |dλ = 0. Hence
Z
ωg,A (n) = |g − gn |dλ → 0 (3.0.4)
A
and ¯Z ¯ Z ¯Z ¯
¯ ¯ ¯ ¯
¯ ghdλ¯ ≤ |gs − g| |h|dλ + ¯ gs hdλ¯ ,
¯ ¯ ¯ ¯
I I I
where gs is defined by (3.0.1) and s < n, s ∈ N+ , is arbitrary. Since
|h| = 1 − γ(A) = γ(Ac ) on A and |h| = γ(A) on Ac , we have
Z
|gs − g| |h|dλ ≤ γ(Ac )ωg,A (s) + γ(A)ωg,Ac (s). (3.0.5)
I
Next,
¯ ¯
¯Z ¯ ¯ Z ¯
¯ ¯ ¯ X ¯
¯
¯ gs hdλ¯ = ¯ ¯
¯ ¯ g s hdλ¯
I ¯ (s) ¯
¯i(s) ∈Ns+ I(i ) ¯
¯ ¯
¯ Ã Z !Z ¯
¯ X 1 ¯
¯ ¯
= ¯ gdλ hdλ ¯
¯ (s) s λ(I(i(s) )) I(i(s) ) I(i(s) ) ¯
¯i ∈N+ ¯
¯ ¯
¯ ¯
¯ X µ(I(i(s) )) ³ ´¯
¯ (s) (s) ¯
= ¯ λ(I(i ) ∩ A) − λ(I(i ))γ(A) ¯ .
¯ (s) s λ(I(i(s) )) ¯
¯i ∈N+ ¯
Limit theorems 167
Now, the result stated follows from (3.0.5), (3.0.6), and (3.0.4). 2
Let fn : N+ → R, n ∈ N+ , and define
Xnj = fn (aj ), 1 ≤ j ≤ n,
k
X
Sn0 = 0, Snk = Xnj , 1 ≤ k ≤ n, Snn = Sn , n ∈ N+ .
j=1
for any ε > 0. It follows from Proposition A3.5 (see also Section A1.4) that
whatever ε > 0 we have
¡ −1 ¢ ε
dP γSnk , δ0 ≤ , 1 ≤ k ≤ kn ,
4
for any n large enough (≥ nε ). Therefore for some θ ≤ ε/4 we have
−1
δ0 (A) < γSnk (Aθ ) + θ
168 Chapter 3
Note that ξen is Bk∞n +1 -measurable and then by Lemma 3.0.1 and Lemma
2.1.1 in Iosifescu and Grigorescu (1990) we have
µZ Z ¶
lim e−1
hd(γ ξn ) − e−1
hd(µξn ) = 0 (3.0.9)
n→∞ D D
We can now conclude the proof using (3.0.9) through (3.0.11). If, for
w
example, γξn−1 → ν for some ν ∈ pr (BD ), then it follows from (3.0.10) that
w w
γ ξen−1 → ν, too. Next, (3.0.9) implies that µξen−1 → ν, which in conjunction
w
with (3.0.11) yields µξn−1 → ν. 2
Remark. Lemma
¡ ¢ 3.0.2 still holds when the process ξn is replaced by the
process ξnC = ξnC (t) t∈I defined by
¡ ¢
ξnC (t) = Snbntc + (nt − bntc) Sn(bntc+1) − Snbntc , t ∈ I,
X = {Xnj , 1 ≤ j ≤ n, n∈ N+ },
where ³ a ´α
j
Xnj = I(aj >θn) . (3.1.1)
n
For this array we have
k
X
−α
Snk = n aαj I(aj >θn) , 1 ≤ k ≤ n, Sn = Snn , n ∈ N+ . (3.1.2)
j=1
k
X ³ ε´ ³ ε´
γ (|Snk | > ε) ≤ γ |Xnj | > = kγ |Xn1 | >
k k
j=1
µ µ ³ ´ ¶¶
ε 1/α
= kγ a1 > n max θ,
k
where:
(i) if α ∈ R++ then ν = Pois ρ with
dρ x−1−1/α
(x) = δx ((θα , ∞)) , x ∈ R;
dλ α log 2
dρ x−1−1/α
(x) = −δx ((0, θα )) , x ∈ R;
dλ α log 2
³ ´
(iii) if α = 0 then ν = Pois (θ log 2)−1 δ1 , that is, ν is the Poisson
³ ´
distribution P (θ log 2)−1 with parameter (θ log 2)−1 .
Proof. We only prove (i), the proofs of (ii) and (iii) being completely
similar.
Consider the measures µn on B defined by
³³ a ´α ´
1
µn (A) = γ ∈ A, a1 > θn , A ∈ B, n∈ N+ .
n
Clearly,
and
1 1
= lim n log 1 + j k
log 2 n→∞ n (max(x, θα ))1/α + 1
1 1
= = ρ ((x, ∞)) .
log 2 (max(x, θα ))1/α
Finally,
1
lim n µn (R) = lim n γ(a1 > n θ) = = ρ(R).
n→∞ n→∞ θ log 2
where ³ ´
δnj = n−α bαj I(bj >θn) − aαj I(aj >θn) , 1 ≤ j ≤ n.
Notice that (aj > θn) ⊂ (bj > θn), 1 ≤ j ≤ n, and put
n
X ³ ´ n
X
δn0 =n −α
bαj I(bj >θn) − I(aj >θn) = n −α
bαj I(bj >θn,aj ≤θn) ,
j=1 j=1
n
X
δn00 = n−α |bαj − aαj |I(aj >θn) .
j=1
Pn
Then j=1 |δnj | ≤ δn0 + δn00 ,
and we are going to prove that δn0 and δn00 both
converge to 0 in γ-probability as n → ∞.
We have
γ(δn0 > 0) ≤ nγ(θn − c < a1 ≤ θn) → 0
as n → ∞ while
n
X
δn00 ≤ cα n−1 n−(α−1) aα−1
j I(aj >θn) ,
j=1
where
cα(1 + c)α−1 if α ≥ 1,
cα =
c|α|
if α < 1.
¡ ¢
[We have used the inequality (1+a)α −1 ≤ a {α} + bαc(1 + a)α−1 , valid for
non-negative a and α, which implies 1 − (1 + a)−α ≤ aα.] By Theorem 3.1.2,
δn00 converges to 0 in γ-probability as n → ∞. It follows that d0 (ξn , ξn0 ) is
dominated by the sum of two non-negative random variables both converging
in γ-probability to 0 as n → ∞. The proof is complete. 2
Corollary 3.1.5 Let bn denote either yn , rn , or un , n ∈ N+ . Put
k
X
0 −α
Snk =n bαj I(bj >θn) , 1 ≤ k ≤ n,
j=1
Limit theorems 173
In particular,
µ ¶
Mn log 2 1
lim µ ≤ x = e− x , x ∈ R++ .
n→∞ n
Pn
Proof. Let 1 ≤ k ≤ n. It is easy to see that Sn0 = j=1 I(bj >θn) is less
(k)
than k if and only if Mn does not exceed θn, that is,
³ ´ ¡ ¢
Mn ≤ θn = Sn0 < k
(k)
(3.1.5)
³ ´ ¡ 0 ¢ Xk−1
¡ ¢
µ Mn(k) ≤ θn = µ Sn < k = µ Sn0 = j
j=0
k−1
X
−(θ log 2)−1 1
→ e
j!(θ log 2)j
j=0
Proof. We have (bn ≥ cn i.o.) ⊂ (Mn ≥ cn i.o.) since bn (ω) ≥ cn for some
n ∈ N+ and ω ∈ Ω implies Mn (ω) ≥ cn . Conversely, if Mn (ω) ≥ cn for some
n ∈ N+ and ω ∈ Ω, then there exists n0 ≤ n such that Mn (ω) = bn0 (ω) ≥
cn ≥ cn0 . Hence (Mn ≥ cn i.o.) ⊂ (bn ≥ cn i.o.). Therefore (Mn ≥ cn i.o.) =
(bn ≥ cn i.o.) , and the conclusion follows from Corollary 1.3.17. 2
Limit theorems 175
Mn
lim = 0 a.e. (3.1.6)
n→∞ cn
or
Mn
lim sup = ∞ a.e. (3.1.7)
n→∞ cn
P
according as the series n∈N+1/cn converges or diverges.
P
Proof. First, assume that s = n∈N+ 1/cn < ∞. P Choose positive
numbers dn , n ∈ N+ , with limn→∞ dn = ∞ such that n∈N+ dn /cn < ∞.
Pn
This is always possible. Indeed, put sn = i=1 1/ci , n ∈ N+ , and define
E1 = {j ∈ N+ : sj ≤ 3s/4},
( n−1 n
)
X X
−i −i
En = j ∈ N+ : 3s 4 < sj ≤ 3s 4 , n ≥ 2.
i=1 i=1
E1 = {j ∈ N+ : sj ≤ 4} ,
© ª
En = j ∈ N+ : 4n−1 < sj ≤ 4n , n ≥ 2.
P
3 · 2nk−1 , k = 1, 3, · · · . Clearly, this implies n∈N+ dn /cn = ∞. By Propo-
sition 3.1.8 we have µ ¶
Mn 1
γ ≥ i.o. = 1,
cn dn
which is equivalent to (3.1.7). 2
Theorem 3.1.10 Let (cn )n∈N+ be a non-decreasing sequence of positive
numbers such that the sequence (n/cn )n∈N+ is non-decreasing. Then
µ ¶
n
γ Mn ≤ i.o.
cn log 2
converges or diverges.
The proof is completely similar to that given for the i.i.d. case in
Barndorff–Nielsen (1961). Theorem 3.1.6 plays an essential part in the
present case. For details in the case bn = an , n ∈ N+ , see Philipp (1976,
pp. 384–385). 2
Corollary 3.1.11 We have
log Mn − log n
lim sup(inf) = 1(0) a.e.,
n→∞ log log n
whence
log Mn
lim = 1 a.e..
n→∞ log n
Proof. For the lim sup case we should show that for any ε > 0 we have
µ ¶
log Mn − log n
γ ≥ 1 + ε i.o. = 0
log log n
and µ ¶
log Mn − log n
γ ≥ 1 − ε i.o. = 1
log log n
or, equivalently, ¡ ¢
γ Mn ≥ n(log n)1+ε i.o. = 0
Limit theorems 177
and
¡ ¢
γ Mn ≥ n(log n)1−ε i.o. = 1.
These equations clearly hold by Proposition 3.1.8.
For the lim inf case we should show that for any ε > 0 we have
µ ¶
log Mn − log n
γ ≤ ε i.o. = 1
log log n
and µ ¶
log Mn − log n
γ ≤ −ε i.o. = 0
log log n
or, equivalently,
γ (Mn ≤ n(log n)ε i.o.) = 1
and
¡ ¢
γ Mn ≤ n(log n)−ε i.o. = 0
It is easy to check that these equations hold by Theorem 3.1.10. 2
Corollary 3.1.12 We have
Mn log log n 1
lim inf = a.e..
n→∞ n log 2
and µ ¶
Mn log log n 1
γ − ≤ −ε i.o. = 0
n log 2
or, equivalently,
µ ¶
n(1 + ε0 )
γ Mn ≤ i.o. = 1
(log log n)(log 2)
and µ ¶
n(1 − ε0 )
γ Mn ≤ i.o. = 0,
(log log n)(log 2)
where ε0 = ε log 2. This follows immediately from Theorem 3.1.10. 2
178 Chapter 3
(k)
To conclude this subsection we consider the kth smallest mn of b1 , · · · , bn ,
(1) (n) (k)
1 ≤ k ≤ n, n ∈ N+ . Clearly, mn = Mn . In general, we have mn =
(n−k+1)
Mn , 1 ≤ k ≤ n. Then by (3.1.5) we have
¡ 0 ¢
(m(k)
n ≤ θn) = Sn < n − k + 1
for any θ ∈ R++ and n ∈ N+ . Hence, for any µ ∈ pr(BI ) such that µ ¿ λ,
³ ´ ¡ ¢
µ m(k)
n ≤ θn = µ Sn0 < n − k + 1
n−k
X n
X
= µ(Sn0 = j) = 1 − µ(Sn0 = j).
j=0 j=n−k+1
(k)
for any fixed k ∈ N+ , where an denotes the kth smallest of a1 , · · · , an .
(k) (k)
As mn ≤ an + 2, n ∈ N+ , 1 ≤ k ≤ n, it follows that
(k)
mn
lim = 0 a.e.
n→∞ n
say, then P (ηk < g(n) for p values k, 1 ≤ k ≤ n) → e−θ θp /p! as n → ∞ for
any fixed p ∈ N.
In particular this result applies to a sequence (ηn )n∈N+ for which
with
2θ log 2
g(n) = 1 + , n ∈ N+ .
n
For such a sequence, similarly to (3.1.4) we can write
à ! k−1 j
(k) X
n(ηn − 1) x
lim P ≥ x = e−x , x ∈ R++ , (3.1.9)
n→∞ 2 log 2 j!
j=0
(k)
for any fixed k ∈ N+ , where ηn denotes the kth smallest of η1 , · · · , ηn , 1 ≤
k ≤ n.
We cannot assert that (3.1.9) is true for ηn = an , n ∈ N+ , since the
equation γ (a1 ≥ x) = log (1 + 1/x) / log 2 holds just for x ∈ N+ . It is
conjectured in Iosifescu (1978) that (3.1.9) holds true for ηn = rn , n ∈ N+ ,
under any P ¿ λ. [Notice that γ (r1 ≥ x) = log (1 + 1/x) / log 2 for any
x ≥ 1, but the sequence (rn )n∈N+ is not ψ-mixing under γ.] 2
H1 = H( · · · , a−2 , a−1 , a0 , a1 , a2 , · · · ).
1 ¡ ¢
ξnC (t) = √ Sbntc + (nt − bntc)(Hbntc+1 − Eγ H1 ) ,
σ n
1
ξnD (t) = √ Sbntc , t ∈ I,
σ n
180 Chapter 3
w
If σ > 0 then γξn−1 −→ W in both C and D, where ξn stands for either ξnC
or ξnD . The last conclusion still holds when γ is replaced by any µ ∈ pr(BI2 )
such that µ ¿ λ2 .
Proof. This is a transcription of Theorem 21.1 in Billingsley (1968), with
an improvement by Popescu (1978) (concerning the possibility of replacing γ
by µ), for the special case of the doubly infinite sequence (al )l∈Z . Note that
in Proposition 2.1.22 a class of functions H is indicated, for which (3.2.1)
holds. 2
Next, we state a strong invariance principle.
Theorem 3.2.2 Assume that there exist constants 0 < δ ≤ 2 and c > 0
such that Eγ |H1 |2+δ < ∞ and
1/(2+δ)
Eγ |H1 − Eγ (H1 |a−n , · · · , an )|2+δ ≤ cn−(2+7/δ) , n ∈ N+ , (3.2.3)
so that (3.2.1) holds and
1
lim Eγ Sn2 = σ 2 ≥ 0
n→∞ n
exists finitely and is given by the absolutely convergent series (3.2.2). If
σ > 0 then the strong invariance principle holds for the stochastic processes
ξnC and ξnD , n ∈ N+ . That is, without changing their distributions, we can
redefine these processes on a common richer probability space together with
a standard Brownian motion process (w(t))t∈I such that
sup |ξn (t) − w(t)| = O(n−a ) a.s.
t∈I
Limit theorems 181
H1 = H(a1 , a2 , · · · ),
and we have a strictly stationary sequence (Hn )n∈N+ on (I, BI , γ). With
the same definitions as before for Sn , ξnC and ξnD , n ∈ N+ , where Eγ H1 is
replaced by Eγ H1 , we can state the following special cases of Theorems 3.2.1
and 3.2.2.
Theorem 3.2.10 Assume that Eγ H12 < ∞ and
X
Eγ1/2 [H1 − Eγ (H1 |a1 , · · · , an )]2 < ∞ (3.2.10 )
n∈N+
so that
1
lim Eγ Sn2 = σ 2 ≥ 0
n→∞ n
w
If σ > 0 then γξn−1 −→ W in both C and D, where ξn stands for either ξnC
or ξnD . The last conclusion still holds when γ is replaced by any µ ∈ pr(BI )
such that µ ¿ λ.
Note that inequality (2.1.32) and Proposition 2.1.23 describe two classes
of functions H for which (3.2.10 ) holds.
Theorem 3.2.20 Assume that there exist constants 0 < δ ≤ 2 and c > 0
such that Eγ |H1 |2+δ < ∞ and
1
lim Eγ Sn2 = σ 2 ≥ 0
n→∞ n
exists finitely and is given by the absolutely convergent series (3.2.20 ). If σ >
0 then the strong invariance principle holds for the stochastic processes ξnC
and ξnD , n ∈ N+ . That is, without changing their distributions, we can re-
define these processes on a common richer probability space together with a
standard Brownian motion process (w(t))t∈I such that
1 X 1 + v(i(k) )
Eγ H1r = H r (i(k) ) log
log 2 1 + u(i(k) )
i(k) ∈Nk+
with r = 1 or 2, and
σ 2 = Eγ H12 − Eγ2 H1 (3.2.200 )
X X H(i(k) )H(in+1 , · · · , in+k ) 1 + v(i(n+k) )
+2 log (n+k)
− Eγ2 H1 .
log 2 1 + u(i )
n∈N+ i(n+k) ∈Nn+k
+
for some δ > 0, then there exist two positive constants a < 1 and c such
that ¯ Ã Pn ! ¯
¯ ¯
¯ j=1 Hj − nEγ H1 ¯
¯γ √ < x − Φ(x)¯ ≤ c n−a
¯ σ n ¯
for any x ∈ R and n ∈ N+ .
Proof. This is a transcription of Theorem 1 in Iosifescu (1968) for the
special case of the sequence (an )n∈N+ of incomplete quotients. 2
Remark. It is an open problem to determine the optimal value of a in
Theorem 3.2.3. We conjecture that a = δ/2, that is, the same value as in
the case of i.i.d. random variables with finite (2 + δ)-absolute moment. 2
In what follows, by restricting the class of functions H we give more
precise results in the case k = 1. To emphasize this special framework we
change the notation by using the letter f instead of H.
Theorem 3.2.4 Let f : N+ → R, An ∈ R, Bn ∈ R++ , n ∈ N+ , with
limn→∞ Bn = ∞, and define
Xnj = Bn−1 (f (aj ) − An ) , 1 ≤ j ≤ n,
X k
Sn0 = 0, Snk = Xnj , 1 ≤ k ≤ n, Snn = Sn , n ∈ N+ ,
j=1
1 X
F (x) = f 2 (k)k −2 ,
log 2
{k:|f (k)|≤x}
X µ ¶
1 1
= f 2 (k) log 1+ , x ∈ R+ .
log 2 k(k + 2)
{k:|f (k)|≤x}
(I) The stochastic process ξnD = ξn = (ξn (t))t∈I defined for any n ∈ N+
by ξn (t) = Snbntc , t ∈ I, satisfies
w
γξn−1 −→ WD in BD ,
184 Chapter 3
(ii) When limx→∞ Fe(x) = Eγ f 2 (a1 ) = ∞, assertion (I) above holds with a
bounded sequence (An )n∈N+ if and only if
X
x2 k −2
{k:|f (k)|>x}
lim X =0 (3.2.4)
n→∞ f 2 (k)k −2
{k:|f (k)|≤x}
P ¡ ¢
Eγ f 2 (a1 ) − Eγ2 f (a1 ) + 2 n∈N+ Eγ f (a1 )f (an+1 ) − Eγ2 f (a1 )
Eγ f 2 (a1 )
Fix δ ∈ (0, 1) and put Xnjδ = Xnj I(|Xnj |≤δ) − Eγ Xnj I(|Xnj |≤δ) for any
w
1 ≤ j ≤ n, n ∈ N+ . As γSn−1 → N (0, 1) by (II), it follows from Theorem
A3.11(i) that
2
Xn
lim Eγ Xnjδ = 1. (3.2.6)
n→∞
j=1
for any n large enough since δ ∈ (0, 1), (An )n∈N+ is bounded, and limn→∞ Bn
= ∞. Then for such an n we have
2
Eγ Xn1 I(|Xn1 |≤δ) ≤ Bn−2 Eγ (f (a1 ) − An )2 I(|f (a1 )|≤Bn )
³ ´
≤ 2Bn−2 Fe (Bn ) + A2n ,
186 Chapter 3
whence, by (3.2.5),
2
Eγ Xn1 I(|Xn1 |≤δ) ≤ 4Bn−2 Fe(Bn ) (3.2.8)
for any n large enough. It follows from (3.2.6) through (3.2.8) that there
exist c > 0 and n0 ∈ N+ such that
(|Xn1 | > ε) = (|f (a1 ) − An | > εBn ) ⊃ (|f (a1 )| > |An | + εBn )
Noting that limn→∞ Bn+1 /Bn = 1 (this follows from, e.g., Theorem A3.9,
but a direct proof can be also easily given), the last equation implies
P
Finally, for a > 0 we have F (x) ∼ x4a/(2a+1) /2a log 2 and x2 {k:|f (k)|>x} k
−2
with the usual convention which assigns value 0 to a sum over the empty
set, where (Bn )n∈N+ is any sequence satisfying limn→∞ nBn−2 F (Bn ) = 1
with F defined as in Theorem 3.2.4, and Eγ (b0 ) is equal to
Z ∞ Z ∞
1 f (x)dx 1 f (x)dx
Eγ f (y 0 ) = , Eγ f (r0 ) = Eγ f (r1 ) =
log 2 1 x(x + 1) log 2 1 x(x + 1)
or µZ Z ¶
2 ∞
1 (x − 1)f (x)dx f (x)dx
Eγ f (u0 ) = +
log 2 1 x2 2 x2
according as bn denotes yn , rn or un , n ∈ N+ . Then
0 w
µξn−1 → WD in BD
X Z
1
= γ̄(dω 0 , dθ0 )
γ̄ 2+δ (I 2 (i−n , · · · , in )) I 2 (i−n ,··· ,in )
i−n ,··· ,in ∈N+ (3.2.14)
¯Z ¯2+δ
¯ ¯
¯ ¯
ׯ (h(ω 0 , θ0 ) − h(ω, θ))γ̄(dω, dθ)¯ .
¯ 2
I (i−n ,··· ,in ) ¯
Now, under (i) it is easy to check that h satisfies an inequality of the form
(2.1.30), which yields cn ≤ crn , n ∈ N+ , for some c > 0 and 0 < r < 1,
Limit theorems 191
for any µ ∈ pr(BI2 ) such that µ ¿ λ2 , where ξn stands for either ξnC or ξnD
defined as in Section 3.2.1, for our special H given by (3.2.13) and with
σ(f ) = σ(H) defined by (3.2.12). But
¯ ¯
¯bn (ω) − bn (ω, θ)¯ ≤ (Fn−1 Fn )−1 , n ∈ N+ , (ω, θ) ∈ Ω2 .
¯ ¯ 1 ¯ ¯
sup ¯ξn0 (t, ω) − ξn (t, (ω, θ))¯ ≤ √ max ¯Si0 (ω) − Si (ω, θ)¯
t∈I σ(f ) n 1≤i≤n
n
X
1 ¯ ¡ ¢¯
≤ √ ¯f (bi (ω)) − f bi (ω, θ) ¯
σ(f ) n
i=1
sε (f ) X ¯¯ ¯ε
≤ √ bi (ω) − bi (ω, θ)¯
σ(f ) n
i=1
³ ´
= O n−1/2
192 Chapter 3
O(1) ³ ´
≤ √ = O n−1/2 γ-a.s.
σ(f ) n
for any µ ∈ pr(BI2 ) such that µ ¿ λ2 . Now, (3.2.15) and (3.2.16) imply at
once that
0 w
µξn−1 −→ W in both BC and BD
for any µ ∈ pr(BI ) such that µ ¿ λ.
To prove (b) note that for δ > 0 by Theorem 3.2.2 we have
converges absolutely. If σ(f ) 6= 0 then both the weak and strong invariance
principles hold as described in Theorems 3.2.10 and 3.2.20 for the stochastic
Limit theorems 193
It follows from Proposition 2.1.23 and its proof that both (3.2.10 ) and (3.2.30 )
hold in our special case, hence the present statement. 2
Remark. Convergence
P rates in the central limit theorem are available for
the sequence ( ni=1 f (ri ) − nEγ f (r1 ))n∈N+ . Hofbauer and Keller (1982, p.
133) proved that
¯ µ Pn ¶ ¯
¯ i=1 f (ri ) − nEγ f (r1 ) ¯
¯
sup ¯γ √ < x − Φ(x)¯¯ = O(n−a )
x∈R σ(f ) n
1 X (−1)k+1
=
log 2 k2
k∈N+
π2
=
12 log 2
while the corresponding σ(f ) = σ < ∞ is non-zero. This can be shown
as follows. By the reversibility of (ā` )`∈Z —see Subsection 1.3.3—the finite
194 Chapter 3
dimensional distributions under γ̄ of (ȳ` )`∈Z and (r̄` )`∈Z are identical. Then
à n µ ¶!2
X 2
σ2 1E
= lim n log y i − π
γ 12 log 2
n→∞
i=1
à n µ ¶!2
X
1E
= lim n log ri − π2
γ 12 log 2
n→∞
i=1
à n µ ¶!2
X
1E
= lim n log ri − π2 .
γ 12 log 2
n→∞
i=1
So, σ 2 coincides with (2.1.33) in the case where the function h is defined by
1 π2
h(ω) = log − , ω ∈ Ω.
ω 12 log 2
In this case convergence rates in the central limit theorem are available.
Misevičius (1981) proved that
¯ µ ¶ ¯ µ ¶
¯ log qn − nπ 2 /12 log 2 ¯ log n
sup ¯¯λ √ < x − Φ(x)¯¯ = O √ (3.2.17)
x∈R σ n n
we can derive the corresponding result for the random variable zn defined
by ¯ ¯
¯ p n (ω) ¯
zn (ω) = ¯¯ω − ¯ , ω ∈ Ω, n ∈ N+ .
qn (ω) ¯
We have
¯ µ ¶ ¯ µ ¶
¯ 2 /6 log 2 ¯
¯µ log zn + nπ
√ ¯
< x − Φ(x)¯ = O √
1
¯ 2σ n n
1
h(ω) = ω − + 1, ω ∈ Ω,
log 2
k
X
Sn0 = 0, Snk = Xnj , 1 ≤ k ≤ n, Snn = Sn , n ∈ N+ .
j=1
and
1 X k1
lim k −2 = ,
x→∞ Fe(x) {k:f (k)>x} k1 + k2
(3.3.2)
1 X k2
lim k −2 =
x→∞ Fe(x) {k:f (k)<−x} k1 + k2
(iii) If either (I) or (II) above holds, then γ can be replaced in (i) by any
µ ∈ pr (BI ) such that µ ¿ λ.
Proof. (i) and (iii) follows from Theorem A3.7 and Lemma 3.0.2, respec-
tively. The proof of (ii) is entirely similar to that working in the case of i.i.d.
random variables. See Samur (1989, p. 62) and Araujo and Giné (1980, pp.
81, 84–85, 87–88). 2
Remark. In principle, from Theorem 3.3.1 we might derive the asymp-
totic behaviour as n → ∞ of random variables as, e.g.,
with the usual convention which assigns value 0 to a sum over the empty
set. Then
w
µηn−1 → Qνα in BD
for any µ ∈ pr(BI ) such that µ ¿ λ.
Proof. (i) By Lemma A2.6(iii) it is sufficient to show that
X
k −2 ∼ (f1 (x))−1 as x → ∞. (3.3.3)
{k:f (k)>x}
Hence X X
k −2 k −2
{k:f (k)>x} f1 (x)≤k≤f2 (x)
1≤ X ≤1+ X (3.3.5)
−2
k k −2
k>f2 (x) k>f2 (x)
1 X ¡ ¢
ξn (t) = f (aj ) − Eγ f (a1 )I(f (a1 )≤Bn ) , t ∈ I,
Bn
j≤bntc
with Bn satisfying
k1 + k2
lim n Bn−2 F (Bn ) = . (3.3.9)
n→∞ 2−α
It is therefore sufficient to prove that in (3.3.9) we can take Bn = f (n), n ∈
N+ , k1 = α/ log 2, k2 = 0, and that
α
= .
(1 − α) log 2
f1 (f (n) − 1) ≤ n ≤ f2 (f (n)) , n ∈ N+ .
f1 (f (n) − 1) ∼ f1 (f (n)) as n → ∞.
200 Chapter 3
As f1 ∼ f2 , it follows that
fi (f (n)) ∼ n as n → ∞, i = 1, 2. (3.3.11)
1 X
Eγ f (a1 )I(f (a1 )≤f (n)) ∼ f (k)k −2 as n → ∞.
log 2
{k:f (k)≤f (n)}
Limit theorems 201
1 X
Eγ f (a1 )I(a1 ≥f1 (f (n))) ∼ f (k)k −2 as n → ∞.
log 2
k≥f1 (f (n))
1 n f (f1 (f (n))) α
∼
log 2 f1 (f (n)) f (n) α−1
α
∼ as n → ∞,
(α − 1) log 2
202 Chapter 3
Z x
sin(π/α) dt
= , 0 ≤ x ≤ 1,
π 0 t1−1/α (1− t)1/α
for any µ ∈ pr(BI ) such that µ ¿ λ.
Proof. It is easy to check that να defined in Corollary 3.3.2 is a strictly
stable probability and να ((0, ∞)) = 1/α for any 1 < α < 2. Then (3.3.18)
is an immediate consequence of Theorem 5.1 in de Acosta (1982). 2
Remarks. 1. Proposition 3.3.3 holds for α = 2, too. In this case the
limiting distribution is the classical arc-sine law mentioned in Remark 2
following Theorem 3.2.4. However, the assumption on f in Proposition
3.3.3 is slightly stronger [cf. Corollary A2.7(ii)] than the assumption on f
in Theorem 3.2.4, under which the arc-sine law holds.
2. It follows from Proposition 3.3.3 [cf. Theorem 5.2 in de Acosta (1982)]
that
Z
sin(π/α) x dt
µ (λ (t ∈ I : ξνα (t) > 0) < x) = 1−1/α
, 0 ≤ x ≤ 1,
π 0 t (1 − t)1/α
for any 1 < α < 2. This generalizes P. Lévy’s arc-sine law for Brownian
motion. 2
we have
n
1 X (j + 1)2
An = Eγ a1 I(a1 ≤n) = j log
log 2 j(j + 2)
j=1
µ ¶
1 n+2
= log(n + 2) − (n + 1) log , n ∈ N+ .
log 2 n+1
Hence
1
An = (log n − 1 + o(1)) (3.3.19)
log 2
as n → ∞. For any µ ∈ pr(BI ) such that µ ¿ λ by Corollary 3.3.2(ii) we
have
w
µ (ηn (1))−1 → ν1 , (3.3.20)
where
n
1X
ηn (1) = (aj − An ) , n ∈ N+ .
n
j=1
where
n µ ¶
1X C − log n
ζn (1) = aj + , n ∈ N+ ,
n log 2
j=1
¯ ¯ 2
¯γ (ζn (1) < x) − ν 0 ((−∞, x))¯ ≤ c0 (log n) (3.3.22)
n
for any n ∈ N+ and x ∈ R.
To conclude let us note that (3.3.21) is a special case of
w
µζn−1 −→ Qν 0 in BD ,
204 Chapter 3
It follows from (3.3.19) and the strict stationarity of (an )n∈N+ under γ that
n
Eγ tn (h) = (log h(n) − 1 + o(1)) (3.3.24)
log 2
Let us find upper bounds for the γ-probabilities of A1 , A2 , and A3 .We have
¡ ¢ ¡¯ ¯ ¢
A1 = tn − Eγ tn > 12 Eγ tn ⊂ ¯tn − Eγ tn ¯ > 12 Eγ tn .
¡ ¢
Since t0n ≤ tn , n ∈ N+ and Eγ tn /2 − Eγ t0n < 0 for n large enough, for
such an n we have
¡ ¢ ¡ ¢ ¡ ¢
A2 = tn < 12 Eγ tn ⊂ t0n < 12 Eγ tn = t0n − Eγ t0n < 12 Eγ tn − Eγ t0n
¡¯ 0 ¯ ¢
⊂ ¯tn − Eγ t0n ¯ > Eγ t0n − 1 Eγ tn .
2
c02 n2 −2
γ(A2 ) ≤ ¡ ¢2 ≤ c5 (log n) . (3.3.28)
Eγ t0n − Eγ tn /2
Noting that
n ³
[ ´
(tn 6= tn ) = ai > n(blog4/3 nc + 1) ,
i=1
whence
³ ³ ´´
γ(tn 6= tn ) ≤ nγ a1 > n blog4/3 nc + 1 ≤ c6 (log n)−4/3 , (3.3.29)
we obviously have
γ(A3 ) ≤ c6 (log n)−4/3 . (3.3.30)
Next, let us find an upper bound for
¯ ¯ 4
¯1
¯ 1 ¯¯ X
Eγ ¯ − = Ii (n),
tn Eγ tn ¯
i=1
where Z ¯ ¯
¯1 1 ¯
Ii (n) = ¯ − ¯
¯ tn E t ¯ dγ, 1 ≤ i ≤ 4.
Ai γ n
Since tn ≤ tn , n ∈ N+ , on A1 we have
1 1 2
≤ < . (3.3.31)
tn tn 3Eγ tn
Finally, set
wn = (tn − Eγ tn )/Eγ tn
and note that by (3.3.24) and (3.3.25) we have
with the usual convention which assigns value 0 to a sum over the empty
set, where f : [1, ∞) → R++ is bounded on finite intervals and regularly
varying of index β > 1. Then d0 (ηn , ηn0 ) converges to 0 in γ-probability as
n → ∞.
Proof. Write f (x) = xβ L(x), x ∈ [1, ∞), where L is slowly varying. For
any n ∈ N+ we have
d0 (ηn , ηn0 ) ≤ sup |ηn (t) − ηn0 (t)|
t∈I
(3.3.35)
1 P |f (a ) − f (b )| ≤ δ 0 + δ 00 ,
n
≤ j j n n
f (n) j=1
where
1 X³ β ´
n n
1 X β
δn0 = bj − aβj L(aj ), δn00 = bj |L(bj ) − L(aj )| .
f (n) f (n)
j=1 j=1
¡ ¢
Using the inequality (1 + a)α − 1 ≤ a {α} + bαc(1 + a)α−1 , valid for non-
negative a and α, we obtain
bβj − aβj ≤ cβ(1 + c)β−1 aβ−1
j , 1 ≤ j ≤ n,
whence
n
1 X −1
δn0 ≤ cβ(1 + c)β−1 aj f (aj ).
f (n)
j=1
Writing
a−1 −1 −1
j f (aj ) = aj f (aj )I(aj ≤M ) + aj f (aj )I(aj >M ) , 1 ≤ j ≤ n,
for an arbitrarily given M ≥ 1, we easily obtain
n
X
n f (i) 1 1
δn0 ≤ cβ(1 + c)β−1 max + f (aj ) .
f (n) 1≤i≤M i M f (n)
j=1
Limit theorems 209
n µ
X ¶ ¯ ¯
bj β ¯ L(bj ) ¯
+ f (aj ) ¯¯ − 1¯¯ I(aj >M )
aj L(aj )
j=1
à !
³ ´ n
β
≤ 1 + (1 + c) sup f (x)
1≤x≤M +c f (n)
¯ ¯X
(1 + c)β ¯ L(x + s) ¯ n
+ sup ¯ ¯
f (n) ¯ L(x) − 1¯ f (aj ).
0≤s≤c, x>M j=1
with the usual convention which assigns value 0 to a sum over the empty set,
where f : [1, ∞) → R++ is bounded on finite intervals and regularly varying
of index 1/α, 0 < α < 1. Let µ ∈ pr(BI ) such that µ ¿ λ. Then
w
µηn0−1 → Qνα in BD .
with the usual convention which assigns value 0 to a sum over the empty
set, where m(f, b0 ) and Eγ f (b0 ) are equal to
Z ∞
1 f (x)dx
Eγ f (y 0 ) = Eγ f (r0 ) = Eγ f (r1 ) = ,
log 2 1 x(x + 1)
µZ 2 Z ∞ ¶
1 f (x)(x − 1)dx f (x)dx
Eγ f (u0 ) = + ,
log 2 1 x2 2 x2
Eγ f (y 0 ) = Eγ f (r0 ) = Eγ f (r1 )
Z ∞ Z 1
1 x1/α dx 1 v −1/α dv
= =
log 2 1 x(x + 1) log 2 0 v+1
1 X 1
=
log 2 (2j − 1 − 1/α)(2j − 1/α)
j∈N+
µ µ ¶ µ ¶¶
1 1 1 1
= ψ 1− −ψ − ,
2 log 2 2α 2 2α
212 Chapter 3
m(f, u0 ) = Eγ (r0 − a0 + y −1 −1
0 ) = m(f, r 0 ) + Eγ (y 0 )
Z ∞
2 dx ¡ −1
¢
= = 2 (log 2) − 1 .
log 2 1 x2 (x + 1)
0 0
It follows that if for any n ∈ N+ the process ζn = (ζn (t))t∈I is defined
by µ ¶
0 1 X C − log n
ζn (t) = bj + , t ∈ I,
n log 2
j≤bntc
and
à Pk !
card{1 ≤ k ≤ n : j=1 uj > k(log n − C)/ log 2}
lim µ <x
n→∞ n
¡ ¢
= µ λ(t ∈ I : ξν 000 (t) > 0) < x , 0 ≤ x ≤ 1.
2
1 ¡ ¡ ¢¢
θn (t) = √ Sbntc + (nt − bntc) Hbntc+1 − Eγ H1
σ 2n log log n
1
= √ ξC , t ∈ I.
2n log log n n
converges or diverges.
Proof. These results follow from Theorem 3.2.20 and properties of stan-
dard Brownian motion. See Jain and Taylor (1973) and Jain, Jogdeo and
Stout (1975) [cf. Philipp and Stout (1975)]. 2
Except for the sufficiency of the moment assumption Eγ H12 < ∞ in
the case considered there, the considerations on Theorem 3.4.1 following
Corollary 3.4.2 are valid for Proposition 3.4.4, too.
We note that Proposition 3.4.4(i) implies the classical law of the iterated
logarithm µ ¶
Sn
γ lim sup √ = 1 = 1. (3.4.1)
n→∞ σ 2n log log n
√
To obtain (3.4.1)
√ we should take successively θ(n) = (1 + ε) 2 log log n and
θ(n) = (1 − ε) 2 log log n, 0 < ε < 1, n ∈ N+ . Also, Proposition 3.4.4(ii)
implies Chung’s law of the iterated logarithm for maximum absolute partial
sums à !
max1≤i≤n |Si | π
γ lim inf p =√ = 1. (3.4.2)
n→∞ σ n/(log log n) 8
√ √
To obtain (3.4.2)
√ we should√ take successively θ(n) = ( 8/π)(1+ε) log log n
and θ(n) = ( 8/π)(1 − ε) log log n, 0 < ε < 1, n ∈ N+ .
We conjecture that in the special case where H only depends on finitely
N
many coordinates of a current point in N+ + , Chung’s law of the iterated
logarithm (3.4.2) holds only assuming that Eγ H12 < ∞ [as (3.4.1) does]. See
Jain and Pruitt (1975) for the i.i.d. case.
Proof. The results follow at once from Theorem 3.2.9(b) and Strassen’s
law of the iterated logarithm for standard Brownian motion [see Theorem 1
in Strassen (1964)]. 2
Note that in the present context we cannot make considerations similar
to those following Corollary 3.4.2.
Example 3.4.6 Let f (x) = log x, x ∈ [1, ∞). As we have seen in Exam-
ple 3.2.11, in the cases where bn = yn or bn = rn , n ∈ N+ , we have
π2
Eγ f (b0 ) =
12 log 2
and σ(f ) = σ < ∞ is non-zero. It follows that Strassen’s law of the iterated
logarithm holds for the corresponding processes θn0 , n ∈ N+ . In particular,
the classical law of the iterated logarithm
µ ¶
log qn − nπ 2 /12 log 2
γ lim sup √ =1 =1
n→∞ σ 2n log log n
holds. This had been proved by Gordin and Reznik (1970) and Philipp and
Stackelberg (1969). 2
A result similar to Proposition 3.4.4 holds.
Proposition 3.4.7 Let θ : [1, ∞) → R++ be non-decreasing. Then
under the assumptions of Theorem 3.2.9 the following assertions hold:
√
(i) γ(Sn0 > σ(f ) n θ(n) i.o.) = 0 or 1
according as Z ∞ µ 2 ¶
θ(t) θ (t)
exp − dt
1 t 2
converges or diverges.
√
(ii) γ (max1≤i≤n |Si0 | < σ(f ) n/θ(n) i.o.) = 0 or 1
according as Z ∞ 2 µ 2 2 ¶
θ (t) π θ (t)
exp − dt
1 t 8
converges or diverges.
Proof. These results follow from Theorem 3.2.9 and properties of stan-
dard Brownian motion. See Jain and Taylor (1973) and Jain, Jogdeo and
Stout (1975) [cf. Philipp and Stout (1975)]. 2
The remarks following Proposition 3.4.4 concerning the classical and
Chung’s laws of the iterated logarithm apply mutatis mutandis in the present
context, too.
Limit theorems 217
It is obvious that all the results stated in this section still hold when γ
is replaced by any µ ∈ pr(BI ) such that µ ¿ λ.
218 Chapter 3
Chapter 4
219
220 Chapter 4
lim µ (T n (A)) = 1.
n→∞
Ergodic theory of continued fractions 221
and
f˜ ◦ T = f˜ µ-a.s.
R R
Moreover, X f˜ dµ = X f dµ and if, in addition, T is ergodic under µ, then
R
f˜ is µ-a.s. a constant equal to X f dµ.
A proof of the ergodic theorem can be found in, e.g., Billingsley (1965),
Walters (1982), Petersen (1983) or Cornfeld et al. (1982). In particular, in
Keane (1991) a short proof, essentially based on an idea of Kamae (1982),
is outlined. See also Katznelson and Weiss (1982). 2
Under suitable assumptions it is possible to refine Birkhoff’s theorem by
giving an estimate of the convergence rate to the limit f˜. The result stated
below is a special case of Theorem 3 of Gál and Koksma (1950).
Proposition 4.0.4 Let T be a µ-preserving transformation of X which
is ergodic under µ. Assume that
Z Ãn−1
X Z !2
f ◦ Tκ − n f dµ dµ = O(Ψ(n))
X κ=0 X
and define T : XT → XT by
where Aj ∈ X , 0 ≤ j ≤ n, n ∈ N, by setting
\
µ(C(A0 , . . . , An )) = µ T −n+j (Aj ), n ∈ N.
0≤j≤n
is called a skew product of T and (Tx )x∈X . In many cases the natural
extensions are constructed as skew products. Several examples can be found
in the next sections.
Assuming that T is µ-preserving and Tx is νx -preserving for any x ∈ X,
we might expect the skew-product S to be ν-preserving, where ν is the
probability measure on X ⊗ Y defined by
Z
ν(A × B) = νx (B) µ(dx), A ∈ X , B ∈ Y.
A
for any (ωi )i∈N = (ω0 , ω1 , · · · ) ∈ Ωτ . Let us remark that by the very defi-
nition of Ωτ we have ωi+1 = 1/(κi + ωi ) for some κi ∈ N+ whatever i ∈ N.
Hence Ωτ can be viewed as the Cartesian product
N
Ω × N+ +
and µ ¶
1
τ (ω0 ), ∈ Ω2 .
bω0−1 c + [bω1−1 c, bω2−1 c, · · · ]
These considerations show that we can identify τe : Ωτ → Ωτ and τ : Ω2 →
Ω2 defined as in Subsection 1.3.1 by
µ ¶
1
τ (ω, θ) = τ (ω), , (ω, θ) ∈ Ω2 .
a1 (ω) + θ
Ergodic theory of continued fractions 225
card{κ : aκ = i, 1 ≤ κ ≤ n}
n
µ ¶ ³ 1 ´
1 1
= log 1 + + o n− 2 log(3+ε)/2 n a.e.
log 2 i(i + 2)
as n → ∞.
Proof. The first equation in the above statement follows from (4.1.1) by
taking f = I(a1 =i) , hence f ◦ τ κ = I(a1 ◦τ κ =i) = I(aκ+1 =i) , κ ∈ N. The second
equation follows from Proposition 4.0.4 on account of Corollaries 1.3.15 and
A3.3 which yield Ψ(n) = n, n ∈ N+ . 2
226 Chapter 4
1 1 + v(i(m) ) ³ 1 ´
−2 (3+ε)/2
= log + o n log n a.e.
log 2 1 + u(i(m) )
as n → ∞.
The proof is quite similar to that of the preceding proposition. In (4.1.1)
we should take f = I((a1 ,··· ,am )=i(m) ) . 2
It is important to note that the asymptotic relative digit frequencies as
well as the asymptotic relative m-digit block frequencies, m ≥ 2, consti-
tute probability distributions on N+ respectively Nm + . This is quite easily
checked in the first case and not so easily in the second one (induction on
m!). Actually, this follows from (4.1.1) on account of the countable additiv-
ity of the integral there with respect to the integrand.
We now give other results related to asymptotic relative digit frequencies.
Corollary 4.1.3 (Asymptotic relative frequencies of digits between two
given values) For any i, j ∈ N+ such that i ≤ j we have
card{κ : i ≤ aκ ≤ j, 1 ≤ κ ≤ n} 1 (i + 1)(j + 1)
lim = log a.e..
n→∞ n log 2 i(j + 2)
card{κ : i ≤ aκ ≤ j, 1 ≤ κ ≤ n}
n
1 (i + 1)(j + 1) ³ 1 3+ε
´
= log + o n− 2 log 2 n a.e.
log 2 i(j + 2)
as n → ∞.
This is a direct consequence of Proposition 4.1.1, which can be also
obtained from (4.1.1) by taking f = I(i≤a1 ≤j) .
Ergodic theory of continued fractions 227
The proof rests on a special case of a result from Whittaker and Watson
(1927, Section 12.13), which reads as follows.
Let αi , βi ∈ C \ N+ , 1 ≤ i ≤ r, for a given r ∈ N+ . Then the infinite
product
Y (n − α1 )(n − α2 ) · · · (n − αr )
(n − β1 )(n − β2 ) · · · (n − βr )
n∈N+
228 Chapter 4
Pr Pr
converges if and only if i=1 αi = i=1 βi . If this condition is fulfilled,
then
Y (n − α1 )(n − α2 ) · · · (n − αr ) r
Y Γ(1 − βi )
= . (4.1.2)
(n − β1 )(n − β2 ) · · · (n − βr ) Γ(1 − αi )
n∈N+ i=1
2
For example, using the well known relations Γ(z)Γ(1 − z) = π/ sin πz,
z 6∈ Z, and Γ(z + 1) = zΓ(z), z 6∈ −N, if we take m = 2 and ` = 1 then we
find that
card{κ : aκ ≡ 1 mod 2, 1 ≤ κ ≤ n}
lim
n→∞ n
1 Γ(1/2)Γ(3/2) log π
= log 2
= − 1 = 0.6514 · · · a.e.,
log 2 Γ (1) log 2
1 Γ(1/4)Γ(3/4) 1
= log 2
= a.e.,
log 2 Γ (1/2) 2
1 X X (4ij + 1)(4ij + 2i + 2j + 2)
= log a.e.,
log 2 (4ij + 2i + 1)(4ij + 2j + 1)
i∈N+ j∈N+
1 X Γ(1 + 2i+1 1
4i )Γ(1 + 4i+2 )
log 1 i+1
.
log 2 Γ(1 + 4i )Γ(1 + 2i+1 )
i∈N+
Ergodic theory of continued fractions 229
Nolte (op. cit.) proved that the last quantity can be expressed as
µ ¶
1 X n ζ(n) − 1 2−n 2−2n 2n−1 − 1
α+ (−1) (2 −2 − 1)(ζ(n) − 1) + 2n−2 ,
log 2 n 2
n≥2
where
µ ¶
2 √ 4 1
α = log 2 − 1 + log 6 2π − log Γ = 0.08167 · · · .
log 2 log 2 4
Actually, all the results we have proved so far are special cases of the
following result.
Proposition 4.1.6 Given m ∈ N+ , let H : Nm
+ → R be such that
X
|H(i(m) )|(v(i(m) ) − u(i(m) )) < ∞
i(m) ∈Nm
+
where
1 X 1 + v(i(m) )
αm = H(i(m) ) log .
log 2 1 + u(i(m) )
i(m) ∈Nm
+
If, in addition,
X
Eλ H 2 (a1 , · · · , am ) = H 2 (i(m) )(v(i(m) ) − u(i(m) )) < ∞
i(m) ∈Nm
+
230 Chapter 4
1X
n−1 ³ 1 ´
H(aκ , · · · , aκ+m−1 ) = αm + o n− 2 log(3+ε)/2 n a.e.
n
κ=0
as n → ∞.
For the proof this time the choice of f in (4.1.1) is
Eλ H 2 (a1 , · · · , am ) < ∞
1X
n−1 ³ 1 ´
H(aκ , · · · , aκ+m−1 ) = αm + o n− 2 αm
2
log2+ε n a.e.
n
κ=0
as n → ∞. 2
We shall now consider other important special cases of Proposition 4.1.6.
With m = 1 and
p
i if p < 1, p 6= 0,
H(i) = Hp (i) =
log i if p = 0
and µ ¶1/p
ap1 + · · · + apn
lim = Kp a.e.
n→∞ n
Ergodic theory of continued fractions 231
= 2.685452 · · ·
and
X µ ¶ 1/p µ Z 1 ¶1/p
1 1 1 (b1/tc)p
Kp = p
i log 1 + = dt .
log 2 i(i + 2) log 2 0 1 + t
i∈N+
In particular,
K−1 = 1.745405 · · · , K−2 = 1.450340 · · · , K−3 = 1.313507 · · · ,
K−4 = 1.236961 · · · , K−5 = 1.189003 · · · , K−6 = 1.156552 · · · ,
K−7 = 1.133323 · · · , K−8 = 1.115964 · · · , K−9 = 1.102543 · · · ,
K−10 = 1.091877 · · · .
More precisely, whatever ε > 0 we have
1 3+ε
(a1 · · · an )1/n = K0 + o(n− 2 log 2 n) a.e.
as n → ∞, and
µ p ¶1/p
a1 + · · · + apn 1 3+ε
= Kp + o(n− 2 log 2 n) a.e.
n
for any p < 1/2, p 6= 0, as n → ∞.
The cases p = 0 and p = −1 leading to the asymptotic a.e. values K0 and
K−1 of the geometric, respectively, harmonic mean of the first n incomplete
quotients as n → ∞ , were studied by Khintchine (1934/35). Ever since its
discovery much effort has been put in the numerical evaluation of K0 . See
Lehmer (1939), Pedersen (1959), Shanks and Wrench, Jr. (1959), Wrench,
Jr. (1960). In the last reference K0 has been evaluated to 155 decimal places.
Recently, using work by Wrench, Jr. and Shanks (1996), Bailey et al. (1997)
have presented rapidly converging series for any Kp , p < 1, allowing them to
evaluate K0 and K−1 to 7,350 decimal places and Kp for p = −2, −3, · · · , −10
to 50 decimal places. Setting
n
X
ζ(s, n) = ζ(s) − i−s , s > 1, n ∈ N+ ,
i=1
232 Chapter 4
where
2i−1
X
Ai = (−1)κ−1 /κ , i ∈ N+ ;
κ=1
(ii) whatever the negative integer p, for any n ∈ N+ we have
µ ¶
P j−p−1
X ζ(2i + j − p, n)
1
j∈N −p − 1
Kpp =
log 2 i
i∈N+
X µ ¶
1
− (i − 1)p log 1 − 2 ;
i
2≤i≤n
where the infimum is taken over all open coverings U = {Ui }i of E such that
diam(Ui ) ≤ ε. The Hausdorff measure H δ (E) and the Hausdorff dimension
dimH (E) of E are then defined as
n o
H δ (E) = lim Hεδ (E), dimH (E) = inf δ : H δ (E) = 0 .
ε→0
Thus lim supn→∞ tn /n = ∞ a.e. and it remains to show that lim sup can be
replaced by lim. Actually, we shall prove much more.
Theorem 4.1.9 [Diamond and Vaaler (1986)] We have
1 + o(1)
tn = n log n + θn max ai a.e.
log 2 1≤i≤n
bh(n)c
n X 1
= (1 + o(1)) = n logbh(n)c(1 + o(1))/ log 2
log 2 j
j=1
as n → ∞. But
bh(n)c µ ¶
1 X 2 1
Eγ (t01 )2 = j log 1 + = bh(n)c(1 + o(1))/ log 2
log 2 j(j + 2)
j=1
nκ = bexp κ1−ε c , κ ∈ N+ .
Note that ¡ ¢
nκ−1 = 1 + O(κ−ε ) nκ
as κ → ∞ so that nκ−1 /nκ and h(nκ−1 )/h(nκ ) both converge to 1 as κ → ∞.
By the choice of the nκ it is obvious that the series with general term
Eγ (t0nκ − Eγ t0nκ )2
, κ ∈ N+ ,
nκ h(nκ )κ1+ε
236 Chapter 4
is convergent. Hence by Beppo Levi’s theorem the random series with gen-
eral term
(t0nκ − Eγ t0nκ )2
, κ ∈ N+ ,
nκ h(nκ )κ1+ε
is convergent a.e. Therefore
³ ´
|t0nκ − Eγ t0nκ | = o nκ κ(1+ε)/2 log(1+2ε)/4 nκ a.e.
as κ → ∞.
Next, for any n ∈ N+ satisfying nκ−1 < n ≤ nκ for some κ ∈ N+ we
clearly have
t0nκ−1 ≤ t0n ≤ t0nκ ,
so that
(1 + o(1))Eγ t0nκ−1 ≤ t0n ≤ (1 + o(1))Eγ t0nκ a.e.
as n → ∞, and since
hold for two distinct indices i, j ≤ n. To proceed fix i < j. It follows from
Corollary 1.3.15 that
γ(ai > h(n), aj > h(n)) = O(γ(ai > h(n))γ(aj > h(n)))
for all sufficiently large n. By (4.1.3) and (4.1.4) the proof is complete. 2
Remarks. 1. It is now clear from the above theorem and Proposition
3.1.7 why tn /n log n converges in probability, rather than a.e., to 1/ log 2 as
n → ∞. The obstacle to a.e. convergence is the occurrence of a single large
value of the digits. At the same time, a.e. convergence can be obtained by
excluding at most one summand.
2. It is interesting to compare Theorems 3.3.4 and 4.1.9 (see also Corol-
lary 3.1.11). 2
Corollary 4.1.10 Whatever 0 ≤ ε < 1 we have
a1 + · · · + an
lim = ∞ a.e..
n→∞ n(log n)ε
1 + o(1)
tn = n log n + θn cn a.e.
log 2
as n → ∞, where θn is an I-valued random variable for any n ∈ N+ .
Ergodic theory of continued fractions 239
Then
a1 + · · · + an 1
lim sup = a.e..
n→∞ dn log 2
1 + o(1) dn
tn ≤ dn + a.e.
log 2 log log 10κ
S = {n ∈ N+ : cn < n log n} ,
we have X
1 1
lim = 0,
x→∞ log x n
n≤x, n∈S
240 Chapter 4
that is, S has logarithmic density zero. It then follows from Corollary 4.1.11
that
a1 + · · · + an = O(cn )
as n → ∞ for all integers n outside a set of logarithmic density 0. See also
Corollary 3.1.9.
3. Theorem 4.1.9 can be easily generalized for a function H : N+ → R++
satisfying
2
X X ³ 3
´
H 2 (i)/i2 / H(i)/i2 = O n log− 2 −ε n
1≤i≤n 1≤i≤n
π − 3 = [ 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, · · · ] .
Ergodic theory of continued fractions 241
In Bailey et al. (1997, p. 423) it is asserted that, based on the first 17,001,303
continued fraction digits of π − 3, the geometric mean is 2.68639 and the
harmonic mean is 1.745882, which are reasonably close to K0 and K−1 —see
Proposition 4.1.8. Clearly, no conclusion can be drawn beyond this.
For computations concerning the continued fraction digits of various ir-
rationals in I we refer the reader to Alexandrov (1978), Brjuno (1964),
Choong, Daykin and Rathbone (1971) (see nevertheless D. Shanks’ review
[MR 52 # 7073] of this paper), Lang and Trotter (1972), Richtmyer (1975),
Shiu (1995), and J.O. Shallit’s review [MR 96b: 11165] of this last paper.
Presenting an algorithm for computing the continued fraction expansion
of numbers which are zeroes of differentiable functions, Shiu (1995)
√ obtained
statistics of the√ first 10000 digits of irrationals in I such as 3 2 − 1, π − 3,
π 2 − 9, log 2, 2 2 − 2. Table 1 below is compiled from his Table 1. The last
column contains the (theoretical) asymptotic relative digit frequencies
µ ¶
1 1
log 1 + , 1 ≤ i ≤ 10,
log 2 i(i + 2)
1 12 × 101
log
log 2 11 × 102
of the digits in the range [11, 100] in the 11th line, and the asymptotic
relative frequency
1 102
log
log 2 101
of the digits exceeding 100 in the last line. Cf. Propositions 4.1.1, 4.1.3, and
4.1.4.
242 Chapter 4
Frequency of occurrence of i in
10000 digits of Theoretical
Digit asymptotic
√ √
i 3
2−1 π−3 π2 − 9 log 2 2 2 −2 relative frequency
Table 1
It is also interesting to note that setting M10000 (ω) = max1≤κ≤10000 aκ (ω)
(cf. Subsection 3.1.3) we have
√ √
M10000 ( 3 2 − 1) = a1990 ( 3 2 − 1) = 12737,
M10000 (π − 3) = a431 (π − 3) = 20776,
M10000 (π 2 − 9) = a1234 (π 2 − 9) = 12013,
M10000 (log 2) = a9168 (log 2) = 963664,
√ √
M10000 (2 2 − 2) = a6342 (2 2 − 2) = 44122 ,
and that in all cases just considered there exist digits not exceeding 100
which do not appear, viz.
√
74, 86, 91, 96, 97, 99, and 100 for 3 2 − 1;
90, 91, and 96 for π − 3;
91 and 92 for π 2 − 9;
55, 73, 76, 96, and 97 for log 2;
√
79, 80, 81, 82, 91, 94, 97, and 99 for 2 2 − 2.
Ergodic theory of continued fractions 243
whatever the given m-digit block. Actually, the above equation holds for all
x ∈ I except for a set of Lebesgue measure zero. This can easily be seen by
applying Birkhoff’s ergodic theorem to the transformation T x = bx mod 1
of I. A number that is normal in all bases b ∈ N+ , b ≥ 2, is called normal.
However, even if there are lots of normal numbers, when we are given a
‘concrete’ number x ∈ I the existence result just mentioned does not help
to decide whether x is normal or not. Such a problem cannot be handled
by methods known today. (Will it ever be solved?) For instance, it is not
known whether π − 3, e − 2, or any irrational algebraic number is normal
or not. The first example of a normal number in base 10 was given by
Champernowne (1933). His number is
x = 0. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 · · ·
if all its asymptotic relative m-digit block frequencies exist and are equal to
those occurring in Proposition 4.1.2 for any m ∈ N+ . In other words, ω is
a normal continued fraction number if it does not belong to the exceptional
sets of λ-measure zero excluded in Proposition 4.1.2 for any m ∈ N+ . For
instance, the quadratic irrationalities are not normal since they eventually
have periodic expansions, and neither is e − 2.
A construction of the Champernowne type for a normal continued frac-
tion number was given by Adler, Keane, and Smorodinsky (1981). Their
example is as follows. Let (rn )n∈N+ be the sequence of rationals in (0,1) ob-
tained by first writing r1 = 1/2, then r2 = 1/3 and r3 = 2/3, then r4 = 1/4,
r5 = 2/4, r6 = 3/4, etc., at each stage m ∈ N+ writing all quotients with
denominator m + 1 in increasing order. Let ri = [ai,1 , ai,2 , . . . , ai,ni ] be the
continued fraction expansion of ri , with ai,ni 6= 1, i ∈ N+ . The irrational ω
with continued fraction expansion
[a1,1 , a2,1 , a3,1 , a3,2 , a4,1 , a5,1 , a6,1 , a6,2 , a7,1 , a8,1 , a8,2 , a9,1 , a9,2 , a9,3 , · · · ],
which is obtained by concatenating the expansions of r1 , r2 , · · · in the given
order, is a normal continued fraction number. The first 14 digits of ω are
2, 3, 1, 2, 4, 2, 1, 3, 5, 2, 2, 1, 1, 2.
Another example of a different nature had been given by Postnikov
(1960).
We should emphasize that even if the empirical evidence pleads in favour
of normality for the continued fraction expansion of algebraic irrationals of
degree exceeding 2, or of π − 3, π 2 − 9 etc., the only mathematical results
proved so far are the examples of normal continued fraction numbers just
discussed.
Finally, a few words about the empirical evidence concerning √ Theorem
3
4.1.9. Von Neumann and Tuckerman (1955) computed √ t n ( 2 − 1) and
3
n log n/ log 2 for n = 100(100)2000. It appears that tn ( 2 − 1) log 2/n log n
is most of the time greater than 1 and often nearly 2. As tn log 2/n log n
converges just in probability to 1 as n → ∞, these deviations cannot be seen
as significant.
RR
for any measurable function f¯ : I 2 → R such that I 2 |f¯| dλ2 < ∞. As
in Subsection 4.1.1, for suitable choices of f¯, Proposition 4.0.4 will lead to
estimates of convergence rates in (4.1.6).
We now give several results which can be derived from (4.1.6).
Proposition 4.1.13 For any B ∈ BI2 we have
n−1 ZZ
1X 1 dx dy
lim IB (τ k , s̄k ) = a.e. in I 2 .
n→∞ n log 2 B (xy + 1)2
k=0
and
n−1
1X
lim IA (s̄k ) = γ(A) a.e. in I 2 .
n→∞ n
k=0
1 X ³ −k ´
n−1
lim µ̄ τ̄ (B) = γ̄(B), B ∈ BI2 . (4.1.7)
n→∞ n
k=0
In particular,
1 X ³ −k ´
n−1 n−1
1X
lim µ̄ τ̄ (I × A) = lim µ̄ (s̄k ∈ A)
n→∞ n n→∞ n
k=0 k=0 (4.1.8)
= γ(A), A ∈ BI .
246 Chapter 4
1 1
ḡ(x, y) = , (x, y) ∈ I 2 .
log 2 (xy + 1)2
Now, since τ̄ is strongly mixing (see Subsections 4.0.1 and 4.0.2), the last
integral in the equations above converges to
ZZ ZZ
IB dγ̄ (h̄/ḡ) dγ̄ = γ̄(B)µ̄(I 2 ) = γ̄(B)
I2 I2
as n → ∞. 2
Remarks. 1. Proposition 2.1.5 shows that measures µτ −n , n ∈ N, can
be expressed in terms of the Perron–Frobenius operator Pγ = U of τ with
respect to γ. A similar representation holds for the case of a measure µ̄ as
in Proposition 4.1.15. It is easy to check that we have
ZZ
−n
µ̄(τ̄ (B)) = P̄γ̄n f¯ dγ̄, B ∈ BI2 ,
B
where f¯ = h̄/ḡ and P̄γ̄ is the Perron–Frobenius operator of τ̄ under γ̄. See
the Remark following Proposition 2.1.1.
If the endomorphism (τ̄ , γ̄) were exact, then from Proposition 4.0.2 we
might have deduced that convergence in (4.1.9) is uniform with respect to
B ∈ BI2 . Since (τ̄ , γ̄) is not exact, such a conclusion cannot be reached this
way. It is an open problem whether this is really true.
2. Proposition 4.1.15 is a first step towards the solution of what can be
called Gauss’ problem for the natural extension τ̄ of τ . 2
Ergodic theory of continued fractions 247
hn (θ, B) = µ (τ̄ n ( · , θ) ∈ B) , n ∈ N+ .
and
n−1
1X
lim IBε+ \Bε− (τ k , s̄k ) = γ̄(Bε+ \ Bε− ) a.e. in I 2 .
n→∞ n
k=0
It is now easy to see that (4.1.14) and the last three equations imply the
result stated. 2
Remark. Theorem 4.1.16(i) has been proved by Barbolosi and Faivre
(1995) while (ii) is implicit (or implicitly used) in many papers by Dutch
authors. See, e.g., Bosma et al. (1983) or Jager (1986). 2
Theorem 4.1.16 has a host of consequences. We state some of them.
Corollary 4.1.17 Let µ ∈ pr(BI ) such that µ ¿ λ. For any B ∈ BI2
such that λ2 (∂B) = 0 we have
for any n ∈ N. 2
Let us note that in Theorem 2.5.8 the (optimal) convergence rate in
(4.1.15) has been obtained in the case where µ = γa for the class of rectangles
B = [0, x] × [0, y], x, y ∈ I. Using this result we can prove
Proposition
Sm 4.1.18 Let B be a simply connected subset of I 2 such that
∂B = i=1 `i for some m ∈ N+ , where either
`i := ( (x, fi (x)) : ai ≤ x ≤ bi )
by equation (1.3.7). 2
Let us define random variables ρn and Θn0 by
¯ ¯
¯ pn+1 ¯ ¯ ¯
¯ω − qn+1 ¯ ¯ pn ¯¯
ρn (ω) = ¯¯ ¯ , 0 ¯
Θn = qn qn+1 ¯ω − ¯ , ω ∈ Ω, n ∈ N.
p ¯ qn
¯ω − qnn ¯
It is easy to see that ρn = sn+1 τ n+1 and Θ0n = 1/(sn+1 τ n+1 + 1) so that
Θ0n = 1/(ρn + 1), n ∈ N.
Corollary 4.1.21 For any µ ∈ pr(BI ) such that µ ¿ λ we have
µ ¶
1 t log t
lim µ(ρn ≤ t) = log(t + 1) − , t ∈ I,
n→∞ log 2 t+1
0 if 0 ≤ t ≤ 1/2,
lim µ(Θ0n ≤ t) =
n→∞
log(2tt (1 − t)1−t )
if 1/2 ≤ t ≤ 1.
log 2
For µ = λ the convergence rate in the equations above is O(gn ) as n → ∞.
The proof is left to the reader. 2
For other results of the same type, which can be derived as before, we
refer the reader to Bosma et al. (1983), Jager (1986), Kraaikamp (1994).
Corollary 4.1.22 For any t, t1 , t2 ∈ I the limits
1
lim card{k : Θk ≤ t, 0 ≤ k ≤ n − 1 },
n→∞ n
1
lim card{k : Θk ≤ t1 , Θk+1 ≤ t2 , 0 ≤ k ≤ n − 1 },
n→∞ n
1
lim card{k : ρk ≤ t, 0 ≤ k ≤ n − 1 },
n→∞ n
and
1
lim card{k : Θ0k ≤ t, 0 ≤ k ≤ n − 1 },
n→∞ n
all exist a.e. in I and are equal to the corresponding values of the limiting
distribution functions occurring in Corollaries 4.1.19, 4.1.20, and 4.1.21,
respectively.
252 Chapter 4
where H e has been defined in Theorem 2.2.13. Corollary 4.1.22 only covers
the case kn = n, n ∈ N+ .
2. In the case kn = n, n ∈ N+ , equation (4.1.16) has been conjectured
by H.W. Lenstra Jr. Actually, this conjecture is implicit in Doeblin (1940),
which enables us to call it after both Doeblin and Lenstra. The Doeblin–
Lenstra conjecture has been proved by Bosma et al. (1983) by using, even
if not explicitly, Theorem 4.1.16(ii) in a special case. 2
Corollary 4.1.23 The equations
n−1
1X 1
lim Θk = = 0.36067 · · ·
n→∞ n 4 log 2
k=0
n−1 µ ¶
1X 1 1
lim Θk Θk+1 = 1− = 0.10655 · · ·
n→∞ n 6 4 log 2
k=0
n−1
1X π2
lim ρk = − 1 = 0.18656 · · ·
n→∞ n 12 log 2
k=0
and
n−1
1X 0 1 1
lim Θk = + = 0.86067 · · ·
n→∞ n 2 4 log 2
k=0
all hold a.e. in I.
Proof. We consider just the first equation, leaving the calculation details
to the reader, as the same idea underlies the proofs in the other cases.
By Corollary 4.1.22 we have
n−1
1X e
lim I[0,t] (Θk ) = H(t)
n→∞ n
k=0
a.e. in I for any t ∈ I ∩ Q. Hence for any fixed ω ∈ Ω not belonging to the
exceptional set the distribution function
n−1
1X
Fn (t) := I[0,t] (Θk ), t ∈ I,
n
k=0
Ergodic theory of continued fractions 253
e as n → ∞. Consequently,
converges weakly to H
Z n−1
1X
t dFn (t) = Θk
I n
k=0
should converge to Z
e 1
t dH(t) =
I 4 log 2
as n → ∞ for any ω ∈ Ω not belonging to the exceptional set, thus a.e. in I.
While for the last two equations the reasoning is quite similar, in the
case of the second equation we should consider
RR two-dimensional distribution
functions, and the value of the limit equals I 2 t1 t2 dH(t1 , t2 ). 2
We turn now to limit properties of certain associated random variables.
It follows from
R (4.1.6) that for any measurable real-valued function f on I
such that I |f | dλ < ∞ we have
n−1 Z
1X
lim f (s̄k ) = f dγ a.e. in I 2 . (4.1.17)
n→∞ n I
k=0
From (4.1.17) we can derive a weaker result for the sequences (san )n∈N , a ∈ I.
Theorem 4.1.24 Let f : I → R be continuous. Then for any a ∈ I we
have
n−1 Z
1X a
lim f (sk ) = f dγ a.e. in I.
n→∞ n I
k=0
Proof. We have |s̄k − sak | ≤ (Fk Fk+1 )−1 for any k ∈ N, (ω, θ) ∈ Ω × I,
a ∈ I. The result then follows from (4.1.17) and the uniform continuity of
f on I. 2
Remarks. 1. The above result also follows from a theorem of Breiman
(1960) on account of the Markov property of the sequences (san )n∈N , a ∈ I.
2. The corresponding result for yna = 1/san , n ∈ N+ , a ∈ I, can be easily
stated. In this form it can be found in Elton (1987) and Grigorescu and
Popescu (1989). 2
Corollary 4.1.25 For any m ∈ N+ and a ∈ I we have
n−1
1X a m 1 X (−1)i−1
lim (sk ) = a.e. in I.
n→∞ n log 2 (m + i)
k=0 i∈N+
254 Chapter 4
1 π2
lim log(sa1 sa2 · · · san ) = − a.e. in Ω.
n→∞ n 12 log 2
More precisely, whatever ε > 0, for any a ∈ I we have
1 π2 ³ 1 ´
log(sa1 sa2 · · · san ) = − + o n− 2 log(3+ε)/2 n a.e. in Ω
n 12 log 2
as n → ∞, where the constant implied in o depends on both ε and the current
point ω ∈ Ω.
In particular, for a = 0 the above equations amount to
√ 2
lim n
qn = eπ /12 log 2 a.e. in Ω (4.1.19)
n→∞
and
√ 2
³ 1 ´
n
qn = eπ /12 log 2 + o n− 2 log(3+ε)/2 n a.e. in Ω (4.1.20)
as n → ∞, respectively.
Proof. By the mean value theorem we have
¯ ¯
¯ log x − log y ¯ 1
¯ ¯ ≤
¯ x−y ¯ min(x, y)
1 π2
lim log λ (I(a1 , · · · , an )) = − a.e. in Ω
n→∞ n 6 log 2
and, for any ε > 0,
1 π2 ³ ´
−1/2 (3+ε)/2
log λ (I(a1 , · · · , an )) = − +o n log n a.e. in Ω
n 6 log 2
as n → ∞, where the constant implied in o depends on both ε and ω ∈ Ω.
Proof. By (1.2.2) and (1.2.5) we have
4.2.1 Preliminaries
The following rather old and well known remark is fundamental. For
a ∈ Z, b ∈ N+ and x ∈ [0, 1) we have
1 −1
a+ =a+1+ .
1 b+1+x
1+
b+x
This operation is called a singularization. We have singularized the digit 1
in
[ · ; · · · , a, 1, b, · · · ]
The effect of a singularization is that a new and shorter continued frac-
tion expansion is obtained. Moreover, we will see that the sequence of
convergents associated with the ‘new’ continued fraction expansion is a sub-
sequence of the sequence of convergents of the ‘old’ one. For example, given
n ∈ N+ , if we singularize the digit an+1 (ω) = 1 in the RCF expansion of
some ω ∈ Ω, then the sequence of convergents of the ‘new’ continued frac-
tion expansion is obtained by deleting the nth term from the sequence of
RCF convergents of ω. Obviously, the ‘new’ continued fraction expansion is
no longer an RCF expansion!
Starting from the RCF expansion of a given x ∈ [0, 1) it is not possible
(i) to singularize two consecutive digits equal to 1, and (ii) to singularize
digits other than 1.
It is also important to note that once we have singled out digits equal to
1 to be singularized, the order in which they are singularized has no impact
on the final result. Of course, just one singularization does not make the
new expansion ‘really faster’ than the old one. However, many algorithms
can be devised such that for almost all x ∈ [0, 1) infinitely many convergents
are skipped. Before considering such algorithms, let us fix notation.
Let x ∈ [0, 1) with RCF expansion
x = [a1 , a2 , · · · ] .
is called a 1-block if either k = 1 and ak+n (x) 6= 1 (if n is finite) or k > 1 and
ak−1 (x) 6= 1, ak+n (x) 6= 1 (if n is finite). The first algorithm we consider is:
A For any x ∈ [0, 1) singularize the first, third, fifth, etc., components in
any 1-block.
Ergodic theory of continued fractions 259
B For any x ∈ [0, 1) singularize the last, third from last, fifth from last,
etc., components in any 1-block.
C For any x ∈ [0, 1) singularize all digits an+1 (x) = 1 for which Θn (x) ≥
1/2 (see Subsection 1.3.2) whatever n ∈ N.
In Subsection 4.3.2 it is shown that algorithm C is well defined, that is, not
in conflict with the requirements of the singularization procedure.
Example 4.2.3 Let x be as in Example 4.2.1. A simple calculation
shows that the first four digits equal to 1 in the RCF expansion of x should
not be singularized if we apply algorithm C to it. 2
From this example it is clear that, in general, algorithm C does not
yield expansions which are fastest. In Subsection 4.3.3 we will discuss an
algorithm which yields both fastest and closest expansions. This algorithm
was introduced by Selenius (1960) and—independently—by Bosma (1987),
and is called the optimal continued fraction (OCF) expansion. Finally, in
Subsection 4.2.5 we will answer question (iii) above.
e1
= a0 + ∈ R ∪ {−∞, ∞}.
e2
a1 +
. en
a2 + . . +
an
If M = N+ then we say that the CF considered is infinite and look at it as
the sequence
((ek )1≤k≤n , (ak )0≤k≤n )n∈N+
of all finite CF’s which are obtained by finite truncation. In both cases we
can associate with a CF its convergents
pe0 pek
:= a0 , := [a0 ; e1 /a1 , · · · , ek /ak ] , 1 ≤ k ≤ n,
q0e qke
and
Mne := Ae0 · · · Aen , n ∈ N.
Clearly,
with pe−1 = 1, q−1e = 0, which implies that the sequences (pen )n∈N and
e
(qn )n∈N satisfy the recurrence relations
and clearly se1 := q0e /q1e = 1/a1 . It follows from (4.2.2) and (4.2.3) that
More generally,
pen + zpen−1
Mne (z) = e = [a0 ; e1 /a1 , · · · , en−1 /an−1 , en /(an + z)] , n ≥ 2,
qne + zqn−1
Ergodic theory of continued fractions 263
we also have
sen+1
Θen (a0 + te0 ) = , n ∈ N. (4.2.6)
sen+1 ten+1 + 1
The numbers Θen , n ∈ N, associated with a (finite or infinite) SRCF
expansion are called its approximation coefficients. Compare with the RCF
expansion case in Subsection 1.3.2.
We conclude this subsection with a few examples of well known SRCF
expansions.
1. The RCF expansion: this is the SRCF expansion for which en = 1 for
any n ∈ N+ .
6. The continued fraction with odd incomplete quotients (Odd CF) ex-
pansion: this is the SRCF expansion for which e1 = 1, an ≡ 1 mod 2,
en+1 + an ≥ 2, n ∈ N+ . It was introduced by Rieger (1981a) [see
also Barbolosi (1990), Hartono and Kraaikamp (2002), and Schweiger
(1995, Ch. 3)].
7. The continued fraction with even incomplete quotients (Even CF) ex-
pansion: this is the SRCF expansion for which e1 = 1, an ≡ 0 mod 2,
en+1 + an ≥ 2, n ∈ N+ . See also Kraaikamp and Lopes (1996) and
Schweiger (1995, Ch. 3).
µ ¶µ ¶µ ¶ µ ¶µ ¶
0 c 0 d 0 1 0 c 0 −d
= , (4.2.8)
1 a 1 1 1 b 1 a+d 1 b+1
(e
ek )k∈M \{`+1} , (e
ak )k∈{0}∪(M \{`+1}) (4.2.10)
Ergodic theory of continued fractions 265
Therefore
µ ¶ µ ¶µ ¶ µ ¶
pee` e −e`+1 0 0 pe`+1
= M`+1 = ,
qe`e e`+1 1 1 e
q`+1
4.2.4 S-expansions
From now on we will concentrate on one special singularization process.
The set S of continued fraction expansions to be singularized is the set of
all (finite or infinite) RCF expansions. Since in this case all the e’s are
+1, we will speak of singularizing a`+1 = 1 instead of singularizing the pair
a`+1 = 1, e`+2 = 1.
Before describing the general rule (as we should according to the defini-
tion just given) remark that Example 4.2.1 actually describes a singulariza-
tion process: S plus algorithm A yield the NICF expansion! Now, notice
that algorithm A is equivalent to
where
SB = [g, 1) × I ⊂ I 2 .
Finally, using properties of the approximation coefficients Θn , n ∈ N, de-
fined in Subsection 1.3.2, we can also show that algorithm C—leading to
Minkowski’s DCF expansion—is equivalent to
where ½ ¾
x 1
SC = (x, y) ∈ I 2 ; ≥ .
xy + 1 2
S ⊂ [1/2, 1) × I,
S ∩ τ̄ (S) ⊂ {(g,g)},
(ii) S ⊂ [1/2, 1) × I;
A(S, n)
lim = γ̄(S) a.e..
n→∞ n
268 Chapter 4
log G
γ(M1 ) = γ(M2 ) = 1 − .
log 2
Next, put S1 = S ∩ M1 and S2 = S ∩ M2 . Clearly,
τ̄ (S1 ) ∪ S2 ⊂ M2
log G
≤ γ(M2 ) = 1 − .
log 2
Ergodic theory of continued fractions 269
1 .....
....
......
.........
.
...............
..........
τ̄ (S1 ) ..... ..
...... ...
...... ...
...
........ .....
.
....... ..
.......
.......
......
...
...
... S2
. .... .
........ ....
.. .
........ ...
g .........
.........
..........
..
..
...
...
...
...
........... ..
.
.......... ...
............
............. ...
...
...
...
...
...
...
...
...
...
.... S1
..
...
...
..
.
...
...
..
...
1 g
0 2 1
That a singularization area actually can have γ̄-measure 1 − (log 2)−1 (log G)
∗ and S ∗ .
is shown by the cases of SA 2
B
On account of Proposition 4.2.6, a singularization area S will be called
maximal if
log G
γ(S) = 1 − .
log 2
Given a singularization area S, let BS be a subset of I 2 such that what-
ever ω = [a1 , a2 , · · · ] ∈ Ω any digit 1 = a`+1 = a`+1 (ω) is unchanged by S-
singularization if and only if (τ ` , s` ) ∈ BS , ` ∈ N. Clearly, such a set—which
determines the occurrence of digits equal to 1 in the S-expansion—should
have the following properties:
k
= lim + γ̄(S) a.e.,
k→∞ nS (k)
Ergodic theory of continued fractions 271
The minimum of the limit above is attained if and only if S is maximal, and
is equal to
1 G3
log = 0.11915 · · · .
log G 4
and define
cS = sup (t ∈ (0, 1] : A(t) ∩ S = ∅) .
Put
LS = min(cS , 1/2) .
Let ω ∈ Ω and p, q ∈ N+ with g.c.d.(p, q) = 1, p < q. If
¯ ¯
¯ p ¯
e = Θ(ω,
Θ e p/q) = q ¯¯ω − ¯¯ < LS ,
2
q
LS = g2 = 0.38166 · · · .
This value was also found by Ito (1987) and by Jager and Kraaikamp (1989).
Their methods are different. Ito (op. cit.) developed a theory for determin-
ing the Legendre constants for a class of continued fractions, larger than the
class of S-expansions. Unfortunately, his method is rather complicated.
[e
a0 ; ee1 /e
a1 , ee2 /e
an , · · · ]
τ n = [ an+1 , an+2 , · · · ] , n ∈ N,
sn = [an , · · · , a1 ] , n ∈ N+ , s0 = 0,
and put
ten = [ een+1 /e
e an+1 , · · · ] , n ∈ N,
0 if n = 0,
seen = 1/e
a1 if n = 1,
[1/e
an , een /e
an−1 , · · · , ee2 /e
a1 ] if n > 1.
pen /e
where (pn /qn )n∈N and (e qne )n∈N are the sequences of RCF convergents
and S-convergents of x, respectively. Also,
pn + τ n pn−1
n ,
qn + τ qn−1
x = (4.2.11)
p
e e
+ e
t e e
pe
k
k k−1
e ee e
qek + tk qek−1
∆ := I 2 \ S , ∆− = τ̄ (S), ∆+ = ∆ \ ∆− .
(ii) if pn /qn is not an S-convergent, then both pn−1 /qn−1 and pn+1 /qn+1
are S-convergents;
Proof. (i) This follows directly from Definition 4.2.5 and Proposition
4.2.4.
(ii) This follows from the fact that in the sequence of RCF convergents
we cannot remove two or more consecutive convergents and still have a
sequence of convergents of some srcf.
(iii) If (τ n , sn ) ∈ ∆+ then the very definition of ∆+ implies that
Hence neither an nor an+1 is singularized and therefore both pn−1 /qn−1 and
pn /qn are S-convergents. But then there exists k ∈ N+ such that
Since all the fractions are in their lowest terms and their denominators are
positive we should have
e
qek−1 = qn−1 , qeke = qn .
pn + τ n pn−1 pn + e
tek pn−1
= ,
qn + τ n qn−1 qn + e
tek qn−1
hence e
tek = τ n . Finally, we have
e
qek−1 qn−1
seek = e = = sn .
qek qn
e
qek−1 = qn−2 , qeke = qn .
Since
pn = an pn−1 + pn−2 = pn−1 + pn−2 ,
(4.2.12)
qn = an qn−1 + qn−2 = qn−1 + qn−2
we have
qn−2 qn − qn−1
seek = = = 1 − sn .
qn qn
276 Chapter 4
pn + τ n pn−1 pn + e
tek pn−2
= ,
n
qn + τ qn−1 qn + e
tek qn−2
e
tek + e
tek τ n + τ n = 0 ,
whence
e τn
tek = − n .
τ +1
The converse is obvious. 2
Now, define the transformation τ̄∆ : ∆ → ∆ as
τ̄ (x, y) if τ̄ (x, y) 6∈ S,
τ̄∆ (x, y) =
2
τ̄ (x, y) if τ̄ (x, y) ∈ S
1 1
, (x, y) ∈ ∆.
γ̄(∆) log 2 (xy + 1)2
1 1
, (x, y) ∈ AS .
(1 − γ̄(S)) log 2 (xy + 1)2
Then (AS , BAS , τ̄S , ρ) is an ergodic dynamical system which underlies the
corresponding S-expansion.
Proof. The conclusion follows on account of equations (4.2.13) through
(4.2.15) noting that the dynamical systems (∆, B∆ , τ̄∆ , γ̄∆ ) and (AS , BAS ,
τ̄S , ρ) are isomorphic by the very definition of the latter. See Remark 1
following Proposition 4.0.5 and Petersen (1983, Sections 1.3 and 2.3). 2
Remark. The entropy of the maps τ̄∆ and τ̄S can be easily obtained
using Abramov’s formula [see e.g. Petersen (1983, p. 257)]. Since H(τ ) =
π 2 /6 log 2 (see Remark following Corollary 4.1.28), we have
H(τ ) 1 π2
H(τ̄∆ ) = = = H(τ̄S ),
γ̄(∆) 1 − γ̄(S) 6 log 2
¡ ¢
which shows that entropy is maximal π 2 /6 log G for maximal singulariza-
tion areas. 2
At first sight the dynamical system (AS , BAS , τ̄S , ρ) looks very intricate.
However, it is quite helpful. We have the following result.
278 Chapter 4
(1)
where τ̄S (x, y) is the first coordinate of τ̄S (x, y). Let a : [0, 1) → N+ ∪ {∞}
be defined as in Chapter 1, that is,
−1
bt c if t ∈ (0, 1),
a(t) =
∞ if t = 0.
We have
a(x) if sgn x = 1 and τ̄ (x, y) 6∈ S,
a(x) + 1 if sgn x = 1 and τ̄ (x, y) ∈ S,
f (x, y) =
a(−x/(x + 1)) + 1 if sgn x = −1 and τ̄ (M −1 (x, y)) 6∈ S,
a(−x/(x + 1)) + 2 if sgn x = −1 and τ̄ (M −1 (x, y)) ∈ S
and
¡ ¢
τ̄S (x, y) = |x−1 | − f (x, y), (y f (x, y) + sgn x)−1 , (x, y) ∈ AS .
Proof. We should distinguish four cases, of which only two will be con-
sidered here. The other two cases can be treated similarly. Cf. Kraaikamp
(1991, p. 26).
Therefore
− x−1+xa(x)
1−xa(x) a(x) + y
τ̄S (x, y) = M (τ̄∆ (M −1 (x, y))) = , 1−
1 + x−1+xa(x)
1−xa(x)
a(x) + y + 1
µ ¶
1 1
= − (a(x) + 1), .
x a(x) + y + 1
where
f (x, y) = a(x) + 1.
¡ ¢
2. Let (x, y) ∈ M (∆− ) and τ̄ M −1 (x, y) 6∈ S. Then sgn x = −1 and
we have
µ ¶
−1 −1 x
τ̄S (x, y) = M τ̄ M (x, y) = τ̄ M (x, y) = τ̄ − ,1 − y
x+1
µ µ ¶ ¶
1 x 1
= − −a − ,
x/(x + 1) x+1 a(−x/(x + 1)) + 1 − y
µ µ ¶ ¶
1 x 1
= − −1−a − , .
x x+1 a(−x/(x + 1)) + 1 + y sgn x
where
f (x, y) = a(−x/(x + 1)) + 1.
2
Corollary 4.2.15 We have
Let AiS , i = 1, 2, be the projections of AS onto the two axes and let λAi
S
denote the probability measure defined by
¡ ¢
λ A ∩ AiS
λAi (A) = ¡ ¢ , A ∈ BAi , i = 1, 2.
S λ AiS S
³ ´
Proposition 4.2.16 Let µ ∈ pr BA1 such that µ ¿ λA1 . For any
S S
B ∈ BAS such that λA1 ⊗ λA2 (∂B) = 0 we have
S S
¡ e e ¢
lim µ (e
tn , sen ) ∈ B = ρS (B),
n→∞
n−1
X
lim 1 IB (e
tek , seek ) = ρS (B) a.e. in A1S .
n→∞ n
k=0
For any µ ∈ pr(BA1 ) such that µ ¿ λA1 and any (t1 , t2 ) ∈ I 2 we have
S S
³ ´
lim µ Θe e ≤ t1 , Θ
e e ≤ t2 = ρ(B),
n→∞ n−1 n
1 card{k : Θ
lim n e e ≤ t1 , Θ
e e ≤ t2 , 0 ≤ k ≤ n − 1} = ρ(B) a.e. in A1S ,
n→∞ k k+1
where µ ¶
y |x|
B = (x, y) ∈ AS ; ≤ t1 , ≤ t2 .
xy + 1 xy + 1
with ¡ ¥ ¦¢
(e1 (x), b1 (x)) = sgn x, |x−1 | + 1 − α , x ∈ Iα .
Here Nαn denotes the composition of Nα with itself n times while Nα0 is the
identity map.
The theory of α-expansions can be developed by parallelling that of the
RCF expansion. This has been done by Nakada (1981), Nakada et al. (1977),
Bosma et al. (1983), and Popescu (2000). Originally, these expansions were
defined by McKinney (1907).
Our approach here consists in putting any α-expansion in the framework
of the S-expansion theory by giving a suitable singularization area Sα , α ∈
[1/2, 1]. This will allow us to retrieve results derived by Nakada and co-
workers (op. cit.) using different methods.
We should distinguish two cases: (i) α ∈ [1/2, g] and (ii) α ∈ (g, 1].
Case (i). Before giving the singularization areas Sα , α ∈ [1/2, g], we
first return to the special case α = 1/2 which yields the NICF expansion.
Recall that the NICF expansion of an irrational number can be obtained
from its RCF expansion by applying algorithm A from Subsection 4.2.1 to
the latter. We noticed in Subsection 4.2.4 that this is equivalent to
where
SA = [1/2, 1) × [0, g] .
For α ∈ (1/2, g], notice that
Hence
Writing Aα for ASα —see again the general case in Subsection 4.2.5—we take
¡ 2 ¢
Aα = I \ (Sα ∪ τ̄ (Sα )) ∪ (M (τ̄ (Sα )) \ ({0} × [0, 1/2]))
= ([α − 1, g − 1) × [0, 1 − g))
∪ ([g − 1, (1 − 2α)/α] × [0, 1 − g])
∪ (((1 − 2α)/α, 0) × [0, 1/2]) ∪ ([0, (2α − 1)/(1 − α)] × [0, 1/2))
∪ (((2α − 1)/(1 − α), α) × [0, g)) .
we deduce that fα (x, y) does not depend on y and that we should have
¯ ¯
fα (x, y) = b¯x−1 ¯ + 1 − αc, (x, y) ∈ Aα , x 6= 0.
¯ ¯
Hence x → ¯x−1 ¯ − fα (x, y) is Nakada’s transformation Nα . On account
of Theorem 4.2.14 we can therefore state the main result for the case α ∈
[1/2, g].
1
Theorem 4.3.1 [Nakada (1981)] Let 2 ≤ α ≤ g. Consider the probability
measure γ̄α on BAα with density
1 1
, (x, y) ∈ Aα ,
log G (xy + 1)2
where (x, y) ∈ Aα . Then (Aα , BAα , N̄α , γ̄α ) is an ergodic dynamical system
underlying the corresponding α-expansion.
Taking projection onto the first axis, we deduce from Theorem 4.3.1 the
following result.
284 Chapter 4
Sα
1α g 1−α
0 2 α 1
1
Figure 4.2: Sα for 2 ≤α≤g
From Figure 4.2 it is clear that the vertices (α, g) and ((1 − α)/α, 1) of
Sα determine the value of the Legendre constant Lα := LSα . See Theorem
4.2.11. More precisely, we have the following result.
1
Theorem 4.3.3 Let 2 ≤ α ≤ g. Then
.
It follows at once from this and (4.3.1) that BSα = ∅, which is consistent
with Proposition 4.2.7. 2
Case (ii). Let α ∈ (g, 1]. Put
Sα = [α, 1] × I . (4.3.2)
Hence τ̄ (Sα ) = [0, (1−α)/α]×[1/2, 1], and Sα ∩ τ̄ (Sα ) = ∅ since for α ∈ (g, 1]
we have
(1 − α)/α < α .
It is then easy to check that Sα is indeed a singularization area. However,
a simple calculation shows that
log(1 + α)
γ̄(Sα ) = 1 − ,
log 2
thus for no value of α under consideration here the singularization area Sα
is maximal.
Next, with M defined as in Subsection 4.2.5 we have
Thus we can state the main result for the case α ∈ (g, 1].
Theorem 4.3.4 [Nakada (1981)] Let g < α ≤ 1. Consider the probability
measure γ̄α on BAα with density
1 1
, (x, y) ∈ Aα ,
log(1 + α) (xy + 1)2
τ̄ (Sα )
1/2
Sα
M (τ̄ (Sα ))
α−1 0 1−α α 1
α 1/2
Taking again projection onto the first axis, we deduce from Theorem
4.3.4 the following result.
Corollary 4.3.5 Let g < α ≤ 1. Consider the probability measure µα
on BIα with density
1 1/(x + 2) if x ∈ [α − 1, (1 − α)/α],
×
log(1 + α)
1/(x + 1) if x ∈ ((1 − α)/α, α].
Since for our values of α we have (1 − α)/α < 1/(1 + α), we find that the
set Bα := BSα from Proposition 4.2.7 is (1/(1 + α), α) × I. Then
log(2 + α)
γ̄α (Bα ) = 2 − ,
log(1 + α)
1 log(2 + α)
lim card{k ; e
ak = 1, 1 ≤ k ≤ n} = 2 − a.e..
n→∞ n log(1 + α)
However, one might ask whether there are values of α for which still
smaller values can be obtained for the corresponding approximation coeffi-
cients ee ee
√ Θn (α) = Θn , n ∈ N. Beforehand it is clear that a value smaller than
1/ 5 = 0.447 · · · can never be found by a classical result of √ A. Hurwitz
[see Perron (1954, p. 49)], according to which for every θ < 1/ 5 there exist
irrational numbers x such that the inequality q 2 |x − (p/q)| < θ is verified
only for finitely many p/q ∈ Q.
The above-mentioned method from Kraaikamp (1990) can easily be adap-
ted for S-expansions. As an example we will mention here the case of α-
expansions, for which the first result below is due to Bosma et al. (1983).
Theorem 4.3.8 Let α ∈ [1/2, 1]. For any irrational number in Iα and
any n ∈ N+ we have
e en < c(α)
Θ
and
ee , Θ
min(Θ e e ) < V (α) ,
n−1 n
Since min(Θn , Θn+1 ) < 1/2—cf. Subsection 1.3.2—the DCF expansion picks
at least one out of two consecutive RCF convergents. Since
1
γ̄ (SDCF ) = 1 − ,
2 log 2
(ii) ak , ak+2 6= 1, k ∈ N+ ;
ADCF := ASDCF = ∆+
DCF ∪ M (τ̄ (SDCF )) ,
−1/2 0 1/2 1
Furthermore, writing fDCF for fSDCF and τ̄DCF for τ̄SDCF we have
$ ¯ −1 ¯ %
¯ −1 ¯ b ¯x ¯c + y sgn x − 1
fDCF (x, y) = ¯x ¯ + ,
2(b|x−1 | + y sgn x) − 1
and
³¯ ¯ ´
τ̄DCF (x, y) = ¯x−1 ¯ − fDCF (x, y), (fDCF (x, y) + y sgnx)−1
Ergodic theory of continued fractions 291
¡ ¢
B2 = B1 ∩ (x, y) ∈ E1 : 0 ≤ (x − y)2 + x + y ≤ 3/4 .
The result above can be also stated in an equivalent form concerning the
existence for any (t1 , t2 ) ∈ I 2 of the limit a.e. equal to H(t1 , t2 ) of
1 e e ≤ t1 , Θ
e e ≤ t2 , 0 ≤ k ≤ n − 1}
card{k : Θ k k+1
n
as n → ∞. It then follows, e.g., that
n−1
1 X ee 1
lim Θk = a.e..
n→∞ n 4
k=0
whatever k ∈ N.
Note that γ̄(B) is equal to
ÃZ Z 1/(2−t) Z 2−√2 Z 1/(2−t) !
1
1 du du
dt 2
− dt 2
log 2 1/2 (2t−1)/t (tu + 1) 1/2 (2t−1)/(2−3t) (tu + 1)
µ ¶
1 √ √ 1
= log( 2 − 1) + 2 − = 0.0473 · · · .
log 2 2
ae0 ; e
Corollary 4.3.12 Let [e ae1 , e
ae2 , · · · ] be the DCF expansion of an irra-
tional number. Then
1
lim aek = 1, 1 ≤ k ≤ n}
card{k : e
n→∞ n
γ̄(B)
= ρDCF (B) =
1 − γ̄(SDCF )
µ ¶
√ √ 1
= 2 log( 2 − 1) + 2 − = 0.0656 · · · a.e..
2
peek+1 = e
aek+1 peek + eek+1 peek−1 , qek+1
e
aek+1 qeke + eek+1 qek−1
=e e
,
and peek /e
qke , k ∈ N, are the OCF convergents of x.
Next, the sequence of OCF convergents (e qke )k∈N is a subsequence of
pek /e
the sequence (pn /qn )n∈N of RCF convergents. If we define n(k) in such a
qke = pn(k) /qn(k) , k ∈ N+ , then
way that peek /e
n(k) + 1 if eek+2 = 1,
n(k + 1) =
n(k) + 2 if eek+2 = −1
294 Chapter 4
with
0 if x > 0,
n(0) =
1 if x < 0.
Finally, it appears that the OCF expansion gives approximation coeffi-
cients Θ e en = (e
qne )2 |x − (e
pen /e
qne )| < 1/2 for any n ∈ N and, at the same time,
it is a fastest expansion. Fastest SRCF expansions for which all convergents
are RCF convergents can be defined as those in which always the maximal
number of RCF convergents is skipped, meaning that whenever a 1-block of
length m ∈ N+ occurs in the RCF expansion, exactly b(m + 1)/2c out of the
m 1’s are skipped. (Note that this implies that for fastest SRCF expansions
only a choice is left in deciding which RCF convergents will be skipped when
m is even.) A still more precise definition of ‘fastest’ is as follows. Writing
nα (k) := nSα (k), k ∈ N+ , α ∈ [1/2, 1], by Theorem 4.2.8 we have a.e.
log 2
nα (k) log G = 1.44092 · · ·
if 1/2 ≤ α ≤ g,
lim =
k→∞ k
log 2
if g < α ≤ 1.
log(α + 1)
where the qi and qeie , i ∈ N+ , are associated with the RCF expansion and
the SRCF expansion considered, respectively. Cf. Bosma (1987, p. 364).
The next result [cf. Bosma and Kraaikamp (1990)] places OCF expan-
sions in the context of the S-expansion theory. More precisely, it shows how
singularizing appropriately the RCF expansion yields the OCF expansion.
(Note that it is for this reason that we have anticipated notation by denoting
the OCF expansion as an S-expansion.)
Lemma 4.3.14 Let ω ∈ Ω have RCF expansion [a1 , a2 , · · · ], RCF con-
vergents pn /qn , and RCF approximation coefficients Θn , n ∈ N. Consider
the set µ µ ¶¶
2 2x − 1
SOCF = (x, y) ∈ I ; y < min x, .
1−x
Then for any n ∈ N+ the following three assertions are equivalent:
Ergodic theory of continued fractions 295
(iii) (τ n , sn ) ∈ SOCF .
Proof. For the proof of the equivalence of (i) and (ii) we refer the reader
to Corollary (4.20) of Bosma (1987). Here we show that (ii) and (iii) are
equivalent.
Since
sn τn
Θn−1 = , Θn = , n ∈ N+ , (4.3.2)
sn τ n + 1 sn τ n + 1
we have
|qn ω − pn | Θn qn−1
= = τ n < 1, ω ∈ Ω. (4.3.3)
|qn−1 ω − pn−1 | Θn−1 qn
Also
Θn−1 < Θn if and only if τ n > sn . (4.3.4)
Furthermore, if an+1 = 1 then pn+1 = pn + pn−1 and qn+1 = qn + qn−1 , and
by (4.3.3) we have
2τ n − 1
an+1 = 1 and Θn+1 < Θn if and only if sn < . (4.3.5)
1 − τn
Combining (4.3.4) and (4.3.5) with the definition of SOCF completes the
proof. 2
296 Chapter 4
1 .....
.....
.....
.....
..
.....
.
....
.....
.....
.....
τ̄ (SOCF ) .
.....
.
......
.
....
.....
.....
.
..
......
.
.....
....
....................
....................
...
...
...
...
...
................... ..
.
.
......
............... ...
.............. ...
.............
............ ..
.
...........
..
...
...
............ 1/2 ..
.
.
..
.......... ..
......... ..
... ..
... ..
.
.
... ..
... ..
..
...
....
..
...
.
SOCF
.. ..
... ..
...
..
.
M (τ̄ (SOCF )) .
.
..
...
... ...
... ...
.. ..
... ...
−1/2 0 1/2 g 1
¡ ¢
where Π = (x, y) ∈ R2++ : 4x2 + y 2 < 1, x2 + 4y 2 < 1 .
The result above can be also stated in an equivalent form concerning the
existence for any (t1 , t2 ) ∈ I 2 of the limit a.e. equal to H(t1 , t2 ) of
1 e e ≤ t1 , Θ
e e ≤ t2 , 0 ≤ k ≤ n − 1}
card{k : Θ k k+1
n
as n → ∞. It then follows, e.g., that
n
1 X ee arctan 12
lim Θk = = 0.24087 · · · a.e.. (4.3.6)
n→∞ n 4 log G
k=1
1
= 0.36067 · · · for the RCF expansion,
4 log 2
1
= 0.25 for the DCF expansion,
4
√
5−2
= 0.24528 · · · for the NICF and SCF expansions,
2 log G
√
8G + 6 − 2G − 1
= 0.24195 · · · for the α0 -expansion,
log G
where α0 = 0.55821 · · · . See Corollary 4.1.23 and Proposition 4.3.10 for the
first two values, and Bosma et al. (1983) for the last two ones. Note how
close the value in (4.3.6) is to
log G
1 − γ̄ (SOCF ) = = 0.24061 · · · .
2
Ergodic theory of continued fractions 299
The latter gives an a priori bound for the a.e. asymptotic arithmetic mean
of the approximation coefficients. It can be shown that the value in (4.3.6)
is in fact ‘the best one can get’ for any irrational number. More precisely,
we have the following result.
Theorem 4.3.17 [Bosma and Kraaikamp (1991)] Whatever the SRCF
expansion with convergents pen /qne and approximation coefficients Θen , n ∈ N,
we have
m n
1 X e 1 X ee
Θk ≥ Θ k , n ∈ N+ ,
m n
k=1 k=1
for any irrational number, where m = card{k : qke < qen+1e , k ∈ N+ } and
e e n , n ∈ N+ , are associated with the OCF expansion.
qen and Θ e
be a (finite or infinite) CF with a`+1 > 1, e`+1 = 1 for some ` ∈ N for which
` + 1 ∈ M . The transformation ι` which takes (4.4.1) into the CF
(e
ek )k∈M
f, (e
ak )k∈{0}∪M
f, (4.4.2)
f = M if M = N+ and M
where M f = {k : 1 ≤ k ≤ n + 1} if M =
{k : 1 ≤ k ≤ n}, n ∈ N+ , with eek = ek , k ∈ M , k ≤ `, ee`+1 = −1,
300 Chapter 4
resulting after the insertion ι` of the pair (1, −1) before a`+1 (> 1), e`+1
(= 1), is obtained by inserting the term (pe` + pe`−1 )/(q`e + q`−1
e ) in the set
(pk /qk )k∈{0}∪M before the convergent p` /q` . As usual, here pe−1 = 1, q−1
e e e e e =
0.
The proof is similar to that of Proposition 4.2.4 by using appropriate
matrix identities. 2
Starting from the RCF expansion, by appropriate insertions we can ob-
tain many classical SRCF expansions, and also continued fraction algorithms
which are not SRCF expansions. Amongst the former we mention the Lehner
continued fraction (LCF) expansion, and amongst the latter the Farey con-
tinued fraction (FCF) expansion. Both these expansions will be studied in
the next subsection.
In particular, we can obtain this way the OddCF and EvenCF expansions
—see the examples of SRCF expansions at the end of Subsection 4.2.2—as
well as the backward continued fraction (BCF) expansion that we will study
in Subsection 4.4.3.
where
(2, −1) if 1 ≤ x < 32 ,
(b(x), e(x)) =
3
(1, 1) if 2 ≤ x < 2,
then
(bn (x), en+1 (x)) = (b(Ln (x)), e(Ln (x))) , x ∈ [1, 2),
for any n ∈ N. Here Ln , n ∈ N+ , denotes the composition of L with itself
n times while L0 is the identity map.
Denoting as usual the RCF convergents of a real number x = [a0 ;
a1 , a2 , · · · ] by (pn /qn )n∈N and defining the mediant convergents of x by
kpn + pn−1
, 1 ≤ k < an+1 , n = 1, 2, · · ·
kqn + qn−1
(so that if an+1 = 1 then there is no mediant convergent), we will see that
the set of LCF convergents of x is the union of the sets of RCF and mediant
convergents of x. It is for this reason that the LCF expansion was called the
mother of all SRCF expansions in Dajani and Kraaikamp (op. cit.).
Proposition 4.4.2 Let x ∈ [1, 2) \ Q, with RCF expansion
[ 1; a1 , a2 , · · · ].
If n ≥ 1 then we replace
[ 1; 1, · · · , 1, an+1 , · · · ]
by
ιn+an+1 −1 ( · · · (ιn+1 (ιn ([ 1; 1, · · · , 1, an+1 , · · · ])) · · · )
(ii) Let n0 > n be the smallest integer m0 > n for which e0m0 +1 = 1 and
b0m0 +1 > 1. Apply to (4.4.4) the procedure from (i) to b0n0 +1 .
We also have
1
L(x) = , x ∈ [1, 2),
I (h(x − 1))
where the bijective function h : [0, 1) → [1/3, 2/3) is defined by
1
if 0 ≤ x < 1/2,
2−x
h(x) =
x
if 1/2 ≤ x < 1.
x+1
Ito (op. cit.) showed that I is ν-preserving, where ν is the ¡ σ-finite, infinite
¢
−1
measure on B[0,1) with density x , x ∈ (0, 1), and that [0, 1), B[0,1) , I, ν is
an ergodic dynamical system. This implies that L is µ-preserving, where µ
is the σ-finite,
¡ infinite measure
¢ on B[1,2) with density (x − 1)−1 , x ∈ (1, 2),
and that [1, 2), B[1,2) , L, µ , is an ergodic dynamical system underlying the
LCF expansion.
Ergodic theory of continued fractions 303
We will now exhibit the relationship between the LCF expansion and an
algorithm yielding the so called Farey continued fraction (FCF ) expansion.
The latter is an infinite CF expansion of any x ∈ [−1, 0) ∪ (0, ∞) of the form
f1
:= [ f1 /d1 , f2 /d2 , · · · ] , (4.4.5)
f2
d1 +
.
d2 + . .
where (dn , fn ) is equal to either (1, 1) or (2, −1), n ∈ N+ . Formally, as
shown by Dajani and Kraaikamp (op. cit.), if we define the transformation
F : [−1, ∞) → [−1, ∞) by
f (x)
x − d(x) if x 6= 0,
F(x) =
0 if x = 0,
where
(2, −1) if − 1 ≤ x < 0,
(d(x), f (x)) =
(1, 1) if x ≥ 0,
then
¡ ¢
(dn (x), fn (x)) = d(Fn−1 (x)), f (Fn−1 (x)) , x ∈ [−1, ∞),
L̄n (x, y) = (Ln (x), [en (x)/bn−1 (x), · · · , e2 (x)/b1 (x), e1 (x)/(b0 (x) + y)])
304 Chapter 4
L̄−n (x, y) = ([dn (y); fn (y)/dn−1 (y), · · · , f2 (y)/d1 (y), f1 (y)/x], Fn (y))
whatever (x, y) ∈ D0 .
Remark. It is interesting to compare the last two equations above with
(1.3.10 ) and (1.3.20 ). This might suggests developments similar to those in
Section 1.3. 2
¡ ¢
Theorem 4.4.3 The quadruple D, BD , L̄, µ̄ is an ergodic dynamical
system which is a natural extension of the dynamical system
¡ ¢
[1, 2), B[1,2) , L, µ .
We should next show that L̄ is µ̄-preserving and, finally, that the σ-algebra
generated by [ ¡ ¢
L̄n π1−1 B[1,2)
n∈N
coincides with BD . We leave the details to the reader, who can find them in
Dajani and Kraaikamp (op. cit.). 2
Let us denote by φ the σ-finite, infinite measure on B[−1,∞) with density
(x + 1)−1 − (x + 2)−1 , x ∈ (−1, ∞). It is easy to check that F is φ-preserving.
Theorem 4.4.4 The map ξ : [−1, 0) ∪ (0, ∞) → [1, 2) defined by
x = [ f1 /d1 , f2 /d2 , · · · ]
¡ ¢ ¡ ¢
is an isomorphism from [−1, ∞), B[−1,∞) , F, φ to [1, 2), B[1,2) , L, µ .
Ergodic theory of continued fractions 305
= [ d2 ; f2 /d3 , f3 /d4 , · · · ]
b1 + · · · + bn
lim = 2 a.e.
n→∞ n
306 Chapter 4
k
1 1 1X j m+k
+ ··· + = k+ (ai − 1) + = .
b1 bm 2 2 2
i=1
k 1
≤ Pk ,
m 1
i=1 ai
k
m
lim = 2.
m→∞ 1 1
+ ··· +
b1 bm
m p
m b1 + · · · + bm
≤ b1 · · · bm ≤ (≤ 2) ,
1 1 m
+ ··· +
b1 bm
[ f1 /d1 , f2 /d2 , · · · ].
Ergodic theory of continued fractions 307
Then
n
lim = 2 a.e.,
n→∞ 1 1
+ ··· +
d1 dn
√
lim n d1 · · · dn = 2 a.e.,
n→∞
d1 + · · · + dn
lim = 2 a.e..
n→∞ n
As with Proposition 4.4.2 we leave to the reader the proof of the following
result.
Proposition 4.4.7 Let ω ∈ Ω with RCF expansion [a1 , a2 , · · · ]. Then
the BCF expansion (4.4.6) of ω is given by the following algorithm.
(i) If a1 = 1 then singularize a1 to arrive at
See also Zagier (1981, Aufgabe 3, p. 131). It also follows easily from (4.4.8)
that every quadratic irrationality has an (eventually) periodic BCF expan-
sion.
2. Again, as for the LCF expansion, it heuristically follows from Corol-
lary 4.1.10 and the insertion mechanism that the BCF transformation β
should be ergodic, with invariant σ-finite, infinite measure. 2
√
For the LCF expansion it was intuitively clear that n b1 · · · bn → 2 a.e. as
n → ∞ since the only digits are 1 and 2, and ‘there are very few 1’s against
Ergodic theory of continued fractions 309
the 2’s’ (by Corollary 4.1.10). For the BCF expansion such an argument
clearly does not work. However, we have the following result.
Theorem 4.4.8 Let ω ∈ Ω with BCF expansion (4.4.6). Then
√
lim n c1 · · · cn = 2 a.e.
n→∞
and
n
lim = 2 a.e..
n→∞ 1 1
+ ··· +
c1 cn
Proof. Let [a1 , a2 , · · · ] be the RCF expansion of ω. For any given suf-
ficiently large m ∈ N+ there (uniquely) exist integers k ∈ N+ and j ∈ N
such that
m = a1 + a3 + · · · + a2k−1 + j, 0 ≤ j < a2k+1 .
It follows from (4.4.8) that
Pk k
Y
c1 · · · cm = 2 i=1 (a2i−1 −1)+j−1 (a2i + 2) ,
i=1
and therefore
m
à k ! k
1 X log 2 X 1X
log ci = a2i−1 − k + j − 1 + log(a2i + 2)
m m m
i=1 i=1 i=1
k
X
log(a2i + 2)
k+1
i=1
= (log 2) 1 − k + k .
X X
a2i−1 + j a2i−1 + j
i=1 i=1
Since
k+1 1
k
= k
→ 0 a.e.
X 1 X j
a2i−1 + j a2i−1 +
k+1 k+1
i=1 i=1
as m → ∞, and
k
X
log(a2i + 2)
i=1
k
→ 0 a.e.
X
a2i−1 + j
i=1
310 Chapter 4
as m → ∞, we deduce that
√
m
c1 · · · cm → 2 a.e.
A1.1
Let X be an arbitrary non-empty set. A non-empty collection X of subsets
of X is said to be a σ-algebra (in X) if and only if it is closed under the for-
mation of complements and countable unions. Clearly, ∅ and X both belong
to X , and X is also closed under the formation of countable intersections.
For any non-empty collection C of subsets of X the σ-algebra generated by
C, denoted σ(C), is defined as the smallest σ-algebra in X which contains C.
Clearly, σ(C) is the intersection of all σ-algebras in X which contain C.
A pair (X, X ) consisting of a non-empty set X and a σ-algebra X in X is
called a measurable space. In the special case where X is a denumerable set
the usual σ-algebra in X is P(X), the collection of all subsets of X. Clearly,
P(X) is generated by the elements of X : P(X) = σ ({x} : x ∈ X).
The product of two measurable spaces (X, X ) and (Y, Y) is the measur-
able space (X × Y, X ⊗ Y), where the product σ-algebra X ⊗ Y is defined as
σ(C) with C = (A × B : A ∈ X , B ∈ Y).
A1.2
Let (X, X ) and (Y, Y) be two measurable spaces. A map f : X → Y from X
into Y is said to be (X , Y)-measurable or a Y -valued random variable (r.v.)
on X if and only if the inverse image f −1 (A) = (x ∈ X : f (x) ∈ A) of
every set A ∈ Y is in X . Setting f −1 (Y) = (f −1 (A) : A ∈ Y), the above
condition can be compactly written as f −1 (Y) ⊂ X . [Note that f −1 (Y) is
always a σ-algebra in X whatever f : X → Y ! ]
Let (X, X ) be a measurable space, let ((Yi , Yi ))i∈I be a family of
measurable spaces, and for any i ∈ I let fi be a Yi -valued r.v. on X. Then
313
314 Appendix 1
¡ ¢
the σ-algebra σ ∪i∈I fi−1 (Yi ) is called the σ-algebra generated by the family
(fi )i∈I and is denoted σ((fi )i∈I ). Clearly, this is the smallest σ-algebra S⊂X
having the property that fi is (S, Yi )-measurable for any i ∈ I.
A1.3
Let (X, X ) be a measurable space. A function µ : X → R+ is said to
be a (finite) measure on X if and only if it is completely additive, that
is,¡ for any sequence
¢ P (Ai )i∈N+ of pairwise disjoint elements of X we have
µ ∪i∈N+ Ai = i∈N+ µ(Ai ). Complete additivity is equivalent to finite
additivity [that is, for any finite collection
P A1 , . . . , An of pairwise disjoint
elements of X , we have µ (∪ni=1 Ai ) = ni=1 µ(Ai )] in conjunction with con-
tinuity at ∅ (that is, for any decreasing sequence A1 ⊃ A2 ⊃ . . . of elements
of X with ∩i∈N+ Ai = ∅ we have limn→∞ µ(An ) = 0 ). Clearly, finite
additivity implies µ (∅) = 0. In the special case where X is a denumerable
set a measure µ on P(X) is defined by simply giving the values µ ({x}) for
the elements x ∈ X. A probability on X is a measure P on X satisfying
P (X) = 1. An important example of a probability on X is that of the
probability δx concentrated at x for any given x ∈ X, which is defined by
δx (A) = IA (x), A ∈ X . The collection of all measures (probabilities) on X
will be denoted m(X ) (pr(X )).
A triple (X, X , P ) consisting of a measurable space (X, X ) and a prob-
ability P on X is called a probability space. [The traditional notation for
a probability space is (Ω, K, P ). The points ω ∈ Ω are interpreted as the
possible outcomes (elementary events) of a random experiment, and the sets
A ∈ K as the (random) events associated with it; these are the subsets of
Ω arising as the truth sets of certain statements concerning the experiment.]
We say that A ∈ X occurs P -almost surely, and write A P -a.s., if and only
if P (A) = 1. Let (Y, Y) be a measurable space and let f be a Y -valued
¡r.v. on ¢X. The P -distribution of f is the probability P f −1 on Y defined by
Pf −1 −1
(A) = P (f (A)), A ∈ Y.
Let (X, X ) and (Y, Y) be two measurable spaces. The product measure
of µ ∈ m(X ) and ν ∈ m(Y) is the (unique) measure µ ⊗ ν ∈ m (X ⊗ Y)
satisfying the equation µ ⊗ ν(A × B) = µ(A)ν(B) for any A ∈ X and B ∈ Y.
A1.4
Let X be a metric space with metric d. The usual σ-algebra in X, denoted
BX , is that of Borel subsets of X, that is, the σ-algebra generated by the
Spaces, functions, and measures 315
for any h ∈ Cr (X) = the set of all real-valued bounded continuous functions
on (X, d). An equivalent definition is obtained by asking that
In particular, the above result holds for a continuous f for which clearly
Df = ∅. For a characterization via weak convergence of almost every-
where continuous functions f , that is, such that P (Df ) = 0, see Mazzone
(1995/96).
316 Appendix 1
A1.5
In this section (X, d) is the real line with the usual Euclidean distance.
The characteristic function (ch.f.) or Fourier transform of a measure
∧
µ ∈ m(B) is the complex-valued function µ defined on R by
Z
∧
µ (t) = e itx µ(dx), t ∈ R.
R
∧ ∧
If µ = ν for two measures µ, ν ∈ m(B), then µ = ν.
Proposition A1.2 (Lévy-Cramér continuity theorem) Let P, Pn ∈
pr(B), n ∈ N+ .
w
(i) Pn → P ∈ pr(B) implies limn→∞ Pbn = Pb pointwise, and the conver-
gence of ch.f.s is uniform on compact subsets of R.
∧
(ii) If limn→∞ P n = h pointwise and h is continuous at 0, then h is the
w
ch.f. of a probability P ∈ pr(B) and Pn → P .
Let µ, ν ∈ m(B). The convolution µ ∗ ν is the measure on B defined by
Z
µ ∗ ν(A) = µ(A − x)ν(dx), A ∈ B,
R
where A − x := (y − x : y ∈ A) , x ∈ R.
The convolution operator ∗ is associative and commutative. We have
µ[
∗ν =µ
b νb, µ, ν ∈ m(B).
P ∗n = P fn−1 ,
fn (x) = An x + Bn , x ∈ R. (A1.2)
318 Appendix 1
with a ∈ R, k1 , k2 ≥ 0, k1 + k2 > 0.
A1.6
Let C = Cr (I) be the metric space of real-valued continuous functions
on I = [0, 1] with the uniform metric
W (x : x(0) = 0) = 1,
W (x : x(ti ) − x(ti−1 ) ≤ ai , 1 ≤ i≤ k)
Yk Z ai
1 2
= p e−u /2(ti −ti−1 ) du
i=1
2π (ti − ti−1 ) −∞
for any ` ∈ L. The distance d0 (x, y)(≤ d(x, y)) for x, y ∈ D is defined as
the infimum of all ε > 0 for which there exists ` ∈ L such that s0 (`) ≤ ε
320 Appendix 1
and supt∈I |x(t) − y (`(t))| ≤ ε. The metrics d0 and d generate the same
topology in D. Nevertheless, while D is complete and separable under d0 ,
separability does not hold under d.
The σ-algebra BD of Borel sets in (D, d0 ) coincides with the σ-algebra B I ∩
D. Wiener measure W can be immediately extended from BC to BD as the
topologies induced in D by the metrics d0 and d are identical. Hence A∩C ∈
BC for any A ∈ BD . This allows us to define W (A) = W (A ∩ C), A ∈ BD .
Clearly, C is the support of W in D, that is, the smallest closed subset of
D whose W -measure equals 1.
General references: Araujo and Giné (1980), Billingsley (1968), Halmos
(1950), Hoffmann–Jørgensen (1994), Samorodnitsky and Taqqu (1994).
Appendix 2: Regularly
varying functions
A2.1
A measurable function R : [r, ∞) → R+ , where r ∈ R+ , is said to be
regularly varying (at ∞) of index α ∈ R if and only if there exists x0 ≥ r
such that R([x0 , ∞)) ⊂ R++ and
R(tx)
lim = tα
x→∞ R(x)
for any t ∈ R++ . A regularly varying function of index 0 is called a slowly
varying function.
It is obvious that R is regularly varying of index α if and only if it can
be written in the form
R(x) = xα L(x), x ∈ (r, ∞),
where L is a slowly varying function.
The general form of a slowly varying function is described by the cel-
ebrated Karamata theorem below [cf. Seneta (1976, Theorem 1.2 and its
Corollary)].
Theorem A2.1 (Representation theorem) Let r ∈ R+ . A function L :
[r, ∞) → R+ is slowly varying if and only if
µZ x ¶
ε(t)
L(x) = c(x) exp dt , x ≥ x0 ,
x0 t
321
322 Appendix 2
(ii) limx→∞ xε L(x) = ∞ and limx→∞ x−ε L(x) = 0 for any ε > 0;
xα+1 L(x)
lim Z x =α+1 (A2.1)
x→∞
y α L(y)dy
x0
Z x
while the function x → y α L(y)dy, x > x0 , is regularly varying of index
x0
α + 1.
Conversely, if L : [r, ∞) → R+ is measurable and bounded on finite
intervals in [x0 , ∞) for some x0 ≥ r and (A2.1) holds Z
for some α > −1, then
x
L is a slowly varying function while the function x → y α L(y)dy, x > x0 ,
x0
is regularly varying of index α + 1. The last assertion also holds for α = −1.
Theorem A2.4 Let r ∈ R+ . If L : [r, ∞) → R+ is a slowly varying
function, then Z ∞
lim y α L(y) dy < ∞ (A2.2)
x→∞ x
Z ∞
for any α < −1. If lim y −1 L(y) dy < ∞ then for any α ≤ −1 we
x→∞ x
have
xα+1 L(x)
lim Z ∞ = −(α + 1) (A2.3)
x→∞
y α L(y)dy
x
Z ∞
while the function x → y α L(y) dy, for x large enough, is regularly
x
varying of index α + 1.
Conversely, if L : [r, ∞) → R+ is measurable, satisfies (A2.2), and
(A2.3) holds for some α < −1, then L is a slowly varying function while
Regularly varying functions 323
Z ∞
the function x → y α L(y)dy, for x large enough, is regularly varying of
x
index α + 1.
A2.2
An important class of pairs of regularly varying functions is defined as fol-
lows. Let ξ be a non-degenerate real-valued random variable on a probability
space (Ω, K, P ), and define real-valued functions F and Fe on [0, ∞) by
Z ∞
x Fe (x) + F (x) = 2x2
2
u−3 F (u)du, x ∈ R+ . (A2.5)
x
x2 Fe (x)
lim =c (A2.6)
x→∞ F (x)
A2.3
Let f : [1, ∞) → R++ be a measurable function which is bounded on finite
intervals and such that limx→∞ f (x) = ∞. For any y ∈ [f (1), ∞) define
f1 (y)
lim = 1.
y→∞ f2 (y)
f0 (y)
lim =1
y→∞ f2 (y)
f 2 (x)
lim X = 0. (A2.7)
x→∞ x f 2 (k)k −2
{k∈N+ : k≤x}
A3.1
Let (Ω, K, P ) be a probability space. For any two σ-algebras K1 and K2
included in the σ-algebra K define the dependence coefficients
Clearly,
α(K1 , K2 ) ≤ ϕ(K1 , K2 ) ≤ ψ(K1 , K2 )
and
X = {Xnj , 1 ≤ j ≤ jn , jn ∈ N+ , n ∈ N+ } (A3.1)
325
326 Appendix 3
(k)
where N+ = {n ∈ N+ : jn > k} , k ∈ N+ , and δ stands for either α, ϕ or
ψ. Clearly, in the case of an infinite sequence (Xn )n∈N+ we can write
(Xk+1 , · · · , Xk+h ), 0 ≤ k ≤ n − h,
real-valued r.v. on (X, X ), and assume that Ef 2 (X1 ) < ∞. Then the
series
X
σ 2 = Ef 2 (X1 ) − E 2 f (X1 ) + 2 E(f (X1 ) − Ef (X1 ))(f (Xn+1 ) − Ef (X1 ))
n∈N+
as n → ∞.
The above results are already folklore. See, e.g., Doukhan (1994, Ch. 1).
Proposition A3.4 [Gordin (1971, Remark 3)] In addition to the hy-
potheses of Corollary A3.3 assume that ψ(1) < 1. Then σ = 0 if and only
if f = const.
A3.2
For an array (A3.1) of real-valued r.v.s on (Ω, K, P ) set
k
X
Snk = Xnj , 1 ≤ k ≤ jn , Snjn = Sn , n ∈ N+ .
j=1
Then such an array is said to be strongly infinitesimal (s.i. for short) if and
only if it is strictly stationary and for any sequence (kn )n∈N+ of natural
integers such that kn ≤ jn , n ∈ N+ , and limn→∞ kn /jn = 0 the sum Snkn
converges in P -probability to 0 as n → ∞.
All results given below were proved by J.D. Samur, as indicated at appro-
priate places, in the more general case of Banach valued random variables.
Proposition A3.5 If (A3.1) is a ϕ-mixing s.i. array of real-valued r.v.s,
then ¡ −1 ¢
lim max dP P Snk , δ0 = 0
n→∞ 1≤k≤kn
A3.3
Let ν be an infinitely divisible probability on B. We denote by Qν the
distribution (on BD ) of a stochastic process ξν = (ξν (t))t∈I with stationary
independent increments, ξν (0) = 0 a.s., trajectories in D, and ξν (1) having
probability distribution ν. When ν is Gaussian the process ξν can be taken
with trajectories in C. In this case the distribution of ξν is concentrated on
BC , and we shall denote it by Q0ν .
Given an array (A3.1) of real-valued r.v.s., for any n ∈ N+ define the
stochastic processes ξnD = (ξnD (t))t∈I and ξnC = (ξnC (t))t∈I by
Theorem A3.8 [Samur (1987, Corollary 3.5 and § 3.6.4)] Let (A3.1) be
a ϕ-mixing strictly stationary array of real-valued r.v.s. Let ν be a probability
measure on B. Then the following statements are equivalent:
w
I. P Sn−1 → ν, the array (A3.1) is s.i., and limn→∞ jn P (|Xn1 | > ε) = 0
for any ε > 0.
¡ ¢−1 w
II. ν is Gaussian and P ξnD → Qν in BD .
¡ ¢−1 w 0
III. ν is Gaussian and P ξnC → Qν in BC .
P 0 (Xn1
0 , · · · , X 0 )−1 = P (X , · · · , X
njn n1
−1
njn ) , n ∈ N+ ,
P 0 ζ −1 = Q0ν ,
¯ ¯
¯X ³ ´¯¯
¯ k 0
max ¯ Xnj − ζ jk ¯¯ → 0 P 0 -a.s. as n → ∞.
1≤k≤jn ¯¯ n ¯
j=1
1
Xnj = (Xj − An ) , 1 ≤ j ≤ n, n ∈ N+ .
Bn
rn (Arn − An )
lim = 0.
n→∞ Bn
E 2 η I(|η|≤x)
lim = 0. (A3.2)
x→∞ Eη 2 I(|η|≤x)
Assume that
x2 P (|X1 | > x)
0 < EX12 ≤ ∞, lim = 0,
x→∞ EX 2 I(|X |≤x)
1 1
w
we have P ξ¯n−1 → WD in BD , where
1.1
As we have noted, the basic reference for classical non-metric results on
different types of continued fraction expansions is Perron (1954, 1957).
There exist several metrical results about Euclid’s algorithm. Let b, n ∈
N+ with 1 ≤ b < n. Then b/n = [a1 , · · · , aτ (b,n) ] with aτ (b,n) ≥ 2, and
τ (b, n) ∈ N+ is the number of division steps occurring when b and n are
input to the algorithm. Since Euclid’s algorithm applied to b and n behaves
essentially the same as when applied to b/g.c.d.(b, n) and n/g.c.d.(b, n),
it is convenient to consider the average number τn of division steps when b
is relatively prime to n and chosen at random, that is, probability 1/ϕ(n)
is given to any integer in the range [1, n] which is prime to n. Here ϕ is
Euler’s ϕ-function defined by
Yµ 1
¶
ϕ(n) = n 1− , n ≥ 2,
p
p|n
and ϕ(1) = 1, where the product is taken over all prime numbers p which
divide n. Clearly,
n
X
1
τn = τ (k, n).
ϕ(n)
k=1
g.c.d.(k, n) = 1
333
334 Notes and Comments
The leading coefficient (12 log 2)/π 2 = 0.84276... was independently derived
by Dixon (1970, 1971) and Heilbronn (1969). A very interesting discussion of
this topic can be found in Knuth (1981, Section 4.5.3). See also Lochs (1961),
Szüsz (1980), and Tonkov (1974). For recent generalizations of Dixon’s and
Heilbronn’s results, see Hensley (1994). The largest quotient
max ak
1≤k≤τ (b,n)
1.2
Whole sections or chapters on the metrical theory of continued fractions
can be found in the books by Billingsley (1965), Ibragimov and Linnik
(1971), Iosifescu and Grigorescu (1990), Kac (1959), Khin(t)chin(e) (1956,
1963, 1964), Knuth (1981), Koksma (1936), Lévy (1954), Rockett and Szüsz
(1992), Sinai (1994), Urban (1923).
1.3
The natural extension τ̄ of τ has been introduced in a more general context
by Nakada (1981) in order to derive ergodic properties of associated random
variables. See Sections 4.0 and 4.1.
The extended incomplete quotients have been first introduced by Faivre
(1996) and, in general, the extended random variables by Iosifescu (1997),
who proved Theorem 1.3.5 which motivates the consideration of the condi-
tional probability measures γa , a ∈ I. Proposition 1.3.8 and Corollary 1.3.9
can also be found in the latter reference.
Subsections 1.3.5 and 1.3.6 rely on the work of Iosifescu (1989, 2000
b). It is worth mentioning that to our knowledge it is the first time that
mixing coefficients have been computed exactly. A first estimation, ψ(n) ≤
(0.8)n , n ∈ N+ , of the ψ-mixing coefficients is due to Philipp (1988). As
to other types of mixing, it seems possible to prove a kind of α-mixing for
(r̄` )`∈Z using the Markovian structure of (s̄` )`∈Z and the reversibility of
(ā` )`∈Z .
Notes and Comments 335
It is the appropriate place to mention that the sequence (an )n∈N+ enjoys
another mixing property known as the almost Markov property, a concept
introduced by the Lithuanian school—see especially the references to the
papers by V.A. Statulevic̆ius and B. Riauba in Heinrich (1987) and Mis-
evic̆ius (1971). See also Saulis and Statulevic̆ius (1991). Let µ ∈ pr(BI ) and
for k, n ∈ N+ define the random variable
Then as shown in Heinrich (op. cit.)—for a slightly weaker form of this result
see Misevic̆ius (1981)—assuming that µ ¿ λ and that f = dµ/dλ ∈ L(I)
and is bounded away from 0, we have
Finally, note that it has not been usual to prove F. Bernstein’s theo-
rem (Proposition 1.3.16) as an application of ψ-mixing of the sequence of
incomplete quotients.
2.1
Theorem 2.1.6 and Proposition 2.1.7 are in fact corollaries of the ergodic
theorem of Ionescu Tulcea and Marinescu (1950) [see also Hennion (1993)],
which is a deep generalization of an ergodic theorem of Doeblin and Fortet
(1937). Cf. Iosifescu (1993b). As noted by Iosifescu (1993a), it is hard to
understand how Doeblin (1940) missed a geometric rate solution to Gauss’
problem, which could have been obtained by using the latter theorem.
Subsection 2.1.3 relies on the work of Iosifescu (1992, 1993, 1994). In
particular, Propositions 2.1.11 and 2.1.12 have allowed for the simplest so-
lution known to date to Gauss’ problem, which is included in the first two
references just quoted. Proposition 2.1.11 has been also proved by Szüsz
(1961) for f ∈ C 1 (I).
In connection with Proposition 2.1.17 we note that in the case of a sin-
gular µ ∈ pr(BI ) the solution to the corresponding Gauss’ problem has not
been yet systematically studied. See Remark 2 following Corollary 4.1.10
for a case where the limit clearly differs from Gauss’ measure.
336 Notes and Comments
2.2
Subsections 2.2.1 and 2.2.2 contain a very detailed presentation of E.Wir-
sing’s 1974 celebrated paper. This also includes the effective computation
of numerical constants occurring there.
Subsection 2.2.3 relies on the work of Iosifescu (2000 a, c). That Theo-
rem 2.2.6 holds for f ∈ L(I), that is, that Theorem 2.2.8 holds, had been
announced in Iosifescu (1992) and subsequently used by Faivre (1998a). We
stress again the importance of a study of the set E defined in Remark 1
following Theorem 2.2.6. (See also Remark 2 following Theorem 2.2.11.)
2.3
This section contains a detailed presentation of K.I. Babenko’s work on
Gauss’ problem, with some improvements and generalizations. Information
about the life and work of K.I. Babenko (1919–1987) can be found in Russian
Math. Surveys 35 (1980), no. 2, 265–275, and 43 (1988), no. 2, 138–151.
Proposition 2.3.2 and its proof are due to Mayer and Roepstorff (1987).
For a = 0, that is, under Lebesgue measure λ = γ0 the exact Gauss–Kuzmin–
Lévy Theorem 2.3.5 has been proved by Babenko (1978). The general case
a ∈ I has been announced by Iosifescu (2000 b). Note that equation (2.3.14)
is equivalent to equation (3.6) in Hensley (1992).
We stress the fact that for some a ∈ I the exact convergence rate in
Gauss’ problem under γa is faster than Wirsing’s optimal rate O(λn0 ) as
n → ∞. See the Remark after the proof of Corollary 2.3.6.
It should be noted that by Proposition 2.1.17 for any i(k) ∈ Nk+ the
limit of µ[(an+1 , . . . , an+k ) = i(k) ] as n → ∞ exists and is equal to γ(I(i(k) )
whatever µ ∈ pr(BI ) such that µ ¿ λ. Corollary 2.3.6 shows that in the
case where µ = γa , a ∈ I, a good convergence rate also holds.
A note of historical nature is in order concerning the equation
µ ¶
1 1
lim λ(an = k) = log 1 + , k ∈ N+ ,
n→∞ log 2 k(k + 2)
which is a weaker form of a result given in Corollary 2.3.6. This formula was
first obtained as early as 1900. Two papers of the Swedish astronomer Hugo
Gyldén, whose understanding of the approximate computation of planetary
motions led him in 1888 to study the asymptotic of λ(an = k), k ∈ N+ , as
n → ∞, were taken up for revision by his fellow-countrymen Torsten Brodén
and Anders Wiman, both mathematicians associated with Lund University.
Notes and Comments 337
Wiman (1900) got finally the correct result after Sisyphical computations.
Two subsequent papers, both published in 1901, of Brodén and Wiman were
then considered by Émile Borel as the first ones to notice the applicability
of measure theory in probability. The reader will find precise references
and all the necessary details in von Plato (1994, Ch. 2). This book is a
fascinating account of the emergence of measure-theoretic probability in the
first third of the 20th century (until the publication of A.N. Kolmogorov’s
Grundbegriffe der Wahrscheinlichkeitsrechnung in 1933). It is convincingly
argued there that the theory of the continued fraction expansion should
be counted among the fields that brought infinitary events and the idea of
measure 0 into probability.
2.5
This section relies on the work of Iosifescu (1994, 1997, 1999). For a = 0,
that is, under Lebesgue measure λ = γ0 the optimal convergence rate O(g2n )
in Theorem 2.5.5 (without explicit lower and upper bounds), has been first
shown by Dürner (1992) using a different approach. For a = 0, too, Theorem
2.2.8 with just an upper bound O(gn ) [instead of the optimal one O(g2n )],
has been proved by a different method by Dajani and Kraaikamp (1994).
The proof given here emphasizes the importance of the generalized Brodén–
Borel–Lévy formula (1.3.21).
It is hard to understand why A. Denjoy’s 1936 Comptes Rendus Notes
went unnoticed so many years. The method of proving and generalizing
Denjoy’s results here, is quite different from that suggested by him.
3.0
The idea underlying Lemma 3.0.1 goes back to Philipp (1970). Lemma 3.0.2
is a special case of a result of Samur (1989, Lemma 2.3).
3.1
Except for Theorem 3.1.6, the results in Subsections 3.1.1 and 3.1.2 have
been proved by Samur (1989). The classical Poisson law [Theorem 3.1.2
(iii)] under any µ ¿ λ has been first given a complete proof by Iosifescu
(1977), who filled a gap in an incomplete proof by Doeblin (1940, p. 358).
338 Notes and Comments
(1981). The functional version of this central limit theorem was proved by
Philipp and Webb (1973).
3.4
We only mention here a result not covered by those given in this section.
It is about Doeblin’s sequence (Sn )n∈N+ just discussed. Doeblin (1940, p.
361) asserted the validity of the law of the iterated logarithm
µ ¶
Sn − An
λ lim supn→∞ √ = 1 = 1.
2An log log An
A complete proof was again given by Philipp (1970). The functional version
of this law of the iterated logarithm might follow from a more general result
in Szüsz and Volkmann (1982, p. 458).
4.0
Most of the results stated for probability measures are still valid for finite
measures and even for σ-finite, infinite measures. See, e.g., Aaronson (1997).
4.1
Khin(t)chin(e) [1934/35, 1936;P1963 (or 1964), Ch. 3] proved the a.e. con-
vergence of arithmetic means ni=1 f (ai , · · · , ai+k−1 )/n, n ∈ N+ , for some
fixed k ∈ N, under an unnecessarily strong assumption on the function
f : Nk+ → R. His proofs are quite intricate since he made no use of the
Birkhoff–Khinchin (!) ergodic theorem which, as we have seen, provides
short and elegant proofs. (This should be certainly associated with the fact
that ergodic theory at the time was restricted to invertible transformations.
But even so a way out could have perhaps been found.) Unlike Khinchin,
Doeblin (1940, p. 366) did make use of the ergodic theorem. He proved that
the continued fraction transformation τ is ergodic under λ [a different proof
had been given earlier by Knopp (1926), see also Martin (1934)]. Since τ
is γ-preserving, this enabled him to derive (in an equivalent form) equa-
tion (4.1.1), thus to retrieve Khinchin’s results under weaker assumptions
in a straightforward manner. It is the appropriate place to note that, in
spite of the fact that, e.g., Billingsley (1965, p. 49) fully credits Doeblin
for the idea leading to (4.1.1), many authors assert that this idea is due to
340 Notes and Comments
Ryll–Nardzewski (1951). Actually, the only real advance made after 1940
in using ergodic theorems in the metric theory of RCF expansion originated
with Nakada (1981) who, as already mentioned, introduced the natural ex-
tension τ̄ of τ , allowing to derive equation (4.1.6). It is again really surprising
that Doeblin (1940, p. 365) asserts that his version of Theorem 2.2.11—see
Remark 1 following that theorem—implies that
1
lim card{k : Θ−1
k < x, 1 ≤ k ≤ n } = H(x), x ≥ 1,
n→∞ n
P
and that n−1 ni=1 Θi converges a.e. as n → ∞ to a constant (not indicated).
Or Doeblin’s first assertion above is equivalent to the first case considered
in Corollary 4.1.22 while the second one is the first equation in Corollary
4.1.23 without the value of the limit. How did Doeblin guess these results
whose proofs involve the use of τ̄ ?
It should be noted that special cases of the Khinchin-Doeblin results
have been known before. For example, as already noted, Proposition 4.1.1
and its consequences were first proved (without convergence rates) by Lévy
(1929).
The application of the Gál–Koksma theorem to the RCF expansion,
yielding the convergence rates indicated, is due to de Vroedt (1962, 1964).
Let us finally mention that in Philipp (1967) a more general problem is
considered. Given an arbitrary sequence (In )n∈N+ of intervals contained in
I, it is shown there that for any ε > 0 the random variable
card{k : τ k ∈ Ik , 1 ≤ k ≤ n }, n ∈ N+ ,
is equal to
à !1/2 à n !
n
X Xn X
3+ε
γ(Ik ) + O γ(Ik ) log 2 γ(Ik ) a.e.
k=1 k=1 k=1
Moeckel (1982), then Jager and Liardet (1988), using quite different
methods showed—amongst other things—that if we consider modulo 2 the
sequence (qn )n∈N+ of the denominators of the RCF convergents of any given
ω ∈ Ω, then the asymptotic relative frequencies of the digit blocks 01, 10,
and 11 all are a.e. equal to 1/3. [Note that the digit block 00 cannot occur
since |pn−1 qn − pn qn−1 | = 1, n ∈ N+ .] Jager and Liardet (op. cit.) showed
Notes and Comments 341
that results of this kind can be easily derived from the ergodicity of a certain
skew product. To define it we need some notation. For any integer m ≥ 2
let G(m) denote the finite group of 2 × 2 matrices with entries from Z/mZ
(the classes of remainders modulo m) and determinant equal to ±1, that is,
µµ ¶ ¶
a b
G(m) = : a, b, c, d ∈ Z/mZ, ad − bc = ±1 .
c d
Here the product is taken over all prime numbers p which divide m.
Jager and Liardet’s skew product Tm : Ω × G(m) → Ω × G(m) is then
defined by
µ µ ¶¶
0 1
Tm (ω, A) = τ (ω), A , (ω, A) ∈ Ω × G(m).
1 a1 (ω) mod m
¡ ¢
for (ω, θ, A) ∈ Ω2 ×G(m). Then T̄m is γ̄⊗hm -preserving, and T̄m , γ̄ ⊗ hm is
an ergodic automorphism. Hence Dajani and Kraaikamp (op. cit.) deduced,
e.g., that for any integers m ≥ 2, 0 ≤ a, b ≤ m − 1, with g.c.d.(a, b, m) = 1
and for any (t1 , t2 ) ∈ I 2 we have
1
lim card {k : Θk−1 < t1 , Θk < t2 , pk ≡ a, qk ≡ b mod m, 1 ≤ k ≤ n }
n→∞ n
H(t1 , t2 )
= a.e.,
J(m)
where the distribution function H has been defined in Corollary 4.1.20.
Their paper contains a host of other results. They also showed that these
results can be extended to S-expansions (cf. Sections 4.2 and 4.3). It is
interesting to note that the sequences of numerators and denominators of
the S-convergents have – mod m – the same asymptotic behaviour as that
just indicated for the sequences of numerators and denominators of the RCF
convergents.
It may seem difficult to compare, e.g., the decimal expansion with the
RCF expansion, since their dynamics are different. However, Lochs (1964)
obtained a then surprising result that had to serve as a prototype for further
results of the same kind. Let ω ∈ Ω and consider the rational number
xn = xn (ω) := b10n ωc/10n , which yields the first n decimal digits of ω, and
yn = xn + 10−n , n ∈ N+ . Clearly, for n large enough we have yn < 1. Next,
let ω = [a1 , a2 , · · · ], xn = [b1 , · · · , bk ], and yn = [c1 , · · · , c` ] be the RCF
expansions of ω, xn , and yn , respectively, and for n ∈ N+ large enough put
mn = mn (ω) = max{i ≤ max(k, `) : bj = cj , 1 ≤ j ≤ i }.
In other words, mn (ω) is the largest integer such that the closed interval
[xn , yn ] is contained in the closure of the fundamental
√ interval I(a1 , · · · ,
3
amn (ω) ) (containing ω). For example, if ω = 2 − 1 = 0.259921 · · · then
x5 = 0.25992, y5 = 0.25993, ω = [3, 1, 5, 1, 1, · · · ], x5 = [3, 1, 5, 1, 1, 4, 2, 5, 1,
3], and y5 = [3, 1, 5, 1, 1, 5, 5, 1, 2, 1, 4, 3]. Therefore m5 (ω) = 5, that is,
from the first 5 decimal digits of ω we obtain its first 5 RCF digits. Using
arithmetic properties of τ and Paul Lévy’s result (4.1.19), Lochs (op. cit.)
proved that
mn 6 log 2 log 10
lim = = 0.97027014 · · · a.e..
n→∞ n π2
This means that, roughly speaking, usually around 97% of the RCF digits
are determined by the decimal digits. Using an early mainframe computer,
Notes and Comments 343
by way of example, Lochs (1963) calculated that the first 1000 decimal digits
of π determine 968 RCF digits of it!
Lochs’ result was generalized to a wider class of transformations of I by
Bosma et al. (1999). Their results are based on the Shannon–McMillan–
Breiman theorem in information theory [see Billingsley (1965, p. 129)] while
Lochs’ limit appears in fact to be the ratio of the entropies of the transfor-
mations S : I → I defined as Sx = 10 x mod 1, x ∈ I, underlying the decimal
expansion, and τ . Finally, Dajani and Fieldsteel (2001) gave wider applica-
tions and simpler proofs of results describing the rate at which the digits of
one number theoretical expansion determine those of another. Their proofs
are based on general measure-theoretic covering arguments and not on the
dynamics of specific maps.
We mention that Lochs’ problem was also considered by Faivre (1997,
1998b), who showed that (i) for any ε > 0 there exist positive constants
a < 1 and A such that
µ¯ ¯ ¶
¯ mn 6 log 2 log 10 ¯
λ ¯¯ − ¯ ≥ ε ≤ Aan , n ∈ N+ ,
¯
n π2
¡ ¢ √
and (ii) the random variable mn − 6(log 2)(log 10)n/π 2 / n is asymptot-
ically N (0, σ) for some σ > 0 (which is related to the constant denoted by
the same letter in Example 3.2.11). Clearly, Lochs’ result is implied by (i)
via the Borel–Cantelli lemma.
Cassels (1959) showed that there exist numbers x which are normal in
base 3 but non-normal in any base that is not a power of 3. This result was
generalized by Schmidt (1960) as follows. Let the notation r ∼ s stand for
r, s ∈ N+ being powers of the same integer. It is fairly obvious that if r ∼ s
then normality of x in both bases r and s imply each other. If r 6∼ s then
this implication does not hold. In fact, Schmidt (op. cit.) showed that in the
latter case there is a continuum power set of numbers x which are normal
in base r but not even simply normal in base s. (Simple normality means
that each single digit occurs with the proper frequency.) Motivated by this,
Schweiger (1969) defined two number theoretical transformations T and S
on I (or I d , the d-dimensional unit cube, d ∈ N+ ) to be equivalent (T ∼ S)
if there exist positive integers m, n ∈ N+ such that T m = S n . Schweiger
then showed that T ∼ S implies that every T -normal number is S-normal,
and conjectured that T 6∼ S implies the opposite conclusion.
Surprisingly, Kraaikamp and Nakada (2000) proved that the RCF and
NICF expansions share the same set of normal numbers. Clearly, in itself
344 Notes and Comments
In Burton et al. (op. cit.) the natural extension of the ergodic dynamical
system underlying the λ-CF expansion was obtained for any q ≥ 3—the case
q = 3 is in fact the NICF expansion. [Previously, Nakada (op. cit.) obtained
a similar result for any even q.] From this a large number of results similar to
those holding for the RCF expansion, were obtained for the λ-CF expansion.
At first sight Nakada’s α-expansions and those of Tanaka and Ito (1981)
bear a close resemblance. Let α ∈ [1/2, 1], Iα = [α − 1, α], and define the
transformation Tα : Iα → Iα by
−1 ¥ −1 ¦
x − x + 1 − α if x ∈ Iα \ {0},
Tα (x) =
0 if x = 0.
It yields a unique Tanaka–Ito α-expansion of the form
1
x = , x ∈ Iα \ {0} ,
1
b1 +
.
b2 + . .
which is finite if and only if x is rational, and where bi ∈ Z \ {0}, i ∈
N+ . In spite of the similarities it is much harder to obtain results for the
Tanaka–Ito α-expansions as compared to the Nakada α-expansions discussed
in Subsection 4.3.1. E.g., Tanaka and Ito (op. cit.) were able only to give
the explicit form of the density of the invariant measure for 1/2 ≤ α ≤ g.
For these values of α they were also able to derive the entropy of Tα . It
is interesting to note that the latter is independent of α ∈ [1/2, g], and is
equal to π 2 /(6 log g), which is the value corresponding to an S-expansion
with maximal singularization area.
347
348 References
Aliev, I., Kanemitsu, S., and Schinzel, A. (1998) On the metric theory
of continued fractions. Colloq. Math. 77, 141–146.
Araujo, A. and Giné, E. (1980) The Central Limit Theorem for Real
and Banach Valued Random Variables. Wiley, New York.
Babenko, K.I. and Jur0 ev, S.P. (1978) On the discretization of a prob-
lem of Gauss. Soviet Math. Dokl. 19, 731–735.
Bailey, D.H., Borwein, J.M., and Crandall, R.E. (1997) On the Khint-
chine constant. Math. Comp. 66, 417–431.
Bedford, T., Keane, M., and Series, C. (Eds.) (1991) Ergodic Theory,
Symbolic Dynamics and Hyperbolic Spaces. Oxford University Press,
Oxford.
Bosma, W., Dajani, K., and Kraaikamp, C. (1999) Entropy and count-
ing correct digits. Report No. 9925 (June), Univ. Nijmegen, Dept. of
Math., Nijmegen (The Netherlands).
Bosma, W., Jager, H., and Wiedijk, F. (1983) Some metrical observa-
tions on the approximation by continued fractions. Indag. Math. 45,
281–299.
Dajani, K., Kraaikamp, C., and Solomyak, B. (1996) The natural ex-
tension of the β-transformation. Acta Math. Hungar. 73, 97–109.
Davison, J.L. and Shallit, J.O. (1991) Continued fractions for some
alternating series. Monatsh. Math. 111, 119–126.
Gál, I.S. and Koksma, J.F. (1950) Sur l’ordre de grandeur des fonctions
sommables. Indag. Math. 12, 638–653.
References 355
Heyde, C.C. and Scott, D.J. (1973) Invariance principles for the law of
the iterated logarithm for martingales and processes with stationary
increments. Ann. Probab. 1, 428–436.
Ito, Sh. (1989) Algorithms with mediant convergents and their metri-
cal theory. Osaka J. Math. 26, 557–578.
Jager, H. (1985) Metrical results for the nearest integer continued frac-
tion. Indag. Math. 47, 417–427.
Jain, N.C. and Pruitt, W.E. (1975) The other law of the iterated
logarithm. Ann. Probab. 3, 1046–1049.
360 References
Jain, N.C. and Taylor, S.J. (1973) Local asymptotic laws for Brownian
motion. Ann. Probab. 1, 527–549.
Jenkinson, O. and Pollicott, M. (2001) Computing the dimension of
dynamically defined sets: E2 and bounded continued fractions. Er-
godic Theory and Dynamical Systems 21, 1429–1445.
Jain, N.C., Jodgeo, K., and Stout, W.F. (1975) Upper and lower func-
tions for martingales and mixing processes. Ann. Probab. 3, 119–145.
Jones, W.B. and Thron, W.J. (1980) Continued Fractions: Analytic
Theory and Applications. Addison-Wesley, Reading, Mass.
Kac, M. (1959) Statistical Independence in Probability and Statistics.
Wiley, New York.
Kaijser, T. (1983) A note on random continued fractions. Probabil-
ity and Mathematical Statistics : Essays in Honour of Carl-Gustav
Esseen, 74–84. Uppsala Univ., Dept. of Math., Uppsala.
Kakeya, S. (1924) On a generalized scale of notations. Japan J. Math.
1, 95-108.
Kalpazidou, S. (1985a) On a random system with complete connec-
tions associated with the continued fraction to the nearer integer ex-
pansion. Rev. Roumaine Math. Pures Appl. 30, 527–537.
Kalpazidou, S. (1985b) On some bidimensional denumerable chains of
infinite order. Stochastic Process. Appl. 19, 341–357.
Kalpazidou, S. (1985c) Denumerable chains of infinite order and Hur-
witz expansion. Selected Papers Presented at the 16th European Meet-
ing of Statisticians (Marburg, 1994). Statist. Decisions, Suppl. Issue
no. 2, 83–87.
Kalpazidou, S. (1986a) A class of Markov chains arising in the met-
rical theory of the continued fraction to the nearer integer expansion.
Rev. Roumaine Math. Pures Appl. 31, 877–890.
Kalpazidou, S. (1986b) Some asymptotic results on digits of the near-
est integer continued fraction. J. Number Theory 22, 271–279.
Kalpazidou, S. (1986c) On nearest continued fractions with stochasti-
cally independent and identically distributed digits. J. Number Theory
24, 114–125.
References 361
Keane, M.S. (1991) Ergodic theory and subshifts of finite type. In:
Bedford, T. et al. (Eds.) (1991), 35–70.
Kraaikamp, C. and Lopes, A. (1996) The theta group and the con-
tinued fraction expansion with even partial quotients. Geometriae
Dedicata 59, 293–333.
Lévy, P. (1929) Sur les lois de probabilité dont dépendent les quotients
complets et incomplets d’une fraction continue. Bull. Soc. Math. France
57, 178–194.
Magnus, W., Oberhettinger, F., and Soni, R.P. (1966) Formulas and
Theorems for the Special Functions of Mathematical Physics, 3rd Edi-
tion. Springer–Verlag, Berlin.
Nakada, H., Ito, Sh., and Tanaka, S. (1977) On the invariant measure
for the transformations associated with some real continued fraction.
Keio Engrg. Rep. 30, 159–175.
Perron, O. (1954, 1957) Die Lehre von der Kettenbrüchen. Band I: El-
ementare Kettenbrüche; Band II: Analytisch-funktiontheoretische Ket-
tenbrüche. Teubner, Stuttgart. (1st Edition 1913; 2nd Edition 1929)
Rieger, G.J. (1977) Die metrische Theorie der Kettenbrüche seit Gauss.
Abh. Braunschweig. Wiss. Gesellsch. 27, 103–117.
Rieger, G.J. (1981b) Über die Länge von Kettenbrüchen mit ungeraden
Teilnennern. Abh. Braunschweig. Wiss. Gesellsch. 32, 61–69.
Sebe, G.I. (2001b) Gauss’ problem for the continued fraction expan-
sion with odd partial quotients revisited. Rev. Roumaine Math. Pures
Appl. 46, 839–852.
Szűsz, P. (1961) Über einen Kusminschen Satz. Acta Math. Acad. Sci.
Hungar. 12, 447–453.
Vajda, S. (1989) Fibonacci and Lucas Numbers, and the Golden Sec-
tion: Theory and Applications. E. Horwood, Chichester.
Viader, P., Paradis, J., and Bibiloni, L. (1998) A new light on Minkow-
ski’s ?(x)-function. J. Number Theory 73, 212–227.
377
378 Index