Sei sulla pagina 1di 400

The Metrical Theory

of Continued fractions

Marius Iosifescu and Cor Kraaikamp


Contents

Preface ix

Frequently Used Notation xv

1 Basic properties of the continued fraction expansion 1


1.1 A generalization of Euclid’s algorithm . . . . . . . . . . . . . 1
1.1.1 The continued fraction transformation τ . . . . . . . . 1
1.1.2 Continuants and convergents . . . . . . . . . . . . . . 4
1.1.3 Some special continued fraction expansions . . . . . . 11
1.2 Basic metric properties . . . . . . . . . . . . . . . . . . . . . . 14
1.2.1 Defining random variables of interest . . . . . . . . . . 14
1.2.2 Gauss’ problem and measure . . . . . . . . . . . . . . 15
1.2.3 Fundamental intervals, and applications . . . . . . . . 17
1.3 The natural extension of τ . . . . . . . . . . . . . . . . . . . . 25
1.3.1 Definition and basic properties . . . . . . . . . . . . . 25
1.3.2 Approximation coefficients . . . . . . . . . . . . . . . . 27
1.3.3 Extended random variables . . . . . . . . . . . . . . . 31
1.3.4 The conditional probability measures . . . . . . . . . . 36
1.3.5 Paul Lévy’s solution to Gauss’ problem . . . . . . . . 39
1.3.6 Mixing properties . . . . . . . . . . . . . . . . . . . . . 43

2 Solving Gauss’ problem 53


2.0 Banach space preliminaries . . . . . . . . . . . . . . . . . . . 53
2.0.1 A few classical Banach spaces . . . . . . . . . . . . . . 53
2.0.2 Bounded essential variation . . . . . . . . . . . . . . . 55
2.1 The Perron–Frobenius operator . . . . . . . . . . . . . . . . . 56
2.1.1 Definition and basic properties . . . . . . . . . . . . . 56
2.1.2 Asymptotic behaviour . . . . . . . . . . . . . . . . . . 62

v
vi CONTENTS

2.1.3 Restricting the domain of the Perron–Frobenius


operator . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.1.4 A solution to Gauss’ problem for probability measures
with densities . . . . . . . . . . . . . . . . . . . . . . . 70
2.1.5 Computing variances of certain sums . . . . . . . . . . 71
2.2 Wirsing’s solution to Gauss’ problem . . . . . . . . . . . . . . 79
2.2.1 Elementary considerations . . . . . . . . . . . . . . . . 79
2.2.2 A functional-theoretic approach . . . . . . . . . . . . . 85
2.2.3 The case of Lipschitz densities . . . . . . . . . . . . . 95
2.3 Babenko’s solution to Gauss’ problem . . . . . . . . . . . . . 101
2.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 101
2.3.2 A symmetric linear operator . . . . . . . . . . . . . . . 103
2.3.3 An ‘exact’ Gauss–Kuzmin–Lévy theorem . . . . . . . 111
2.3.4 ψ-mixing revisited . . . . . . . . . . . . . . . . . . . . 119
2.4 Extending Babenko’s and Wirsing’s work . . . . . . . . . . . 120
2.4.1 The Mayer–Roepstorff Hilbert space approach . . . . 120
2.4.2 The Mayer–Roepstorff Banach space approach . . . . 127
2.4.3 Mayer–Ruelle operators . . . . . . . . . . . . . . . . . 130
2.5 The Markov chain associated with the
continued fraction expansion . . . . . . . . . . . . . . . . . . 135
2.5.1 The Perron–Frobenius operator on BV (I) . . . . . . . 135
2.5.2 An upper bound . . . . . . . . . . . . . . . . . . . . . 139
2.5.3 Two asymptotic distributions . . . . . . . . . . . . . . 151
2.5.4 A generalization of a result of A. Denjoy . . . . . . . . 156

3 Limit theorems 165


3.0 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
3.1 The Poisson law . . . . . . . . . . . . . . . . . . . . . . . . . 169
3.1.1 The case of incomplete quotients . . . . . . . . . . . . 169
3.1.2 The case of associated random variable . . . . . . . . 171
3.1.3 Some extreme value theory . . . . . . . . . . . . . . . 173
3.2 Normal convergence . . . . . . . . . . . . . . . . . . . . . . . 179
3.2.1 Two general invariance principles . . . . . . . . . . . . 179
3.2.2 The case of incomplete quotients . . . . . . . . . . . . 182
3.2.3 The case of associated random variables . . . . . . . . 188
3.3 Convergence to non-normal stable laws . . . . . . . . . . . . . 196
3.3.1 The case of incomplete quotients . . . . . . . . . . . . 196
3.3.2 Sums of incomplete quotients . . . . . . . . . . . . . . 202
3.3.3 The case of associated random variables . . . . . . . . 207
3.4 Fluctuation results . . . . . . . . . . . . . . . . . . . . . . . . 213
CONTENTS vii

3.4.1 The case of incomplete quotients . . . . . . . . . . . . 213


3.4.2 The case of associated random variables . . . . . . . . 215

4 Ergodic theory of continued fractions 219


4.0 Ergodic theory preliminaries . . . . . . . . . . . . . . . . . . . 219
4.0.1 A few general concepts . . . . . . . . . . . . . . . . . . 219
4.0.2 The special case of the transformations τ and τ . . . . 224
4.1 Classical results and generalizations . . . . . . . . . . . . . . 225
4.1.1 The case of incomplete quotients . . . . . . . . . . . . 225
4.1.2 Empirical evidence, and normal continued fraction
numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 240
4.1.3 The case of associated and extended random variables 244
4.2 Other continued fraction expansions . . . . . . . . . . . . . . 257
4.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 257
4.2.2 Semi-regular continued fraction expansions . . . . . . 260
4.2.3 The singularization process . . . . . . . . . . . . . . . 264
4.2.4 S-expansions . . . . . . . . . . . . . . . . . . . . . . . 266
4.2.5 Ergodic properties of S-expansions . . . . . . . . . . . 273
4.3 Examples of S-expansions . . . . . . . . . . . . . . . . . . . . 281
4.3.1 Nakada’s α-expansions . . . . . . . . . . . . . . . . . . 281
4.3.2 Minkowski’s diagonal continued fraction expansion . . 289
4.3.3 Bosma’s optimal continued fraction expansion . . . . . 292
4.4 Continued fraction expansions with σ-finite, infinite invariant
measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
4.4.1 The insertion process . . . . . . . . . . . . . . . . . . . 299
4.4.2 The Lehner and Farey continued fraction expansions . 300
4.4.3 The backward continued fraction expansion . . . . . . 307

Appendix 1: Spaces, functions, and measures 313


A1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
A1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
A1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
A1.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
A1.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
A1.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

Appendix 2: Regularly varying functions 321


A2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
A2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
A2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
viii CONTENTS

Appendix 3: Limit theorems for mixing random variables 325


A3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
A3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
A3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

Notes and Comments 333

References 347

Index 377
Preface

This monograph is intended to be a complete treatment of the metrical the-


ory of the (regular) continued fraction expansion and related representations
of real numbers. We have attempted to give the best possible results known
so far, with proofs which are the simplest and most direct.
The book has had a long gestation period because we first decided to
write it in March 1994. This gave us the possibility of essentially improving
the initial versions of many parts of it. Even if the two authors are different
in style and approach, every effort has been made to hide the differences.
Let Ω denote the set of irrationals in I = [0, 1]. Define the (reg-
ular) continued fraction transformation τ by τ (ω) = fractional part of
1/ω, ω ∈ Ω. Write τ n for the nth iterate of τ, n ∈ N = {0, 1, · · · },
with τ 0 = identity map. The positive integers an (ω) = a1 (τ n−1 (ω)), n ∈
N+ = {1, 2 · · · } , where a1 (ω) = integer part of 1/ω, ω ∈ Ω, are called the
(regular continued fraction) digits of ω. Writing

[x1 ] = 1/x1 , [x1 , · · · , xn ] = 1/(x1 + [x2 , · · · , xn ]), n ≥ 2,

for arbitrary indeterminates xi , 1 ≤ i ≤ n, we have

ω = lim [a1 (ω), · · · , an (ω)] , ω ∈ Ω,


n→∞

thus explaining the name of τ . The above equation will be also written as

ω = lim [a1 (ω), a2 (ω), · · · ], ω ∈ Ω.


n→∞

The an , n ∈ N, to be called incomplete quotients, are clearly positive integer-


valued random variables which are defined almost surely on (I, BI ) with
respect to any probability measure assigning probability 0 to the set I\Ω of
rationals in I. (Here BI denotes the σ-algebra of Borel subsets of I.) The
metrical theory of the (regular) continued fraction expansion is about the
sequence (an )n∈N+ of its incomplete quotients, and related sequences.

ix
x Preface

C.F. Gauss stated in 1812 that, in current notation,

lim λ(τ −n ([0, x))) = γ([0, x]), x ∈ I,


n→∞

where λ denotes Lebesgue measure and γ is what we now call Gauss’ mea-
sure, defined by Z
1 dx
γ(A) = , A ∈ BI .
log 2 A x + 1
Gauss asked for an estimate of the convergence rate in the above limiting
relation, and this has actually been the first problem of the metrical theory
of continued fractions. Ramifications of this problem, which was given a
first solution only in 1928, still pervade the current developments. Chapter
2 contains a detailed treatment of Gauss’ problem by an elementary ap-
proach and functional-theoretic methods as well. The latter are applied to
the Perron–Frobenius operator associated with τ , considered as acting on
various Banach spaces including that of functions of bounded variation on
I.
Gauss’ measure is important since it is preserved by τ , that is, γ(τ −1 (A))
= γ(A) for any A ∈ BI . This implies that, by its very definition, the
sequence (an )n∈N+ is strictly stationary under γ. As such, there should
exist a doubly infinite version of it, say (ā` )`∈Z , Z = { · · · , −1, 0, 1, · · · },
defined on a richer probability space. It appears that this doubly infinite
version can be effectively constructed on (I 2 , BI2 , γ̄), where γ̄ is the so called
extended Gauss’ measure defined by
ZZ
1 dxdy
γ̄(B) = , B ∈ BI2 .
log 2 B (xy + 1)2
Put ā−n (ω, θ) = an+1 (θ), ā0 (ω, θ) = a1 (θ), ān (ω, θ) = an (ω) for any
n ∈ N+ and (ω, θ) ∈ Ω2 . Then whatever ` ∈ Z, k ∈ N, and n ∈ N+
the probability distribution of the random vector (ā` , · · · , ā`+k ) under γ̄ is
identical with that of the random vector (an , · · · , an+k ) under γ, that is,
(ā` )`∈Z under γ̄ is a doubly infinite version of (an )n∈N+ under γ. A distinc-
tive feature of our treatment is the consistent use of the extended incomplete
quotients ā` , ` ∈ Z. It appears that
(a + 1)x
γ̄ ( [0, x] × I | ā0 , ā−1 , · · · ) = γ̄-a.s.
ax + 1
for any x ∈ I, where a = [ā0 , ā−1 , · · · ], which in turn implies that
a+1
γ̄(ā`+1 = i | ā` , ā`−1 , · · · ) = γ̄-a.s.
(a + i)(a + i + 1)
Preface xi

for any i ∈ N+ and ` ∈ Z. The last equation emphasizes a ‘chain of infinite


order’ structure of the incomplete quotients when properly defined on a
richer probability space. This idea goes back to W. Doeblin (1940) and,
hopefully, is fully clarified by our treatment. Also, the considerations above
motivate the introduction of the family (γa )a∈I of probability measures on
BI defined by their distribution functions
(a + 1)x
γa ([0, x]) = , x ∈ I.
ax + 1
In particular, γ0 = λ. Besides γ, these probability measures, which we call
conditional, are the most natural ones associated with the regular contin-
ued fraction expansion. It appears that (ā` )`∈Z is ψ-mixing under γ̄ while
(an )n∈N+ is ψ-mixing under γ and any γa , a ∈ I, and that the ψ-mixing
coefficients of the latter under γ (which are equal to the corresponding ones
of the former under γ̄) can in principle be exactly calculated. The facts just
described are part of our Chapter 1.
Chapter 3 is devoted to limit theorems for incomplete quotients, related
random variables, and their extended versions. These include weak conver-
gence to the Poisson, normal, and non-normal stable laws as well as the law
of the iterated logarithm, in both classical and functional approaches, and
are essentially based, in general, on the ψ-mixing property of both (ā` )`∈Z
and (an )n∈N+ .
The ergodic properties of the regular continued fraction expansion, lead-
ing to strong laws of large numbers, is deferred to Chapter 4. The reason is
that whilst these properties are inherited by the continued fraction expan-
sions which can be derived from the regular continued fraction expansion
by the procedures called singularization and insertion, the limit properties
in Chapter 3 do not transfer automatically to continued fraction expansions
so derived. We give applications of the ergodic properties of the continued
fraction transformation τ and its natural extension τ̄ . After an introduc-
tion, in which several general ergodic theoretical concepts and results—such
as Birkhoff’s ergodic theorem—are described, various classical results and
important recent results, based on the natural extension, are derived. It is
then shown that—via singularization and insertion—the ergodic properties
of very many other continued fraction expansions can easily be obtained.
In particular, the ergodic properties of the so called S-expansions are de-
scribed in detail. Several examples of S-expansions are studied, such as
Nakada’s α-expansions, Minkowski’s diagonal continued fraction expansion
and Bosma’s optimal continued fraction expansion. Also, the connection
between the regular continued fraction expansion and continued fraction
xii Preface

expansions with σ-finite, infinite invariant measures, such as the backward


continued fraction expansion and Lehner’s continued fraction expansion, is
explained.
To make the book self-contained as reasonably as possible, we have in-
cluded three appendices containing less known notions and results from mea-
sure theory, regularly varying functions, and limit theorems for mixing se-
quences of random variables, which we use frequently, especially in Chapter
3. We urge the reader to become familiar with the appendices early on so
as to be aware of what can be found there as needed. We also warn the
reader that Chapter 3 and some subsections of Chapter 2 are more involved
or more abstract, and thus they make more difficult reading.
The concluding notes and comments aim at giving credit, pointing out
to results not included in the main text, or tracing historical developments.
The references list greatly exceeds the number of works quoted in the
course of the book. It should be consulted with the purpose of discovering
historical sources, parallel research, and starting points for new investiga-
tions.
For what our work is not, the reader is referred to the books by Brezinski
(1991) and von Plato (1994)—for the history of continued fractions—Jones
and Thron (1980), Lorenzen and Waadeland (1992), Olds (1963), Perron
(1954, 1957), Rockett and Szüsz (1992), Schmidt (1980), Sprindžuk (1979),
Sudan (1959), and Wall (1948)—for various, mainly non-metric, aspects of
the theory of continued fractions.

Acknowledgements

Much of our original work included in this book has been carried out in the
framework of our association with the Bucharest ‘Gheorghe Mihoc’ Centre
for Mathematical Statistics of the Romanian Academy, and the Department
of Probability and Statistics (CROSS), Faculty ITS, of the Delft University
of Technology.
Many institutions and persons have helped us in various ways.
The first of us wishes to acknowledge the hospitality of Université René
Descartes – Paris 5, Université des Sciences et des Technologies de Lille,
and Université Victor Segalen – Bordeaux 2. He is grateful to Bui Trong
Lieu, Michel Schreiber (both of Paris 5), George Haiman (Lille), and Jean-
Marc Deshouillers (Bordeaux 2) for their kind invitations at these locations
where his stays in the period 1996–1999 were very helpful in completing
parts of the book. He is also grateful to the Nederlandse Organisatie voor
Preface xiii

Wetenschappelijk Onderzoek (NWO)—the Dutch organization for scientific


research—for two one-month research grants in the years 2000 and 2001,
and to the Department of Probability and Statistics (CROSS) for invitations
allowing several short stays in Delft during which much of the joint work on
the book was done. A short stay in the spring of 2000 at the Department of
Mathematics of Uppsala University, for which he is grateful to Allan Gut,
was very beneficial for gathering recent literature on the subject. Last,
but not the least, he gratefully acknowledges generous financial support
in the years 2000 and 2001 from a French–Romanian CNRS International
Project of Scientific Cooperation (PICS) directed by Haı̈m Brezis and Doina
Ciorănescu (both of Université Pierre et Marie Curie – Paris 6). This allowed
him to spend more time in Delft, which was decisive for completing the book.
Finally, he wishes to acknowledge the technical help he has received from
Adriana Grădinaru who changed his handwritten, hardly legible drafts into
a camera ready copy.
The second author would also like to thank the Romanian Academy for
their support during his visits to Bucharest.
Adriana Berechet read several versions of the typescript, and with her
penetrating mind detected some inaccuracies and slips. Expressing our in-
debtedness to her, we wish to make it clear that any remaining errors are
our own.

Finally, we must thank all the people with Kluwer Academic Publishers
who helped during the development and production of this book project.

Delft, November 2001 M.I.


C.K.
xiv Preface
Frequently Used Notation

Abbreviations

a.e. = almost everywhere (with respect to Lebesgue measure)

a.s. = almost surely (with respect to any other measure)

Cov = covariance

g.c.d. = greatest common divisor

i.i.d.= independent identically distributed

i.o. = infinitely often

log = natural logarithm

p.m. = probability measure

s.i. = strongly infinitesimal

r.v. = random variable

var = total variation

Var = variance

2 = end of example, proof, or remark

xv
xvi Frequently Used Notation

Symbols

N = {0, 1, 2, · · · } , N+ = {1, 2, · · · } , −N = {· · · , −2, −1, 0}

Z = (−N) ∪ N+ = {· · · , −1, 0, 1, · · · }

Q = the set of rational numbers

R = the set of real numbers

bac = integer part of a ∈ R

{a} = fractional part of a ∈ R

R+ = (x ∈ R : x ≥ 0) , R++ = (x ∈ R : x > 0)

I = [0, 1] = the unit interval of R

Ω = I \ Q = the set of irrationals in I

C = the set of complex numbers



i= −1 (imaginary unit)

z ∗ = complex conjugate of z ∈ C

Rn = real n-vector space, or Euclidean n-space, n ∈ N+ ; R1 = R

B n = σ-algebra of Borel sets in Rn ; B1 = B

BM = Bn ∩ M := (B ∩ M : B ∈ B n ), M ∈ B n , n ∈ N+

BI = B ∩ I = σ-algebra of Borel sets in I

BI2 = BI 2 = σ-algebra of Borel sets in I 2

Ac = complementary set of the set A


Frequently Used Notation xvii

IA = indicator function of the set A

∂A = boundary of the Borel set A

δx = p.m. concentrated at the point x

λ = Lebesgue measure on B

λ2 = Lebesgue measure on B2

N (0, 1) = standard normal distribution

Φ = standard normal distribution function

P (θ) = Poisson distribution with parameter θ

P f −1 = P -distribution of r.v. f

∗ = convolution of measures

⊗ = product of σ-algebras or measures

C = 0.577 215 · · · (Euler’s constant)

Fn = nth Fibonacci number:

F0 = F1 = 1, Fn+1 = Fn + Fn−1 , n ∈ N+

g = ( 5 − 1)/2, G = g + 1 (‘golden ratios’)

K0 = 2.685 452 · · · (Khinchin’s constant)

K−1 = 1.745 405 · · · (Khinchin’s constant)

λ0 = 0.303 663 002 898 732 568 · · · (Wirsing’s constant)


P
ζ(2) = i∈N+ i−2 = π 2 /6
xviii Frequently Used Notation

an , 3, 14 || · || L , 54

ā` , 31 Lp , 55

B(I), 53 ||·||p , 55

|| · ||, 53 L∞ , 55

BEV (I), 55 ||.||∞ , 55

||·||v , 56 Lpµ , 54

||·||v,µ , 56 ||·||p,µ , 54

BV (I), 54 L∞
µ , 55

|| · || v , 54 ||.||∞,µ , 55

C, 319 m(X ), 314

C(I), 53 µ-ess sup, 55

C 1 (I), 53 να , 197

|| · || 1 , 53 Pλ , 60

cτ Pois µ, 317 Pi , 22

γ, 16 Pi1 ···in , 136

γ̄, 26 Pµ , 57

γa , 36 pn , 4, 19

d0 , 319 pen , 261

dP , 315 peen , 265

D = D(I), 319 Pois µ, 317

ess sup, 55 pr(X ), 314

F, 324 qn , 4, 19

G, Gan , 39 qne , 261

L(I), 53 qene , 265


Frequently Used Notation xix

Qν , 328 W , 319

rn , 14 yn , 15

r̄` , 34 y ` , 34

s (f ), 53

sn , 14

san , 36

s̄` , 34

σ (C), 313

σ ((fi )i∈I ), 314

ten , 263
e
ten , 273

τ, 2

τ , 25

Θn , 27

Θ0n , 251

Θen , 263
e e , 280
Θ n
¡ (n) ¢
u i , 18

U := Pγ , 59

un , 14

uan , 38

ū` , 34

v (f ), 55
¡ ¢
v i(n) , 18
xx Frequently Used Notation
Chapter 1

Basic properties of the


continued fraction expansion

In this chapter the (regular) continued fraction expansion is introduced and


notation fixed. Some basic properties to be used in subsequent chapters are
also derived.

1.1 A generalization of Euclid’s algorithm


1.1.1 The continued fraction transformation τ
In Proposition 2 of Book VII, Euclid gave an algorithm—now bearing his
name—for finding the greatest common divisor (g.c.d.) of two given integers:
let a, b ∈ Z and assume for convenience that a > b > 0. Put

v0 := a, v1 := b,

and determine a1 ∈ N+ , v2 ∈ N, such that

v0 = a1 v1 + v2 ,

where 0 ≤ v2 < v1 . If v2 6= 0 then we repeat this procedure and obtain

v1 = a2 v2 + v3 ,

where 0 ≤ v3 < v2 . In general, if vm 6= 0 for some m ≥ 2, then we obtain

vm−1 = am vm + vm+1 , (1.1.1)

1
2 Chapter 1

where 0 ≤ vm+1 < vm . Clearly, the procedure should stop after finitely many
steps: there exists n ∈ N+ such that vn 6= 0 and vn+1 = 0. Then, as is well
known, we have
vn = g.c.d. (a, b) .
Remark. The running time of Euclid’s algorithm depends on the number
of division steps required to get the g.c.d. of the given positive integers
v0 > v1 . In an 1844 paper of the French mathematician Gabriel Lamé it
is essentially shown that (i) given n ∈ N+ , if Euclid’s algorithm applied to
v0 and v1 requires exactly n division steps and v0 is as small as possible
satisfying this condition, then v0 = Fn+1 and v1 = Fn ; (ii) if v1 < v0 < m ∈
N+ , then the number of division steps required by Euclid’s algorithm when
applied to v0 and v1 is at most
j √ √ k
log( 5m)/ log(( 5 + 1)/2) − 2 ≈ b2.078 log m + 1.672c − 2,

where b c : R → Z is the greatest integer function, that is,

bxc = greatest integer not exceeding x ∈ R.

For historical details we refer the reader to Shallit (1994), and for recent
developments to Knuth (1981, Section 4.5.3) and Hensley (1994). It should
be noted that the latter are based on results to be proved in this and later
chapters. 2
To consider Euclid’s algorithm more closely we define the so called con-
tinued fraction transformation τ : I → I by
½ −1
x − bx−1 c if x 6= 0,
τ (x) =
0 if x = 0.

Then putting x = b/a we obviously have

a1 = a1 (x) = bv0 /v1 c, ··· , an = an (x) = bvn−1 /vn c

and
vm
= τ m−1 (x) , 1 ≤ m ≤ n, τ n (x) = 0,
vm−1
where τ 0 = identity map and τ ` , ` ∈ N+ , is the composition of τ with itself
` times. Note that
¡ ¢
am (x) = a1 τ m−1 (x) , 1 ≤ m ≤ n. (1.1.2)
Basic properties 3

As vm−1 = am vm + vm+1 , we have


1
= am + τ m (x) , 1 ≤ m ≤ n.
τ m−1 (x)

If for arbitrary indeterminates xi , 1 ≤ i ≤ n, n ∈ N+ , we write


1 1
[x1 ] = , [x1 , · · · , xn ] = , n ≥ 2,
x1 x1 + [x2 , · · · , xn ]

then it follows that

x = [a1 + τ (x)] = [a1 , · · · , am−1 , am + τ m (x)] = [a1 , · · · , an ] (1.1.3)

for 1 < m ≤ n. An expression as on the right hand side of (1.1.3) is called a


finite (regular ) continued fraction (RCF for short). It follows from Euclid’s
algorithm that each rational number x ∈ / Z can be written as

x = a0 + [a1 , . · · · , an ] , (1.1.4)

where a0 = bxc. (Note that for any x ∈ R, x ∈ / Z, the fractionary part


x − bxc of x is a number in the open interval (0, 1) !) The right hand side of
(1.1.4) will be denoted by

[a0 ; a1 , · · · , an ] .

Euclid’s algorithm yields an ≥ 2. Hence each rational number x ∈


/ Z has
two continued fraction expansions, namely,

[a0 ; a1 , · · · , an ] = [a0 ; a1 , · · · , an − 1, 1] .

Of course, there is no reason whatsoever to stick to rationals. Let x ∈


R\Q and, as in the case of rationals, put a0 = bxc. It follows from the very
definition of τ that

τ n (x − a0 ) ∈ Ω = I\Q, n ∈ N.

Let us define

an = an (x) = b1/τ n−1 (x − a0 )c, n ∈ N+ ,

so that, similarly to (1.1.2),


¡ ¢
an (x) = a1 τ n−1 (x − a0 ) , n ∈ N+ . (1.1.20 )
4 Chapter 1

Hence
x = [a0 ; a1 + τ (x − a0 )] = · · · = [a0 ; a1 , · · · , an−1 , an + τ n (x − a0 )] (1.1.5)
for any n ≥ 2.
The two cases x ∈ Q and x ∈ R\Q can be treated in a unitary manner
if we define
a1 (0) = ∞,
the symbol ∞ being subject to the rules 1/∞ = 0, 1/0 = ∞. Equations
(1.1.5) are then valid for any x ∈ R. Clearly, for any x ∈ Q there exists
n = n (x) ∈ N+ such that am (x) = ∞ for any m ≥ n.
The integers a1 (x), a2 (x), · · · will be called the (continued fraction) digits
of x ∈ R whilst the functions x → ai (x) ∈ N+ ∪ {∞}, x ∈ R, i ∈ N+ ,
will be called the incomplete (or partial ) quotients of the continued fraction
expansion. Euclid’s algorithm implies that x ∈ R has finitely many finite
continued fraction digits if and only if x ∈ Q.

1.1.2 Continuants and convergents


Throughout the first three chapters, without express mention to the con-
trary, we will assume that x ∈ [0, 1), which implies that a0 = 0, and write
[0; a1 , · · · , an ] = [a1 , · · · , an ] , n ∈ N+ .
We will usually drop the dependence on x in the notation. Define
ω0 = 0, ωn = ωn (x) = [a1 , · · · , an ] , x ∈ [0, 1), n ∈ N+ .
Clearly, ωn ∈ Q, say
pn
ωn =
, n ∈ N+ ,
qn
where pn , qn ∈ N+ and g.c.d. (pn , qn ) = 1. The number ωn ∈ ωn (x) is called
the nth (regular continued fraction) (RCF) convergent of x, n ∈ N. As
a rule, in the first three chapters the specification RCF will be dropped.
Clearly, for any x ∈ Q there exists n = n (x) ∈ N such that ωm (x) = x for
any m ≥ n. We shall show that for any irrational ω ∈ Ω := I\Q we have
lim ωn (ω) = ω.
n→∞
For that we need some preparation. Define recursively polynomials Qn
of n variables, n ∈ N, by

 1 if n = 0,
Qn (x1 , · · · , xn ) = x1 if n = 1,

x1 Qn−1 (x2 , · · · , xn ) + Qn−2 (x3 , · · · , xn ) if n ≥ 2.
Basic properties 5

Thus

Q2 (x1 , x2 ) = x1 x2 + 1, Q3 (x1 , x2 , x3 ) = x1 x2 x3 + x1 + x3 ,
Q4 (x1 , x2 , x3 , x4 ) = x1 x2 x3 x4 + x1 x2 + x1 x4 + x3 x4 + 1,

etc. In general, as noted by Leonhard Euler, for any n ∈ N+ , Qn (x1 , · · · , xn )


is the sum of all terms which can be obtained starting from x1 · · · xn and
deleting zero or more non-overlapping pairs (xi , xi+1 ) of consecutive vari-
ables. There are Fn such terms. (Prove it!) The polynomials Qn , n ∈ N,
are called continuants, and their basic property is that

Qn−1 (x2 , · · · , xn )
[x1 , · · · , xn ] = , n ∈ N+ . (1.1.6)
Qn (x1 , · · · , xn )

The proof by induction is immediate and is left to the reader. The continu-
ants enjoy the symmetry property

Qn (x1 , · · · , xn ) = Qn (xn , · · · , x1 ) , n ∈ N+ . (1.1.7)

This follows from Euler’s remark above. Hence

Qn (x1 , · · · , xn ) = xn Qn−1 (x1 , · · · , xn−1 ) + Qn−2 (x1 , · · · , xn−2 ) (1.1.8)

for any n ≥ 2. The continuants also satisfy the equation

Qn (x1 , · · · , xn ) Qn (x2 , · · · , xn+1 )


(1.1.9)
n
− Qn+1 (x1 , · · · , xn+1 ) Qn−1 (x2 , · · · , xn ) = (−1) , n ∈ N+ .

The proof is immediate. For n = 1 equation (1.1.9) is true. By the very


definition of Qn , for any n ≥ 2 we have
Qn (x1 , · · · , xn ) Qn (x2 , · · · , xn+1 )−Qn+1 (x1 , · · · , xn+1 ) Qn−1 (x2 , · · · , xn )

= (x1 Qn−1 (x2 , · · · , xn ) + Qn−2 (x3 , · · · , xn )) Qn (x2 , · · · , xn+1 )


− (x1 Qn (x2 , · · · , xn+1 ) + Qn−1 (x3 , · · · , xn+1 )) Qn−1 (x2 , · · · , xn )

= (−1) Qn−1 (x2 , · · · , xn ) Qn−1 (x3 , · · · , xn+1 )


−(−1)Qn (x2 , · · · , xn+1 ) Qn−2 (x3 , · · · , xn )

= · · · = (−1)n−1 (Q1 (xn ) Q1 (xn+1 ) − Q2 (xn , xn+1 )) = (−1)n .


6 Chapter 1

Now, let ω ∈ Ω = I\Q have digits a1 (ω), a2 (ω), · · · . It follows from


(1.1.6) and (1.1.9) that

Qn−1 (a2 , · · · , an )
ωn (ω) = , (1.1.10)
Qn (a1 , · · · , an )

pn = Qn−1 (a2 , · · · , an ), qn = Qn (a1 , · · · , an ), n ∈ N+ .


Hence pn (ω) = qn−1 (τ (ω)), n ∈ N+ , ω ∈ Ω, and using (1.1.8) we obtain

qn = an qn−1 + qn−2 , n ≥ 2,
(1.1.11)
pn = an pn−1 + pn−2 , n ≥ 3,

with q0 = 1, q1 = a1 , p1 = 1, p2 = a2 . If we define p0 = q−1 = 0, p−1 = 1,


then equations (1.1.11) hold for any n ∈ N+ . It follows from (1.1.9) and
(1.1.10) that
pn qn−1 − pn−1 qn = (−1)n+1 , n ∈ N. (1.1.12)
Clearly, either (1.1.10) or (1.1.11) implies that

pn+1 ≥ Fn , qn ≥ Fn , n ∈ N. (1.1.13)

Notice that by (1.1.5), (1.1.6), (1.1.7), (1.1.10), and (1.1.11) we also have

1 p1 + τ (ω) p0
ω = [a1 + τ (ω)] = = ,
a1 + τ (ω) q1 + τ (ω) q0

£ ¤ a2 + τ 2 (ω) p2 + τ 2 (ω) p1
ω= a1 , a2 + τ 2 (ω) = = ,
a1 a2 + 1 + a1 τ 2 (ω) q2 + τ 2 (ω) q1
and for n ≥ 3,

Qn−1 (an + τ n (ω) , an−1 , · · · , a2 )


ω = [a1 , · · · , an−1 , an + τ n (ω)] =
Qn (an + τ n (ω) , an−1 , · · · , a1 )
(an + τ n (ω)) Qn−2 (a2 , · · · , an−1 ) + Qn−3 (a2 , · · · , an−2 )
=
(an + τ n (ω)) Qn−1 (a1 , · · · , an−1 ) + Qn−2 (a1 , · · · , an−2 )
an pn−1 + pn−2 + τ n (ω) pn−1 pn + τ n (ω) pn−1
= = .
an qn−1 + qn−2 + τ n (ω) qn−1 qn + τ n (ω) qn−1
Therefore we can assert that
pn + τ n (ω) pn−1
ω= , ω ∈ Ω, n ∈ N, (1.1.14)
qn + τ n (ω) qn−1
Basic properties 7

and remark that (1.1.14) also holds for any rational ω in [0, 1).
Remark. A matrix approach to equations (1.1.12) and (1.1.14) is as
follows. Consider the matrices
µ ¶
pn−1 pn
Mn = , n ∈ N,
qn−1 qn
so that M0 = identity matrix, and define
µ ¶
0 1
M−1 = .
1 0

Then equations (1.1.11) imply that

Mn = Mn−1 An , n ∈ N,

where µ ¶
0 1
An = , n ∈ N,
1 an
with a0 = 0. Hence
µ ¶Y
n µ ¶
0 1 0 1
Mn = , n ∈ N,
1 0 1 ai
i=0

and (1.1.12) is nothing but the equation

det Mn = (−1)n , n ∈ N.

Clearly, M−1 , Mn , An ∈ SL (2, Z), n ∈ N, that is, the entries of these


2 × 2 matrices belong to Z and their determinants are equal either to 1 or
−1 . Recall that any matrix
µ ¶
a b
M= ∈ SL (2, Z)
c d
can be viewed as a Möbius transformation denoted by the same letter of the
compactified complex plane C∗ , which is defined by
µ ¶
a b az + b
M (z) = (z) := , z ∈ C∗ .
c d cz + d
With T denoting transpose we also have

(1, 0) M (z, 1)T


M (z) = , z ∈ C∗ ,
(0, 1) M (z, 1)T
8 Chapter 1

which implies at once that


¡ ¢
M 0 M 00 (z) = M 0 M 00 (z) , z ∈ C∗ ,

for any M 0 , M 00 ∈ SL (2, Z) .


Next, for any z ∈ C and n ∈ N we have
µ ¶ µ ¶ µ ¶
pn + zpn−1 z z
= Mn = Mn−1 An
qn + zqn−1 1 1
µ ¶
1
= Mn−1 .
an + z

In particular, for z = 0 we have


µ ¶ µ ¶ µ ¶
pn 0 1
= Mn = Mn−1 , n ∈ N, (1.1.100 )
qn 1 an

whence
(1, 0) Mn−1 (1, an )T pn
Mn (0) = T
=
(0, 1) Mn−1 (1, an ) qn
½
[a1 , · · · , an ] if n ∈ N+ ,
:=
0 if n = 0.

It follows that
pn + zpn−1
Mn (z) = = [a1 , · · · , an−1 , an + z] , n ≥ 2,
qn + zqn−1
for any z ∈ C, z 6= −qn /qn−1 , and
µ ¶
1 p1 + zp0
M1 (z) = =
a1 + z q1 + zq0
for any z ∈ C, z 6= −a1 . Now, (1.1.14) follows from the last two equations
by taking z = τ n (ω) , n ≥ 2, respectively z = τ (ω), ω ∈ Ω.
Finally, it is obvious by (1.1.100 ) that pn and qn , n ∈ N+ , can be actually
defined as µ ¶ µ ¶ µ ¶µ ¶
pn 0 1 0 1 0
= ··· .
qn 1 a1 1 an 1
It is worth mentioning that any irrational number

ω = [a0 ; a1 , a2 , · · · ] ∈ R
Basic properties 9

can be represented in terms of only two elements of SL(2, Z), namely


µ ¶ µ ¶
0 1 1 1
Q= and R = ,
−1 0 0 1

so that Q(z) = −1/z, R(z) = z + 1, z ∈ C. It is not hard to check that Q


and R generate SL(2, Z) and that

ω = lim Ra0 QR−a1 QRa2 Q · · · R−a2n−1 Q Ra2n (z0 )


n→∞

for any z0 ∈ C. This simple remark is the starting point for understanding
by the use of elementary results about continued fractions the behaviour of
the geodesic flow on a certain Riemann surface. For details see Series (1982,
1991). See also Adler (1991), Faivre (1993), and Nakada (1995). For another
representation of irrationals ω ∈ R in terms of matrices R and L = (P Q)2 Q
see Raney (1973). 2
We can now prove the result announced before defining the continuants.
Proposition 1.1.1 For any x ∈ [0, 1) we have

(−1)n τ n (x)
x − ωn (x) = , n ∈ N. (1.1.15)
qn (qn + τ n (x) qn−1 )

For any ω ∈ Ω we have


1 1
< |ω − ωn (ω)| < , n ∈ N, (1.1.16)
qn (qn+1 + qn ) qn qn+1

and
lim ωn (ω) = ω. (1.1.17)
n→∞

Proof. Equation (1.1.15) follows from (1.1.12) and (1.1.14). Next, since
1
= an+1 + τ n+1 (ω) , n ∈ N, ω ∈ Ω,
τ n (ω)

by (1.1.11) we have

τ n (ω) 1
=
qn (qn + τ n (ω) qn−1 ) qn (qn (an+1 + τ n+1 (ω)) + qn−1 )
1
= ,
qn (qn+1 + qn τ n+1 (ω))
10 Chapter 1

and (1.1.16) follows.


Finally, (1.1.17) follows from (1.1.16) and (1.1.13). 2
Remark. It is easy to see that (1.1.15) implies
1
|x − ωn (x)| ≤ , n ∈ N,
qn qn+1
for any x ∈ [0, 1). Of course, for a rational x the inequality above is mean-
ingful just for finitely many values of n ∈ N. 2
Notice that (1.1.12) implies that

(−1)n+1
ωn − ωn−1 = , n ∈ N+ , ω ∈ Ω, (1.1.18)
qn qn−1
which in conjunction with (1.1.15) yields

0 = ω0 < ω2 < ω4 < · · · < ω3 < ω1 < 1 (1.1.19)

for any ω ∈ Ω. Clearly, the above inequalities also hold for any rational
ω ∈ [0, 1) with some inequality signs ‘<’ replaced by ‘≤’.
In what follows we shall write

ω = [a1 , a2 , · · · ] , ω ∈ Ω,

to mean precisely equation (1.1.17).


The next result shows that the continued fraction expansion of an irra-
tional number is unique in a certain sense.
Proposition 1.1.2 Let (in )n∈N+ be a sequence of positive integers. De-
fine the rational numbers

ωn = [i1 , · · · , in ] , n ∈ N+ .

Then the limit


lim ωn = ω
n→∞
exists, where ω ∈ Ω and, moreover, the in , n ∈ N+ , are the continued
fraction digits of ω.
Proof. Writing ωn = pn /qn , n ∈ N+ , ω0 = 0, where pn , qn ∈ N+ and
g.c.d.(pn , qn ) = 1, it follows from (1.1.18) that
n
X (−1)k+1
ωn = , n ∈ N+ .
qk−1 qk
k=1
Basic properties 11

As qk increases with k, Leibnitz’s theorem ensures the existence of limn→∞ ωn ,


say, ω, and (1.1.19) shows that 0 < ω < 1.
It remains to show that an (ω) = in , n ∈ N+ . This will also prove
that ω ∈ Ω, since if ω ∈ Q then we should have am (ω) = am+1 (ω) = · · · ∞
for some m ∈ N+ . As
1
ωn = , n ≥ 2, (1.1.20)
i1 + [i2 , · · · , in ]
it is sufficient to show that

a1 (ω) = b1/ωc = i1 .

This follows from (1.1.20) letting n → ∞ and noting that limn→∞ [i2 , · · · , in ]
exists and lies in the open interval (0, 1). 2

1.1.3 Some special continued fraction expansions


The continued fraction expansion of a real number is a fundamental repre-
sentation of it through its connection with the Euclidean algorithm and with
‘best’ rational approximations [see, e.g., Hardy and Wright (1979, Ch. 11)].
At the same time very little is known about the explicit continued fraction
expansions of some interesting numbers.
We already know that these expansions are finite (i.e., terminating) ex-
actly for rational numbers. Also, by a well known theorem of J.-L. Lagrange
[for all classical non-metric results the basic reference is Perron (1954, 1957)],
the sequence of digits of an irrational number x is eventually periodic if and
only if x is a quadratic irrationality. Here ‘eventually periodic’ means that
if
x = [a0 ; a1 , a2 , · · · ] ,
then there exist k ∈ N and ` ∈ N+ such that an = an+` for any n ≥ k, and
we use the notation

 [a0 ; · · · , a`−1 ] if k = 0,
x= [a0 ; a1 , · · · , a` ] if k = 1,

[a0 ; a1 , · · · , ak−1 , ak , · · · , ak+`−1 ] if k≥2
as a convenient abbreviation. The smallest such ` ∈ N+ is called the period
length of x. If we can take k = 0, then x is called purely periodic. Next, a
quadratic irrationality is a number of the form

a+ b
x= ,
c
12 Chapter 1

where
³ b ∈ N+ is not a perfect square, and a, c ∈ Z, c 6= 0. Then x0 =
√ ´
a − b /c is called the algebraic conjugate of x. A purely periodic quadratic
irrationality x is characterized by the inequalities x > 1, −1 < x0 < 0. We
have, for example, √
1+ 7 £ ¤
= 1; 1, 4, 1
2
and √
1+ 2 £ ¤
= 1, 4, 8 .
3
The first quadratic irrationality above is purely periodic and has period
length 4 while the second one has period length 2 but is not purely periodic.
Apart from that, the continued fraction expansion of even a single ad-
ditional algebraic number is not explicitly known. We do not know even
whether the sequence of digits is unbounded for such a number. [In connec-
tion with this matter see, however, Brjuno (1964) and Richtmyer (1975).]
For transcendental numbers of interest it is not clear when to expect
a continuous fraction expansion with a good ‘pattern’. For example, in a
paper titled De Fractionibus Continuis, published Pin 1737, Leonhard Euler
gave a nice continued fraction expansion for e = n∈N 1/n!, namely

e = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, · · · , 1, 2n, 1, · · · ] .

In this expansion the digits are eventually comprised of a meshing of two


arithmetic progressions, one of which has zero common difference while the
other has difference two. Generalizing the above result, Euler showed—the
overline in the notation indicates infinite arithmetic progressions— that

e1/n = [1; n − 1 + 2in, 1]i∈N = [1; n − 1, 1, 1, 3n − 1, 1, 1, 5n − 1, 1, · · · ]

for any 1 < n ∈ N+ , and

e2/n = [1; (n − 1)/2 + 3in, 6n + 12in, (5n − 1)/2 + 3in, 1]i∈N

= [1; (n − 1)/2, 6n, (5n − 1)/2, 1, 1, (7n − 1)/2, 18n, (11n − 1)/2, 1, · · · ]

for any odd n ∈ N+ greater than 1.


Recently, Clemens et al. (1995) have given explicit formulae relating
continued fraction expansions with almost periodic or almost symmetric
patterns in their digits, and series whose terms satisfy certain recurrence
relations. The method developed by these authors ties together as a single
Basic properties 13

phenomenon previous results by Davison and Shallit (1991), Köhler (1980),


Pethő (1982), Shallit (1979, 1982 a,b), van der Poorten and Shallit (1992),
and Tamura (1991), who have found continued fraction expansions for num-
bers expressed by certain types of series.
On the other hand, nobody has made any sense out of the pattern in the
continued fraction expansion for π :

π = [3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, · · · ] .

The digits of π do not appear to follow any pattern and are widely suspected
to be in some sense random.
There is a vague folklore statement [cf. Thakur (1996)] that the nice
patterns come from the connection with hypergeometric functions and the
representation of the latter by certain generalized continued fraction expan-
sions. For more on that see Chudnovsky and Chudnovsky (1991, 1993).

Remark. Using the continued fraction expansion for e, Alzer (1998)


proved that
¯ ¯
q 2 log q ¯¯ p ¯¯
min e− ¯
p,q∈N+ ,q≥3 log log q ¯ q

exists and is only attained at the 19th convergent of e

p19 28 245 729


= ,
q19 10 391 013

thus it is equal to
¯ ¯
(10 391 013)2 log 10 391 013 ¯¯ 28 245 729 ¯¯
log log 10 391 013 ¯e − 10 391 013 ¯

= 0.386 249 199 819 · · · .

Further, the inequality


¯ ¯
q 2 log q ¯¯ p ¯¯
e− <c
log log q ¯ q¯

has infinitely many solutions in integers p, q ∈ N+ if and only if c ≥ 1/2.


For further developments see Elsner (1999). 2
14 Chapter 1

1.2 Basic metric properties


1.2.1 Defining random variables of interest
By (1.1.20 ) the incomplete quotients an , n ∈ N+ , of the irrationals in I are
defined by
¡ ¢
a1 (ω) = b1/ωc, an (ω) = a1 τ n−1 (ω) , ω ∈ Ω, n ∈ N+ .

If we define a1 (0) = ∞ then the above equations also define the incomplete
quotients for the rational numbers in [0, 1). As we have noted in Subsec-
tion 1.1.1, for any rational x ∈ [0, 1) there exists n = n (x) ∈ N+ such
that am (x) = ∞ for any m ≥ n.
The metric point of view in studying the sequence (an )n∈N+ is to con-
sider that the an , n ∈ N+ , are N+ -valued random variables on (I, BI )
which are defined µ-a.s. in I for any probability measure µ on BI assigning
measure 0 to the rationals in I . (Such a µ is clearly Lebesgue measure λ.)
Alternatively, we can look at the an , n ∈ N+ , as N+ ∪ {∞}-valued random
variables which are defined everywhere in [0, 1). It is clear, for example, that
µ ¶
1
a1 (0) = ∞, a1 (x) = 1, x ∈ ,1 ,
2
µ ¸
1 1
a1 (x) = i, x ∈ , , i ≥ 2,
i+1 i
µ ¶
1
a2 (0) = a2 = ∞, i ≥ 2,
i
[ µ 1 1

a2 (x) = 1, x ∈ , ,
i + 1 i + 1/2
i∈N+
[ · 1 1

a2 (x) = j, x ∈ , , j ≥ 2.
i + 1/j i + 1/ (j + 1)
i∈N+

The distinction between the two cases is nevertheless immaterial as we


shall only consider probability measures on BI assigning measure 0 to the
rationals in I .
The probability structure of (an )n∈N+ under λ will be given later. See
Proposition 1.2.7.
Let us define some related random variables. For any n ∈ N+ put

1
rn = = [an ; an+1 , an+2 , · · · ], (1.2.1)
τ n−1
Basic properties 15

qn−1 1
sn = , yn = , (1.2.2)
qn sn
¯ ¯−1
−2 ¯
¯ pn−1 ¯¯
un (ω) = qn−1 ¯ ω − , ω ∈ Ω, (1.2.3)
qn−1 ¯
where, as usual, pn /qn = [a1 , · · · , an ] , n ∈ N+ , is the nth convergent,
p0 = 0, q0 = 1. Note that qn = y1 · · · yn = (s1 · · · sn )−1 , n ∈ N+ . Next, it
follows from the first equation (1.1.11) that
1
= an + sn−1 , n ∈ N+ ,
sn
with s0 = 0. Hence

sn = [an , · · · , a1 ] , n ∈ N+ . (1.2.20 )

Finally, using (1.1.15) it is easy to see that

un = sn−1 + rn , n ∈ N+ . (1.2.30 )

In what follows we shall refer to the qn , rn , sn , un , yn , n ∈ N+ , as asso-


ciated (with (an )n∈N+ ) random variables.
It is clear that 0 < sn < 1 whilst rn , un , yn > 1, n ∈ N+ . We defer
to Subsection 1.2.3 the study of distributional properties under λ of the
associated random variables.

1.2.2 Gauss’ problem and measure


Of paramount importance for the metric theory of the continued fraction
expansion, actually its first basic result, is the asymptotic behaviour of the
distribution function Fn (x) = λ (τ n < x) = λ (τ −n ([0, x))), x ∈ I, of τ n
as n → ∞. C.F. Gauss wrote on 25th October 1800 in his diary that (in
modern notation)

log (x + 1)
lim Fn (x) = , x ∈ I.
n→∞ log 2
Gauss’ proof has never been found. Later, in a letter dated 30th January
1812, Gauss asked Laplace what we now call:
Gauss’ Problem. Estimate the error
log(x + 1)
en (x) := Fn (x) − , n ∈ N, x ∈ I.
log 2
16 Chapter 1

Gauss’ letter has been published on pages 371–372 of his Werke, Volume 1,
Section 1, Teubner, Leipzig, 1917. Almost the whole letter is reproduced
on pages 396–397 of J.V. Uspensky’s Introduction to Mathematical Proba-
bility, McGraw-Hill, New York, 1937. See also Gray (1984, p. 123) for other
historical details about Gauss’ problem.
The first one to give a solution to Gauss’ problem (implicitly proving
Gauss’ 1800 assertion) was R.O.√Kuzmin, who showed in 1928 [see Kuzmin
(1928, 1932)] that en (x) = O(q n ) as n → ∞, with 0 < q < 1, uniformly
in x ∈ I. Kuzmin’s proof is reproduced in Khintchine (1956, 1963, 1964).
Independently, Paul Lévy showed one year later [see Lévy (1929) and√also
Lévy (1954, Ch.IX)] that |en (x)| ≤ q n , n ∈ N+ , x ∈ I, with q = 3.5−2 2 =
0.67157 · · · . We present a slightly improved version of Lévy’s solution in
Subsection 1.3.5. Using Kuzmin’s approach, Szűsz (1961) claimed to have
lowered the Lévy estimate for q to 0.4. Actually, Szűsz’s argument yields just
0.485 rather than 0.4. The optimal value of q was determined by Wirsing
(1974), who found that it was equal to 0.303 663 002 · · · .
Chapter 2 is devoted to a thorough treatment of Gauss’ problem. In
particular, Corollary 2.3.6 provides a complete solution to a generalization
of it, where the interval [0, x), x ∈ I, is replaced by an arbitrary set A ∈ BI .
The limiting distribution function log(x + 1)/ log 2, x ∈ I, occurring
in Gauss’ problem motivates the introduction of what we now call Gauss’
measure γ, which is defined on BI by
Z
1 dx
γ (A) = , A ∈ BI .
log 2 A x + 1
Then clearly γ([0, x]) = log(x+1)/ log 2, x ∈ I. We are going to prove that γ
and τ enjoy an important property. First, we note that τ does not preserve
λ. This means that we do not have λ(τ −1 (A)) = λ (A) for any A ∈ BI .
Indeed, for, e.g., A = (1/2, 1) we have
[ µ 1 1

−1
τ (A) = ,
i + 1 i + 1/2
i∈N+

and
¡ ¢ X µ 1 1
¶ X µ 1 1

λ τ −1 (A) = − =2 −
i + 1/2 i + 1 2i + 1 2i + 2
i∈N+ i∈N+
µ ¶
1
= 2 log 2 − 1 + = 2 log 2 − 1
2
while λ (A) = 1/2.
Basic properties 17

Instead, τ does preserve γ and we state formally this result, which is a


basic one in the metric theory of the RCF expansion.
Theorem 1.2.1 Gauss’ measure γ is preserved by τ, and the sequence
(an )n∈N+ is strictly stationary under γ.
Proof. We should show that
¡ ¢
γ τ −1 (A) = γ(A), A ∈ BI .

For this it is enough to show that the above equation holds for any interval
A = (0, u], 0 < u ≤ 1. As
[ · 1 1¶
−1
τ ((0, u]) = , ,
u+i i
i∈N+

we only need to verify that


Z u X Z 1/i
dx dx
= ,
0 x+1 1/(u+i) x + 1
i∈N+

which is an easy exercise.


Since an = a1 ◦ τ n−1 , n ∈ N+ , the second assertion is obvious. 2
Remark. The expectation of a1 under γ is infinite. Indeed
Z 1 Z 1/i
1 a1 (x) 1 X dx
dx = i = ∞.
log 2 0 x + 1 log 2 1/(i+1) x + 1
i∈N+

1.2.3 Fundamental intervals, and applications


For any n ∈ N+ and i(n) = (i1 , · · · , in ) ∈ Nn+ define

I(i(n) ) = ( ω ∈ Ω : ak (ω) = ik , 1 ≤ k ≤ n ) .

For example, for any i ∈ N+ we have


µ ¶
1 1
I (i) = ( ω ∈ Ω : a1 (ω) = i ) = Ω ∩ , .
i+1 i

We are going to prove that any I(i(n) ) is the set of irrationals from a certain
open interval with rational endpoints. The sets I(i(n) ), i(n) ∈ Nn+ , are
18 Chapter 1

called fundamental intervals of rank n. Let us make the convention that


I(i(0) ) = Ω.
Theorem 1.2.2 For any n ∈ N+ and i(n) = (i1 , · · · , in ) ∈ Nn+ let
pn−1 pn
= [i1 , · · · , in−1 ] , = [i1 , · · · , in ]
qn−1 qn
with g.c.d. (pn−1 , qn−1 ) = g.c.d. (pn , qn ) = 1, p0 = 0, q0 = 1. Then

I(i(n) ) = Ω ∩ (u(i(n) ), v(i(n) )), (1.2.4)

where 
 pn + pn−1

 if n is odd,
 qn + qn−1
(n)
u(i )=

 pn

 if n is even,
qn
 pn

 if n is odd,

 qn
v(i(n) ) =

 pn + pn−1

 if n is even.
qn + qn−1
We have

 [i1 + 1] if n = 1,
pn + pn−1
=
qn + qn−1 
[i1 , · · · , in−1 , in + 1] if n > 1,
1
λ(I(i(n) )) = (1.2.5)
qn (qn + qn−1 )
and
1
max λ(I(i(n) )) = λ (I (1(n))) = , n ∈ N+ , (1.2.6)
i(n) ∈Nn
+
Fn Fn+1

with 1(n) = (i1 , · · · , in ), where i1 = · · · = in = 1.


¡ ¢
Proof. Since [i1 , · · · , in−1 , in + ω] ∈ I i(n) , n ≥ 2, and [i1 + ω] ∈ I (i1 )
¡ ¡ ¢¢
for any ω ∈ Ω, we have τ n I i(n) = Ω for any n ∈ N+ and i(n) ∈ Nn+ . In
conjunction with (1.1.14) this proves (1.2.4). It thus appears that I(i(n) ) is
the image of Ω under the map
pn + ωpn−1
ω→ , ω ∈ Ω.
qn + ωqn−1
Basic properties 19

Next, (1.2.5) follows from (1.2.4) and (1.1.12).


Finally, (1.2.6) is an immediate consequence of (1.2.5), as the minimum
of qn is attained for i1 = · · · = in = 1 [cf.(1.1.13)]. 2
Remark. When denoting by pn and qn , n ∈ N+ , quantities seemingly
different from those already defined in Subsection 1.1.2, we clearly abused
the notation. However, it should be noted that according to the context pn
and qn will appear
¡ (n) ¢ to be either
¡ (n) ¢ functions of ω ∈ Ω or of i(n) ∈ Nn+ as well.
Actually, pn i (qn i ) is the common value of pn (qn ) as defined in
¡ ¢
Subsection 1.1.2 at all points ω ∈ I i(n) , n ∈ N+ . 2
Corollary 1.2.3 For p, q ∈ N+ with p < q and g.c.d. (p, q) = 1 let
p
= [i1 , · · · , in ] = [i1 , · · · , in−1 , in − 1, 1]
q

for some n = n (p/q) ∈ N+ , where in ≥ 2. Define

pn−1 p−n
= [i1 , · · · , in−1 ] , = [i1 , · · · , in−1 , in − 1]
qn−1 qn−

with g.c.d. (pn−1 , qn−1 ) = g.c.d. (p− −


n , qn ) = 1, p0 = 0, q0 = 1, and
µ ¶
p
Ip/q = ω ∈ Ω : is a convergent of ω .
q

Then
Ip/q = I (i1 , · · · , in ) ∪ I (i1 , · · · , in−1 , in − 1, 1) (1.2.7)
 µ ¶

 p + pn−1 p + p− n

 Ω ∩ , if n is odd,
 q + qn−1 q + qn−
= µ ¶



 p + p−n p + pn−1
 Ω ∩ , if n is even
q + qn− q + qn−1
and
¡ ¢ 3
λ Ip/q = ¡ ¢.
(q + qn−1 ) q + qn−
We have
¡ ¢ ¡ ¢ 3
max λ Ip/q = λ IFn /Fn+1 = , n ∈ N+ .
{p,q∈N+ : n(p,q)=n} (Fn−1 + Fn+1 ) Fn+2
20 Chapter 1

Proof. By (1.1.11) we have


p = p−
n + pn−1 , q = qn− + qn−1 .
It then follows from (1.2.4) that
 µ ¶

 p p + p−n

 Ω ∩ , if n is odd,
 q q + qn−
I (i1 , · · · , in−1 , in − 1, 1) = µ ¶



 p + p−n p
 Ω ∩ , if n is even
q + qn− q
while, by (1.2.4) again,
 µ ¶

 p + pn−1 p

 Ω ∩ , if n is odd
 q + qn−1 q
I (i1 , · · · , in ) = µ ¶

 p p + pn−1


 Ω ∩ , if n is even.
q q + qn−1
The last two equations
¡ ¢show that (1.2.7) holds.
To compute λ Ip/q we have to¡ use ¢(1.1.12) three times. Finally, we
should note that the maximum of λ Ip/q is obtained for i1 = · · · = in−1 =
1, in = 2. 2
Corollary 1.2.4 (Legendre’s theorem) For ω ∈ Ω and p, q ∈ N+ with p <
q and g.c.d. (p, q) = 1 let
p pn−1
= [i1 , · · · , in ] , = [i1 , · · · , in−1 ]
q qn−1
with p0 = 0, q0 = 1, where the length n = n (p/q) ∈ N+ of the continued
fraction expansion of p/q is chosen in such a way that it is even if p/q <
ω and odd otherwise. Define
¯ ¯
¯
2¯ p ¯¯
Θ = q ¯ ω − ¯.
q
Then
q p
Θ< if and only if is a convergent of ω.
q + qn−1 q
In particular, if Θ ≤ 1/2 then p/q is a convergent of ω .
Proof. If p/q is a convergent of ω, then by (1.1.15) we have
¯ ¯
¯
2¯ p ¯¯ q τ n (ω) q
Θ=q ¯ω− ¯= n
< .
q q + τ (ω) qn−1 q + qn−1
Basic properties 21

Conversely, if Θ < q / (q + qn−1 ) then


¯ ¯
¯ ¯
¯ω − p ¯ < 1
. (1.2.8)
¯ q ¯ q (q + qn−1 )

Assuming that p/q < ω, that is, n is even, from (1.2.8) we obtain

p p 1 p + pn−1
<ω< + =
q q q (q + qn−1 ) q + qn−1

[by (1.1.12)]. Similarly, assuming that p/q > ω, that is, n is odd, we obtain

p p 1 p + pn−1
>ω> − =
q q q (q + qn−1 ) q + qn−1

[by (1.1.12) again]. In both cases we thus have ω ∈ I (i1 , · · · , in ). Hence


p/q = [i1 , · · · , in ] is a convergent of ω.
The special case follows from the inequality q / (q + qn−1 ) > 1/2 which
holds since q > qn−1 . 2
Corollary 1.2.5 For any n ∈ N+ and i(n) = (i1 , · · · , in ) ∈ N+ we
have

1 1 + v(i(n) )
γ (ak = i1 , · · · , ak+n−1 = in ) = log , k ∈ N+ .
log 2 1 + u(i(n) )

In particular,
µ ¶
1 (i + 1)2 1 1
γ (ak = i) = log = log 1 + (1.2.9)
log 2 i (i + 2) log 2 i (i + 2)

for any k, i ∈ N+ .
Proof. Theorem 1.2.1 and equation (1.2.4). 2
Corollary 1.2.6 (Brodén–Borel–Lévy formula) For any n ∈ N+ we
have
(sn + 1) x
λ (τ n < x | a1 , · · · , an ) = , x ∈ I, (1.2.10)
sn x + 1
where sn is defined by (1.2.2) or (1.2.20 ).
Proof. Clearly, for any n ∈ N+ and x ∈ I,

λ ((τ n < x) ∩ I (a1 , · · · , an ))


λ (τ n < x | a1 , · · · , an ) = .
λ (I (a1 , · · · , an ))
22 Chapter 1

By (1.1.14) and (1.2.4) we have

(τ n < x) ∩ I (a1 , · · · , an )

 µ ¶

 pn + xpn−1 pn

 ω∈Ω: <ω< if n is odd,
 qn + xqn−1 qn
= µ ¶

 pn pn + xpn−1


 ω∈Ω: <ω< if n is even.
qn qn + xqn−1

Hence, using (1.2.5) and (1.1.12),

qn (qn + qn−1 ) x (sn + 1) x


λ (τ n < x | a1 , · · · , an ) = =
qn (qn + xqn−1 ) sn x + 1

for any n ∈ N+ and x ∈ I, and the proof is complete. 2

Remark. For x ∈ N+ equation (1.2.10) has been obtained by the Swedish


mathematician T. Brodén as early as 1900 [see Brodén (1900, p. 246)], nine
years before É. Borel [see Borel (1909)]. Lévy (1929) also obtained and
used (1.2.10). This equation was called the Borel-Lévy formula by Doeblin
(1940). A generalization of (1.2.10) will be given in Proposition 1.3.8. 2
The Brodén–Borel–Lévy formula (1.2.10) allows us to determine the
probability structure of (an )n∈N+ under λ.
Proposition 1.2.7 For any i, n ∈ N+ we have
1
λ (a1 = i) = , (1.2.11)
i (i + 1)

λ (an+1 = i | a1 , · · · , an ) = Pi (sn ) , (1.2.12)


where
x+1
Pi (x) = , x ∈ I. (1.2.13)
(x + i) (x + i + 1)
Proof. As we have already noted,
µ ¶
1 1
( ω ∈ Ω : a1 (ω ) = i ) = Ω ∩ , , i ∈ N+ ,
i+1 i

and (1.2.11) follows at once.


Basic properties 23

Since τ n (ω) = [an+1 (ω) , an+2 (ω) , · · · ] , n ∈ N+ , ω ∈ Ω, we have


µ µ ¶¶
n 1 1
( ω ∈ Ω : an+1 (ω) = i ) = ω ∈ Ω : τ (ω) ∈ ,
i+1 i
for any n, i ∈ N+ so that
µ µ ¶¯ ¶
n 1 1 ¯¯
λ (an+1 = i | a1 , · · · , an ) = λ τ ∈ , a1 , · · · , an ,
i+1 i ¯
and (1.2.12) follows from (1.2.10). 2
Remark. Proposition 1.2.7 is the starting point of an approach to the
metrical theory of the continued fraction expansion via dependence with
complete connections. See Iosifescu and Grigorescu (1990, Section 5.2). 2
Corollary 1.2.8 The sequence (sn )n∈N+ with s0 = 0 is a Q ∩ I-valued
Markov chain on (I, BI , λ) with the following transition mechanism: from
state s ∈ Q ∩ I the possible transitions are to any state 1/ (s + i) with
corresponding transition probability Pi (s), i ∈ N+ .
We conclude this subsection by considering the random variables rn and
un , n ∈ N+ , introduced in Subsection 1.2.1.
Proposition 1.2.9 For any n ∈ N+ and x ≥ 1 we have
1
λ (r1 < x) = λ (u1 < x) = 1 − , (1.2.14)
x
sn + 1
λ (rn+1 < x | a1 , · · · , an ) = 1 − , (1.2.15)
sn + x
(
0 if x ≤ sn + 1,
λ (un+1 < x | a1 , · · · , an ) = sn + 1 (1.2.16)
1− if x > sn + 1.
x
Proof. Equations (1.2.14) are obvious since r1 = u1 = 1/τ 0 . Then for
any n ∈ N+ and x ≥ 1 we have
µ ¶
1 ¯¯
λ (rn+1 < x | a1 , · · · , an ) = λ τ n > ¯ a1 , · · · , an
x
and

λ (un+1 < x | a1 , · · · , an ) = λ (rn+1 < x − sn | a1 , · · · , an )


µ ¶
1 ¯¯
= λ τn > ¯ a1 , · · · , a n .
x − sn
24 Chapter 1

To obtain equations (1.2.15) and (1.2.16) it remains to use (1.2.10). 2


Corollary 1.2.10 For any n ∈ N+ let Gn (s) = λ(sn < s), s ∈ R,
G0 (s) = 0 or 1 according as s ≤ 0 or s > 0. For any n ∈ N+ and x ≥ 1
we have
Z 1
x−1
λ (rn < x) = dGn−1 (s)
0 s+x
µ Z 1 ¶ (1.2.17)
1 Gn−1 (s) ds
= (x − 1) + ,
x+1 0 (s + x)2
 Z x−1 µ ¶

 s+1

 1− dGn−1 (s) if 1 ≤ x ≤ 2,
 0 x
λ (un < x) = (1.2.18)

 Z 1µ ¶

 s + 1
 1− dGn−1 (s) if x > 2
0 x
 Z

 1 x−1

 Gn−1 (s) ds if 1 ≤ x ≤ 2,
 x 0
=

 Z

 2 1 1
 1− + Gn−1 (s) ds if x > 2
x x 0

Z x−1
1
= Gn−1 (s) ds,
x 0
Z 1
d (s + 1) dGn−1 (s)
λ (rn < x) = (1.2.19)
dx 0 (s + x)2
Z 1
2 (s − x + 2) Gn−1 (s) ds
= 2 + .
(x + 1) 0 (s + x)3
Also, for any n ∈ N+ we have

Z x−1
d 1 1
λ (un < x) = Gn−1 (x − 1) − 2 Gn−1 (s) ds (1.2.20)
dx x x 0

 Z x−1

 1 1

 G (x − 1) − 2 Gn−1 (s) ds if 1 ≤ x ≤ 2,
 x n−1 x 0
= µ ¶

 Z 1

 1
 2 2− Gn−1 (s) ds if x > 2
x 0
Basic properties 25

a.e. in [1, ∞).

Proof. The first equality in (1.2.17) follows at once from (1.2.15). To


obtain the second one we integrate by parts noting that Gn (0) = 0 and
Gn (1) = 1 for any n ∈ N.
Similarly, the first equality in (1.2.18) follows at once from (1.2.16). To
obtain the second and third ones we integrate by parts and then note that
Gn (s) = 1 for any n ∈ N and s ≥ 1.
Finally, equations (1.2.19) and (1.2.20) follow immediately from (1.2.17)
and (1.2.18), respectively. 2

1.3 The natural extension of τ


1.3.1 Definition and basic properties
The incomplete quotients an , n ∈ N+ , are expressed in terms of a1 and
the powers of the continued fraction transformation τ . Such a thing is not
possible for the variables sn or un , n ∈ N+ . To rule out this inconvenience
we consider the so called natural extension τ of τ which is a transformation
of (0, 1) × I defined by
µ ¶
1
τ (ω, θ) = τ (ω) , , (ω, θ) ∈ (0, 1) × I. (1.3.1)
a1 (ω) + θ

This is a one-to-one transformation of Ω2 with inverse


µ ¶
−1 1
τ (ω, θ) = , τ (θ) , (ω, θ) ∈ Ω2 . (1.3.2)
a1 (θ) + ω

It is easy to see that for any n ≥ 2 we have

τ n (ω, θ) = (τ n (ω) , [an (ω) , · · · , a2 (ω) , a1 (ω) + θ]) (1.3.10 )

whatever (ω, θ) ∈ Ω × I, and

τ −n (ω, θ) = ([an (θ) , · · · , a2 (θ) , a1 (θ) + ω], τ n (θ)) (1.3.20 )

whatever (ω, θ) ∈ Ω2 .
Equations (1.3.1) and (1.3.10 ) imply that

τ n (ω, 0) = (τ n (ω) , sn (ω)) , n ∈ N+ , (1.3.3)


26 Chapter 1

for any ω ∈ Ω. Note that the above equation also hold for n = 0 if we define
τ 0 =identity map.
Now, define the extended Gauss measure γ on BI2 by
ZZ
1 dxdy
γ (B) = , B ∈ BI2 .
log 2 B (xy + 1)2
Note that
γ (A × I) = γ (I × A) = γ (A) (1.3.4)
for any A ∈ BI . The result below shows that γ plays with respect to τ the
part played by γ with respect to τ (cf. Theorem 1.2.1).
Theorem 1.3.1 The extended Gauss measure γ is preserved by τ .
¡ ¢
Proof. We should show that γ τ −1 (B) = γ (B) for any B ∈ BI2 or,
equivalently, since τ is invertible on Ω2 , that γ (τ (B)) = γ (B) for any B ∈
BI2 . As the set of Cartesian products I(i(m) ) × I(j (n) ), i(m) ∈ Nm
+, j
(n) ∈

Nn+ , m, n ∈ N, generates the σ-algebra BI2 , it is enough to show that


γ(τ (I(i(m) ) × I(j (n) ))) = γ(I(i(m) ) × I(j (n) )) (1.3.5)
for any i(m) ∈ Nm
+, ∈ j (n) Nn+ ,
m, n ∈ N. It follows from (1.3.4) and
Theorem 1.2.1 that (1.3.5) holds for m = 0 and n ∈ N. If m ∈ N+ then it
is easy to see that
τ (I(i(m) ) × I(j (n) )) = I (i2 , · · · , im ) × I (i1 , j1 , · · · , jn ) , n ∈ N+ ,
where I (i2 , · · · , im ) equals Ω for m = 1. Also, if I(i(m) ) = Ω ∩ (a, b)
and ¡I(j (n) ) = Ω ∩ (c,¢ d), with a, b, c, d ∈ Q ∩ I, then I (i2 , · · · , im ) =
Ω ∩ b−1 − i1 , a−1 − i1 and I (i1 , j1 , · · · , jn ) = Ω ∩ ((d + i1 )−1 , (c + i1 )−1 ).
A simple computation yields
1 (bd + 1) (ac + 1)
γ((a, b) × (c, d)) = log ,
log 2 (bc + 1) (ad + 1)
and then
¡ ¢
γ( b−1 − i1 , a−1 − i1 × ((d + i1 )−1 , (c + i1 )−1 ))
1 ((a−1 − i1 )(c + i1 )−1 + 1)((b−1 − i1 )(d + i1 )−1 + 1)
= log
log 2 ((a−1 − i1 )(d + i1 )−1 + 1)((b−1 − i1 )(c + i1 )−1 + 1)
1 (bd + 1) (ac + 1)
= log ,
log 2 (bc + 1) (ad + 1)
that is, (1.3.5) holds. 2
For more details on natural extensions we refer the reader to Subsection
4.0.1.
Basic properties 27

1.3.2 Approximation coefficients


On account of Legendre’s theorem (see Corollary 1.2.4), for any ω ∈ Ω we
define the approximation coefficients Θn = Θn (ω) as
¯ ¯

¯ pn ¯¯
Θn = Θn (ω) = qn ¯ω − ¯ , n ∈ N.
qn

Clearly, Θ0 (ω) = ω, ω ∈ Ω, and by (1.2.3) we have

Θn = u−1
n+1 , n ∈ N. (1.3.6)

Hence
0 < Θn < 1, n ∈ N.
It is rather easy to obtain more information about Θn , n ∈ N. It follows
from (1.2.30 ) and (1.2.1) that
1 τn
Θn = = , n ∈ N.
sn + rn+1 sn τ n + 1
−1
Moreover, as s−1
n = an + sn−1 and rn = an + rn+1 , n ∈ N+ , we also have

1 1
Θn−1 = = −1
sn−1 + rn sn−1 + an + rn+1
sn
= , n ∈ N+ .
sn τ n + 1
Thus it appears that

(Θn−1 , Θn ) = Ψ (τ n , sn ) , n ∈ N+ , (1.3.7)

the function Ψ : I 2 → R2+ being defined by


µ ¶
y x
Ψ (x, y) = , , (x, y) ∈ I 2 .
xy + 1 xy + 1

Clearly, Ψ is a C 1 -diffeomorphism between the interior of I 2 and the


interior of the triangle ∆ with vertices (0, 0) , (1, 0) and (0, 1). It then
follows from (1.3.7) that

Θn−1 + Θn < 1, n ∈ N+ ,

whence
1
min (Θn−1 , Θn ) < , n ∈ N+ ,
2
28 Chapter 1

a well known result due to Vahlen (1895).


The inverse Ψ−1 of Ψ is given by
µ ¶
−1 2β 2α
Ψ (α, β) = √ , √ , (α, β) ∈ ∆.
1 + 1 − 4αβ 1 + 1 − 4αβ

For i ∈ N+ put
Vi = I (i) × Ω
Hi = Ω × I (i) .
It follows from the definition of τ that

τ (Vi ) = Hi , Vi = τ −1 (Hi ) , i ∈ N+ ,

and that for any i ∈ N+ we have

τ n ∈ Vi if and only if an+1 = i, n ∈ N, (1.3.8)

τ n ∈ Hi if and only if an = i, n ∈ N+ . (1.3.9)


Furthermore, the set Vi∗ = ΨVi , is a quadrangle with vertices
µ ¶ µ ¶ µ ¶ µ ¶
1 i 1 i+1 1 1
0, , , , , and 0, ,
i i+1 i+1 i+2 i+2 i+1

and notice that its symmetrical with respect to the diagonal α = β is Hi∗ =
ΨHi , i ∈ N+ . (For i = 1 both quadrangles are in fact triangles.) Define
the mapping F : ∆ → ∆ as F = Ψτ Ψ−1 .
It is easy to check that for any i ∈ N+ we have
³ p ´
(α, β) ∈ Vi∗ ⇒ F (α, β) = β, α + i 1 − 4αβ − i2 β . (1.3.10)

Now, by (1.3.7) we have

Ψ−1 (Θn−1 , Θn ) = (τ n , sn ) ,

whence ¡ ¢
τ Ψ−1 (Θn−1 , Θn ) = τ n+1 , sn+1 , n ∈ N+ .
Therefore, by (1.3.7) again,

F (Θn−1 , Θn ) = Ψ τ Ψ−1 (Θn−1 , Θn )


¡ ¢ (1.3.11)
= Ψ τ n+1 , sn+1 = (Θn , Θn+1 ), n ∈ N+ .
Basic properties 29

Hence, by (1.3.3), (1.3.8), and (1.3.10),


p
Θn+1 = Θn−1 + an+1 1 − 4Θn−1 Θn − a2n+1 Θn , n ∈ N+ . (1.3.12)

Similarly, for any i ∈ N+ we have


³ p ´
(α, β) ∈ Hi∗ ⇒ F −1 (α, β) = β + i 1 − 4αβ − i2 α, α . (1.3.13)

As by (1.3.3), (1.3.9), and (1.3.13) we have

F −1 (Θn , Θn+1 ) = (Θn−1 , Θn ) , n ∈ N+ ,

we obtain
p
Θn−1 = Θn+1 + an+1 1 − 4Θn Θn+1 − a2n+1 Θn , n ∈ N+ . (1.3.120 )

We note that both (1.3.12) and (1.3.120 ) can be established by direct


computation using the relationships between Θn , rn , sn , and an , n ∈ N+ .
We are now able to derive some classical results in Diophantine approx-
imation. Put
p
fi (α, β) = α + i 1 − 4αβ − i2 β, i ∈ N+ ,

so that (1.3.10) can be rewritten as

(α, β) ∈ Vi∗ ⇒ F (α, β) = (β, fi (α, β)) .

It is easy to check that


∂fi ∂fi
(α, β) < 0, (α, β) < 0, (α, β) ∈ Vi∗ , i ∈ N+ . (1.3.14)
∂α ∂β
The only fixed point of τ in Vi is (ξi , ξi ), where

−i + i2 + 4
ξi = [i, i, i, · · · ] = , i ∈ N+ ,
2
while the only fixed point of F in Vi∗ = ΨVi is (ξi∗ , ξi∗ ), where
µ ¶
∗ ∗ 1 1
(ξi , ξi ) = Ψ (ξi , ξi ) = √ ,√ , i ∈ N+ . (1.3.15)
i2 + 4 i2 + 4
Note that by (1.3.11) we have (Θn−1 , Θn , Θn+1 ) = (Θn−1 , F (Θn−1 , Θn )) , n ∈
N+ . Hence, for any i, n ∈ N+ ,

(Θn−1 , Θn , Θn+1 ) = (Θn−1 , Θn , fi (Θn−1 , Θn ))


30 Chapter 1

if and only if (Θn−1 , Θn ) ∈ Vi∗ , that is, by (1.3.7), if and only if an+1 = i.
Finally, note that
Θn−1 (ξi∗ ) 6= Θn (ξi∗ ) (1.3.16)
for any i, n ∈ N+ .
Now, on account of (1.3.14) through (1.3.16) we can state the following
result.
Theorem 1.3.2 For any ω ∈ Ω and n ∈ N+ we have
1
min (Θn−1 , Θn , Θn+1 ) < q (1.3.17)
2
an+1 + 4

and
1
max (Θn−1 , Θn , Θn+1 ) > q . (1.3.18)
a2n+1 + 4

Inequality (1.3.17) generalizes a result of Borel (1903) according to which


1
min (Θn−1 , Θn , Θn+1 ) < √ , n ∈ N+ . (1.3.11)
5
A great number of people independently found (1.3.17). See, e.g., Bage-
mihl and McLaughlin (1966), Obrechkoff (1951), Sendov (1959/60).
Inequality (1.3.18) is due to Tong (1983). Actually, the method sketched
above yields easy proofs of generalizations of a great number of classical
results by M. Fujiwara, B. Segre, J. LeVeque, P. Szűsz, and others. We will
mention here a generalization of a result of B. Segre. For other results the
reader is referred to Jager and Kraaikamp (1989) and Kraaikamp (1991).
Theorem 1.3.3 Let ρ ≥ 0 and n ∈ N+ . Then of the three inequalities
ρ 1 ρ
Θ2n−1 < q , Θ2n < q , Θ2n+1 < q
a22n+1 + 4ρ a22n+1 + 4ρ a22n+1 + 4ρ

at least one is satisfied and at least one is not satisfied.


Corollary 1.3.4 [Segre (1945)] Let ρ ≥ 0 and ω ∈ Ω. Then there
are infinitely many rational numbers p/q with p < q and g.c.d. (p, q) = 1
satisfying the inequalities
ρ 1 p 1 1
−√ <ω− < √ .
1 + 4ρ q 2 q 1 + 4ρ q 2
Basic properties 31

Remark. Tong (1994) proved the optimal version of Theorem 1.3.2 by


showing that for any ω ∈ Ω and n ∈ N+ we have
1
min (Θn−1 , Θn , Θn+1 ) < p
(an+1 + |τ n+1 − sn |)2 + 4
and
1
max (Θn−1 , Θn , Θn+1 ) > p .
(an+1 − |τ n+1 − sn |)2 + 4
2

1.3.3 Extended random variables


It is well known [see, e.g., Doob (1953, p. 456)] that a doubly infinite version
of (an )n∈N+ under γ (i.e., when the process is a strictly stationary one, see
Theorem 1.2.1) should exist on a richer probability space. It is possible to
construct it effectively by using the natural extension τ as follows. Define
extended incomplete quotients a` , ` ∈ Z, on Ω2 by
³ ´
a`+1 (ω, θ) = a1 τ ` (ω, θ) , ` ∈ Z,

with
a1 (ω, θ) = a1 (ω) , (ω, θ) ∈ Ω2 .
Clearly, by (1.3.10 ) and (1.3.20 ) we have

an (ω, θ) = an (ω) , a0 (ω, θ) = a1 (θ) ,


a−n (ω, θ) = an+1 (θ) , n ∈ N+ , (ω, θ) ∈ Ω2 .

Similarly to the interpretation of the an , n ∈ N+ , in Subsection ¡ 21.2.1,


¢
we can consider the a` , ` ∈ Z, as N+ -valued random variables on I , BI2
which are defined µ-a.s. in I 2 for any probability measure µ on BI2 assigning
measure 0 to I 2 \Ω2 . (Such a µ is clearly γ.) Alternatively, we can look
at the a` , ` ∈ Z, as N+ ∪ {∞}-valued random variables which are defined
everywhere in [0, 1)2 , as the an , n ∈ N+ , can be defined everywhere in
[0, 1) (cf. Subsection 1.2.1). In the latter case a typical trajectory of (a` )`∈Z
is either
— a doubly infinite sequence of natural numbers;
— a doubly infinite sequence of elements of N+ ∪ {∞} in which the
natural numbers appear finitely many times in consecutive positions;
— a doubly infinite sequence of elements of N+ ∪ {∞} in which the
natural numbers appear in consecutive positions from a certain rank on or
up to a certain rank.
32 Chapter 1

The distinction between the two cases is again immaterial.


Since τ preserves γ, the doubly infinite sequence (a` )`∈Z is strictly sta-
tionary under γ. It is indeed a doubly infinite version of (an )n∈N+ under γ,
that is, the distribution of (ah , · · · , ah+m ) under γ and that of (ak , · · · , ak+m )
under γ are identical for any h ∈ Z, m ∈ N, and k ∈ N+ .
The probability structure of (a` )`∈Z under γ is described by Corollary
1.3.6 to Theorem 1.3.5 below. The latter also brings to light an important
family of probability measures on BI , to be called conditional, which we
shall consider in some detail in the next subsection.
Theorem 1.3.5 For any x ∈ I we have

(a + 1) x
γ ([0, x] × I | a0 , a−1 , · · · ) = γ-a.s.,
ax + 1

where a = [a0 , a−1 , · · · ].


Proof. As is well known,

γ ([0, x] × I | a0 , a−1 , · · · ) = lim γ ([0, x] × I | a0 , · · · , a−n ) γ̄-a.s..


n→∞

For typographical convenience let us denote by In the fundamental inter-


val I (a0 , · · · , a−n ) for any arbitrarily fixed values of the ai , i = 0, −1, . . . , −n.
Then we have
γ ([0, x] × I | a0 , · · · , a−n )

γ ([0, x] × In )
=
γ (I × In )
Z Z x Z
−1 du −1 x (y + 1) dy
(log 2) dy (log 2)
In 0 (uy + 1)2 In xy + 1 y + 1
= =
γ (In ) γ (In )
Z
x (y + 1)
γ (dy)
In xy + 1 x (yn + 1)
= =
γ (In ) xyn + 1

for some yn ∈ In . Since

lim yn = [a0 , a−1 , · · · ] = a,


n→∞

the proof is complete. 2


Basic properties 33

Corollary 1.3.6 For any i ∈ N+ we have

γ (a1 = i| a0 , a−1 , · · · ) = Pi (a) γ-a.s.,

where a = [a0 , a−1 , · · · ] and the functions Pi , i ∈ N+ , are defined by


(1.2.13).
Proof. We have
 µ ¶

 1

 , 1 × [0, 1) if i = 1,
 2
(a1 = i) = µ ¸

 1 1


 , × [0, 1) if i ≥ 2.
i+1 i
Hence the conditional probability in the statement is γ-a.s. equal to
(a + 1) /i (a + 1) / (i + 1)
− = Pi (a) .
1 + a/i 1 + a/ (i + 1)
2
Remarks. 1. The strict stationarity of (a` )`∈Z under γ implies that the
conditional probability

γ (a`+1 = i| a` , a`−1 , · · · ), i ∈ N+ ,

does not depend on ` ∈ Z and is γ-a.s. equal to Pi (a), where a = [ a` , a`−1 ,


· · · ]. Thus Proposition 1.2.7 and Corollary 1.3.6 provide interpretations of
Pi (x) for all x ∈ [0, 1).
2. The process (a` )`∈Z is an example of what is called an infinite-order
chain in the theory of dependence with complete connections, see Section
5.5 in Iosifescu and Grigorescu (1990). The existence of such chains is not
obvious. To ensure the existence several restrictions should be imposed. See,
e.g., Theorems 5.5.1 and 5.5.2 in Iosifescu and Grigorescu (op. cit.). The
latter refers to N+ -valued infinite-order chains and makes explicit use of the
continued fraction expansion.
¡ 2 2 The ¢ simple effective construction of (a` )`∈Z on
the probability space I , BI , γ fully clarifies an idea of Wolfgang Doeblin
[see Doeblin (1940)], who was the first to use dependence with complete
connections in the metric theory of the continued fraction expansion. 2
Note that by its very construction (a` )`∈Z is a reversible process, that
is, the finite dimensional distributions under γ of (a` )`∈Z and (a−` )`∈Z are
identical. A similar property holds for (an )n∈N+ under γ, as is shown by
the following result.
34 Chapter 1

Proposition 1.3.7 The random sequence (an )n∈N+ on (I, BI , γ) is re-


versible, i.e., the distributions of (a` : m ≤ ` ≤ n) and (am+n−` : m ≤ ` ≤ n)
are identical for any m, n ∈ N+ , m ≤ n.
Proof. By the strict stationarity under γ of (a` )`∈Z , the distribution of
(a` : m ≤ ` ≤ n) is identical with the distribution of (a`−m−n+1 : m ≤ ` ≤ n)
(both under γ). But by the very definition of (a` )`∈Z the first distribution
is identical with that of (a` : m ≤ ` ≤ n) while the second one is identical
with that of (am+n−` : m ≤ ` ≤ n) (both under γ). 2
Remark. The result stated in Proposition 1.3.7 amounts to the fact that
the γ-measures of the fundamental intervals I (i1 , · · · , in ) and I (in , · · · , i1 )
are equal for any n ∈ N+ and i1 , · · · , in ∈ N+ . This can be also proved by
direct computation using results from Subsection 1.2.3. See Philipp (1967)
and Dürner (1992). 2
Define extended associated random variables s` , y ` , r` and u` as

1
s` = [ a` , a`−1 , · · · ] , y` = ,
s̄`

r` = [ a` ; a`+1 , a`+2 , · · · ] , u` = s`−1 + r` , ` ∈ Z.


Clearly,
s` = s0 ◦ τ ` , y` = y0 ◦ τ ` ,

r` = r0 ◦ τ ` , u` = s0 ◦ τ `−1
+ r1 ◦ τ `−1
, ` ∈ Z.
It follows from the above equations, Theorem 1.3.1, and Corollary
¡ 1.3.6¢ that
(s` )`∈Z is a strictly stationary Ω-valued Markov process on I 2 , BI2 , γ with
the following transition mechanism: from state s ∈ Ω the possible transi-
tions are to any state 1/ (s + i) with corresponding transition probability
Pi (s) , i ∈ N+ . Clearly, for any ` ∈ Z we have

γ (s` < x) = γ (s0 < x) = γ (I × [0, x]) = γ ([0, x]) , x ∈ I.

Similar considerations can be made about the¡ process ¢(y ` )`∈Z . This is a
strictly stationary Ω0 -valued Markov process on I 2 , BI2 , γ , where Ω0 = the
set of irrationals in [1, ∞). The transition mechanism of (y ` )`∈Z is as follows:
from state y ∈ Ω0 the only possible transitions are to any state y −1 + i with
corresponding transition probability Pi (1/y), i ∈ N. For any ` ∈ Z we have

γ(y ` < x) = γ(y 0 < x) = γ([x−1 , 1]) = γ 0 ([1, x]) , x ∈ [1, ∞),
Basic properties 35

where γ 0 is the probability measure on B[1,∞] defined by


Z
0
¡ 0¢ 1 dy
γ A = , A0 ∈ B[1,∞) .
log 2 A0 y (y + 1)
Next, the process¡(r` )`∈Z is¢ a strictly stationary Ω0 -valued ‘deterministic’
Markov process ¡on ¡ I 2 ,¢¢
BI2 , γ in which state r ∈ Ω0 is followed by state
1/(r − brc) = 1/ τ r −1 . Obviously, for any ` ∈ Z we have
γ (r` < x) = γ (r1 < x) = γ (r1 < x) = γ 0 ([1, x)), x ∈ [1, ∞).
Note that by the reversibility of (ā` )`∈Z the finite-dimensional distributions
under γ of (s` )`∈Z and (r−1
` )`∈Z are identical.
Finally, the process (s`−1 , r−1 2
` ¡ )`∈Z is a¢ strictly stationary Ω -valued ‘de-
terministic’ Markov process on I , BI , γ in which state (s, ω) ∈ Ω2 is fol-
2 2

lowed by state
µ ¶
−1 1 −1 −1
τ (s, ω) = , ω − bω c .
s + bω −1 c
For any ` ∈ Z we have
¡ ¢ ¡ ¢
γ s`−1 < x, r−1
` <y = γ s0 < x, r−1
1 <y

Z y Z x
1 dudv
= γ ([0, y] × [0, x]) =
log 2 0 0 (uv + 1)2
log (xy + 1)
= , x, y ∈ I.
log 2
The process (u` )`∈Z , which is a functional of (s`−1 , r−1` )`∈Z (note that
u` = s`−1 +r` , ` ∈ Z), is no longer Markovian but is still a strictly stationary
one. For any ` ∈ Z we have
γ (u` < x) = γ (u1 < x) = γ (s0 + r1 < x)
ZZ
1 dudv
= , x ∈ [1, ∞),
log 2 D (uv + 1)2
¡ ¢
where D = (u, v) ∈ I 2 : u + v −1 < x . Hence
 µ ¶

 1 x−1

 log x − if 1 ≤ x ≤ 2,
 log 2 x
γ (u` < x) = µ ¶

 1 1


 log 2 − if x ≥ 2.
log 2 x
36 Chapter 1

1.3.4 The conditional probability measures


Motivated by Theorem 1.3.5 we shall consider the family of (conditional )
probability measures (γa )a∈I on BI defined by their distribution functions

(a + 1) x
γa ([0, x]) = , x ∈ I, a ∈ I.
ax + 1
In particular, γ0 = λ. The density ha of γa is
a+1
ha (x) = , x ∈ I, a ∈ I,
(ax + 1)2

and [see, e.g., Billingsley (1968, p. 224)] we then have


Z
1
sup |γa (A) − γb (A)| = |ha (x) − hb (x)| dx
A∈BI 2 I
Z ¯¯ ¯
1 (ab + a + b) x2 + 2x − 1¯
= |b − a| dx
2 I (ax + 1)2 (bx + 1)2
Z α
1 − 2x − (ab + a + b) x2
= |b − a| dx
0 (ax + 1)2 (bx + 1)2
α (1 − α) |b − a|
= ,
(αa + 1) (αb + 1)
³ p ´−1
where α = 1 + (a + 1) (b + 1) , a, b ∈ I. Hence

1
sup |γa (A) − γb (A)| ≤ |b − a| , a, b ∈ I . (1.3.19)
A∈BI 4

It is easy to see that we also have

α (1 − α) |b − a|
sup |γa ([0, x]) − γb ([0, x])| = , a, b ∈ I.
x∈I (1 + αa) (1 + αb)

For any a ∈ I put sa0 = a and

1
san = , n ∈ N+ . (1.3.20)
san−1 + an

It follows from the properties just described of the process (s` )`∈Z that
the sequence (san )n∈N+ is an I-valued Markov chain on (I, BI , γa ) which
starts at sa0 = a and has the following transition mechanism: from state
Basic properties 37

s ∈ I the possible transitions are to any state 1/ (s + i) with corresponding


transition probability Pi (s) , i ∈ N+ . [Strictly speaking, this only holds
for any a ∈ E ⊂ Ω, for some E ∈ BI with λ (E) = 1, as (san )n∈N under γa
is a version of (sn )n∈N under γ( · |s0 = a), a ∈ E. The validity of the above
assertion for the remaining a ∈ I \ E follows by continuity on account of
(1.3.19).]
Proposition 1.3.8 (Generalized Brodén–Borel–Lévy formula) For any
a ∈ I and n ∈ N+ we have

(san + 1) x
γa (τ n < x | a1 , · · · , an ) = , x ∈ I. (1.3.21)
san x + 1

Proof. For any n ∈ N+ and x ∈ I consider the conditional probability


¡ ¢
γ τ −n ([0, x] × I)| an , · · · , a1 , a0 , a−1 , · · · . (1.3.22)

Put a = [a0 , a−1 , · · · ] − actually, a (ω, θ) = θ, (ω, θ) ∈ Ω2 − and note that

[an , · · · , a1 , a0 , a−1 , · · · ] = san .

On the one hand, it follows from Theorems 1.3.1 and 1.3.5 (see also
Remark 1 after Corollary 1.3.6) that the conditional probability (1.3.22) is
γ-a.s. equal to
(san + 1) x
.
san x + 1
On the other hand, putting

γ̄a ( · ) = γ ( · | a0 , a−1 , · · · ) ,

it is clear that (1.3.22) is γ-a.s. equal to

γ̄a (τ −n ([0, x] × I) ∩ (I (a1 , · · · , an ) × I))


. (1.3.23)
γ̄a (I(a1 , · · · , an ) × I)

Since τ −n ([0, x] × I) = τ −n ([0, x]) × I and γ̄a (A × I) = γa (A) , A ∈ BI ,


the fraction in (1.3.23) is equal to
¡ ¢
γa τ −n ([0, x]) |I(a1 , · · · , an ) = γa (τ n < x | a1 , · · · , an ) .

Therefore (1.3.21) holds for any a ∈ E ⊂ Ω, for some E ∈ BI with λ (E) = 1,


hence by continuity [use (1.3.19)] for the remaining a ∈ I\E. 2
38 Chapter 1

Remark. Equation (1.3.21) can be also proved by direct computation


(cf. the proof of Corollary 1.2.6). 2

Corollary 1.3.9 For any a ∈ I and n ∈ N+ we have

γa (A | a1 , · · · , an ) = γsan (τ n (A)) (1.3.24)

whatever the set A belonging to the σ-algebra generated by the random vari-
ables an+1 , an+2 , · · · , that is, τ −n (BI ).

We now give a generalization of Proposition 1.2.9, where Lebesgue mea-


sure λ(= γ0 ) is replaced by γa , a ∈ I. Define first the random variables uan
as
uan = san−1 + rn , n ∈ N+ , a ∈ I.

Proposition 1.3.10 For any a ∈ I, n ∈ N+ , and x ≥ 1 we have

a+1
γa (r1 < x) = 1 − ,
x+a



 0 if x ≤ a + 1,
γa (ua1 < x) =
 1− a+1

if x > a + 1,
x

san + 1
γa (rn+1 < x | a1 , . . . , an ) = 1 − ,
x + san



 0 if x ≤ san + 1,
γa (uan+1 < x | a1 , . . . , an ) = a
 1 − sn + 1

if x > san + 1.
x

The proof is entirely similar to that of Proposition 1.2.9. 2

Corollary 1.3.11 For any a ∈ I and n ∈ N+ let Gan (s) = γa (san <
s), s ∈ R, Ga0 (s) = 0 or 1 according as s ≤ a or s > a. For any a ∈ I, n ∈
Basic properties 39

N+ , and x ≥ 1 we have
Z 1
x−1 a
γa (rn < x) = dG (s)
0 s + x n−1
µ Z 1 Ga ¶
1 n−1 (s) ds
= (x − 1) + ,
x+1 0 (s + x)2
 Z x−1 µ ¶

 s+1

 1− dGan−1 (s) if 1 ≤ x ≤ 2,
 0 x
γa (uan < x) =

 Z 1µ ¶

 s+1
 1− dGan−1 (s) if x>2
0 x
Z x−1
1
= Gan−1 (s) ds.
x 0

Equations similar to (1.2.19) and (1.2.20) hold, too.

1.3.5 Paul Lévy’s solution to Gauss’ problem


We now present the elegant solution given by Lévy (1929) to Gauss’ problem.
Actually, as Lévy has done in the case a = 0, we shall obtain estimates for
both ‘errors’ Fna − G and Gan − G, a ∈ I, n ∈ N, where

Fna (x) = γa (τ n < x), x ∈ I, Gan (s) = γa (san < s), s ∈ R,

and G(s) = 0, γ([0, s]), or 1 according as s < 0, s ∈ I, or s > 1.


It follows from Corollary 1.3.11 that
Z 1
x(s + 1) a
Fna (x) = dGn (s) (1.3.25)
0 xs + 1

for any a, x ∈ I and n ∈ N. It is easy to check that


Z 1
x(s + 1)
G (x) = dG(s), x ∈ I, (1.3.26)
0 xs + 1

and µ ¶ µ ¶
1 1
Gan+1 = Fna , m, n ∈ N+ , a ∈ I. (1.3.27)
m m
40 Chapter 1

The last equation is still valid for n = 0 and a 6= 0 while


µ ¶ µ ¶
0 1 0 1 1
G1 = F0 = , m ∈ N+ . (1.3.270 )
m m+1 m+1
Since (san )n∈N is a Markov chain on (I, BI , γa )—see the preceding subsection—
for any m, n ∈ N+ , a ∈ I, and θ ∈ [0, 1) we have
µ ¶ µ ¶
a 1 a 1
Gn+1 − Gn+1
m m+θ
µ ¶
1 a 1
= γa ≤ sn+1 <
m+θ m
µ µ ¯ ¶¶
1 a 1 ¯¯ a
= E γa ≤ sn+1 < ¯ sn (1.3.28)
m+θ m
Z θ
= Pm (s) dGan (s)
0

while
µ ¶ µ ¶ µ ¶
1 1 1 1 1
Ga1 − Ga1 = γa ≤ <
m m+θ m+θ a1 + a m
(1.3.280 )
Z θ
= Pm (s) dGa0 (s),
0+

that is, (1.3.28) also holds for n = 0 if a 6= 0.


It is easy to check that
Z θ µ ¶ µ ¶
1 1
Pm (s)dG(s) = G −G (1.3.29)
0 m m+θ
for any m ∈ N+ and θ ∈ [0, 1).
Now, by (1.3.25) and (1.3.26) we have
Z 1
a x(s + 1)
Fn (x) − G(x) = d(Gan (s) − G(s))
0 xs + 1
Z 1 µ ¶
a ∂ x(s + 1)
= − (Gn (s) − G(s)) ds
0 ∂s xs + 1
for any a, x ∈ I and n ∈ N. Setting

αna = sup |Gan (s) − G(s)| , a ∈ I, n ∈ N,


s∈I
Basic properties 41

we obtain
Z 1
x(1 − x) x(1 − x) a
|Fna (x) − G(x)| ≤ αna ds = α ,
0 (xs + 1)2 x+1 n
hence √
|Fna (x) − G(x)| ≤ (3 − 2 2)αna (1.3.30)
for any a, x ∈ I and n ∈ N. Let us note that

α0a = max (G(a), 1 − G(a)), a ∈ I.

Theorem 1.3.12 For any n ∈ N+ and a ∈ I we have


1 √ √
sup |Fna (x) − G(x)| ≤ (3 − 2 2)(3.5 − 2 2)n−1 ,
x∈I 2
1 √
sup |Gan (x) − G(x)| ≤ (3.5 − 2 2)n−1 .
x∈I 2

Proof. By (1.3.27) through (1.3.30), for any m, n ∈ N+ , a ∈ I, and


θ ∈ [0, 1)—also for n = 0 and any m ∈ N+ , a ∈ (0, 1], and θ ∈ [0, 1)—we
have
¯ µ ¶ µ ¶¯
¯ a 1 1 ¯
¯Gn+1 −G ¯
¯ m+θ m+θ ¯
¯ µ ¶ µ ¶¯
¯ 1 1 ¯¯
≤ ¯¯Gan+1 −G
m m ¯
¯ µ ¶ µ ¶ µ ¶ µ ¶¯
¯ a 1 1 1 1 ¯
¯
+ ¯Gn+1 a
− Gn+1 −G +G ¯
m m+θ m m+θ ¯
¯ µ ¶ µ ¶¯ ¯Z θ ¯
¯ a 1 1 ¯¯ ¯¯ ¯
¯
= ¯Fn −G +¯ Pm (s) d (Gn (s) − G(s))¯¯
a
m m ¯
0
³ √ ´ a
≤ 3 − 2 2 αn
¯Z θ ¯
¯ ¯
+¯ ¯ (G(s) − Gn (s)) dPm (s) + Pm (θ)(Gn (θ) − G(θ))¯¯
a a
0

≤ (3 − 2 2 + β(m, θ))αna ,

where Z ¯ ¯
θ ¯ dPm (s) ¯
β(m, θ) = ¯ ¯
¯ ds ¯ ds + Pm (θ).
0
42 Chapter 1

It is easy to check that β(m, θ) ≤ 1/2 for any m ∈ N+ and θ ∈ [0, 1).
Actually,

 1/2 if m = 1,





 √

 4/(3 + θ) − 2/(2 + θ) − 1/6 if m = 2 and θ ≤ 2 − 1,
β(m, θ) = √ √


 6 − 4 2 − 1/6
 if m = 2 and θ ≥ 2 − 1,





2Pm (θ) − 1/m(m + 1) if m ≥ 3.
Hence
¯ µ ¶ µ ¶¯
¯ a 1 1 ¯
a
αn+1 = sup ¯G − G ¯
¯ n+1 m + θ m+θ ¯
m∈N+ , θ∈[0,1)
(1.3.31)

≤ (3.5 − 2 2)αna
for any a ∈ I and n ∈ N+ .
Finally, by (1.3.27), (1.3.270 ), and (1.3.280 ),
µ ¶ µ ¶
1 1 1
G01 = G01 =
m+θ m m+1
and
µ ¶ µ ¶ Z θ
1 1
Ga1 = Ga1 − Pm (s)dGa0 (s)
m+θ m 0
 µ ¶

 1

 F0a − Pm (a) if 0 ≤ θ ≤ a,
 m
= µ ¶

 1

 a
 F0 if θ > a
m

 a+1

 if 0 ≤ θ ≤ a,
 a+m+1
=

 a+1

 if θ > a
a+m
for any a ∈ (0, 1], θ ∈ [0, 1), and m ∈ N+ . It is easy to see that
¯ µ ¶ µ ¶¯
¯ a 1 1 ¯ 1
a
α1 = sup ¯G1 − G ¯ ≤ , a ∈ I. (1.3.32)
¯ m+θ m+θ ¯ 2
m∈N+ , θ∈[0,1)
Basic properties 43

It follows from (1.3.31) and (1.3.32) that


1 √
αna ≤ (3.5 − 2 2)n−1 , n ∈ N+ , a ∈ I.
2
By (1.3.30) the proof is complete. 2
Theorem 1.3.12 shows that both Fna and Gan converge very fast to Gauss’
distribution function G. Actually, the convergence is even considerably
faster. See Corollary 2.3.6 and Theorem 2.5.5.

1.3.6 Mixing properties


We conclude this section by studying the ψ-mixing coefficients of (an )n∈N+
under either γa , a ∈ I, or γ. Theorem 1.3.12 plays here an important part.
For any k ∈ N+ let B1k = σ (a1 , · · · , ak ) and Bk∞ = σ (ak , ak+1 , · · · )
denote the σ-algebras generated by the random variables a1 , · · · , ak , respec-
tively, ak , ak+1 , · · · . Clearly, B1k is the σ-algebra generated by the closures
of the fundamental intervals of rank k while Bk∞ = τ −k+1 (BI ), k ∈ N+ .
For any µ ∈ pr (BI ) consider the ψ-mixing coefficients (cf. Section A3.1)
¯ ¯
¯ µ (A ∩ B) ¯
ψµ (n) = sup ¯ ¯ − 1¯¯ , n ∈ N+ ,
µ (A) µ (B)

where the supremum is taken over all A ∈ B1k and B ∈ Bk+n∞ such that
µ (A) µ (B) 6= 0, and k ∈ N+ .
Define ¯ ¯
¯ γa (B) ¯
¯
εn = sup ¯ − 1¯¯ , n ∈ N+ ,
γ (B)
where the supremum is taken over all a ∈ I and B ∈ Bn∞ with γ (B) > 0.
∞ ⊂ B ∞ for any
Note that the sequence (εn )n∈N+ is non-increasing since Bn+1 n
a , a ∈ I,
n ∈ N+ . We shall show that εn can be expressed in terms of Fn−1
and G, namely, εn = ε0n with
¯ a ¯
¯ dFn−1 (x) /dx ¯
εn = sup ¯¯
0
− 1¯¯ , n ∈ N+ ,
a,x∈I g (x)

where g (x) = G 0 (x) = (log 2)−1 / (x + 1) , x ∈ I.


Indeed, by the very definition of ε0n , for any a, x ∈ I we have
¯ a ¯
¯ dFn−1 (x) ¯
0 ¯
εn g (x) ≥ ¯ − g (x)¯¯ .
dx
44 Chapter 1

By integrating the above inequality over B ∈ Bn∞ we obtain


Z ¯ a ¯
¯ dFn−1 (x) ¯
γ (B) ε0n ≥ ¯ − g (x)¯ dx
¯ dx ¯
B
¯Z Z ¯
¯ ¯
¯ a
≥ ¯ dFn−1 (x) − g (x) dx¯¯ = |γa (B) − γ (B)|
B B

for any B ∈ Bn∞ , n ∈ N+ , and a ∈ I. Hence ε0n ≥ εn , n ∈ N+ . On the other


+
hand, for any arbitrarily given n ∈ N+ let Bx,h = (x ≤ τ n−1 < x + h) ∈ Bn∞ ,

with x ∈ [0, 1), h > 0, x + h ∈ I, and Bx,h = (x − h ≤ τ n−1 < x) ∈ Bn∞ ,
with x ∈ (0, 1], h > 0, x − h ∈ I. Clearly,
ï ¯ ¯ ¯!
¯ γa (B + ) ¯ ¯ γa (B − ) ¯
¯ x,h ¯ ¯ x,h ¯
εn ≥ max ¯ + − 1 ¯ , ¯ − − 1 ¯
¯ γ(Bx,h ) ¯ ¯ γ(Bx,h ) ¯

for any a ∈ I and suitable x ∈ I and h > 0. Letting h → 0 we get


εn ≥ ε0n , n ∈ N+ . Therefore εn = ε0n , n ∈ N+ .
¡ ¢
It is easy to compute ε01 = ε1 and ε02 = ε2 . Since F0a (x) = γa τ 0 < x =
γa ([0, x]) , a, x ∈ I, we have
¯ a ¯ ¯ ¯
¯ dF0 (x) /dx ¯ ¯ (a + 1) (x + 1) ¯
ε1 = sup ¯¯ − 1¯¯ = sup ¯¯ 2 log 2 − 1¯¯ .
a,x∈I g (x) a,x∈I (ax + 1)

As
(a + 1) (x + 1)
1≤ ≤ 2, a, x ∈ I,
(ax + 1)2
it follows that
ε1 = 2 log 2 − 1 = 0.38629 · · · .

Next, as γa (sa1 = 1/(a + i)) = Pi (a), a ∈ I, i ∈ N+ , by Proposition 1.3.8


we have
X (a + i + 1)x a+1
F1a (x) =
x + a + i (a + i)(a + i + 1)
i∈N+

X (a + 1)x
= , a, x ∈ I.
(x + a + i)(a + i)
i∈N+
Basic properties 45

Then
¯ a ¯
¯ dF1 (x)/dx ¯
ε2 = sup ¯ ¯ − 1¯¯
a,x∈I g(x)
¯ ¯
¯ X ¯
¯ 1 ¯
= sup ¯¯(log 2)(a + 1)(x + 1) 2
− 1¯¯ .
a,x∈I ¯ (x + a + i) ¯
i∈N+

It is not difficult to check that


X 1
2(ζ(2) − 1) ≤ (a + 1)(x + 1) ≤ ζ(2), a, x ∈ I.
(x + a + i)2
i∈N+

Hence
ε2 = max(ζ(2) log 2 − 1, 1 − 2(ζ(2) − 1) log 2)

= ζ(2) log 2 − 1 = 0.14018 · · · .


For n ≥ 3 the computation of εn becomes forbidding. Instead, Theorem
1.3.12 can be used to derive good upper bounds for εn whatever n ∈ N+ .
Proposition 1.3.13 We have ε1 < log 2 and
1
εn ≤ (log 2)cn−2 , n ≥ 2,
2

where c = 3.5 − 2 2 = 0.67157 · · · .
Proof. It follows from (1.3.25) and (1.3.26) that
Z 1
dFna (x) s+1
= 2
dGan (s)
dx 0 (xs + 1)
and Z 1
s+1
g(x) = dG(s)
0 (xs + 1)2
for any a, x ∈ I and n ∈ N. Using the last two equations, integration by
parts yields
¯ a ¯ ¯Z 1 ¯
¯ dFn (x) ¯ ¯ s + 1 ¯
¯ ¯ ¯ d(Gn (s) − G(s))¯¯
a
¯ dx − g(x)¯ = ¯ 2
0 (xs + 1)
¯Z 1 µ ¶ ¯
¯ ∂ s+1 ¯
= ¯¯ a
((Gn (s) − G(s)) ds¯¯
∂s (xs + 1) 2
0
Z 1
|x(s + 2) − 1|
≤ sup | Gan (s) − G(s)| ds.
s∈I 0 (xs + 1)3
46 Chapter 1

But
Z 1
|x(s + 2) − 1|
ds
0 (xs + 1)3
 Z 1

 1 − x(s + 2)

 ds if 0 ≤ x ≤ 13 ,

 (xs + 1)3


0



 Z (1−2x)/x 1 − x(s + 2) Z 1
1 − x(s + 2)
= 3
ds − 3
ds if 13 ≤ x ≤ 12 ,

 0 (xs + 1) (1−2x)/x (xs + 1)





 Z 1

 x(s + 2) − 1

 ds if 12 ≤ x ≤ 1
0 (xs + 1)3


 2(x + 1)−2 − 1 if 0 ≤ x ≤ 13 ,



= −2(x + 1)−2 − 1 + (2x(1 − x))−1 if 13 ≤ x ≤ 12 ,





1 − 2(x + 1)−2 if 12 ≤ x ≤ 1

and Z 1
|x(s + 2) − 1|
(x + 1) ds =
0 (xs + 1)3


 2(x + 1)−1 − (x + 1) if 0 ≤ x ≤ 13



= −2(x + 1)−1 − (x + 1) + (x + 1)(2x(1 − x))−1 if 31 ≤ x ≤ 12





x + 1 − 2(x + 1)−1 if 21 ≤ x ≤ 1

≤ 1.

Therefore
¯ a ¯
¯ dFn (x)/dx ¯
sup ¯¯ − 1¯¯ ≤ (log 2) sup |Gan (s) − G(s)| , n ∈ N.
a,x∈I g(x) a,s∈I

Then
ε01 = ε1 ≤ log 2
and, by Theorem 1.3.12,
1
ε0n+1 = εn+1 ≤ (log 2)cn−1 , n ∈ N+ .
2
Basic properties 47

2
Theorem 1.3.14 For any a ∈ I we have
εn + εn+1
ψγa (n) ≤ , n ∈ N+ . (1.3.33)
1 − εn+1
Also,
ψγ (n) = εn , n ∈ N+ . (1.3.34)

Proof. It follows from (1.3.24) that for any a ∈ I we have


¯ ¡ ¯
¯ γ B|I(i(k) )¢ ¯
¯ a ¯
εn = sup ¯ − 1¯ , n ∈ N+ , (1.3.35)
¯ γ(B) ¯

where the supremum is taken over all B ∈ Bk+n ∞ with γ(B) > 0, i(k) ∈ Nk ,
+
and k ∈ N. For arbitrarily given k, `, n ∈ N+ , i(k) ∈ Nk+ , and j (`) ∈ N`+
put
A = I(i(k) ), B = ((ak+n , · · · , ak+n+`−1 ) = j (`) ))
and note that γa (A) γa (B) 6= 0 for any a ∈ I. By (1.3.35) we have

|γa (B|A) − γ (B)| ≤ εn γ (B) (1.3.36)

and

|γa (B) − γ (B)| ≤ εn+k γ (B) . (1.3.37)


It follows from (1.3.36) and (1.3.37) that

|γa (B|A) − γa (B)| ≤ (εn + εn+k ) γ (B) ,

whence

|γa (A ∩ B) − γa (A) γa (B)| ≤ (εn + εn+k ) γa (A) γ (B) .

Finally, note that (1.3.37) yields


γa (B)
γ (B) ≤ .
1 − εn+k
Since the sequence (εn )n∈N+ is non-increasing, we have
εn + εn+k εn + εn+1
≤ , k, n ∈ N+ ,
1 − εn+k 1 − εn+1
48 Chapter 1

which completes the proof of (1.3.33).


To prove (1.3.34) we first note that putting A = I(i(k) ) for any given
k ∈ N+ and i(k) ∈ Nk+ , by (1.3.35) we have

|γa (A ∩ B) − γa (A) γ (B)| ≤ εn γa (A) γ (B)


∞ , and n ∈ N . By integrating the above inequality
for any a ∈ I, B ∈ Bk+n +
over a ∈ I with respect to γ and taking into account that
Z
γa (E) γ(da) = γ (E) , E ∈ BI ,
I

we obtain ψγ (n) ≤ εn , n ∈ N+ .
To prove the converse inequality remark that the ψ-mixing coefficients
under the extended Gauss measure γ̄ of the doubly infinite sequence (ā` )`∈Z
of extended incomplete quotients, are equal to the corresponding ψ-mixing
coefficients under γ of (an )n∈N+ . This is obvious by the very definitions of
(ā` )`∈Z and ψ-mixing coefficients. See Subsection 1.3.3 and Section A3.1.
As (ā` )`∈Z is strictly stationary under γ̄, we have
¯ ¯
¯ γ̄(A ∩ B) ¯
ψγ (n) = ψγ̄ (n) = sup ¯¯ − 1¯¯ , n ∈ N+ ,
γ̄(A) γ̄(B)

where the upper bound is taken over all Ā = σ(ān , ān+1 , · · · ) and B̄ ∈
σ(ā0 , ā−1 , · · · ) for which γ̄(A) γ̄(B) 6= 0. Clearly, A = A × I and B = I × B,
with A ∈ Bn∞ = τ −n+1 (BI ) and B ∈ BI . Then
¯ ¯
¯ γ̄(A × B) ¯
ψγ (n) = sup ¯ − 1¯¯ , n ∈ N+ . (1.3.38)
¯
A ∈ τ −n+1 (BI ), B ∈ BI γ(A) γ(B)
γ(A)γ(B) 6= 0

Now, it is easy to check that


Z Z
γ̄(A × B) = γ(da)γa (B) = γ(db)γb (A)
A B

for any A, B ∈ BI . It then follows from (1.3.38) and the very definition of
εn that
¯ ¯
¯ γb (A) ¯
ψγ (n) ≥ sup ¯ − 1¯¯ = εn , n ∈ N+ .
¯
b ∈ I, A ∈ τ −n+1 (BI ) γ(A)
γ(A) 6= 0
Basic properties 49

This completes the proof of (1.3.34). 2

Corollary 1.3.15 The sequence (an )n∈N+ is ψ-mixing under γ and any
γa , a ∈ I. For any a ∈ I we have ψγa (1) ≤ (ε1 + ε2 )/(1 − ε2 ) = 0.61231 · · ·
and
(log 2)cn−2 (1 + c)
ψγa (n) ≤ , n ≥ 2.
2 − (log 2)cn−1
Also, ψγ (1) = 2 log 2 − 1 = 0.38629 · · · , ψγ (2) = ζ(2) log 2 − 1 = 0.14018 · · ·
and
1
ψγ (n) ≤ (log 2)cn−2 , n ≥ 3.
2
The doubly infinite sequence (ā` )`∈Z of extended incomplete quotients is
ψ-mixing under the extended Gauss measure γ̄, and its ψ-mixing coefficients
are equal to the corresponding ψ-mixing coefficients under γ of (an )n∈N+ .
The proof follows from Proposition 1.3.13 and Theorem 1.3.14. As al-
ready noted, the last assertion is obvious by the very definitions of (ā` )`∈Z
and ψ-mixing coefficients. 2
Remark. The above result will be improved in Chapter 2. See Proposi-
tion 2.3.7. 2
Proposition 1.3.16 (F. Bernstein’s theorem) Let (cn )n∈N+ be a se-
quence of positive numbers. The random event (an ≥ cnP ) occurs infinitely
often with γ-probability 0 or 1, according as the series n∈N+ 1/cn con-
verges or diverges.
P In other words, γ(an ≥ cn i.o.) is either 0 or 1 according
as the series n∈N+ 1/cn converges or diverges.
Proof. We can clearly assume that cn ≥ 1, n ∈ N+ . Let En = (an ≥
cn ), n ∈ N+ . By (1.2.9) we have

γ(En ) = γ(an ≥ cn ) = γ (a1 ≥ cn )


µ ¶
1 1
= γ(a1 ≥ c0n ) = log 1 + 0 ,
log 2 cn

where either c0n = bcn c + 1 or c0n = bcn c. Hence


1 2
≤ γ(En ) ≤ , n ∈ N+ ,
2cn cn log 2
P
since x log 2 ≤ log(1 + x) ≤ x for any x ∈ I. Thus if n∈N+ 1/cn converges,
then the result stated follows from the Borel–Cantelli lemma.
50 Chapter 1

P
Assume now that n∈N+ 1/cn diverges. It follows from Theorem 1.3.14
that for any k, n ∈ N+ such that k ≤ n we have

|γ (Ekc ∩ · · · ∩ Enc ∩ En+1 ) − γ (Ekc ∩ · · · ∩ Enc ) γ (En+1 )|

≤ ε1 γ (Ekc ∩ · · · ∩ Enc ) γ (En+1 ) ,


where ε1 = 2 log 2 − 1 = 0.38629 · · · . Hence
1 − ε1
γ ( En+1 | Ekc ∩ · · · ∩ Enc ) ≥ (1 − ε1 )γ(En+1 ) ≥ ,
2cn+1

therefore
¡ c ¯ c ¢
γ En+1 ¯ E ∩ · · · ∩ Enc ≤ 1 − 1 − ε1
k
2cn+1
for any k, n ∈ N+ such that k ≤ n.
It follows that for any k, m ∈ N+ we have
m µ ¶
¡ ¢ Y 1 − ε1
γ Ekc ∩ · · · ∩ Ek+m
c
≤ 1− ,
2ck+i
i=0

whence
m µ
Y ¶
¡ c c
¢ 1 − ε1
γ Ek ∩ Ek+1 ∩ · · · ≤ lim 1− =0
m→∞ 2ck+i
i=0
P
since n∈N+ 1/cn diverges.
Finally,

γ (an ≥ cn i.o.) = γ(∩k∈N+ ∪ i≥k Ei )

= lim γ(∪ i≥k Ei ) = lim γ((∩i≥k Eic )c )


k→∞ k→∞

¡ ¢
= 1 − lim γ Ekc ∩ Ek+1
c
∩ · · · = 1.
k→∞

2
In Chapter 3 we shall need the following result.
Corollary 1.3.17 Let bn , n ∈ N+ , be real-valued random variables on
(I, BI ) such that an ≤ bn ≤ an + c, n ∈ N+ , for some c ∈ R+ . Let
(cn )n∈N+ be a sequence of positive
P numbers. Then γ (bn ≥ cn i.o.) is either
0 or 1 according as the series n∈N+ 1/cn converges or diverges.
Basic properties 51

Proof. Clearly,

(an ≥ cn i.o.) ⊂ (bn ≥ cn i.o.) ⊂ (an ≥ max(1, cn − c) i.o.),


P P
and the series n∈N+ 1/cn and n∈N+ 1/ max(1, cn −c) are both convergent
or divergent. 2
52 Chapter 1
Chapter 2

Solving Gauss’ problem

In this chapter a generalization of Gauss’ problem stated in Subsection 1.2.1


is solved. Several applications are also given.

2.0 Banach space preliminaries


2.0.1 A few classical Banach spaces
In this subsection we describe some Banach spaces which are often men-
tioned throughout the book. We consider just functions defined on I, but
almost all considerations below can be easily extended to more general cases.
We denote by B (I) the collection of all bounded measurable functions
f : I → C. This is a commutative Banach algebra with unit under the
supremum norm
|| f || = sup |f (x)| , f ∈ B (I) .
x∈I

We denote by C (I) the collection of all continuous functions f : I → C .


This is a commutative Banach algebra with unit under the supremum norm.
We denote by C 1 (I) the collection of all functions f : I → C which have
a continuous derivative. This is a commutative Banach algebra with unit
under the norm
|| f || 1 = || f || + || f 0 || , f ∈ C 1 (I) .
We denote by L (I) the collection of all Lipschitz functions f : I → C,
that is, those for which

|f (x0 ) − f (x00 ) |
s (f ) := sup < ∞·
x0 6=x00 |x0 − x00 |

53
54 Chapter 2

This is a commutative Banach algebra with unit under the norm

|| f || L = || f || + s (f ) , f ∈ L (I) .

Clearly,
C 1 (I) ⊂ L (I) ⊂ C (I) ⊂ B (I) .
The variation varA f over A ⊂ I of a function f : I → C is defined as
k−1
X
sup |f (ti ) − f (ti−1 )| ,
i=1

the supremum being taken over t1 < · · · < tk , ti ∈ A, 1 ≤ i ≤ k, and k ≥ 2.


We write simply var f for varI f . If var f < ∞ then f is called a function
of bounded variation. The collection BV (I) of all functions f : I → C of
bounded variation is a commutative Banach algebra with unit under the
norm
|| f || v = || f || + var f, f ∈ BV (I) .
Clearly,
L (I) ⊂ BV (I) ⊂ B (I) .
Let µ be a measure on BI . Two measurable functions f : I → C and
g : I → C are said to be µ-indistinguishable, or to be µ-versions of each
other, if and only if µ (f 6= g) = 0. Let us partition the collection of all
measurable complex-valued functions defined on I into (equivalence) classes
of µ-indistinguishable functions. For any real number p ≥ 1 we denote by
Lp (I, BI , µ) = Lpµ the collection of all such classes of µ-indistinguishable
R 0
functions f : I → C for which I |f |p dµ < ∞. Clearly, Lpµ ⊂ Lpµ if p ≥ p0 ≥
1. Next, Lpµ is a Banach space under the norm
µZ ¶1/p
p
||f ||p,µ = |f | dµ , f ∈ Lpµ .
I

(Note that the value of the integral is the same for all functions in an equiv-
alence class.)
To define L∞ µ we should first define the µ-essential supremum. For a
measurable function f : I → R, its µ-essential supremum, which is denoted
µ-ess sup f , is defined as

inf {a ∈ R : µ (f > a) = 0} .
Solving Gauss’ problem 55

A measurable function f : I → C is said to be µ-essentially bounded if and


only if
µ-ess sup|f | < ∞.
Note that
µ-ess sup|f | = inf || fe || ,
where the lower bound is taken over all µ-versions fe or f . We denote
by L∞ (I, BI , µ) = L∞
µ the collection of all classes of µ-essentially bounded
complex-valued µ-indistinguishable functions defined on I ; L∞ µ is a com-
mutative Banach algebra with unit under the norm

||f ||∞,µ = µ-ess sup |f |, f ∈ L∞


µ .

(Note that the value of the essential supremum is the same for all functions
p
in an equivalence class.) Clearly, L∞
µ ⊂ Lµ for any p ≥ 1.
The special case p = 2 is an important one: L2µ can be also considered
as a Hilbert space with inner product (·, ·)µ defined by
Z
(f, g)µ = f g ∗ dµ, f, g ∈ L2µ .
I

In the case where µ = λ we simply write Lp , ||f ||p , L∞ , ||f ||∞ , and
ess sup f instead of Lpλ , ||f ||p,λ , L∞
λ , ||f ||∞,λ , and λ-ess sup f , respectively.

2.0.2 Bounded essential variation


A variation v (f ) for f ∈ L∞ e the infimum
is defined as v (f ) = inf var f,
being taken over all λ-versions e
f of f . If v (f ) < ∞ then f ∈ L∞ is called
a function of bounded essential variation. It can be shown that
Z
1 1
v (f ) = lim |f (u + a) − f (u) |du,
0<a→0 a 0

where for x > 1 we define f (x) = f (1). Clearly, if f ∈ BV (I) then,


in general, v (f ) ≤ var f . This is a special instance of the following more
general result due to Stadje (1985). If v (f ) < ∞ then the limit
Z t+a
1
fe(t) = lim f (u) du
0<a→0 a t

exists for any t ∈ I, the function fe is a right-continuous λ-version of f , and


var fe = v (f ). The collection BEV (I) of all functions f ∈ L∞ of bounded
56 Chapter 2

essential variation is a commutative Banach algebra with unit under any of


the norms
||f ||v,µ = v (f ) + ||f ||1,µ , f ∈ BEV (I) ,
with µ ∈ pr (BI ) such that µ ≡ λ. See Răuţu and Zbăganu (1989). In the
case where µ = λ we simply write ||f ||v instead of ||f ||v,λ .
Proposition 2.0.1 (i) Let µ ∈ pr (BI ). If f ∈ BV (I) then
¯Z ¯
¯ ¯
¯
|| f || ≤ var f + ¯ f dµ¯¯ . (2.0.1)
I

(ii) Let µ ∈ pr (BI ) with µ ≡ λ. If f ∈ BEV (I) then


¯Z ¯
¯ ¯
¯
µ-ess sup |f | ≤ v (f ) + ¯ f dµ¯¯ . (2.0.2)
I

Proof. (i) For any x ∈ I we can write


¯Z ¯ ¯ Z ¯ ¯Z ¯
¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ ¯ ¯
|f (x)| − ¯ f dµ¯ ≤ ¯f (x) − f dµ¯ = ¯ (f (x) − f (u)) µ (du)¯¯ ≤ var f,
I I I

from which (2.0.1) follows at once.


(ii) (2.0.2) follows from (2.0.1) since

µ-ess sup |f | = inf || fe|| , v (f ) = inf var fe,


fe fe

the infimum being taken over all µ-versions fe of f , and


Z Z
e
f dµ = f dµ
I I

for such an fe. 2

2.1 The Perron–Frobenius operator


2.1.1 Definition and basic properties
Let µ ∈ pr (BI ) such that
¡ ¢
µ τ −1 (A) = 0 whenever µ (A) = 0, A ∈ BI , (2.1.1)

where τ is the continued fraction transformation defined in Subsection 1.1.1.


Solving Gauss’ problem 57

In particular, this condition


¡ −1 ¢is satisfied if τ is µ-preserving, that is,
µτ −1 = µ, to mean µ τ (A) = µ (A) for any A ∈ BI . In general,
assuming that µ ¿ λ and putting h = dµ/dλ, it is easy to check that (2.1.1)
holds if and only if λ (E) = 0, where E = (x ∈ I : h (x) = 0).
The Perron–Frobenius operator Pµ of τ under µ is defined as the
bounded linear operator on L1µ which takes f ∈ L1µ into Pµ f ∈ L1µ with
Z Z
Pµ f dµ = f dµ , A ∈ BI ,
A τ −1 (A)

or, equivalently, Z Z
gPµ f dµ = (g ◦ τ ) f dµ (2.1.2)
I I

for any f ∈ L1µ and g ∈ L∞ µ . The existence of Pµ f is ensured by the Radon–


Nikodym theorem on account of (2.1.1). Actually, Pµ so defined takes Lpµ
into itself for any p ≥ 1 and p = ∞. So, (2.1.2) holds for any f ∈ Lpµ
and g ∈ Lqµ , with p > 1 and q = p/ (p − 1). In particular, (2.1.2) holds for
any f, g ∈ L2µ .
The probabilistic interpretation of Pµ is immediate : if an
R I-valued ran-
dom variable ξ Ron I has µ-density h, that is, µ (ξ ∈ A) = A hdµ, A ∈ BI ,
with h ≥ 0 and I hdµ = 1, then τ ◦ ξ has µ-density Pµ h. In the special case
µ = λ we obviously have
Z
d
Pλ f (x) = f dλ a.e. in I.
dx
τ −1 ([0,x])

Proposition 2.1.1 The following properties hold :


(i) Pµ is positive, that is, Pµ f ≥ 0 if f ≥ 0;
(ii) Pµ preserves integrals, that is,
Z Z
Pµ f dµ = f dµ, f ∈ L1µ ;
I I

(iii) kPµ kp,µ := sup (||Pµ f ||p,µ : f ∈ Lpµ , ||f ||p,µ = 1) ≤ 1 for any p ≥ 1
and p = ∞;
(iv) for any n ∈ N+ the nth power Pµn of Pµ is the Perron–Frobenius
operator of the nth iterate τ n of τ under µ ;
(v) (Pµ f )∗ = Pµ f ∗ for any f ∈ L1µ ;
(vi) Pµ ((g ◦ τ ) f ) = gPµ f for any f ∈ L1µ and g ∈ L∞ µ and for any f ∈
p
Lµ and g ∈ Lqµ with p > 1 and q = p/ (p − 1);
58 Chapter 2

(vii) PRµ f = f if and only if τ is ν-preserving, where ν is defined by


ν (A) = A f dµ, A ∈ BI . In particular, Pµ 1 = 1 if and only if τ is µ-
preserving.
For the proof see Boyarski and Góra (1997, Ch. 4), Lasota and Mackey
(1985, Ch. 3) or Mackey (1992, Ch. 4). 2
Remark. The above considerations on the Perron–Frobenius operator of
the continued fraction transformation τ under different probability measures
on BI apply mutatis mutandis to the general case of a transformation of an
arbitrary probability space. For example, in the case of the natural extension
τ of
¡ τ2 ¢(see Subsection 1.3.1) we should start by considering measures µ ∈
pr BI such that
¡ ¢
µ τ −1 (B) = 0 whenever µ (B) = 0, B ∈ BI2 . (2.1.10 )

Assuming that µ ¿ λ2 (two-dimensional Lebesgue measure) and¡ putting ¢


h = dµ/dλ2¡, it is easy to check that ¢(2.1.10 ) holds if and only if λ2 E = 0,
where E = (x, y) ∈ I 2 : h (x, y) = 0 .
The Perron–Frobenius
¡ ¢ operator P µ of
¡ τ¢ under µ is the ¡bounded
¢ linear
operator on L1µ I 2 which takes f ∈ L1µ I 2 into P µ f ∈ L1µ I 2 with
Z Z
P µ f dµ = f dµ, B ∈ BI2 .
B τ̄ −1 (B)

It is also quite easy to check that if µ ≤ λ2 and h = dµ/dλ2 > 0 a.e. in I 2 ,


then ¡ ¢ ¡ ¢
h ◦ τ −1 (x, y) f ◦ τ −1 (x, y)
P µ f (x, y) =
y 2 (x + b1/yc)2 h (x, y)
a.e. in I 2 . Alternatively,
µ x ¶ ¡ ¢
s1 (y) 2 h ◦ τ −1 (x, y) ¡ ¢
P µ f (x, y) = 0
f ◦ τ −1 (x, y)
τ (y) h (x, y)

a.e. in I 2 . In particular, for µ = γ when


1 1
h (x, y) = , x, y ∈ I 2 ,
log 2 (xy + 1)2

we have P γ f = f ◦ τ −1 a.e. in I 2 . Hence


µ x ¶ ¡ ¢
n s1 (y) · · · sxn (y) 2 h ◦ τ −n (x, y) ¡ ¢
P µ f (x, y) = 0 n−1
f ◦ τ −n (x, y) ,
τ (y) · · · τ (y) h (x, y)
Solving Gauss’ problem 59

n
P γ f = f ◦ τ −n
a.e. in I 2 for any n ∈ N+ .
We should, however, note that the Perron–Frobenius operator of an in-
vertible transformation, like τ̄ , is not of great value for deriving asymptotic
properties of its nth power as n → ∞. For an interesting discussion of the
Perron–Frobenius operator of τ̄ in connection with the time evolution of
certain spatially homogeneous cosmologies (‘mixmaster universe’), we refer
the reader to Mayer (1987). 2
Proposition 2.1.2 The Perron–Frobenius operator Pγ := U of τ under
γ is given a.e. in I by the equation
X µ ¶
1
U f (x) = Pi (x) f , f ∈ L1γ . (2.1.3)
x+i
i∈N+

Proof. Let τi : Ii → I denote the restriction of τ to the interval Ii =


(1/ (i + 1) , 1/i], i ∈ N+ , that is,
1
τi (u) = − i, u ∈ Ii .
u
For any f ∈ L1γ and any A ∈ BI we have
Z X Z X Z
f dγ = f dγ = f dγ. (2.1.4)
τ −1 (A) i∈N+ τ −1 (A∩Ii ) i∈N+ τi−1 (A)

For any i ∈ N+ , by the change of variable x = τi−1 (y) = (y + i)−1 we


successively obtain
Z Z
1 f (x) dx
f dγ =
τi−1 (A) log 2 τi−1 (A) x + 1
Z µ ¶
1 1 1 dy
= f
log 2 A y+i (y + i) + 1 (y + i)2
−1
(2.1.5)
Z µ ¶
1 1 dy
= Pi (y) f
log 2 A y+i y+1
Z µ ¶
1
= Pi (y) f γ (dy) .
A y+i

Now, (2.1.3) follows from (2.1.4) and (2.1.5). 2


60 Chapter 2

Proposition 2.1.3 Let µ ∈ pr (BI ). Assume that µ ¿ λ and h =


dµ/dλ > 0 a.e. in I . Then the Perron–Frobenius operator Pµ of τ under
µ is given a.e. in I by the equation
³ ´
h (x + i) −1 µ ¶
1 X 1
Pµ f (x) = 2 f
h (x) (x + i) x+i
i∈N+
(2.1.6)
U g (x)
= , f ∈ L1µ ,
(x + 1) h (x)

where g (x) = (x + 1) h (x) f (x), x ∈ I.


The powers of Pµ are given a.e. in I by the equation

U n g (x)
Pµn f (x) = , f ∈ L1µ , n ∈ N+ . (2.1.7)
(x + 1) h (x)

Proof. The proof of (2.1.6) is entirely similar to that of (2.1.3), and is


left to the reader. Note that f ∈ L1µ entails g ∈ L1γ .
To prove (2.1.7) note that it holds for n = 1. Assuming that (2.1.7)
holds for some n ∈ N+ , we have
µ ¶
n+1
¡ n ¢ U ng
Pµ f (x) = Pµ Pµ f (x) = Pµ (x)
(· + 1) h
³ ´
X h (x + i)−1 µ ¶ µ ¶ ³ ´
1 n 1 1 −1
= U g / + 1 h (x + i)
h (x)
i∈N+
(x + i)2 x+i x+i

X µ ¶
1 n 1 U n+1 g (x)
= Pi (x) U g = a.e. in I,
(x + 1) h (x) x+i (x + 1) h (x)
i∈N+

and the proof is complete. 2


Corollary 2.1.4 The Perron–Frobenius operator Pλ of τ under λ is
given a.e. in I by the equation
X µ ¶
1 1
Pλ f (x) = 2f , f ∈ L1 .
(x + i) x + i
i∈N+

The powers of Pλ are given a.e. in I by the equation


U n g (x)
Pλn f (x) = , f ∈ L1 , n ∈ N+ ,
x+1
Solving Gauss’ problem 61

where g (x) = (x + 1) f (x), x ∈ I.

Proposition 2.1.5 Let µ ∈ pr (BI ). Assume that µ ¿ λ and let h =


dµ/dλ. Then
Z
¡ −n ¢ U n f (x)
µ τ (A) = dx (2.1.8)
A x+1

for any n ∈ N and A ∈ BI , where f (x) = (x + 1) h(x), x ∈ I.

Proof. For n = 0 equation (2.1.8) reduces to


Z
µ (A) = h (x) dx, A ∈ BI ,
A

which is obviously true. Assume that (2.1.8) holds for some n ∈ N. Then
³ ´ ¡ ¡ ¢¢
µ τ −(n+1) (A) = µ τ −n τ −1 (A)
Z Z
U n f (x)
= dx = (log 2) U n f dγ.
−1
τ (A) x + 1 −1
τ (A)

By the very definition of the Perron–Frobenius operator U = Pγ we have


Z Z
n
U f dγ = U n+1 f dγ.
τ −1 (A) A

Therefore
³ ´ Z Z
−(n+1) n+1 U n+1 f (x)
µ τ (A) = (log 2) U f dγ = dx,
A A x+1

and the proof is complete. 2

Remark. It should be noted that (2.1.8) holds without assuming that


h > 0 a.e. Since
Z
n
¡ −n ¢
µ (τ ∈ A) = µ τ (A) = Pµn 1 dµ, n ∈ N, A ∈ BI ,
A

it is possible to derive (2.1.8) from Proposition 2.1.3 assuming that h > 0


a.e., which clearly restricts the generality of the result. 2
62 Chapter 2

2.1.2 Asymptotic behaviour


It is easy to check that
1
x+1
is an eigenfunction of Pλ corresponding to the eigenvalue 1.
Define on L1 the linear operators Π1 and T0 by
Z
(log 2)−1
Π1 f (x) = f dλ, f ∈ L1 , x ∈ I,
x+1 I

T0 = Pλ − Π1 .
Hence

Π21 = Π1 , Pλ Π1 = Π1 Pλ = Π1 , T0 Π1 = Π1 T0 = 0. (2.1.9)

It follows from the last equation (2.1.9) that

Pλn = Π1 + T0n , n ∈ N+ . (2.1.10)

Theorem 2.1.6 The only eigenvalue of modulus 1 of Pλ : L1 →


L1 is 1 and this eigenvalue is simple. The operator T0 has the following
properties:
(i) T0 (BEV (I)) ⊂ BEV (I);
(ii) there exists 0 < q < 1 such that ||T0n ||v = O (q n ) as n → ∞
(equivalently, the spectral radius of T0 in BEV (I) under || · ||v is less than
1);
(iii) supn∈N+ ||T0n ||1 < ∞ and limn→∞ ||T0n h||1 = 0 for any h ∈ L1 .
Proof. This is a special case of Theorem 5.3.12 in Iosifescu and Grig-
orescu (1990). 2
The result just stated concerning the asymptotic behaviour of T0n as
n → ∞ can be used to derive the asymptotic behaviour of U n as n → ∞.
It follows from Corollary 2.1.4 and equation (2.1.10) that
µ ¶
g
U n g (x) = U ∞ g + (x + 1) T0n (x) (2.1.11)
·+1

a.e. in I for any g ∈ L1γ , where


Z

U g= gdγ.
I
Solving Gauss’ problem 63

It is obvious that U ∞ U ∞ = U U ∞ = U ∞ . Using the last equation (2.1.9) it


is easy to check that
U ∞U = U ∞ . (2.1.12)
Now, defining the linear operator T : L1γ → L1γ by
µ ¶
g
T g (x) = (x + 1) T0 (x), g ∈ L1γ ,
·+1

a.e. in I, it is easy to check that


µ ¶
n g
T g (x) = (x + 1) T0n (x), g ∈ L1γ , (2.1.13)
·+1

a.e. in I for any n ∈ N+ , and

T U ∞ = U ∞ T = 0. (2.1.14)

It follows from (2.1.11) and (2.1.13) that

U n = U ∞ + T n, n ∈ N+ .

Proposition 2.1.7 The only eigenvalue of modulus 1 of U : L1γ → L1γ


is 1 and this eigenvalue is simple. The corresponding eigenspace consists of
the a.e. constant functions on I. The linear operator T : L1γ → L1γ has the
following properties:
(i) T (BEV (I)) ⊂ BEV (I);
(ii) there exists 0 < q < 1 such that ||T n ||v,γ = O (q n ) as n → ∞
(equivalently, the spectral radius of T in BEV (I) under || · ||v,γ is less than
1);
(iii) supn∈N+ ||T n ||1,γ < ∞ and limn→∞ ||T n h||1,γ = 0 for any h ∈ L1γ .
Proof. By (2.1.11) and (2.1.13), all the conclusions are immediate conse-
quences of the corresponding conclusions of Theorem 2.1.6. In checking (ii)
we have to use Proposition 2.0.1(ii). 2
Remark. Since
λ (A) λ (A)
≤ γ (A) ≤ , A ∈ BI ,
2 log 2 log 2

the domains of the operators U, U ∞ and T can be as well taken to be L1


and then in (ii) and (iii) the norms || · ||v,γ and || · ||1,γ should be replaced
by the norms || · ||v and || · ||1 , respectively. 2
64 Chapter 2

Corollary 2.1.8 For any h ∈ L1 we have


Z Z
n ∞
lim |U h − U h|dγ = lim |U n h − U ∞ h|dλ = 0.
n→∞ I n→∞ I

Hence, for any h ∈ L1 ,


Z
lim U n h dµ = µ (A) U ∞ h (2.1.15)
n→∞ A

uniformly with respect to A ∈ BI , where µ stands for either λ or γ.


Proof. For any A ∈ BI we have
¯Z ¯ ¯Z ¯
¯ ¯ ¯ ¯
¯ U h dµ − µ (A) U h¯ = ¯ (U h − U h) dµ¯
n ∞ n ∞
¯ ¯ ¯ ¯
A
ZA
≤ |U n h − U ∞ h| dµ
ZA
≤ |U n h − U ∞ h| dµ −→ 0
I

as n → ∞, and the proof is complete. 2


Remark. It is not possible to show that U n h → U ∞ h a.e. as n → ∞
by using (2.1.15). It is an open problem whether this is actually true. Cf.
Petek (1989) and Iosifescu (1992, p. 912). 2

2.1.3 Restricting the domain of the Perron–Frobenius


operator
The asymptotic properties of the Perron–Frobenius operator U : L1γ → L1γ
as described by Proposition 2.1.7, are not strong enough for to lead to a sat-
isfactory solution to Gauss’ problem, whilst when restricting U to BEV (I)
they are substantially better. See further Proposition 2.1.17.
In the next sections the domain of U will be successively restricted to
various Banach spaces. In this subsection we show that U , defined by
X µ ¶
1
U f (x) = Pi (x) f (2.1.16)
x+i
i∈N+

for any x ∈ I, is a bounded linear operator on any of the Banach spaces


B (I) , C (I) , BV (I), L (I), and C 1 (I).
Solving Gauss’ problem 65

Proposition 2.1.9 The operator U defined by (2.1.16) is a bounded


linear operator of norm 1 on both B (I) and C (I).
Proof. It is obvious that if f ∈ B (I) then U f ∈ B (I) and || U f || ≤ || f ||.
Next, if f ∈ C (I) then U f ∈ C (I) since the series defining U f is uniformly
convergent, it being dominated by a convergent series of positive constants.
We also have || U f || ≤ || f || , f ∈ C (I) ⊂ B (I), as a consequence of the
validity of this inequality for f ∈ B (I). In both cases || U || = 1 since U
preserves the constant functions. 2
A different interpretation is available for the operator U : B (I) → B (I).
Proposition 2.1.10 The operator U : B (I) → B (I) is the transition
operator of both the Markov chain a)
¡ 2 (s
2
¢
n n∈N on (I, BI , γa ), for any a ∈ I, and
the Markov chain (s` )`∈Z on I , BI , γ .
Proof. As noted in Subsection 1.3.4, for any a ∈ I the sequence (san )n∈N
is an I-valued Markov chain with the following transition mechanism: from
state s ∈ I the possible transitions are to any state 1/ (s + i) with corre-
sponding transition probability Pi (s), i ∈ N+ . Then the transition operator
of (san )n∈N takes f ∈ B (I) to the function defined by
X µ ¶
¡ ¡ a ¢ a ¢ 1
E f sn+1 |sn = s = Pi (s) f = U f (s), s ∈ I,
s+i
i∈N+

that is, it coincides with the operator U whatever a ∈ I.


A similar reasoning is valid for the case of the Markov chain (s` )`∈Z ,
whose transition mechanism is identical with that of (san )n∈N . (See Subsec-
tion 1.3.3.) 2
To prove a result similar to Proposition 2.1.9 for the Banach spaces
BV (I), L (I), and C 1 (I) we need some preparation.
We first prove that the operator U : B (I) → B (I) preserves monotonic-
ity.
Proposition 2.1.11 If f ∈ B (I) is non-decreasing (non-increasing),
then U f is non-increasing (non-decreasing).
Proof. To make a choice assume that f is non-decreasing. Let y > x,
x, y ∈ I. We have U f (y) − U f (x) = S1 + S2 , where
X µ µ ¶ µ ¶¶
1 1
S1 = Pi (y) f −f ,
y+i x+i
i∈N+
X µ ¶
1
S2 = (Pi (y) − Pi (x)) f .
x+i
i∈N+
66 Chapter 2

Clearly, S1 ≤ 0. We shall prove that S2 ≤ 0, too. Since


X
Pi (u) = 1, u ∈ I,
i∈N+

we can write
X µ µ 1
¶ µ
1
¶¶
S2 = − f −f (Pi (y) − Pi (x)) .
x+1 x+i
i∈N+

As is easy to see, the function P1 is decreasing while the functions Pi , i ≥ 3,


are all increasing. Note also that
µ ¶ µ ¶ µ ¶ µ ¶
1 1 1 1
f −f ≥f −f ≥ 0, i ≥ 2.
x+1 x+i x+1 x+2
Therefore
Xµ µ 1
¶ µ
1
¶¶
S2 = − f −f (Pi (y) − Pi (x))
x+1 x+i
i≥2
µ µ ¶ µ ¶¶ X
1 1
≤ − f −f (Pi (y) − Pi (x))
x+1 x+2
i≥2
µ µ ¶ µ ¶¶
1 1
= f −f (P1 (y) − P1 (x)) ≤ 0,
x+1 x+2

as claimed. Thus U f (y) − U f (x) ≤ 0, and the proof is complete. 2


Remark. It is possible to show more generally that if f ∈ L1 is non-
decreasing (non-increasing), then U f is non-increasing (non-decreasing).
The proof, along the same lines as above, is left to the reader. 2
Proposition 2.1.12 If f ∈ B (I) is monotone, then
1
var U f ≤ var f.
2
The constant 1/2 cannot be lowered.
Proof. Assume, with no loss of generality, that f is non-decreasing. [Note
that if f is non-increasing, then −f is non-decreasing while var U (−f ) =
var U f and var (−f ) = var f .] Then by Proposition 2.1.11 we have
X µ µ ¶
1
µ
1
¶¶
var U f = U f (0) − U f (1) = Pi (0) f − Pi (1) f .
i i+1
i∈N+
Solving Gauss’ problem 67

Since Pi (1) = 2Pi+1 (0), i ∈ N+ , it follows that


X µ ¶
1
var U f = P1 (0) f (1) − Pi+1 (0) f .
i+1
i∈N+

As X 1
P1 (0) = Pi+1 (0) =
2
i∈N+

and µ ¶
1
f ≥ f (0) , i ∈ N+ ,
i+1
we finally obtain
1 1
var U f ≤ (f (1) − f (0)) = var f.
2 2
Since for f defined by f (x) = 0, 0 ≤ x < 1, and f (1) = 1 we have
var U f = (var f ) /2, it follows that the constant 1/2 cannot be lowered. 2
Corollary 2.1.13 If f ∈ BV (I) is real-valued, then
1
var U f ≤ var f.
2
The constant 1/2 cannot be lowered.
Proof. By Hahn’s decomposition of a signed measure, for any f ∈ BV (I)
there exist monotone functions f1 , f2 ∈ B (I) such that f = f1 − f2 and
var f = var f1 + var f2 . [To obtain this consider the signed measure µ on
BI defined by µ ((a, b]) = f (b) − f (a), a < b, a, b ∈ I.] Then by Proposition
2.1.12 we have

var U f = var (U f1 − U f2 ) ≤ var U f1 + var U f2


1 1
≤ (var f1 + var f2 ) = var f .
2 2
The optimality of the constant 1/2 follows from Proposition 2.1.12. 2
Proposition 2.1.14 We have

s (U f ) ≤ (2ζ (3) − ζ (2)) s (f ) (2.1.17)

for any f ∈ L (I). The constant θ = 2ζ (3) − ζ (2) = 0.7594798 · · · cannot


be lowered.
68 Chapter 2

Proof. For x 6= y, x, y ∈ I, we have

U f (y) − U f (x) X Pi (y) − Pi (x) µ 1 ¶


= f (2.1.18)
y−x y−x x+i
i∈N+
³ ´ ³ ´
1 1
X f y+i −f x+i 1
− Pi (y) 1 1 .
y+i − x+i
(x + i) (y + i)
i∈N+

Next, remark that


i i−1
Pi (x) = − , i ∈ N+ ,
x+i+1 x+i
and then
Pi (y) − Pi (x) i−1 i
= − , i ∈ N+ .
y−x (x + i) (y + i) (x + i + 1) (y + i + 1)

Hence
X Pi (y) − Pi (x) µ 1 ¶
f
y−x x+i
i∈N+

µ µ ¶ µ ¶¶ (2.1.19)
X i 1 1
= f −f .
(x + i + 1) (y + i + 1) x+i+1 x+i
i∈N+

Assume that x > y. It then follows from (2.1.18) and (2.1.19) that
¯ ¯
¯ U f (y) − U f (x) ¯ X µ Pi (y) i

¯ ¯ ≤ s (f ) + .
¯ y−x ¯ (y + i)2 (y + i) (y + i + 1)3
i∈N+

Now, the function g defined by


X Pi (y)
g (y) = , y ∈ I,
i∈N+
(y + i)2

is precisely U h for h (y) = y 2 , y ∈ I. Since h is increasing, g is decreasing


by Proposition 2.1.11. Therefore for any y ∈ I we have
X µ Pi (y) i

+
i∈N+
(y + i)2 (y + i) (y + i + 1)3
Solving Gauss’ problem 69

X µ 1 1

≤ +
i3 (i + 1) (i + 1)3
i∈N+
X µ1 1 1 1 1

= − + − +
i3 i2 i i + 1 (i + 1)3
i∈N+
= ζ (3) − ζ (2) + 1 + ζ (3) − 1 = 2ζ (3) − ζ (2) .

As clearly
¯ ¯
¯ U f (y) − U f (x) ¯
sup ¯ ¯ ¯ = s (U f ) ,
y−x ¯
x,y∈I, x>y

we obtain (2.1.17).
Finally, it is easy to check that for f (x) = x, x ∈ I, we have s (f ) =
1 and s (U f ) = 2ζ (3) − ζ (2). The proof is complete. 2

Proposition 2.1.15 We have

|| (U f )0 || ≤ (2ζ (3) − ζ (2)) || f 0 || (2.1.20)

for any f ∈ C 1 (I). The constant θ = 2ζ (3) − ζ (2) = 0.7594798 · · · cannot


be lowered.

Proof. Equations (2.1.19) and (2.1.18) show that for f ∈ C 1 (I) the series
defining U f can be differentiated term by term since the series of the deriva-
tives is uniformly convergent, it being dominated by a convergent series of
positive constants (cf.further Subsection 2.2.1). Then (2.1.20) follows from
(2.1.17) since for any f ∈ C 1 (I) we have s (f ) = || f 0 ||. 2

Now, we can state the result announced.

Proposition 2.1.16 The operator U defined by (2.1.16) is a bounded


linear operator of norm 1 on any of the Banach spaces BV (I), L (I), and
C 1 (I).

Proof. The result follows from Corollary 2.1.13 and Propositions 2.1.14
and 2.1.15, having in view that U preserves the constant functions. In the
case of BV (I) we should note that for a complex-valued f ∈ BV (I) we have

max (var Re f, var Im f ) ≤ var f ≤ var Re f + var Im f.

Hence by Corollary 2.1.13 we have var U f ≤ var f for such an f. 2


70 Chapter 2

2.1.4 A solution to Gauss’ problem for probability measures


with densities
Let µ ∈ pr (BI ) such that µ ¿ λ. By Proposition 2.1.5 for any n ∈ N we
have Z
¡ ¢ U n f0 (x)
µ τ −n (A) = dx, A ∈ BI , (2.1.21)
A x+1
with f0 (x) = (x + 1) F00 (x) , x ∈ I, where F00 = dµ/dλ. We shall consider
Gauss’ problem in a more general form, namely, that of the asymptotic
behaviour of µ(τ −n (A)) as n → ∞ for any A ∈ BI .
Equation (2.1.21) shows that solving this more general Gauss’ problem
for a given µ ∈ pr (BI ) amounts to studying the behaviour of the nth power
of the Perron–Frobenius operator U on a suitable Banach space. On account
of the results obtained in Subsection 2.1.2 we can state the following result.
Proposition 2.1.17 Let µ ∈ pr (BI ) such that µ ¿ λ. We have
¯ ¡ ¢ ¯
lim sup ¯µ τ −n (A) − γ (A)¯ = 0. (2.1.22)
n→∞ A∈B
I

If F00 = dµ/dλ ∈ BEV (I) then there exists a constant C ∈ R+ such


that ¯ ¡ −n ¢ ¯
¯µ τ (A) − γ (A)¯ ≤ C q n γ (A) (2.1.23)
for any n ∈ N+ and A ∈ BI . Here 0 < q < 1 is the constant occurring in
Proposition 2.1.7(ii).
Proof. We have
Z
¡ ¢ U n f0 (x) − U ∞ f0
µ τ −n (A) − γ (A) = dx (2.1.24)
A x+1
since Z Z
1 1
U ∞ f0 = f0 dγ = F00 dλ = ,
I log 2 I log 2
and equation (2.1.22) follows by (2.1.15).
If F00 ∈ BEV (I) then for some C0 ∈ R+ by Proposition 2.1.7(ii) we have

kU n f0 − U ∞ f0 kv ≤ C0 q n ||f0 ||v , n ∈ N+ .

It then follows from Proposition 2.0.1(ii) that

ess sup |U n f0 − U ∞ f0 | ≤ C0 q n ||f0 ||v , n ∈ N+ . (2.1.25)

Now, (2.1.23) follows from (2.1.24) and (2.1.25). 2


Solving Gauss’ problem 71

2
√ Remark. As for q, we conjecture that its (optimal) value is g = (3 −
5)/2 = 0.38196 · · · , as in a further related result, namely, Corollary 2.5.7.2
In the next three sections we will take up Gauss’ problem assuming that
F00 = dµ/dλ belongs to Banach spaces ‘smaller’ than BEV (I).

2.1.5 Computing variances of certain sums


In this subsection, using properties of the Perron–Frobenius operator U on
BEV (I), we give some results concerning the variances of certain sums of
random variables constructed starting from either the a` , ` ∈ Z, or the an , n ∈
N+ . These results will be used in Chapter 3.
Let H be a real-valued function on NZ + . Set H` = H1 ◦ τ
`−1 , ` ∈ Z,

where
H1 = H(· · · , ā−2 , ā−1 , ā0 , ā1 , ā2 , · · · ).
¡ 2 2 ¢
Clearly (H
Pn` )`∈Z is a strictly stationary process on I , BI , γ . Set S0 =
0, Sn = i=1 Hi , n ∈ N+ . We start with some well known results.
Theorem 2.1.18 If Eγ H12 < ∞, Eγ H1 = 0, and limn→∞ Eγ H1 Hn = 0,
then the finite or infinite limit limn→∞¡Eγ¢Sn2 exists. We have limn→∞ Eγ Sn2
< ∞ if and only if there exists g ∈ L2γ I 2 such that H1 = g ◦ τ − g a.e. in
I 2.
This is a special case of Theorem 18.2.2 in Ibragimov and Linnik (1971).
Proposition 2.1.19 If Eγ H12 < ∞, Eγ H1 = 0, and the series
X
σ 2 = Eγ H12 + 2 Eγ H1 Hn+1 (2.1.26)
n∈N+

converges absolutely, then σ 2 ≥ 0 and


¡ ¢
Eγ Sn2 = n σ 2 + o (1) (2.1.27)
P
as n → ∞. If the stronger assumption n∈N+ n |Eγ H1 Hn+1 | < ∞ holds,
then ¡ ¢
Eγ Sn2 = n σ 2 + O(n−1 ) (2.1.28)
as n → ∞.
Proof. By strict stationarity, for any n > 1 we have
n
X n−1
X
Eγ Sn2 = Eγ Hi Hj = nEγ H12 +2 (n − j) Eγ H1 Hj+1 .
i,j=1 j=1
72 Chapter 2

Therefore
 n−1 
P
 j |Eγ H1 Hj+1 | 
1 ¯¯ ¯  j=1 X 
2 2¯
Eγ Sn − nσ ≤ 2  + |Eγ H1 Hj+1 | ,
n  n 
j≥n

P
and the right hand side is o (1) as n → ∞ when |E H H |<∞
P Pn n∈N+ γ 1 n+1
(note that n∈N+ |un | < ∞ implies limn→∞ j=1 j |uj | /n = 0), so that
(2.1.27) holds. Finally, since
P
n−1 P
j |Eγ H1 Hj+1 | j |Eγ H1 Hj+1 |
j=1 X j∈N+
+ |Eγ H1 Hj+1 | ≤ ,
n n
j≥n

equation (2.1.28) holds, too, under our stronger assumption. 2


Corollary 2.1.20 Assume that Eγ H12 < ∞, Eγ H1 = 0, and
X
n |Eγ H1 Hn+1 | < ∞.
n∈N+
¡ ¢
Then σ = 0 if and only if there exists g ∈ Lγ2 I 2 such that H1 = g ◦ τ − g
a.e. in I 2 .
Proposition 2.1.21 If Eγ H12 < ∞, Eγ H1 = 0, and
X 1/2
Eγ [H1 − Eγ (H1 |a−n , · · · , an )]2 < ∞, (2.1.29)
n∈N+

then series (2.1.26) converges absolutely.


On account of Corollary 1.3.15, this is a transcription of part of Theorem
18.6.1 in Ibragimov and Linnik (1971) for the special case of the doubly
infinite sequence (a` )`∈Z . 2
Note that both the conditional mean value occurring
¡ 2 in2(2.1.29)
¢ and σ 2
can be expressed in terms of the random variable h on I , BI defined on Ω2
(thus a.e. in I 2 ) by
h ([i1 , i2 , · · · ], [i0 , i−1 , · · · ]) = H (· · · , i−1 , i0 , i1 , · · · )
for any (i` )`∈Z ∈ NZ+ . Clearly,
Z Z
Eγ H1 = hdγ, Eγ H12 = h2 dγ,
I2 I2
Solving Gauss’ problem 73
Z
1
Eγ (H1 |ā−n , · · · , ān ) (ω, θ) = 2
hdγ
γ (I (i−n , · · · , in ))
I 2 (i−n ,··· ,in )

for (ω, θ) ∈ I 2 (i−n , · · · , in ), where

I 2 (i−n , · · · , in ) = I (i1 , · · · , in ) × I (i0 , i−1 , · · · , i−n )

for any ik ∈ N+ , −n ≤ k ≤ n, n ∈ N+ , and


Z X Z
2 2
σ = h dγ + 2 h (h ◦ τ n ) dγ.
I2 n∈N+ I2

Condition (2.1.29) is fulfilled for a large class of functions h as shown by


the following result.
Proposition 2.1.22 Put
¯ ¡ ¢¯
cn = sup ¯h (ω, θ) − h ω 0 , θ0 ¯, n ∈ N+ ,

where the upper bound is taken over all (ω, θ),R (ω 0 , θ0 ) ∈ I 2 (i−n ,P
· · · , in ) and
ik ∈ N+ , −n ≤ k ≤ n. Assume that Eγ H12 = I 2 h2 dγ < ∞ and n∈N+ cn <
∞. Then (2.1.29) holds.
Proof. For any n ∈ N+ we have

Eγ [H1 − Eγ (H1 |a−n , · · · , an )]2

 Z 2
Z h (ω, θ) γ (dω, dθ)
X  ¡ 0 0¢ I 2 (i−n ,··· ,in )  ¡ 0 ¢
= h ω , θ −  γ dω , dθ0
 γ (I 2 (i −n , · · · , in )) 
i−n ,··· ,in ∈N+ I 2 (i−n ,··· ,in )

Z ÃZ !2
¡ ¢ ¡ ¡ 0 0¢ ¢
γ dω 0 , dθ0 h ω , θ − h (ω, θ) γ (dω, dθ)
X I 2 (i−n ,··· ,in ) I 2 (i−n ,··· ,in )
=
γ 2 (I 2 (i−n , · · · , in ))
i−n ,··· ,in ∈N+
X ¡ ¢
≤ γ I 2 (i−n , · · · , in ) c2n = c2n .
i−n ,··· ,in ∈N+

Hence
P the series occurring in (2.1.29) is dominated by the convergent series
n∈N+ cn , which completes the proof. 2
74 Chapter 2

Remark. If for some positive constants c and ε we have


µ¯ ¯ ¯ ¯¶ε
¯ ¡ 0 0 ¢¯ ¯1 1 ¯ ¯1 1 ¯
¯h (ω, θ) − h ω , θ ¯ ≤ c ¯ − ¯ + ¯ − ¯ (2.1.30)
¯ ω ω 0 ¯ ¯ θ θ0 ¯

for any (ω, θ), (ω 0 , θ0 ) ∈ Ω2 , then the assumption of Proposition 2.1.22 holds.
Indeed, for (ω, θ), (ω 0 , θ0 ) ∈ I 2 (i−n , · · · , in ) we have
¯ ¯
¯1 ¯
¯ − 1¯≤ sup λ (I (i−1 , · · · , i−n )) = (Fn Fn+1 )−1
¯ θ θ0 ¯
i−1 ,··· ,i−n ∈N+

and similarly ¯ ¯
¯1 1 ¯
¯ − ¯ ≤ (Fn−1 Fn )−1 .
¯ ω ω0 ¯
Hence ¯ ¡ ¢¯
¯h (ω, θ) − h ω 0 , θ0 ¯ ≤ c 2ε (Fn−1 Fn )−ε, n ∈ N+ ,
for any (ω,Pθ), (ω 0 , θ0 )
∈ I 2 (i
−n , · · · , in ), ik ∈ N+ , −n ≤ k ≤ n, and clearly
−ε
the series n∈N+ (Fn−1 Fn ) is convergent.
In particular, (2.1.30) holds if h satisfies a Hölder condition of order
ε > 0, that is,
|h (ω, θ) − h (ω 0 , θ0 )|
sup < ∞.
(ω,θ),(ω 0 ,θ0 )∈Ω2 (|ω 0 − ω| + |θ0 − θ|)ε
2
The results above clearly apply to the special case where H is a real-
N
valued function on N+ + . In this case we set

Hn = H (an , an+1 , · · · ) = H1 ◦ τ n−1 , n ∈ N+ .

Then (Hn )n∈N+ is a strictly stationary sequence on (I, BI , γ). Theorem


2.1.18, Proposition 2.1.19, Corollary 2.1.20, and Proposition 2.1.21 hold in
the present case if in their statements we replace γ by γ, I 2 by I, τ by τ
and inequality (2.1.29) by
X
Eγ1/2 [H1 − Eγ (H1 |a1 , · · · , an )]2 < ∞. (2.1.31)
n∈N+

In the present case the conditional mean value occurring in (2.1.31) and σ 2
can be expressed in terms of the random variable h on (I, BI ) defined on
Ω (thus a.e. in I) by

h ([i1 , i2 , · · · ]) = H (i1 , i2 , · · · )
Solving Gauss’ problem 75

N
for any (i` )`∈N+ ∈ N+ + . Clearly,
Z Z
Eγ H1 = hdγ, Eγ H1 = h2 dγ,
2
I I
Z
1
Eγ (H1 |a1 , · · · , an ) (ω) = ¡ ¡ (n) ¢¢ hdγ
γ I i I(i(n) )

for any ω ∈ I(i(n) ), i(n) ∈ Nn+ , n ∈ N+ , and


Z X Z
2 2
σ = h dγ + 2 h (h ◦ τ n ) dγ
I n∈N+ I
Z X Z
= h2 dγ + 2 h U n h dγ
I n∈N+ I

[the last equation is a consequence of (2.1.2)].


It follows from
R Proposition 2.1.22
P that condition (2.1.31) is fulfilled if we
assume that I h2 dγ < ∞ and n∈N+ cn < ∞, where
¯ ¡ ¢¯
cn = sup sup ¯h (ω) − h ω 0 ¯ , n ∈ N+ .
i(n) ∈Nn 0
+ ω,ω ∈I(i
(n) )

In turn, the second assumption holds if for some positive constants c and ε
we have ¯ ¯ε
¯ ¡ ¢¯ ¯ ¯
¯h (ω) − h ω 0 ¯ ≤ c ¯ 1 − 1 ¯ , ω, ω 0 ∈ Ω. (2.1.32)
¯ ω ω0 ¯

In particular, (2.1.32) holds if h satisfies a Hölder condition of order ε > 0,


that is,
|h (ω) − h (ω 0 )|
sup < ∞.
ω,ω 0 ∈Ω |ω − ω 0 |ε
To indicate another class of functions h for which (2.1.31) holds let us
recall that a function h : I → C is said to be of bounded p–variation, p ≥ 1,
on A ⊂ I if and only if
k−1
X
(p)
varA h := sup |h (ti+1 ) − h (ti )|p < ∞,
i=1

the supremum being taken over t1 < · · · < tk , ti ∈ A, 1 ≤ i ≤ k, and


(p)
k ≥ 2. We write simply var(p) h for varI h. If var(p) h < ∞ then h is
called a function of bounded p-variation. Clearly, var(1) h = var h and a
76 Chapter 2

function of bounded variation is also a function of bounded p-variation for


any p > 1. (The converse of this assertion is in general not true.) More
generally, a function of bounded p-variation, p ≥ 1, is also a function of
p0 -variation, p0 > p.
Proposition 2.1.23. If h is a function of bounded p-variation on Ω,
then (2.1.31) holds.
Proof. Without any loss of generality, on account of the last assertion
above we can assume that p ≥ 2. It is obvious that
¯ ¡ ¢¯ ³ ´1/p
¯h (ω) − h ω 0 ¯ ≤ var(p) h
A

for any A ⊂ Ω and ω, ω 0 ∈ A. Then


Eγ1/2 [H1 − Eγ (H1 |a1 , · · · , an )]2 ≤ Eγ1/p |H1 − Eγ (H1 |a1 , · · · , an )|p

 ¯ ¯p 1/p
Z ¯ Z ¯
¯ ¯
 X ¯ 1 ¡ ¢ ¡ ¢¯ 
=  ¯h (ω) − ¡ ¡ (n) ¢¢ h ω 0 γ dω 0 ¯ γ (dω)
¯ γ I i ¯
i(n) ∈Nn
+ I(i(n) )
¯ I(i(n) )
¯
 ¯ ¯p 1/p
Z ¯ Z ¯
¯ ¡ 0 ¢¢ ¡ 0 ¢¯¯ 
 X 1
¡ ¡ ¢¢ ¯ ¡
=  γ (dω) ¯ h (ω) − h ω γ dω ¯ 
γp I i(n) ¯ ¯
i(n) ∈Nn
+ I(i(n) )
¯ I(i(n) ) ¯
 1/p
 X (p) 
≤  max γ(I(i(n) )) varI(i(n) ) h
(n) n
i ∈N+
i(n) ∈Nn
+

µ ¶1/p ³ ´1/p
1 (p)
≤ (Fn Fn+1 )−1 varΩ h .
log 2
Hence the series occurring in (2.1.31) is dominated by
(p)
(varΩ h)1/p X
1/p
(Fn Fn+1 )−1/p ,
(log 2) n∈N+

and clearly the last series is convergent. 2


It is important to know when σ2
defined in terms of H or, equivalently, in
terms of h, is non-zero. In the result below the function h, which is only de-
fined on Ω, is considered as the representative of a class of λ-indistinguishable
Solving Gauss’ problem 77

functions on I, after having been extended in an arbitrary manner to the


whole of I . R
Proposition 2.1.24 Assume that h ∈ L2γ (I), I hdγ = 0, and U h ∈
BEV (I). Then the series
Z X Z
2 2
σ = h dγ + 2 h U n h dγ (2.1.33)
I n∈N+ I

converges absolutely, and we have σ = 0 if and only if there exists b ∈ L2γ (I)
such that h = b ◦ τ − b a.e. in I. In particular, if h is essentially unbounded
then σ 6= 0.
Proof. By (2.0.2) and Proposition 2.1.7(ii) we have

ess sup |U n h| ≤ ||U n h||v ≤ q n−1 ||U h||v , n ∈ N+ , (2.1.34)

for some positive q <P1. This Rclearly entails the absolute convergence of both
series (2.1.33) and n∈N+ n I h U n h dγ. Then Corollary 2.1.20 completes
the proof of the first two assertions concerning σ.
Without appealing to Corollary 2.1.20, the characterizationP of the case
σ = 0 can be given a direct proof as follows. Put h1 = n∈N+ U n h. By
(2.1.34) this series converges in BEV (I), and we have h1 = U h + U h1 =
U (h + h1 ). Writing g = h + h1 we note that U g ∈ BEV (I) and
Z Z ³ ´
¡ 2 ¢
2
σ = h + 2hh1 dγ = g 2 − (U g)2 dγ.
I I

By (2.1.2) we have
Z Z
2
(U g) dγ = ((U g) ◦ τ )g dγ
I I

and Z Z
(U g)2 dγ = ((U g) ◦ τ )2 dγ.
I I
R R
[Note that (2.1.2) implies in general that I f dγ = I f ◦ τ dγ, f ∈ L1γ , which
also follows from the fact that τ is γ-preserving.] Consequently, we can write
Z Z Z
2
σ = g dγ − 2 ((U g) ◦ τ ) g dγ + ((U g) ◦ τ )2 dγ
2

ZI I I

= (g − (U g) ◦ τ )2 dγ.
I
78 Chapter 2

Now, if σ = 0 then g = (U g) ◦ τ a.e. in I. Hence

h = (U g) ◦ τ − U g a.e. in I, (2.1.35)

that is, we can take b = U g. Conversely, if h = b ◦ τ − b a.e. in I then


Sn = b ◦ τ n − b a.e. in I for any n ∈ N+ . Hence
Z
−1 2 −1
n Eγ Sn ≤ 4n b2 dγ → 0 as n → ∞,
I

that is, σ = 0.
Finally, since U g ∈ BEV (I) as shown above, equation (2.1.35) cannot
hold in the case where h is essentially unbounded, that is, we cannot have
σ = 0. 2
Corollary 2.1.25 Let f : N+ → R such that Eγ f 2 (a1 ) < ∞, Eγ f (a1 ) =
0. Put X
σ 2 = Eγ f 2 (a1 ) + 2 Eγ f (a1 ) f (an+1 ) (2.1.36)
n∈N+

Then σ = 0 if and only if f = 0.


Proof. As a special case of (2.1.26) with (2.1.31) trivially satisfied, series
(2.1.36) is absolutely convergent. Moreover, in the present case h is defined
by
h (ω) = f (b1/ωc) , ω ∈ Ω,
R
and by hypothesis h ∈ L2γ (I) and I hdγ = 0. We then have
X
U h (ω) = Pi (ω)f (i), ω ∈ Ω,
i∈N+

and
X X |f (i)|
v (U h) ≤ |f (i)| var Pi ≤ C
i2
i∈N+ i∈N+

for some C > 0. The last series is convergent since Eγ |f (a1 )| < ∞, so that
U h ∈ BEV (I). Then by Proposition 2.1.24 we have σ = 0 if and only if
there exists b ∈ L2γ (I) such that h = b ◦ τ − b a.e. in I, and we have to show
that this happens if and only if f = 0. Clearly, if f = 0 then σ = 0. To
prove the converse we first note that

U h = U (b ◦ τ ) − U b = b − U b a.e. in I.
P
This equation holds for b equal to h1 = n∈N+ U n h ∈ BEV (I). Putting
b = b1 + h1 we get b1 = U b1 . But by Proposition 2.1.7 the last equation
Solving Gauss’ problem 79

only holds for a.e. constant functions b1 . This shows that actually b ∈
BEV (I). Next, whatever i ∈ N+ , for u ∈ (1/ (i + 1) , 1/i) the equation
h (u) = (b ◦ τ ) (u) − b (u) a.e. in I implies
µ ¶
1
f (i) = b (x) − b
x+i

a.e. in I . Hence

nf (i) = b (x) − b ([i(n − 1), x + i])

a.e. in I for any n ≥ 2, where i (n − 1) = (i1 , · · · , in−1 ) with i1 = · · · =


in−1 = i. If f (i) 6= 0 then this contradicts the fact that b ∈ BEV (I). The
proof is complete. 2
We note another criterion for to have σ 6= 0 under stronger assumptions
than in Proposition 2.1.24.
Proposition 2.1.26 Let h : I → R be continuous except for a finite
number
R of points of
R I 2and assume that inf x∈(0,δ) |h (x)| > 0 for some δ > 0,
I hdγ = 0, and I h dγ < ∞. If the series defining U h is uniformly
convergent in I and U h ∈ BV (I), then σ defined by (2.1.33) is non-zero.
For the proof see Samur (1996). 2

2.2 Wirsing’s solution to Gauss’ problem


2.2.1 Elementary considerations
Let µ ∈ pr (BI ) such that µ ¿ λ. For any n ∈ N put

Fn (x) = µ (τ n < x) , x ∈ I,

with τ 0 = identity map. As (τ n < x) = τ −n ((0, x)), by Proposition 2.1.5


we have Z x n
U f0 (u)
Fn (x) = du, n ∈ N, x ∈ I, (2.2.1)
0 u+1
with f0 (x) = (x + 1)F00 (x), x ∈ I, where F00 = dµ/dλ. [Clearly, (2.2.1) is a
special case of (2.1.21).]
In this subsection we will assume that F00 ∈ C 1 (I). In other words, we
study the behaviour of U n as n → ∞, assuming that the domain of U is
C 1 (I).
80 Chapter 2

Let f ∈ C 1 (I). Then


X µ ¶
1
U f (x) = Pi (x) f
x+i
i∈N+
X µ i i−1
¶ µ
1

= − f , x ∈ I,
x+i+1 x+i x+i
i∈N+

can be differentiated term by term to give


X µµ i i−1
¶ µ
1

(U f )0 (x) = − − f
i∈N+
(x + i + 1)2 (x + i)2 x+i
µ ¶ µ ¶¶
i i−1 1 0 1
+ − f
x + i + 1 x + i (x + i)2 x+i
X µ i
µ µ
1
¶ µ
1
¶¶
= − f −f
i∈N+
(x + i + 1)2 x+i x+i+1
µ ¶¶
x+1 0 1
+ f , x ∈ I,
(x + i)3 (x + i + 1) x+i
since the series of derivatives is uniformly convergent, it being dominated
by a convergent series of positive constants. Hence

(U f )0 = −V f 0 , f ∈ C 1 (I), (2.2.2)

where V : C (I) → C (I) is defined by


à Z 1/(x+i)
X i
V g (x) = g (u) du
i∈N+
(x + i + 1)2 1/(x+i+1)
µ ¶¶
x+1 1
+ g , g ∈ C (I), x ∈ I.
(x + i)3 (x + i + 1) x+i
Clearly,
(U n f )0 = (−1)n V n f 0 , n ∈ N+ , f ∈ C 1 (I) . (2.2.3)
n
We are going to show that V takes certain functions into functions with
very small values when n ∈ N+ is large.
Proposition 2.2.1 There are positive constants v > 0.29017 and w <
0.30796, and a real-valued function ϕ ∈ C (I) such that

vϕ ≤ V ϕ ≤ wϕ.
Solving Gauss’ problem 81

Proof. Let h : R+ → R be a continuous bounded function such that


limx→∞ h (x) /x = 0. We look for a function g : (0, 1] → R such that
U g = h, assuming that the equation
X µ ¶
1
U g (x) = Pi (x) g = h (x) (2.2.4)
x+i
i∈N+

holds for x ∈ R+ . Then (2.2.4) yields


µ ¶
h (x) h (x + 1) 1 1
− = g , x ∈ R+ .
x+1 x+2 (x + 1) (x + 2) x+1

Hence µ ¶ µ ¶ µ ¶
1 1 1 1
g (u) = +1 h −1 − h , u ∈ (0, 1],
u u u u
and we indeed have U g = h since
X µ h (x + i − 1) h (x + i)

U g (x) = (x + 1) −
x+i x+i+1
i∈N+
µ ¶
h (x) h (x + i)
= (x + 1) − lim = h (x) , x ∈ R+ .
x + 1 i→∞ x + i + 1

In particular, for any fixed a ∈ I we consider the function ha : R+ → R


defined by
1
ha (x) = , x ∈ R+ .
x+a+1
We have just seen that the function ga : (0, 1] → R defined by
µ ¶ µ ¶ µ ¶
1 1 1 1
ga (x) = + 1 ha − 1 − ha
x x x x
x+1 1
= − , x ∈ (0, 1],
ax + 1 (a + 1) x + 1

satisfies
U ga (x) = ha (x), x ∈ I.
We come to V via (2.2.2). Setting

1−a a+1
ϕa (x) = ga0 (x) = 2 + , x ∈ I,
(ax + 1) ((a + 1) x + 1)2
82 Chapter 2

we have
1
V ϕa (x) = − (U ga )0 (x) = , x ∈ I.
(x + a + 1)2
Let us choose a by asking that
ϕa ϕa
(0) = (1) .
V ϕa V ϕa
This amounts to

(a + 1)3 (2a + 1) + (a − 1) (a + 2)2 = 0

or
2 (a + 1)4 − 3 (a + 1) − 2 = 0,
which yields as unique acceptable solution

a = 0.3126597 · · · .

For this value of a the function ϕa /V ϕa attains its maximum equal to


2 (a + 1)2 = 3.44615 · · · at x = 0 and at x = 1, and has a minimum equal to
µ ¶
¡ ¢ a+1
m (a) = a3 + a2 − a + 1 + 3a (a + 2) 1 − a − a2 (1 − a) δ +
δ
= 3.247229 · · ·

at x = (δ − 1) / (1 − a (δ − 1)) = 0.3655 · · · , where


µ ¶1/3
a (a + 1) (a + 2)
δ= = 1.328024 · · · .
(1 − a) (1 − a − a2 )

It follows that for ϕ = ϕa with a = 0.3126597 · · · we have


ϕ ϕ
2 ≤ V ϕ ≤ m (a) ,
2 (a + 1)

that is,
vϕ ≤ V ϕ ≤ wϕ,
where
1 1
v= > 0.29017, w= < 0.30796.
2 (a + 1)2 m (a)
2
Solving Gauss’ problem 83

Remark. As noted by Wirsing (1974, p. 513), a better choice of ϕ


is ϕ = 8ϕa0 − 7ϕa00 with a0 = 0.6247 and a00 = 0.7, which yields v =
0.3020, w = 0.3043. 2
Corollary 2.2.2 Let f0 ∈ C 1 (I) such that f00 > 0. Put
ϕ (x) ϕ (x)
α = min , β = max .
x∈I f00 (x) x∈I f00 (x)
Then
α n 0 β
v f0 ≤ V n f00 ≤ wn f00 , n ∈ N+ . (2.2.4)
β α
Proof. Since V is a positive operator (that is, takes non-negative func-
tions into non-negative functions) we have

v n ϕ ≤ V n ϕ ≤ wn ϕ, n ∈ N+ .

Noting that αf00 ≤ ϕ ≤ βf00 we then can write


α n 0 1 n 1 1
v f0 ≤ v ϕ ≤ V n ϕ ≤ V n f00 ≤ V n ϕ
β β β α
1 n β n 0
≤ w ϕ ≤ w f0 , n ∈ N+ ,
α α
which shows that (2.2.4) holds. 2
Remark. A similar result holds if f0 ∈ C 1 (I) and f00 < 0. 2
Theorem 2.2.3 (Near-optimal solution to Gauss’ problem) Let f0 ∈
C 1 (I) such that f00 > 0. For any n ∈ N+ and x ∈ I we have

(log 2)2 α minx∈I f00 (x) n


v G (x) (1 − G (x))

≤ |µ (τ n < x) − G (x)|

(log 2)2 β maxx∈I f00 (x) n


≤ w G (x) (1 − G(x)),
α
where α, β, v, and w are defined in Proposition 2.2.1 and Corollary 2.2.2.
In particular, for any n ∈ N+ and x ∈ I we have

0.07739 v n G (x) (1 − G (x)) ≤ |λ (τ n < x) − G (x)|

≤ 1.49132 wn G (x) (1 − G (x)) .


84 Chapter 2

¡ ¢
Proof. For any n ∈ N and y ∈ I set dn (y) = µ τ n < ey log 2 − 1 − y so
that
dn (G (x)) = µ (τ n < x) − G(x), x ∈ I.
Then by (2.2.1) we have
Z x
U n f0 (u)
dn (G (x)) = du − G(x).
0 u+1

Differentiating twice with respect to x yields

1 U n f0 (x) 1
d0n (G (x)) = − ,
(x + 1) log 2 x+1 (x + 1) log 2
1 d00n (G (x))
(U n f0 (x))0 = , n ∈ N, x ∈ I.
(log 2)2 x + 1

Hence, by (2.2.3),

d00n (G (x)) = (−1)n (log 2)2 (x + 1) V n f00 (x), n ∈ N, x ∈ I.

Since dn (0) = dn (1) = 0, it follows from a well known interpolation formula


that
y (1 − y) 00
dn (y) = − dn (θ), n ∈ N, y ∈ I,
2
for a suitable θ = θ (n, y) ∈ I. Therefore

θ+1 n 0
µ (τ n < x) − G (x) = (−1)n+1 (log 2)2 V f0 (θ) G (x) (1 − G (x))
2
for any n ∈ N and x ∈ I, and another suitable θ = θ (n, x) ∈ I. The result
stated follows now from Corollary 2.2.2.
In the special case µ = λ we have f0 (x) = x + 1, x ∈ I. Then with
a = 0.3126597 · · · we have
ϕ (x) 1−a a+1
α = min = 2 + = 0.644333 · · · ,
x∈I f00 (x) (a + 1) (a + 2)2

ϕ (x)
β = max = 2,
x∈I f00 (x)
so that

(log 2)2 α (log 2)2 β


= 0.07739 · · · , = 1.49131 · · · .
2β α
Solving Gauss’ problem 85

The proof is complete. 2


Remark. It follows from the above proof that for any n ∈ N the difference

µ (τ n < x) − G (x)

has a constant sign equal to (−1)n+1 whatever 0 < x < 1. 2

2.2.2 A functional-theoretic approach


The question naturally arises whether the operator V has an eigenvalue λ0
such that v ≤ λ0 ≤ w (see Theorem 2.2.3). This will indeed follow from the
result below.
Let B be a collection of bounded real-valued functions defined on a set
X, with the following properties: (i) B is a linear space over R; (ii) B
is complete with respect to the supremum norm, and (iii) B contains the
constant functions.
Theorem 2.2.4 Let V : B → B be a positive bounded linear operator
and F : B → R a positive bounded linear functional such that

V ≥ F. (2.2.5)

Assume that there exist ϕ ∈ B with

m (ϕ) = inf ϕ (x) > 0


x∈X

and two positive numbers v and w, v ≤ w, such that


V ϕ (x)
v≤ ≤ w, x ∈ X, (2.2.6)
ϕ (x)
and ³ v´
F (ϕ) > 1 − || V ϕ ||. (2.2.7)
w
Then V has an eigenvalue λ0 ∈ [v, w] with corresponding positive eigenfunc-
tion ψ ∈ B such that
F (ϕ) F (ψ)
ψ ≥ ϕ ≥ m (ϕ) > 0, 0<w − (w − v) ≤ ≤ λ0 ,
|| V ϕ || || ψ ||
and for any n ∈ N and f ∈ B we have
µ ¶
n f F (ψ) n
V f =G (f ) λn0 ψ + osc λ0 − θn ψ, (2.2.8)
ψ || ψ ||
86 Chapter 2

where G : B → R is a positive bounded linear functional with || G || ≤ 1/m (ϕ),


and θn : X → R is a function satisfying |θn | ≤ 1.
Proof. Define ϕn = V n ϕ, n ∈ N, ϕ0 = ϕ. Since V is positive, from
(2.2.6) we get
vϕn ≤ ϕn+1 ≤ wϕn , n ∈ N.
It follows that
inf ϕn (x) > 0, n ∈ N.
x∈X

Set v0 = v, w0 = w, and
ϕn+1 ϕn+1
vn = inf , wn = sup , n ∈ N+ .
ϕn ϕn

Then
vn ϕn ≤ ϕn+1 ≤ wn ϕn , n ∈ N, (2.2.9)
whence
vn V ϕn ≤ V ϕn+1 ≤ wn V ϕn ,
that is,
vn ϕn+1 ≤ ϕn+2 ≤ wn ϕn+1 .
Therefore vn+1 ≥ vn and wn+1 ≤ wn , n ∈ N. We are going to improve
these inequalities.
It follows from (2.2.5) and (2.2.9) that

ϕn+2 − vn ϕn+1 = V (ϕn+1 − vn ϕn ) ≥ F(ϕn+1 − vn ϕn )


ϕn+1
≥ F(ϕn+1 − vn ϕn ),
|| ϕn+1 ||

whence
F(ϕn+1 − vn ϕn )
vn+1 ≥ vn + , n ∈ N. (2.2.10)
|| ϕn+1 ||
Similarly,

wn ϕn+1 − ϕn+2 = V (wn ϕn − ϕn+1 ) ≥ F (wn ϕn − ϕn+1 )


ϕn+1
≥ F(wn ϕn − ϕn+1 ),
|| ϕn+1 ||

whence
F (wn ϕn − ϕn+1 )
wn+1 ≤ wn − , n ∈ N. (2.2.100 )
|| ϕn+1 ||
Solving Gauss’ problem 87

Putting dn = wn − vn and en = F (ϕn ) /|| ϕn+1 || , n ∈ N, it follows from


(2.2.10) and (2.2.100 ) that

dn+1 ≤ dn (1 − en ), n ∈ N, (2.2.11)

which shows that en ≤ 1, n ∈ N.


Now, note that (2.2.9) implies

F (ϕn+1 ) ≥ vn F (ϕn ) and || ϕn+2 || ≤ wn+1 || ϕn+1 || , n ∈ N.

Hence
vn
en+1 ≥ en , n ∈ N. (2.2.12)
wn+1
In conjunction with (2.2.11) and (2.2.12), assumption (2.2.7) which can be
written as
d0
e0 − > 0,
w0
ensures exponential decrease of the dn , n ∈ N, since

wn+1 en+1 − dn+1 ≥ vn en − dn (1 − en ) = wn en − dn , n ∈ N,

whence
wn en − dn ≥ w0 e0 − d0 ,
1 d0
1 ≥ en ≥ (w0 e0 − d0 ) ≥ e0 − > 0, (2.2.13)
wn w0
and µ ¶
d0 n
dn ≤ d0 1 − e0 + , n ∈ N. (2.2.14)
w0
Put λ0 = limn→∞ vn = limn→∞ wn , and define

ϕ en = ϕn (v0 · · · vn−1 )−1 ,


e0 = ϕ0 = ϕ, ϕ n ∈ N+ .

Then (2.2.9) amounts to


µ ¶ µ ¶
wn dn dn
ϕ
en ≤ ϕ
en+1 ≤ ϕ
en = 1+ ϕ
en ≤ 1 + ϕ
en , n ∈ N, (2.2.15)
vn vn v0

and (2.2.14) implies that


Yµ dn

A = 1+ < ∞.
v0
n∈N
88 Chapter 2

Hence

n−1
di

ϕ
en ≤ 1+ ϕ
e0 ≤ A ϕ
e0 , n ∈ N+ . (2.2.16)
v0
i=0

It follows from (2.2.15) and (2.2.16) that

dn dn A
0≤ϕ
en+1 − ϕ
en ≤ ϕen ≤ ϕ
e0 , n ∈ N.
v0 v0
P
Therefore by (2.2.14) the series n∈N || ϕ
en+1 − ϕ en || converges. By the
completeness of B the limit ψ = limn→∞ ϕ en exists. Letting n → ∞ in
vn ϕ
en ≤ V ϕ en ≤ wn ϕ en , n ∈ N, yields V ψ = λ0 ψ.
Since ϕ en+1 ≥ ϕ en ≥ · · · ≥ ϕ e0 = ϕ, e we have ψ ≥ ϕ. As 1 ≥ en =
F (ϕn ) /|| ϕn+1 || = F (ϕ en ) /|| V ϕ
en || , n ∈ N, letting n → ∞ yields 1 ≥
F (ψ) /λ0 || ψ || . Finally, by (2.2.13) we have

F (ψ) λ0 F (ψ) F (ϕ)


= = lim wn en ≥ w0 e0 − d0 = w − w + v > 0.
|| ψ || || V ψ || n→∞ || V ϕ ||

To prove (2.2.8) let f ∈ B and define fn = V n f, n ∈ N, f0 = f,

fn fn
ven = inf , w
en = sup , n ∈ N.
λn0 ψ λn0 ψ

Hence

fn+1 − ven λn+1


0 ψ = V (fn − ven λn0 ψ)
ψ
≥ F (fn − ven λn0 ψ) ≥ F(fn − ven λn0 ψ),
|| ψ ||

which yields
1
ven+1 ≥ ven + F (fn − ven λn0 ψ) ≥ ven , n ∈ N.
λn+1
0 || ψ ||

Similarly,
1
w
en+1 ≤ w
en − n+1 en λn0 ψ
F(w − fn ) ≤ w
en , n ∈ N.
λ0 || ψ ||

Therefore
µ ¶
F (ψ)
w
en+1 − ven+1 ≤ (w
en − ven ) 1 − , n ∈ N,
λ0 || ψ ||
Solving Gauss’ problem 89

whence µ ¶
f F (ψ) n
w
en − ven ≤ osc 1− , n ∈ N,
ψ λ0 || ψ ||
since
f f f
w
e0 − ve0 = sup − inf = osc .
ψ ψ ψ
If we denote by G (f ) the common limit of ven and w en as n → ∞, then we
have µ ¶
e f F (ψ) n
ven , w
en = G (f ) + θn osc 1− , n ∈ N,
ψ λ0|| ψ ||
¯ ¯
¯ ¯
with a suitable θn ∈ R satisfying ¯θen ¯ ≤ 1. Hence, by the very definition
e
of the ven and w en , n ∈ N, equation (2.2.8) should hold. Since

|| f ||
|G(f )| ≤ max (|e
v0 | , |w
e0 |) ≤ , f ∈ B,
inf ψ

it follows that
|G (f )| 1
|| G || = sup ≤ .
f ∈B || f || inf ψ
The fact that G is a positive linear functional is an immediate consequence
of equation (2.2.8). 2
Let us show that Theorem 2.2.4 applies to Gauss’ problem as considered
in Subsection 2.2.1. The space B is Cr (I), the collection of all real-valued
functions in C (I) , and the operator V the one denoted there by the same
letter. As function ϕ we could use the function ϕa constructed in Subsection
2.2.1 with a = 0.3126597 · · · . Nevertheless, it is more convenient to use V ϕa
instead, for which the same values of v and w apply. Thus we take
1
ϕ (x) = , x ∈ I,
(x + a + 1)2

with a = 0.3126597 · · · . Finally, the functional F can be constructed as


follows. Let f ∈ Cr (I) , f ≥ 0. [Note that actually the considerations below
hold for any non-negative f ∈ B(I).] Then

X Z 1/(x+i)
i
V f (x) ≥ f (y) dy
i∈N+
(x + i + 1)2 1/(x+i+1)
Z 1
= k (x, y) f (y) dy, x ∈ I,
0
90 Chapter 2

where

k (x, 0) = 0, x ∈ I,
by −1 − xc
k (x, y) = , x ∈ I, y ∈ (0, 1].
(x + by −1 − xc + 1)2

If 0 < y ≤ 1/3 then by −1 − xc ≥ 2, and since

t → (t + x + 1)−2 , t ≥ 2,

is a decreasing function, we have

y −1 − x y −1 − 1 y (1 − y)
k (x, y) ≥ 2 ≥ 2 =
(y −1 + 1) (y −1 + 1) (y + 1)2

for x ∈ I, 0 ≤ y ≤ 1/3. If 1/3 < y ≤ 1/2 then either k (x, y) = (2 + x)−2 or


k (x, y) = 2 (3 + x)−2 . Hence k (x, y) ≥ 1/9 for x ∈ I, 1/3 < y ≤ 1/2. Thus
we have V f ≥ F (f ), where
Z 1/3 Z 1/2
y (1 − y) 1
F (f ) = 2 f (y) dy + 9 f (y) dy.
0 (y + 1) 1/3

Elementary calculations yield


Z 1/3 Z 1/2
y (1 − y) dy 1 dy
F (ϕ) = 2 2 + 9
0 (y + 1) (y + a + 1) 1/3 (y + a + 1)2
Z 1/3 µ ¶
3a + 4 2 3a + 4 a2 + 3a + 2
= − − − dy
0 a3 (y + 1) a2 (y + 1)2 a3 (y + a + 1) a2 (y + a + 1)2
Z 1/2
1 dy 3a + 4 4 (a + 1) 88a2 + 279a + 216
+ = log − .
9 1/3 (y + a + 1)2 a3 3a + 4 18a2 (2a + 3) (3a + 4)
As V ϕ ≤ wϕ, we have

w F (ϕ) F (ϕ)
≥ = (a + 1)2 F (ϕ) > 0.033184. (2.2.17)
|| V ϕ || || ϕ ||

Since w − v < 0.01779, inequality (2.2.7) holds. Thus Theorem 2.2.4 applies
and we have
F (ψ)
≥ (a + 1)2 F (ϕ) − (w − v) > 0.01539. (2.2.18)
|| ψ ||
Solving Gauss’ problem 91

To state the result corresponding to Theorem 2.2.3 we should first intro-


duce a few notation. Let
Z x
Ψ (x) = ψ (u) du
0

and Z x
Ψ (u) − U ∞ Ψ
ψe (x) = du, x ∈ I.
0 u+1
It is easy to check that
³ ´0
(x + 1) ψe0 (x) = ψ(x), x ∈ I,

and ψe (0) = ψe (1) = 0.


Remarks. 1. As noted by Wirsing (1974, p. 521), using as function ϕ
the function V (8ϕa0 − 7ϕa00 ) with a0 = 0.6247 and a00 = 0.7 one can improve
(2.2.18) to
F(ψ)/|| ψ || ≥ 0.031.
2. Wirsing (1974, § 5) proved that the functions ψ and ψe are analytic.
Their analytic continuations are holomorphic in the whole complex plane
with a cut along the negative real axis from ∞ to −1, which is the natural
boundary of these functions. 2
Theorem 2.2.5 Let f0 ∈ C 1 (I) (equivalently, dµ/dλ = F00 ∈ C 1 (I)).
For any n ∈ N and x ∈ I we have
¯ ¯
¯ n ¡ ¢ ¯
¯ µ (τ n < x) − G (x) − (−λ0 ) G f00 ψe (x)¯

f00
(log 2)2 (λ0 − 0.01539)n G (x) (1 − G (x)) ,
≤ || ψ || osc
ψ
where λ0 = 0.303 663 002 898 732 658 · · · ,
1 3.41
2 ≤ ψ (x) ≤ , x ∈ I,
(x + a + 1) (x + a + 1)2
with a = 0.3126597, and G is a positive bounded functional on Cr (I) such
that
1
|| G || ≤ ≤ (a + 2)2 = 5.34839 · · · .
inf ψ
In particular, for any n ∈ N and x ∈ I we have
¯ ¯
¯ ¯
¯λ (τ n < x) − G (x) − (−λ0 ) G (1) ψe (x)¯
n
(2.2.19)
92 Chapter 2

≤ 4.605 (λ0 − 0.01539)n G (x) (1 − G (x)) .

Proof. We use the same¡ ntrickyas in the¢proof of Theorem 2.2.3.


¡ y logFor n ∈¢ N
and y ∈ I set dn (y) = µ τ < e log 2 n 0 e
− 1 − y − (λ0 ) G(f0 )ψ e 2 − 1 so
that
¡ ¢
dn (G (x)) = µ (τ n < x) − G (x) − (−λ0 )n G f00 ψe (x) , x ∈ I.

Differentiating twice with respect to x yields


1 d00n (G (x)) ¡ ¢³ ´0
= (U n f0 )0 (x) − (−λ0 )n G f00 (x + 1) ψe0 (x)
(log 2)2 x + 1
¡ ¢
= (−1)n V n f00 (x) − (−λ0 )n G f00 ψ (x) .

Hence, by Theorem 2.2.4 and (2.2.18),


¯ 00 ¯ 0
¯dn (G (x))¯ ≤ 2 || ψ || osc f0 (log 2)2 (λ0 − 0.01539)n , n ∈ N, x ∈ I.
ψ
Since dn (0) = dn (1) = 0, the first inequality in the statement follows (cf. the
proof of Theorem 2.2.3).
In principle, Theorem 2.2.4 provides the means for computing λ0 to any
accuracy. It follows from that theorem that for any real-valued f ∈ C 1 (I)
and n ∈ N we have
U n f (1) − U n f (0)
Z 1 Z 1
¡ ¢ f0
= (−1) n
λn0 G f0 n
ψdλ + (λ0 − 0.01539) osc θen ψ dλ
0 ψ 0

with a suitable θen : I → R satisfying |θen | ≤ 1. Therefore if f 0 > 0 then


µµ ¶ ¶
U n f (1) − U n f (0) λ0 − 0.01539 n
= −λ0 + O
U n−1 f (1) − U n−1 f (0) λ0

as n → ∞. Using this equation Wirsing (1974) has obtained the value given
in the statement. Note that in Knuth (1981, p. 350) the first 20 (RCF)
digits of λ0 are given as 3, 3, 2, 2, 3, 13, 1, 174, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 1. The
20th convergent equals
227 769 828
,
750 074 345
which yields 14 exact significant digits of λ0 .
Solving Gauss’ problem 93

Now, we refer to the proof of Theorem 2.2.4. It is shown there that


ϕ ≤ ψ ≤ Aϕ, with
Yµ dn

A = 1+ ,
v
n∈N
µ ¶
w−v n
dn ≤ (w − v) 1 − e0 + , n ∈ N,
w
where in the present case v > 0.29017 and w < 0.30796. Then since by
(2.2.17) we have

wF (ϕ)
we0 = ≥ (a + 1)2 F (ϕ) ≥ 0.033184,
|| V (ϕ) ||

it follows that
P
n∈N dn w (w − v)
A ≤ exp ≤ exp ≤ 3.409 · · · .
v v (we0 − (w − v))

In the special case µ = λ we have

f00 1 1 1 (a + 1)2
osc = osc = − ≤ (a + 2)2 − = 4.843094 · · · ,
ψ ψ inf ψ sup ψ 3.41

and (2.2.19) follows. 2


Theorem 2.2.6 Let f ∈ C 1 (I) be real-valued. For any n ∈ N we have

|| U n f − U ∞ f || (2.2.20)
µ ¶ Z Z
¯ ¡ ¢¯ f0 x
≤ λn0 ¯G f 0 ¯ + osc (λ0 − 0.01539)n γ(dx) ψ dλ
ψ I 0
and

|| U n f − U ∞ f || (2.2.21)
µ ¶ Z Z x
¯ ¡ ¢¯ f0
≥ λn0 ¯G f 0 ¯ − osc (λ0 − 0.01539)n γ(dx) ψ dλ .
ψ I 0

Here G is a positive bounded linear functional on Cr (I) with || G || ≤ 5.34839 · · · ,


and the last inequality is meaningful for n ∈ N+ large enough.
Proof. It follows from (2.2.3) and (2.2.8) that

U n f (x) − U n f (y) =
94 Chapter 2
µ Z y Z y ¶
f0
= (−1)n G(f 0 )λn0 ψ dλ + osc (λ0 − 0.01539)n θen ψ dλ
x ψ x

for any n ∈ N and x, y ∈ I with a suitable θen : I → R satisfying |θen | < 1.


Integrating over y ∈ I with respect to γ, on account of (2.1.12) we obtain
µ Z Z y
n ∞ n 0 n
U f (x) − U f = (−1) G(f )λ0 γ(dy) ψ dλ (2.2.22)
I x
Z Z y ¶
f0
+ osc (λ0 − 0.01539)n γ(dy) θen ψ dλ
ψ I x
for any n ∈ N and x ∈ I. Hence (2.2.20) and (2.2.21) follow at once. For
the lower bound (2.2.21) we should note that

|| U n f − U ∞ f || ≥ |U n f (0) − U ∞ f | .
2

Remarks. 1. Equation (2.2.22) shows that whatever f ∈ C 1 (I) the exact


rate of convergence of U n f (x)−U ∞ f to 0 as n → ∞ is O(λn0 ) for any x ∈
/ E,
where µ Z Z y ¶
E = x ∈ I : γ(dy) ψ dλ = 0 .
I x
Clearly, E is not empty since
Z Z y Z Z y
γ(dy) ψ dλ > 0 and γ(dy) ψ dλ < 0.
I 0 I 1

2. By (2.1.12) and Proposition 2.0.1(i) with µ = γ, for any f ∈ C 1 (I)


we have
|| U n f − U ∞ f || ≤ var U n f, n ∈ N.
Next, since

U n f (1) − U n f (0) = U n f (1) − U ∞ f − (U n f (0) − U ∞ f ),

we have
|U n f (1) − U n f (0)| ≤ 2 || U n f − U ∞ f ||.
Finally, noting that by (2.2.3) we have
Z Z
¯ n 0¯ ¯ n 0¯
n
var U f = ¯ ¯
(U f ) dλ = ¯V f ¯ dλ,
I I
¯Z ¯ ¯Z ¯
¯ ¯ ¯ ¯
|U f (1) − U f (0)| = ¯ (U f ) dλ¯ = ¯ V f dλ¯¯ ,
n n ¯ n 0 ¯ ¯ n 0
I I
Solving Gauss’ problem 95

from (2.2.8) we obtain


µ ¶Z
n ∞
¯

¯
0 ¯ f0 n
|| U f − U f || ≤ λ0 G(f ) + osc (λ0 − 0.01539) ψ dλ
ψ I

and
µ ¶Z
n ∞ 1 ¯

¯
0 ¯ f0 n
|| U f − U f || ≥ λ0 G(f ) − osc (λ0 − 0.01539) ψ dλ
2 ψ I

for any n ∈ N and any real-valued f ∈ C 1 (I).


Since Z Z x Z
γ(dx) ψ dλ < ψ dλ,
I 0 I

the upper bound for || U n f


− U ∞f
|| just derived is slightly worse than that
given in Theorem 2.2.6. The comparison of the lower bounds forR || U n f −
U ∞ fR|| , here Rand in Theorem 2.2.6, amounts to a comparison of I ψ dλ/2
x
and I γ(dx) 0 ψ dλ, a question we cannot answer. 2
Corollary 2.2.7 The spectral radius of the operator U − U ∞ in C 1 (I) is
equal to λ0 .
Proof. We should show that
à !1/n
n 1/n || U n f − U ∞ f ||1
lim || U − U ∞ ||1 = lim sup = λ0 .
n→∞ n→∞ 06=f ∈C 1 (I) || f ||1

This follows easily using Theorem 2.2.6 and equations (2.2.3) and (2.2.8).
The details are left to the reader. 2

2.2.3 The case of Lipschitz densities


Theorem 2.2.4 can be also used to solve Gauss’ problem in the case where
F00 = dµ/dλ ∈ L(I). In other words, Theorem 2.2.4 enables us to study the
behaviour of U n as n → ∞ assuming that the domain of U is L(I).
Let f ∈ L(I). Then the derivative f 0 exists a.e. in I and is bounded by
s(f ). Abusing the notation, we will also denote by f 0 the extension to I of
the derivative of f , which is obtained by assigning the value 0 at the points
where f is not differentiable.
It is obvious that the operator V : C(I) → C(I) introduced in Subsection
2.2.1 can be extended to B(I) with V g, g ∈ B(I), defined by the same
96 Chapter 2

formula as in the case of a continuous g. The point is that, as is easy to see,


equations (2.2.2) and (2.2.3) hold now a.e. in I, that is,

(U n f )0 = (−1)n V n f 0 , f ∈ L(I), n ∈ N+ , (2.2.23)

a.e. in I, with the null set of exempted points depending on f and n.


Let us now apply Theorem 2.2.4 to our V in the case where B is Br (I),
the collection of all real-valued functions in B(I), with the same function ϕ
and functional F as in the case where B = Cr (I) ⊂ Br (I), which has been
considered in Subsection 2.2.2. It follows that the operator V : Br (I) →
Br (I) has an eigenvalue λ0 = 0.303 663 002 898 732 658 · · · with correspond-
ing positive eigenfunction ψ ∈ C(I) satisfying

1 3.41
2
≤ ψ(x) ≤ , x ∈ I,
(x + a + 1) (x + a + 1)2

where a = 0.3126597 · · · , and


g
V n g = G(g)λn0 ψ + osc (λ0 − 0.01539)n θn ψ (2.2.24)
ψ

for any n ∈ N and g ∈ Br (I). Here G : Br (I) → R is a positive bounded


linear functional with || G || ≤ (a + 2)2 and θn : I → R is a function satisfying
|θn | ≤ 1.
Theorem 2.2.8 Let f ∈ L(I) be real-valued. For any n ∈ N+ we have
µ ¶Z Z x
n ∞
¯

¯
0 ¯ f0 n
|| U f − U f || ≤ λ0 G(f ) + osc (λ0 − 0.01539) γ(dx) ψdλ
ψ I 0

and
µ ¶Z Z x
¯ ¯ f0
|| U n f − U ∞ f || ≥ λn0 ¯G(f 0 )¯ − osc (λ0 − 0.01539)n γ(dx) ψdλ.
ψ I 0

Here G is a positive bounded functional on Br (I) with || G || < 5.34839 · · · ,


and the last inequality is meaningful for n ∈ N+ large enough.
The proof is identical with that of Theorem 2.2.6. Instead of (2.2.3) and
(2.2.8) we should use (2.2.23) and (2.2.24). In particular, equation (2.2.22)
holds for f ∈ L(I), too. 2
Remark. The contents of Remarks 1 and 2 following the proof of Theo-
rem 2.2.6 apply mutatis mutandis to the present L(I) framework. 2
Solving Gauss’ problem 97

Corollary 2.2.9 Let f0 ∈ L(I) (equivalently, dµ/dλ = F00 ∈ L(I)). For


any n ∈ N and A ∈ BI we have
¯ ¯
¯µ(τ −n (A)) − γ(A)¯ (2.2.25)
µ ¶
¯
n¯ 0 ¯
¯ f00 n
≤ (1 − log 2) λ0 G(f0 ) + osc (λ0 − 0.01539) || ψ || min(γ(A), 1 − γ(A)).
ψ
Proof. By Proposition 2.1.5, for any n ∈ N and A ∈ BI we have
Z
U n f0 (x) − U ∞ f0
µ(τ −n (A)) − γ(A) = dx (2.2.26)
x+1
A

since Z Z
∞ 1 1
U f0 = f0 dγ = F00 dλ = .
log 2 log 2
I I

Note that
Z Z x Z 1 µ ¶
|| ψ || x dx 1
γ(dx) ψ dλ ≤ = || ψ || −1 (2.2.27)
I 0 log 2 0 x+1 log 2

and
µ(τ −n (A)) − γ(A) = γ(Ac ) − µ(τ −n (Ac )) (2.2.28)
for any n ∈ N and A ∈ BI .
Now, (2.2.25) follows from (2.2.26) through (2.2.28) and Theorem 2.2.8.
2
Corollary 2.2.10 The spectral radius of the operator U − U ∞ in L(I)
equals λ0 .
Proof. Obvious by Theorem 2.2.8. 2
As an application of Theorem 2.2.8 we shall derive the asymptotic be-
haviour of
γa (uan < x), x ≥ 1,
as n → ∞ for any a ∈ I. While it is natural to think that for any a ∈ I the
limit distribution function

lim γa (uan < x)


n→∞

is the common distribution function γ̄(ū1 < x), x ≥ 1, of the extended


random variables ū` , ` ∈ Z,—cf. the last paragraph of Subsection 1.3.3—it
98 Chapter 2

is somewhat surprising to find out that the (exact) convergence rate is O(λn0 )
for most a ∈ I.
Theorem 2.2.11 For any n ∈ N+ and x ≥ 1 we have
¯ ¯
sup ¯γa (uan+1 < x) − H(x)¯ (2.2.29)
a∈I

I(1,∞) (x) n
≤ 3.2228 λ0 (1 + (0.94932)n ),
x
where  µ ¶

 1 x−1

 log x − if 1 ≤ x ≤ 2,
 log 2 x
H(x) = µ ¶

 1 1


 log 2 − if x ≥ 2.
log 2 x
In (2.2.29), λ0 cannot be replaced by a smaller constant, and the exact con-
vergence rate to 0 of the left hand side of (2.2.29) is O(λn0 ).
Proof. By Proposition 1.3.10, for any a ∈ I, x ≥ 1, and n ∈ N+ we have
µ ¶
a san + 1
γa (un+1 < x|a1 , . . . , an ) = 1 − I(san +1,∞) (x).
x
Hence
µ ¯ ¶ µ ¶
1 ¯¯ 1
γa uan+1 ≥ ¯ a1 , . . . , an a
= 1 − (1 − t(sn + 1))I(san +1,∞)
t t
a a
= min(1, t(sn + 1)) = ft (sn )
for any a ∈ I, t ∈ (0, 1], and n ∈ N+ , with
ft (y) = min(1, t(y + 1)), y ∈ I.
Therefore, by Proposition 2.1.10,
µ ¶ µ µ ¯ ¶¶
a 1 1 1 ¯¯
γa un+1 ≥ = E γa un+1 ≥ ¯ a1 , . . . , an = U n ft (a), (2.2.30)
t t
for any a ∈ I, t ∈ (0, 1], and n ∈ N+ . It is easy to check that (2.2.30) holds
for n = 0, too. Clearly, ft ∈ L(I) for any t ∈ (0, 1], and

 t
Z 
 if 0 < t ≤ 1/2,
 log 2
U ∞ ft = ft (y)γ(dy) =
I 
 1

 (1 − t + log(2t)) if 1/2 ≤ t ≤ 1.
log 2
Solving Gauss’ problem 99

Next, 0 ≤ ft0 (y) ≤ tI(0,1) (t), t ∈ (0, 1], y ∈ I. Hence

ft0
osc ≤ 5.348396 tI(0,1) (t)
ψ

and ¯ ¯
¯G(ft0 )¯ ≤ || G || || ft0 || ≤ 5.348396 tI(0,1) (t)

for any t ∈ (0, 1]. Finally,


Z Z x Z µ ¶
3.41 1 1 dx
γ(dx) ψdλ ≤ −
I 0 log 2 I 1.312659 x + 1.312659 x + 1
µ ¶
3.41 1 2.312659 1
= log −
0.312659 log 2 1.312659 1.312659

≤ 0.60256.

Consequently, Theorem 2.2.8 yields


¯ µ ¶ ¯
¯ 1 ¯
¯ a
sup ¯γa un+1 ≥ ∞ ¯
− U ft ¯ ≤ 3.2228 t I(0,1) (t)λn0 (1 + (0.94932)n )
a∈I t

for any n ∈ N and t ∈ (0, 1]. Hence, by putting 1/t = x, (2.2.29) follows.
Finally, the assertion concerning the optimality of λ0 also follows from
Theorem 2.2.8. 2
Remarks. 1. The convergence of λ(un < x) to H(x), x ≥ 1, as n →
∞ was first sketchy proved by Doeblin (1940, p. 365) with an unspecified
convergence rate. A detailed proof following Doeblin’s suggestions was given
by Samur (1989, Lemma 4.5) together with a slower convergence rate than
that occurring in Theorem 2.2.11.
2. Theorem 2.2.8 shows that the convergence rate to 0 as n → ∞ of
¯ ¯
sup sup ¯γa (uan+1 < x) − H(x)¯
a∈I x≥1

is O(λn0 ). It is possible for some a ∈ I that the convergence rate to 0 as


n → ∞ of ¯ ¯
sup ¯γa (uan+1 < x) − H(x)¯
x≥1

is O(αn ) with 0 < α < λ0 . It follows from equation (2.2.22), which is valid
for f ∈ L(I) too, that this happens if and only if a ∈ E, with E defined in
100 Chapter 2

Remark 1 following Theorem 2.2.6. In particular, 0 and 1 do not belong to


E, thus
sup |λ(un+1 < x) − H(x)| = O(λn0 )
x≥1

and ¯ ¯
sup ¯γ1 (u1n+1 < x) − H(x)¯ = O(λn0 )
x≥1

as n → ∞. It would be interesting to effectively determine elements of E.2

The asymptotic behaviour as n → ∞ of the probability density of


uan ,
n ∈ N+ , a ∈ I, which exists a.e. by Corollary 1.3.11, can be established
using a result to be proved later in Subsection 2.5.3. Set

 x−1

 if 1 ≤ x ≤ 2,
dH (x)  x2 log 2
h (x) = =
dx 
 1

 if x ≥ 2.
2
x log 2
Recalling that 

 0 if x ≤ 0,




 log(x + 1)
G (x) = if 0 ≤ x ≤ 1,

 log 2





1 if x > 1,
it is easy to check that
Z x−1
1
H (x) = G (s) ds, x ≥ 1.
x 0

Corollary 1.3.11 then yields


Z x−1 ¡
1 ¢
γa (uan < x) − H (x) = Gan−1 (s) − G (s) ds (2.2.31)
x 0

for any a ∈ I, n ∈ N+ , and x ≥ 1. Letting Dx γa (uan < x) denote anyone


of the four (two for x = 1) unilateral derivatives of γa (uan < x) at x, we can
state the following result.
Proposition 2.2.12 For any n ∈ N+ , a ∈ I, and x ≥ 1 we have
k0 [min(x − 1, 1) + x I(1,2] (x)] 1
|Dx γa (uan < x) − h (x)| ≤ 2
x Fn−1 Fn
Solving Gauss’ problem 101

where k0 is a constant not exceeding 14.8.


The proof follows from (2.2.31) and Theorem 2.5.5. The details are left
to the reader. 2
Remark.¡√The upper
¢ bound ¡in Proposition
√ ¢ 2.2.12 is O(g2n ) as n → ∞
with g = 5 − 1 /2, g2 = 3 − 5 /2 = 0.38196 · · · . It is an open
problem whether this yields the optimal convergence rate. 2
Theorem 2.2.11 and Proposition 2.2.12 can be restated in terms of the
approximation coefficients defined in Subsection 1.3.2. Indeed, by (1.3.6) we
have un+1 = u0n+1 = Θ−1n , n ∈ N, and the results below are easily checked.

Theorem 2.2.13 For any n ∈ N+ and t ∈ I we have

|λ(Θn ≤ t) − H̃(t)| ≤ 3.2228 tI(0,1) (t)λn0 (1 + (0.94932)n )

and
[min(t−1 − 1, 1) + t−1 I[1/2,1) (t)]
|Dt λ(Θn ≤ t) − h̃(t)| ≤ k0 ,
Fn Fn+1
where 
 t if 0 ≤ t ≤ 1/2,

 log 2
H̃(t) =

 1 (1 − t + log(2t)) if 1/2 ≤ t ≤ 1

log 2
and  1

 if 0 ≤ t ≤ 1/2,

 log 2
dH̃
h̃(t) = = µ ¶
dt 
 1 1

 −1 if 1/2 ≤ t ≤ 1.
log 2 t

Remark. The first result above improves on the convergence rate ob-
tained by Faivre (1998a) while the second one on that obtained by Knuth
(1984). 2

2.3 Babenko’s solution to Gauss’ problem


2.3.1 Preliminaries
Let H−1/2 = H denote the collection of all complex-valued functions f which
are holomorphic in the half-plane Re z > −1/2, bounded in every half-plane
102 Chapter 2

Re z > −1/2 + ε, ε > 0, and which satisfy


Z ¯ µ ¶¯2
¯ 1 ¯
¯f − + iy ¯ dy < ∞.
¯ 2 ¯
R

Note that H is known [see Duren (1970)] as the ordinary Hardy space of
functions holomorphic in the half-plane Re z > −1/2, which is a Hilbert
space with inner product (·, ·)H defined by
Z µ ¶ µ ¶
1 ∗ 1 1
(f, g)H = f − + iy g − + iy dy, f, g ∈ H,
2π R 2 2

therefore a Banach space under the norm || · || H defined by


à Z ¯ µ ¶¯2 !1/2
1 ¯ 1 ¯
|| f || H = ¯f − + iy ¯ dy , f ∈ H.
2π ¯ 2 ¯
R

Let L2 (R+ , BR+ , λ) = L2 (R+ ) denote the Hilbert space of square λ-


integrable functions ϕ : R+ → C with the usual scalar product
Z
(ϕ, ψ) = ϕψ ∗ dλ, ϕ, ψ ∈ L2 (R+ ) ,
R+

and norm
||ϕ||2 = (ϕ, ϕ)1/2 , ϕ ∈ L2 (R+ ) .
A Paley–Wiener theorem holds, giving a simple characterization of the
elements of H [see Duren (1970)]: f ∈ H if and only if there exists ϕ ∈
L2 (R+ ) such that
Z
f (z) = e−zs−s/2 ϕ (s) ds, Re z > −1/2;
R+

the function ϕ is unique (in the L2 -sense) and

|| f || H = || ϕ ||2 . (2.3.1)

In other words, the linear operator M : L2 (R+ ) → H defined by


Z
M ϕ (z) = e−zs−s/2 ϕ (s) ds, ϕ ∈ L2 (R+ ) , Re z > −1/2,
R+

is an isometry and the image under M of L2 (R+ ) is H.


Solving Gauss’ problem 103

Notice that in Babenko (1978) an equivalent definition of H is considered.


We follow here Mayer (1991). See also Hensley [(1992, p. 344) and (1994, p.
145)].
It is easy to check that the Perron–Frobenius operator Pλ of τ under λ
takes H into itself. Obviously, for f ∈ H we define Pλ f by
X µ ¶
1 1
Pλ f (z) = f , Re z > −1/2.
(z + i)2
i∈N+
z+i

2.3.2 A symmetric linear operator


Consider the linear operator S : L2 (R+ ) → L2 (R+ ) defined by
µ ¶1/2
1 − e−s
Sϕ (s) = ϕ (s) , ϕ ∈ L2 (R+ ) , s ∈ R+ .
s

Clearly, S is invertible and


µ ¶1/2
−1 s ¡ ¢
S ϕ (s) = ϕ (s) , ϕ ∈ S L2 (R+ ) , s ∈ R+ .
1 − e−s

Consider also the linear operator

A = SM −1 : H → L2 (R+ )

with inverse ¡ ¢
A−1 = M S −1 : S L2 (R+ ) → H.

Proposition 2.3.1 Define the symmetric linear operator K : L2 (R+ ) →


L2 (R+ )
by
Z
Kϕ (s) = k (s, t) ϕ (t) dt , ϕ ∈ L2 (R+ ) , s ∈ R+ ,
R+

where ¡ √ ¢
J1 2 st
k (s, t) = , s, t ∈ R+ ,
((es − 1) (et − 1))1/2
and J1 is the Bessel function of order 1 defined by

s X (−1)k ³ s ´2k
J1 (s) = , s ∈ R+ .
2 k! (k + 1)! 2
k∈N
104 Chapter 2

Then
Pλ = A−1 K A. (2.3.2)

¡ ¢
Proof. Note first that the range of K is included in S L2 (R+ ) .
Let ϕ ∈ L2 (R+ ) and put f = M ϕ ∈ H. We have

A−1 K A f = M S −1 K S ϕ.

But
µ ¶1/2 Z µ ¶1/2
¡ −1 ¢ s 1 − e−t
S KSϕ (s) = k (s, t) ϕ (t) dt
1 − e−s R+ t
Z ³ s ´1/2 ¡ √ ¢
s−t J1 2 st
= e 2 ϕ (t) dt, s ∈ R+ ,
R+ t es − 1

whence
Z t ¡ √ ¢
¡ ¢ −zs − ³ s ´1/2 J1 2 st
M S −1 KSϕ (z) = e 2 ϕ (t) dsdt
R2+ t es − 1
 
Z X µ ¶
 1 t t  ϕ (t) dt,
= exp − −
R+ k∈N
(z + k)2 z+k 2
+

for Re z > −1/2, on account of the identity


µ ¶ Z ³ ´ ¡ √ ¢
X 1 t s 1/2 −zs J1 2 st
exp − = e ds
k∈N+
(z + k)2 z+k R+ t es − 1

which is valid for t ∈ R+ and Re z > −1 [see Watson (1944, formula


7.13.9)]. It remains to note that
Z µ ¶ µ ¶ µ ¶
t t 1 1
ϕ (t) exp − − dt = (M ϕ) =f
R+ z+k 2 z+k z+k

for any k ∈ N+ and Re z > −1, to obtain


X µ ¶
¡ −1 ¢ 1 1
A KAf (z) = f = (Pλ f ) (z), Re z > −1/2.
(z + k)2
k∈N+
z+k

2
Solving Gauss’ problem 105

As an integral symmetric linear operator with continuous kernel, K is


a compact operator on L2 (R+ ) with only real eigenvalues λj , j ∈ N+ ,
satisfying
lim |λj | = 0.
j→∞

See, e.g., Kanwal (1997, Ch.7). Note that 0 cannot be an eigenvalue since
Kϕ = 0 implies that ϕ = 0 by the invertibility of the Hankel transform.
See, e.g., Magnus et al. (1966, Ch. 11). As usual, we order the eigenvalues
according to their absolute values, that is, |λ1 | ≥ |λ2 | ≥ ... , where we list
each eigenvalue according to its multiplicity. We then have
X
Kϕ = λj (ϕ, ϕj ) ϕj , ϕ ∈ L2 (R+ ) , (2.3.3)
j∈N+

where ϕj is a (real-valued) eigenfunction corresponding to λj , that is Kϕj =


λj ϕj , j ∈ N+ , and the ϕj , j ∈ N+ , define an orthonormal system in
L2 (R+ ). Note that this system is complete since 0 is not an eigenvalue of
K.
We actually can prove more about K. For that we recall that a linear
operator L on a Banach space B of norm || · || is called nuclear of order 0
(or of trace class) if and only if it can be written as
X
Lx = yi (x)xi , x ∈ B,
i∈I

with X
(||yi || ||xi ||)r < ∞
i∈I

for any r > 0. Here I is a countable set while xi ∈ B and yi ∈ B ∗ = the dual
Banach space of B (consisting of all bounded linear functional on B) for any
i ∈ I. Such operators have been introduced and studied by Grothendieck
(1955, 1956). They are compact and thus have discrete spectra. Moreover,
most of matrix algebra can be extended to them. In particular, one can
define the trace of such an operator as
X X
Tr L = yi (xi ) = λj , (2.3.4)
i∈I j∈N+

where λj , j ∈ N+ , are the eigenvalues of L, each of them counted with its


multiplicity. The traces of the powers Ln , n ≥ 2, are also well defined. The
analog of the characteristic polynomial of a matrix for a nuclear operator of
106 Chapter 2

order 0, is known as the Fredholm determinant, which is an entire function


of z ∈ C given by the formula
Y
det (Id − zL) = (1 − λj z).
j∈N+

Then the equation



X zk
det(Id − zL) = exp(−Tr log(Id − zL)) = exp − TrLk 
k
k∈N+

holds for |z| < 1. Hence


X
Tr Ln = λnj , n ∈ N+ .
j∈N+

Moreover, generalized traces defined as


X
|λj |ε
j∈N+

exist for any ε > 0.


Let us finally note that in some Banach spaces every bounded linear
operator is nuclear of order 0. A typical example of such a Banach space is
A∞ (D1 ), to be defined in Subsection 2.4.3.
Proposition 2.3.2 K is a nuclear operator of trace class. Hence
X
|λj |ε < ∞
j∈N+

for any ε > 0. We have


X Z Z
J1 (2s)
Tr K = λj = k (s, s) ds = ds = 0.7711255237 · · · ,
R+ R+ es − 1
j∈N+

X ZZ
Tr K 2 = λ2j = k (s, t) k (t, s) ds dt
j∈N+ R2+

¡ √ ¢ (2.3.5)
ZZ
J12 2 st
= s t
ds dt = 1.103839654 · · · .
R2+ (e − 1) (e − 1)
Solving Gauss’ problem 107

Proof. Consider the Laguerre polynomials


n
X sm
L1n (s) = (n + 1)! (−1)m , n ∈ N, s ∈ R+ .
(m + 1)!m! (n − m)!
m=0

We have Z
¡ ¢2
se−s L1n (s) ds = n + 1, n ∈ N,
R+
Z
se−s L1m (s) L1n (s) ds = 0, m, n ∈ N, m 6= n.
R+
¡ √ ¢ √
See, e.g., Magnus et al. (1966, Ch. 5). We expand J1 2 st / st, s, t ∈ R+ ,
in terms of the L1n (s) , n ∈ N, to obtain
¡ √ ¢
J1 2 st X
√ = L1n (s) Cn (t), s, t ∈ R+ ,
st n∈N

where
Z ¡ √ ¢
1 J1 2 st
Cn (t) = se−s L1n (s) √ ds
n+1 R+ st
n
X X (−1)m+k (m + k + 1)!tk
= n!
k! (k + 1)!m! (m + 1)! (n − m)!
m=0 k∈N

e−t tn
= , n ∈ N, t ∈ R+ .
(n + 1)!

It follows that X
Kϕ = (ϕ, βn ) αn , ϕ ∈ L2 (R+ ) , (2.3.6)
n∈N

where αn , βn ∈ L2 (R+ ) are given by

s1/2 L1n (s) tn+1/2 e−t


αn (s) = , β n (t) = , s, t ∈ R+ .
(es − 1)1/2 (et − 1)1/2 (n + 1)!

To prove the first assertion we should show that


X
(||αn ||2 ||βn ||2 )r < ∞
n∈N
108 Chapter 2

P
for any r > 0. Since (es − 1)−1 = k∈N+ e−ks , s ∈ R++ , the computation
of ||αn ||2 reduces to that of a standard integral:
X Z ¡ ¢2
||αn ||22 = se−ks L1n (s) ds
k∈N+ R+

X n+1 X n µ ¶µ ¶
n+1 n
= (k − 1)2p ,
k 2n+2 p p
k∈N+ p=0
¡n+1¢
and since p≤ 2n+1 , 0 ≤ p ≤ n, we obtain
³ ´n
2
X (k − 1) + 1
||αn ||22 ≤ 2n+1 (n + 1) ≤ 2n+1 (n + 1) ζ (2) .
k 2n+2
k∈N+
Z
Next, as sm e−s ds = m!, m ∈ N, we have
R+

1 XZ
||βn ||22 = s2n+1 e−ks ds
((n + 1)!)2 k≥3 R+
¡2n+1¢
(2n + 1)! X 1 n+1
X 1
= 2 = , n ∈ N.
((n + 1)!) k≥3 k 2n+2 n+1
k≥3
k 2n+2

Since
X 2 X
X
1 1
=
k≥3
k 2n+2
j=0 `∈N+
(3` + j)2n+2
X 1
≤ 3 = 3−2n−1 ζ (2n + 2)
`∈N+
(3`)2n+2

and µ ¶
2n + 1
≤ 22n+1 , ζ (2n + 2) ≤ ζ(2), n ∈ N,
n+1
we obtain µ ¶2n+1
ζ (2) 2
||βn ||22 ≤ , n ∈ N.
n+1 3
Finally, for any r > 0 we have
X µ ¶r X ÃÃ √ !r !n
2 2 2
(||αn ||2 ||βn ||2 )r ≤ √ ζ (2) < ∞.
3 3
n∈N n∈N
Solving Gauss’ problem 109

The formulae for Tr K and Tr K 2 in the statement follow from (2.3.4)


and (2.3.6) which as easily checked yield
X Z
Tr K = (αn , βn ) = k(s, s)ds,
R+
n∈N
X ZZ
2
Tr K = (αm , βn )(αn , βm ) = k(s, t)k(t, s)dsdt.
m,n∈N R2+

Concerning the numerical values of Tr K and Tr K 2 we refer the reader


to Mayer and Roepstorff (1987, Section 3). 2
Remark. There is an interesting relationship between Tr K n and the
non-zero fixed points of τ n for any n ∈ N+ . It can be shown [see Mayer and
Roepstorff (1987, Section 3) and (1988, Section 3)] that
" n
#−1
X Y
n −2 −2 n
Tr K = xi1 ···in xik ···in i1 ···ik−1 − (−1) ,
i1 ,... ,in ∈N+ k=2
Q1 £ ¤
with k=2 = 1, where xi1 ···in = i1 , . . . , in , i1 , . . . , in ∈ N+ . (For no-
tation see Subsection 1.1.3.) Clearly, these quadratic irrationalities are all
non-zero solutions of the equation τ n x = x. Hence
1 ³ ¡ ¢1/2 ´
xi1 ···in = pn−1 − qn + (pn−1 + qn )2 + 4(−1)n−1
2qn−1
for any n ∈ N+ and i1 , . . . , in ∈ N+ . Here, as usual,
pn
= [i1 , . . . , in ] , g.c.d.(pn , qn ) = 1, n ∈ N+ ,
qn
with p0 = 0, q0 = 1. In particular,
µ 2 ¶1/2
i i
xi = +1 − , i ∈ N+ ,
4 2
µ 2 ¶1/2
j j j
xij = + − , i, j ∈ N+ .
4 i 2
It is asserted in Babenko (1978, p. 140) that for any n ∈ N+ , in our
notation, we have
 
(−1) n−1 X  pn−1 + qn 
Tr K n = 1 − ³ ´1/2  .
2
i1 ,... ,in ∈N+ (pn−1 + qn )2 + 4(−1)n−1
110 Chapter 2

For n = 1 and n = 2 this is in agreement with the Mayer–Roepstorff formula,


as easily checked. Clearly, Babenko’s formula is much simpler than Mayer–
Roepstorff’s. It can be shown that it is true for any n ∈ N+ . See Subsection
2.4.3.
Let us finally note that by the above we have
µ ¶
1 X i
Tr K = 1− √
2 i2 + 4
i∈N +

and
à !
1 X ij + 2
Tr K 2 = p −1
2 ij (ij + 4)
i,j∈N+
à !
1 X k+2
= p − 1 t(k),
2 k(k + 4)
k∈N+

Q
where
Q nα t(k) is the number of divisors of k, equal to α (nα + 1) if 1 < k =
α pα is the factorization of k into distinct primes, and t(1) = 1. 2

Corollary 2.3.3 The dominant eigenvalue λ1 of K is simple and is


equal to 1. The corresponding eigenfunction ϕ1 is defined by

µ ¶1/2
1 1 − e−s
ϕ1 (s) = e−s/2 , s ∈ R+ .
(log 2)1/2 s

Z
Proof. Since sk e−s ds = k!, k ∈ N, we have
R+

Z ³ √ ´
1
Kϕ1 (s) = J1 2 st t−1/2 e−t dt
(log 2)1/2 (es − 1)1/2 R+

s1/2 X (−1)k sk Z
= tk e−t dt
(log 2)1/2 (es − 1)1/2 k! (k + 1)! R+
k∈N

s1/2 (1 − e−s )
= = ϕ1 (s) , s ∈ R+ ,
(log 2)1/2 (es − 1)1/2 s
Solving Gauss’ problem 111

and
Z
1 (1 − e−s ) e−s
||ϕ1 ||22 = ds
log 2 R+ s
Z
1 X (−1)k+1
= sk−1 e−s ds
log 2 k! R+
k∈N+

1 X (−1)k+1
= = 1.
log 2 k
k∈N+

Thus 1 is an eigenvalue of K with corresponding eigenfunction ϕ1 . It should


be the dominant eigenvalue since λn = 1 implies Tr K 2 ≥ n, which contra-
dicts (2.3.5) unless n = 1. It should also be simple since λ1 = λ2 implies
Tr K 2 ≥ 2, which contradicts again (2.3.5). 2
Concerning the remaining eigenvalues λn , n ≥ 2, we first have
λ2 = −λ0 = −0.30366 30028 98732 65859 · · ·
(this follows from Theorem 2.2.5 and Theorem 2.3.5 below). Next, extensive
computations [cf. Daudé et al. (1997, Section 6) and MacLeod (1993)] yield
λ3 = 0.10088 45092 93104 07530 ··· ,
λ4 = −0.03549 61590 21659 84540 · · · ,
λ5 = 0.01284 37903 62440 26481 ··· ,
λ6 = −0.00471 77775 11571 03107 · · · ,
λ7 = 0.00174 86751 24305 51191 ··· ,
λ8 = −0.00065 20208 58320 50290 · · · ,
λ9 = 0.00024 41314 65524 51581 ··· ,
λ10 = −0.00009 16890 83768 59330 · · · .

It has been conjectured in Babenko (1978) that all eigenvalues λj , j ∈


N+ , are simple. Another conjecture [Mayer and Roepstorff (1988)] is that
(−1)j+1 λj > 0, j ∈ N+ .

2.3.3 An ‘exact’ Gauss–Kuzmin–Lévy theorem


Let us define the functions ψj ∈ H, j ∈ N+ , by
Z µ ¶1/2
¡ −1 ¢ −zs−s/2 s
ψj (z) = A ϕj (z) = e ϕj (s) ds, Re z > −1/2.
R+ 1 − e−s
112 Chapter 2

Note that since λj ϕj = Kϕj implies

|ϕj (s)| ≤ Cj s1/2 e−s/2 , s ∈ R+ ,

for some suitable Cj ∈ R+ , it follows that ψj is regular in the half-


plane Re z > −1. It is possible to show that actually the ψj , j ∈ N+ , are
regular outside a cut along the negative axis from −1 to ∞, which is the
natural boundary of them.
In particular,
Z ¯∞
1 −zs−s 1 e−(z+1)s ¯¯
ψ1 (z) = e ds = − ¯
(log 2)1/2 R+ (log 2)1/2 z + 1 ¯0
(2.3.7)
1 1
= 1/2 z + 1
, Re z > −1.
(log 2)

Proposition 2.3.4 We have


X X 1
|ψj (z)|2 = , Re z > −1/2, (2.3.8)
j∈N+ j∈N+
(2 Re z + j)2
µ ¶1/2
π2 1
max |ψj (x)| ≤ − = 1.13325209315 · · · , j ≥ 2. (2.3.9)
x∈I 6 4 log 2

Proof. For any fixed z with Re z > −1/2 consider the function
µ ¶1/2
−zs−s/2 s
ϕ (s) = e , s ∈ R+ ,
1 − e−s

which clearly belongs to L2 (R+ ). On account of the completeness of the sys-


tem (ϕj )j∈N+ , whose properties are described in the lines following equation
(2.3.3), we can write X
ϕ= ej ϕ j ,
j∈N+

where
ej = (ϕ, ϕj ) = ψj (z) , j ∈ N+ .
Parseval’s equation then yields
X
|ej |2 = ||ϕ||22 .
j∈N+
Solving Gauss’ problem 113

But
Z ¯ ¯
¯ −zs−s/2 ¯2 s ds
||ϕ||22 = ¯e ¯
R+ 1 − e−s
Z X Z
−2sRez s ds
= e = e−(2 Re z+j)s s ds
R+ es − 1 R+
j∈N+
¯
X µ ¶¯∞
s 1 ¯
= − e−(2 Re z+j)s + ¯
2 Re z + j (2 Re z + j) ¯¯
2
j∈N+
0
X 1
= , Re z > −1/2,
j∈N+
(2 Re z + j)2

and (2.3.8) follows.


Finally, (2.3.9) follows from (2.3.7) and (2.3.8) since
1
min ψ1 (x) = .
x∈I 2 (log 2)1/2
2

Remarks. 1. It is conjectured in Babenko (1978, p.140) that ψj (0) 6= 0


and |ψj (0)| = maxx∈I |ψj (x)| , j ≥ 2. Note that ψ2 (0) 6= 0 is implicit in
Wirsing (1974).
2. If ψj (0) 6= 0 for some j ≥ 2, then

(−1)n+1 ψj (0)
ψj (−i − [i1 , . . . , in ] + z) = n+2 (1 − λj ) z + O(1)
λj

as z → 0 for any n ∈ N+ , i, i1 , . . . , in ∈ N+ , in ≥ 2, with ε < |arg z| <


π − ε whatever ε > 0. This was proved by Wirsing (1974) for j = 2, thus
establishing the cut along the negative real axis from −1 to ∞ as the natural
boundary of the functions ψ and ψe in Subsection 2.2.2. (See Remark 2 before
Theorem 2.2.5.) It is asserted in Babenko & Jur0 ev (1978) that Wirsing’s
reasoning also works for any j ≥ 3. 2
We are now able to prove an ‘exact’ Gauss–Kuzmin–Lévy theorem for
the measures γa , a ∈ I (cf. Subsection 1.3.4).
Theorem 2.3.5 For any a ∈ I, A ∈ BI , and n ∈ N+ we have
X Z
¡ −n ¢ n−1
γa τ (A) − γ (A) = (a + 1) λj ψj (a) ψj dλ. (2.3.10)
j≥2 A
114 Chapter 2
Z
Next, ψj dλ = 0, j ≥ 2, and
I
¯ ¯
¯ `−1 Z ¯
¯ ¡ −n ¢ X ¯
¯γa τ (A) − γ (A) − (a + 1) n−1
λj ψj (a) ψj dλ¯¯
¯
¯ j=2 A ¯

µ ¶
π 2 log 2
≤ − 1 |λ` |n−1 min (γ (A) , 1 − γ (A))
6
P
for any a ∈ I, A ∈ BI , ` ≥ 2, and n ∈ N+ . (Clearly, 1j=2 = 0.)

Proof. For any a ∈ I consider the function ha defined by

a+1
ha (z) = , Re z > −1/2.
(az + 1)2

Note that h0 does not belong to H. Instead, the function


X 1
Pλ ha (z) = (a + 1) , Re z > −1/2,
i∈N+
(z + a + i)2

does belong to H for any a ∈ I.


By (2.3.2) and (2.3.3) for any g ∈ H and n ∈ N we have
 
X X
Pλn g = A−1 K n A g = A−1  λnj (Ag, ϕj ) ϕj  = λnj (Ag, ϕj ) ψj .
j∈N+ j∈N+

Hence, for any n ∈ N+ and a ∈ I,


X
Pλn ha = Pλn−1 (Pλ ha ) = λn−1
j (APλ ha , ϕj ) ψj . (2.3.11)
j∈N+

We assert that for any a ∈ I we have


µ ¶1/2
−s/2−as s
(APλ ha ) (s) = (a + 1) e , s ∈ R+ . (2.3.12)
1 − e−s

This can be checked as follows. Since Pλ ha = M S −1 (APλ ha ), we have to


Solving Gauss’ problem 115

prove that this last equation holds with APλ ha given by (2.3.12). We have
s
S −1 (APλ ha ) (s) = (a + 1) e−s/2−as , s ∈ R+ ,
1 − e−s
Z
¡ ¢ se−s e−(z+a)s
M S −1 APλ ha (z) = (a + 1) ds
R+ 1 − e−s
X Z
= (a + 1) se−(z+j+a)s ds
j∈N+ R+

X 1
= (a + 1)
j∈N+
(z + j + a)2

= Pλ ha (z), Re z > −1/2.

Thus (2.3.12) holds and we then have

(APλ ha , ϕj ) = (a + 1) ψj (a), a ∈ I, j ∈ N+ . (2.3.13)

Therefore (2.3.11) and (2.3.13) imply that


X
Pλn ha = (a + 1) λn−1
j ψj (a) ψj , a ∈ I, n ∈ N+ .
j∈N+

The last equationP holds in H . By (2.3.9), Proposition 2.3.2, and Corollary


2.3.3, the series j∈N+ λn−1
j ψj (a) ψj is uniformly and absolutely convergent
in I for any a ∈ I and n ∈ N+ . Hence whatever a ∈ I and n ∈ N+ by
(2.3.7) we have

1 X
Pλn ha (x) − = (a + 1) λn−1
j ψj (a) ψ(x), x ∈ I. (2.3.14)
(x + 1) log 2
j≥2

Equation (2.3.10) follows by integrating the last equation over A ∈ BI since


by the very definition of the Perron–Frobenius operator we can write
Z Z Z
Pλn ha dλ = ha dλ = dγa = γa (τ −n (A)), n ∈ N.
A τ −n (A) τ −n (A)

Since
Z
¡ ¢
γ (da) γa τ −n (A) = γ(τ −n (A)) = γ (A) , n ∈ N, A ∈ BI ,
I
116 Chapter 2

if we divide equation (2.3.10) by (a + 1) (log 2) and integrate the equation


obtained over a ∈ I, then we obtain
X Z Z
n−1
0= λj ψj dλ ψj dλ, n ∈ N+ , A ∈ BI .
j≥2 I A

Taking A = I and n = 1 we deduce that


Z
ψj dλ = 0, j ≥ 2.
I

Finally, for a ∈ I, A ∈ BI , ` ≥ 2, and n ∈ N+ set

Da,`,n (A) = D (A)


¯ ¯
¯ `−1
X Z ¯
¯ ¡ −n ¢ ¯
¯
= ¯γa τ (A) − γ (A) − (a + 1) n−1
λj ψj (a) ψj dλ¯¯
¯ j=2 A ¯

and note that D (A) = D (I \ A). It follows from (2.3.10) that


 
Z X
D (A) ≤ (a + 1) |λ` |n−1  |ψj (a)| |ψj (x)| dx
A j≥`
 1/2  1/2
Z X X
≤ (a + 1) |λ` |n−1  ψj2 (a)  ψj2 (x) dx
A j≥` j≥`
 1/2
Z X
= (log 2) |λ` |n−1 (a + 1)2 ψj2 (a)
A j≥`
 1/2
X
× (x + 1)2 ψj2 (x) γ (dx) .
j≥`

Now, equation (2.3.8) implies


 
X X 1 1
(a + 1)2 ψj2 (a) ≤ (a + 1)2  2 − 2

j≥` j∈N+
(2a + j) (a + 1) log 2

1
≤ ζ (2) − (2.3.15)
log 2
Solving Gauss’ problem 117

for any a ∈ I and ` ≥ 2. (The last inequality can be easily checked.) We


therefore obtain
µ 2 ¶
π log 2
D (A) ≤ − 1 |λ` |n−1 γ (A) .
6

Since D (A) = D (I \ A) we conclude that


µ ¶
π 2 log 2
D (A) ≤ − 1 |λ` |n−1 min (γ(A), 1 − γ (A)) .
6

Note that
π 2 log 2
− 1 = 0.14018 · · · = ε2
6
(cf. Subsection 1.3.6 ). 2
Corollary 2.3.6 For any a, x ∈ I, n ∈ N+ , and ` ≥ 2 we have
X Z x
γa (τ n < x) − γ([0, x]) = (a + 1) λn−1
j ψj (a) ψj dλ,
j≥2 0

d 1 X
γa (τ n < x) − = (a + 1) λn−1
j ψj (a)ψj (x),
dx (x + 1) log 2
j≥2
¯ ¯
¯ `−1
X Z x ¯
¯ ¯
¯γa (τ n < x) − γ([0, x]) − (a + 1) λn−1
ψ (a) ψ dλ¯
¯ j j j ¯
¯ j=2 0 ¯
µ ¶ µ ¯ ¯¶
π 2 log 2 n−1 1
¯1 ¯
≤ − 1 |λ` | − ¯ − γ([0, x])¯¯ ,
¯
6 2 2
¯ ¯
¯ `−1
X ¯
¯d 1 ¯
¯ γa (τ n < x) − − (a + 1) λj ψj (a)ψj (x)¯¯
n−1
¯ dx (x + 1) log 2
¯ j=2 ¯
µ ¶
π2 1 1
≤ − |λ` |n−1 .
6 log 2 x+1
Next (cf. Corollary 1.2.5), for any a ∈ I, n, k ∈ N+ , and i(k) ∈ Nk+ we have
¯ ¡ ¢ ¯ µ 2 ¶
¯ γ (a , · · · , a ) = i (k) ¯ π log 2
¯ a n+1 n+k ¯
¯ ¡ ¢ − 1¯ ≤ − 1 λn−1
0 ,
¯ γ [u(i(k) ), v(i(k) )] ¯ 6
118 Chapter 2

which for k = 1 reduces to


¯ ¯ µ 2 ¶
¯ γa (an+1 = i) ¯ π log 2
¯ ¯ − 1 λ0n−1 ,
¯ (log 2)−1 log(1 + 1/i(i + 2)) − 1 ¯ ≤ 6
for any a ∈ I and i, ∈ n ∈ N+ .
Proof. The first equation is (2.3.10) for A = [0, x), x ∈ I, while the
second one is simply (2.3.14). (Clearly, the latter can be obtained from the
former by differentiation.) The first inequality is that occurring in Theorem
2.3.5 for A = [0, x), x ∈ I, while the second one is easily obtained using
(2.3.15). Finally, the last inequality (the general case) is that occurring in
Theorem 2.3.5 for A = [u(i(k) ), v(i(k) )] and ` = 2. 2
It is interesting to compare Theorem 2.2.5 (with µ = γa , a ∈ I) and
Corollary 2.3.6. It is easy to see that for any a, x ∈ I we have
Z x
e
−λ0 G(fa0 )ψ(x) = ψ2 (a) ψ2 dλ, (2.3.16)
0

where
x+1
fa (x) = , a, x ∈ I.
(ax + 1)2
Differentiating (2.3.16) with respect to x and then putting x = a yield

ψ22 (a) = −λ0 G(fa0 )ψe 0 (a), a ∈ I.

In particular, ψ22 (0) = −λ0 G(1)ψe 0 (0) = λ0 G(1)U ∞ Ψ 6= 0 (since G(1) > 0).
Now, it follows from (2.3.16) that for any x ∈ I such that ψe 0 (x) 6= 0 the ratio
ψ2 (x)/ψe 0 (x) has a constant value equal to −(sgn ψ2 (0))(λ0 G(1)/U ∞ Ψ)1/2 ,
and that for any a ∈ I such that ψ2 (a) 6= 0 the ratio G(fa0 )/ψ2 (a) has a
constant value equal to G(1)/ψ2 (0). Then
µ ∞ ¶1/2 Z x
e U Ψ
ψ(x) = −(sgn ψ2 (0)) ψ2 dλ
λ0 G(1) 0

and µ ¶1/2
λ0 G(1)
ψ2 (x) = −(sgn ψ2 (0)) ψe 0 (x)
U ∞Ψ
for any x ∈ I.
Remark. It follows from Corollary 2.3.6 that the exact convergence rate
to 0 as n → ∞ of

sup |γa (τ n < x) − γ([0, x])| , a ∈ I, (2.3.17)


x∈I
Solving Gauss’ problem 119

is O(λn0 ) as long as ψ2 (a) 6= 0. In particular this holds for a = 0 since, as


we have just shown, ψ2 (0) 6= 0. If ψ2 (a) = · · · = ψj−1 (a) = 0 and ψj (a) 6= 0
for some j ≥ 3, then the exact convergence rate to 0 as n → ∞ of (2.3.17)
is O(λnj ).
The high accuracy computations of MacLeod (1993) show, however, that
the only possible value of j is j = 3, since there exists a unique a ∈ I, very
close to 0.4, with ψ2 (a) = 0 while ψ3 (a) 6= 0. 2

2.3.4 ψ-mixing revisited


Theorem 2.3.5 allows for an important improvement of Corollary 1.3.15.
With the notation in Subsection 1.3.6, it follows from Theorem 2.3.5 that
µ 2 ¶
π log 2
εn+1 ≤ − 1 λn−1 0 , n ∈ N+ . (2.3.18)
6
It is easy to check that for n = 1 we actually have equality in (2.3.18), that
is,
π 2 log 2
ε2 = − 1 = 0.14018 · · · ,
6
in accordance with the result obtained in Subsection 1.3.6.
We can thus reformulate Corollary 1.3.15 as follows.
Proposition 2.3.7 The sequence (an )n∈N+ is ψ-mixing under γ and
any γa , a ∈ I. For any a ∈ I we have ψγa (1) ≤ 0.61231 · · · and
ε2 λn−2
0 (1 + λ0 )
ψγa (n) ≤ , n ≥ 2.
1 − ε2 λn−1
0
In particular ψγa (2) ≤ ε2 (1 + λ0 )/(1 − ε2 λ0 ) = 0.19087 · · · for any a ∈ I.
Also, ψγ (1) = ε1 = 2 log 2 − 1 = 0.38629 · · · , ψγ (2) = ε2 = 0.14018 · · · ,
and
ψγ (n) ≤ ε2 λn−2
0 , n ≥ 3.
The doubly infinite sequence (ā` )`∈Z of extended incomplete quotients is
ψ-mixing under the extended Gauss measure γ̄ and its ψ-mixing coefficients
are equal to the corresponding ψ-mixing coefficients under γ of (an )n∈N+ .
Remark. From Theorem 2.3.5 we can also obtain a formula expressing
the ψ-mixing coefficients ψγ (n), n ≥ 2, in terms of the eigenvalues λj and
functions ψj , j ≥ 2, as
¯ ¯
¯X ¯
¯ ¯
ψγ (n + 1) = (log 2) sup (a + 1)(b + 1) ¯¯ λj ψj (a) ψj (b)¯¯ , n ∈ N+ .
n−1
a,b∈I ¯ j≥2 ¯
120 Chapter 2

It is not difficult to check that the above formula yields ψγ (2) = ε2 . Other-
wise it seems to be of little value. 2

2.4 Extending Babenko’s and Wirsing’s work


2.4.1 The Mayer–Roepstorff Hilbert space approach
In this subsection we describe the setting devised by Mayer and Roepstorff
(1987) for Babenko’s work which is thus simplified and extended. Proofs
are in general not given, and for them the reader is referred to the original
paper.
Let m denote the measure on BR+ with density

dm t
= t , t ∈ R+ .
dt e −1
Note that
Z X X 1
m (R+ ) = t e−kt dt = = ζ (2) .
R+ k2
k∈N+ k∈N+
¡ ¢
Consider the Hilbert space L2 R+ , BR+ , m = L2m (R+ ) of m-square inte-
grable functions f : R+ → C with inner product (·, ·)m defined by
Z
(ϕ, ψ)m = ϕψ ∗ dm, ϕ, ψ ∈ L2m (R+ ),
R+

and norm µZ ¶1/2


kϕk2,m = |ϕ|2 dm , ϕ ∈ L2m (R+ ).
R+

Let D denote the half-plane Re z > −1/2 and consider the measure ν
on BD with density

 1 1 1

 2 if − < x < 0, y ∈ R,
dν π (x + 1) + y 2 2
=
dxdy  

0 otherwise.

Note that
Z 0 Z Z 0
1 dy dx
ν (D) = dx = = log 2.
π −1/2 R (x + 1)2 + y 2 −1/2 x+1
Solving Gauss’ problem 121

Consider
¯ the Hilbert
¯ space H 2 (ν) of functions f holomorphic in D such
¯ ¯
that ¯(z + 1)−1 f (z)¯ is bounded in every half-plane Re z > −1/2 + ε, ε > 0,
and µZ ¶ 1/2
kf k2,ν = |f |2 dν < ∞,
D

with inner product (·, ·)ν defined by


Z
(f, g)ν = f g ∗ dν, f, g ∈ H 2 (ν) .
D

Thus H 2 (ν) is a Banach space under the norm k·k2,ν .


Let fe denote the restriction of f ∈ H 2 (ν) to I. Then
Z
(f, 1)ν
U fe =

fedγ = (2.4.1)
I log 2

and ° °
° e°
°f ° ≤ kf k2,ν . (2.4.2)
2,γ

Next, the linear mapping M : L2m (R+ ) → H 2 (ν) defined by


Z
M ϕ (z) = (z + 1) e−zt ϕ (t) m(dt), ϕ ∈ L2m (R+ ) , z ∈ D,
R+

is an isometry and the image under M of L2m (R+ ) is H 2 (ν).


The Perron–Frobenius operator U takes H 2 (ν) into itself. Obviously,
for f ∈ H 2 (ν) we define U f by
X µ ¶
1
U f (z) = Pi (z) f , z ∈ D.
z+i
i∈N+

The mapping Kb : ϕ → Kϕ,


b where
Z ³ √ ´ ϕ (t)
b
Kϕ (s) = J1 2 st √ m (dt) , ϕ ∈ L2m (R+ ) , s ∈ R+ ,
R+ st

defines on L2m (R+ ) an integral symmetric linear operator with continuous


kernel
¡ √ ¢
J1 2 st X (−1)n
b
k (s, t) = √ = (st)n , s, t ∈ R+ .
st n! (n + 1)!
n∈N
122 Chapter 2

Kb has infinite-dimensional range, is nuclear (of trace class) and, therefore,


compact. The spectra of the operators K b and K (introduced in Subsec-
tion 2.3.2) coincide. Thus with the notation from Subsection 2.3.2 for the
eigenvalues of K we have
X
b =
Kϕ λk (ϕ, ϕ bk , ϕ ∈ L2m (R+ ),
bk )m ϕ (2.4.3)
k∈N+

where ϕbk is an eigenfunction corresponding to λk , that is, Kbϕ bk = λk ϕ


bk ,
k ∈ N+ , and the ϕ 2
bk , k ∈ N+ , define an orthonormal basis in Lm (R+ ).
Actually,
¡ ¢1/2
bk (t) = t−1/2 et − 1
ϕ ϕk (t) , k ∈ N+ , t ∈ R+ ,

where the ϕk , k ∈ N+ , are those introduced in Subsection 2.3.2.


The operators M, Kb and U are connected by the equation U = M KM
b −1 .
Hence
Un = MK b n M −1 , n ∈ N+ . (2.4.4)
From (2.4.3) we have
X
Kb nϕ = λnk (ϕ, ϕ
bk )m ϕ
bk , n ∈ N+ , ϕ ∈ L2m (R+ ) . (2.4.5)
k∈N+

It then follows from (2.4.4) and (2.4.5) that


X
U ng = λnk (M −1 g, ϕ bk , n ∈ N+ , g ∈ H 2 (ν) .
bk )m M ϕ
k∈N+

Alternatively,
X
U ng = λnk (g, M ϕ
bk )ν M ϕ
bk , n ∈ N+ , g ∈ H 2 (ν) .
k∈N+

For k = 1 we have λ1 = 1 and


1 ¡ ¢
ϕ
b1 (t) = 1/2
t−1 et − 1 e−t , t ∈ R+ .
(log 2)
Therefore
Z

b1 (z) = (z + 1) e−zt ϕ
b1 (t) m (dt)
R+
Z
(z + 1) 1
= 1/2
e−(z+1)t dt = , z ∈ D,
(log 2) R+ (log 2)1/2
Solving Gauss’ problem 123

and, by (2.4.1),
1
(g, M ϕ
b1 )ν M ϕ
b1 = (g, 1)ν = U ∞ ge, g ∈ H 2 (ν) .
log 2
b we also have
As 0 is not an eigenvalue of K,
X
M −1 g = (M −1 g, ϕ bk , g ∈ H 2 (ν) ,
bk )m ϕ
k∈N+

or, alternatively,
X
g= (g, M ϕ
bk )ν M ϕ
bk , g ∈ H 2 (ν) .
k∈N+

Then
° −1 °2 X ¯ ¯2 X
°M g ° = ||g||2
2,ν = ¯(M −1
g, ϕ
bk )m
¯ = bk )ν |2
|(g, M ϕ
2,m
k∈N+ k∈N+

for any g ∈ H 2 (ν). Therefore


P
||U n g − U ∞ ge||22,ν = 2n bk )ν |2
k≥2 |λk | |(g, M ϕ
³ ´ (2.4.6)
≤ ||g||22,ν − |U ∞ ge|2 log 2 |λ2 |2n ,

for any n ∈ N+ and g ∈ H 2 (ν).


Inequalities (2.4.2) and (2.4.6) imply the following result.
Proposition 2.4.1 Let g ∈ H 2 (ν). Then for any n ∈ N+ we have
³ ´1/2
kU n ge − U ∞ gek2,γ ≤ ||g||22,ν − |U ∞ ge|2 log 2 |λ2 |n .

Corollary 2.4.2 (L2 -version of the Gauss–Kuzmin–Lévy theorem) Let


h : D → C such that the function z → (z + 1) h(z), z ∈ D, belongs to H 2 (ν)
and the restriction of h to I is the Radon–Nikodym derivative with respect
to λ of a probability measure µ on BI . Then

|µ (τ −n (A)) − γ (A)|
µ ZZ ¶1/2 (2.4.7)
1 2 1
≤ (log 2) γ 1/2
(A) |h (x + iy)| dxdy − |λ2 |n
π D log 2
124 Chapter 2

for any n ∈ N+ and A ∈ BI .

Proof. Let g (z) = (z + 1) h(z), z ∈ D. For any A ∈ BI and n ∈ N+ we


have
¯ ¯ µZ ¶1/2
¯ n ∞ ¯ 2
¯(IA , U ge − U ge)γ ¯ ≤ IA dγ kU n ge − U ∞ gek2,γ . (2.4.8)
I

But Z
n ∞ 1 U n ge (x) − U ∞ ge
(IA , U ge − U ge)γ = dx
log 2 A x+1
and, by Proposition 2.1.5,
Z
U n ge (x) − U ∞ ge ¡ ¢
dx = µ τ −n (A) − γ (A)
A x+1

since Z
∞ 1 (x + 1) h (x) 1
U ge = dx = .
log 2 I x+1 log 2

Therefore (2.4.8) amounts to

¯ ¡ −n ¢ ¯
¯µ τ (A) − γ (A)¯ ≤ (log 2) γ 1/2 (A) kU n ge − U ∞ gek (2.4.9)
2,γ

for any n ∈ N+ and A ∈ BI . Now, (2.4.7) follows from (2.4.9) and Proposi-
tion 2.4.1. 2

Remark. Inequality (2.4.6) can be obviously generalized as follows. For


any n, ` ∈ N+ and g ∈ H 2 (ν) we have
¯¯ ¯¯2
¯¯ X ¯¯
¯¯ n ¯¯
¯¯U g − U ∞ ge − n
λk (g, M ϕ bk ¯¯¯¯
bk )ν M ϕ
¯¯
¯¯ 2≤k≤` ¯¯
2,ν

 
X
≤ ||g||22,ν − |U ∞ ge|2 log 2 − bk )ν |2  |λ`+1 |2n
|(g, M ϕ
2≤k≤`

with the usual convention which assigns value 0 to a sum over the empty
set. Proposition 2.4.1 and Corollary 2.4.2 can be accordingly generalized. 2
Solving Gauss’ problem 125

We can again derive the ‘exact’ Gauss–Kuzmin–Lévy Theorem 2.3.5.


First, we clearly have
Z
ψbk (z) := M ϕ bk (z) = (z + 1) e−zt ϕ
bk (t) m (dt)
R+
Z
¡ ¢−1/2 (2.4.10)
= (z + 1) e−zt t1/2 et − 1 ϕk (t) dt
R+

= (z + 1) ψk (z), k ∈ N+ , z ∈ D.

Second, the function ga , a ∈ I, defined by

(a + 1) (z + 1)
ga (z) = , z ∈ D,
(az + 1)2

does not belong to H 2 (ν) for a = 0. Instead, the function


X 1
U ga (z) = (a + 1) (z + 1) , z ∈ D,
j∈N+
(z + a + j)2

does belong to H 2 (ν) for any a ∈ I. Then


X
U n ga = U n−1 (U ga ) = λn−1
k bk )m ψbk
(M −1 U ga , ϕ
k∈N+

for any a ∈ I and n ∈ N+ . Now, it is easy to check that

M −1 U ga (t) = (a + 1) e−at , a ∈ I, t ∈ R+ . (2.4.11)

Hence
Z
(M −1
U ga , ϕ
bk )m = (a + 1) bk (t) m(dt) = ψbk (a),
e−at ϕ a ∈ I, k ∈ N+ .
R+

Therefore
X
U n ga = λn−1
k ψbk (a)ψbk , n ∈ N+ , a ∈ I, (2.4.12)
k∈N+

which by (2.4.10) is identical with (2.3.14).


126 Chapter 2

Note that by (2.4.11) for any a ∈ I we have


Z
¯¯ ¯¯2 e−2at t dt
||U ga ||22,ν = ¯¯M −1 U ga ¯¯2,m = (a + 1)2 t
R+ e − 1

2
X Z
= (a + 1) te−(2a+k)t dt (2.4.13)
k∈N+ R+

X 1
= (a + 1)2 ,
k∈N+
(2a + k)2

that is, by Proposition 2.3.4,


X
||U ga ||22,ν = (a + 1)2 |ψk (a)|2 .
k∈N+

This result is not at all surprising. It can be derived immediately from


(2.4.12) with n = 1 on account of the fact that (ψbk )k∈N+ is an orthonormal
basis in H 2 (ν). (Remark that the ψk , k ∈ N+ , are not pairwise orthogonal
in H !).
Next,
1
U ∞ U gea = U ∞ gea = , a ∈ I. (2.4.14)
log 2
It then follows from Proposition 2.4.1 that for any n ∈ N+ we have
° °
kU n gea − U ∞ gea k2,γ = °U n−1 (U gea ) − U ∞ gea °2,γ

³ ´1/2 (2.4.15)
≤ ||U ga ||22,ν − |U ∞ gea |2 log 2 |λ2 |n−1 .

Proposition 2.4.3 For any a ∈ I, n ∈ N+ and A ∈ BI we have


¯ ¯
¯γa (τ −n (A)) − γ(A)¯ (2.4.16)
 1/2
X 1 1 
≤ (log 2) γ 1/2 (A) (a + 1)2 2 − log 2 |λ2 |n−1 .
k∈N+
(2a + k)

Proof. The function

e gea (x) a+1


ha (x) = = , x ∈ I,
x+1 (ax + 1)2
Solving Gauss’ problem 127

is just the Radon–Nikodym derivative dγa /dλ. Now, (2.4.16) follows from
(2.4.9) and (2.4.13) through (2.4.15). 2
Remarks. 1. On account of the remark following Corollary 2.4.2, in-
equality (2.4.16) can be generalized as follows. For any a ∈ I, `, n ∈ N+ ,
and A ∈ BI we have
¯ ¯
¯ X Z ¯
¯ ¡ −n ¢ n−1 b
¯
¯γa τ (A) − γ (A) − (log 2) λk ψk (a) b
ψk dγ ¯¯ (2.4.17)
¯
¯ 2≤k≤` A ¯
 1/2
X 1 X
≤ (log 2) γ 1/2 (A) (a + 1)2 2 − ψbk2 (a) |λ`+1 |n−1 .
k∈N+
(2a + k) 1≤k≤`

2. It is instructive to compare the inequality in Theorem 2.3.5 with


(2.4.17). The difference between them reflects the difference between the
Hilbert spaces H and H 2 (ν). 2

2.4.2 The Mayer–Roepstorff Banach space approach


In this subsection we give a summary of the work of Mayer and Roepstorff
(1988) on the u0 -positivity of the Perron–Frobenius operators Pλ and U =
Pγ on a suitable Banach space.
Let us first recall a few concepts concerning positive operators with re-
spect to a cone in a real Banach space B. A closed convex subset C of B is
called a cone if and only if (i) x ∈ C and a ∈ R+ imply ax ∈ C, and (ii)
x ∈ C and −x ∈ C imply x = 0.
A cone C induces a partial order ≤C (≤ for short): x ≤ y if and only if
y − x ∈ C. A cone C is said to be reproducing if and only if B = C − C, that
is, any z ∈ B can be written as z = x − y with x, y ∈ C. A linear operator
T : B → B is said to be positive with respect to a cone C if and only if
T C ⊂ C.
Let C be a cone and 0 6= u0 ∈ C. A positive with respect to C operator
T is said to be u0 -positive if and only if for any 0 6= x ∈ C there exist p ∈ N+
and α, β ∈ R++ such that

αu0 ≤ T p x ≤ βu0 .

Compact operators on the complexification of B, which are positive with


respect to a reproducing cone C ⊂ B and u0 -positive for some 0 6= u0 ∈ C,
enjoy properties similar to those of finite positive matrices. They obey
128 Chapter 2

a generalization of the Perron–Frobenius theorem for such matrices. For


details the reader is referred to Krasnoselskii (1964).
Coming back to our problem, let

D1 = (z ∈ C : |z − 1| < 3/2)

and consider the collection A (D1 ) of all holomorphic functions in D1 which


together with their first derivatives are continuous in D1 ; A (D1 ) is a Banach
space under the norm
à !
¯ 0 ¯
kf k = max sup |f (z)| , sup ¯f (z)¯ , f ∈ A (D1 ) .
z∈D1 z∈D1

Both operators Pλ and U take A (D1 ) into itself. Obviously, for f ∈ A (D1 )
we define Pλ f and U f by
X µ ¶
1 1
Pλ f (z) = f , z ∈ D1 ,
(z + i)2
i∈N+
z+i

and µ ¶
X 1
U f (z) = Pi (z) f , z ∈ D1 ,
z+i
i∈N+

respectively. Both Pλ and U are nuclear operators of trace class on A (D1 ).


Let us write (compare with Subsection 2.1.2) Pλ = Π1 + T0 , where
Z
Π1 f (z) = f1 (z) f dλ, f ∈ A(D1 ), z ∈ D1 ,
I

and
(log 2)−1
f1 (z) = , z ∈ D1 .
z+1
Since Pλ (f1 f ) = f1 U f, f ∈ A (D1 ), the spectra of the operators Pλ and U
on A (D1 ) are identical, algebraic multiplicities of the eigenvalues included.
Theorem 2.4.4 The spectra of U on A (D1 ) and on H 2 (ν) (see Subsec-
tion 2.4.1) are identical, algebraic multiplicities of the eigenvalues included.
Consider the subspaces
µ Z ¶
⊥ ∞
A (D1 ) = f ∈ A (D1 ) : U f = f dγ = 0
I
Solving Gauss’ problem 129

and µ Z ¶
e⊥ (D1 ) =
A f ∈ A(D1 ) : f dλ = 0
I
³ ´ ³ ´
of A (D1 ) and the real subspaces A⊥ e⊥ of A⊥ (D1 ) Ae⊥ (D1 )
r (D1 ) Ar (D1 )
consisting of functions that take real values on R ∩ D1 = [−1/2, 5/2]. Note
that by Proposition 2.1.1(ii) U leaves invariant both subspaces A⊥ (D1 ) and
A⊥ e⊥ (D1 ) and Ae⊥
r (D1 ) while Pλ leaves invariant³both subspaces
´ A ³ r (D1´).
The complexification of A⊥ r (D1 )
e⊥
A ⊥
r (D1 ) is just A (D1 )
e⊥ (D1 ) .
A
Also, the spectrum of T0 on A e⊥ (D1 ) is identical with the spectrum of U on

A (D1 ).
The set ³ ´
C = f ∈ A⊥ 0
r (D1 ) : f ≥ 0 on [−1/2, 5/2]

is a reproducing cone in A⊥
r (D1 ) . Define u0 ∈ A(D1 ) by
1
u0 (z) = z + 1 − , z ∈ D1 .
log 2
Clearly, u0 ∈ C.
Theorem 2.4.5 The operator −U on A⊥ r (D1 ) is positive with respect
to the cone C . Moreover, −U is u0 -positive. Hence the operator − U +
U ∞ on A (D1 ) has a simple positive dominant eigenvalue equal to λ0 (cf.
Theorem 2.2.5) with eigenfunction f2 in the interior C o of C. There is no
other eigenfunction in C.
Corollary 2.4.6 The operator −T0 on A e⊥
r (D1 ) is positive with respect to
the (reproducing) cone f1 C = (f1 f : f ∈ C). Moreover, −T0 is f1 u0 -positive.
Hence the operator −T0 on A (D1 ) has a simple positive dominant eigenvalue
equal to λ0 with eigenfunction fb2 = f1 f2 . There is no other eigenfunction
in f1 C.
Note that a minimax principle for −λ0 holds. We namely have
(U f )0 (x) (U f )0 (x)
mino max = −λ0 = max min .
f ∈C −1/2≤x≤5/2 f 0 (x) f ∈C o −1/2≤x≤5/2 f 0 (x)
Hence
(U f )0 (x) (U f )0 (x)
min ≤ −λ0 ≤ max
−1/2≤x≤5/2 f 0 (x) −1/2≤x≤5/2 f 0 (x)
for any f ∈ C o . For example, taking
z+1
f (z) = − c, z ∈ D1 ,
z + 1.14617
130 Chapter 2

with c chosen such that f ∈ A⊥ (D1 ), we obtain

0.2995 ≤ λ0 ≤ 0.3038,

that is, an approximation which is good enough.

2.4.3 Mayer–Ruelle operators


Statistical mechanics problems motivated the consideration of a class of
operators including as a special case the Perron–Frobenius operator Pλ of
τ under λ. This class has been thoroughly studied by Mayer (1990, 1991).
Nowadays, these operators are named after him and D. Ruelle.
Let D1 = (z ∈ C : |z − 1| < 3/2) and consider the collection A∞ (D1 ) of
all holomorphic functions in D1 which are continuous in D1 ; A∞ (D1 ) is a
Banach space under the supremum norm

|| f || = sup |f (z)| , f ∈ A∞ (D1 ).


z∈D1

For any β ∈ C with Re β > 1 and f ∈ A∞ (D1 ) define


X µ ¶
1 1
Gβ f (z) = f , z ∈ D1 .
(z + i)β z+i
i∈N+

It is easy to check that Gβ is a bounded linear operator on A∞ (D1 ). Hence,


as mentioned when discussing nuclear operators in Subsection 2.3.2, Gβ is
nuclear of order 0 and thus has a discrete spectrum.
For β = 2, Gβ has the same analytical expression as Pλ . In what follows
we give without proofs the most important properties of the Mayer–Ruelle
operator Gβ for Re β > 1, which generalize those of Pλ . For proofs we
refer the reader to Mayer (1990, 1991). See also Daudé et al. (1997), Faivre
(1992), Flajolet and Vallée (1998, 2000), and Vallée (1997).
Theorem 2.4.7 Let β be real, strictly greater than 1.
(i) The operator Gβ : A∞ (D1 ) → A∞ (D1 ) has a positive dominant eigen-
value λ(β) which is simple and strictly greater in absolute value than all
other eigenvalues. The corresponding eigenfunction gβ ∈ A∞ (D1 ) is strictly
positive on D1 ∩ R = [−1/2, 5/2].
(ii) The map β → λ(β) defines on (1, ∞) a strictly decreasing and log-
concave function with

log λ(β) 5−1
lim λ(β) = ∞, λ(2) = 1, lim = log .
β↓1 β→∞ β 2
Solving Gauss’ problem 131

Moreover, Ã√ !u
5−1
λ(β + u) ≤ λ(β), u ∈ R+ .
2

(iii) There exists a linear functional `β on A∞ (D1 ) with `β (gβ ) = 1 and


`β (f ) > 0 for any f ∈ A∞ (D1 ) such that f |[−1/2,5/2] > 0 (here f |[−1/2,5/2]
denotes the restriction of f to [−1/2, 5/2]). If Π1β denotes the projection
defined as
Π1β f = `β (f )gβ , f ∈ A∞ (D1 ),
then
Gβ = λ(β)Π1β + T0β
with Π1β T0β = T0β Π1β = 0. Hence

Gnβ = λn (β)Π1β + T0β


n
, n ∈ N+ .

(iv) The spectral radius ρ(β) of the linear operator T0β : A∞ (D1 ) →
A∞ (D1 ) is strictly smaller than λ(β), and for any f ∈ A∞ (D1 ) such that
f |[−1/2,5/2] > 0 we have
µµ ¶n ¶
Gnβ f (z) ρ(β)
=1+O
λn (β)`β (f )gβ (z) λ(β)

as n → ∞, where the constant implied in O is independent of z ∈ D1 (but


dependent on f and β).
(v) There exists ε = ε(β) > 0 such that for any α ∈ C satisfying
|α − β| ≤ ε the dominant spectral properties of Gβ : A∞ (D1 ) → A∞ (D1 )
transfer to Gα : A∞ (D1 ) → A∞ (D1 ) : quantities λ(α), ρ(α), gα , `α (thus
Π1α ) and T0α can be defined to represent the dominant spectral objects asso-
ciated with Gα , and all of them are analytical with respect to α. Moreover,
let a ∈ (ρ(β)/λ(β), 1) . For any f ∈ A∞ (D1 ) such that f |[−1/2,5/2] > 0 we
have
Gnα f (z)
= 1 + O(an )
λn (α)`α (f )gα (z)
as n → ∞, where the constant implied in O is independent of z ∈ D1 and α
satisfying |α − β| ≤ ε, but depends on a, f , and β. Finally, ρ(β + it) < ρ(β)
for t ∈ [−ε, ε] , t 6= 0.
The proof is the same Perron–Frobenius type of argument used in the
case β = 2, which has been sketched in the preceding subsection. There the
132 Chapter 2

existence of a dominant simple real (in fact, negative) eigenvalue of T02 = T0


followed by considering the subspace A(D1 ) ⊂ A∞ (D1 ). 2
As in the special case β = 2, the Mayer–Ruelle operators enjoy better
properties when defined on suitable Hilbert spaces.
Let Re β > 1. Consider the collection H (β) of functions f which are
holomorphic in the half plane Re z > −1/2, bounded in any half-plane
Re z > −1/2 + ε, ε > 0, and can be represented in the form
Z
f (z) = e−zs ϕ(s)(β−1)/2 m0 (ds), Re z > −1/2, (2.4.18)
R+

where m0 is the measure on BR+ with density


 1
 if s > 0,
dm0  es − 1
=
ds 

0 if s = 0,

for some ϕ ∈ L2m0 (R+ ), the Hilbert space of m0 -square integrable functions
ϕ : R+ → C with inner product (·, ·)m0 defined by
Z
(ϕ, ψ)m0 = ϕψ ∗ dm0 , ϕ, ψ ∈ L2m0 (R+ )
R+

and norm
µZ ¶1/2
2 0
||ϕ||2,m0 = |ϕ| dm , ϕ ∈ L2m0 (R+ ).
R+

Introducing the inner product

(f1 , f2 )(β) = (ϕ1 , ϕ2 )m0 ,

where ϕi is associated with fi , i = 1, 2, by (2.4.18), H (β) is made a Hilbert


space with norm
|| f ||(β) = ||ϕ||2,m0 , f ∈ H (β) ,
where f and ϕ are again associated by (2.4.18).
Theorem 2.4.8 Let Re β > 1.
(i) The linear operator Gβ takes boundedly H (β) into itself.
(ii) For any f ∈ H (β) we have
Z
Gβ f (z) = e−zs Kβ ϕ(s)s(β−1)/2 m0 (ds), Re z > −1/2,
R+
Solving Gauss’ problem 133

where Kβ : L2m0 (R+ ) → L2m0 (R+ ) is a symmetric integral operator defined


by
Z ³ √ ´
Kβ ϕ(s) = Jβ−1 2 st ϕ(t)m0 (dt), ϕ ∈ L2m0 (R+ ), s ∈ R+ .
R+

Here Jβ−1 is the Bessel function of order β − 1 defined by


³ u ´β−1 X (−1)k ³ u ´2k
Jβ−1 (u) = , u ∈ R+ .
2 k! Γ(k + β) 2
k∈N

Hence Gβ : H (β) → H (β) can be diagonalized in an orthonormal basis of


H (β) . Moreover, if β ∈ R then Gβ is self-adjoint and its spectrum is real.
(iii) The spectra of the operators Gβ : A∞ (D1 ) → A∞ (D1 ), Gβ : H (β) →
H (β) and Kβ : L2m0 (R+ ) → L2m0 (R+ ) are identical. Hence for any real β > 1
these spectra are all real.
Let us note in particular that for β = 2 the symmetric operator K2 from
Theorem 2.4.8 is different from the symmetric operator K from Proposition
2.3.1. They are related by the simple relation K2 = SKS −1 , where S :
L2 (R+ ) → L2m0 (R+ ) is an invertible linear operator defined by

S ϕ(s) = (es − 1)1/2 ϕ(s), s ∈ R+ .

Hence the spectra of K and K2 are identical.


As for K, formulae for the trace of Kβ and its powers are available. De-
noting by λi (β), i ∈ N+ , the eigenvalues of Kβ taken in order of decreasing
moduli and counting their multiplicity, we have
X X 1
Tr Kβ = λi (β) = β−2 2
,
i∈N+ i∈N+ yi (yi + 1)
³ √ ´
where yi = i + i2 + 4 /2, i ∈ N+ , and, in general,
X X 1
Tr Kβn = λni (β) = ¡ 2 ¢,
i∈N+ i1 ,··· ,in ∈N+ yiβ−2
1 ···in
yi1 ···in + (−1)n−1

where p
pn−1 + qn + (pn−1 + qn )2 + 4(−1)n−1
yi1 ···in =
2
with, as usual,
pn
= [i1 , · · · , in ] , g.c.d. (pn , qn ) = 1, p0 = 0,
qn
134 Chapter 2

for any n ∈ N+ and i1 , · · · , in ∈ N+ . Let us note that for β = 2 we recover


Babenko’s formula for Tr K n , n ∈ N+ . See the remark following the proof
of Proposition 2.3.2.
In particular [see Daudé et al. (1997)], we have
µ ¶µ ¶
7 2 7 1X i i−1 2i 1
Tr K4 = −√ −√ + (−1) ζ(2i) − 1 − 2i
2 5 2 2 i+1 i 2
i≥2

= 0.14446 23962 46160 81588 · · · ,

Tr K42 = 0.04647 18256 42727 93983 · · · ,

and
λ1 (4) = 0.19945 88183 43767 26019 ··· ,
λ2 (4) = −0.07573 95140 84360 60892 · · · ,
λ3 (4) = 0.02856 64037 69818 52783 ··· ,
λ4 (4) = −0.01077 74165 76612 69829 · · · ,
λ5 (4) = 0.00407 09406 93426 42144 ··· .

To conclude this brief discussion of Mayer–Ruelle operators we mention


two generalizations of them.
a. For any subset M of N+ define
X µ ¶
1 1
GM,β f (z) = f , z ∈ D1 ,
(z + i)β z+i
i∈M

whatever β ∈ C with Re β > 1 and f ∈ A∞ (D1 ). Clearly, GM,β is a


bounded linear operator on A∞ (D1 ), hence a nuclear one of trace class,
which coincides with Gβ when M = N+ . Now, for an arbitrarily fixed k ∈
N+ , let Mi , 1 ≤ i ≤ k, be subsets of N+ and write M = (M1 , . . . , Mk ).
Consider the linear operator GM,β : A∞ (D1 ) → A∞ (D1 ) defined as

GM,β = GMk ,β ◦ · · · ◦ GM1 ,β ,

which is nuclear of trace class, too.


The operators GM,β for various M control the dynamics of continued
fraction expansions of irrationals subject to periodical constraints. Their
spectral properties are entirely similar to those of Gβ . For details see Vallée
(1998), who considered systematically such operators. See, however, Fluch
(1986, 1992) for special cases.
Solving Gauss’ problem 135

b. The second generalization has been motivated by the study of the


transformation ¹ º
1 1
z → − Re , 0 6= z ∈ C,
z z
which extends to the complex domain the continued fraction transformation
τ . Let µ ¶
5
D2 = z : |z − 1| < ,
4
and consider the collection B∞ (D2 ) of all functions F which are holomorphic
2
in D22 and continuous in D2 . Under the supremum norm

|| F || = sup |F (z, w)| ,


2
(z,w)∈D2

B∞ (D2 ) is a Banach space.


Then for any (α, β) ∈ C2 with Re (α + β) > 1 a linear bounded operator

Gα,β : B∞ (D2 ) → B∞ (D2 )

is defined by
X µ ¶
1 1 1
Gα,β F (z, w) = F ,
(z + i) (w + i)β
α z+i w+i
i∈N+

for any F ∈ B∞ (D2 ) and (z, w) ∈ D22 . The spectral properties of Gα,β ,
which is positive and nuclear of trace class, are strongly related to those of
Gα+β+2` , ` ∈ N. For details see Vallée (1997).

2.5 The Markov chain associated with the


continued fraction expansion
2.5.1 The Perron–Frobenius operator on BV (I)
In this section we study the Perron–Frobenius operator U on BV (I). This
is motivated by Proposition 2.1.10 which establishes U as the transition
operator of certain Markov chains. Throughout, except for Corollary 2.5.7,
we consider just real-valued functions in BV (I).
By Proposition 2.1.16, the operator U defined by (2.1.16) is a bounded
linear operator of norm 1 on BV (I). Moreover, by Corollary 2.1.13 we have
1
var U f ≤ var f
2
136 Chapter 2

for any f ∈ BV (I), the constant 1/2 being optimal. Hence

var U n f ≤ 2−n var f

for any f ∈ BV (I) and n ∈ N+ . As might be expected, we shall see that


the constant 2−n is not optimal for n > 1. A natural problem thus arises:
what is the upper bound of var U n f /var f over non-constant f ∈ BV (I)?
A satisfactory answer to this problem will be given in Theorem 2.5.3 and
Corollary 2.5.6.
It is easy to check by induction with respect to n ∈ N+ that
X
U n f (x) = Pi1 ···in (x)f (uin ···i1 (x)), x ∈ I, (2.5.1)
i1 ,··· ,in ∈N+

where
uin ···i1 = uin ◦ · · · ◦ ui1 ,
(2.5.2)
Pi1 ···in (x) = Pi1 (x)Pi2 (ui1 (x)) · · · Pin (uin−1 ···i1 (x)), n ≥ 2,

and the functions ui and Pi , i ∈ N+ are defined by


1 x+1
ui (x) = , Pi (x) = , x ∈ I.
x+i (x + i)(x + i + 1)
Note that by Proposition 2.1.10 we have

U n f (x) = Ex (f (sxn ))

for any n ∈ N, f ∈ B(I), and x ∈ I (remember that sx0 = x, x ∈ I), where


Ex denotes the mean value operator with respect to the probability measure
γx . As
sxn = uan ···a1 (x), x ∈ I, n ∈ N+ ,
we thus have
X
U n f (x) = γx ((a1 , · · · , an ) = i(n) )f (uin ···i1 (x)) (2.5.3)
i(n) ∈Nn
+

for any n ∈ N+ , f ∈ B(I), and x ∈ I. Hence

Pi1 ···in (x) = γx (I(i(n) )) (2.5.4)

for any x ∈ I, n ∈ N+ , and (i1 , · · · , in ) = i(n) ∈ Nn+ . Of course, equation


(2.5.4) could be also obtained by direct computation.
Solving Gauss’ problem 137

Now, by (1.2.4), I(i(n) ) is the set of irrationals in the interval with end-
points pn /qn and (pn + pn−1 )/(qn + qn−1 ). Since


 1/i1 if n = 1,
pn 
= [i1 , · · · , in ] =
qn 
 1
 if n > 1
i1 + pn−1 (i2 , · · · , in )/qn−1 (i2 , · · · , in )

and

 1/(i1 + 1) if n = 1,
pn + pn−1
=
qn + qn−1 
[i1 , · · · , in−1 , in + 1] if n > 1



 1/(i1 + 1) if n = 1,

=

 1
 if n > 1
i1 + pn (i2 , · · · , in , 1)/qn (i2 , · · · , in , 1)

we can write
1
Pi1 ···in (x) = (x + 1) × ×
qn−1 (i2 , · · · , in )(x + i1 ) + pn−1 (i2 , · · · , in )

1
× (2.5.5)
qn (i2 , · · · , in , 1)(x + i1 ) + pn (i2 , · · · , in , 1)
for any n ≥ 2, i(n) ∈ Nn+ , and x ∈ I.
A useful alternative representation of U n f, n ∈ N+ , when f ∈ BV (I) is
available.
Proposition 2.5.1 If f ∈ BV (I) then for any n ∈ N+ and x ∈ I we
have Z
n
U f (x) = U n I(a,1] (x)df (a) + f (0)
[0,1)
R
with [0,x) df = f (x) − f (0), x ∈ I.
Proof. Since f can be represented as the difference of two non-decreasing
functions, we may and shall assume that f is non-decreasing. Then for any
x ∈ I we have Z
f (x) − f (0) = I(a,1] (x)df (a).
[0,1)
138 Chapter 2

By (2.5.1), using the above equation and Fubini’s theorem we obtain


X
U n f (x) = Pi1 ···in (x)f (uin ···i1 (x))
i1 ,··· ,in ∈N+
X Z
= Pi1 ···in (x) I(a,1] (uin ···i1 (x))df (a) + f (0)
i1 ,··· ,in [0,1)
 
Z X
=  Pi1 ···in (x)I(a,1] (uin ···i1 (x)) df (a) + f (0)
[0,1) i1 ,··· ,in ∈N+
Z
= U n I(a.1] (x)df (a) + f (0)
[0,1)

for any n ∈ N+ and x ∈ I. 2


Corollary 2.5.2 For any n ∈ N+ we have
var U n f var U n f var U n f
sup = sup = sup
f ∈BV (I) var f f ∈B(I),f ↑ var f f ∈B(I)f ↓ var f

= sup var U n I(a,1] ,


a∈[0,1)

where the first three upper bounds are taken over non-constant functions f ,
and f ↑ (↓) means that f is non-decreasing (non-increasing).
Proof. It is clear that
var U n f var U n f
sup = sup
f ∈B(I),f ↑ var f f ∈B(I),f ↓ var f

since
var U n (−f ) var U n f
= .
var(−f ) var f
Next, let
var U n f
vn = sup , n ∈ N+ .
f ∈B(I),f ↑ var f

Then (cf. the proof of Corollary 2.1.13) for any non-constant f ∈ BV (I)
there exist two non-decreasing functions f1 and f2 such that f = f1 − f2
and var f = var f1 + var f2 . Therefore
var U n f ≤ var U n f1 + var U n f2

≤ vn (var f1 + var f2 ) = vn var f, n ∈ N+ .


Solving Gauss’ problem 139

Hence
var U n f
sup ≤ vn
f ∈BV (I) var f

and since
var U n f var U n f
sup ≥ sup = vn ,
f ∈BV (I) var f f ∈B(I),f ↑ var f

the first equation should hold.


To derive the last equation let f ∈ B(I) be non-decreasing. Then U n f
is a monotone function by Proposition 2.1.11, and Proposition 2.5.1 implies
that Z
n n
¡ n ¢
U f (1) − U f (0) = U I(a,1] (1) − U n I(a,1] (0) df (a)
[0,1)

for any n ∈ N+ . Noting that I(a,1] : I → I is also a non-decreasing function


for any a ∈ [0, 1), we obtain
à !
var U n f ≤ sup var U n I(a,1] var f.
a∈[0,1)

Hence, for any a ∈ [0, 1) and n ∈ N+ ,

var U n f
var U n I(a,1] ≤ sup ≤ sup var U n I(a,1]
f ∈B(I),f ↑ var f a∈(0,1]

and the proof is complete. 2

2.5.2 An upper bound


On account of Corollary 2.5.2 our guess for the upper bound of var U n f /var f
over non-constant f ∈ BV (I) is given in the conjecture below.
UB Conjecture. For any n ∈ N+ we have

vn = sup var U n I(a,1] = var U n I(g,1] ,


a∈[0,1)


where g = [1, 1, 1, · · · ] = ( 5 − 1)/2 = 0.6180339 · · · .
Without any loss of generality, throughout this subsection we assume
that f ∈ BV (I) is non-decreasing. To simplify the writing put

Pi1 ···in (0) = αi1 ···in , ui1 ···in (0) = βi1 ···in , i1 , · · · , in ∈ N+ .
140 Chapter 2

If n is odd then by Proposition 2.1.11 and equations (2.5.1), (2.5.2), and


(2.5.5) we have

var U n f = U n f (0) − U n f (1) (2.5.6)

X
= [Pi1 ···in (0)f (uin ···i1 (0)) − Pi1 ···in (1)f (uin ···i1 (1))]
i1 ,··· ,in ∈N+
X
= [Pi1 ···in (0)f (uin ···i1 (0)) − 2P(i1 +1)i2 ···in (0)f (uin ···i2 (i1 +1) (0))]
i1 ,··· ,in ∈N+
 
X X
= α1i2 ···in f (βin ···i2 1 ) − α(i1 +1)i2 ···in f (βin ···i2 (i1 +1) ) .
i2 ,··· ,in ∈N+ i1 ∈N+

Similarly, if n is even then we have

var U n f = U n f (1) − U n f (0) (2.5.7)


 
X X
=  α(i1 +1)i2 ···in f (βin ···i2 (i1 +1) ) − α1i2 ···in f (βin ···i2 1)  .
i2 ,··· ,in ∈N+ i1 ∈N+

It is easy to see that if n is odd then var U n I(a,1] has a constant value
for  · ¶

 1 1
 , if n = 1,
j1 + 1 j1
a∈



[ [j1 , · · · , jn−1 , jn + 1], [j1 , · · · , jn ] ) if n > 1
while if n is even then var U n I(a,1] has a constant value for

a ∈ [ [j1 , · · · , jn ], [j1 , · · · , jn−1 , jn + 1] ) ,

that is, in both cases, on the closure without the right endpoint of any
fundamental interval I(j (n) ), j (n) = (j1 , · · · , jn ) ∈ Nn+ . Write 1(n) for
(j1 , · · · , jn ) with jk = 1, 1 ≤ k ≤ n, n ∈ N+ . Then in particular for

a ∈ [ [1(2m + 2)], [1(2m + 1)]) , m ∈ N,

that is, · ¶
F2m+1 F2m
a∈ , , m ∈ N, (2.5.8)
F2m+2 F2m+1
Solving Gauss’ problem 141

we have

v10 := var U I(a,1] = 1/2,


 
X X X
v30 := var U 3 I(a,1] = α1i2 1 − α(i1 +1)i2 1  + α(i1 +1)11 ,
i2 ∈N+ i1 ∈N+ i1 ∈N+

and
0
v2m+1 := var U 2m+1 I(a,1]

m−2
X X h
= α1i2 i3 ···i2m−2q−1 (i2m−2q +1)1···1
q=0 i2 ,··· ,i2m−2q ∈N+
X i
− α(i1 +1)i2 i3 ···i2m−2q−1 (i2m−2q +1)1···1
i1 ∈N+
 
X X X
+ α1i2 1···1 − α(i1 +1)i2 1···1  + α(i1 +1)1···1
i2 ∈N+ i1 ∈N+ i1 ∈N+

for m ≥ 2. (In the last equation the number of subscripts of the α’s is
2m + 1.) Similarly, for

a ∈ [ [1(2m + 2)], [1(2m + 3)]) , m ∈ N,

that is, · ¶
F2m+1 F2m+2
a∈ , , m ∈ N, (2.5.9)
F2m+2 F2m+3
we have X
v20 := var U 2 I(a,1] = α(i1 +1)1 ,
i1 ∈N+

0
v2m+2 := var U 2m+2 I(a,1]
m−1
X X h X
= α(i1 +1)i2 i3 ···i2m−2q (i2m−2q+1 +1)1···1
q=0 i2 ,··· ,i2m−2q+1 ∈N+ i1 ∈N+
i
− α1i2 i3 ···i2m−2q (i2m−2q+1 +1)1···1
X
+ α(i1 +1)1···1
i1 ∈N+
142 Chapter 2

for m ∈ N+ . (In the last equation the number of subscripts of the α’s is
2m + 2.)
Since g belongs to all intervals (2.5.8) and (2.5.9), the UB Conjecture
amounts to
vn = vn0 , n ∈ N+ .
The case n = 1. This case was dealt with in Proposition 2.1.12. Actually,
writing i for i1 , equation (2.5.6) yields
X
var U f = α1 f (β1 ) − αi+1 f (βi+1 ).
i∈N+

Hence
1
var U I(a,1] =
i+1
· ¶
1 1
for a ∈ , , i ∈ N+ and
i+1 i
1
v1 = sup var U I(a,1] = = var U I(g,1] = v10
a∈[0,1) 2

as g ∈ [1/2, 1). Thus in this case the UB Conjecture holds.


The case n = 2. Write i for i1 and j for i2 . Then we have
1
αij = , i, j ∈ N+ ,
(ij + 1)(i(j + 1) + 1)
and equation (2.5.7) yields
 
X X
var U 2 f =  α(i+1)j f (βj(i+1) ) − α1j f (βj1 )
j∈N+ i∈N+
(2.5.10)
X
= α(i+1)1 f (β1(i+1) )
i∈N+
 
X X
+  α(i+1)(j+1) f (β(j+1)(i+1) ) − α1j f (βj1 ) .
j∈N+ i∈N+

Clearly, β(j+1)(i+1) < βj1 for any i, j ∈ N+ . Hence


 
X X X
var U 2 f ≤ f (1) α(i+1)1 + f (βj1 )  α(i+1)(j+1) − α1j  .
i∈N+ j∈N+ j∈N+
Solving Gauss’ problem 143

But
X X 1
α(i+1)(j+1) =
((i + 1)(j + 1) + 1) ((i + 1)(j + 2) + 1)
i∈N+ i∈N+

1 X 1
≤ (2.5.11)
(j + 1)(j + 2) (i + 1)2
i∈N+

= (ζ(2) − 1) α1j < α1j


for any j ∈ N+ . Since f (βj1 ) ≥ f (0), j ∈ N+ , and
 
X X X
 α(i+1)(j+1) − α1j  = − α(i+1)1 ,
j∈N+ i∈N+ i∈N+

(2.5.10) and (2.5.11) imply that


X X
var U 2 f ≤ α(i+1)1 (f (1) − f (0)) = α(i+1)1 var f (2.5.12)
i∈N+ i∈N+

for any non-decreasing f ∈ B(I). Now, note that for f = I(a,1] with
a ∈ [1/2, 2/3), in particular for a = g, we have
X
var U 2 I(a,1] = α(i+1)1 ,
i∈N+

that is, the constant


X X 1 X µ 1 1

α(i+1)1 = =2 −
(i + 2)(2i + 3) 2i + 3 2i + 4
i∈N+ i∈N+ i∈N+
µ ¶
1 1 1 7
= 2 log 2 − 1 + − + = log 4 − = 0.21962 · · ·
2 3 4 6
occurring in (2.5.12) cannot be lowered. Therefore for n = 2 we have
7
v2 = log 4 − = 0.21962 · · · ,
6
and the UB Conjecture holds in this case.
The case n ≥ 3. We could try to treat this case similarly to the case
n = 2. Using (2.5.5) it is not difficult to generalize inequality (2.5.11) to
X
α(i1 +1)(i2 +1)i3 ···in ≤ (ζ(2) − 1)α1i2 ···in < α1i2 ···in (2.5.13)
i1 ∈N+
144 Chapter 2

for any n ≥ 3 and i2 , · · · , in ∈ N+ . Next, to make a choice let us assume


that n is odd. Then it is easy to see that

βin ···i3 (i2 +1)(i1 +1) > βin ···i3 i2 1 ,

βin ···i3 1(i1 +1) > βin ···i3 1 ,

βin ···i3 i2 1 < βin ···i2

for any i1 , · · · , in ∈ N+ . Then by (2.5.6) and (2.5.13) we have



X X ¡ ¢
var U n f ≤ − α(i1 +1)1i3 ···in f βin ···i3 1(i1 +1)
i3 ,··· ,in ∈N+ i1 ∈N+

  
X X
+ α1i2 i3 ···in − α(i1 +1)(i2 +1)i3 ···in  f (βin ···i3 i21 )
i2 ∈N+ i1 ∈N

  
X X X
≤  α1i2 i3 ···in − α(i1 +1)i2 i3 ···in  f (βin ···i3 )
i3 ,··· ,in ∈N+ i2 ∈N+ i1 ∈N

  
X
+ α(i1 +1)1i3 ···in  (f (βin ···i3 ) − f (βin ···i3 1 )) .
i1 ∈N+
(2.5.14)
For an even n the corresponding inequality is
  
X X X
var U n f ≤   α(i +1)i i − α1i2 i3 ···in  f (βin ···i3 )
1 2 3 ···in
i3 ,··· ,in ∈N i2 ∈N+ i1 ∈N+

  
X
+ α(i1 +1)1i3 ···in  (f (βin ···i3 1 ) − f (βin ···i3 )) . (2.5.15)
i1 ∈N+

Put
 
X X
δi3 ···in = (−1)n−1 α1i2 i3 ···in − α(i1 +1)i2 i3 ···in ) 
i2 ∈N+ i1 ∈N
Solving Gauss’ problem 145

for any i3 , · · · , in ∈ N+ . Note that


 
X X
δi3 ···in = (−1)n−1 α1 − αi1 +1  = 0. (2.5.16)
i3 ,··· ,in ∈N+ i1 ∈N+

Using (2.5.5), which implies


 
 1 1 
Pi1 ···in (0) = (−1)n 
 − 
pn (i2 , · · · , in , 1) pn−1 (i2 , · · · , in ) 
i1 + i2 +
qn (i2 , · · · , in , 1) qn−1 (i2 , · · · , in )

for any n ≥ 2 and i1 , · · · , in ∈ N+ , it is easy to see that δi3 ···in can be


expressed in terms of the digamma function ψ as
µ ¶ µ ¶
p0n−2 p0n−2 + p0n−3
δi3 ···in = ψ 2 + 0 −ψ 2+ 0 0
qn−2 qn−2 + qn−3
    
X 
 
 1 


 1 

+ ψ 2 + 0  − ψ 2 + 0 0  ,
  pn−2   pn−2 + pn−3 
i∈N+ i+ 0 i+ 0 0
qn−2 qn−2 + qn−3

where p0m = pm (i3 , · · · , im+2 ), qm


0 = q (i , · · · , i
m 3
0
m+2 ), m ∈ N+ , and p0 =
0
0, q0 = 1. Let us recall that the digamma function can be expressed by the
convergent series
X µ1 1
¶ X z−1
ψ(z) = −C + − = −C +
j j+z−1 j(j + z − 1)
j∈N+ j∈N+

for z 6= 0, −1, −2, · · · , where C = 0.57721 · · · is the Euler constant. As is


well known, ψ satisfies the equation

1
ψ(z + 1) = ψ(z) +
z
for z 6= 0, −1, −2, · · · . Tables for ψ can be found in Abramowitz and Stegun
(1964).
Putting X
δ (n) (f ) = δi3 ···in f (βin ···i3 ),
i3 ,··· ,in ∈N+
146 Chapter 2

inequalities (2.5.14) and (2.5.15) imply that


X
var U n f ≤ δ (n) (f ) + α(i1 +1)1···1 var f (2.5.17)
i1 ∈N+

for any n ≥ 3. Here we used the fact that α(i1 +1)1i3 ···in < α(i1 +1)11···1 for any
n ≥ 3 and (i3 , · · · , in ) 6= 1(n − 2), which follows at once from (2.5.5).
First, note that by (2.5.16) we have
1 X
δ (n) (f ) ≤ |δi3 ···in | (f (1) − f (0)). (2.5.18)
2
i3 ,··· ,in ∈N+

Since X X
1
|δi3 ···in | = sup δi3 ···in ,
2
i3 ,··· ,in ∈N+ (i3 ,··· ,in )∈A

where the supremum is taken over all A ⊂ Nn−2


+ , it follows that

1 X 1 X
|δi3 ···in | ≥ |δi | .
2 2
i3 ,··· ,in ∈N+ i∈N+

Hence the right hand side of (2.5.17) does not tend to 0 as n → ∞, and
(2.5.18) is useless for n ≥ 3. As a matter of fact, it is a general result which
does not take into account that f is non-decreasing.
If for some given n ≥ 3 the inequality

δ (n) (I(a,1] ) ≤ δ (n) (I(g,1] ) (2.5.19)

holds for any a ∈ [0, 1), then by (2.5.17) we have


X
var U n I(a,1] ≤ δ (n) (I(g,1] ) + α(i1 +1)1···1 (2.5.20)
i1 ∈N+

for any a ∈ [0, 1). It is easy to see that the right hand side of (2.5.20) is
equal to vn0 . Since whatever n ∈ N+ we have

var U n I(g,1) = vn0 ,

it follows from (2.5.20) that vn = vn0 . Thus if (2.5.19) holds then for the
given n the UB Conjecture holds, too.
In particular for n = 3, writing i, j, k for i1 , i2 , i3 , respectively, we have
1
αijk = , i, j, k ∈ N+ .
(i(jk + 1) + k)(i(j(k + 1) + 1) + k + 1)
Solving Gauss’ problem 147

It has been proved in Iosifescu (1994) that


 
X X
δk = α1jk − α(i+1)jk 
j∈N+ i∈N+

is positive for k = 1 and negative for k > 1. Then (2.5.19) clearly holds in
this case. Hence the UB Conjecture holds for n = 3 and
X
v3 = δ1 + α(i+1)11
i∈N+

X µ 1
µ
1
¶ µ
2
¶¶
= +ψ 2+ −ψ 2+
(j + 2)(2j + 3) j+1 2j + 1
j∈N+
µ ¶ µ ¶
2 1
+ψ 2 + −ψ 2+
3 2
7 X µ µ 1
¶ µ
2
¶¶
= log 4 − + ψ 2+ −ψ 2+
6 j+1 2j + 1
j∈N+
µ ¶ µ ¶
3 3 2 2 1
+ + +ψ − −2−ψ
5 2 3 3 2
7 17 4 π
= log 4 − − + log √ + √
6 30 27 2 3
X µ µ ¶ µ ¶¶
1 2
+ ψ 2+ −ψ 2+ .
j+1 2j + 1
j∈N+

We have [see Iosifescu (1994, p.115)]

0.09104 < v3 < 0.09759

while a computation using MATHEMATICA yields

0.09436 < v3 < 0.09445.

Returning to the general case, a good upper bound for vn , n ∈ N+ is


available. For a lower bound see further Corollary 2.5.6.
Theorem 2.5.3 We have
k0
vn ≤ (2.5.21)
Fn Fn+1
148 Chapter 2

for any n ∈ N+ . Here and throughout the remainder of this section, k0 is a


constant not exceeding 14.8.

Proof. Clearly, (2.5.21) holds for n = 1, 2, 3 as was shown before.


By Corollary 2.5.2 and on account of the constancy of the function a →
var U n I(a,1] on any fundamental interval of order n, we have

vn = sup var U n I(a,1] , n ∈ N+ .


a∈Ω

If to make a choice we assume that n ∈ N+ is odd, then by Proposition


2.1.11 and equation (2.5.3) for any a ∈ I we have

var U n I(a,1] = U n I(a,1] (0) − U n I(a,1] (1)


X ¡ ¢
= γ0 (I(i(n) ) − γ1 (I(i(n) )) I(a,1] (uin ···i1 (1))
i(n) ∈Nn
+

X
+ γ0 (I(i(n) ))(I(a,1] (uin ···i1 (0)) − I(a,1] (uin ···i1 (1)).
i(n) ∈Nn
+
(2.5.22)
Note that if a ∈ Ω then just one of the differences

I(a,1] (uin ···i1 (0)) − I(a,1] (uin ···i1 (1)), i(n) ∈ Nn+ ,

is 6= 0 (and equal to 1). Also, for an arbitrarily given a = [j1 , j2 , · · · ] ∈ Ω the


set n o
(n) n
i ∈ N+ : uin ···i1 (1) > a

consists of the i(n) = (i1 , . . . , in ) ∈ Nn+ satisfying

(i1 < j1 ), if n = 1;

(i3 < j1 ) ∪ (i3 = j1 , i2 > j2 ) ∪ (i3 = j1 , i2 = j2 , i1 < j3 ), if n = 3;

(in < j1 ) ∪ (in = j1 , in−1 > j2 ) ∪ (in = j1 , in−1 = j2 , in−2 < j3 ) ∪ · · ·

∪(in = j1 , · · · , i3 = jn−2 , i2 > jn−1 ) ∪ (in = j1 , · · · , i2 = jn−1 , i1 < jn ),

if n ≥ 5. Therefore, putting µ = γ0 − γ1 , it follows from (2.5.22) that for


Solving Gauss’ problem 149

a = [j1 , j2 , . . . ] ∈ Ω and any odd n ≥ 5 we have


var U n I(a,1] ≤ |µ(an < j1 )| + |µ(an = j1 , an−1 > j2 )|

+ |µ(an = j1 , an−1 = j2 , an−2 < j3 )| + · · ·

+ |µ(an = j1 , · · · , a3 = jn−2 , a2 > jn−1 | (2.5.23)

+ |µ(an = j1 , · · · , a2 = jn−1 , a1 < jn |

+ maxi(n) ∈Nn γ0 (I(i(n) )).


+

We shall use the inequalities


|γ0 (A) − γ1 (A)| ≤ (log 2)γ(A),
(2.5.24)
|γa (τ −n (A)) − γ(A)| ≤ (ζ (2) log 2 − 1)λ0n−1 γ(A),
which are valid for any a ∈ I, A ∈ BI , and n ∈ N+ , with λ0 = 0.303663 · · ·
(Wirsing’s constant).
The first inequality follows by integrating over A the double inequality
1 2 1
− ≤1− 2
≤ , x ∈ I,
x+1 (x + 1) x+1
while the second one is the inequality in Theorem 2.3.5 for ` = 2. Note that
(an < j1 ) = τ −n+1 (a1 < j1 ),

(an = j1 , an−1 > j2 ) = τ −n+2 (a2 = j1 , a1 > j2 ),

(an = j1 , an−1 = j2 , an−2 < j3 ) = τ −n+3 (a3 = j1 , a2 = j2 , a1 < j3 ),


····························································
(an = j1 , · · · , a3 = jn−2 , a2 > jn−1 )

= τ −1 (an−1 = j1 , · · · , a2 = jn−2 , a1 > jn−1 )


and
(a2 = j1 , a1 > j2 ) ⊂ (a2 = j1 )

(a3 = j1 , a2 = j2 , a1 < j3 ) ⊂ (a3 = j2 , a2 = j2 )


····························································
(an−1 = j1 , · · · , a2 = jn−2 , a1 > jn−1 ) ⊂ (an−1 = j1 , · · · , a2 = jn−2 )

(an = j1 , · · · , a2 = jn−1 , a1 < jn ) ⊂ (an = j1 , · · · , a2 = jn−1 ).


150 Chapter 2

Next, by Theorem 1.2.2 we have


1
max γ0 (I(i(n) )) = (:= σ(n)), n ∈ N+ ,
i(n) ∈Nn
+
Fn Fn+1

and then
2
max γ(I(i(n) )) ≤ σ(n), n ∈ N+ . (2.5.26)
i(n) ∈Nn
+
3 log 2
Now, by (2.5.24) through (2.5.26), with k1 = log 2 = 0.69315 · · · , k2
¡√ ¢2 ¡ √ ¢
= ζ(2) log 2 − 1 = 0.14018, · · · , θ = g2 = 5 − 1 /4 = 3 − 5 /2 =
0.38196 · · · , it follows from (2.5.23) that
var U n I(a,1]
µ ¶
4k2 n−2 n−3 k1
≤ λ0 σ (0) + λ0 σ (1) + · · · + σ (n − 2) + σ (n − 1) + σ (n)
3 log 2 2k2
µ
4k2 σ (0) σ (1)
= σ (n − 1) λn−20 + λ0n−3 + ···
3 log 2 σ (n − 1) σ (n − 1)

σ(n − 2) k1
+ + + σ(n).
σ(n − 1) 2k2
Since
σ (k) 1
≤ θk−n−1 , k, n ∈ N,
σ (n) 2
and
σ (n − 1) 8
≤ , n ≥ 3,
σ (n) 3
we finally obtain
µ ¶
n 16k1 16k2
var U I(a,1] ≤ 1+ + σ (n) .
9 log 2 9θ (θ − λ0 ) log 2
We have
µ ¶
16k1 16k2 16 k2
1+ + =1+ k1 +
9 log 2 9θ (θ − λ0 ) log 2 9 log 2 θ (θ − λ0 )

à !
32 log 2 ζ (2) log 2 − 1
= 1+ + √ ¡ √ ¢
9 log 2 2 7 − 3 5 − 3 − 5 λ0

= 14.780 · · · < 14.8


Solving Gauss’ problem 151

and the proof is complete for any odd n. The case of an even n can be
treated similarly. 2
Corollary 2.5.4 Let f ∈ BV (I). For any n ∈ N we have

k0 var f
|| U n f − U ∞ f || ≤ .
Fn Fn+1

Proof. By (2.1.12) and Proposition 2.0.1 (i) we have

|| U n f − U ∞ f || ≤ var U n f, n ∈ N,

and the result stated is implied by Theorem 2.5.3 for n ∈ N+ . The case
n = 0 can be checked directly. 2
Remark. It was claimed in Iosifescu (1997, p.76) that Theorem 2.5.3
holds with k0 = 1/ log 2 for all n ∈ N large enough. (This is clearly true
for n = 1, 2, or 3.) A flaw detected by Adriana Berechet in the method
of proof in that paper invalidates the conclusion. We conjecture, however,
that both Theorem 2.5.3 and Corollary 2.5.4 hold with k0 = 1/ log 2 for any
n ∈ N. 2

2.5.3 Two asymptotic distributions


We are now able to derive the asymptotic behaviour of γa (san ≤ x) as n → ∞
for any a, x ∈ I.
Theorem 2.5.5 For any a ∈ I and n ∈ N we have

a+1 k0
≤ sup |γa (san ≤ x) − G(x)| ≤ .
2(Fn + aFn−1 )(Fn+1 + aFn ) x∈I Fn Fn+1

Proof. (i) The upper bound. We have already used in Subsection 2.5.1
the property of U of being the transition operator of the Markov chain
(san )n∈N for any a ∈ I. Therefore in particular
¡ ¢
U n I[0,x] (a) = Ea I[0,x] (san ) = γa (san ≤ x)

for any a, x ∈ I and n ∈ N. As


Z

U I[0,x] = I[0,x] dγ = γ([0, x]) = G(x), x ∈ I,
I
152 Chapter 2

Corollary 2.5.4 yields the upper bound announced .


(ii) The lower bound. We start with two simple remarks. First, using
the continuity of G and the equations limh↓0 γa (san ≤ x − h) = γa (san < x)
and limh↓0 γa (san < x + h) = γa (san ≤ x), x ∈ I, it is easy to see that

sup |γa (san ≤ x) − G(x)| = sup |γa (san < x) − G(x)|


x∈I x∈I

for any a ∈ I and n ∈ N. Second, for any s ∈ I we have

γa (san = s) = γa (san ≤ s) − G(s) − (γa (san < s) − G(s))

≤ sup |γa (san ≤ x) − G(x)| + sup |γa (san < x) − G(x)|


x∈I x∈I

= 2 sup |γa (san ≤ x) − G(x)| .


x∈I

Hence
1
sup |γa (san ≤ x) − G(x)| ≥ sup γa (san = s) (2.5.27)
x∈I 2 s∈I
for any a ∈ I and n ∈ N.
Next, recall (see Subsection 2.5.1) that

γa (san = [in , . . . , i2 , i1 + a]) = γa (I(i(n) )) = Pi1 ···in (a), n ≥ 2,


µ ¶
1
γa sa1 = = γa (I(i1 )) = Pi1 (a)
i1 + a

for any a ∈ I and (i1 , · · · , in ) = i(n) ∈ Nn+ . By (2.5.5) and (2.5.27) we then
have
sup γa (san = s) = P1(n) (a), a ∈ I, n ∈ N+ , (2.5.28)
s∈I

where we write 1(n) for (i1 , · · · , in ) with i1 = · · · = in = 1, n ∈ N+ . With


the convention F−1 = 0, by equation (2.5.5) again,
a+1
P1(n) (a) =
((a + 1)Fn−1 + Fn−2 )((a + 1)Fn + Fn−1 )
(2.5.29)
a+1
= , a ∈ I, n ∈ N+ .
(Fn + aFn−1 )(Fn+1 + aFn )
The lower bound announced now follows from (2.5.27) through (2.5.29). The
case n = 0 can be checked directly. 2
Solving Gauss’ problem 153

Remarks. 1. It is easy to see that P1(n) (·) is a decreasing function. Hence


2
P1(n) (a) ≥ P1(n) (1) = (2.5.30)
Fn+1 Fn+2
for any a ∈ I and n ∈ N+ .
2. Both√lower and upper bounds 2n
2
√ in Theorem 2.5.5 are O(g ) as n → ∞
with g = ( 5 − 1)/2, g = (3 − 5)/2 = 0.38196 · · · . Thus the optimal
convergence rate has been obtained. 2
Corollary 2.5.6 For any n ∈ N+ we have
2
vn ≥ .
Fn+1 Fn+2

Proof. As noted in the proof of Theorem 2.5.5, we have

γa (san ≤ x) = U n I[0,x] (a), G(x) = U ∞ I[0,x]

for any a, x ∈ I and n ∈ N. Then Theorem 2.5.5, inequality (2.5.30), and


the argument used in the proof of Corollary 2.5.4 yield
2
≤ sup || U n I[0,x] − U ∞ I[0,x] || ≤ sup var U n I[0,x] (2.5.31)
Fn+1 Fn+2 x∈I x∈I

for any n ∈ N. By Corollary 2.5.2 the proof is complete. 2


Remark. Theorem 2.5.3 and Corollary 2.5.6 show that vn = O(g2n ) as
n → ∞, and this convergence rate is optimal. 2
Corollary 2.5.7
√ The spectral radius of the operator U − U ∞ in BV (I)
equals g2 = (3 − 5)/2 = 0.38196 · · · .
Proof. We should show that
à !1/n
n || U n f − U ∞ f ||v
lim || U − U ∞ ||1/n
v = lim sup = g2 .
n→∞ n→∞ 06=f ∈BV (I) || f ||v

The argument used in the proof of Corollary 2.5.4, and Theorem 2.5.3
yield

|| U n f − U ∞ f ||v = || U n f − U ∞ f || + var U n f

4k0 4k0
≤ 2 var U n f ≤ var f ≤ || f ||v
Fn Fn+1 Fn Fn+1
154 Chapter 2

for any n ∈ N and f ∈ BV (I). (We took into account that, as mentioned
at the beginning of this section, here f is complex-valued. See the proof of
Proposition 2.1.16.) Hence

lim || U n − U ∞ ||1/n
v ≤ g2 .
n→∞

The converse inequality follows by taking f = I[0,x] , x ∈ I, and using


(2.5.31). 2

Theorem 2.5.5 allows a quick derivation of the asymptotic behaviour of

γa (τ n ≤ x, san ≤ y)

as n → ∞ for any a, x, y ∈ I, and of the (optimal) convergence rate, the


same as above.

Theorem 2.5.8 For any a ∈ I and n ∈ N we have

a+1
2(Fn + aFn−1 )(Fn+1 + aFn )

¯ ¯
¯ log(xy + 1) ¯¯
¯ n a
≤ sup ¯γa (τ ≤ x, sn ≤ y) −
log 2 ¯
x,y∈I

k0
≤ .
Fn Fn+1

Proof. Set Gan (y) = γa (san ≤ y), Hna (y) = Gan (y) − G(y), a, y ∈ I, n ∈ N.
Theorem 2.5.5 yields

k0
|Hna (y) | ≤ , a, y ∈ I, n ∈ N. (2.5.32)
Fn Fn+1

By the generalized Brodén–Borel–Lévy formula (1.3.21), for any a, x, y ∈ I


Solving Gauss’ problem 155

and n ∈ N we have
Z y
n
γa (τ ≤ x, san ≤ y) = γa (τ n ≤ x|san = z) dGan (z)
0
Z y
(z + 1)x a
= dGn (z)
0 zx + 1
Z y Z y
1 (z + 1)x dz (z + 1)x
= + dHna (z)
log 2 0 zx + 1 z + 1 0 zx + 1
¯
log(xy + 1) (z + 1)x a ¯¯z=y
= + H (z)
log 2 zx + 1 n ¯z=0
Z y
x − x2
− H a (z)dz.
2 n
0 (zx + 1)

[When applying formula (1.3.21) we used the fact that the σ-algebras gen-
erated by (a1 , · · · , an ) and by san are identical for any a ∈ I and n ∈ N+ .]
Hence, by (2.5.32),
¯ ¯
¯ ¯
¯γa (τ n ≤ x, san ≤ y) − log(xy + 1) ¯
¯ log 2 ¯
µ ¶
k0 (y + 1)x (x − x2 )y k0
≤ + ≤
Fn Fn+1 xy + 1 xy + 1 Fn Fn+1
for any a, x, y ∈ I and n ∈ N, so that the upper bound holds.
To get the lower bound we note that by Theorem 2.5.5 for any a ∈ I
and n ∈ N we have
¯ ¯
¯ log(xy + 1) ¯¯
¯ n a
sup ¯γa (τ ≤ x, sn ≤ y) −
log 2 ¯
x,y∈I
¯ ¯
¯ log(y + 1) ¯¯
¯ n a
≥ sup ¯γa (τ ≤ 1, sn ≤ y) −
y∈I log 2 ¯
a+1
= sup |γa (san ≤ y) − G(y)| ≥ .
y∈I 2(Fn + aFn−1 )(Fn+1 + aFn )
2
Remarks. 1. We can replace γa (τ n ≤ x, san ≤ y) by λ(τ n ≤ x, san ≤ y) in
the statement of ¯Theorem ¯ 2.5.8 since it is possible to relate these quantities
¯ a 0 ¯ 2
by noticing that sn − sn ≤ 1/Fn , n ∈ N, a ∈ I. The new upper and lower
bounds are of order O(g2n ) as n → ∞, too.
156 Chapter 2

2. As noted at the end of Subsection 1.3.3, log(xy + 1)/ log 2, x, y ∈ I,


is the joint distribution function under γ̄ of the extended random variables
τ̄ n and s̄n . 2

2.5.4 A generalization of a result of A. Denjoy


Sixty five years ago, A. Denjoy published a Comptes Rendus Note [see Den-
joy (1936 b)] in which he sketched a proof of the fact that (in our notation)

x log(y + 1)
lim λ([a1 , · · · , an ] ≤ x, s0n ≤ y) = (2.5.33)
n→∞ log 2

uniformly with respect to x, y ∈ I. Of course, for x = 1 this follows at once


from Theorem 2.5.5. In this subsection we prove that (2.5.33) holds with
λ replaced by any probability measure µ on BI absolutely continuous with
respect to λ, in particular with λ replaced by any γa , a ∈ (0, 1]. An estimate
of the convergence rate is also given . These will follow from Theorem 2.5.9
below.
Since |[a1 , · · · , an ] − τ 0 | ≤ (Fn Fn+1 )−1 , n ∈ N+ , it is easy to see that
for any probability measure µ on BI absolutely continuous with respect to
λ, we have
¯ ¯
¯µ([a1 , · · · , an ] ≤ x, s0n ≤ y) − µ(τ 0 ≤ x, s0n ≤ y)¯

≤ max(µ(x − (Fn Fn+1 )−1 < τ 0 ≤ x), µ(x < τ 0 ≤ x + (Fn Fn+1 )−1 )) → 0

uniformly with respect to x, y ∈ I as n → ∞. This allows us to replace


[a1 , · · · , an ] by τ 0 in (2.5.33) and its generalizations.
Fix a ∈ I arbitrarily. Let f be a λ-integrable complex-valued function
on I. Since γa is equivalent to λ for any a ∈ I, f is γa -integrable, too.
Denote by Ek , k ∈ N, the set consisting of the endpoints of all fundamental
intervals of rank `, 0 ≤ ` ≤ k. For any n ∈ N we associate with f a
function fna which hasR a constant value on each fundamental interval of rank
n. Specifically, f0a = I f dγa and
Z
a 1
fn (x) = f dγa , x ∈ I(i(n) ), i(n) ∈ Nn+ ,
γa (I(i(n) )) I(i(n) )

for n ∈ N+ . Clearly,
Z Z
fna dγa = f dγa , n ∈ N. (2.5.34)
I I
Solving Gauss’ problem 157

Since for any n ∈ N+ and x ∈ I \ En there is a unique i(n) ∈ Nn+ such that
x ∈ I(i(n) ) and since
max γa (I(i(n) )) → 0
i(n) ∈Nn
+

as n → ∞, by a well known property of the Lebesgue integral we have

lim f a (x) = f (x) (2.5.35)


n→∞ n

a.e. in I. It follows from (2.5.34) and (2.5.35) that


Z
lim |f − fna |dγa = 0. (2.5.36)
n→∞ I

By (2.5.36) the right hand side of (2.5.37) below converges to 0 as n → ∞.


Remark. It is easy to check that (fna )n∈N is a martingale on (I, BI , γa )
whatever a ∈ I. 2
Theorem 2.5.9 Let f be a λ-integrable complex valued function on I
and let h ∈ BV (I) be real-valued. Then
¯Z Z Z ¯
¯ ¯
¯ f (h ◦ san ) dγa − f dγa hdγ ¯
¯ ¯
I I I
µ Z Z ¶ (2.5.37)
k0
≤ inf || h || |f − fka | dγa + var h |f |dγa
0≤k≤n I Fn−k Fn−k+1 I

for any a ∈ I and n ∈ N.


Proof. For any a ∈ I and k, n ∈ N+ , k ≤ n, we have
Z
f (h ◦ san )dγa
I
ÃZ Z ! (2.5.38)
X
= (f − fka )(h ◦ san )dγa + fka (h ◦ san )dγa .
I(i(k) ) I(i(k) )
i(k) ∈Nk+

Clearly,
¯ ¯
¯ ¯
¯ X Z ¯ Z
¯ ¯
¯ (f − fk )(h ◦ sn )dγa ¯ ≤ || h || |f − fka |dγa .
a a
(2.5.39)
¯ (k) k I(i(k) ) ¯ I
¯i ∈N+ ¯
158 Chapter 2

Next, for any fixed i(k) ∈ Nk+ we can write


Z Z Z
1
fka (h ◦ san )dγa = f dγa (h ◦ san )dγa . (2.5.40)
I(i(k) ) γa (I(i(k) ) I(i(k) ) I(i(k) )

It is easy to check that

a+1
γa (I(i(k) )) = ,
(qk + apk )(qk + qk−1 + a(pk + pk−1 ))

where
pk
= [i1 , . . . , ik ], g.c.d. (pk , qk ) = 1, k ∈ N+ ,
qk
and p0 = 0, q0 = 1. With the change of variable

pk + t pk−1
u= , t ∈ I,
qk + t qn−1

noting that
0
san (u) = san−k (t)
for t ∈ Ω, where
½
0 [ik , . . . , i2 , i1 + a] if k > 1,
a =
1/(i1 + a) if k = 1
qk−1 + apk−1
= ,
qk + apk

we obtain
Z Z
h(san (u))du
h(san (u))γa (du) = (a + 1)
I(i(k) ) I(i(k) ) (au + 1)2
Z 0
h(san−k (t))dt
= (a + 1) .
I (t(qk−1 + apk−1 ) + qk + apk )2

Hence
Z Z 0
1 h(san−k (t))dt
h(san (u))γa (du) = (a0 + 1) 0 2
γa (I(i(k) )) I(i(k) ) I (a t + 1)
(2.5.41)
Z Z
= a0
(h ◦ sn−k )dγa0 = a0
h(v) dGn−k (v),
I I
Solving Gauss’ problem 159

0 0
where Gam (v) = γa0 (sam < v), m ∈ N, v ∈ I. By Theorem 2.5.5 we have
k0
|Gam (v) − G(v)| ≤
Fm Fm+1
for any a, v ∈ I and m ∈ N. Then
¯Z Z ¯
¯ ¯
¯ h(v)dG a 0
hdγ ¯¯
¯ n−k (v) −
I I
¯Z Z ¯ (2.5.42)
¯ ¯ k0 var h
= ¯¯ Gn−k (v)dh(v) − G(v)dh(v)¯¯ ≤
a0
.
I I Fn−k Fn−k+1

It follows from (2.5.40) through (2.5.42) that


¯ ¯
¯ ¯
¯ X Z Z Z ¯
¯ a a ¯
¯ f (h ◦ sn )dγ a − f dγ a hdγ ¯ (2.5.43)
¯ (k) k I(i(k) ) k I I ¯
¯i ∈N+ ¯
Z
k0 var h
≤ |f |dγa .
Fn−k Fn−k+1 I
Finally, (2.5.38), (2.5.39), and (2.5.43) for k = 0 and n ∈ N should be
replaced by
Z Z Z
f (h ◦ san )dγa = (f − f0a )(h ◦ san )dγa + f0a (h ◦ san )dγa , (2.5.380 )
I I I
¯Z ¯ Z
¯ ¯
¯ (f − f0a )(h ◦ san )dγa ¯ ≤ || h || |f − f0a | dγa , (2.5.390 )
¯ ¯
I I
and
¯ Z Z Z ¯ Z
¯ a ¯
¯f0 (h ◦ sn )dγa − f dγa hdγ ¯ ≤ k0 var h |f |dγa ,
a
(2.5.430 )
¯ ¯ Fn Fn+1 I
I I I

respectively.
Now, (2.5.37) follows from (2.5.38), (2.5.380 ) (2.5.39), (2.5.390 ), (2.5.43),
and (2.5.430 ). 2
Corollary 2.5.10 For any a, x, y ∈ I and n ∈ N we have
¯ ¯
¯γa (τ 0 ≤ x, san ≤ y) − γa ([0, x])G(y)¯
µ ¶ (2.5.44)
k0
≤ inf δka (x) + γa ([0, x])
0≤k≤n Fn−k Fn−k+1
160 Chapter 2

where 
 0 if x ∈ Ek ,
δka (x) = 2(a + 1)(x − ak )(bk − x)
 if x ∈ (ak , bk ),
(bk − ak )(ax + 1)2
and [ak , bk ] is the closure of the (unique) fundamental interval of order
k ∈ N containing x ∈ I \ Ek .
Proof. Clearly,
Z
0
γa (τ ≤ x, san ≤ y) = I[0,x] (I[0,y] ◦ san ) dγa
I

for any a, x, y ∈ I and n ∈ N. Theorem 2.5.9 applies with f = I[0,x] and


h = I[0,y] , x, y ∈ I, yielding (2.5.44) since as is easy to see, in the present
case Z
|f − fka |dγa = δka (x), k ∈ N, a, x ∈ I.
I
2
Corollary 2.5.11 For any a ∈ I and n ∈ N we have

a+1 ¯ ¯
≤ sup ¯γa (τ 0 ≤ x, san ≤ y) − γa ([0, x])G(y)¯
2(Fn + aFn−1 )(Fn+1 + aFn ) x,y∈I
µ ¶
a+1 1
≤ + k0 . (2.5.45)
2 Fbn/2c Fbn/2c+1

Proof. We clearly have

a+1 a+1
δka (x) ≤ max λ(I(i(k) )) = , k ∈ N, a, x ∈ I. (2.5.46)
2 i (k) 2Fk Fk+1

The upper bound from (2.5.45) follows by using (2.5.46) and taking k =
bn/2c.
Next, as in the proof of Theorem 2.5.8, we get
¯ ¯
supx,y∈I ¯γa (τ 0 ≤ x, san ≤ y) − γa ([0, x])G(y)¯

a+1
≥ supy∈I |γa (san ≤ y) − G(y)| ≥
2(Fn + aFn−1 )(Fn+1 + aFn )

for any a ∈ I and n ∈ N, and so the lower bound holds, too. 2


Solving Gauss’ problem 161

Remark.
√ The upper bound in Corollary 2.5.11 is O(gn ) as n → ∞, with
g = ( 5 − 1)/2. The lower bound is O(g2n ) as n → ∞ so that the problem
of the exact rate of convergence is unsettled. 2
Corollary 2.5.12 Let µ ∈ pr (BI ) such that µ ¿ λ and let ga =
dµ/dγa , a ∈ I. Then we have
¯ ¯
¯µ(τ 0 ≤ x, san ≤ y) − µ([0, x])G(y)¯
µZ ¶ (2.5.47)
¯ ¯ k0
≤ inf ¯ga I[0,x] − (ga I[0,x] )a ¯ dγa + µ([0, x])
k
0≤k≤n I Fn−k Fn−k+1
for any a, x, y ∈ I and n ∈ N. In particular, if ga has a version gea of
bounded variation, then
Z
¯ ¯
¯ga I[0,x] − (ga I[0,x] )a ¯ dγa (2.5.48)
k
I

 (a + 1)var[0,x] gea

 if x ∈ Ek

 (Fk + aFk−1 )(Fk+1 + aFk )
≤ Z

 (a + 1)var[0,x] gea x


 +2 ga (t)γa (dt) if x ∈ (ak , bk ),
(Fk + aFk−1 )(Fk+1 + aFk ) ak
where [ak , bk ] is the closure of the (unique) fundamental interval of order
k ∈ N containing x ∈ I \ Ek .
Proof. We have
Z
0
µ(τ ≤ x, san ≤ y) = I[0,x] (I[0,y] ◦ san )ga dγa
I
for any a, x, y ∈ I and n ∈ N. Theorem 2.5.9 applies with f = ga I[0,x] and
h = I[0,y] , x, y ∈ I, yielding (2.5.47). Next, (2.5.48) can be obtained noting
that (i) for a typical fundamental interval I(i(k) ) of order k ∈ N contained
in [0, x] we have
Z
¯ ¯
¯ga I[0,x] − (ga I[0,x] )a ¯ dγa
k
I(i(k) )

Z ¯ Z ¯
¯ 1 ¯
¯ ¯
= ¯ga (t) − ga (s)γa (ds) ¯ γa (dt)
(k)
I(i ) ¯ γ a (I(i (k) )) (k)
I(i ) ¯
Z ¯ ¯
1 ¯Z ¯
¯ ¯
= ¯ (e
ga (t) − g
ea (s)) γa (ds) ¯ γa (dt)
γa (I(i(k) )) I(i(k) ) ¯ I(i(k) ) ¯

≤ γa (I(i(k) ))varI (i(k) ) gea ,


162 Chapter 2

and (ii) for x ∈ (ak , bk ) we have


Z bk
¯ ¯
¯ga I[0,x] − (ga I[0,x] )a ¯ dγa
k
ak

Z ¯
x¯ Z x ¯
1 ¯
= ¯ga (t) − ga (s)γa (ds)¯¯ γa (dt)
¯ γa ([ak , bk ]) ak
ak
Z b k ¯Z x ¯
1 ¯ ¯
+ ¯ ga (s)γa (ds)¯¯ γa (dt)
γa ([ak , bk ]) x ¯
ak
Z x Z bk µZ x ¶
1
≤ ga (t)γa (dt) + ga (s)γa (ds) γa (dt)
ak γa ([ak , bk ]) ak ak
Z x
= 2 ga (t)γa (dt).
ak

The proof is complete. 2


Corollary 2.5.13 Let µ ∈ pr(BI ) such that µ ¿ λ and let ga =
dµ/dγa , a ∈ I. Then we have
µZ ¶
a a k0
|µ(sn ≤ x) − G(x)| ≤ inf |ga − (ga )k | dγa +
0≤k≤n I Fn−k Fn−k+1
for any a, x ∈ I and n ∈ N. If ga has a version of bounded variation, then
the right hand side of the above inequality is O(gn ) as n → ∞ uniformly
with respect to a, x ∈ I.
Proof. Take x = 1 in (2.5.47), and then x = 1 and k = bn/2c in
(2.5.48). 2
Remark. Corollary 2.5.13 shows that the limiting distribution as n → ∞
of san under a probability measure on BI absolutely continuous with respect
to λ is always Gauss’ γ for any a ∈ I. The problem of the exact rate of
convergence, which should normally depend on ga , remains unsettled. 2
Other special cases of Theorem 2.5.9 and its corollaries can be considered.
For example, we can check that

lim γ(τ 0 ≤ x, san ≤ y) = G(x)G(y), a, x, y ∈ I. (2.5.49)


n→∞

It is interesting to note that (2.5.49) points to the asymptotic indepen-


dence of τ 0 and san under γ as n → ∞.
Solving Gauss’ problem 163

As already noted at the beginning of this subsection, we can easily obtain


the results corresponding to Corollaries 2.5.10 through 2.5.12 in the case
considered by A. Denjoy, where τ 0 is replaced by [a1 , · · · , an ]. A definite
difference occurs just in the convergence rates while the limiting probabilities
are not altered.
164 Chapter 2
Chapter 3

Limit theorems

This chapter is devoted to functional versions of central limit and other


weak theorems, and of the law of the iterated logarithm for the incomplete
quotients and associated random variables. The reader should keep in mind
throughout that the sequence (an )n∈N+ of incomplete quotients is ψ-mixing
under different probability measures on BI (see Subsections 1.3.6 and 2.3.4),
while frequent reference is made to the three appendices at the end of the
book.

3.0 Preliminaries
As in Subsection 2.5.4 let g be a λ-integrable complex-valued function on I.
We particularize here the framework considered there taking a = 0 and ac-
cordingly γ0 = λ. Denote by Ek , k ∈ N, the set consisting of the endpoints
of all fundamental intervals of rank `, 0 ≤ ` ≤ k. For any n ∈ N+ we asso-
ciate with g a function gn which has a constant value on each fundamental
interval I(i(n) ), i(n) ∈ Nn+ , of rank n. Specifically,
Z
1
gn (x) = ¡ ¢ gdλ, x ∈ I(i(n) ), i(n) ∈ Nn+ , n ∈ N+ . (3.0.1)
λ I(i(n) ) I(i(n) )

Then Z Z
gn dλ = gdλ, n ∈ N+ , (3.0.2)
I I

and
lim gn (x) = g(x) a.e. in I. (3.0.3)
n→∞

165
166 Chapter 3

R
It follows from (3.0.2) and (3.0.3) that limn→∞ I |g − gn |dλ = 0. Hence
Z
ωg,A (n) = |g − gn |dλ → 0 (3.0.4)
A

uniformly with respect to A ∈ BI as n → ∞.


We shall now prove a result which, in a sense, is dual to Theorem 2.5.9.
Lemma 3.0.1 Let µ ∈ pr (BI ) such that µ ¿ λ and let g =dµ/dλ.
For any n ∈ N+ and A ∈ Bn∞ = τ −n+1 (BI ) we have

|µ(A) − γ(A)| ≤ inf (γ(Ac )ωg,A (s) + γ(A)ωg,Ac (s) + γ(A)εn−s ) ,


1≤s<n

with εn , n ∈ N+ , defined as in Subsection 1.3.6. Hence

lim sup |µ(A) − γ(A)| = 0.


n→∞ A∈B∞
n

Proof. Put h = IA − γ(A), A ∈ Bn∞ . Then


Z
µ(A) − γ(A) = ghdλ
I

and ¯Z ¯ Z ¯Z ¯
¯ ¯ ¯ ¯
¯ ghdλ¯ ≤ |gs − g| |h|dλ + ¯ gs hdλ¯ ,
¯ ¯ ¯ ¯
I I I
where gs is defined by (3.0.1) and s < n, s ∈ N+ , is arbitrary. Since
|h| = 1 − γ(A) = γ(Ac ) on A and |h| = γ(A) on Ac , we have
Z
|gs − g| |h|dλ ≤ γ(Ac )ωg,A (s) + γ(A)ωg,Ac (s). (3.0.5)
I

Next,
¯ ¯
¯Z ¯ ¯ Z ¯
¯ ¯ ¯ X ¯
¯
¯ gs hdλ¯ = ¯ ¯
¯ ¯ g s hdλ¯
I ¯ (s) ¯
¯i(s) ∈Ns+ I(i ) ¯
¯ ¯
¯ Ã Z !Z ¯
¯ X 1 ¯
¯ ¯
= ¯ gdλ hdλ ¯
¯ (s) s λ(I(i(s) )) I(i(s) ) I(i(s) ) ¯
¯i ∈N+ ¯
¯ ¯
¯ ¯
¯ X µ(I(i(s) )) ³ ´¯
¯ (s) (s) ¯
= ¯ λ(I(i ) ∩ A) − λ(I(i ))γ(A) ¯ .
¯ (s) s λ(I(i(s) )) ¯
¯i ∈N+ ¯
Limit theorems 167

It then follows from equation (1.3.35) that


¯Z ¯
¯ ¯
¯ gs hdλ¯ ≤ γ(A)εn−s . (3.0.6)
¯ ¯
I

Now, the result stated follows from (3.0.5), (3.0.6), and (3.0.4). 2
Let fn : N+ → R, n ∈ N+ , and define

Xnj = fn (aj ), 1 ≤ j ≤ n,

k
X
Sn0 = 0, Snk = Xnj , 1 ≤ k ≤ n, Snn = Sn , n ∈ N+ .
j=1

For any n ∈ N+ define the process ξn = ((ξn (t))t∈I by ξn (t) = Snbntc , t ∈ I.


Lemma 3.0.2 Let µ ∈ pr (BI ) such that µ ¿ λ. Assume that the
array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i. under γ.
¡ ¢ ¡ ¢
(i) If either γξn−1 n∈N+ or µξn−1 n∈N+ converges weakly in BD , then
both sequences converge
¡ ¢weakly in ¡ BD and
¢ have the same limit.
(ii) If either γSn−1 n∈N+ or µSn−1 n∈N+ converges weakly in B, then
both sequences converge weakly in B and have the same limit.
Proof. Clearly, (ii) is an immediate consequence of (i). Let us therefore
prove the latter. Take a sequence (kn )n∈N+ such that kn ≤ n, limn→∞ kn /n =
0, and limn→∞ kn = ∞. As X is s.i. under γ, we have

lim γ (|Snkn | > ε) = 0 (3.0.7)


n→∞

for any ε > 0.


Let us first show that
µ ¶
lim max |Snk | > ε = 0 (3.0.8)
n→∞ 1≤k≤kn

for any ε > 0. It follows from Proposition A3.5 (see also Section A1.4) that
whatever ε > 0 we have
¡ −1 ¢ ε
dP γSnk , δ0 ≤ , 1 ≤ k ≤ kn ,
4
for any n large enough (≥ nε ). Therefore for some θ ≤ ε/4 we have
−1
δ0 (A) < γSnk (Aθ ) + θ
168 Chapter 3

for any n ≥ nε , 1 ≤ k ≤ kn , and A ∈ B. Hence, with A = (−θ, θ) for which


Aθ = (−2θ, 2θ), we obtain
³³ ε ε ´´ ³ ´ ε
−1 −1
γSnk − , > γSnk Aθ > 1 − θ ≥ 1 −
2 2 4
for any n ≥ nε and 1 ≤ k ≤ kn . Equivalently,
³ ε´ ε
min γ |Snk | < >1− , n ≥ nε .
1≤k≤kn 2 4
If ε is small enough so that
ε
1− > ϕγ (1),
4
then by an Ottaviani type inequality [see Lemma 1.1.6 in Iosifescu and
Theodorescu (1969)] we can write
µ ¶ ¡ ¢
γ |Snkn | ≥ 2ε
γ max |Snk | > ε ≤
1≤k≤kn 1 − 4ε − ϕγ (1)

for any n ≥ nε . Hence (3.0.8) holds on account of (3.0.7).


Next, for any n ∈ N+ consider the process ξen = (ξen (t))t∈I defined by

ξen (t) = Snbntc − Sn min(bntc,kn ) , t ∈ I.

Note that ξen is Bk∞n +1 -measurable and then by Lemma 3.0.1 and Lemma
2.1.1 in Iosifescu and Grigorescu (1990) we have
µZ Z ¶
lim e−1
hd(γ ξn ) − e−1
hd(µξn ) = 0 (3.0.9)
n→∞ D D

for any bounded continuous real-valued function h on D. On the other hand


(see Section A1.6), for any n ∈ N+ we have

d0 (ξn , ξen ) ≤ sup |ξn (t) − ξen (t)| ≤ max |Snk |.


t∈I 1≤k≤kn

It then follows from (3.0.8) that



d0 (ξn , ξ n ) converges to 0 in γ-probability as n → ∞. (3.0.10)

Hence as µ ¿ γ we also have that



d0 (ξn , ξ n ) converges to 0 in µ-probability as n → ∞. (3.0.11)
Limit theorems 169

We can now conclude the proof using (3.0.9) through (3.0.11). If, for
w
example, γξn−1 → ν for some ν ∈ pr (BD ), then it follows from (3.0.10) that
w w
γ ξen−1 → ν, too. Next, (3.0.9) implies that µξen−1 → ν, which in conjunction
w
with (3.0.11) yields µξn−1 → ν. 2
Remark. Lemma
¡ ¢ 3.0.2 still holds when the process ξn is replaced by the
process ξnC = ξnC (t) t∈I defined by
¡ ¢
ξnC (t) = Snbntc + (nt − bntc) Sn(bntc+1) − Snbntc , t ∈ I,

with the convention Sn0 = 0, n ∈ N+ . 2

3.1 The Poisson law


3.1.1 The case of incomplete quotients
Let θ ∈ R++ and α ∈ R be arbitrarily given. Consider the array

X = {Xnj , 1 ≤ j ≤ n, n∈ N+ },

where ³ a ´α
j
Xnj = I(aj >θn) . (3.1.1)
n
For this array we have

k
X
−α
Snk = n aαj I(aj >θn) , 1 ≤ k ≤ n, Sn = Snn , n ∈ N+ . (3.1.2)
j=1

Proposition 3.1.1 The array (3.1.1) is s.i. under γ.


Proof. We only consider the case α ∈ R++ . The other cases can be
treated similarly. We have

k
X ³ ε´ ³ ε´
γ (|Snk | > ε) ≤ γ |Xnj | > = kγ |Xn1 | >
k k
j=1
µ µ ³ ´ ¶¶
ε 1/α
= kγ a1 > n max θ,
k

≤ kγ(a1 > nθ), 1 ≤ k ≤ n.


170 Chapter 3

Hence Xn1 converges in γ-probability to 0 as n → ∞, and for any 0 < a < 1


we have
a
lim sup max γ (|Snk | > ε) ≤ lim a n γ (a1 > nθ) = ,
n→∞ 1≤k≤an n→∞ θ log 2

which is less than 1 if we choose

0 < a < min(1, θ log 2).

On account of Proposition A3.6 the proof is complete. 2


Theorem 3.1.2 We have
w
γSn−1 → ν in B, (3.1.3)

where:
(i) if α ∈ R++ then ν = Pois ρ with

dρ x−1−1/α
(x) = δx ((θα , ∞)) , x ∈ R;
dλ α log 2

(ii) if −α ∈ R++ then ν = Pois ρ with

dρ x−1−1/α
(x) = −δx ((0, θα )) , x ∈ R;
dλ α log 2
³ ´
(iii) if α = 0 then ν = Pois (θ log 2)−1 δ1 , that is, ν is the Poisson
³ ´
distribution P (θ log 2)−1 with parameter (θ log 2)−1 .

Proof. We only prove (i), the proofs of (ii) and (iii) being completely
similar.
Consider the measures µn on B defined by
³³ a ´α ´
1
µn (A) = γ ∈ A, a1 > θn , A ∈ B, n∈ N+ .
n
Clearly,

µn (R) = γ (a1 > θn) ≤ 1, µn ([−θα , θα ]) = 0, n ∈ N+ ,

and

γ(Xn1 ∈ A) = γ (a1 ≤ θn) δ0 (A) + µn (A), A ∈ B, n∈ N+ .


Limit theorems 171

Also, for any x ∈ R we have


³ ´
lim n µn ((x, ∞)) = lim n γ a1 > n (max(x, θα ))1/α
n→∞ n→∞

 
1 1
= lim n log 1 + j k
log 2 n→∞ n (max(x, θα ))1/α + 1

1 1
= = ρ ((x, ∞)) .
log 2 (max(x, θα ))1/α

Finally,
1
lim n µn (R) = lim n γ(a1 > n θ) = = ρ(R).
n→∞ n→∞ θ log 2

Therefore all hypotheses of Theorem A3.10 are fulfilled, and (3.1.3)


holds. 2
Now, on account of Proposition 3.1.1, Theorem 3.1.2, Lemma 3.0.2, and
Theorem A3.7 we can state the following result. (See Section A3.3 for nota-
tion.)
w
Corollary 3.1.3 Let µ ∈ pr(BI ) such that µ ≤ λ. Then µξn−1 → Qν
w
in BD , hence µSn−1 → ν in B, where ξn = (Snbntc )t∈I , with the convention
Sn0 = 0, n ∈ N+ .

3.1.2 The case of associated random variable


We shall now show that both Theorem 3.1.2 and Corollary 3.1.3 still hold
when aj is replaced by either yj , rj , or uj , 1 ≤ j ≤ n, in (3.1.1) and (3.1.2).
This will follow from the result below.
Lemma 3.1.4 Let bn , n ∈ N+ , be real-valued random variables on (I,BI )
such that
an ≤ bn ≤ an + c, n ∈ N+ ,
for some c ∈ R+ . For any n ∈ N+ consider the stochastic processes ξn =
0
(Snbntc )t∈I and ξn0 = (Snbntc )t∈I , where Snk , 1 ≤ k ≤ n, is defined by (3.1.2)
and
Xk
0 −α
Snk = n bαj I(bj >θn) , 1 ≤ k ≤ n,
j=1
172 Chapter 3

0 = 0. Then d (ξ , ξ 0 ) converges to 0 in γ-probability


with the convention Sn0 0 n n
as n → ∞.
Proof. For any n ∈ N+ we have
n
X
d0 (ξn , ξn0 ) ≤ sup |Snbntc − 0
Snbntc | ≤ |δnj |,
t∈I j=1

where ³ ´
δnj = n−α bαj I(bj >θn) − aαj I(aj >θn) , 1 ≤ j ≤ n.
Notice that (aj > θn) ⊂ (bj > θn), 1 ≤ j ≤ n, and put
n
X ³ ´ n
X
δn0 =n −α
bαj I(bj >θn) − I(aj >θn) = n −α
bαj I(bj >θn,aj ≤θn) ,
j=1 j=1

n
X
δn00 = n−α |bαj − aαj |I(aj >θn) .
j=1
Pn
Then j=1 |δnj | ≤ δn0 + δn00 ,
and we are going to prove that δn0 and δn00 both
converge to 0 in γ-probability as n → ∞.
We have
γ(δn0 > 0) ≤ nγ(θn − c < a1 ≤ θn) → 0
as n → ∞ while
 
n
X
δn00 ≤ cα n−1 n−(α−1) aα−1
j I(aj >θn)  ,
j=1

where 
 cα(1 + c)α−1 if α ≥ 1,
cα =

c|α|
if α < 1.
¡ ¢
[We have used the inequality (1+a)α −1 ≤ a {α} + bαc(1 + a)α−1 , valid for
non-negative a and α, which implies 1 − (1 + a)−α ≤ aα.] By Theorem 3.1.2,
δn00 converges to 0 in γ-probability as n → ∞. It follows that d0 (ξn , ξn0 ) is
dominated by the sum of two non-negative random variables both converging
in γ-probability to 0 as n → ∞. The proof is complete. 2
Corollary 3.1.5 Let bn denote either yn , rn , or un , n ∈ N+ . Put
k
X
0 −α
Snk =n bαj I(bj >θn) , 1 ≤ k ≤ n,
j=1
Limit theorems 173

and for any n ∈ N+ consider the stochastic process ξn0 = (Snbntc


0 )t∈I , with the
0 = 0. Let µ ∈ pr(B ) such that µ ¿ λ. Then µξ 0−1 → Q in w
convention Sn0 I n ν
w
0−1 −→
BD , hence µSnn ν in B.
Proof. Lemma 3.1.4 applies with c = 1 in the case of yn and rn and
with c = 2 in the case of un . Since µ ¿ γ, the distance d0 (ξn , ξn0 ) converges
to 0 in µ-probability, too, as n → ∞. This property and Corollary 3.1.3
imply the result stated. 2
Let bn denote either an , yn , rn or un , n ∈ N+ , and consider the special
case α = 0. By Corollaries 3.1.3 and 3.1.5, under any µ ∈ pr(BI ) such that
µ ¿ λ, the random variable
n
X
Sn0 = I(bj >θn)
j=1
¡ ¢
is asymptotically P (θ log 2)−1 as n → ∞. It is possible to estimate the
rate of convergence of γ(Sn0 = k), k ∈ N, to its Poisson limit. The following
result holds.
Theorem 3.1.6 Let k ∈ N and 0 < δ < 1 be fixed. We have

|γ(Sn0 = k) − e−θ θk /k!| ≤ c exp(−(log n)δ ), n ∈ N+ ,

for θ = O(na ), 0 ≤ a < 1, where c only depends, perhaps, on δ, a, and k.


The proof for the case bn = an , n ∈ N+ , k = 0, can be found in Philipp
(1976, p. 382), where the proviso θ = O(na ), 0 ≤ a ≤ 1, does not appear.
Cf. Galambos (1972) and Iosifescu (1978, p. 35).

3.1.3 Some extreme value theory


Throughout this subsection let again bn denote either an , yn , rn or un , n ∈
(k)
N+ . For 1 ≤ k ≤ n let Mn be the kth largest of b1 , · · · , bn . Clearly,
(1)
Mn = Mn is the maximum of b1 , · · · , bn . The asymptotic distribution of
(k)
Mn as n → ∞ for any fixed k can be easily obtained from previous results
as shown below.
Proposition 3.1.7 Let µ ∈ pr(BI ) such that µ ¿ λ. For any fixed
k ∈ N+ we have
à ! k−1 −j
(k) X
Mn log 2 − x1 x
lim µ ≤x =e , x ∈ R++ . (3.1.4)
n→∞ n j!
j=0
174 Chapter 3

In particular,
µ ¶
Mn log 2 1
lim µ ≤ x = e− x , x ∈ R++ .
n→∞ n

Pn
Proof. Let 1 ≤ k ≤ n. It is easy to see that Sn0 = j=1 I(bj >θn) is less
(k)
than k if and only if Mn does not exceed θn, that is,
³ ´ ¡ ¢
Mn ≤ θn = Sn0 < k
(k)
(3.1.5)

for any θ ∈ R++ and n ∈ N+ . Hence, by Corollaries 3.1.3 and 3.1.5,

³ ´ ¡ 0 ¢ Xk−1
¡ ¢
µ Mn(k) ≤ θn = µ Sn < k = µ Sn0 = j
j=0

k−1
X
−(θ log 2)−1 1
→ e
j!(θ log 2)j
j=0

as n → ∞ for any fixed k ∈ N+ . Putting x = θ log 2 we obtain the result


stated. 2
Remark. The limit distribution for the special case k = 1 is known as
Type II Extreme Value distribution for sequences of i.i.d. random variables.
See, e.g., de Haan (1970). The same result can also be obtained from general
results of Loynes (1965) for mixing strictly stationary sequences. 2
In what follows we give some almost sure asymptotic properties of Mn
due to Philipp (1976), which improve upon results of Galambos (1974). We
start with a F. Bernstein type theorem (see Proposition 1.3.16).
Proposition 3.1.8 Let (cn )n∈N+ be a non-decreasing sequence of posi-
tive numbers. Then
γ(Mn ≥ cn i.o.)
P
is either 0 or 1 according as the series n∈N+ 1/cn converges or diverges.

Proof. We have (bn ≥ cn i.o.) ⊂ (Mn ≥ cn i.o.) since bn (ω) ≥ cn for some
n ∈ N+ and ω ∈ Ω implies Mn (ω) ≥ cn . Conversely, if Mn (ω) ≥ cn for some
n ∈ N+ and ω ∈ Ω, then there exists n0 ≤ n such that Mn (ω) = bn0 (ω) ≥
cn ≥ cn0 . Hence (Mn ≥ cn i.o.) ⊂ (bn ≥ cn i.o.). Therefore (Mn ≥ cn i.o.) =
(bn ≥ cn i.o.) , and the conclusion follows from Corollary 1.3.17. 2
Limit theorems 175

Corollary 3.1.9 Let (cn )n∈N+ be as in Proposition 3.1.8. Then either

Mn
lim = 0 a.e. (3.1.6)
n→∞ cn

or
Mn
lim sup = ∞ a.e. (3.1.7)
n→∞ cn
P
according as the series n∈N+1/cn converges or diverges.
P
Proof. First, assume that s = n∈N+ 1/cn < ∞. P Choose positive
numbers dn , n ∈ N+ , with limn→∞ dn = ∞ such that n∈N+ dn /cn < ∞.
Pn
This is always possible. Indeed, put sn = i=1 1/ci , n ∈ N+ , and define

E1 = {j ∈ N+ : sj ≤ 3s/4},
( n−1 n
)
X X
−i −i
En = j ∈ N+ : 3s 4 < sj ≤ 3s 4 , n ≥ 2.
i=1 i=1

Consider the increasing sequence (nk )k∈N+ of indices n for which En 6=


∅Pand take dj = 2nk−1 if j ∈ Enk , k ∈ N+ , with n0 = 0. Then we have
−nk + 4−(nk −1) + · · · + 4−nk−1 ) ≤ 4−nk−1 +1 s, k ∈ N ,
j∈Enk 1/cj ≤ 3s(4 +
P P P P −n
hence n∈N+ dn /cn = k∈N+ j∈En dj /cj ≤ 4s k∈N+ 2 k−1 ≤ 8s. By
k
Proposition 3.1.8 we have
µ ¶
Mn 1
γ ≥ i.o. = 0,
cn dn

which is equivalent to (3.1.6).


P
Second, assume that n∈N+ 1/cP n = ∞. Choose positive numbers dn , n ∈
N+ , with limn→∞ dn = 0 such that n∈N+ dn /cn = ∞. This is again always
P
possible. Indeed, put sn = ni=1 1/ci , n ∈ N+ , and define

E1 = {j ∈ N+ : sj ≤ 4} ,

© ª
En = j ∈ N+ : 4n−1 < sj ≤ 4n , n ≥ 2.

Consider the increasing sequence (nk )k∈N+ of indices n for which En 6= ∅


and −nk−1 if j ∈ E
P take dj = 2 n n
nk ∪ Enk+1 , k = 1, 3, · · P
n
· , with n0 = 0. Then
j∈En ∪En 1/cj ≥ 4 − 4
k k−1 ≥ 3·4 k−1 whence j∈En ∪En dj /cj ≥
k k+1 k k+1
176 Chapter 3

P
3 · 2nk−1 , k = 1, 3, · · · . Clearly, this implies n∈N+ dn /cn = ∞. By Propo-
sition 3.1.8 we have µ ¶
Mn 1
γ ≥ i.o. = 1,
cn dn
which is equivalent to (3.1.7). 2
Theorem 3.1.10 Let (cn )n∈N+ be a non-decreasing sequence of positive
numbers such that the sequence (n/cn )n∈N+ is non-decreasing. Then
µ ¶
n
γ Mn ≤ i.o.
cn log 2

is either 0 or 1 according as the series


X log log n
n exp cn
n∈N+

converges or diverges.
The proof is completely similar to that given for the i.i.d. case in
Barndorff–Nielsen (1961). Theorem 3.1.6 plays an essential part in the
present case. For details in the case bn = an , n ∈ N+ , see Philipp (1976,
pp. 384–385). 2
Corollary 3.1.11 We have

log Mn − log n
lim sup(inf) = 1(0) a.e.,
n→∞ log log n

whence
log Mn
lim = 1 a.e..
n→∞ log n

Proof. For the lim sup case we should show that for any ε > 0 we have
µ ¶
log Mn − log n
γ ≥ 1 + ε i.o. = 0
log log n

and µ ¶
log Mn − log n
γ ≥ 1 − ε i.o. = 1
log log n
or, equivalently, ¡ ¢
γ Mn ≥ n(log n)1+ε i.o. = 0
Limit theorems 177

and
¡ ¢
γ Mn ≥ n(log n)1−ε i.o. = 1.
These equations clearly hold by Proposition 3.1.8.
For the lim inf case we should show that for any ε > 0 we have
µ ¶
log Mn − log n
γ ≤ ε i.o. = 1
log log n

and µ ¶
log Mn − log n
γ ≤ −ε i.o. = 0
log log n
or, equivalently,
γ (Mn ≤ n(log n)ε i.o.) = 1
and
¡ ¢
γ Mn ≤ n(log n)−ε i.o. = 0
It is easy to check that these equations hold by Theorem 3.1.10. 2
Corollary 3.1.12 We have

Mn log log n 1
lim inf = a.e..
n→∞ n log 2

Proof. We should show that for any ε > 0 we have


µ ¶
Mn log log n 1
γ − ≤ ε i.o. = 1
n log 2

and µ ¶
Mn log log n 1
γ − ≤ −ε i.o. = 0
n log 2
or, equivalently,
µ ¶
n(1 + ε0 )
γ Mn ≤ i.o. = 1
(log log n)(log 2)

and µ ¶
n(1 − ε0 )
γ Mn ≤ i.o. = 0,
(log log n)(log 2)
where ε0 = ε log 2. This follows immediately from Theorem 3.1.10. 2
178 Chapter 3

(k)
To conclude this subsection we consider the kth smallest mn of b1 , · · · , bn ,
(1) (n) (k)
1 ≤ k ≤ n, n ∈ N+ . Clearly, mn = Mn . In general, we have mn =
(n−k+1)
Mn , 1 ≤ k ≤ n. Then by (3.1.5) we have
¡ 0 ¢
(m(k)
n ≤ θn) = Sn < n − k + 1

for any θ ∈ R++ and n ∈ N+ . Hence, for any µ ∈ pr(BI ) such that µ ¿ λ,
³ ´ ¡ ¢
µ m(k)
n ≤ θn = µ Sn0 < n − k + 1

n−k
X n
X
= µ(Sn0 = j) = 1 − µ(Sn0 = j).
j=0 j=n−k+1

Since n−1 Sn0 converges to 0 in µ-probability as n → ∞ by Corollaries 3.1.3


and 3.1.5, we have ¡ ¢
lim µ Sn0 = n − m = 0
n→∞

for any fixed m ∈ N. Consequently,


³ ´
lim µ m(k)n ≤ θn =1 (3.1.8)
n→∞

for any fixed k ∈ N+ . This result is not at all surprising. Indeed, by


Proposition 4.1.1 we have

lim a(k) = 1 a.e.


n→∞ n

(k)
for any fixed k ∈ N+ , where an denotes the kth smallest of a1 , · · · , an .
(k) (k)
As mn ≤ an + 2, n ∈ N+ , 1 ≤ k ≤ n, it follows that
(k)
mn
lim = 0 a.e.
n→∞ n

for any fixed k ∈ N+ , which clearly entails (3.1.8).


Remark. It is proved in Iosifescu (1977) that if (ηn )n∈N+ is a strictly
stationary ψ-mixing sequence of positive random variables on a probability
space (Ω , K, P ) such that for some real-valued function g on N+ there exists
the positive finite limit

lim nP (ηn < g(n)) = θ,


n→∞
Limit theorems 179

say, then P (ηk < g(n) for p values k, 1 ≤ k ≤ n) → e−θ θp /p! as n → ∞ for
any fixed p ∈ N.
In particular this result applies to a sequence (ηn )n∈N+ for which

P (η1 ≥ x) = log(1 + 1/x)/ log 2, x ≥ 1,

with
2θ log 2
g(n) = 1 + , n ∈ N+ .
n
For such a sequence, similarly to (3.1.4) we can write
à ! k−1 j
(k) X
n(ηn − 1) x
lim P ≥ x = e−x , x ∈ R++ , (3.1.9)
n→∞ 2 log 2 j!
j=0

(k)
for any fixed k ∈ N+ , where ηn denotes the kth smallest of η1 , · · · , ηn , 1 ≤
k ≤ n.
We cannot assert that (3.1.9) is true for ηn = an , n ∈ N+ , since the
equation γ (a1 ≥ x) = log (1 + 1/x) / log 2 holds just for x ∈ N+ . It is
conjectured in Iosifescu (1978) that (3.1.9) holds true for ηn = rn , n ∈ N+ ,
under any P ¿ λ. [Notice that γ (r1 ≥ x) = log (1 + 1/x) / log 2 for any
x ≥ 1, but the sequence (rn )n∈N+ is not ψ-mixing under γ.] 2

3.2 Normal convergence


3.2.1 Two general invariance principles
Assume the framework of Subsection 2.1.5. Thus let H be a real-valued
function on NZ
+ . Set Hl = H1 ◦ τ
l−1 , l ∈ Z, where

H1 = H( · · · , a−2 , a−1 , a0 , a1 , a2 , · · · ).

Then (Hl )l∈Z is a strictly stationary process on (I 2 , BI2 , γ). Set S0 = 0, Sn =


P n
i=1 Hi − nEγ H1 , n ∈ N+ , assuming that the mean value Eγ H1 exists
and is finite. For any n ∈ N+ let us define the stochastic processes ξnC =
(ξnC (t))t∈I and ξnD = (ξnD (t))t∈I by

1 ¡ ¢
ξnC (t) = √ Sbntc + (nt − bntc)(Hbntc+1 − Eγ H1 ) ,
σ n
1
ξnD (t) = √ Sbntc , t ∈ I,
σ n
180 Chapter 3

where σ = σ(H) is a positive number which will be specified later.


We start with a weak invariance principle.
Theorem 3.2.1 Assume that Eγ H12 < ∞ and
X 1/2
Eγ [H1 − Eγ (H1 |a−n , · · · , an )]2 < ∞ (3.2.1)
n∈N+

so that by Propositions 2.1.19 and 2.1.21


1
lim Eγ Sn2 = σ 2 ≥ 0
n→∞ n

exists finitely and is given by the absolutely convergent series


X ¡ ¢
σ 2 = Eγ H12 − Eγ2 H1 + 2 Eγ H1 Hn+1 − Eγ2 H1 . (3.2.2)
n∈N+

w
If σ > 0 then γξn−1 −→ W in both C and D, where ξn stands for either ξnC
or ξnD . The last conclusion still holds when γ is replaced by any µ ∈ pr(BI2 )
such that µ ¿ λ2 .
Proof. This is a transcription of Theorem 21.1 in Billingsley (1968), with
an improvement by Popescu (1978) (concerning the possibility of replacing γ
by µ), for the special case of the doubly infinite sequence (al )l∈Z . Note that
in Proposition 2.1.22 a class of functions H is indicated, for which (3.2.1)
holds. 2
Next, we state a strong invariance principle.
Theorem 3.2.2 Assume that there exist constants 0 < δ ≤ 2 and c > 0
such that Eγ |H1 |2+δ < ∞ and
1/(2+δ)
Eγ |H1 − Eγ (H1 |a−n , · · · , an )|2+δ ≤ cn−(2+7/δ) , n ∈ N+ , (3.2.3)
so that (3.2.1) holds and
1
lim Eγ Sn2 = σ 2 ≥ 0
n→∞ n
exists finitely and is given by the absolutely convergent series (3.2.2). If
σ > 0 then the strong invariance principle holds for the stochastic processes
ξnC and ξnD , n ∈ N+ . That is, without changing their distributions, we can
redefine these processes on a common richer probability space together with
a standard Brownian motion process (w(t))t∈I such that
sup |ξn (t) − w(t)| = O(n−a ) a.s.
t∈I
Limit theorems 181

as n → ∞, with a random constant implied in O, for each a > 0 small


enough, depending on δ. Here ξn stands for either ξnC or ξnD .
Proof. This is a transcription of Theorem 7.1.1 in Philipp and Stout
(1975) for the special case of the doubly infinite sequence (al )l∈Z . 2
For further reference we also consider the special case where H only
depends on the coordinates with positive indices of a current point in NZ +,
N+
i.e., H is a real-valued function on N+ . (Completely similar considerations
can be made in the case where H only depends on the coordinates with non-
positive indices of a current point in NZ
+ , i.e., H is a real-valued function on
(−N)
N+ .) In this case we set Hn = H1 ◦ τ n−1 , n ∈ N+ , where

H1 = H(a1 , a2 , · · · ),

and we have a strictly stationary sequence (Hn )n∈N+ on (I, BI , γ). With
the same definitions as before for Sn , ξnC and ξnD , n ∈ N+ , where Eγ H1 is
replaced by Eγ H1 , we can state the following special cases of Theorems 3.2.1
and 3.2.2.
Theorem 3.2.10 Assume that Eγ H12 < ∞ and
X
Eγ1/2 [H1 − Eγ (H1 |a1 , · · · , an )]2 < ∞ (3.2.10 )
n∈N+

so that
1
lim Eγ Sn2 = σ 2 ≥ 0
n→∞ n

exists finitely and is given by the absolutely convergent series


X ¡ ¢
σ 2 = Eγ H12 − Eγ2 H1 + 2 Eγ H1 Hn+1 − Eγ2 H1 . (3.2.20 )
n∈N+

w
If σ > 0 then γξn−1 −→ W in both C and D, where ξn stands for either ξnC
or ξnD . The last conclusion still holds when γ is replaced by any µ ∈ pr(BI )
such that µ ¿ λ.
Note that inequality (2.1.32) and Proposition 2.1.23 describe two classes
of functions H for which (3.2.10 ) holds.
Theorem 3.2.20 Assume that there exist constants 0 < δ ≤ 2 and c > 0
such that Eγ |H1 |2+δ < ∞ and

Eγ1/(2+δ) |H1 − Eγ (H1 |a1 , · · · , an )|2+δ ≤ cn−(2+7/δ) , n ∈ N+ , (3.2.30 )


182 Chapter 3

so that (3.2.10 ) holds and

1
lim Eγ Sn2 = σ 2 ≥ 0
n→∞ n
exists finitely and is given by the absolutely convergent series (3.2.20 ). If σ >
0 then the strong invariance principle holds for the stochastic processes ξnC
and ξnD , n ∈ N+ . That is, without changing their distributions, we can re-
define these processes on a common richer probability space together with a
standard Brownian motion process (w(t))t∈I such that

sup |ξn (t) − w(t)| = O(n−a ) a.s.


t∈I

as n → ∞, with a random constant implied in O, for each a > 0 small


enough, depending on δ. Here ξn stands for either ξnC or ξnD .

3.2.2 The case of incomplete quotients


An important special case of Theorem 3.2.10 is obtained when the function
N
H only depends on finitely many coordinates of a current point of N+ + , i.e.,
when H is a real-valued function on Nk+ for a given k ∈ N+ . In this case
Hn = H(an , ..., an+k−1 ), n ∈ N+ , assumption (3.2.10 ) is trivially satisfied,
and by Corollary 1.2.5 we have

1 X 1 + v(i(k) )
Eγ H1r = H r (i(k) ) log
log 2 1 + u(i(k) )
i(k) ∈Nk+

with r = 1 or 2, and
σ 2 = Eγ H12 − Eγ2 H1 (3.2.200 )
 
X X H(i(k) )H(in+1 , · · · , in+k ) 1 + v(i(n+k) ) 
+2  log (n+k)
− Eγ2 H1  .
log 2 1 + u(i )
n∈N+ i(n+k) ∈Nn+k
+

Note that in the case k = 1 by either Corollary 2.1.25 or Proposition


A3.4 we have σ = 0 if and only if H =const. It is an open problem to find
necessary and sufficient conditions in terms of H in the case k > 1 for to
have σ = 0.
The special framework assumed allows for an estimate of the convergence
rate in the classical central limit theorem. Thus we have the following result.
Limit theorems 183

Theorem 3.2.3 If σ > 0 and


1 X ¯¯ ¯2+δ
¯ 1 + v(i(k) )
Eγ |H1 |2+δ = ¯H(i(k) )¯ log <∞
log 2 (k) k
1 + u(i(k) )
i ∈N+

for some δ > 0, then there exist two positive constants a < 1 and c such
that ¯ Ã Pn ! ¯
¯ ¯
¯ j=1 Hj − nEγ H1 ¯
¯γ √ < x − Φ(x)¯ ≤ c n−a
¯ σ n ¯
for any x ∈ R and n ∈ N+ .
Proof. This is a transcription of Theorem 1 in Iosifescu (1968) for the
special case of the sequence (an )n∈N+ of incomplete quotients. 2
Remark. It is an open problem to determine the optimal value of a in
Theorem 3.2.3. We conjecture that a = δ/2, that is, the same value as in
the case of i.i.d. random variables with finite (2 + δ)-absolute moment. 2
In what follows, by restricting the class of functions H we give more
precise results in the case k = 1. To emphasize this special framework we
change the notation by using the letter f instead of H.
Theorem 3.2.4 Let f : N+ → R, An ∈ R, Bn ∈ R++ , n ∈ N+ , with
limn→∞ Bn = ∞, and define
Xnj = Bn−1 (f (aj ) − An ) , 1 ≤ j ≤ n,
X k
Sn0 = 0, Snk = Xnj , 1 ≤ k ≤ n, Snn = Sn , n ∈ N+ ,
j=1
1 X
F (x) = f 2 (k)k −2 ,
log 2
{k:|f (k)|≤x}

Fe(x) = Eγ f 2 (a1 )I(|f (a1 )|≤x)

X µ ¶
1 1
= f 2 (k) log 1+ , x ∈ R+ .
log 2 k(k + 2)
{k:|f (k)|≤x}

(i) The following assertions are equivalent.

(I) The stochastic process ξnD = ξn = (ξn (t))t∈I defined for any n ∈ N+
by ξn (t) = Snbntc , t ∈ I, satisfies
w
γξn−1 −→ WD in BD ,
184 Chapter 3

where WD is the Wiener measure on BD .


w
(II) γSn−1 −→ N (0, 1), and the array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is
s.i. under γ.

(ii) When limx→∞ Fe(x) = Eγ f 2 (a1 ) = ∞, assertion (I) above holds with a
bounded sequence (An )n∈N+ if and only if
X
x2 k −2
{k:|f (k)|>x}
lim X =0 (3.2.4)
n→∞ f 2 (k)k −2
{k:|f (k)|≤x}

or, equivalently (see Theorem A2.5), if and only if F is slowly varying. If


this is the case, then we can take An = Eγ f (a1 ), n ∈ N+ ,and any sequence
(Bn )n∈N+ such that limn→∞ nBn−2 F (Bn ) = 1.
When Eγ f 2 (a1 ) < ∞, assertion (I) holds with a bounded sequence
(An )n∈N+ if and only if f 6=const. If this is the case, then we can take
√ 1/2
An = Eγ f (a1 ) and Bn = nσ(0) Eγ f 2 (a1 ), n ∈ N+ , for some σ(0) > 0.
(iii) If either (I) or (II) holds, then γ can be replaced in (i) by any
µ ∈ pr(BI ) such that µ ¿ λ.
Proof. (i) and (iii) follow from Theorem A3.7 and Lemma 3.0.2, respec-
tively.
We thus have to only prove (ii). First, since
³ ´
1
log 1 + k(k+2)
lim = 1,
k→∞ k −2

either F and Fe both tend to ∞ as x → ∞ and limx→∞ F (x)/Fe (x) = 1 or


both have finite limits as x → ∞. Consequently, F is slowly varying if and
only if Fe is.
Assume that (3.2.4) holds. Note that this does always happen when

0 < Eγ f 2 (a1 ) = lim Fe(x) < ∞.


x→∞

Then Theorem A3.12 applies with Xn = f (an ), n ∈ N+ , and



 Eγ2 f (a1 )

 if Eγ f 2 (a1 ) < ∞,
2 E f 2 (a )
m (X1 ) = γ 1



0 if Eγ f 2 (a1 ) = ∞,
Limit theorems 185


 Eγ f (a1 )f (an )
 if Eγ f 2 (a1 ) < ∞,
(0) Eγ f 2 (a1 )
ϕ1 = 1, ϕ(0)
n =



0 if Eγ f 2 (a1 ) = ∞
for n ≥ 2 [use Proposition A3.1 and equation (A3.2)], and σ(0) 2 equals either

P ¡ ¢
Eγ f 2 (a1 ) − Eγ2 f (a1 ) + 2 n∈N+ Eγ f (a1 )f (an+1 ) − Eγ2 f (a1 )
Eγ f 2 (a1 )

or 1 according as Eγ f 2 (a1 ) < ∞ or Eγ f 2 (a1 ) = ∞. Noting that when


Eγ f 2 (a1 ) < ∞ by Corollary 2.1.25 we have σ(0) 6= 0 if and only if f 6=
const., we conclude that with An and Bn , n ∈ N+ , as indicated we have
w
γξn−1 −→ WD , that is, (I) holds with a bounded sequence (An )n∈N+ .
Next, assume that (I) or, equivalently, (II) holds with a bounded se-
quence (An )n∈N+ . Clearly, this cannot happen if f = const. It thus remains
to show that Fe is slowly varying when

lim Fe (x) = ∞. (3.2.5)


x→∞

Fix δ ∈ (0, 1) and put Xnjδ = Xnj I(|Xnj |≤δ) − Eγ Xnj I(|Xnj |≤δ) for any
w
1 ≤ j ≤ n, n ∈ N+ . As γSn−1 → N (0, 1) by (II), it follows from Theorem
A3.11(i) that
 2
Xn
lim Eγ  Xnjδ  = 1. (3.2.6)
n→∞
j=1

On the other hand, it follows from Corollary A3.2 that


 2  
Xn X
Eγ  Xnjδ  ≤ 1 + 2 ψ(k) nEγ Xn1
2
I(|Xn1 |≤δ) , n ∈ N+ . (3.2.7)
j=1 k∈N+

Now, note that |f (i) − An | ≤ δBn entails


¡ ¢
|f (i)| ≤ |An | + δBn = Bn |An |Bn−1 + δ ≤ Bn

for any n large enough since δ ∈ (0, 1), (An )n∈N+ is bounded, and limn→∞ Bn
= ∞. Then for such an n we have
2
Eγ Xn1 I(|Xn1 |≤δ) ≤ Bn−2 Eγ (f (a1 ) − An )2 I(|f (a1 )|≤Bn )
³ ´
≤ 2Bn−2 Fe (Bn ) + A2n ,
186 Chapter 3

whence, by (3.2.5),
2
Eγ Xn1 I(|Xn1 |≤δ) ≤ 4Bn−2 Fe(Bn ) (3.2.8)

for any n large enough. It follows from (3.2.6) through (3.2.8) that there
exist c > 0 and n0 ∈ N+ such that

nBn−2 Fe(Bn ) ≥ c, n ≥ n0 . (3.2.9)

Finally, by Theorem A3.11 we also have

lim nγ (|Xn1 | > ε) = 0


n→∞

for any ε > 0. Since

(|Xn1 | > ε) = (|f (a1 ) − An | > εBn ) ⊃ (|f (a1 )| > |An | + εBn )

and limn→∞ (|An | + εBn ) /εBn = 1, we then have

lim nγ (|f (a1 )| > Bn ) = 0. (3.2.10)


x→∞

It follows from (3.2.9) and (3.2.10) that

Bn2 γ (|f (a1 )| > Bn )


lim = 0.
n→∞ Eγ f 2 (a1 )I(|f (a )|≤B )
1 n

Noting that limn→∞ Bn+1 /Bn = 1 (this follows from, e.g., Theorem A3.9,
but a direct proof can be also easily given), the last equation implies

x2 γ (|f (a1 )| > x)


lim = 0,
x→∞ Eγ f 2 (a1 )I(|f (a )|≤x)
1

which shows by Theorem A2.5 that Fe is slowly varying.


Remarks. 1. Theorem 3.2.4 still holds if we replace D by C, WD by WC ,
and the stochastic process ξnD by the stochastic process ξnC defined by
¡ ¢
ξnC (t) = Snbntc + (nt − bntc) Sn(bntc+1) − Snbntc , t ∈ I, n ∈ N+ .

This follows from Theorem A3.8.


2. For the many consequences of Theorem 3.2.4 (as well as of other
similar further results) concerning, e.g., the asymptotic behaviour as n → ∞
of random variables as min0≤k≤n Snk , max0≤k≤n Snk , max0≤k≤n |Snk |, Un =
number of indices k, 1 ≤ k ≤ n, for which Snk > 0, we refer the reader to
Limit theorems 187

Billingsley (1968, § 11). In particular, in the last case we have an arc-sine


law µ ¶
Un 2 √
lim µ < a = arcsin a, 0 ≤ a ≤ 1,
n→∞ n π
for any µ ∈ pr(BI ) such that µ ¿ λ. 2
Example 3.2.5 Let f (n) = na+1/2 , n ∈ N+ , with a ∈ R. Clearly, for
a < 0 we have EP 2 2
γ f (a1 ) < ∞. For a = 0 we have Eγ f (a1 ) = ∞, F (x) ∼
2 log x/ log 2, x 2 −2
{k:|f (k)|>x} k = O(1) as x → ∞. Thus (3.2.4) holds and
we can take
³ ´ µ ¶
1/2 1 X 1/2 1
An = Eγ a1 = k log 1 +
log 2 k(k + 2)
k∈N+

and Bn = (n log n/ log 2)1/2 , n ∈ N+ . It is easy to check that

ζ(3/2)/6 log 2 < An < ζ(3/2)/ log 2

and that we can also write


X³ √ √ √ ´
An = 2 k − 1 − k − k − 2 log k, n ∈ N+ .
k≥2

P
Finally, for a > 0 we have F (x) ∼ x4a/(2a+1) /2a log 2 and x2 {k:|f (k)|>x} k
−2

∼ x4a/(2a+1) as x → ∞, that is, (3.2.4) does not hold. 2


As a special case of Theorem 3.2.20 we note the following result.
Proposition 3.2.6 Let f : N+ → R be a non-constant function. As-
sume that there 2+δ < ∞. Put
Pnexists a constant δ > 0 such that Eγ |f (a1 )|
S0 = 0, Sn = i=1 f (ai ) − nEγ f (a1 ), n ∈ N+ . Let
X ¡ ¢
σ 2 = Eγ f 2 (a1 ) − Eγ2 f (a1 ) + 2 Eγ f (a1 )f (an+1 ) − Eγ2 f (a1 ) ,
n∈N+

which by Corollary 2.1.25 is positive. Then the strong invariance principle


holds for the stochastic processes ξnC and ξnD , n ∈ N+ . That is, without
changing their distributions we can redefine these processes on a common
richer probability space together with a standard Brownian motion process
(w(t))t∈I such that

sup |ξn (t) − w(t)| = O(n−a ) a.s. (3.2.11)


t∈I
188 Chapter 3

as n → ∞, with a random constant implied in O, for each a > 0 small


enough, depending on δ. Here ξn stands for either ξnC or ξnD .
Remark. It follows from a general result of Heyde and Scott (1973) that
if we only assume Eγ f 2 (a1 ) < ∞, then instead of (3.2.11) we only can assert
that ³ ´
sup |ξn (t) − w(t)| = o (log log n)1/2 a.s.
t∈I

as n → ∞, with a random constant implied in o. 2

3.2.3 The case of associated random variables


Write bn for either yn , rn or un , n ∈ N+ , respectively bl for either y l , rl or
ul , l ∈ Z. We now give a partial extension of Theorem 3.2.4 to the sequence
(bn )n∈N+ in the case of infinite variance.
Theorem 3.2.7 Assume f : [1, ∞) → R+ is regularly varying ¡R x of index¢
1/2, Eγ f 2 (a1 ) = ∞, and f (x) = x1/2 L(x), where L(x) = c exp 1 ε(t)t−1 dt ,
x ≥ 1, with c > 0, ε : [1, ∞) → R+ continuous, and limt→∞ ε(t) = 0. For
0 0
any n ∈ N+ define the stochastic process ξn = (ξ (t))t∈I by
0 1 X ¡ ¢
ξn (t) = f (bj ) − Eγ (b0 ) , t ∈ I,
Bn
j≤bntc

with the usual convention which assigns value 0 to a sum over the empty
set, where (Bn )n∈N+ is any sequence satisfying limn→∞ nBn−2 F (Bn ) = 1
with F defined as in Theorem 3.2.4, and Eγ (b0 ) is equal to
Z ∞ Z ∞
1 f (x)dx 1 f (x)dx
Eγ f (y 0 ) = , Eγ f (r0 ) = Eγ f (r1 ) =
log 2 1 x(x + 1) log 2 1 x(x + 1)
or µZ Z ¶
2 ∞
1 (x − 1)f (x)dx f (x)dx
Eγ f (u0 ) = +
log 2 1 x2 2 x2
according as bn denotes yn , rn or un , n ∈ N+ . Then
0 w
µξn−1 → WD in BD

for any µ ∈ pr(BI ) such that µ ¿ λ.


The proof of Theorem 3.2.7 for the cases where bn = rn or bn = un , n ∈
N+ , can be found in Samur (1989, pp. 75–77). The case where bn = yn , n ∈
N+ , can be treated in a similar manner. 2
Limit theorems 189

We note that the hypothesis of a slowly varying F occurring in Theorem


3.2.4 is replaced here by stronger hypotheses. [By Corollary A2.7(ii) the
assumptions on f imply that F is slowly varying.] And even the Karamata
representation of f is assumed to present special features (compare with
Theorem A2.1).
Example 3.2.8 Let f (x) = x1/2 , x ∈ [1, ∞) (cf. Example 3.2.5). Theo-
rem 3.2.7 holds with Bn = (n log n/ log 2)1/2 , n ∈ N+ , and
Z ∞
1 dx π
Eγ f (y 0 ) = Eγ f (r1 ) = √ = ,
log 2 1 x(x + 1) 2 log 2
µZ 2 Z ∞ ¶ ¡√ ¢
1 (x − 1)dx dx 4 2−1
Eγ f (u0 ) = + = .
log 2 1 x3/2 2 x3/2 log 2
2
The next result covers the case of finite variance.
Theorem 3.2.9 Let f : [1, ∞) → R. Assume that either
(i) f satisfies a Lipschitz condition of order 0 < ε ≤ 1, that is,
|f (x) − f (y)|
sup := sε (f ) < ∞,
x6=y, x,y≥1 |x − y|ε
Z ∞
and |f (x)|2+δ x−2 dx < ∞ for some δ ≥ 0
1
or
(ii) f = I(b,∞) for some b > 1.
P
Put S00 = 0, Sn0 = ni=1 (f (bi ) − Eγ f (b0 )), n ∈ N+ , and for any n ∈ N+
define the stochastic processes ξn0C = (ξn0C (t))t∈I and ξn0D = (ξn0D (t))t∈I on
(I, BI , γ) by
1
ξn0C (t) = √ (S 0 + (nt − bntc)(f (bbntc+1 ) − Eγ f (b0 ))),
σ(f ) n bntc
0
Sbntc
ξn0D (t) = √ , t ∈ I,
σ(f ) n
where σ(f ) is a positive number which is defined by (3.2.12) below. Then
à n !2
1 X¡ ¢
lim Eγ f (bi ) − Eγ f (b0 ) = σ 2 (f ) ≥ 0 (3.2.12)
n→∞ n
i=1
190 Chapter 3

exists finitely. If σ(f ) > 0 then


(a) assuming that δ = 0, for any µ ∈ pr(BI ) such that µ ¿ λ we have
w
µξn0−1 → W in both BC and BD ,

where ξn0 stands for either ξn0C or ξn0D ;


(b) assuming that δ > 0, the strong invariance principle holds for the
stochastic processes ξn0C and ξn0D , n ∈ N+ . That is, without changing their
distributions we can redefine these processes on a richer common probability
space together with a standard Brownian motion process (w(t))t∈I such that
¯ 0 ¯
¯ ¯
sup ¯ξn (t) − w(t)¯ = O(n−a ) a.s.
t∈I

as n → ∞, with a random constant implied in O, for each a > 0 small


enough, depending on δ. Here ξn0 stands for either ξn0C or ξn0D .
Proof. We shall show that (a) and (b) follow from Theorems 3.2.1 and
3.2.2, respectively. We use the notation of Subsection 2.1.5 . Define
¡ ¢
H ((il )l∈Z ) = f b1 ([i1 , i2 , · · · ], [i0 , i−1 , · · · ]) , (il )l∈Z ∈ NZ
+,
(3.2.13)
H1 = H((al )l∈Z ), Hm = H1 ◦ τ m−1 , m ∈ N+ .
Hence


 f (1/θ) in the case where bl = y l , l ∈ Z,



h(ω, θ) = f (1/ω) in the case where bl = rl , l ∈ Z,





f (θ + 1/ω) in the case where bl = ul , l ∈ Z

for (ω, θ) ∈ Ω2 . Also, as in the proof of Proposition 2.1.22 we easily obtain

Eγ |H1 − Eγ (H1 | a−n , · · · , an )|2+δ

X Z
1
= γ̄(dω 0 , dθ0 )
γ̄ 2+δ (I 2 (i−n , · · · , in )) I 2 (i−n ,··· ,in )
i−n ,··· ,in ∈N+ (3.2.14)
¯Z ¯2+δ
¯ ¯
¯ ¯
ׯ (h(ω 0 , θ0 ) − h(ω, θ))γ̄(dω, dθ)¯ .
¯ 2
I (i−n ,··· ,in ) ¯

Now, under (i) it is easy to check that h satisfies an inequality of the form
(2.1.30), which yields cn ≤ crn , n ∈ N+ , for some c > 0 and 0 < r < 1,
Limit theorems 191

with cn , n ∈ N+ , defined as in Proposition 2.1.22. It follows from (3.2.14)


that
1/(2+δ)
Eγ |H1 − Eγ (H1 | a−n , · · · , an )|2+δ ≤ crn , n ∈ N+ .

Hence (3.2.3) clearly holds.


Next, we are going to show that under (ii) condition (3.2.3) also holds.
In the case where bl = y l , l ∈ Z, for any given n ∈ N+ there is at most
one fundamental interval I(i0 , i−1 , ..., i−n ) such that 1/b ∈ I (i0 , i−1 , ..., i−n ).
Similarly, in the case where bl = rl , l ∈ Z, for any given n ∈ N+ , there is
at most one fundamental interval I(i1 , ..., in ) such that 1/b ∈ I (i1 , ..., in ).
Therefore by (3.2.14) in both these cases Eγ |H1 − Eγ (H1 |a−n , ..., an )|2+δ
does not exceed (Fn Fn+1 log 2)−1 for all n ∈ N+ , hence (3.2.3) holds. In
the case where bl = ul , l ∈ Z, the last integral in (3.2.14) may be different
from 0 only for those rectangles I 2 (i−n , ..., in ) which are intersected by the
hyperbola y + 1/x = 1/b. It is easy to see that for n large enough the total
Euclidean area of them does not exceed (Fn Fn+1 )−1 so that (3.2.3) holds in
this case, too.
To prove (a) note that for δ = 0 by Theorem 3.2.1 we have
w
µξn−1 −→ W in both BC and BD (3.2.15)

for any µ ∈ pr(BI2 ) such that µ ¿ λ2 , where ξn stands for either ξnC or ξnD
defined as in Section 3.2.1, for our special H given by (3.2.13) and with
σ(f ) = σ(H) defined by (3.2.12). But
¯ ¯
¯bn (ω) − bn (ω, θ)¯ ≤ (Fn−1 Fn )−1 , n ∈ N+ , (ω, θ) ∈ Ω2 .

[In the case where bn = rn , n ∈ N+ , we even have bn (ω) = bn (ω, θ), n ∈ N+ ,


(ω, θ) ∈ Ω2 .] Thus under (i) we have

¯ ¯ 1 ¯ ¯
sup ¯ξn0 (t, ω) − ξn (t, (ω, θ))¯ ≤ √ max ¯Si0 (ω) − Si (ω, θ)¯
t∈I σ(f ) n 1≤i≤n
n
X
1 ¯ ¡ ¢¯
≤ √ ¯f (bi (ω)) − f bi (ω, θ) ¯
σ(f ) n
i=1

sε (f ) X ¯¯ ¯ε
≤ √ bi (ω) − bi (ω, θ)¯
σ(f ) n
i=1
³ ´
= O n−1/2
192 Chapter 3

as n → ∞, with a non-random constant independent of (ω, θ) ∈ Ω2 implied


in O, while under (ii) it is easy to see that
n
X
¯ ¯ 1 ¯ ¯
sup ¯ξn0 (t, ω) − ξn (t, (ω, θ))¯ ≤ √ ¯I(b,∞) (bi (ω)) − I(b,∞) (bi (ω, θ))¯
t∈I σ(f ) n
i=1

O(1) ³ ´
≤ √ = O n−1/2 γ-a.s.
σ(f ) n

with a random constant implied in O. Therefore in both cases


¯ ¯ ³ ´
sup ¯ξn0 (t, ω) − ξn (t, (ω, θ))¯ = O n−1/2 µ-a.s. (3.2.16)
t∈I

for any µ ∈ pr(BI2 ) such that µ ¿ λ2 . Now, (3.2.15) and (3.2.16) imply at
once that
0 w
µξn−1 −→ W in both BC and BD
for any µ ∈ pr(BI ) such that µ ¿ λ.
To prove (b) note that for δ > 0 by Theorem 3.2.2 we have

sup |ξn (t) − w(t)| = O(n−a ) a.s.


t∈I

as n → ∞. By (3.2.16) it is obvious that the strong invariance principle


holds as stated for the stochastic processes ξn0C or ξn0D , n ∈ N+ . 2
In the case where bn = rn , n ∈ N+ , under different assumptions on f ,
we can derive from Theorems 3.2.10 and 3.2.20 the following result.
Theorem 3.2.10 Let f : [1, ∞) → R and define the function g by
g(u) = f (1/u) , u ∈ (0, 1]. Assume that g is a function of bounded p-
variation, p ≥ 1. Put
n
X
S00 = 0, Sn0 = f (ri ) − nEγ f (r1 ), n ∈ N+ .
i=1

Then the series


Z µZ ¶2 Ã µZ ¶2 !
X Z
σ 2 (f ) = g 2 dγ − gdγ + 2 g U n gdγ − gdγ
I I n∈N+ I I

converges absolutely. If σ(f ) 6= 0 then both the weak and strong invariance
principles hold as described in Theorems 3.2.10 and 3.2.20 for the stochastic
Limit theorems 193

processes ξn0C and ξn0D , n ∈ N+ , defined as in Theorem 3.2.9 with bn =


rn , n ∈ N+ .
Proof. In this case the function H considered in Theorems 3.2.10 and
3.2.20 is defined by
N
H (i1 , i2 , ...) = g ([i1 , i2 , ...]) , (in )n∈N+ ∈ N+ + .

It follows from Proposition 2.1.23 and its proof that both (3.2.10 ) and (3.2.30 )
hold in our special case, hence the present statement. 2
Remark. Convergence
P rates in the central limit theorem are available for
the sequence ( ni=1 f (ri ) − nEγ f (r1 ))n∈N+ . Hofbauer and Keller (1982, p.
133) proved that
¯ µ Pn ¶ ¯
¯ i=1 f (ri ) − nEγ f (r1 ) ¯
¯
sup ¯γ √ < x − Φ(x)¯¯ = O(n−a )
x∈R σ(f ) n

as n → ∞ for some 0 < a ≤ 1/2. Rousseau-Egèle (1983) showed that in the


case p = 1 we can take a = 1/2. See also Iosifescu and Grigorescu (1990,
pp. 212–213) and Misevičius (1971). 2
Example 3.2.11 Let f (x) = log x, x ∈ [1, ∞). This is clearly a Lip-
schitz function 0
¯ since
¯α f (x) = 1/x ≤ 1 for any x ∈ [1, ∞). Also, it is easy to
¯ ¯
see that Eγ f (b0 ) < ∞ for any α ∈ R+ . In the cases where bn = yn or
bn = rn , n ∈ N+ , Theorem 3.2.9 holds with
Z ∞
1 log x dx
Eγ f (b0 ) =
log 2 1 x(x + 1)
µ ¯ Z ∞ µ ¶ ¶
1 x + 1 ¯¯∞ 1 1
= − log x log + log 1 + dx
log 2 x ¯1 1 x x
Z
1 X (−1)k+1 ∞ dx
=
log 2 k 1 xk+1
k∈N+

1 X (−1)k+1
=
log 2 k2
k∈N+

π2
=
12 log 2
while the corresponding σ(f ) = σ < ∞ is non-zero. This can be shown
as follows. By the reversibility of (ā` )`∈Z —see Subsection 1.3.3—the finite
194 Chapter 3

dimensional distributions under γ̄ of (ȳ` )`∈Z and (r̄` )`∈Z are identical. Then
à n µ ¶!2
X 2
σ2 1E
= lim n log y i − π
γ 12 log 2
n→∞
i=1

à n µ ¶!2
X
1E
= lim n log ri − π2
γ 12 log 2
n→∞
i=1

à n µ ¶!2
X
1E
= lim n log ri − π2 .
γ 12 log 2
n→∞
i=1

So, σ 2 coincides with (2.1.33) in the case where the function h is defined by

1 π2
h(ω) = log − , ω ∈ Ω.
ω 12 log 2

It is easy to check that U h ∈ BV (I) while h is essentially unbounded. Hence


σ 6= 0 by Proposition 2.1.24.
It is worth mentioning that Mayer (1990) showed that −π 2 /12 log 2 is
the value at β = 2 of the first derivative of the dominant eigenvalue λ(β)
of the Mayer–Ruelle operator Gβ . See Theorem 2.4.7. Also, Hensley (1994)
showed that σ 2 = λ00 (2) − (λ0 (2))2 > 1/6.
Note that in the case where bn = yn , n ∈ N+ , we have
n
X nπ 2 nπ 2
Sn0 = log yi − = log qn − , n ∈ N+ .
12 log 2 12 log 2
i=1

In this case convergence rates in the central limit theorem are available.
Misevičius (1981) proved that
¯ µ ¶ ¯ µ ¶
¯ log qn − nπ 2 /12 log 2 ¯ log n
sup ¯¯λ √ < x − Φ(x)¯¯ = O √ (3.2.17)
x∈R σ n n

as n → ∞. Vallée (1997) was able to obtain the optimal convergence rate


in (3.2.17) using Mayer–Ruelle operators. She proved that for µ ∈ pr(BI )
such that µ ¿ λ and the Radon–Nikodym derivative dµ/dλ is analytic and
strictly positive in I, we have
¯ µ ¶ ¯ µ ¶
¯ log qn − nπ 2 /12 log 2 ¯ 1
¯
sup ¯µ √ ¯
< x − Φ(x)¯ = O √ (3.2.18)
x∈R σ n n
Limit theorems 195

as n → ∞. The same result for µ = λ had been also obtained by Morita


(1994). For further results on the sequence (log qn )n∈N+ see Misevičius
(1992) and Vallée (1997). See also Example 3.4.6.
From (3.2.18), using the double inequality
¯ ¯
1 ¯ pn (ω) ¯¯ 1
¯
≤ ¯ω − ≤ 2 , ω ∈ Ω, n ∈ N+ ,
2
2qn+1 (ω) qn (ω) ¯ qn (ω)

we can derive the corresponding result for the random variable zn defined
by ¯ ¯
¯ p n (ω) ¯
zn (ω) = ¯¯ω − ¯ , ω ∈ Ω, n ∈ N+ .
qn (ω) ¯
We have
¯ µ ¶ ¯ µ ¶
¯ 2 /6 log 2 ¯
¯µ log zn + nπ
√ ¯
< x − Φ(x)¯ = O √
1
¯ 2σ n n

as n → ∞. The details are left to the reader.


In the case where bn = un , n ∈ N+ , Theorem 3.2.9 should hold with
µZ 2 Z ∞ ¶
1 (x − 1) log x log x dx
Eγ f (b0 ) = dx +
log 2 1 x2 2 x2
µ ¶
1 1 2 1 2 2 1 ∞ 1
= (log x − 1) |1 + (log x) |1 − (log x − 1) |2 = 1 + log 2
log 2 x 2 x 2
while we conjecture that σ(f ) is non-zero. 2
Example 3.2.12 Let f (x) = 1/x, x ∈ [1, ∞). This is also a Lips-
0
chitz function since | f (x) | = 1/x2 ≤ 1 for all x ∈ [1, ∞) while g(ω) =
f (1/ω), ω ∈ Ω, is a function of bounded variation. Both Theorems 3.2.9, in
the case where bn = rn , n ∈ N+ , and 3.2.10 hold with
Z ∞
1 dx 1
Eγ f (r1 ) = Eγ f (r0 ) = 2
= −1
log 2 1 x (x + 1) log 2

while the corresponding σ(f ) = σ is non-zero. Indeed, σ 2 coincides with


(2.1.33) in the case where the function h is defined by

1
h(ω) = ω − + 1, ω ∈ Ω,
log 2

and Proposition 2.1.26 applies. 2


196 Chapter 3

3.3 Convergence to non-normal stable laws


3.3.1 The case of incomplete quotients
We start with a result which parallels Theorem 3.2.4.
Theorem 3.3.1 Let f : N+ → R, An ∈ R, Bn ∈ R++ , n ∈ N+ , with
limn→∞ Bn = ∞, and define

Xnj = Bn−1 (f (aj ) − An ) , 1 ≤ j ≤ n,

k
X
Sn0 = 0, Snk = Xnj , 1 ≤ k ≤ n, Snn = Sn , n ∈ N+ .
j=1

Let k1 , k2 ≥ 0, k1 + k2 > 0, α ∈ (0, 2), and denote by ν = ν(k1 , k2 , α) the


stable p.m. c1 Pois µ(k1 , k2 , α) (see Section A1.5).
(i) The following assertions are equivalent.
(I) The stochastic process ξnD = ξn = (ξn (t))t∈I defined for any n ∈ N+
by ξn (t) = Snbntc , t ∈ I, satisfies
w
γξn−1 → Qν in BD ,

where the p.m. Qν is defined as in Section A3.3.


w
(II) γSn−1 → ν, and the array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i.
under γ.
(ii) Assertion (I) above holds if and only if
X
Fe(x) = k −2 , x ∈ R+ , is regularly varying of index − α (3.3.1)
{k:|f (k)|>x}

and
1 X k1
lim k −2 = ,
x→∞ Fe(x) {k:f (k)>x} k1 + k2
(3.3.2)
1 X k2
lim k −2 =
x→∞ Fe(x) {k:f (k)<−x} k1 + k2

or, equivalently (see Theorem A2.5), if and only if


X
F (x) = (log 2)−1 f 2 (k)k −2 , x ∈ R+ ,
{k:|f (k)|≤x}
Limit theorems 197

is regularly varying of index 2 − α and (3.3.2) holds or, equivalently, if and


only if
x2 Fe(x) 2−α
lim = log 2
x→∞ F (x) α
and (3.3.2) holds. If this is the case, then we can take

An = Eγ f (a1 )I(|f (a1 )|≤Bn ) , n ∈ N+ ,

and any sequence (Bn )n∈N+ such that

lim nBn−2 F (Bn ) = (k1 + k2 )/(2 − α).


n→∞

(iii) If either (I) or (II) above holds, then γ can be replaced in (i) by any
µ ∈ pr (BI ) such that µ ¿ λ.
Proof. (i) and (iii) follows from Theorem A3.7 and Lemma 3.0.2, respec-
tively. The proof of (ii) is entirely similar to that working in the case of i.i.d.
random variables. See Samur (1989, p. 62) and Araujo and Giné (1980, pp.
81, 84–85, 87–88). 2
Remark. In principle, from Theorem 3.3.1 we might derive the asymp-
totic behaviour as n → ∞ of random variables as, e.g.,

min Snk , max Snk , or max |Snk |.


0≤k≤n 0≤k≤n 0≤k≤n

This depends on the possibility of determining the distribution of the random


vector µ ¶
inf ξν (t), sup ξν (t), ξν (1) ,
t∈I t∈I

where ξν = (ξν (t))t∈I is a stochastic process with stationary independent


increments, ξν (0) = 0 a.s., trajectories in D, and ξν (1) having probability
distribution ν (see Section A3.3). Note that this problem could be solved in
the case of normal convergence, when ν is the standard normal distribution
and ξν is the standard Brownian motion process—see Remark 2 following
Theorem 3.2.4. 2
Corollary 3.3.2 Let k1 , k2 , α, and ν = ν(k1 , k2 , α) be as in Theorem
3.3.1.
(i) Let f ∈ F (see Section A2.3). Then (3.3.1) and (3.3.2) hold if and
only if f is regularly varying of index 1/α.
198 Chapter 3

(ii) Assume f : [1, ∞) → R++ is bounded on finite intervals and regu-


larly varying of index 1/α. Let
 µ ¶

 α

 δ ∗ν , 0, α if α 6= 1,
 α/(1−α) log 2 log 2
να = µ ¶

 1


 ν , 0, 1 if α = 1,
log 2
and for any n ∈ N+ define the stochastic process ηn = (ηn (t))t∈I by
 X

 f (aj ) if α < 1,



 j≤bntc





 X ¡ ¢
1 f (aj ) − Eγ f (a1 )I(f (a1 )≤f (n)) if α = 1,
ηn (t) = ×
f (n)  j≤bntc





 X

 (f (aj ) − Eγ f (a1 )) if α > 1,


j≤bntc

with the usual convention which assigns value 0 to a sum over the empty
set. Then
w
µηn−1 → Qνα in BD
for any µ ∈ pr(BI ) such that µ ¿ λ.
Proof. (i) By Lemma A2.6(iii) it is sufficient to show that
X
k −2 ∼ (f1 (x))−1 as x → ∞. (3.3.3)
{k:f (k)>x}

For any x ≥ 1 by the definition of f1 and f2 (see Section A2.3) we have

{k : k > f2 (x)} ⊂ {k : f (k) > x} ⊂ {k : k ≥ f1 (x)}. (3.3.4)

Hence X X
k −2 k −2
{k:f (k)>x} f1 (x)≤k≤f2 (x)
1≤ X ≤1+ X (3.3.5)
−2
k k −2
k>f2 (x) k>f2 (x)

for any x ≥ 1. But


X
k −2 ≤ (f1 (x) − 1)−1 − (f2 (x))−1 , (3.3.6)
f1 (x)≤k≤f2 (x)
Limit theorems 199
X
k −2 ≥ (f2 (x) + 1)−1 (3.3.7)
k>f2 (x)

for any x ≥ 1, and


X
(f1 (x))−1 ∼ (f2 (x))−1 ∼ k −2 as x → ∞. (3.3.8)
k>f2 (x)

Now, (3.3.3) follows from (3.3.5) through (3.3.8).


(ii) By Lemma A2.6(ii) we have f ∈ F. It follows from (i) above and
Theorem 3.3.1 that
w
µξn−1 → Qνα in BD
for any µ ∈ pr(BI ) such that µ ¿ λ, where for any n ∈ N+ the process
ξn = (ξn (t))t∈I is defined by

1 X ¡ ¢
ξn (t) = f (aj ) − Eγ f (a1 )I(f (a1 )≤Bn ) , t ∈ I,
Bn
j≤bntc

with Bn satisfying

k1 + k2
lim n Bn−2 F (Bn ) = . (3.3.9)
n→∞ 2−α
It is therefore sufficient to prove that in (3.3.9) we can take Bn = f (n), n ∈
N+ , k1 = α/ log 2, k2 = 0, and that

lim Eγ (ηn (1) − ξn (1))


n→∞

n  Eγ f (a1 )I(f (a1 )≤f (n)) if α < 1,
= lim × (3.3.10)
n→∞ f (n) 
−Eγ f (a1 )I(f (a1 )>f (n)) if α > 1

α
= .
(1 − α) log 2

To proceed notice first that by the very definition of f1 and f2 we have

f1 (f (n) − 1) ≤ n ≤ f2 (f (n)) , n ∈ N+ .

Since f1 is regularly varying, by Corollary A2.2(i) we have

f1 (f (n) − 1) ∼ f1 (f (n)) as n → ∞.
200 Chapter 3

As f1 ∼ f2 , it follows that

fi (f (n)) ∼ n as n → ∞, i = 1, 2. (3.3.11)

Taking up (3.3.9) we begin by noting that (3.3.4) implies that


X X
f 2 (k)k −2 f 2 (k)k −2
k<f1 (x) {k:f (k)<x}
X ≤ X ≤1 (3.3.12)
2 −2
f (k)k f 2 (k)k −2
k≤f2 (x) k≤f2 (x)

for all x ≥ 1. Next, we use Theorem A2.3 taking

L(x) = x−2/α f 2 (bxc) (bxc + 1) /bxc, x ≥ 1,

which is a slowly varying function. We easily obtain


P
x k≤x f 2 (k)k −2 α
lim = . (3.3.13)
x→∞ f 2 (x) 2−α
P P
Clearly, (3.3.13) also holds when k≤x is replaced by k<x . Because f1 ∼
f2 and f is regularly varying, it follows from (3.3.13) that the first fraction
in (3.3.12) tends to 1 as x → ∞. Then by (3.3.13) again and (3.3.11) we
obtain
n n X
F (f (n)) ∼ f 2 (k)k −2
f 2 (n) f 2 (n) log 2
k≤f2 (f (n))

1 n f 2 (f2 (f (n))) α (3.3.14)



log 2 f2 (f (n)) f 2 (n) 2−α
α
∼ as n → ∞,
(2 − α) log 2
that is, (3.3.9) is satisfied as stated.
Now, coming to (3.3.10) assume first α < 1. Then since
³ ´
1
log 1 + k(k+2)
lim =1 (3.3.15)
k→∞ k −2
P
and k∈N+ f (k)k −2 = ∞, we have

1 X
Eγ f (a1 )I(f (a1 )≤f (n)) ∼ f (k)k −2 as n → ∞.
log 2
{k:f (k)≤f (n)}
Limit theorems 201

Therefore the asymptotic behaviour of


n
Eγ f (a1 )I(f (a1 )≤f (n))
f (n)

as n → ∞ can be obtained from (3.3.14) by replacing f 2 by f, thus α by


2α (note that while f 2 is regularly varying of index 2/α, f is regularly
varying of index 1/α). Thus
n α
Eγ f (a1 )I(f (a1 )≤f (n)) ∼ as n → ∞,
f (n) (1 − α) log 2

that is, (3.3.10) holds when α < 1.


Finally, let α > 1. We now use Theorem A2.4 taking

L(x) = x−1/α f (bxc) (bxc + 1) /bxc, x ≥ 1,

which is a slowly varying function. We easily obtain


P
x k≥x f (k)k −2 α
lim = . (3.3.16)
x→∞ f (x) α−1
P P
Clearly, (3.3.16) also holds when k≥x is replaced by k>x . By (3.3.4),
similarly to (3.3.12) we have

Eγ f (a1 )I(a1 >f2 (f (n))) Eγ f (a1 )I(f (a1 )>f (n))


≤ ≤ 1, n ∈ N+ . (3.3.17)
Eγ f (a1 )I(a1 ≥f1 (f (n))) Eγ f (a1 )I(a1 ≥f1 (f (n)))

It follows from (3.3.16) that


P the first fraction in (3.3.17) tends to 1 as n → ∞.
Notice then that since k∈N+ f (k)k −2 < ∞, by (3.3.15 ) we have

1 X
Eγ f (a1 )I(a1 ≥f1 (f (n))) ∼ f (k)k −2 as n → ∞.
log 2
k≥f1 (f (n))

Using (3.3.16) again we thus obtain


n n X
Eγ f (a1 )I(f (a1 )>f (n)) ∼ f (k)k −2
f (n) f (n) log 2
k≥f1 (f (n))

1 n f (f1 (f (n))) α

log 2 f1 (f (n)) f (n) α−1
α
∼ as n → ∞,
(α − 1) log 2
202 Chapter 3

that is, (3.3.10) holds when α > 1, too. 2


To complete the remark following Theorem 3.3.1 we note that Corollary
3.3.2 allows to derive in some cases the asymptotic behaviour as n → ∞
of the random variable Un = number of indices k, 1 ≤ k ≤ n, for which
Snk > 0.
Proposition 3.3.3 Assume f is bounded on finite intervals and regularly
varying of index 1/α with 1 < α < 2. Then
µ ¶
Un
lim µ <x (3.3.18)
n→∞ n
 n P o 
card 1 ≤ k ≤ n : kj=1 f (aj ) > kEγ f (a1 )
= lim µ  < x
n→∞ n

Z x
sin(π/α) dt
= , 0 ≤ x ≤ 1,
π 0 t1−1/α (1− t)1/α
for any µ ∈ pr(BI ) such that µ ¿ λ.
Proof. It is easy to check that να defined in Corollary 3.3.2 is a strictly
stable probability and να ((0, ∞)) = 1/α for any 1 < α < 2. Then (3.3.18)
is an immediate consequence of Theorem 5.1 in de Acosta (1982). 2
Remarks. 1. Proposition 3.3.3 holds for α = 2, too. In this case the
limiting distribution is the classical arc-sine law mentioned in Remark 2
following Theorem 3.2.4. However, the assumption on f in Proposition
3.3.3 is slightly stronger [cf. Corollary A2.7(ii)] than the assumption on f
in Theorem 3.2.4, under which the arc-sine law holds.
2. It follows from Proposition 3.3.3 [cf. Theorem 5.2 in de Acosta (1982)]
that
Z
sin(π/α) x dt
µ (λ (t ∈ I : ξνα (t) > 0) < x) = 1−1/α
, 0 ≤ x ≤ 1,
π 0 t (1 − t)1/α

for any 1 < α < 2. This generalizes P. Lévy’s arc-sine law for Brownian
motion. 2

3.3.2 Sums of incomplete quotients


P
From Corollary 3.3.2 we can derive results for the sums tn = nj=1 aj , n ∈
N+ , of incomplete coefficients by taking f (x) = x, x ∈ [1, ∞). In this case
Limit theorems 203

we have
n
1 X (j + 1)2
An = Eγ a1 I(a1 ≤n) = j log
log 2 j(j + 2)
j=1
µ ¶
1 n+2
= log(n + 2) − (n + 1) log , n ∈ N+ .
log 2 n+1

Hence
1
An = (log n − 1 + o(1)) (3.3.19)
log 2
as n → ∞. For any µ ∈ pr(BI ) such that µ ¿ λ by Corollary 3.3.2(ii) we
have
w
µ (ηn (1))−1 → ν1 , (3.3.20)
where
n
1X
ηn (1) = (aj − An ) , n ∈ N+ .
n
j=1

It follows from (3.3.19) and (3.3.20) that


w
µ (ζn (1))−1 → δ(C−1)/ log 2 ∗ ν1 := ν 0 , (3.3.21)

where
n µ ¶
1X C − log n
ζn (1) = aj + , n ∈ N+ ,
n log 2
j=1

and C = 0.57722 · · · is Euler’s constant. Note that the ch.f. of ν 0 is


µ µ ¶ ¶
0 π 2
νb (t) = exp − 1 + i sgn t log |t| |t| , t ∈ R,
2 log 2 π

see Section A1.5. Hence ν 0 is strictly stable.


A convergence rate in (3.3.21) is available in the special case where µ = γ.
Heinrich (1987) proved that there exists c0 ∈ R++ such that

¯ ¯ 2
¯γ (ζn (1) < x) − ν 0 ((−∞, x))¯ ≤ c0 (log n) (3.3.22)
n
for any n ∈ N+ and x ∈ R.
To conclude let us note that (3.3.21) is a special case of
w
µζn−1 −→ Qν 0 in BD ,
204 Chapter 3

where for any n ∈ N+ the process ζn = (ζn (t))t∈I is defined by


µ ¶
1 X C − log n
ζn (t) = aj + , t ∈ I.
n log 2
j≤bntc

As a consequence (compare with Remark 2 following Proposition 3.3.3) we


have
à P !
card{1 ≤ k ≤ n : kj=1 aj > k(log n − C)/ log 2}
lim µ <x
n→∞ n

= µ (λ(t ∈ I : ξν 0 (t) > 0) < x) , 0 ≤ x ≤ 1.


An explicit expression of the last distribution function is not known.
Immediate consequences of (3.3.21) and (3.3.22) are that (i) for any µ ∈
pr(BI ) such that µ ¿ λ we have
tn 1
−→ in µ-probability as n → ∞, (3.3.23)
n log n log 2
and (ii) for any ε > 0 and n ∈ N+ we have
µ¯ ¯ ¶
¯ tn 1 ¯
γ ¯¯ − ¯≤ε
n log n log 2 ¯
µ· ¸¶
0 C C 2c0 (log n)2
≥ν −ε log n + , ε log n + − .
log 2 log 2 n
P
Khintchine (1934/35) proved using (3.3.23) that the series n∈N+ 1/tn
is divergent a.e. in I. A stronger result is Theorem 3.3.4 below. This was
stated by Doeblin (1940), but his proof is incorrect. We reproduce here the
proof of Iosifescu (1996).
Theorem 3.3.4 The series
Xµ 1 log 2


tn n log n
n≥2

is absolutely convergent a.e. in I .


Proof. In what follows, the letter c with different indices will denote
suitable positive constants. Let h : N+ → N+ be a function such that
limn→∞ h(n) = ∞. For any n ∈ N+ put
n
X
tn (h) = ai I(ai ≤h(n)) .
i=1
Limit theorems 205

It follows from (3.3.19) and the strict stationarity of (an )n∈N+ under γ that
n
Eγ tn (h) = (log h(n) − 1 + o(1)) (3.3.24)
log 2

as n → ∞. Next, for any n ∈ N+ we have


n µ ¶
1 X 2 1
Eγ a21 I(a1 ≤n) = j log 1 + ≤ c1 n,
log 2 j(j + 2)
j=1

and Corollary A3.2 yields

Eγ (tn (h) − Eγ tn (h))2 ≤ c2 nh(n), n ∈ N+ . (3.3.25)


³ ´
Now, write t̄n = tn (h) for h(n) = n blog4/3 nc + 1 and t0n = tn (h) for
h(n) = n, n ∈ N+ . For any n ≥ 3 by (3.3.24) we have
¯ ¯
¯ 1 log 2 ¯¯ log log n
¯
¯ E t − n log n ¯ ≤ c3 n log2 n .
γ n
P 2
Since the series n≥3 (log log n)/n log n is convergent, it is sufficient to
prove that the series
Xµ 1 1

− (3.3.26)
tn Eγ tn
n≥2

is absolutely convergent a.e. in I.


For any n ≥ 2 consider the random events
¡ ¢ ¡ ¢
A1 (n) = A1 = tn > 32 Eγ tn , A2 (n) = A2 = tn < 12 Eγ tn ,
¡1 ¢ ¡ ¢
A3 (n) = A3 = 2 Eγ tn ≤ tn ≤ 32 Eγ tn ∩ tn 6= tn ,
¡1 ¢ ¡ ¢
A4 (n) = A4 = 2 Eγ tn ≤ tn ≤ 32 Eγ tn ∩ tn = tn .

Let us find upper bounds for the γ-probabilities of A1 , A2 , and A3 .We have
¡ ¢ ¡¯ ¯ ¢
A1 = tn − Eγ tn > 12 Eγ tn ⊂ ¯tn − Eγ tn ¯ > 12 Eγ tn .

By (3.3.24) and (3.3.25) the Bienaymé–Chebyshev inequality implies


³ ´ ¡ ¢2
γ(A1 ) ≤ 4c2 n2 blog4/3 nc + 1 / Eγ tn ≤ c4 (log n)−2/3 . (3.3.27)
206 Chapter 3

¡ ¢
Since t0n ≤ tn , n ∈ N+ and Eγ tn /2 − Eγ t0n < 0 for n large enough, for
such an n we have
¡ ¢ ¡ ¢ ¡ ¢
A2 = tn < 12 Eγ tn ⊂ t0n < 12 Eγ tn = t0n − Eγ t0n < 12 Eγ tn − Eγ t0n

¡¯ 0 ¯ ¢
⊂ ¯tn − Eγ t0n ¯ > Eγ t0n − 1 Eγ tn .
2

Again by (3.3.24) and (3.3.25), the Bienaymé–Chebyshev inequality implies

c02 n2 −2
γ(A2 ) ≤ ¡ ¢2 ≤ c5 (log n) . (3.3.28)
Eγ t0n − Eγ tn /2

Noting that
n ³
[ ´
(tn 6= tn ) = ai > n(blog4/3 nc + 1) ,
i=1
whence
³ ³ ´´
γ(tn 6= tn ) ≤ nγ a1 > n blog4/3 nc + 1 ≤ c6 (log n)−4/3 , (3.3.29)

we obviously have
γ(A3 ) ≤ c6 (log n)−4/3 . (3.3.30)
Next, let us find an upper bound for
¯ ¯ 4
¯1
¯ 1 ¯¯ X
Eγ ¯ − = Ii (n),
tn Eγ tn ¯
i=1

where Z ¯ ¯
¯1 1 ¯
Ii (n) = ¯ − ¯
¯ tn E t ¯ dγ, 1 ≤ i ≤ 4.
Ai γ n

Since tn ≤ tn , n ∈ N+ , on A1 we have
1 1 2
≤ < . (3.3.31)
tn tn 3Eγ tn

It follows from (3.3.24), (3.3.27), and (3.3.31) that

I1 (n) ≤ c7 n−1 (log n)−5/3 . (3.3.32)

Since tn ≥ n, n ∈ N+ , by (3.3.24), (3.3.28), and (3.3.30) we have

I2 (n) ≤ c8 n−1 (log n)−2 , I3 (n) ≤ c9 n−1 (log n)−4/3 . (3.3.33)


Limit theorems 207

Finally, set
wn = (tn − Eγ tn )/Eγ tn
and note that by (3.3.24) and (3.3.25) we have

Eγ |wn | ≤ Eγ1/2 wn2 ≤ c10 (log n)−1/3 .

Since on A4 we have tn = tn and 2/3 ≤ 1/(1 + wn ) ≤ 2, it follows that


Z ¯ ¯ Z
¯1 ¯
I4 (n) = ¯ − 1 ¯ dγ = |wn |

¯ tn E t ¯ (1 + wn )Eγ tn
A4 γ n A4
(3.3.34)
2
≤ Eγ |wn | ≤ c11 n−1 (log n)−4/3 .
Eγ t̄n

Therefore by (3.3.32) through (3.3.34) we have


¯ ¯ ³ ´
¯1 1 ¯¯
¯
Eγ ¯ − = O n−1
(log n)−4/3
tn Eγ tn ¯
P −1 −4/3 is convergent, by Beppo
as n → ∞. As the series n≥2 n (log n)
Levy’s theorem series (3.3.26) is absolutely convergent a.e. in I. The proof
is complete. 2
Corollary 3.3.5 We have
Pn
i=1 1/ti
lim = log 2 a.e..
n→∞ log log n

Proof. This follows immediately from Theorem 3.3.4 since, as is well


known, Ã n !
X 1
lim − log log n
n→∞ i log i
i=1
exists and is finite. 2
For further results on the sums tn , n ∈ N+ , see Theorem 4.1.9 and its
corollaries.

3.3.3 The case of associated random variables


We shall now show that Corollary 3.3.2 still holds in the case where α < 1
when aj is replaced by either yj , rj , or uj , j ∈ N+ . This will follow from
the result below (compare with Lemma 3.1.4).
208 Chapter 3

Lemma 3.3.6 Let bn , n ∈ N+ , be real-valued random variables on


(I, BI ) such that
an ≤ bn ≤ an + c, n ∈ N+ ,
for some c ∈ R+ . For any n ∈ N+ consider the stochastic processes ηn =
(ηn (t))t∈I and ηn0 = (ηn0 (t))t∈I defined by
1 X 1 X
ηn (t) = f (aj ), ηn0 (t) = f (bj ), t ∈ I,
f (n) f (n)
j≤bntc j≤bntc

with the usual convention which assigns value 0 to a sum over the empty
set, where f : [1, ∞) → R++ is bounded on finite intervals and regularly
varying of index β > 1. Then d0 (ηn , ηn0 ) converges to 0 in γ-probability as
n → ∞.
Proof. Write f (x) = xβ L(x), x ∈ [1, ∞), where L is slowly varying. For
any n ∈ N+ we have
d0 (ηn , ηn0 ) ≤ sup |ηn (t) − ηn0 (t)|
t∈I
(3.3.35)
1 P |f (a ) − f (b )| ≤ δ 0 + δ 00 ,
n
≤ j j n n
f (n) j=1

where
1 X³ β ´
n n
1 X β
δn0 = bj − aβj L(aj ), δn00 = bj |L(bj ) − L(aj )| .
f (n) f (n)
j=1 j=1
¡ ¢
Using the inequality (1 + a)α − 1 ≤ a {α} + bαc(1 + a)α−1 , valid for non-
negative a and α, we obtain
bβj − aβj ≤ cβ(1 + c)β−1 aβ−1
j , 1 ≤ j ≤ n,
whence
n
1 X −1
δn0 ≤ cβ(1 + c)β−1 aj f (aj ).
f (n)
j=1
Writing
a−1 −1 −1
j f (aj ) = aj f (aj )I(aj ≤M ) + aj f (aj )I(aj >M ) , 1 ≤ j ≤ n,
for an arbitrarily given M ≥ 1, we easily obtain
 
n
X
n f (i) 1 1
δn0 ≤ cβ(1 + c)β−1  max + f (aj ) .
f (n) 1≤i≤M i M f (n)
j=1
Limit theorems 209

Then for any ε > 0 by Corollary 3.3.2(ii) we have


³ ´
lim sup γ δn0 > cβ(1 + c)β−1 ε
n→∞

≤ lim sup γ (ηn (1) > M ε/2)


n→∞
µ· ¶¶

≤ ν1/β ,∞ −→ 0 as M → ∞.
2

Hence δn0 converges to 0 in γ-probability as n → ∞.


Next, for any fixed M ≥ 1 we can write
 Ã !
Xn µ ¶β
1  bj
δn00 ≤ f (bj ) + f (aj ) I(aj ≤M )
f (n) aj
j=1


n µ
X ¶ ¯ ¯
bj β ¯ L(bj ) ¯
+ f (aj ) ¯¯ − 1¯¯ I(aj >M ) 
aj L(aj )
j=1

à !
³ ´ n
β
≤ 1 + (1 + c) sup f (x)
1≤x≤M +c f (n)

¯ ¯X
(1 + c)β ¯ L(x + s) ¯ n
+ sup ¯ ¯
f (n) ¯ L(x) − 1¯ f (aj ).
0≤s≤c, x>M j=1

Given η > 0, choose M ≥ 1 such that


¯ ¯
¯ L(x + s) ¯
sup ¯¯ − 1¯¯ ≤ η
0≤s≤c L(x)

for x > M, which is possible by the Karamata representation of L (see


Theorem A2.1). Then for any ε > 0 by Corollary 3.3.2(ii) again we have
³ ´
lim sup γ(δn00 > ε) ≤ lim sup γ ηn (1) > η −1 (1 + c)−β ε/2
n→∞ n→∞
µ· ¶¶
η −1 (1 + c)−β ε
≤ ν1/β ,∞ −→ 0 as η → 0.
2

Hence δn00 converges to 0 in γ-probability as n → ∞.


210 Chapter 3

By (3.3.35) the proof is complete. 2


Corollary 3.3.7 Let bn denote either yn , rn or un , n ∈ N+ . For any
n ∈ N+ consider the stochastic process
 
1 X
ηn0 =  f (bj )
f (n)
j≤bntc t∈I

with the usual convention which assigns value 0 to a sum over the empty set,
where f : [1, ∞) → R++ is bounded on finite intervals and regularly varying
of index 1/α, 0 < α < 1. Let µ ∈ pr(BI ) such that µ ¿ λ. Then
w
µηn0−1 → Qνα in BD .

Proof. Lemma 3.3.6 applies with c = 1 in the case of yn and rn and


with c = 2 in the case of un . Since µ ¿ λ, the distance d0 (ηn , ηn0 ) converges
to 0 in µ-probability, too, as n → ∞. This property and Corollary 3.3.2(ii)
imply the result stated. 2
In the case where α ≥ 1 we have results which complement Theorem
3.2.7. Write b0 for either y 0 , r0 or u0 .
Theorem 3.3.8 Let bn denote either yn , rn or un . Assume f :
[1, ∞) → R++ is regularly varying of index 1/α, 2
µ x α ∈ [1, 2),
¶ Eγ f (a1 ) = ∞,
R
and f (x) = x1/α L(x), where L(x) = c exp ε(t)t−1 dt , x ≥ 1, with
1
c > 0, ε : [1, ∞) → R continuous, and limt→∞ ε(t) = 0. For any n ∈ N+
define the process η̄n0 = (η̄n0 (t))t∈I by
 X ¡ ¢

 f (bj ) − m(f, b0 ) − Eγ f (a1 )I(f (a1 )≤f (n)) if α = 1,


1  j≤bntc
η̄n0 (t) = × X ¡
f (n)  ¢

 f (bj ) − Eγ f (b 0 ) if α > 1

j≤bntc

with the usual convention which assigns value 0 to a sum over the empty
set, where m(f, b0 ) and Eγ f (b0 ) are equal to

m(f, y 0 ) = m(f, r0 ) = Eγ (f (r0 ) − f (a0 )) = Eγ (f (r1 ) − f (a1 ))


Z ∞
1 (f (x) − f (bxc)) dx
= ,
log 2 1 x(x + 1)
Limit theorems 211

m(f, u0 ) = Eγ (f (u0 ) − f (a0 ))


Z ∞Z ∞µ µ ¶ ¶
1 1
= f x+ − f (bxc) (xy + 1)−2 dxdy
log 2 1 1 y
µZ
2
1 (f (x) − f (1)) (x − 1)
= dx
log 2 1 x2
Z ∞ ¶
f (x) − (bxc − x + 1)f (bx − 1c) − (x − bxc)f (bxc)
+ dx ,
2 x2

Z ∞
1 f (x)dx
Eγ f (y 0 ) = Eγ f (r0 ) = Eγ f (r1 ) = ,
log 2 1 x(x + 1)
µZ 2 Z ∞ ¶
1 f (x)(x − 1)dx f (x)dx
Eγ f (u0 ) = + ,
log 2 1 x2 2 x2

according as bn denotes yn , rn or un , n ∈ N+ . Then


w
µη 0−1
n −→ Qνα in BD

for any µ ∈ pr(BI ) such that µ ¿ λ, where να is defined as in Corollary


3.3.2(ii).
The proof of Theorem 3.3.8 for the cases bn = rn or bn = un , n ∈ N+ ,
can be found in Samur (1989, pp. 75–77). The case where bn = yn , n ∈ N+ ,
can be treated in a similar manner. 2
Example 3.3.9 Let f (x) = x1/α , x ∈ [1, ∞), where α ∈ (1, 2). (For the
case α = 2 see Example 3.2.8.) Theorem 3.3.8 holds with

Eγ f (y 0 ) = Eγ f (r0 ) = Eγ f (r1 )

Z ∞ Z 1
1 x1/α dx 1 v −1/α dv
= =
log 2 1 x(x + 1) log 2 0 v+1
1 X 1
=
log 2 (2j − 1 − 1/α)(2j − 1/α)
j∈N+
µ µ ¶ µ ¶¶
1 1 1 1
= ψ 1− −ψ − ,
2 log 2 2α 2 2α
212 Chapter 3

where ψ is the digamma function—see p. 145—and


µZ 2 Z ∞ ¶
1 (x − 1)dx dx α2 (21/α − 1)
Eγ f (u0 ) = + = .
log 2 1 x2−1/α 2 x2−1/α (α − 1) log 2
2
Example 3.3.10 Let f (x) = x, x ∈ [1, ∞). Theorem 3.3.8 holds with
Z ∞
1 (x − bxc) dx
m(f, y 0 ) = m(f, r0 ) =
log 2 1 x(x + 1)
Z ∞
1 dx
= = (log 2)−1 − 1,
log 2 1 x2 (x + 1)

m(f, u0 ) = Eγ (r0 − a0 + y −1 −1
0 ) = m(f, r 0 ) + Eγ (y 0 )

Z ∞
2 dx ¡ −1
¢
= = 2 (log 2) − 1 .
log 2 1 x2 (x + 1)
0 0
It follows that if for any n ∈ N+ the process ζn = (ζn (t))t∈I is defined
by µ ¶
0 1 X C − log n
ζn (t) = bj + , t ∈ I,
n log 2
j≤bntc

where bn denotes either yn , rn or un , n ∈ N+ , then for any µ ∈ pr(BI ) such


that µ ¿ λ we have
0 w
µζn−1 −→ Qν 00 in BD
in the cases where bn = yn or bn = rn , n ∈ N+ , with ν 00 = δC/ log 2−1 ∗ ν1 ,
and
0 w
µζn−1 −→ Qν 000 in BD
in the case where bn = un , n ∈ N+ , with ν 000 = δ(C+1)/ log 2−2 ∗ ν1 . As a
consequence (compare with the similar result for the incomplete quotients
an , n ∈ N+ , in Subsection 3.3.2) we have
à P !
card{1 ≤ k ≤ n : kj=1 yj > k(log n − C)/ log 2}
lim µ <x
n→∞ n
à Pk !
card{1 ≤ k ≤ n : j=1 rj > k(log n − C)/ log 2}
= lim µ <x
n→∞ n

= µ (λ(t ∈ I : ξν 00 (t) > 0) < x) , 0 ≤ x ≤ 1,


Limit theorems 213

and
à Pk !
card{1 ≤ k ≤ n : j=1 uj > k(log n − C)/ log 2}
lim µ <x
n→∞ n
¡ ¢
= µ λ(t ∈ I : ξν 000 (t) > 0) < x , 0 ≤ x ≤ 1.
2

3.4 Fluctuation results


3.4.1 The case of incomplete quotients
We start with a direct consequence of Theorem 3.2.20 .
Let K ⊂ C be the collection
R1 of all absolutely continuous functions x ∈ C
for which x(0) = 0 and 0 [x0 (t)]2 dt ≤ 1. Here x0 stands for the derivative
of x which exists a.e. in I.
N
Let H be a real-valued function on N+ + . Set Hn = H (an , an+1 , · · · ) , n ∈
N 2 0
P+n, and assume that Eγ H1 < ∞ and (3.2.1 ) holds. Denoting Sn =
2 defined by (3.2.20 ) is
i=1 H n − nE γ H 1 , n ∈ N + , and assuming that σ
non-zero, for any n ≥ 3 put

1 ¡ ¡ ¢¢
θn (t) = √ Sbntc + (nt − bntc) Hbntc+1 − Eγ H1
σ 2n log log n
1
= √ ξC , t ∈ I.
2n log log n n

Theorem 3.4.1 (Strassen’s law of the iterated logarithm). Assume that


Eγ |H1 |2+δ < ∞ for some constant δ > 0, (3.2.30 ) holds, and σ 2 defined by
(3.2.20 ) is non-zero. Then the sequence (θn )n≥3 , viewed as a subset of C ,
is a relatively compact set whose derived set coincides a.e. with K.
Proof. The result follows from Strassen’s law of the iterated logarithm for
standard Brownian motion [see Theorem 1 in Strassen (1964)] and Theorem
3.2.20 . 2
Corollary 3.4.2 (Classical law of the iterated logarithm). Under the
assumptions of Theorem 3.4.1 the set of accumulation points of the sequence
³ p ´
Sn /σ 2n log log n
n≥3
214 Chapter 3

coincides a.e. with the segment [−1, 1].


In the special case where H only depends on finitely many coordinates
N
of a current point of N+ + , i.e., when H is a real-valued function on Nk+
for a given k ∈ N+ , certain assumptions in Theorem 3.4.1 are no longer
necessary. In this case Hn = H (an , · · · , an+k−1 ), n ∈ N+ , and (3.2.30 ) is
trivially satisfied. Also, σ 2 reduces to (3.2.200 ) and when k = 1 by Corollary
2.1.25 we have σ 2 = 0 if and only if H = const. Finally, it is enough to
assume that Eγ H12 < ∞. This follows from the work of Heyde and Scott
(1973). Cf. the remark following Proposition 3.2.6.
We state a most striking result.
Proposition 3.4.3 Let f : N+ P → R be a nonconstant function. Assume
that Eγ f 2 (a1 ) < ∞ and put Sn = ni=1 f (ai ) − nEγ f (a1 ) , n ∈ N+ . Let
X ¡ ¢
σ 2 = Eγ f 2 (a1 ) − Eγ2 f (a1 ) + 2 Eγ f (a1 ) f (an+1 ) − Eγ2 f (a1 ) ,
n∈N+

which by Corollary 2.1.25 is non-zero. For any n ≥ 3 put


1 ¡ ¢
θn (t) = √ Sbntc + (nt − bntc) (fbntc+1 − Eγ f (a1 )) , t ∈ I.
σ 2n log log n
Then the sequence (θn )n≥3 , viewed as a subset of C, is a relatively compact
set whose derived set coincides a.e. with
√ K. In particular, the set of accu-
mulation points of the sequence (Sn /σ 2n log log n)n≥3 coincides a.e. with
the segment [−1, 1].
The almost sure invariance principle is instrumental in establishing in-
tegral tests which characterize the asymptotic growth rates of partial sums
and maximum absolute partial sums.
Proposition 3.4.4 Let θ : [1, ∞) → R++ be non-decreasing. Then
under the assumptions of Theorem 3.4.1 the following assertions hold:

(i) γ (Sn > σ n θ (n) i.o.) = 0 or 1
according as Z ∞ µ 2 ¶
θ (t) θ (t)
exp − dt
1 t 2
converges or diverges.

(ii) γ (max1≤i≤n |Si | < σ n/θ(n) i.o.) = 0 or 1
according as Z µ 2 2 ¶

θ2 (t) π θ (t)
exp − dt
1 t 8
Limit theorems 215

converges or diverges.
Proof. These results follow from Theorem 3.2.20 and properties of stan-
dard Brownian motion. See Jain and Taylor (1973) and Jain, Jogdeo and
Stout (1975) [cf. Philipp and Stout (1975)]. 2
Except for the sufficiency of the moment assumption Eγ H12 < ∞ in
the case considered there, the considerations on Theorem 3.4.1 following
Corollary 3.4.2 are valid for Proposition 3.4.4, too.
We note that Proposition 3.4.4(i) implies the classical law of the iterated
logarithm µ ¶
Sn
γ lim sup √ = 1 = 1. (3.4.1)
n→∞ σ 2n log log n

To obtain (3.4.1)
√ we should take successively θ(n) = (1 + ε) 2 log log n and
θ(n) = (1 − ε) 2 log log n, 0 < ε < 1, n ∈ N+ . Also, Proposition 3.4.4(ii)
implies Chung’s law of the iterated logarithm for maximum absolute partial
sums à !
max1≤i≤n |Si | π
γ lim inf p =√ = 1. (3.4.2)
n→∞ σ n/(log log n) 8
√ √
To obtain (3.4.2)
√ we should√ take successively θ(n) = ( 8/π)(1+ε) log log n
and θ(n) = ( 8/π)(1 − ε) log log n, 0 < ε < 1, n ∈ N+ .
We conjecture that in the special case where H only depends on finitely
N
many coordinates of a current point in N+ + , Chung’s law of the iterated
logarithm (3.4.2) holds only assuming that Eγ H12 < ∞ [as (3.4.1) does]. See
Jain and Pruitt (1975) for the i.i.d. case.

3.4.2 The case of associated random variables


Write bn for either yn , rn or un , n ∈ N+ , respectively b0 for either y 0 , r0 or
u0 .
Theorem 3.4.5 Let f : [1, ∞) → R satisfy either (i) or (ii) of Theorem
3.2.9. With the notation of that theorem assume that σ(f ) > 0 and put
1
θn0 (t) = √ ξ 0C (t), n ≥ 3, t ∈ I.
2n log log n n

If δ > 0 then the sequence (θn0 )n≥3 , viewed as a subset of C, is a relatively


compact set whose derived set coincides a.e. √with K. In particular, the set of
0
accumulation points of the sequence (Sn /σ 2n log log n)n≥3 coincides a.e.
with the segment [−1, 1].
216 Chapter 3

Proof. The results follow at once from Theorem 3.2.9(b) and Strassen’s
law of the iterated logarithm for standard Brownian motion [see Theorem 1
in Strassen (1964)]. 2
Note that in the present context we cannot make considerations similar
to those following Corollary 3.4.2.
Example 3.4.6 Let f (x) = log x, x ∈ [1, ∞). As we have seen in Exam-
ple 3.2.11, in the cases where bn = yn or bn = rn , n ∈ N+ , we have
π2
Eγ f (b0 ) =
12 log 2
and σ(f ) = σ < ∞ is non-zero. It follows that Strassen’s law of the iterated
logarithm holds for the corresponding processes θn0 , n ∈ N+ . In particular,
the classical law of the iterated logarithm
µ ¶
log qn − nπ 2 /12 log 2
γ lim sup √ =1 =1
n→∞ σ 2n log log n
holds. This had been proved by Gordin and Reznik (1970) and Philipp and
Stackelberg (1969). 2
A result similar to Proposition 3.4.4 holds.
Proposition 3.4.7 Let θ : [1, ∞) → R++ be non-decreasing. Then
under the assumptions of Theorem 3.2.9 the following assertions hold:

(i) γ(Sn0 > σ(f ) n θ(n) i.o.) = 0 or 1
according as Z ∞ µ 2 ¶
θ(t) θ (t)
exp − dt
1 t 2
converges or diverges.

(ii) γ (max1≤i≤n |Si0 | < σ(f ) n/θ(n) i.o.) = 0 or 1
according as Z ∞ 2 µ 2 2 ¶
θ (t) π θ (t)
exp − dt
1 t 8
converges or diverges.
Proof. These results follow from Theorem 3.2.9 and properties of stan-
dard Brownian motion. See Jain and Taylor (1973) and Jain, Jogdeo and
Stout (1975) [cf. Philipp and Stout (1975)]. 2
The remarks following Proposition 3.4.4 concerning the classical and
Chung’s laws of the iterated logarithm apply mutatis mutandis in the present
context, too.
Limit theorems 217

It is obvious that all the results stated in this section still hold when γ
is replaced by any µ ∈ pr(BI ) such that µ ¿ λ.
218 Chapter 3
Chapter 4

Ergodic theory of continued


fractions

In this chapter applications of the ergodic properties of the continued frac-


tion transformation τ and its natural extension τ are given. Next, two
operations (‘singularization’ and ‘insertion’) on incomplete quotients are in-
troduced, which allow to obtain most of the continued fraction expansions
related to the RCF expansion. Ergodic properties of these expansions are
also derived.

4.0 Ergodic theory preliminaries


4.0.1 A few general concepts
Let (X, X , µ) be a probability space. An X-valued random variable on
X, i.e., an (X , X )-measurable map from X into itself (see Section A1.2),
is called a transformation of X. A transformation T of X is said to be
µ-non-singular if and only if µ(T −1 (A)) = 0 for any A ∈ X for which
µ(A) = 0; it is said to be measure preserving if and only if µT −1 = µ, i.e.,
µ(T −1 (A)) = µ(A) for any A ∈ X – see Section A1.3. (When the probability
µ should be emphasized we shall say that T is µ-preserving.) Clearly, any
µ-preserving transformation of X is µ-non-singular. A pair (T, µ), where
T is a µ-preserving transformation of X, is called an endomorphism of X.
An endomorphism (T, µ) of X is called an automorphism if and only if T is
bijective [that is, T (X) = X and T −1 exists] and T −1 is (X , X )-measurable.
A quadruple (X, X , T, µ), where (T, µ) is an endomorphism of X, is called
a (measurable) dynamical system.

219
220 Chapter 4

A transformation T of X is said to be ergodic (or metrically transitive,


or indecomposable) under µ if and only if the sets A ∈ X with T −1 (A) = A,
which are called T -invariant, satisfy either µ(A) = 0 or µ(A) = 1. An
equivalent definition, even if seemingly more general, is that
¡ ¢
µ (T −1 (A) \ A) ∪ (A \ T −1 (A)) = 0

for A ∈ X if and only if either µ(A) = 0 or µ(A) = 1. Finally, in terms


of functions this is equivalent to f = f ◦ T µ-a.s. for an X-valued random
variable f on X if and only if f is constant µ-a.s.
In particular, T is ergodic under µ if it is strongly mixing under µ, that
is,
lim µ(T −n (A) ∩ B) = µ(A)µ(B)
n→∞

for any sets A, B ∈ X . This is equivalent to


Z Z Z
lim (f ◦ T n )g dµ = f dµ g dµ
n→∞ X X X

for any f ∈ L∞ (X, X , µ) and g ∈ L1 (X, X , µ).


Proposition 4.0.1 Let T be a µ-non-singular transformation of X. If T
is ergodic under µ, then there exists at most one probability measure ν on X
such that ν ¿ µ and (T, ν) is an endomorphism of X. Conversely, if there
exists a unique measure ν on X with ν ¿ µ and dν/dµ > 0 µ-a.s. such that
(T, ν) is an endomorphism of X, then T is ergodic under µ.
The proof of Proposition 4.0.1, which entails the concept of the Perron–
Frobenius operator of T (cf. Section 2.1), can be found in Lasota and Mackey
(1985). 2
An endomorphism (T, µ) of X is said to be exact if and only if, putting
¡ ¢
Xn = T −n (A) : A ∈ X , n ∈ N,
T
where T 0 is the identity map, the tail σ-algebra n∈N Xn is µ-trivial, i.e.,
it contains only sets A for which either µ(A) = 0 or µ(A) = 1. If an
endomorphism (T, µ) of X is exact, then T is ergodic under µ; also, for any
A ∈ X for which µ(A) > 0 and T n (A) ∈ X , n ∈ N+ , we have

lim µ (T n (A)) = 1.
n→∞
Ergodic theory of continued fractions 221

Proposition 4.0.2 Let T be a µ-preserving transformation of X for


which T (A) ∈ X for any A ∈ X . Then the endomorphism (T, µ) is exact if
and only if Z
lim ||P n f − f dµ||1,µ = 0
n→∞ X

for any non-negative f ∈ L1 (X, X , µ), where P is the Perron–Frobenius


operator of T under µ (cf. Section 2.1).
For the proof see Boyarski and Góra (1997, p. 82). 2
Theorem 4.0.3 (Birkhoff’s individual ergodic theorem) Let T be a µ-
preserving transformation of X. Then for any f ∈ L1 (X, X , µ) there exists
f˜ ∈ L1 (X, X , µ) such that
n−1
1X
lim f (T k (x)) = f˜ µ-a.s.
n→∞ n
k=0

and
f˜ ◦ T = f˜ µ-a.s.
R R
Moreover, X f˜ dµ = X f dµ and if, in addition, T is ergodic under µ, then
R
f˜ is µ-a.s. a constant equal to X f dµ.
A proof of the ergodic theorem can be found in, e.g., Billingsley (1965),
Walters (1982), Petersen (1983) or Cornfeld et al. (1982). In particular, in
Keane (1991) a short proof, essentially based on an idea of Kamae (1982),
is outlined. See also Katznelson and Weiss (1982). 2
Under suitable assumptions it is possible to refine Birkhoff’s theorem by
giving an estimate of the convergence rate to the limit f˜. The result stated
below is a special case of Theorem 3 of Gál and Koksma (1950).
Proposition 4.0.4 Let T be a µ-preserving transformation of X which
is ergodic under µ. Assume that
Z Ãn−1
X Z !2
f ◦ Tκ − n f dµ dµ = O(Ψ(n))
X κ=0 X

as n → ∞, where Ψ : N+ → R is a function such that the sequence


(Ψ(n)/n)n∈N+ is non-decreasing. Then whatever ε > 0 we have
n−1
X Z ³ ´
3+ε
κ
f (T (x)) = n f dµ + o Ψ1/2 (n) log 2 n µ-a.s.
κ=0 X
222 Chapter 4

as n → ∞. Here the constant implied in o depends on ε and the current


point x ∈ X.

Given a transformation T of X we can define its so called natural exten-


sion T as follows. Let
¡ ¢
XT = (xi )i∈N ∈ X N : xi = T (xi+1 ), i ∈ N

and define T : XT → XT by

T ((xi )i∈N ) = (T (x0 ), x0 , x1 , · · · )

for any (xi )i∈N = (x0 , x1 , · · · ) ∈ XT . It is easy to check that T is bijective.


If T is µ-preserving, then we can also define a measure µ on the σ-algebra
XT ⊂ X N generated by the cylinder sets

C(A0 , . . . , An ) = ((xi )i∈N ∈ XT : xj ∈ Aj , 0 ≤ j ≤ n) ,

where Aj ∈ X , 0 ≤ j ≤ n, n ∈ N, by setting
 
\
µ(C(A0 , . . . , An )) = µ  T −n+j (Aj ), n ∈ N.
0≤j≤n

Proposition 4.0.5 If T is µ-preserving, then T is µ-preserving; T is


ergodic (strongly mixing) under µ if and only if T is ergodic (strongly mixing)
under µ.
Clearly, if (T, µ) is an endomorphism of X, then (T̄ , µ̄) is an automor-
phism of XT .
Remarks. 1. The definition just given of the natural extension T of
T is a constructive one. More generally, starting from a transformation
T of X which is µ-preserving (µT −1 = µ), a bijective transformation T :
X → X is called a natural extension of T if and only if (i) there exists a
measurable space (X, X ) and a probability measure µ on X such that T is
µ-preserving, and (ii) there exists
S a random variable f : X → X such that
n
the σ-algebra generated by n∈N T f −1 (X )—see Section A1.1—coincides
with X up to sets of µ̄-probability 0, f ◦ T = T ◦ f µ̄-a.s., and µ̄f −1 = µ.
The natural extension is unique up to isomorphism. By this we mean that
if T i : X i → X i , i = 1, 2, are natural extensions of T : X → X, with X i
being µi -preserving for a probability measure µi on X i (the σ-algebra in
Ergodic theory of continued fractions 223

X i ), i = 1, 2, then there exist Ei ∈ X i with µ(Ei ) = 0, i = 1, 2, and a


one-to-one random variable g : X 1 \ E1 → X 2 \ E2 such that gT 1 = T 2 g on
X 1 \ E1 and µ1 (g −1 (E)) = µ2 (E) for any set E in X 2 which is included in
X 2 \ E2 . In the case of the constructive definition we clearly have X = XT
while f is defined by

f ((xi )i∈N ) = x0 , (xi )i∈N ∈ XT .

Note that the definition of isomorphism of two natural extensions of a


given endomorphism also applies to the case of two arbitrary endomorphisms
or dynamical systems.
2. Unlike ergodicity or strong mixing, exactness does not transfer from
an endomorphism (T, µ) to its natural extension (T , µ̄). As T is invertible,
(T , µ̄) cannot be exact since
¡ ¢ ³ ´
−1
µ̄ T (A) = µ̄ T (T (A)) = µ̄(A),
¡ n ¢ ¡ ¢
hence µ̄ T (A) = µ̄(A) for any n ∈ N+ and A ∈ X . Instead, T , µ̄ always
is a K-automorphism, which means that there exists an algebra A ⊂ X
−1 S n
such that T (A) ⊂ A, n∈N+ T (A) generates X , and the tail σ-algebra
T −n
n∈N+ T (A) is µ̄-trivial. Cf. Petersen (1983, Section 2.5) 2
Finally, let us consider together with the probability space (X, X , µ) and
a transformation T : X → X, a family of probability spaces ((Y, Y, νx ))x∈X
and a family (Tx )x∈X of transformations of Y such that the map (x, y) ∈
X × Y → Tx (y) ∈ Y is an Y -valued random variable on X × Y . The map
S : X × Y → X × Y defined by

S(x, y) = (T (x), Tx (y)) , (x, y) ∈ X × Y,

is called a skew product of T and (Tx )x∈X . In many cases the natural
extensions are constructed as skew products. Several examples can be found
in the next sections.
Assuming that T is µ-preserving and Tx is νx -preserving for any x ∈ X,
we might expect the skew-product S to be ν-preserving, where ν is the
probability measure on X ⊗ Y defined by
Z
ν(A × B) = νx (B) µ(dx), A ∈ X , B ∈ Y.
A

Unfortunately, such a result does not hold even if it is claimed in Boyarski


and Góra (1997, p. 64). It is contradicted, e.g., by the case of the natural
extension τ̄ of τ . Cf. the next subsection.
224 Chapter 4

4.0.2 The special case of the transformations τ and τ


It is possible to give a direct proof of the ergodicity under γ of the continued
fraction transformation τ . See, e.g., Billingsley (1965, pp. 44–45).
Results proved in Chapter 2 allow us to assert that actually τ is strongly
mixing under γ and any γa , a ∈ I, thus in particular under γ0 = λ. This is a
direct consequence of Corollary 1.3.15. Therefore τ is also ergodic under γ
and any γa , a ∈ I. Moreover, the endomorphism (τ, γ) is exact by Corollary
2.1.8 and Proposition 4.0.2. It follows from Proposition 4.0.1 that any ν ¿ λ
for which τ is ν-preserving should coincide with γ.
As for τ , we shall show that it can be viewed as the natural extension
of τ in the meaning of the constructive definition given in the preceding
subsection. Indeed, in our case XT from the preceding subsection is

Ωτ = {(ωi )i∈N ∈ ΩN : ωi = τ (ωi+1 ), i ∈ N},

and the natural extension of τ appears to be—we are bound to change


notation—the transformation given by

τe ((ωi )i∈N ) = (τ (ω0 ), ω0 , ω1 , · · · )

for any (ωi )i∈N = (ω0 , ω1 , · · · ) ∈ Ωτ . Let us remark that by the very defi-
nition of Ωτ we have ωi+1 = 1/(κi + ωi ) for some κi ∈ N+ whatever i ∈ N.
Hence Ωτ can be viewed as the Cartesian product
N
Ω × N+ +

or, equivalently, Ω × Ω = Ω2 . More precisely, there is a one-to-one corre-


spondence between Ωτ and Ω2 given by

(ωi )i∈N ∈ Ωτ ↔ (ω0 , [bω1−1 c, bω2−1 c, · · · ] ) ∈ Ω2 .

Then there also is a one-to-one correspondence between

τe ((ωi )i∈N ) = (τ (ω0 ), ω0 , ω1 , · · · ) ∈ Ωτ

and µ ¶
1
τ (ω0 ), ∈ Ω2 .
bω0−1 c + [bω1−1 c, bω2−1 c, · · · ]
These considerations show that we can identify τe : Ωτ → Ωτ and τ : Ω2 →
Ω2 defined as in Subsection 1.3.1 by
µ ¶
1
τ (ω, θ) = τ (ω), , (ω, θ) ∈ Ω2 .
a1 (ω) + θ
Ergodic theory of continued fractions 225

It follows from Proposition 4.0.5 that τ̄ is strongly mixing (thus ergodic)


under γ̄. Also, (τ̄ , γ̄) is a K-automorphism. Clearly, τ̄ can be viewed as a
skew product.

4.1 Classical results and generalizations


4.1.1 The case of incomplete quotients
Since τ is γ-preserving and ergodic under γ, it follows from Theorem 4.0.3
that
n−1 Z 1
1X κ 1 f (x)
lim f ◦τ = dx a.e. (4.1.1)
n→∞ n log 2 0 x + 1
κ=0
R
for any measurable function f : I → R such that I |f | dλ < ∞. It is clear
that under suitable further assumptions on f , Proposition 4.0.4 should lead
to estimates of convergence rates in (4.1.1).
We now state several classical results which can be derived from (4.1.1)
by specializing f , together with the corresponding estimates of the conver-
gence rates, when available. Let us note that throughout this subsection
the constants implied in o will depend on ε, the current point in Ω, and the
other variables involved.
Proposition 4.1.1 [Asymptotic relative digit frequencies – Lévy (1929)]
For any i ∈ N+ we have
µ ¶
card{κ : aκ = i, 1 ≤ κ ≤ n} 1 1
lim = log 1 + a.e..
n→∞ n log 2 i(i + 2)

More precisely, whatever ε > 0, for any i ∈ N+ we have

card{κ : aκ = i, 1 ≤ κ ≤ n}
n
µ ¶ ³ 1 ´
1 1
= log 1 + + o n− 2 log(3+ε)/2 n a.e.
log 2 i(i + 2)
as n → ∞.
Proof. The first equation in the above statement follows from (4.1.1) by
taking f = I(a1 =i) , hence f ◦ τ κ = I(a1 ◦τ κ =i) = I(aκ+1 =i) , κ ∈ N. The second
equation follows from Proposition 4.0.4 on account of Corollaries 1.3.15 and
A3.3 which yield Ψ(n) = n, n ∈ N+ . 2
226 Chapter 4

A more general result yielding the asymptotic relative m-digit block


frequencies is also easily obtained.
Proposition 4.1.2 Whatever ε > 0, for any m ∈ N+ and i(m) =
(i1 , · · · , im ) ∈ Nm
+ we have

card{κ : (aκ , · · · , aκ+m−1 ) = i(m) , 1 ≤ κ ≤ n}


n

1 1 + v(i(m) ) ³ 1 ´
−2 (3+ε)/2
= log + o n log n a.e.
log 2 1 + u(i(m) )
as n → ∞.
The proof is quite similar to that of the preceding proposition. In (4.1.1)
we should take f = I((a1 ,··· ,am )=i(m) ) . 2
It is important to note that the asymptotic relative digit frequencies as
well as the asymptotic relative m-digit block frequencies, m ≥ 2, consti-
tute probability distributions on N+ respectively Nm + . This is quite easily
checked in the first case and not so easily in the second one (induction on
m!). Actually, this follows from (4.1.1) on account of the countable additiv-
ity of the integral there with respect to the integrand.
We now give other results related to asymptotic relative digit frequencies.
Corollary 4.1.3 (Asymptotic relative frequencies of digits between two
given values) For any i, j ∈ N+ such that i ≤ j we have

card{κ : i ≤ aκ ≤ j, 1 ≤ κ ≤ n} 1 (i + 1)(j + 1)
lim = log a.e..
n→∞ n log 2 i(j + 2)

More precisely, whatever ε > 0, for any i, j ∈ N+ such that i ≤ j we have

card{κ : i ≤ aκ ≤ j, 1 ≤ κ ≤ n}
n
1 (i + 1)(j + 1) ³ 1 3+ε
´
= log + o n− 2 log 2 n a.e.
log 2 i(j + 2)
as n → ∞.
This is a direct consequence of Proposition 4.1.1, which can be also
obtained from (4.1.1) by taking f = I(i≤a1 ≤j) .
Ergodic theory of continued fractions 227

Proposition 4.1.4 (Asymptotic relative frequencies of digits exceeding


a given value) For any i ∈ N+ we have
card{κ : aκ ≥ i, 1 ≤ κ ≤ n} 1 i+1
lim = log a.e..
n→∞ n log 2 i
More precisely, whatever ε > 0, for any i ∈ N+ we have
card{κ : aκ ≥ i, 1 ≤ κ ≤ n} 1 i+1 ³ 1 3+ε
´
= log + o n− 2 log 2 n a.e.
n log 2 i
as n → ∞.
The proof is quite similar to that of Proposition 4.1.1. In (4.1.1) we
should take f = I(a1 ≥i) . 2
Let us note that on account of the complete additivity of the asymp-
totic relative digit frequencies, the first half of Proposition 4.1.4 is a direct
consequence of the first half of Proposition 4.1.1.
Now, let m ∈ N+ such that m ≥ 2, and fix arbitrarily an ` ∈ N+ not
exceeding m. It then follows from Proposition 4.1.1 that
card{κ : aκ ≡ ` mod m, 1 ≤ κ ≤ n}
lim
n→∞ n

1 X (` + pm + 1)2
= log a.e..
log 2 (` + pm)(` + pm + 2)
p=0

[By taking f = I(a1 ≡` mod m) in (4.1.1), an estimate of the convergence rate


can be also obtained.] It has been shown that the sum of the series above
can be expressed in terms of Euler’s Gamma-function. To be precise, the
following result holds.
Proposition 4.1.5 [Nolte (1990)] We have

à !
1 X (` + pm + 1)2 1 `
Γ( m )Γ( `+2
m )
log = log .
log 2
p=0
(` + pm)(` + pm + 2) log 2 Γ2 ( `+1
m )

The proof rests on a special case of a result from Whittaker and Watson
(1927, Section 12.13), which reads as follows.
Let αi , βi ∈ C \ N+ , 1 ≤ i ≤ r, for a given r ∈ N+ . Then the infinite
product
Y (n − α1 )(n − α2 ) · · · (n − αr )
(n − β1 )(n − β2 ) · · · (n − βr )
n∈N+
228 Chapter 4

Pr Pr
converges if and only if i=1 αi = i=1 βi . If this condition is fulfilled,
then
Y (n − α1 )(n − α2 ) · · · (n − αr ) r
Y Γ(1 − βi )
= . (4.1.2)
(n − β1 )(n − β2 ) · · · (n − βr ) Γ(1 − αi )
n∈N+ i=1

2
For example, using the well known relations Γ(z)Γ(1 − z) = π/ sin πz,
z 6∈ Z, and Γ(z + 1) = zΓ(z), z 6∈ −N, if we take m = 2 and ` = 1 then we
find that
card{κ : aκ ≡ 1 mod 2, 1 ≤ κ ≤ n}
lim
n→∞ n

1 Γ(1/2)Γ(3/2) log π
= log 2
= − 1 = 0.6514 · · · a.e.,
log 2 Γ (1) log 2

i.e., about 65 % of the occurring digits are odd a.e..


Next, using the same relations for the function Γ, for m = 4 and ` = 1
we find that
card{κ : aκ ≡ 1 mod 4, 1 ≤ κ ≤ n}
lim
n→∞ n

1 Γ(1/4)Γ(3/4) 1
= log 2
= a.e.,
log 2 Γ (1/2) 2

i.e., about half of the occurring digits are ≡ 1 mod 4 a.e..


Similar considerations can be made about 2-digit blocks. For example,
we have
card{κ : (aκ , aκ+1 ) ≡ (0, 0) mod 2, 1 ≤ κ ≤ n}
lim
n→∞ n

1 X X (4ij + 1)(4ij + 2i + 2j + 2)
= log a.e.,
log 2 (4ij + 2i + 1)(4ij + 2j + 1)
i∈N+ j∈N+

which by (4.1.2) is equal to

1 X Γ(1 + 2i+1 1
4i )Γ(1 + 4i+2 )
log 1 i+1
.
log 2 Γ(1 + 4i )Γ(1 + 2i+1 )
i∈N+
Ergodic theory of continued fractions 229

Nolte (op. cit.) proved that the last quantity can be expressed as
µ ¶
1 X n ζ(n) − 1 2−n 2−2n 2n−1 − 1
α+ (−1) (2 −2 − 1)(ζ(n) − 1) + 2n−2 ,
log 2 n 2
n≥2

where
µ ¶
2 √ 4 1
α = log 2 − 1 + log 6 2π − log Γ = 0.08167 · · · .
log 2 log 2 4

Setting y = 2 − log π/ log 2 = 0.3485 . . . , Nolte’s computations show that

card{κ : (aκ , aκ+1 ) ≡ (a, b) mod 2, 1 ≤ κ ≤ n}


lim
n→∞ n
is a.e. equal to

z = 0.11694 · · · for (a, b) = (0, 0);


y − z = 0.23156 · · · for (a, b) = (0, 1) or (1, 0);
1 − 2y + z = 0.41993 · · · for (a, b) = (1, 1).

Actually, all the results we have proved so far are special cases of the
following result.
Proposition 4.1.6 Given m ∈ N+ , let H : Nm
+ → R be such that
X
|H(i(m) )|(v(i(m) ) − u(i(m) )) < ∞
i(m) ∈Nm
+

[which is equivalent to Eγ |H(a1 , · · · , am )| < ∞]. Then we have


n−1
1X
lim H(aκ , · · · , aκ+m−1 ) = αm a.e.,
n→∞ n
κ=0

where
1 X 1 + v(i(m) )
αm = H(i(m) ) log .
log 2 1 + u(i(m) )
i(m) ∈Nm
+

If, in addition,
X
Eλ H 2 (a1 , · · · , am ) = H 2 (i(m) )(v(i(m) ) − u(i(m) )) < ∞
i(m) ∈Nm
+
230 Chapter 4

[which is equivalent to Eγ H 2 (a1 , · · · , am ) < ∞], then whatever ε > 0 we


have

1X
n−1 ³ 1 ´
H(aκ , · · · , aκ+m−1 ) = αm + o n− 2 log(3+ε)/2 n a.e.
n
κ=0

as n → ∞.
For the proof this time the choice of f in (4.1.1) is

f (ω) = H(a1 (ω), · · · , am (ω)), ω ∈ Ω,

while Corollaries 1.3.15 and A3.3 should be also invoked. 2


Remark. A generalization of the second half of Proposition 4.1.6 was
given by Philipp (1967). It allows the integer m vary in relation to n, and
reads as follows.
S
Proposition 4.1.7 Let H : m∈N+ Nm + → R be such that

Eλ H 2 (a1 , · · · , am ) < ∞

for any m ∈ N+ . Whatever ε > 0, if 2m ≤ n < 2m+1 then

1X
n−1 ³ 1 ´
H(aκ , · · · , aκ+m−1 ) = αm + o n− 2 αm
2
log2+ε n a.e.
n
κ=0

as n → ∞. 2
We shall now consider other important special cases of Proposition 4.1.6.
With m = 1 and
 p
 i if p < 1, p 6= 0,
H(i) = Hp (i) =

log i if p = 0

for i ∈ N+ , we obtain the following results.


Proposition 4.1.8 We have

lim (a1 · · · an )1/n = K0 a.e.


n→∞

and µ ¶1/p
ap1 + · · · + apn
lim = Kp a.e.
n→∞ n
Ergodic theory of continued fractions 231

for any p < 1, p 6= 0, where


Y µ 1
¶log i/ log 2 µ
1
Z 1
logb1/tc

K0 = 1+ = exp dt
i(i + 2) log 2 0 1 + t
i∈N+

= 2.685452 · · ·
and
 
X µ ¶ 1/p µ Z 1 ¶1/p
1 1 1 (b1/tc)p
Kp =  p
i log 1 +  = dt .
log 2 i(i + 2) log 2 0 1 + t
i∈N+

In particular,
K−1 = 1.745405 · · · , K−2 = 1.450340 · · · , K−3 = 1.313507 · · · ,
K−4 = 1.236961 · · · , K−5 = 1.189003 · · · , K−6 = 1.156552 · · · ,
K−7 = 1.133323 · · · , K−8 = 1.115964 · · · , K−9 = 1.102543 · · · ,
K−10 = 1.091877 · · · .
More precisely, whatever ε > 0 we have
1 3+ε
(a1 · · · an )1/n = K0 + o(n− 2 log 2 n) a.e.
as n → ∞, and
µ p ¶1/p
a1 + · · · + apn 1 3+ε
= Kp + o(n− 2 log 2 n) a.e.
n
for any p < 1/2, p 6= 0, as n → ∞.
The cases p = 0 and p = −1 leading to the asymptotic a.e. values K0 and
K−1 of the geometric, respectively, harmonic mean of the first n incomplete
quotients as n → ∞ , were studied by Khintchine (1934/35). Ever since its
discovery much effort has been put in the numerical evaluation of K0 . See
Lehmer (1939), Pedersen (1959), Shanks and Wrench, Jr. (1959), Wrench,
Jr. (1960). In the last reference K0 has been evaluated to 155 decimal places.
Recently, using work by Wrench, Jr. and Shanks (1996), Bailey et al. (1997)
have presented rapidly converging series for any Kp , p < 1, allowing them to
evaluate K0 and K−1 to 7,350 decimal places and Kp for p = −2, −3, · · · , −10
to 50 decimal places. Setting
n
X
ζ(s, n) = ζ(s) − i−s , s > 1, n ∈ N+ ,
i=1
232 Chapter 4

the following identities hold:


(i) for any n ∈ N+ we have
 
X X µ ¶ µ ¶
1  Ai 1 1 
log K0 = ζ(2i, n) − log 1 − log 1 + ,
log 2 i i i
i∈N+ 2≤i≤n

where
2i−1
X
Ai = (−1)κ−1 /κ , i ∈ N+ ;
κ=1
(ii) whatever the negative integer p, for any n ∈ N+ we have
 µ ¶
P j−p−1
X ζ(2i + j − p, n)
1  
j∈N −p − 1
Kpp =
log 2  i
i∈N+


X µ ¶
1
− (i − 1)p log 1 − 2  ;
i
2≤i≤n

(iii) in particular, for any n ∈ N+ we have


 P2i 
X n−1 − ζ(j, n) X −2
1 1  j=2 log(1 − i ) 
= − .
K−1 log 2 i i−1
i∈N+ 2≤i≤n
P
Clearly, for n = 1 the sums 2≤i≤n occurring above are empty, thus zero,
so that both Kpp log 2 whatever the negative integer p and (log K0 )(log 2) can
be cast in terms of series involving values of the Riemann zeta function and
rationals.
From (i) above, the elegant integral representation
Z 1
1 log[sin(πt)/πt]
log K0 = − dt
log 2 0 t(t + 1)
can be derived. Let us note that we also have
Z 1
1 log[πt(1 − t2 )/ sin πt]
log K0 = log 2 + dt ,
log 2 0 t(t + 1)
as shown in Shanks and Wrench, Jr. (1959). Actually, the second equation
for log K0 follows from the first one since
Z 1
log(1 − t2 )
dt = − log2 2.
0 t(t + 1)
Ergodic theory of continued fractions 233

See Bailey et al. (op. cit. p. 419).


P
Remarks. 1. Whatever p ∈ R the series i∈N+ api is divergent a.e. For
p < 0 the assertion follows immediately fromPProposition 4.1.8 while for
p ≥ 0 it is obvious since in this case clearly ni=1 api ≥ n, n ∈ N+ . For
p < 0 arbitrarily large in absolute value this might seem strange at first
sight. Actually, things are quite natural since by Proposition 4.1.1 any digit
i ∈ N+ occurs a.e. infinitely often (and thus there is no need to invoke
Proposition 4.1.8).
2. It has been proved by Šalát (1969, 1984) that from a topological
standpoint the sets of probability 1 in Propositions 4.1.1 and 4.1.8 (for p = 0)
are only of the first Baire category, i.e., they are countable unions of nowhere
dense subsets of I.
3. A set which is ‘small’ in the measure theoretical sense, can be quite
‘large’ from the point of view of topology. Consider, for example, the set
E2 of all numbers in [0, 1) whose RCF digits are 1 or 2. It is a trivial
consequence of Proposition 4.1.1 that λ(E2 ) = γ(E2 ) = 0. On the other
hand, it is also clear that E2 has the power of the continuum.
To express the ‘topological size’ of sets like E2 the concepts of Hausdorff
measure and Hausdorff dimension are suitable. We first recall their formal
definitions and then outline two applications of these concepts to continued
fractions. Given a subset E of Rn , for any ε, δ > 0 put
( )
X
δ δ
Hε (E) = inf diam(Ui ) ,
U
i

where the infimum is taken over all open coverings U = {Ui }i of E such that
diam(Ui ) ≤ ε. The Hausdorff measure H δ (E) and the Hausdorff dimension
dimH (E) of E are then defined as
n o
H δ (E) = lim Hεδ (E), dimH (E) = inf δ : H δ (E) = 0 .
ε→0

See Falconer (1986, 1990), Harman (1998), and Rogers (1998).


It follows from Proposition 1.1.1—see also Corollary 4.1.30—that for any
ω ∈ Ω the inequality ¯ ¯
¯ ¯
¯ω − p ¯ < 1
¯ q¯ q2
has infinitely many solutions in integers p, q ∈ N+ with g.c.d. (p, q) = 1.
Let then Mc denote the set of all x ∈ [0, 1) satisfying
¯ ¯
¯ ¯
¯x − p ¯ < 1
¯ q¯ qc
234 Chapter 4

for infinitely many pairs (p, q) of positive integers. Clearly, if c ≤ 2 then


Mc = [0, 1), but what happens when c > 2? It is fairly easy to show that
λ(Mc ) = 0 for c > 2. On the other hand, V. Jarnı́k proved in 1929 that
dimH (Mc ) = 2/c for any c > 2. A simplified proof of this result can be
found in Falconer (1990, p. 142).
Using iterated function systems (IFS)—which is another name for depen-
dence with complete connections—it is possible to calculate the Hausdorff
dimension of sets defined by number-theoretic properties. For instance, the
set E2 just defined is the attractor of the IFS consisting of the two (non-
linear) contractions
1 1
u1 (x) = and u2 (x) = .
1+x 2+x
1
It was first shown by Jarnı́k that 3 ≤ dimH (E2 ) ≤ 23 , but Jenkinson and
Pollicott (2001) found that

dimH (E2 ) = 0.53128 05062 77205 14162 44686 · · · ,

an approximation accurate to 25 decimal places, which improves earlier es-


timates of Hensley (1996). A striking feature of Jenkinson and Pollicott’s
method is that successive approximations of dimH (E2 ) converge at a super-
exponential rate. Their method can be also used to efficiently compute the
Hausdorff dimension of other sets consisting of numbers whose RCF digits
are constrained to belong to any given finite subset of N+ . 2
The case p = 1 is not settled by Proposition 4.1.8. For H(i) = i, i ∈ N+ ,
the series
X X i X 1
|H(i)|(v(i) − u(i)) = =
i(i + 1) i+1
i∈N+ i∈N+ i∈N+

is divergent. In this case Eγ H(a1 ) = ∞ but, however, we have


a1 + · · · + an
lim = ∞ a.e..
n→∞ n
Before proving this (see Corollary 4.1.10 and Remark 1 following it) let us
recall that in Subsection 3.3.2 we noted that, writing tn = a1 + · · · + an ,
n ∈ N+ , tn /n log n converges in µ-probability to 1/ log 2 as n → ∞ for
any µ ∈ pr(BI ) such that µ ¿ λ. It follows that tnκ /nκ log nκ converges
a.e. to 1/ log 2 as κ → ∞, where (nκ )κ∈N+ is some sequence of positive
integers with limκ→∞ nκ = ∞. Hence tnκ /nκ converges a.e. to ∞ as κ → ∞.
Ergodic theory of continued fractions 235

Thus lim supn→∞ tn /n = ∞ a.e. and it remains to show that lim sup can be
replaced by lim. Actually, we shall prove much more.
Theorem 4.1.9 [Diamond and Vaaler (1986)] We have
1 + o(1)
tn = n log n + θn max ai a.e.
log 2 1≤i≤n

as n → ∞, where θn is an I-valued random variable for any n ∈ N+ .


Proof. Given ε > 0 and n ∈ N+ set

a0i = ai I(ai ≤h(n)) , 1 ≤ i ≤ n,


1
where h : N+ → R is defined by h(n) = n log 2 +ε n, and t0n = a01 + · · · + a0n .
Then
bh(n)c µ ¶
0 n X 1
Eγ tn = j log 1 +
log 2 j(j + 2)
j=1

bh(n)c
n X 1
= (1 + o(1)) = n logbh(n)c(1 + o(1))/ log 2
log 2 j
j=1

as n → ∞. By Corollaries 1.3.15 and A3.2 we have

Varγ t0n = O(nVarγ t01 ) = O(nEγ (t01 )2 )

as n → ∞. But
bh(n)c µ ¶
1 X 2 1
Eγ (t01 )2 = j log 1 + = bh(n)c(1 + o(1))/ log 2
log 2 j(j + 2)
j=1

as n → ∞. Therefore Varγ t0n = O(nbh(n)c) as n → ∞.


Now, consider the sequence (nκ )κ∈N+ defined as

nκ = bexp κ1−ε c , κ ∈ N+ .

Note that ¡ ¢
nκ−1 = 1 + O(κ−ε ) nκ
as κ → ∞ so that nκ−1 /nκ and h(nκ−1 )/h(nκ ) both converge to 1 as κ → ∞.
By the choice of the nκ it is obvious that the series with general term
Eγ (t0nκ − Eγ t0nκ )2
, κ ∈ N+ ,
nκ h(nκ )κ1+ε
236 Chapter 4

is convergent. Hence by Beppo Levi’s theorem the random series with gen-
eral term
(t0nκ − Eγ t0nκ )2
, κ ∈ N+ ,
nκ h(nκ )κ1+ε
is convergent a.e. Therefore
³ ´
|t0nκ − Eγ t0nκ | = o nκ κ(1+ε)/2 log(1+2ε)/4 nκ a.e.

as κ → ∞. Now, it is easy to check that


µ ¶
(1+ε)/2 (1+2ε)/4 Eγ t0nκ ¡ ¢
nκ κ log nκ = O ε/3
= o Eγ t0nκ a.e.
log nκ
as κ → ∞ provided that ε < 0.126. Thus

t0nκ = (1 + o(1))Eγ t0nκ a.e.

as κ → ∞.
Next, for any n ∈ N+ satisfying nκ−1 < n ≤ nκ for some κ ∈ N+ we
clearly have
t0nκ−1 ≤ t0n ≤ t0nκ ,
so that
(1 + o(1))Eγ t0nκ−1 ≤ t0n ≤ (1 + o(1))Eγ t0nκ a.e.

as k → ∞. On account of the properties already noted of the sequence


(nκ )κ∈N+ we easily obtain

t0n = (1 + o(1))Eγ t0n a.e.

as n → ∞, and since

n logbh(n)c − n log n = o(n log n)

as n → ∞, we can also write


n log n
t0n = (1 + o(1)) a.e. (4.1.3)
log 2
as n → ∞.
To complete the proof we shall show that a.e. there exist at most finitely
many integers n ∈ N+ for which the inequalities

ai > h(n), aj > h(n)


Ergodic theory of continued fractions 237

hold for two distinct indices i, j ≤ n. To proceed fix i < j. It follows from
Corollary 1.3.15 that

γ(ai > h(n), aj > h(n)) = O(γ(ai > h(n))γ(aj > h(n)))

= O(γ 2 (a1 > h(n))) = O((h(n))−2 )

= O(n−2 (log n)−1−2ε )

as n → ∞. Hence the probability of the random event

(ai > h(n), aj > h(n) for distinct indices i, j ≤ 2n)

is of order at most (log n)−1−2ε . For κ ∈ N+ let


[
Eκ = (ai > h(2` ), aj > h(2` ) for distinct indices i, j ≤ 2`+1 ) .
`≥κ
P
Then γ(Eκ ) = O( `≥κ `−1−2ε ) → 0 as κ → ∞. It is now clear that for
ω 6∈ Eκ and n > 2κ+1 there exists at most one index i ≤ n for which
ai (ω) > h(n).
Consequently, we can assert that

0 ≤ tn − t0n ≤ max ai a.e. (4.1.4)


1≤i≤n

for all sufficiently large n. By (4.1.3) and (4.1.4) the proof is complete. 2
Remarks. 1. It is now clear from the above theorem and Proposition
3.1.7 why tn /n log n converges in probability, rather than a.e., to 1/ log 2 as
n → ∞. The obstacle to a.e. convergence is the occurrence of a single large
value of the digits. At the same time, a.e. convergence can be obtained by
excluding at most one summand.
2. It is interesting to compare Theorems 3.3.4 and 4.1.9 (see also Corol-
lary 3.1.11). 2
Corollary 4.1.10 Whatever 0 ≤ ε < 1 we have
a1 + · · · + an
lim = ∞ a.e..
n→∞ n(log n)ε

Remarks. 1. The equation


a1 + · · · + an
lim = ∞ a.e.
n→∞ n
238 Chapter 4

can be also derived from a slight generalization of equation


R (4.1.1). Hartman
(1951) proved that if f : I → R+ is measurable and I f dλ = ∞, then the
limit in (4.1.1) exists and is equal to ∞ a.e.. The equation above then
follows by taking f (ω) = a1 (ω), ω ∈ Ω. It is interesting to note that if we
take f (ω) = a2 (ω)/a1 (ω) or f (ω) = a1 (ω)/a2 (ω), ω ∈ Ω, then we obrain
1 X ai+1 1 X ai
lim = lim = ∞ a.e..
n→∞ n ai n→∞ n ai+1
i∈N+ i∈N+

2. Salem (1943) proved that the celebrated Minkowski’s ? function can


be expressed in terms of the tn , n ∈ N, as
X
?(x) = (−1)i−1 21−ti (x)
i∈N+

for any x ∈ I, if we consider that ai (x) = ∞ for any large enough i ∈ N+


when x ∈ I \ Ω. It is known that ? is a strictly increasing singular function,
that is, ?0 (x) = 0 a.e. in I. Recently, Viader et al. (1998) have shown that
µ ¶
tn (x) ¡ ¢
x ∈ I : lim = ∞ ∩ x ∈ I : ?0 (x) exists finitely
n→∞ n
¡ ¢
⊂ x ∈ I : ?0 (x) = 0 ,
thus making more precise the set where the derivative of ? vanishes.
Note that the sequence (an )n∈N+ is i.i.d. with common µ-distribution
(2 −m : m ∈ N+ ) under the probability measure µ induced by ? on BI . Cf.
Lagarias (1992, p. 45).
3. Vardi (1995, 1997) discussed an interesting relationship between the
St. Petersburg game [see, e.g., Feller (1968, X.4)] and the sequence (an )n∈N+ ,
on account of the properties of the sequence (tn )n∈N+ . That game is a well
known example of a sequence of independent identically distributed random
variables with infinite mean value, and was considered as a paradox since no
‘fair’ entry fee exists. It appears that (an )n∈N+ makes a reasonable choice
of entry fees for the St. Petersburg game. 2
Corollary 4.1.11
P Let (cn )n∈N+ be a non-decreasing sequence of positive
numbers satisfying n∈N+ c−1
n < ∞. Then

1 + o(1)
tn = n log n + θn cn a.e.
log 2
as n → ∞, where θn is an I-valued random variable for any n ∈ N+ .
Ergodic theory of continued fractions 239

Proof. This is an immediate consequence of Theorem 4.1.9 and Propo-


sition 1.3.16 (F. Bernstein’s theorem). 2
Corollary 4.1.12 Set dn = exp(κ log2 κ)κ log2 κ for

exp((κ − 1) log2 (κ − 1)) < n ≤ exp(κ log2 κ) , κ ≥ 2. (4.1.5)

Then
a1 + · · · + an 1
lim sup = a.e..
n→∞ dn log 2

Proof. In Corollary 4.1.11 set

cn = dn /(log log 10κ)


P
for n in the range (4.1.5). It is easy to check that n∈N+ c−1
n < ∞ and that
(4.1.5) implies
n log n ≤ dn , n ∈ N+ .
Then by Corollary 4.1.11 we have

1 + o(1) dn
tn ≤ dn + a.e.
log 2 log log 10κ

as κ → ∞, so lim supn→∞ tn /dn ≤ 1/ log 2 a.e. To complete the proof we


note that setting nκ = exp((κ + 1) log2 (κ + 1)) we have dnκ = nκ log nκ ,
κ ∈ N+ , and limκ→∞ tnκ /dnκ = 1/ log 2. 2
Remarks. 1. Philipp (1988, Theorem 1) proved P that (i) for any se-
quence (cn )n∈N+ of positive numbers such that n∈N+ c−1 n < ∞, we have
lim supn→∞ tn /cn = 0 a.e., and (ii) for any sequence (cn )n∈N+ of pos-
itive
P numbers such that the sequence (cn /n)n∈N+ is non-decreasing and
−1
n∈N+ cn = ∞, we have lim supn→∞ tn /cn = ∞ a.e. Corollary 4.1.11
shows that the condition on the sequence (cn /n)n∈N+ in (ii) cannot be dis-
pensed with.
2. It is easy to show, see Diamond and Vaaler (op. cit., pp. 81–82), that
if (cn )n∈N+ is as in Corollary 4.1.11, then setting

S = {n ∈ N+ : cn < n log n} ,

we have X
1 1
lim = 0,
x→∞ log x n
n≤x, n∈S
240 Chapter 4

that is, S has logarithmic density zero. It then follows from Corollary 4.1.11
that
a1 + · · · + an = O(cn )
as n → ∞ for all integers n outside a set of logarithmic density 0. See also
Corollary 3.1.9.
3. Theorem 4.1.9 can be easily generalized for a function H : N+ → R++
satisfying
   2
X X ³ 3
´
 H 2 (i)/i2  /  H(i)/i2  = O n log− 2 −ε n
1≤i≤n 1≤i≤n

as n → ∞ for some ε > 0. [Clearly, H(i) = i, i ∈ N+ , satisfies the condition


above.] For such a function H we have
n
X µ ¶
(1 + o(1)) X 1
H(ai ) = n H(i) log 1 +
log 2 i(i + 2)
i=1 1≤i≤n

+ θn max H(ai ) a.e.,


1≤i≤n

where θn is an I-valued random variable for any n ∈ N+ . The proof can be


found in Diamond and Vaaler (op. cit.). 2

4.1.2 Empirical evidence, and normal continued fraction


numbers
We shall now discuss the important amount of empirical evidence already
accumulated on continued fraction expansions of certain real numbers. The
interest of such computations lies in comparing statistics of such expansions
with known theoretical limiting distributions.
It is clear that, for instance, contained in the exceptional set in Propo-
sition 4.1.8 are all quadratic irrationalities and the number e − 2. See Sub-
section 1.1.3.
Clearly, all the numbers just mentioned are also contained in the excep-
tional set in Proposition 4.1.1.
As we have already mentioned in Subsection 1.1.3, in the opposite direc-
tion seems to lie π − 3 whose continued fraction expansion is

π − 3 = [ 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, · · · ] .
Ergodic theory of continued fractions 241

In Bailey et al. (1997, p. 423) it is asserted that, based on the first 17,001,303
continued fraction digits of π − 3, the geometric mean is 2.68639 and the
harmonic mean is 1.745882, which are reasonably close to K0 and K−1 —see
Proposition 4.1.8. Clearly, no conclusion can be drawn beyond this.
For computations concerning the continued fraction digits of various ir-
rationals in I we refer the reader to Alexandrov (1978), Brjuno (1964),
Choong, Daykin and Rathbone (1971) (see nevertheless D. Shanks’ review
[MR 52 # 7073] of this paper), Lang and Trotter (1972), Richtmyer (1975),
Shiu (1995), and J.O. Shallit’s review [MR 96b: 11165] of this last paper.
Presenting an algorithm for computing the continued fraction expansion
of numbers which are zeroes of differentiable functions, Shiu (1995)
√ obtained
statistics of the√ first 10000 digits of irrationals in I such as 3 2 − 1, π − 3,
π 2 − 9, log 2, 2 2 − 2. Table 1 below is compiled from his Table 1. The last
column contains the (theoretical) asymptotic relative digit frequencies

µ ¶
1 1
log 1 + , 1 ≤ i ≤ 10,
log 2 i(i + 2)

in the first 10 lines, the asymptotic relative frequency

1 12 × 101
log
log 2 11 × 102

of the digits in the range [11, 100] in the 11th line, and the asymptotic
relative frequency

1 102
log
log 2 101

of the digits exceeding 100 in the last line. Cf. Propositions 4.1.1, 4.1.3, and
4.1.4.
242 Chapter 4

Frequency of occurrence of i in
10000 digits of Theoretical
Digit asymptotic
√ √
i 3
2−1 π−3 π2 − 9 log 2 2 2 −2 relative frequency

1 4173 4206 4134 4149 4192 0.415037499 · · ·


2 1675 1672 1706 1666 1639 0.169925001 · · ·
3 946 882 948 905 933 0.093109404 · · ·
4 636 597 581 600 616 0.058893689 · · ·
5 421 443 401 390 390 0.040641984 · · ·
6 295 282 302 334 278 0.029747343 · · ·
7 240 224 232 226 213 0.022720076 · · ·
8 163 186 185 187 190 0.017921908 · · ·
9 122 143 138 142 135 0.014499569 · · ·
10 118 123 117 137 135 0.011972641 · · ·
11 − 100 1060 1113 1111 1113 1130 0.111317022 · · ·
≥ 101 151 129 145 151 149 0.014213859 · · ·

Table 1
It is also interesting to note that setting M10000 (ω) = max1≤κ≤10000 aκ (ω)
(cf. Subsection 3.1.3) we have
√ √
M10000 ( 3 2 − 1) = a1990 ( 3 2 − 1) = 12737,
M10000 (π − 3) = a431 (π − 3) = 20776,
M10000 (π 2 − 9) = a1234 (π 2 − 9) = 12013,
M10000 (log 2) = a9168 (log 2) = 963664,
√ √
M10000 (2 2 − 2) = a6342 (2 2 − 2) = 44122 ,

and that in all cases just considered there exist digits not exceeding 100
which do not appear, viz.

74, 86, 91, 96, 97, 99, and 100 for 3 2 − 1;
90, 91, and 96 for π − 3;
91 and 92 for π 2 − 9;
55, 73, 76, 96, and 97 for log 2;

79, 80, 81, 82, 91, 94, 97, and 99 for 2 2 − 2.
Ergodic theory of continued fractions 243

Concerning Khinchin’s constant K0 , computations of


1
K0 (ω, n) = (a1 (ω) · · · an (ω)) n

for n ≤ 10000 and various ω ∈ Ω, including those considered above, suggest


that, e.g., π − 3 is not in the exceptional set. However, it should be pointed
out that if even there might be convergence the rate has to be very slow. It
was found that K0 (π − 3, 10000) differs from K0 by more than K0 (π − 3, 100)
does!
The existence of the asymptotic relative digit and, more generally, m-
digit block frequencies (Propositions 4.1.1 and 4.1.2) raises naturally the
question of normality for the continued fraction expansion.
The idea of normality, first introduced by É. Borel in 1909, is an attempt
to formalize the notion of a real number being random. A real number
x ∈ I is said to be normal in base b, b ∈ N+ , b ≥ 2, if and only if in its
representation in base b all digits 0, 1, · · · , b−1 appear asymptotically equally
often, i.e., with asymptotic relative frequencies all equal to 1/b. In addition,
for each m ∈ N+ the bm different m-digit blocks must occur equally often.
In other words, for any m ∈ N+ we should have
µ ¶
1 number of occurrences of a given m-digit
lim = b−m
n→∞ n block in the first n + m − 1 base-b digits of x

whatever the given m-digit block. Actually, the above equation holds for all
x ∈ I except for a set of Lebesgue measure zero. This can easily be seen by
applying Birkhoff’s ergodic theorem to the transformation T x = bx mod 1
of I. A number that is normal in all bases b ∈ N+ , b ≥ 2, is called normal.
However, even if there are lots of normal numbers, when we are given a
‘concrete’ number x ∈ I the existence result just mentioned does not help
to decide whether x is normal or not. Such a problem cannot be handled
by methods known today. (Will it ever be solved?) For instance, it is not
known whether π − 3, e − 2, or any irrational algebraic number is normal
or not. The first example of a normal number in base 10 was given by
Champernowne (1933). His number is

x = 0. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 · · ·

but an explicit example of a normal number is still lacking.


Clearly, a similar problem can be considered for the continued fraction
expansion (which has the advantage of not being related to any base). An
irrational ω ∈ I is said to be a normal continued fraction number if and only
244 Chapter 4

if all its asymptotic relative m-digit block frequencies exist and are equal to
those occurring in Proposition 4.1.2 for any m ∈ N+ . In other words, ω is
a normal continued fraction number if it does not belong to the exceptional
sets of λ-measure zero excluded in Proposition 4.1.2 for any m ∈ N+ . For
instance, the quadratic irrationalities are not normal since they eventually
have periodic expansions, and neither is e − 2.
A construction of the Champernowne type for a normal continued frac-
tion number was given by Adler, Keane, and Smorodinsky (1981). Their
example is as follows. Let (rn )n∈N+ be the sequence of rationals in (0,1) ob-
tained by first writing r1 = 1/2, then r2 = 1/3 and r3 = 2/3, then r4 = 1/4,
r5 = 2/4, r6 = 3/4, etc., at each stage m ∈ N+ writing all quotients with
denominator m + 1 in increasing order. Let ri = [ai,1 , ai,2 , . . . , ai,ni ] be the
continued fraction expansion of ri , with ai,ni 6= 1, i ∈ N+ . The irrational ω
with continued fraction expansion
[a1,1 , a2,1 , a3,1 , a3,2 , a4,1 , a5,1 , a6,1 , a6,2 , a7,1 , a8,1 , a8,2 , a9,1 , a9,2 , a9,3 , · · · ],
which is obtained by concatenating the expansions of r1 , r2 , · · · in the given
order, is a normal continued fraction number. The first 14 digits of ω are
2, 3, 1, 2, 4, 2, 1, 3, 5, 2, 2, 1, 1, 2.
Another example of a different nature had been given by Postnikov
(1960).
We should emphasize that even if the empirical evidence pleads in favour
of normality for the continued fraction expansion of algebraic irrationals of
degree exceeding 2, or of π − 3, π 2 − 9 etc., the only mathematical results
proved so far are the examples of normal continued fraction numbers just
discussed.
Finally, a few words about the empirical evidence concerning √ Theorem
3
4.1.9. Von Neumann and Tuckerman (1955) computed √ t n ( 2 − 1) and
3
n log n/ log 2 for n = 100(100)2000. It appears that tn ( 2 − 1) log 2/n log n
is most of the time greater than 1 and often nearly 2. As tn log 2/n log n
converges just in probability to 1 as n → ∞, these deviations cannot be seen
as significant.

4.1.3 The case of associated and extended random variables


Since τ̄ is γ̄-preserving and ergodic under γ̄ (see Subsection 4.0.2), it follows
again from Theorem 4.0.3 that
n−1 Z 1 Z 1 ¯
1X¯ k 1 f (x, y)
lim f ◦ τ̄ = dx 2
dy a.e. in I 2 (4.1.6)
n→∞ n log 2 0 0 (xy + 1)
k=0
Ergodic theory of continued fractions 245

RR
for any measurable function f¯ : I 2 → R such that I 2 |f¯| dλ2 < ∞. As
in Subsection 4.1.1, for suitable choices of f¯, Proposition 4.0.4 will lead to
estimates of convergence rates in (4.1.6).
We now give several results which can be derived from (4.1.6).
Proposition 4.1.13 For any B ∈ BI2 we have
n−1 ZZ
1X 1 dx dy
lim IB (τ k , s̄k ) = a.e. in I 2 .
n→∞ n log 2 B (xy + 1)2
k=0

Proof. The equation above follows from (4.1.6) by taking f¯ = IB ,


B ∈ BI2 , and noting that by the very definition of the extended incom-
plete quotients (see Subsection 1.3.3), equations (1.3.1) and (1.3.10 ) can be
written as
τ̄ n (ω, θ) = (τ n (ω), s̄n (ω, θ)) , (ω, θ) ∈ Ω × I,
for any n ∈ N+ . (The last equation holds for n = 0, too.) 2
Corollary 4.1.14 For any A ∈ BI we have
n−1
1X
lim IA (τ k ) = γ(A) a.e. in I,
n→∞ n
k=0

and
n−1
1X
lim IA (s̄k ) = γ(A) a.e. in I 2 .
n→∞ n
k=0

Proof. The first equation follows by taking B = A × I. [It might be


also derived from equation (4.1.1).] The second equation follows by taking
B = I × A. 2
It follows by dominated convergence from Proposition 4.1.13 that for any
µ̄ ∈ pr(BI2 ) we have

1 X ³ −k ´
n−1
lim µ̄ τ̄ (B) = γ̄(B), B ∈ BI2 . (4.1.7)
n→∞ n
k=0

In particular,

1 X ³ −k ´
n−1 n−1
1X
lim µ̄ τ̄ (I × A) = lim µ̄ (s̄k ∈ A)
n→∞ n n→∞ n
k=0 k=0 (4.1.8)

= γ(A), A ∈ BI .
246 Chapter 4

We are going to show under suitable assumptions that in (4.1.7) actual


convergence holds instead of Césaro convergence while in (4.1.8) the ex-
tended random variable s̄k can be replaced by sak , k ∈ N, for a fixed a ∈ I.
Proposition 4.1.15 Let µ̄ ∈ pr(BI2 ) such that µ̄ ¿ λ2 . Then
¡ ¢
lim µ̄ τ̄ −n (B) = γ̄(B) (4.1.9)
n→∞

for any B ∈ BI2 .


Proof. Let h̄ = dµ̄/dλ2 . Then for any B ∈ BI2 we have
ZZ ZZ
−n n
µ̄(τ̄ (B)) = IB ◦ τ̄ dµ̄ = (IB ◦ τ̄ n )(h̄/ḡ) dγ̄,
I2 I2

where ḡ = dγ̄/dλ2 , that is,

1 1
ḡ(x, y) = , (x, y) ∈ I 2 .
log 2 (xy + 1)2

Now, since τ̄ is strongly mixing (see Subsections 4.0.1 and 4.0.2), the last
integral in the equations above converges to
ZZ ZZ
IB dγ̄ (h̄/ḡ) dγ̄ = γ̄(B)µ̄(I 2 ) = γ̄(B)
I2 I2

as n → ∞. 2
Remarks. 1. Proposition 2.1.5 shows that measures µτ −n , n ∈ N, can
be expressed in terms of the Perron–Frobenius operator Pγ = U of τ with
respect to γ. A similar representation holds for the case of a measure µ̄ as
in Proposition 4.1.15. It is easy to check that we have
ZZ
−n
µ̄(τ̄ (B)) = P̄γ̄n f¯ dγ̄, B ∈ BI2 ,
B

where f¯ = h̄/ḡ and P̄γ̄ is the Perron–Frobenius operator of τ̄ under γ̄. See
the Remark following Proposition 2.1.1.
If the endomorphism (τ̄ , γ̄) were exact, then from Proposition 4.0.2 we
might have deduced that convergence in (4.1.9) is uniform with respect to
B ∈ BI2 . Since (τ̄ , γ̄) is not exact, such a conclusion cannot be reached this
way. It is an open problem whether this is really true.
2. Proposition 4.1.15 is a first step towards the solution of what can be
called Gauss’ problem for the natural extension τ̄ of τ . 2
Ergodic theory of continued fractions 247

Theorem 4.1.16 Let µ ∈ pr(BI ) such that µ ¿ λ. For any B ∈ BI2


such that λ2 (∂B) = 0 we have
(i) lim µ (τ̄ n ( · , a) ∈ B) = γ̄(B)
n→∞
uniformly with respect to a ∈ I;
n−1
1X
(ii) lim IB (τ k , sak ) = γ̄(B) a.e. in I
n→∞ n
k=0
uniformly with respect to a ∈ I.
Proof. (i) For any θ ∈ I and B ∈ BI2 set

hn (θ, B) = µ (τ̄ n ( · , θ) ∈ B) , n ∈ N+ .

By Fubini’s theorem we have


ZZ
¡ −n ¢
(µ ⊗ λ) τ̄ (B) = IB (τ̄ n (ω, θ)) µ(dω) dθ
I2
Z 1 Z 1
= dθ IB (τ̄ n (ω, θ)) µ(dω)
0 0
Z 1 Z 1
n
= µ (τ̄ ( · , θ) ∈ B) dθ = hn (θ, B) dθ.
0 0

Since µ ⊗ λ ¿ λ2 , it follows from Proposition 4.1.15 that


Z 1
lim hn (θ, B) dθ = γ̄(B) (4.1.10)
n→∞ 0

for any B ∈ BI2 .


Now, note that—letting d denote the Euclidean distance in I 2 —by The-
orem 1.2.2 we have
1
d (τ̄ n (ω, θ), τ̄ n (ω, a)) ≤ max I(i(n) ) = , n ∈ N+ , (4.1.11)
i(n) ∈Nn
+
Fn Fn+1

for any θ, a ∈ I. Given ε > 0, let


[
Bε+ = Dε (x, y),
(x,y)∈B

where Dε (x, y) is the open disk of radius ε centered at (x, y) ∈ I 2 , and

Bε− = ((x, y) ∈ B : Dε (x, y) ⊂ B) .


248 Chapter 4

By (4.1.11), for n ≥ n0 (ε) great enough and any θ, a ∈ I we have


(ω : τ̄ n (ω, θ) ∈ Bε− ) ⊂ (ω : τ̄ n (ω, a) ∈ B)
(4.1.12)
⊂ (ω : τ̄ n (ω, θ) ∈ Bε+ ) .
On the other hand, for any n ∈ N and θ ∈ I we trivially have
(ω : τ̄ n (ω, θ) ∈ Bε− ) ⊂ (ω : τ̄ n (ω, θ) ∈ B)
(4.1.13)
⊂ (ω : τ̄ n (ω, θ) ∈ Bε+ ) .
Hence
¡ ¢ ¡ ¢
−hn θ, Bε+ \ Bε− ≤ hn (θ, B) − hn (a, B) ≤ hn θ, Bε+ \ Bε−
for any n ≥ n0 (ε) and θ, a ∈ I. Integrating the double inequality above over
θ ∈ I yields
¯Z 1 ¯ Z 1
¯ ¯ ¡ ¢
¯ ¯
hn (θ, B) dθ − hn (a, B)¯ ≤ hn θ, Bε+ \ Bε− dθ
¯
0 0

for any n ≥ n0 (ε) whatever a ∈ I. Finally, let first n → ∞ then ε → 0 in


the last inequality. By (4.1.10) we obtain
lim sup sup |γ̄(B) − hn (a, B)| ≤ lim γ̄(Bε+ \ Bε− ) = γ̄(∂B) = 0
n→∞ a∈I ε→0

since λ2 (∂B) = 0, and the proof of (i) is complete.


(ii) It is easy to check that (4.1.12) and (4.1.13) imply the inequalities
³ ´ ³ ´ ³ ´
IBε− τ k , s̄k ≤ IB τ k , sak ≤ IBε+ τ k , s̄k

for any a ∈ I, (ω, θ) ∈ Ω × I, and any k ≥ n0 (ε) great enough. Also, we


trivially have
³ ´ ³ ´ ³ ´
IBε− τ k , s̄k ≤ IB τ k , s̄k ≤ IBε+ τ k , s̄k

for any k ∈ N and (ω, θ) ∈ Ω × I. Hence


¯ ¯
¯ ¯
¯IB (τ k , s̄k ) − IB (τ k , sak )¯ ≤ IBε+ \Bε− (τ k , s̄k ) (4.1.14)

for any k ≥ n0 (ε), a ∈ I, and (ω, θ) ∈ Ω × I. By Proposition 4.1.13 we have


n−1
1X
lim IB (τ k , s̄k ) = γ̄(B)
n→∞ n
k=0
Ergodic theory of continued fractions 249

and
n−1
1X
lim IBε+ \Bε− (τ k , s̄k ) = γ̄(Bε+ \ Bε− ) a.e. in I 2 .
n→∞ n
k=0

Since λ2 (∂B) = 0, we have

lim γ̄(Bε+ \ Bε− ) = γ̄(∂B) = 0.


ε→0

It is now easy to see that (4.1.14) and the last three equations imply the
result stated. 2
Remark. Theorem 4.1.16(i) has been proved by Barbolosi and Faivre
(1995) while (ii) is implicit (or implicitly used) in many papers by Dutch
authors. See, e.g., Bosma et al. (1983) or Jager (1986). 2
Theorem 4.1.16 has a host of consequences. We state some of them.
Corollary 4.1.17 Let µ ∈ pr(BI ) such that µ ¿ λ. For any B ∈ BI2
such that λ2 (∂B) = 0 we have

lim µ((τ n , san ) ∈ B) = γ̄(B) (4.1.15)


n→∞

uniformly with respect to a ∈ I.


Proof. This is just a transcription of the result stated in Theorem
4.1.16(i) as

τ̄ n (ω, a) = (τ n (ω), s̄n (ω, a)) = (τ n (ω), san (ω)), (ω, a) ∈ Ω × I,

for any n ∈ N. 2
Let us note that in Theorem 2.5.8 the (optimal) convergence rate in
(4.1.15) has been obtained in the case where µ = γa for the class of rectangles
B = [0, x] × [0, y], x, y ∈ I. Using this result we can prove
Proposition
Sm 4.1.18 Let B be a simply connected subset of I 2 such that
∂B = i=1 `i for some m ∈ N+ , where either

`i := ( (x, fi (x)) : ai ≤ x ≤ bi )

with 0 ≤ ai < bi ≤ 1 and fi : [ai , bi ] → I continuous and monotone, or


¡ ¢
`i := (ci , y) : a0i ≤ y ≤ b0i

with ci ∈ I and 0 ≤ a0i < b0i ≤ 1. Then

γa ((τ n , san ) ∈ B) = γ̄(B) + O(gn )


250 Chapter 4

as n → ∞, where the constant implied in O depends on m and the quantities


defining the `i , 1 ≤ i ≤ m.
The proof in the case a = 0 can be found in Dajani and Kraaikamp
(1994). 2
By particularizing the set B in Corollary 4.1.17 and Proposition 4.1.18
we obtain results originally derived by ad hoc methods. We shall state below
some of them leaving the calculation details to the reader.
Corollary 4.1.19 For any µ ∈ pr(BI ) such that µ ¿ λ and any t ∈ I
we have
e
lim µ (Θn ≤ t) = H(t),
n→∞

where H e has been defined in Theorem 2.2.13. For µ = λ the convergence


rate in the equation above is O(gn ) as n → ∞.
Proof. This follows from Corollary 4.1.17 with a = 0 and
µ ¶
2 x
B = (x, y) ∈ I : ≤ t , t ∈ I,
xy + 1
and Proposition 4.1.18, as Θn = τ n /(sn τ n + 1), n ∈ N, by equation (1.3.7).
Note that, however, Theorem 2.2.13 yields a better convergence rate! 2
Corollary 4.1.20 For any µ ∈ pr(BI ) such that µ ¿ λ and any
(t1 , t2 ) ∈ I 2 we have

lim µ (Θn−1 ≤ t1 , Θn ≤ t2 ) = H(t1 , t2 ),


n→∞

where H is the distribution function with density



1 1
 log 2 √1 − 4t t if t1 ≥ 0, t2 ≥ 0, t1 + t2 < 1,

1 2


0 elsewhere.

For µ = λ the convergence rate in the equation above is O(gn ) as n → ∞.


Proof. This follows from Corollary 4.1.17 with a = 0 and
µ ¶
2 y x
B = (x, y) ∈ I : ≤ t1 , ≤ t2 , (t1 , t2 ) ∈ I 2 ,
xy + 1 xy + 1
and Proposition 4.1.18, as
sn τn
Θn−1 = , Θn = , n ∈ N,
sn τ n + 1 sn τ n + 1
Ergodic theory of continued fractions 251

by equation (1.3.7). 2
Let us define random variables ρn and Θn0 by
¯ ¯
¯ pn+1 ¯ ¯ ¯
¯ω − qn+1 ¯ ¯ pn ¯¯
ρn (ω) = ¯¯ ¯ , 0 ¯
Θn = qn qn+1 ¯ω − ¯ , ω ∈ Ω, n ∈ N.
p ¯ qn
¯ω − qnn ¯

It is easy to see that ρn = sn+1 τ n+1 and Θ0n = 1/(sn+1 τ n+1 + 1) so that
Θ0n = 1/(ρn + 1), n ∈ N.
Corollary 4.1.21 For any µ ∈ pr(BI ) such that µ ¿ λ we have
µ ¶
1 t log t
lim µ(ρn ≤ t) = log(t + 1) − , t ∈ I,
n→∞ log 2 t+1


 0 if 0 ≤ t ≤ 1/2,

lim µ(Θ0n ≤ t) =
n→∞ 
 log(2tt (1 − t)1−t )
 if 1/2 ≤ t ≤ 1.
log 2
For µ = λ the convergence rate in the equations above is O(gn ) as n → ∞.
The proof is left to the reader. 2
For other results of the same type, which can be derived as before, we
refer the reader to Bosma et al. (1983), Jager (1986), Kraaikamp (1994).
Corollary 4.1.22 For any t, t1 , t2 ∈ I the limits

1
lim card{k : Θk ≤ t, 0 ≤ k ≤ n − 1 },
n→∞ n
1
lim card{k : Θk ≤ t1 , Θk+1 ≤ t2 , 0 ≤ k ≤ n − 1 },
n→∞ n
1
lim card{k : ρk ≤ t, 0 ≤ k ≤ n − 1 },
n→∞ n
and
1
lim card{k : Θ0k ≤ t, 0 ≤ k ≤ n − 1 },
n→∞ n
all exist a.e. in I and are equal to the corresponding values of the limiting
distribution functions occurring in Corollaries 4.1.19, 4.1.20, and 4.1.21,
respectively.
252 Chapter 4

The proof is immediate on account of Theorem 4.1.16(ii) and the corol-


laries referred to in the statement. 2
Remarks. 1. It has been proved by Hensley (1998) that if (kn )n∈N+ is a
strictly increasing sequence of positive integers, then for any t ∈ I we have
1 e
lim card{j : Θkj ≤ t, 0 ≤ j ≤ n − 1 } = H(t) a.e. in I, (4.1.16)
n→∞ n

where H e has been defined in Theorem 2.2.13. Corollary 4.1.22 only covers
the case kn = n, n ∈ N+ .
2. In the case kn = n, n ∈ N+ , equation (4.1.16) has been conjectured
by H.W. Lenstra Jr. Actually, this conjecture is implicit in Doeblin (1940),
which enables us to call it after both Doeblin and Lenstra. The Doeblin–
Lenstra conjecture has been proved by Bosma et al. (1983) by using, even
if not explicitly, Theorem 4.1.16(ii) in a special case. 2
Corollary 4.1.23 The equations
n−1
1X 1
lim Θk = = 0.36067 · · ·
n→∞ n 4 log 2
k=0
n−1 µ ¶
1X 1 1
lim Θk Θk+1 = 1− = 0.10655 · · ·
n→∞ n 6 4 log 2
k=0
n−1
1X π2
lim ρk = − 1 = 0.18656 · · ·
n→∞ n 12 log 2
k=0
and
n−1
1X 0 1 1
lim Θk = + = 0.86067 · · ·
n→∞ n 2 4 log 2
k=0
all hold a.e. in I.
Proof. We consider just the first equation, leaving the calculation details
to the reader, as the same idea underlies the proofs in the other cases.
By Corollary 4.1.22 we have
n−1
1X e
lim I[0,t] (Θk ) = H(t)
n→∞ n
k=0
a.e. in I for any t ∈ I ∩ Q. Hence for any fixed ω ∈ Ω not belonging to the
exceptional set the distribution function
n−1
1X
Fn (t) := I[0,t] (Θk ), t ∈ I,
n
k=0
Ergodic theory of continued fractions 253

e as n → ∞. Consequently,
converges weakly to H
Z n−1
1X
t dFn (t) = Θk
I n
k=0

should converge to Z
e 1
t dH(t) =
I 4 log 2
as n → ∞ for any ω ∈ Ω not belonging to the exceptional set, thus a.e. in I.
While for the last two equations the reasoning is quite similar, in the
case of the second equation we should consider
RR two-dimensional distribution
functions, and the value of the limit equals I 2 t1 t2 dH(t1 , t2 ). 2
We turn now to limit properties of certain associated random variables.
It follows from
R (4.1.6) that for any measurable real-valued function f on I
such that I |f | dλ < ∞ we have

n−1 Z
1X
lim f (s̄k ) = f dγ a.e. in I 2 . (4.1.17)
n→∞ n I
k=0

From (4.1.17) we can derive a weaker result for the sequences (san )n∈N , a ∈ I.
Theorem 4.1.24 Let f : I → R be continuous. Then for any a ∈ I we
have
n−1 Z
1X a
lim f (sk ) = f dγ a.e. in I.
n→∞ n I
k=0

Proof. We have |s̄k − sak | ≤ (Fk Fk+1 )−1 for any k ∈ N, (ω, θ) ∈ Ω × I,
a ∈ I. The result then follows from (4.1.17) and the uniform continuity of
f on I. 2
Remarks. 1. The above result also follows from a theorem of Breiman
(1960) on account of the Markov property of the sequences (san )n∈N , a ∈ I.
2. The corresponding result for yna = 1/san , n ∈ N+ , a ∈ I, can be easily
stated. In this form it can be found in Elton (1987) and Grigorescu and
Popescu (1989). 2
Corollary 4.1.25 For any m ∈ N+ and a ∈ I we have
n−1
1X a m 1 X (−1)i−1
lim (sk ) = a.e. in I.
n→∞ n log 2 (m + i)
k=0 i∈N+
254 Chapter 4

In particular, for m = 1 the value of the limit is (1/ log 2) − 1.


The proof amounts to computing the integral
Z 1 m
1 t
dt,
log 2 0 t + 1
which yields the result stated. 2
Taking f (x) = log x, x ∈ I, in (4.1.17) and noting that
Z Z 1
1 log x dx
log xγ(dx) =
I log 2 0 x+1
µ Z 1 ¶
1 1 log(x + 1) dx
= log(x + 1) log x|0 −
log 2 0 x
Z
1 X (−1)k 1 k 1 X (−1)k
= − x dx = −
log 2 k+1 0 log 2 (k + 1)2
k∈N k∈N
µ ¶
1 2 π2
= − ζ(2) − ζ(2) = − ,
log 2 4 12 log 2
we obtain
1 π2
lim log(s̄0 s̄1 · · · s̄n−1 ) = − a.e. in Ω
n→∞ n 12 log 2
or, equivalently
1 π2
lim log(ȳ0 ȳ1 · · · ȳn−1 ) = a.e. in Ω.
n→∞ n 12 log 2
In the last equation we can give an estimate of the convergence rate. We
have shown in Example 3.2.11 that
Ãn−1 µ ¶!2
1 X π2
lim Eγ̄ log ȳi − > 0.
n→∞ n 12 log 2
i=0

Then for any ε > 0 by Theorem 4.0.4 we obtain


n−1 n−1
1X 1X
log ȳk = − log s̄k
n n
k=0 k=0
(4.1.18)
π2 ³ 1 ´
= + o n− 2 log(3+ε)/2 n a.e. in Ω
12 log 2
Ergodic theory of continued fractions 255

as n → ∞, where the constant implied in o depends on ε and the current


point (ω, θ) ∈ Ω2 .
While we cannot take f (x) = log x, x ∈ I, in Proposition 4.1.24 since
this is not a continuous function on I, we can however replace s̄k by sak ,
k ∈ N, a ∈ I, in (4.1.18) as shown below.
Theorem 4.1.26 For any a ∈ I we have

1 π2
lim log(sa1 sa2 · · · san ) = − a.e. in Ω.
n→∞ n 12 log 2
More precisely, whatever ε > 0, for any a ∈ I we have

1 π2 ³ 1 ´
log(sa1 sa2 · · · san ) = − + o n− 2 log(3+ε)/2 n a.e. in Ω
n 12 log 2
as n → ∞, where the constant implied in o depends on both ε and the current
point ω ∈ Ω.
In particular, for a = 0 the above equations amount to
√ 2
lim n
qn = eπ /12 log 2 a.e. in Ω (4.1.19)
n→∞

and
√ 2
³ 1 ´
n
qn = eπ /12 log 2 + o n− 2 log(3+ε)/2 n a.e. in Ω (4.1.20)

as n → ∞, respectively.
Proof. By the mean value theorem we have
¯ ¯
¯ log x − log y ¯ 1
¯ ¯ ≤
¯ x−y ¯ min(x, y)

for any 0 < x, y ≤ 1, x 6= y. Next, note that


µ ¶
v(i(k) ) 1 1
0 < − 1 ≤ max ,
u(i(k) ) Fk−1 Fk+1 F2k
¡ ¢
for any fundamental interval I(i(k) ) = Ω ∩ u(i(k) ), v(i(k) ) , i(k) ∈ Nk+ , k ∈
N+ . This follows easily from (1.1.12), (1.1.13), and Theorem 1.1.2.
Consequently, for any k ∈ N+ and a ∈ I we have
µ ¶
a 1 1
|log s̄k − log sk | ≤ max , = O(g2k ) (4.1.21)
Fk−1 Fk+1 F2k
256 Chapter 4

as n → ∞, whatever the current point (ω, θ) ∈ Ω.


Clearly, by (4.1.18) and (4.1.21) the proof is complete for any a ∈ I. In
the special case a = 0 we only should note that
qk−1
s0k = , k ∈ N+ .
qk
2
Remark. The convergence rate in Theorem 4.1.26 with a = 0 is slightly
better than that derived by Philipp (1967, p. 122). Equation (4.1.19) was
first derived by Lévy (1929) using a different method. 2
Corollary 4.1.27 We have
¯ ¯
1 ¯ pn ¯¯ π2
¯
lim log ¯ω − ¯ = − a.e. in Ω
n→∞ n qn 6 log 2
and, for any ε > 0,
¯ ¯ ³ ´
1 ¯ p n ¯ π2
log ¯¯ω − ¯¯ = − + o n−1/2 log(3+ε)/2 n a.e. in Ω
n qn 6 log 2
as n → ∞, where the constant implied in o depends on both ε and ω ∈ Ω.
Proof. It follows from (1.1.16) that for any ω ∈ Ω and n ∈ N we have
¯ ¯
1 ¯ pn ¯¯ 1
¯
< ¯ω − ¯ < 2 .
2
2qn+1 qn qn

Then the results stated are immediate consequences of equations (4.1.19)


and (4.1.20). 2
Corollary 4.1.28 We have

1 π2
lim log λ (I(a1 , · · · , an )) = − a.e. in Ω
n→∞ n 6 log 2
and, for any ε > 0,

1 π2 ³ ´
−1/2 (3+ε)/2
log λ (I(a1 , · · · , an )) = − +o n log n a.e. in Ω
n 6 log 2
as n → ∞, where the constant implied in o depends on both ε and ω ∈ Ω.
Proof. By (1.2.2) and (1.2.5) we have

log λ (I(a1 , · · · , an )) = −2 log qn − log(sn + 1), n ∈ N+ .


Ergodic theory of continued fractions 257

Since sn ∈ I, the results stated are again immediate consequences of equa-


tions (4.1.19) and (4.1.20). 2
Remark. The result above implies that the entropy H(τ ) of the continued
fraction transformation τ is equal to π 2 /6 log 2. See e.g., Billingsley (1965,
p. 134). 2
Corollary 4.1.29 For any ε > 0 we have
p 2
³ 1 ´
n
pn (ω) = ω 1/n eπ /12 log 2 + o n− 2 log(3+ε)/2 n a.e. in Ω

as n → ∞, where the constant implied in o depends on both ε and ω ∈ Ω.


The proof follows from the inequality
¯p p ¯ 1
¯n n ¯
¯ pn (ω) − ω qn (ω)¯ ≤ (n−1)/n
, ω ∈ Ω, n ∈ N+ ,
Fn+1 Fn

which can be easily checked. 2


Corollary 4.1.30 (Khinchin’s fundamental theorem of Diophantine ap-
proximation) Let f : N+ → R++ .
P
(i) If i∈N+ f (i) = ∞ and if (i) ≥ (i + 1)f (i + 1), i ∈ N+ , then a.e. in
Ω the inequality ¯ ¯
¯ ¯
¯ω − p ¯ < f (q)
¯ q¯ q
has infinitely many solutions in integers p, q ∈ N+ with g.c.d.(p, q) = 1.
P
(ii) If i∈N+ f (i) < ∞, then a.e. in Ω the above inequality has at most
finitely many solutions in integers p, q ∈ N+ with g.c.d.(p, q) = 1.
The proof follows from Theorem 4.1.26 with a = 0 and F. Bernstein’s
theorem (Proposition 1.3.16). See, e.g., Billingsley (1965, p. 48). 2

4.2 Other continued fraction expansions

4.2.1 Preliminaries

In this section we study a large class of continued fraction expansions which


can be derived from the RCF expansion. Before defining them formally let
us briefly describe the underlying idea.
258 Chapter 4

The following rather old and well known remark is fundamental. For
a ∈ Z, b ∈ N+ and x ∈ [0, 1) we have

1 −1
a+ =a+1+ .
1 b+1+x
1+
b+x
This operation is called a singularization. We have singularized the digit 1
in
[ · ; · · · , a, 1, b, · · · ]
The effect of a singularization is that a new and shorter continued frac-
tion expansion is obtained. Moreover, we will see that the sequence of
convergents associated with the ‘new’ continued fraction expansion is a sub-
sequence of the sequence of convergents of the ‘old’ one. For example, given
n ∈ N+ , if we singularize the digit an+1 (ω) = 1 in the RCF expansion of
some ω ∈ Ω, then the sequence of convergents of the ‘new’ continued frac-
tion expansion is obtained by deleting the nth term from the sequence of
RCF convergents of ω. Obviously, the ‘new’ continued fraction expansion is
no longer an RCF expansion!
Starting from the RCF expansion of a given x ∈ [0, 1) it is not possible
(i) to singularize two consecutive digits equal to 1, and (ii) to singularize
digits other than 1.
It is also important to note that once we have singled out digits equal to
1 to be singularized, the order in which they are singularized has no impact
on the final result. Of course, just one singularization does not make the
new expansion ‘really faster’ than the old one. However, many algorithms
can be devised such that for almost all x ∈ [0, 1) infinitely many convergents
are skipped. Before considering such algorithms, let us fix notation.
Let x ∈ [0, 1) with RCF expansion

x = [a1 , a2 , · · · ] .

Any finite or infinite string of consecutive digits

ak (x) = 1, ak+1 (x) = 1, ··· , ak+n−1 (x) = 1, k ∈ N+ , n ∈ N+ ∪{∞}

is called a 1-block if either k = 1 and ak+n (x) 6= 1 (if n is finite) or k > 1 and
ak−1 (x) 6= 1, ak+n (x) 6= 1 (if n is finite). The first algorithm we consider is:

A For any x ∈ [0, 1) singularize the first, third, fifth, etc., components in
any 1-block.
Ergodic theory of continued fractions 259

Applying algorithm A to a (finite or infinite) RCF expansion [a1 , a2 , · · · ]


yields a (finite or infinite) continued fraction of the form
e1
b0 + (4.2.1)
e2
b1 +
.
b2 + . .
or [b0 ; e1 /b1 , e2 /b2 , · · · ], for short. In (4.2.1) we have b0 ∈ {0, 1}, bn ∈
N+ , en ∈ {−1, 1}, and bn + en+1 ≥ 2, n ∈ N+ .

Example 4.2.1 Let x = (−3 + 17)/2 = 0.56155 · · · . As a quadratic
irrationality x should have a periodic RCF expansion (see Subsection 1.1.3).
We easily find that
£ ¤
x = [0; 1, 1, 3, 1, 1, 3, · · · ] = 0; 1, 1, 3 .
Applying algorithm A to the RCF expansion of x yields
x = [1; −1/2, 1/4, −1/2, 1/4, · · · ]
h i
or x = 1; −1/2, 1/4 , for short. 2

By the very construction, the convergents


pen e1
:= b0 + e2 , n = 1, 2, · · · ,
qne b1 +
. en
b2 + . . +
bn
of (4.2.1) are a subset of the convergents of [a1 , a2 , · · · ]. Therefore in the
case of an infinite RCF expansion we have
pen
lim e
= [a1 , a2 , · · · ] .
n→∞ qn

Several questions naturally arise :


(i) Are there other algorithms yielding continued fraction expansions with
the property above?
(ii) Does algorithm A always yield fastest continued fraction expansions?
Closest expansions? (The precise meaning of these terms will be ex-
plained later. See Subsection 4.3.3. Informally, one would like the
denominators qne , n ∈ N+ , to grow as fast as possible while the ap-
proximation coefficients associated with the new expansion to be as
small as possible.)
260 Chapter 4

(iii) Is there an underlying ergodic transformation?


We can easily answer the first question. The second algorithm we con-
sider is:

B For any x ∈ [0, 1) singularize the last, third from last, fifth from last,
etc., components in any 1-block.

Example 4.2.2 Let x be as in Example 4.2.1. Applying algorithm B to


the RCF expansion of x yields

x = [1; 1/2, −1/4, 1/2, −1/4, · · · ] ,


h i
or x = 1; 1/2, −1/4 , for short. 2
Clearly, in general, algorithms A and B yield different results. Actually
it is possible to show that, in a sense, one cannot do better than either
of these algorithms. Since one can singularize just digits equal to 1, and
since two consecutive 1’s cannot be both singularized, it is not possible to
go faster than either algorithms A or B. Slower algorithms are trivially at
hand. Here is an example of such an algorithm:

C For any x ∈ [0, 1) singularize all digits an+1 (x) = 1 for which Θn (x) ≥
1/2 (see Subsection 1.3.2) whatever n ∈ N.

In Subsection 4.3.2 it is shown that algorithm C is well defined, that is, not
in conflict with the requirements of the singularization procedure.
Example 4.2.3 Let x be as in Example 4.2.1. A simple calculation
shows that the first four digits equal to 1 in the RCF expansion of x should
not be singularized if we apply algorithm C to it. 2
From this example it is clear that, in general, algorithm C does not
yield expansions which are fastest. In Subsection 4.3.3 we will discuss an
algorithm which yields both fastest and closest expansions. This algorithm
was introduced by Selenius (1960) and—independently—by Bosma (1987),
and is called the optimal continued fraction (OCF) expansion. Finally, in
Subsection 4.2.5 we will answer question (iii) above.

4.2.2 Semi-regular continued fraction expansions


Apart from the RCF expansion there exist many so called semi-regular con-
tinued fraction expansions. To define the latter we start by defining a con-
tinued fraction (CF) as a pair of two sets e = (ek )k∈M and (ak )k∈{0}∪M of
Ergodic theory of continued fractions 261

integers with ek ∈ {−1, 1} and a0 ∈ Z, ak ∈ N+ , k ∈ M , where either


M = {k : 1 ≤ k ≤ n} for some n ∈ N+ or M = N+ . Next, for arbitrary
indeterminates xi , yi , 1 ≤ i ≤ n, n ∈ N+ , write
y1 y1
[y1 /x1 ] = , [y1 /x1 , · · · , yn /xn ] = , n ≥ 2.
x1 x1 + [y2 /x2 , · · · , yn /xn ]
If card M = n ∈ N+ then we say that the CF considered has length n and
assign it the value

[a0 ; e1 /a1 , · · · , en /an ] := a0 + [e1 /a1 , · · · , en /an ]

e1
= a0 + ∈ R ∪ {−∞, ∞}.
e2
a1 +
. en
a2 + . . +
an
If M = N+ then we say that the CF considered is infinite and look at it as
the sequence
((ek )1≤k≤n , (ak )0≤k≤n )n∈N+
of all finite CF’s which are obtained by finite truncation. In both cases we
can associate with a CF its convergents
pe0 pek
:= a0 , := [a0 ; e1 /a1 , · · · , ek /ak ] , 1 ≤ k ≤ n,
q0e qke

for either some n ∈ N+ or any n ∈ N+ , with pe0 = a0 , q0e = 1, pek ∈ Z, qke ∈


N+ , g.c.d. (|pek |, qke ) = 1, 1 ≤ k ≤ n.
To ensure the convergence of the sequence of convergents of an infinite
CF, which would enable us to speak of a CF expansion, additional conditions
should be imposed on the ek and ak , k ∈ N+ . One possibility, yielding the
so called semi-regular continued fraction (SRCF ) expansion, is to ask that
ei+1 + ai ≥ 1, i ∈ N+ , and ei+1 + ai ≥ 2 infinitely often (in the infinite
case). It can be shown that the sequence of convergents of an infinite SRCF
expansion converges to an irrational number. See Tietze (1913) [cf. Perron
(1954, §37)]. This will be written as
pek
lim e := [a0 ; e1 /a1 , e2 /a2 , · · · ] .
k→∞ qk

As in the RCF expansion case a matrix theory is associated with an


SRCF expansion (or, more generally, with a CF). Consider (cf. Remark
262 Chapter 4

preceding Proposition 1.1.1) the matrices


µ ¶µ ¶ µ ¶ µ ¶
e 0 1 0 1 1 a0 0 en
A0 := = , Aen := , n ∈ N+ ,
1 0 1 a0 0 1 1 an

and
Mne := Ae0 · · · Aen , n ∈ N.
Clearly,

det M0e = 1, det Mne = (−1)n e1 · · · en , n ∈ N+ . (4.2.2)

One can prove that


µ ¶
pen−1 pen
Mne = e , n ∈ N, (4.2.3)
qn−1 qne

with pe−1 = 1, q−1e = 0, which implies that the sequences (pen )n∈N and
e
(qn )n∈N satisfy the recurrence relations

pen = an pen−1 + en pen−2 , qne = an qn−1


e e
+ en qn−2 , n ∈ N+ .

The second equation above implies at once that


e
qn−1
sen := = [1/an , en /an−1 , · · · , e2 /a1 ] , n > 1, (4.2.4)
qne

and clearly se1 := q0e /q1e = 1/a1 . It follows from (4.2.2) and (4.2.3) that

pe−1 q0e − pe0 q−1


e
= 1,

pen−1 qne − pen qn−1


e
= (−1)n e1 · · · en , n ∈ N+ ,
showing that indeed g.c.d (|pen |, qne ) = 1, n ∈ N.
Next (see again the RCF expansion case), looking at Mne as a Möbius
transformation one can show that
pen
Mne (0) = , n ∈ N.
qne

More generally,

pen + zpen−1
Mne (z) = e = [a0 ; e1 /a1 , · · · , en−1 /an−1 , en /(an + z)] , n ≥ 2,
qne + zqn−1
Ergodic theory of continued fractions 263

for any z ∈ C, z 6= −1/sen , and


µ ¶
e1 e pe1 + zpe0
a0 + = M1 (z) = e
a1 + z q1 + zq0e

for any z ∈ C, z 6= −1/se1 . It follows that putting ten = [en+1 /an+1 , · · · ] , n ∈


N, we have
pe + ten pen−1
a0 + te0 = ne e , n ∈ N.
qn + ten qn−1
Finally, defining
¯ ¯
¯ pen ¯¯
Θen (a0 + te0 ) = (qne )2 ¯¯a0 + te0 − e ¯, n ∈ N,
qn

it is easy to check that


en+1 ten |ten |
Θen (a0 + te0 ) = = , n ∈ N. (4.2.5)
sen ten + 1 sen ten + 1
Since
en+1
(ten )−1 = en+1 (an+1 + ten+1 ), sen + en+1 an+1 = , n ∈ N,
sen+1

we also have
sen+1
Θen (a0 + te0 ) = , n ∈ N. (4.2.6)
sen+1 ten+1 + 1
The numbers Θen , n ∈ N, associated with a (finite or infinite) SRCF
expansion are called its approximation coefficients. Compare with the RCF
expansion case in Subsection 1.3.2.
We conclude this subsection with a few examples of well known SRCF
expansions.

1. The RCF expansion: this is the SRCF expansion for which en = 1 for
any n ∈ N+ .

2. Nakada’s α-expansions for α ∈ [1/2, 1]: see Subsection 4.3.1.

3. The nearest integer continued fraction (NICF) expansion: this is the


SRCF expansion for which en+1 +an ≥ 2 for any n ∈ N+ . It was intro-
duced by Minnigerode (1873) and studied by Hurwitz (1889). Actually,
the NICF expansion is the 1/2-expansion, and is obtained by applying
algorithm A defined in Subsection 4.2.1 to the RCF expansion.
264 Chapter 4

4. The singular continued fraction (SCF) expansion: this is the SRCF


expansion for which en + an ≥ 2, n ∈ N+ . It was introduced by
Hurwitz
√ (1889). Actually, the SCF expansion is the g-expansion with
g = ( 5−1)/2, the golden ratio, and is obtained by applying algorithm
B defined in Subsection 4.2.1 to the RCF expansion.

5. Minkowski’s diagonal continued fraction (DCF) expansion: this is the


SRCF expansion which is obtained by applying algorithm C defined
in Subsection 4.2.1 to the RCF expansion. See Subsection 4.3.2.

6. The continued fraction with odd incomplete quotients (Odd CF) ex-
pansion: this is the SRCF expansion for which e1 = 1, an ≡ 1 mod 2,
en+1 + an ≥ 2, n ∈ N+ . It was introduced by Rieger (1981a) [see
also Barbolosi (1990), Hartono and Kraaikamp (2002), and Schweiger
(1995, Ch. 3)].

7. The continued fraction with even incomplete quotients (Even CF) ex-
pansion: this is the SRCF expansion for which e1 = 1, an ≡ 0 mod 2,
en+1 + an ≥ 2, n ∈ N+ . See also Kraaikamp and Lopes (1996) and
Schweiger (1995, Ch. 3).

4.2.3 The singularization process


The following two easily checked identities are fundamental for the theory
which we develop in this section:
µ ¶µ ¶µ ¶ µ ¶µ ¶
1 a 0 c 0 1 1 a+c 0 −c
= , (4.2.7)
0 1 1 1 1 b 0 1 1 b+1

µ ¶µ ¶µ ¶ µ ¶µ ¶
0 c 0 d 0 1 0 c 0 −d
= , (4.2.8)
1 a 1 1 1 b 1 a+d 1 b+1

where a, b, c and d are arbitrary real or complex numbers.


Let
(ek )k∈M , (ak )k∈{0}∪M (4.2.9)

be a (finite or infinite) CF with a`+1 = 1, e`+2 = 1 for some ` ∈ N for which


` + 2 ∈ M . The transformation σ` which takes (4.2.9) into the CF

(e
ek )k∈M \{`+1} , (e
ak )k∈{0}∪(M \{`+1}) (4.2.10)
Ergodic theory of continued fractions 265

with eek = ek , k ∈ M, k < ` + 1 or k ≥ ` + 3, ee`+2 = −e`+1 , e ak = ak , k ∈


{0} ∪ M, k < ` or k ≥ ` + 3, e a` = a` + e`+1 , e a`+2 = a`+2 + 1, is called a
singularization of the pair (a`+1 , e`+2 ).
Let (pek /qke )k∈{0}∪M and (e
pek /e
qke )k∈{0}∪(M \{`+1}) be the sets of convergents
associated with (4.2.9) and (4.2.10), respectively. We are going to derive the
relationship between these sets. Let (Mke )k∈{0}∪M and (M fe )k∈{0}∪(M \{`+1})
k
be the sets of matrices defined in the preceding subsection, associated with
(4.2.9) and (4.2.10), respectively. We have
µ e ¶ µ ¶
pek f e 0
= Mk , k ∈ {0} ∪ (M \{` + 1}).
qeke 1

fe = M e for k < ` and, moreover, by (4.2.7) and (4.2.8) we have


Clearly, Mk k
f = Mk+1 for k ≥ ` + 1. The matrix M
M e fe will then be given by
k `
µ ¶
f e e 0 e`
M` = M`−1
1 a` + e`+1
µ ¶
e := 0 1
with M−1 and e0 = 1. Hence
1 0
µ ¶−1 µ ¶−1 µ ¶
fe = M e 0 e`+1 0 e` 0 e`
M` `+1 1 1 1 a` 1 a` + e`+1
µ ¶
e −e`+1 0
= M`+1 .
e`+1 1

Therefore
µ ¶ µ ¶µ ¶ µ ¶
pee` e −e`+1 0 0 pe`+1
= M`+1 = ,
qe`e e`+1 1 1 e
q`+1

and we can state the following result.


Proposition 4.2.4 Let ` ∈ N such that ` + 2 ∈ M . The set of conver-
gents
pek /e
(e qke )k∈{0}∪(M \{`+1})
resulting after the singularization σ` of the pair (a`+1 , e`+2 ) = (1, 1), is
obtained by deleting pe` /q`e from the set (pek /qke )k∈{0}∪M .
In what follows a singularization process will consist of a set S of contin-
ued fractions and a rule which determines in an unambiguous way the pairs
a`+1 = 1, e`+2 = 1 that should be singularized for any member of S.
266 Chapter 4

Remark. For an infinite CF the sequence of convergents of the ‘new’ CF


obtained after singularization, is a subsequence of the sequence of conver-
gents of the ‘old’ one. Therefore if the ‘old’ CF converged to x, so does the
‘new’ one, and it converges faster. In particular, this holds for any SRCF
expansion to be singularized.

4.2.4 S-expansions
From now on we will concentrate on one special singularization process.
The set S of continued fraction expansions to be singularized is the set of
all (finite or infinite) RCF expansions. Since in this case all the e’s are
+1, we will speak of singularizing a`+1 = 1 instead of singularizing the pair
a`+1 = 1, e`+2 = 1.
Before describing the general rule (as we should according to the defini-
tion just given) remark that Example 4.2.1 actually describes a singulariza-
tion process: S plus algorithm A yield the NICF expansion! Now, notice
that algorithm A is equivalent to

singularize a`+1 = 1 if and only if (τ ` , s` ) ∈ SA , ` ∈ N,

where (cf. Subsection 1.3) τ ` = [a`+1 , a`+2 , · · · ], s` = [a` , · · · , a1 ], ` ∈ N,


with s0 = 0, and
SA = [1/2, 1) × [0, g] ⊂ I 2 .
We recall that the golden ratios g and G are defined as

5−1
g= , G = g + 1.
2
Similarly, we can verify that algorithm B—leading to Hurwitz’ SCF expansion—
is equivalent to

singularize a`+1 = 1 if and only if (τ ` , s` ) ∈ SB , ` ∈ N,

where
SB = [g, 1) × I ⊂ I 2 .
Finally, using properties of the approximation coefficients Θn , n ∈ N, de-
fined in Subsection 1.3.2, we can also show that algorithm C—leading to
Minkowski’s DCF expansion—is equivalent to

singularize a`+1 = 1 if and only if (τ ` , s` ) ∈ SC , ` ∈ N,


Ergodic theory of continued fractions 267

where ½ ¾
x 1
SC = (x, y) ∈ I 2 ; ≥ .
xy + 1 2

These three examples lead to the idea of prescribing by a subset S ⊂ I 2


which digits 1 = a`+1 are to be singularized in the RCF expansion in the
form of the condition (τ ` , s` ) ∈ S, ` ∈ N. Such an S cannot be just any set
but must satisfy the conditions

S ⊂ [1/2, 1) × I,

since otherwise a`+1 would not be equal to 1, and

S ∩ τ̄ (S) ⊂ {(g,g)},

since otherwise one would be forced to singularize two consecutive digits


both equal to 1, which is impossible. Thus we are lead—in a natural way—
to the following definition which exactly describes all S-expansions.
Definition 4.2.5 A subset S of I 2 is said to be a singularization area if
and only if

(i) S ∈ BI2 and γ̄(∂S) = 0;

(ii) S ⊂ [1/2, 1) × I;

(iii) S ∩ τ̄ (S) ⊂ {(g,g)}.

If S is a singularization area, then the S-expansion of ω ∈ Ω is defined as the


SRCF expansion converging to ω which is obtained from the RCF expansion
of ω by singularizing a digit 1 = a`+1 = a`+1 (ω) if and only if (τ ` , s` ) ∈ S,
whatever ` ∈ N.
Remarks. 1. We need the continuity condition γ̄(∂S) = 0 in order to be
able to draw the following conclusion. Let A(S, n) be the random variable
defined as

A(S, n) = card{j : (τ j , sj ) ∈ S, 1 ≤ j ≤ n}, n ∈ N+ .

By Theorem 4.1.16(ii) we then have

A(S, n)
lim = γ̄(S) a.e..
n→∞ n
268 Chapter 4

2. Actually, the sets SA and SB do not satisfy condition (iii). Indeed, in


both cases, S ∩ τ̄ (S) is a line segment. Of course, this can be easily repaired
by taking

SA = ([1/2, g] × [0, g]) ∪ ((g, 1) × [0, g))
and

SB = ([g, 1) × [0, g]) ∪ ((g, 1) × (g, 1])
instead of SA and SB , respectively.
3. Since
4
γ([1/2, 1) × I) = (log 2)−1 log = 0.41503 · · · ,
3
a singularization area S never can have γ-measure greater that 0.41503 · · · .
But condition (iii) forces the maximal possible γ-measure of a singularization
area S to be essentially smaller than 0.41503 · · · as shown below.
Proposition 4.2.6 For any singularization area S we have
log G
γ(S) ≤ 1 − = 0.30575 · · · ,
log 2
where the bound is sharp.
Proof. Define M1 = SA ∗ with S ∗ as before and M = ([0, g) × (g, 1]) ∪
A 2
([g, 1) × [g, 1]). It is easy to check that M2 = τ̄ (M1 ) and

log G
γ(M1 ) = γ(M2 ) = 1 − .
log 2
Next, put S1 = S ∩ M1 and S2 = S ∩ M2 . Clearly,

τ̄ (S1 ) ∪ S2 ⊂ M2

and by Definition 4.2.5(iii) we have

τ̄ (S1 ) ∩ S2 ⊂ {(g, g)} ,

see also Figure 4.1.


We now see that

γ(S) = γ(S1 ) + γ(S2 ) = γ(τ̄ (S1 )) + γ(S2 ) = γ(τ̄ (S1 ) ∪ S2 )

log G
≤ γ(M2 ) = 1 − .
log 2
Ergodic theory of continued fractions 269

1 .....
....
......
.........
.
...............
..........
τ̄ (S1 ) ..... ..
...... ...
...... ...
...
........ .....
.
....... ..
.......
.......
......
...
...
... S2
. .... .
........ ....
.. .
........ ...
g .........
.........
..........
..
..
...
...
...
...
........... ..
.
.......... ...
............
............. ...
...
...
...
...
...
...
...
...
...
.... S1
..
...
...
..
.
...
...
..
...
1 g
0 2 1

Figure 4.1: S = S1 ∪ S2 and τ̄ (S1 )

That a singularization area actually can have γ̄-measure 1 − (log 2)−1 (log G)
∗ and S ∗ .
is shown by the cases of SA 2
B
On account of Proposition 4.2.6, a singularization area S will be called
maximal if
log G
γ(S) = 1 − .
log 2
Given a singularization area S, let BS be a subset of I 2 such that what-
ever ω = [a1 , a2 , · · · ] ∈ Ω any digit 1 = a`+1 = a`+1 (ω) is unchanged by S-
singularization if and only if (τ ` , s` ) ∈ BS , ` ∈ N. Clearly, such a set—which
determines the occurrence of digits equal to 1 in the S-expansion—should
have the following properties:

(1) BS ⊂ [1/2, 1) × I since a`+1 = 1;

(2) BS ∩ S = ∅ since a`+1 = 1 is not singularized;

(3) τ̄ −1 (BS ) ∩ S = ∅ since a` is not singularized;

(4) τ̄ (BS ) ∩ S = ∅ since a`+2 is not singularized.

On account of the considerations above, the subset BS of I 2 defined as

BS = ([1/2, 1) × I) \ (S ∪ τ̄ −1 (S) ∪ τ̄ (S))

is called the preservation area of 1’s.


We have the following result.
270 Chapter 4

Proposition 4.2.7 If S is maximal, then γ(BS ) = 0. In general, the


converse of this statement does not hold.
Proof. Let M1 , M2 , S1 and S2 be as in the proof of Proposition 4.2.6.
Put moreover
B1 = BS ∩ M1 , B2 = BS ∩ M2 .
It is now easy to see that
τ̄ (B1 ) ∩ (τ̄ (S1 ) ∪ S2 ) = ∅, τ̄ (B1 ) ∪ τ̄ (S1 ) ∪ S2 ⊂ M2 ,
B2 ∩ (τ̄ (S1 ) ∪ S2 ) = ∅, B2 ∪ τ̄ (S1 ) ∪ S2 ⊂ M2 .
Hence, since S is maximal,
γ̄(B2 ) = 0, γ̄(B1 ) = γ̄(τ̄ (B1 )) = 0,
which completes the proof. (The reader is invited to give an example where
the converse does not hold.) 2
We conclude this subsection by deriving a number of results, which are
obtained as easy spin-off. Let S be a singularization area and ω ∈ Ω. As
the sequence (epek /e
qke )k∈N+ of S-convergents of ω is a subsequence of the
sequence (pn /qn )n∈N+ of its RCF convergents, there exists an increasing
random function nS : N+ → N+ such that
µ e ¶ µ ¶
pek pnS (k)
= , k ∈ N+ .
qeke qnS (k)
Theorem 4.2.8 Let S be a singularization area. Then
nS (k) 1
lim = a.e..
k→∞ k 1 − γ(S)

Proof. It follows from the definition of nS that


nS (k)
X
nS (k) = k + IS (τ j , sj ) .
j=1

Since γ̄(∂S) = 0, by Theorem 4.1.16(ii) we have


nS (k)
k 1 X
1 = lim + lim IS (τ j , sj )
k→∞ nS (k) k→∞ nS (k)
j=1

k
= lim + γ̄(S) a.e.,
k→∞ nS (k)
Ergodic theory of continued fractions 271

whence the result stated. 2


Remark. Theorem 4.2.8 implies that
nS (k) log 2
lim ≤ = 1.4404 · · · a.e.,
k→∞ k log G
the upper bound being attained if and only if S is maximal. In words:
sparsest sequences of S-convergents are given by maximal singularization
∗ which yields the NICF is maximal, we
areas. As the singularization area SA
have thus re-proved a theorem of Adams (1979), see also Jager (1982) and
Nakada (1981). 2
The following corollary gives the S-expansion analogues of two classical
results of P. Lévy in Subsection 4.1.3.
pek /e
Corollary 4.2.9 Let S be a singularization area and let (e qke )k∈N+ be
the corresponding sequence of S-convergents. Then
1 1 π2
lim log qeke = a.e.,
k→∞ k 1 − γ̄(S) 12 log 2
¯ ¯
1 ¯ pee¯
1 −π 2
lim log ¯¯ω − e ¯¯ =
k
a.e.
k→∞ k qek 1 − γ̄(S) 6 log 2

Proof. This is an immediate consequence of Theorems 4.1.26 and 4.2.8.


We have
1 nS (k) 1 1 π2
lim log qeke = lim log qnS (k) = a.e.,
k→∞ k k→∞ k nS (k) 1 − γ̄(S) 12 log 2
and similarly for the second equation. 2

By the mechanism of singularization the collection of RCF convergents


that are deleted to obtain the S-convergents has the same cardinality as the
set of the ee` , ` ∈ N+ , which are equal to −1. It is easy to see that
à k
!
1 X
nS (k) − k = k− ee` .
2
`=1

Therefore we can state the following result.


Corollary 4.2.10 We have
k
1X 1 − 3γ(S)
lim ee` = a.e..
k→∞ k 1 − γ(S)
`=1
272 Chapter 4

The minimum of the limit above is attained if and only if S is maximal, and
is equal to
1 G3
log = 0.11915 · · · .
log G 4

We conclude this subsection by giving the S-expansion analogue of Le-


gendre’s theorem—see Corollary 1.2.4.
Theorem 4.2.11 Let
¡ ¢
A(t) = (x, y) ∈ I 2 : x/(xy + 1) < t, y ∈ Q , 0 < t ≤ 1,

and define
cS = sup (t ∈ (0, 1] : A(t) ∩ S = ∅) .
Put
LS = min(cS , 1/2) .
Let ω ∈ Ω and p, q ∈ N+ with g.c.d.(p, q) = 1, p < q. If
¯ ¯
¯ p ¯
e = Θ(ω,
Θ e p/q) = q ¯¯ω − ¯¯ < LS ,
2
q

then p/q is an S-convergent of ω. The constant LS is best possible.


e
Proof. Suppose that Θ(ω, p/q) < LS and that p/q is not an S-convergent
of ω. Since LS ≤ 1/2, p/q is an RCF convergent of ω by Corollary 1.2.4,
i.e., there exists n ∈ N+ such that p/q = pn /qn . Now, since pn /qn is not an
S-convergent, by the very definition of an S-expansion we have (τ n , sn ) ∈ S.
The definition of LS then implies
τn
≥ cS ≥ LS ,
sn τ n + 1
which by the definition of the approximation coefficients in Subsection 1.3.2
yields
e
Θ(ω, p/q) = Θn ≥ LS ,
contrary to the hypothesis. Finally, it follows from the definition of LS and
Corollary 1.2.4 that LS is best possible. 2
Remarks. 1. Rieger (1979) and Adams (1979) gave a proof of Corollary
4.2.10 for the special case of the NICF expansion, using a formula of Spence
and Abel for the dilogarithm. We see that these transcendent techniques
can be avoided, which was also observed by Jager (1982).
Ergodic theory of continued fractions 273

∗ (the singularization area


2. An easy calculation shows that for S = SA
yielding the NICF expansion) we have

LS = g2 = 0.38166 · · · .

This value was also found by Ito (1987) and by Jager and Kraaikamp (1989).
Their methods are different. Ito (op. cit.) developed a theory for determin-
ing the Legendre constants for a class of continued fractions, larger than the
class of S-expansions. Unfortunately, his method is rather complicated.

4.2.5 Ergodic properties of S-expansions


In this subsection we show that for any S-expansion there exists an ‘under-
lying’ two-dimensional ergodic dynamical system. These systems will be ob-
tained via an induced transformation from (I 2 , BI2 , τ̄ , γ̄), the two-dimensional
ergodic dynamical system underlying the RCF expansion. Using the ergodic
dynamical systems thus obtained we will then deduce more metric and arith-
metic properties of S-expansions.
Let S be a singularization area and let x = [ a0 ; a1 , a2 , · · · ] = a0 +
[ a1 , a2 , · · · ], a0 ∈ Z, [ a1 , a2 , · · · ] ∈ Ω. Denote by

[e
a0 ; ee1 /e
a1 , ee2 /e
an , · · · ]

the S-expansion of x (cf. Subsection 4.2.3). Recall that this is an SRCF-


expansion satisfying een+1 + e
an ≥ 1, n ∈ N+ .
As before let

τ n = [ an+1 , an+2 , · · · ] , n ∈ N,
sn = [an , · · · , a1 ] , n ∈ N+ , s0 = 0,

and put

ten = [ een+1 /e
e an+1 , · · · ] , n ∈ N,


 0 if n = 0,



seen = 1/e
a1 if n = 1,





[1/e
an , een /e
an−1 , · · · , ee2 /e
a1 ] if n > 1.

By equations (1.2.2) and (4.2.4) we have

sn = qn−1 /qn , seen = qen−1


e
qne ,
/e n ∈ N,
274 Chapter 4

pen /e
where (pn /qn )n∈N and (e qne )n∈N are the sequences of RCF convergents
and S-convergents of x, respectively. Also,


 pn + τ n pn−1

 n ,
 qn + τ qn−1
x = (4.2.11)

 p
e e
+ e
t e e
pe
 k
 k k−1
 e ee e
qek + tk qek−1

for any k, n ∈ N, with p−1 = pee−1 = 1, and q−1 = qe−1


e = 0. Finally, put

∆ := I 2 \ S , ∆− = τ̄ (S), ∆+ = ∆ \ ∆− .

Theorem 4.2.12 For any n ∈ N+ the following assertions hold:

(i) (τ n , sn ) ∈ S if and only if pn /qn is not an S-convergent;

(ii) if pn /qn is not an S-convergent, then both pn−1 /qn−1 and pn+1 /qn+1
are S-convergents;

(iii) (τ n , sn ) ∈ ∆+ is equivalent to the existence of k = k(n) ∈ N such that


 e  e
 pek−1 = pn−1 , peek = pn ,  e
tk = τ n (⇒ ek+1 = +1),
and
 e 
qek−1 = qn−1 , qeke = qn , seek = sn ;

(iv) (τ n , sn ) ∈ ∆− is equivalent to the existence of k = k(n) ∈ N such that


 e  e
 pek−1 = pn−2 , peek = pn ,  e
tk = −τ n /(τ n + 1) (⇒ ek+1 = −1),
and
 e 
qek−1 = qn−2 , qeke = qn , seek = 1 − sn .

Proof. (i) This follows directly from Definition 4.2.5 and Proposition
4.2.4.
(ii) This follows from the fact that in the sequence of RCF convergents
we cannot remove two or more consecutive convergents and still have a
sequence of convergents of some srcf.
(iii) If (τ n , sn ) ∈ ∆+ then the very definition of ∆+ implies that

(τ n−1 , sn−1 ) 6∈ S and (τ n , sn ) 6∈ S .


Ergodic theory of continued fractions 275

Hence neither an nor an+1 is singularized and therefore both pn−1 /qn−1 and
pn /qn are S-convergents. But then there exists k ∈ N+ such that

peek−1 pn−1 peek pn


e = , e = .
qek−1 qn−1 qek qn

Since all the fractions are in their lowest terms and their denominators are
positive we should have

peek−1 = pn−1 , peek = pn ,

e
qek−1 = qn−1 , qeke = qn .

Then (4.2.11) implies that

pn + τ n pn−1 pn + e
tek pn−1
= ,
qn + τ n qn−1 qn + e
tek qn−1

hence e
tek = τ n . Finally, we have
e
qek−1 qn−1
seek = e = = sn .
qek qn

The converse is obvious.


(iv) If (τ n , sn ) ∈ ∆− then the very definition of ∆− implies that

(τ n−1 , sn−1 ) ∈ S and (τ n , sn ) 6∈ S .

Hence an = 1, and it should be singularized according to Definition 4.2.5.


Then pn−2 /qn−2 and pn /qn are consecutive S-convergents by (ii). Again,
there exists k ∈ N+ such that

peek−1 = pn−2 , peek = pn ,

e
qek−1 = qn−2 , qeke = qn .

Since
pn = an pn−1 + pn−2 = pn−1 + pn−2 ,
(4.2.12)
qn = an qn−1 + qn−2 = qn−1 + qn−2
we have
qn−2 qn − qn−1
seek = = = 1 − sn .
qn qn
276 Chapter 4

Next, from (4.2.11) we have

pn + τ n pn−1 pn + e
tek pn−2
= ,
n
qn + τ qn−1 qn + e
tek qn−2

and using equations (4.2.12) and (1.1.12) we obtain

e
tek + e
tek τ n + τ n = 0 ,

whence
e τn
tek = − n .
τ +1
The converse is obvious. 2
Now, define the transformation τ̄∆ : ∆ → ∆ as

 τ̄ (x, y) if τ̄ (x, y) 6∈ S,
τ̄∆ (x, y) =
 2
τ̄ (x, y) if τ̄ (x, y) ∈ S

for any (x, y) ∈ ∆ = I 2 \ S. This is a very simple instance of an induced


transformation. Cf., e.g., Petersen (1983, Sections 2.3 and 2.4). According
to the general theory, it follows that (∆, B∆ , τ̄∆ , γ̄∆ ) is an ergodic dynamical
system. Here γ̄∆ is the probability measure on B∆ with density

1 1
, (x, y) ∈ ∆.
γ̄(∆) log 2 (xy + 1)2

Next, Theorem 4.2.12 leads us naturally to consider the map M : ∆ →


R2 defined by

 (x, y), (x, y) ∈ ∆+ ,
M (x, y) =

(−x/(x + 1), 1 − y) (x, y) ∈ ∆− .

Set AS = M (∆). Clearly, AS consists of ∆+ = I 2 \ (S ∪ τ̄ (S)) and the


image M (τ̄ (S)) of ∆− = τ̄ (S) under M , which lies in the second quadrant
of the plane. Also, M : ∆ → AS is one-to-one. We can then define the
transformation τ̄S : AS → AS as τ̄S = M τ̄∆ M −1 , and Theorem 4.2.12
implies that
¡e ¢ ¡ e e¢
tk+1 , seek+1 = τ̄S e
e tk , sek , k ∈ N. (4.2.13)
Ergodic theory of continued fractions 277

It is immediate that the determinant of the Jacobian J of M |∆− is equal


to 1/(x + 1)2 > 0. For (x, y) ∈ ∆− we have
µ ¶2
−1 1 x+1 1
|J| = = ,
(xy + 1)2 xy + 1 (st + 1)2

where t = −x/(x + 1) and s = 1 − y. This shows that


ZZ ZZ
1 ds dt 1 dx dy
= |J| |J|−1
log 2 M (∆− ) (st + 1)2 log 2 ∆− (xy + 1)2
(4.2.14)
= γ̄ (τ̄ (S)) = γ̄(S).

Note also that


¡ ¢
γ̄ ∆+ = 1 − γ̄ (S) − γ̄ (τ̄ (S)) = 1 − 2γ̄ (S) . (4.2.15)

Theorem 4.2.13 Let ρ be the probability measure on BAS with density

1 1
, (x, y) ∈ AS .
(1 − γ̄(S)) log 2 (xy + 1)2

Then (AS , BAS , τ̄S , ρ) is an ergodic dynamical system which underlies the
corresponding S-expansion.
Proof. The conclusion follows on account of equations (4.2.13) through
(4.2.15) noting that the dynamical systems (∆, B∆ , τ̄∆ , γ̄∆ ) and (AS , BAS ,
τ̄S , ρ) are isomorphic by the very definition of the latter. See Remark 1
following Proposition 4.0.5 and Petersen (1983, Sections 1.3 and 2.3). 2
Remark. The entropy of the maps τ̄∆ and τ̄S can be easily obtained
using Abramov’s formula [see e.g. Petersen (1983, p. 257)]. Since H(τ ) =
π 2 /6 log 2 (see Remark following Corollary 4.1.28), we have

H(τ ) 1 π2
H(τ̄∆ ) = = = H(τ̄S ),
γ̄(∆) 1 − γ̄(S) 6 log 2
¡ ¢
which shows that entropy is maximal π 2 /6 log G for maximal singulariza-
tion areas. 2
At first sight the dynamical system (AS , BAS , τ̄S , ρ) looks very intricate.
However, it is quite helpful. We have the following result.
278 Chapter 4

Theorem 4.2.14 Let the map f : AS → R ∪ {∞} be defined by


¯ ¯ (1)
f (x, y) = ¯x−1 ¯ − τ̄S (x, y), (x, y) ∈ AS ,

(1)
where τ̄S (x, y) is the first coordinate of τ̄S (x, y). Let a : [0, 1) → N+ ∪ {∞}
be defined as in Chapter 1, that is,
 −1
 bt c if t ∈ (0, 1),
a(t) =

∞ if t = 0.

We have


 a(x) if sgn x = 1 and τ̄ (x, y) 6∈ S,






 a(x) + 1 if sgn x = 1 and τ̄ (x, y) ∈ S,
f (x, y) =



 a(−x/(x + 1)) + 1 if sgn x = −1 and τ̄ (M −1 (x, y)) 6∈ S,





a(−x/(x + 1)) + 2 if sgn x = −1 and τ̄ (M −1 (x, y)) ∈ S

and
¡ ¢
τ̄S (x, y) = |x−1 | − f (x, y), (y f (x, y) + sgn x)−1 , (x, y) ∈ AS .

Proof. We should distinguish four cases, of which only two will be con-
sidered here. The other two cases can be treated similarly. Cf. Kraaikamp
(1991, p. 26).

1. Let (x, y) ∈ ∆+ and τ̄ (x, y) ∈ S. Then sgn x = 1 and


µ ¶
−1 2 1 1
τ̄∆ (M (x, y)) = τ̄ (x, y) = τ̄ − a(x),
x a(x) + y
µ ¶
1 1
= − 1,
x−1 − a(x) 1 + 1/(a(x) + y)
µ ¶
x − 1 + xa(x) a(x) + y
= , ∈ ∆− .
1 − xa(x) a(x) + y + 1
Ergodic theory of continued fractions 279

Therefore
 
− x−1+xa(x)
1−xa(x) a(x) + y
τ̄S (x, y) = M (τ̄∆ (M −1 (x, y))) =  , 1− 
1 + x−1+xa(x)
1−xa(x)
a(x) + y + 1
µ ¶
1 1
= − (a(x) + 1), .
x a(x) + y + 1

Thus we see that


³¯ ¯ ´
τ̄S (x, y) = ¯x−1 ¯ − f (x, y), (f (x, y) + y sgn x)−1 ,

where
f (x, y) = a(x) + 1.
¡ ¢
2. Let (x, y) ∈ M (∆− ) and τ̄ M −1 (x, y) 6∈ S. Then sgn x = −1 and
we have
µ ¶
−1 −1 x
τ̄S (x, y) = M τ̄ M (x, y) = τ̄ M (x, y) = τ̄ − ,1 − y
x+1
µ µ ¶ ¶
1 x 1
= − −a − ,
x/(x + 1) x+1 a(−x/(x + 1)) + 1 − y
µ µ ¶ ¶
1 x 1
= − −1−a − , .
x x+1 a(−x/(x + 1)) + 1 + y sgn x

Thus we see that


³¯ ¯ ´
τ̄S (x, y) = ¯x−1 ¯ − f (x, y), (f (x, y) + y sgn x)−1 ,

where
f (x, y) = a(−x/(x + 1)) + 1.

2
Corollary 4.2.15 We have

(i) f (x, y) ∈ N+ for (x, y) ∈ AS , x 6= 0;

(ii) e tek , seek ), k ∈ N, with (e


ak+1 = f (e te0 , see0 ) = (x − e
a0 , 0).
280 Chapter 4

Let AiS , i = 1, 2, be the projections of AS onto the two axes and let λAi
S
denote the probability measure defined by
¡ ¢
λ A ∩ AiS
λAi (A) = ¡ ¢ , A ∈ BAi , i = 1, 2.
S λ AiS S

³ ´
Proposition 4.2.16 Let µ ∈ pr BA1 such that µ ¿ λA1 . For any
S S
B ∈ BAS such that λA1 ⊗ λA2 (∂B) = 0 we have
S S

¡ e e ¢
lim µ (e
tn , sen ) ∈ B = ρS (B),
n→∞

n−1
X
lim 1 IB (e
tek , seek ) = ρS (B) a.e. in A1S .
n→∞ n
k=0

Proof. This is the result corresponding to Theorem 4.1.16 and Corollary


4.1.17 for the ergodic dynamical system (AS , BAS , τ̄S , ρ). It is easy to see
that the proof of Theorem 4.1.16 for the case of the ergodic dynamical system
(I 2 , BI2 , τ̄ , γ̄) carries over to the present case. 2

Corollary 4.2.17 Consider the approximation coefficients


¯ e ¯
e en = (e
Θ qne )2 ¯e
a0 + e qne ¯ ,
te0 − peen /e n ∈ N.

For any µ ∈ pr(BA1 ) such that µ ¿ λA1 and any (t1 , t2 ) ∈ I 2 we have
S S

³ ´
lim µ Θe e ≤ t1 , Θ
e e ≤ t2 = ρ(B),
n→∞ n−1 n

1 card{k : Θ
lim n e e ≤ t1 , Θ
e e ≤ t2 , 0 ≤ k ≤ n − 1} = ρ(B) a.e. in A1S ,
n→∞ k k+1

where µ ¶
y |x|
B = (x, y) ∈ AS ; ≤ t1 , ≤ t2 .
xy + 1 xy + 1

Proof. The results stated follow from Proposition 4.2.16 on account of


equations (4.2.5) and (4.2.6). 2
Ergodic theory of continued fractions 281

4.3 Examples of S-expansions


4.3.1 Nakada’s α-expansions
Let Iα = [α−1, α], α ∈ R, so that I1 = I. In this subsection we will consider
transformations Nα : Iα → Iα defined by
 −1 ¥ ¦
 |x | − |x−1 | + 1 − α if x 6= 0
Nα (x) =

0 if x = 0

for x ∈ Iα , with α ∈ [1/2, 1]. Any irrational number x ∈ Iα has an infinite


SRCF expansion called α-expansion, of the form
e1
e2 := [ e1 /b1 , e2 /b2 , · · · ] ,
b1 +
.
b2 + . .
where
¡ ¢
(en , bn ) = (en (x), bn (x)) = e1 (Nαn−1 (x)), b1 (Nαn−1 (x)) , n ∈ N+ ,

with ¡ ¥ ¦¢
(e1 (x), b1 (x)) = sgn x, |x−1 | + 1 − α , x ∈ Iα .
Here Nαn denotes the composition of Nα with itself n times while Nα0 is the
identity map.
The theory of α-expansions can be developed by parallelling that of the
RCF expansion. This has been done by Nakada (1981), Nakada et al. (1977),
Bosma et al. (1983), and Popescu (2000). Originally, these expansions were
defined by McKinney (1907).
Our approach here consists in putting any α-expansion in the framework
of the S-expansion theory by giving a suitable singularization area Sα , α ∈
[1/2, 1]. This will allow us to retrieve results derived by Nakada and co-
workers (op. cit.) using different methods.
We should distinguish two cases: (i) α ∈ [1/2, g] and (ii) α ∈ (g, 1].
Case (i). Before giving the singularization areas Sα , α ∈ [1/2, g], we
first return to the special case α = 1/2 which yields the NICF expansion.
Recall that the NICF expansion of an irrational number can be obtained
from its RCF expansion by applying algorithm A from Subsection 4.2.1 to
the latter. We noticed in Subsection 4.2.4 that this is equivalent to

singularize a`+1 = 1 if and only if (τ ` , s` ) ∈ SA , ` ∈ N,


282 Chapter 4

where
SA = [1/2, 1) × [0, g] .
For α ∈ (1/2, g], notice that

τ̄ ([1/2, α) × [0, g]) = ((1 − α)/α, 1] × [g, 1] .

In particular, for α = g we have

(SA \ ([1/2, α) × [0, g])) ∪ ((1 − α)/α, 1] × [g, 1])

= (SA \ ([1/2, g) × [0, g])) ∪ ((g, 1] × [g, 1])

= ([g, 1) × [0, g]) ∪ ((g, 1] × [g, 1]) ,


∗ of Hurwitz’s SCF
which only slightly differs from the singularizaton area SB
expansion, which coincides with the g-expansion. See Remark 2 following
Definition 4.2.5. It therefore seems natural to try as singularization areas
Sα for α ∈ [1/2, g] the sets

Sα = ([α, g) × [0, g)) ∪ ([g, (1 − α)/α] × [0, g])


(4.3.1)
∪ ((1 − α)/α, 1] × I) .

Hence

τ̄ (Sα ) = ([0, (2α − 1)/(1 − α)) × [1/2, 1])


∪ ([(2α − 1)/(1 − α), g] × [g, 1])
∪ ((g, (1 − α)/α] × (g, 1]) .

It is easy to check that Sα is indeed a singularization area: obviously,


γ̄(∂Sα ) = 0, Sα ⊂ [1/2, 1] × I, and clearly Sα ∩ τ̄ (Sα ) = {(g, g)}. Also,
log G
γ̄(Sα ) = 1 − ,
log 2
hence Sα is maximal for any α ∈ [1/2, g].
Notice that with M defined as in Subsection 4.2.5 we have

M (τ̄ (Sα )) = ([α − 1, g − 1) × [0, 1 − g))


∪ ([g − 1, (1 − 2α)/α] × [0, 1 − g])
∪ ((1 − 2α)/α, 0] × [0, 1/2]) .
Ergodic theory of continued fractions 283

Writing Aα for ASα —see again the general case in Subsection 4.2.5—we take
¡ 2 ¢
Aα = I \ (Sα ∪ τ̄ (Sα )) ∪ (M (τ̄ (Sα )) \ ({0} × [0, 1/2]))
= ([α − 1, g − 1) × [0, 1 − g))
∪ ([g − 1, (1 − 2α)/α] × [0, 1 − g])
∪ (((1 − 2α)/α, 0) × [0, 1/2]) ∪ ([0, (2α − 1)/(1 − α)] × [0, 1/2))
∪ (((2α − 1)/(1 − α), α) × [0, g)) .

If we denote by fα : Aα → R ∪ {∞} the function corresponding to the


function f in Theorem 4.2.14, then it easy to see that actually fα maps Aα
into N+ and that
¯ −1 ¯
¯x ¯ − fα (x, y) ∈ [α − 1, α), x ∈ [α − 1, α) \ {0}.

Since there exists only one n ∈ N+ such that


¯ −1 ¯
¯x ¯ − n ∈ [α − 1, α),

we deduce that fα (x, y) does not depend on y and that we should have
¯ ¯
fα (x, y) = b¯x−1 ¯ + 1 − αc, (x, y) ∈ Aα , x 6= 0.
¯ ¯
Hence x → ¯x−1 ¯ − fα (x, y) is Nakada’s transformation Nα . On account
of Theorem 4.2.14 we can therefore state the main result for the case α ∈
[1/2, g].
1
Theorem 4.3.1 [Nakada (1981)] Let 2 ≤ α ≤ g. Consider the probability
measure γ̄α on BAα with density

1 1
, (x, y) ∈ Aα ,
log G (xy + 1)2

and the transformation N̄α : Aα → Aα defined by


³ ¡ ¢−1 ´
N̄α (x, y) = |x−1 | − b|x−1 | + 1 − αc, b|x−1 | + 1 − αc + y sgn x ,

where (x, y) ∈ Aα . Then (Aα , BAα , N̄α , γ̄α ) is an ergodic dynamical system
underlying the corresponding α-expansion.
Taking projection onto the first axis, we deduce from Theorem 4.3.1 the
following result.
284 Chapter 4

Corollary 4.3.2 Let 12 ≤ α ≤ g. Consider the probability measure µα


on BIα with density


 1/(x + G + 1) if x ∈ [α − 1, (1 − 2α)/α],


1 
× 1/(x + 2) if x ∈ ((1 − 2α)/α, (2α − 1)/(1 − α)),
log G  



1/(x + G) if x ∈ [(2α − 1)/(1 − α), α] .

Then (Iα , BIα , Nα , µα ) is an ergodic dynamical system.


Remark. For α = 1/2 we obtain the NICF expansion, and the corre-
sponding result has been derived independently by Rieger (1979) and Rock-
ett (1980). 2

1α g 1−α
0 2 α 1

1
Figure 4.2: Sα for 2 ≤α≤g

From Figure 4.2 it is clear that the vertices (α, g) and ((1 − α)/α, 1) of
Sα determine the value of the Legendre constant Lα := LSα . See Theorem
4.2.11. More precisely, we have the following result.
1
Theorem 4.3.3 Let 2 ≤ α ≤ g. Then

Lα = min(α/(1 + αg), 1 − α).

Remark. Notice that for the values of α ∈ [1/2, g] under consideration


we have
τ̄ ([1/2, α) × [0, g)) ⊂ Sα .
Ergodic theory of continued fractions 285

.
It follows at once from this and (4.3.1) that BSα = ∅, which is consistent
with Proposition 4.2.7. 2
Case (ii). Let α ∈ (g, 1]. Put

Sα = [α, 1] × I . (4.3.2)

Hence τ̄ (Sα ) = [0, (1−α)/α]×[1/2, 1], and Sα ∩ τ̄ (Sα ) = ∅ since for α ∈ (g, 1]
we have
(1 − α)/α < α .
It is then easy to check that Sα is indeed a singularization area. However,
a simple calculation shows that
log(1 + α)
γ̄(Sα ) = 1 − ,
log 2
thus for no value of α under consideration here the singularization area Sα
is maximal.
Next, with M defined as in Subsection 4.2.5 we have

M (τ̄ (Sα )) = [α − 1, 0] × [0, 1/2] .

Define Aα exactly as in case (i) and denote by fα : Aα → R ∪ {∞} the


function corresponding to the function f in Theorem 4.2.14. The expression
of Aα is now simpler, namely,

Aα = ([α − 1, 0) × [0, 1/2])∪([0, (1 − α)/α] × [0, 1/2))∪(((1 − α)/α, α) × I) ,

see Figure 4.3.


Similarly to case (i) we find that fα (x, y) is independent of y and that
in fact we have again

fα (x, y) = b|x−1 | + 1 − αc , (x, y) ∈ Aα , x 6= 0.

Thus we can state the main result for the case α ∈ (g, 1].
Theorem 4.3.4 [Nakada (1981)] Let g < α ≤ 1. Consider the probability
measure γ̄α on BAα with density
1 1
, (x, y) ∈ Aα ,
log(1 + α) (xy + 1)2

and the transformation N̄α : Aα → Aα defined as in Theorem 4.3.1. Then


(Aα , BAα , N̄α , γ̄α ) is an ergodic dynamical system.
286 Chapter 4

τ̄ (Sα )

1/2

M (τ̄ (Sα ))

α−1 0 1−α α 1
α 1/2

Figure 4.3: Sα for g ≤ α ≤ 1

Taking again projection onto the first axis, we deduce from Theorem
4.3.4 the following result.
Corollary 4.3.5 Let g < α ≤ 1. Consider the probability measure µα
on BIα with density

1  1/(x + 2) if x ∈ [α − 1, (1 − α)/α],
×
log(1 + α) 
1/(x + 1) if x ∈ ((1 − α)/α, α].

Then (Iα , BIα , Nα , µα ) is an ergodic dynamical system.


We conclude the discussion of case (ii) with some results from Kraaikamp
(1991). It is obvious that the vertex (α, 1) of Sα determines the value of the
Legendre constant Lα := LSα . As min(α/(α + 1), 1/2) = α/(α + 1) in case
(ii), we have the following result. See again Theorem 4.2.11.
Theorem 4.3.6 Let g < α ≤ 1. Then
α
Lα = .
α+1
Ergodic theory of continued fractions 287

Next, it is easy to check that

τ̄ −1 (Sα ) ∩ ([1/2, 1] × I) = [1/2, 1/(1 + α)] × I.

Since for our values of α we have (1 − α)/α < 1/(1 + α), we find that the
set Bα := BSα from Proposition 4.2.7 is (1/(1 + α), α) × I. Then

log(2 + α)
γ̄α (Bα ) = 2 − ,
log(1 + α)

and we can state the following result.


Theorem 4.3.7 Let g < α ≤ 1. For the α-expansion [ee1 /e
a1 , ee2 /e
a2 , · · · ] =
[e1 /b1 , e2 /b2 , · · · ] of irrationals in Iα we have

1 log(2 + α)
lim card{k ; e
ak = 1, 1 ≤ k ≤ n} = 2 − a.e..
n→∞ n log(1 + α)

Remarks. 1. The case α = 1 gives the classical result from Proposition


4.1.1.
2. For α ∈ [g, 1] the limit 2 − log(2+α)
log(1+α) increases monotonically from 0
to 2 − log 3
log 2 = 0.4150 · · · , the asymptotic relative frequency of digit 1 in the
RCF expansion. At α = 0.76292 · · · we have already lost half of the original
1’s.
3. It follows from Corollary 4.2.10 that for the α-expansion with α ∈
(g, 1] we have
n
1X log 4
lim eek = 3 − a.e..
n→∞ n log(1 + α)
k=1
2
We conclude this subsection by giving the analogue of Vahlen’s theorem—
see Subsection 1.3.2—for α-expansions with α ∈ [1/2, 1]. For the NICF and
Hurwitz’ SCF expansions this analogue was independently given by Kurosu
(1924) and Sendov (1959/60). Kraaikamp (1990) proved the Kurosu–Sendov
results by giving a domain in R2 where the point (Θ ee , Θ e e ) always lies.
n−1 n
For the two expansions just mentioned, that is for α = 1/2 and α = g, we
have
e en−1 , Θ
min(Θ e en ) < 2g3 = 0.4721 · · · ,

and the constant 2g3 is best possible.


288 Chapter 4

However, one might ask whether there are values of α for which still
smaller values can be obtained for the corresponding approximation coeffi-
cients ee ee
√ Θn (α) = Θn , n ∈ N. Beforehand it is clear that a value smaller than
1/ 5 = 0.447 · · · can never be found by a classical result of √ A. Hurwitz
[see Perron (1954, p. 49)], according to which for every θ < 1/ 5 there exist
irrational numbers x such that the inequality q 2 |x − (p/q)| < θ is verified
only for finitely many p/q ∈ Q.
The above-mentioned method from Kraaikamp (1990) can easily be adap-
ted for S-expansions. As an example we will mention here the case of α-
expansions, for which the first result below is due to Bosma et al. (1983).
Theorem 4.3.8 Let α ∈ [1/2, 1]. For any irrational number in Iα and
any n ∈ N+ we have
e en < c(α)
Θ
and
ee , Θ
min(Θ e e ) < V (α) ,
n−1 n

where the functions c, V : [1/2, 1] → R are defined by


µ ¶
1−α 1
c(α) = max G , α , ≤ α ≤ 1,
gα + 1 2
and  Ã !

 g

 max , 4α − 2 if 1
≤ α ≤ g,

 1 + gα 2

V (α) = Ã !



 2(1 − α) α

 max

α+1
, 2
α +1
if g ≤ α ≤ 1.

The bounds c(α) and V (α) are best possible.


For the proof see Kraaikamp (1991).
Remark. A simple calculation yields minα c(α) = c(α0 ) = α0 , with
µ q ¶
1 √ √
α0 = −2 − 5 + 6 5 + 15 = 0.5473 · · · .
2
Moreover, we √have minα V (α) = V (α1 ) = 0.4484 · · · , a constant slightly
larger than 1/ 5, where

1 − 3g + 10 − 11g
α1 = = 0.6121 · · · < g.
4g2
2
Ergodic theory of continued fractions 289

4.3.2 Minkowski’s diagonal continued fraction expansion


Let x ∈ R such that both x and 2x 6∈ Z. Consider the sequence σ of all
irreducible fractions p/q ∈ Q with q ∈ N+ satisfying
¯ ¯
¯ ¯
¯x − p ¯ < 1 ,
¯ q¯ 2q 2

ordered in such a way that their denominators form an increasing sequence.


It can be shown [see, e.g., Perron (1954, §45)] that there exists a unique
SRCF expansion whose sequence of convergents coincides with σ. Legen-
dre’s theorem (see Corollary 1.2.4) implies that we take precisely those RCF
convergents for which Θn < 1/2. By (4.2.5) this SRCF expansion—which
is called Minkowski’s diagonal continued fraction (DCF ) expansion—is an
S-expansion with singularization area
½ ¾
2 x 1
S = SDCF := (x, y) ∈ I : ≥ .
xy + 1 2

Since min(Θn , Θn+1 ) < 1/2—cf. Subsection 1.3.2—the DCF expansion picks
at least one out of two consecutive RCF convergents. Since

1
γ̄ (SDCF ) = 1 − ,
2 log 2

the singularization area SDCF is not maximal. Also, by Theorem 4.2.8 we


have
nSDCF (k)
lim = 2 log 2 = 1.3862 · · · a.e..
k→∞ k
It can be shown [cf. Kraaikamp (1989, p. 210)] that the DCF expan-
sion of any ω ∈ Ω can be obtained from its RCF expansion [a1 , a2 , · · · ] by
singularizing any digit ak+1 (ω) = 1 if and only if one of the following four
conditions is fulfilled:

(i) k = 0, that is, a1 = 1;

(ii) ak , ak+2 6= 1, k ∈ N+ ;

(iii) ak 6= 1, ak+2 = 1, and [ak+3 , ak+4 , · · · ] > [ak −1, · · · , a1 ], k ∈ N+ , with


the convention that the value of [ak − 1, · · · , a1 ] for k = 1 is [a1 − 1];

(iv) ak = 1, ak+2 6= 1, and [ak−1 , · · · , a1 ] > [ak+2 − 1, ak+3 , · · · ], k ≥ 2.


290 Chapter 4

It is also interesting to note that the DCF expansion of a quadratic ir-


rationality is periodic.

The general theory developed in Subsections 4.2.4 and 4.2.5 allows us


to state the following results. For detailed proofs the reader is referred to
Kraaikamp (op. cit.).
With the notation in Subsection 4.2.5, for the DCF expansion case we
have
µ ¶
x 1 y 1
∆+
DCF = (x, y) ∈ R 2
++ : < , < ,
xy + 1 2 xy + 1 2
µ ¶
2 (x + 1)(1 − y) 1 1
M (τ̄ (SDCF )) = (x, y) ∈ R : ≤ , − ≤ x ≤ 0, y ≥ 0 ,
xy + 1 2 2

ADCF := ASDCF = ∆+
DCF ∪ M (τ̄ (SDCF )) ,

see also Figure 4.4.


1 ....
.....
.....
.
. .............
.
.
.........
.........
...... ...
...... ...
...... .....
τ̄ (SDCF ) ......
.......
....
..
...
.
...
....... ...
....... ...
.. ........... .
..
.
........ ...
......... ...
......... ...
.........
.
...
...
........... .
...
...
........... ...
........... ...
.............
............. ..
.
...
...
............ 1/2 ....
....... ...
........
....... ...
...... ..
..
....... ....
..
.... ...
.... ...
...
... ...
.
...
... .
...
.. SDCF
... ...
..
.
...
...
M (τ̄ (SDCF )) ...
..
.
...
... ...
... ...
... ...

−1/2 0 1/2 1

Figure 4.4: SDCF

Furthermore, writing fDCF for fSDCF and τ̄DCF for τ̄SDCF we have
$ ¯ −1 ¯ %
¯ −1 ¯ b ¯x ¯c + y sgn x − 1
fDCF (x, y) = ¯x ¯ + ,
2(b|x−1 | + y sgn x) − 1

and
³¯ ¯ ´
τ̄DCF (x, y) = ¯x−1 ¯ − fDCF (x, y), (fDCF (x, y) + y sgnx)−1
Ergodic theory of continued fractions 291

for (x, y) ∈ ADCF .


Proposition 4.3.9 Let ρDCF be the probability measure on BSDCF with
density
2
, (x, y) ∈ ADCF .
(xy + 1)2
Then (ADCF , BSDCF , τ̄DCF , ρDCF ) is an ergodic dynamical system which un-
derlies the DCF expansion.
¡ ¢
Proposition 4.3.10 For any µ ∈ pr B[−1/2,1] such that µ ¿ λ and
any (t1 , t2 ) ∈ I 2 we have
³ ´
lim µ Θe e ≤ t1 , Θ
e e ≤ t2 = H(t1 , t2 ).
n→∞ n−1 n

Here H is the distribution function with density d1 + d2 , where


2IB (x, y) 2IB (x, y)
d1 (x, y) = √ 1 , d2 (x, y) = √ 2 ,
1 − 4xy 1 + 4xy
with

B1 = [0, 1/2] × [0, 1/2],

¡ ¢
B2 = B1 ∩ (x, y) ∈ E1 : 0 ≤ (x − y)2 + x + y ≤ 3/4 .

The result above can be also stated in an equivalent form concerning the
existence for any (t1 , t2 ) ∈ I 2 of the limit a.e. equal to H(t1 , t2 ) of
1 e e ≤ t1 , Θ
e e ≤ t2 , 0 ≤ k ≤ n − 1}
card{k : Θ k k+1
n
as n → ∞. It then follows, e.g., that
n−1
1 X ee 1
lim Θk = a.e..
n→∞ n 4
k=0

We also note the following results.

Proposition 4.3.11 An RCF digit ak+1 equal to 1 does not disappear


in the DCF expansion if and only if
µ ¶
k 1 − 2x 2x − 1 1
(τ , sk ) ∈ B = (x, y) ∈ ADCF : y < ,y> ,y<
3x − 2 x 2−x
292 Chapter 4

whatever k ∈ N.
Note that γ̄(B) is equal to
ÃZ Z 1/(2−t) Z 2−√2 Z 1/(2−t) !
1
1 du du
dt 2
− dt 2
log 2 1/2 (2t−1)/t (tu + 1) 1/2 (2t−1)/(2−3t) (tu + 1)

µ ¶
1 √ √ 1
= log( 2 − 1) + 2 − = 0.0473 · · · .
log 2 2

ae0 ; e
Corollary 4.3.12 Let [e ae1 , e
ae2 , · · · ] be the DCF expansion of an irra-
tional number. Then
1
lim aek = 1, 1 ≤ k ≤ n}
card{k : e
n→∞ n

γ̄(B)
= ρDCF (B) =
1 − γ̄(SDCF )
µ ¶
√ √ 1
= 2 log( 2 − 1) + 2 − = 0.0656 · · · a.e..
2

This asymptotic relative frequency (6.56 · · · %) should be compared with


the asymptotic relative frequency of digit 1 in the RCF expansion (2− log 3
log 2 =
41.50 · · · %). See Proposition 4.1.1 and Subsection 4.1.2.

4.3.3 Bosma’s optimal continued fraction expansion


A remarkable geometrical interpretation of the RCF expansion of an irra-
tional number was given by Klein (1895). The idea behind it is to represent
any irreducible p/q ∈ Q ∩ I by an integer-valued vector in R2+ , namely, by
the point (q, p) ∈ R2+ , and to represent an irrational number ω ∈ Ω by a
half-line L with slope ω. The approximation of ω by its RCF convergents
amounts to systematically finding integer-valued vectors close to L. More
precisely, starting from V−1 = (0, 1) and V1 = (1, 0) we define Vn recursively
by
Vn = an Vn−1 + Vn−2 , n ∈ N+ ,
where an ∈ N+ is maximal with respect to the property that Vn is on the
same side of L as Vn−2 . It then appears that the positive integers a1 , a2 , · · ·
are in fact the RCF digits of ω, that is, ω = [a1 , a2 , · · · ].
Ergodic theory of continued fractions 293

Bosma (1987) gave a similar interpretation of α-expansions and, inspired


by this, presented a very interesting SRCF expansion formally defined as
follows.
Definition 4.3.13 Let −1/2 < x < 1/2. Put e ae0 = 0, e
te0 = x, ee1 = sgn e
te0 ,
pee−1 e e e e
= 1, qe−1 = 0, pe0 = 0, qe0 = 1, se0 = 0, and define recursively
 ¯ e ¯−1 
 ¯ e ¯ e e 
¯ e ¯−1 b tk c + eek+1 sek 
ee
ak+1 = tk ¯e ¯ + ³¯ ¯−1 ´ ,
¯ee ¯ e e
2 tk c + eek+1 sek + 1
¯ e ¯−1
tek+1 = ¯e
e tk ¯ − e
aek+1 , eek+2 = sgn e
tek+1 ,

peek+1 = e
aek+1 peek + eek+1 peek−1 , qek+1
e
aek+1 qeke + eek+1 qek−1
=e e
,

seek+1 = qeke /ee


qk+1 , k ∈ N.
The optimal continued fraction (OCF ) expansion of x, denoted OCF(x), is
the SRCF expansion [e ae1 , ee2 /e
e1 /e ae2 , · · · ]. For an irrational x ∈ R such that
2x 6∈ Z, OCF(x)=[e ae0 ; ee1 /e
ae1 , ee2 /e
ae2 , · · · ] is defined as e
ae0 + [e ae1 , ee2 /e
e1 /e ae2 , · · · ],
where e e
a0 ∈ Z is such that −1/2 < x − e e
a0 < 1/2, and [e e1 /ee
a1 , ee2 /e e
a2 , · · · ] =
OCF(x − e ae0 ).
It is not difficult to see that the e
tek and seek have the usual meaning, that
is,
e
tek = [e aek+1 , · · · ],
ek+1 /e k ∈ N,


 0 if k = 0,



seek = ae1
1/e if k = 1,





aek , eek /e
[1/e aek−1 , · · · , ee2 /e
ae1 ] if k ≥ 2

and peek /e
qke , k ∈ N, are the OCF convergents of x.
Next, the sequence of OCF convergents (e qke )k∈N is a subsequence of
pek /e
the sequence (pn /qn )n∈N of RCF convergents. If we define n(k) in such a
qke = pn(k) /qn(k) , k ∈ N+ , then
way that peek /e

 n(k) + 1 if eek+2 = 1,
n(k + 1) =

n(k) + 2 if eek+2 = −1
294 Chapter 4

with 
 0 if x > 0,
n(0) =

1 if x < 0.
Finally, it appears that the OCF expansion gives approximation coeffi-
cients Θ e en = (e
qne )2 |x − (e
pen /e
qne )| < 1/2 for any n ∈ N and, at the same time,
it is a fastest expansion. Fastest SRCF expansions for which all convergents
are RCF convergents can be defined as those in which always the maximal
number of RCF convergents is skipped, meaning that whenever a 1-block of
length m ∈ N+ occurs in the RCF expansion, exactly b(m + 1)/2c out of the
m 1’s are skipped. (Note that this implies that for fastest SRCF expansions
only a choice is left in deciding which RCF convergents will be skipped when
m is even.) A still more precise definition of ‘fastest’ is as follows. Writing
nα (k) := nSα (k), k ∈ N+ , α ∈ [1/2, 1], by Theorem 4.2.8 we have a.e.

 log 2
nα (k)  log G = 1.44092 · · ·
 if 1/2 ≤ α ≤ g,
lim =
k→∞ k 
 log 2
 if g < α ≤ 1.
log(α + 1)

Then an (arbitrary) SRCF expansion is said to be fastest if and only if


nSRCF (k) = n1/2 (k) for infinitely many k ∈ N+ . Here the non-decreasing
function nSRCF : N+ → N+ is defined by

qnSRCF (k) ≤ qeke < qnSRCF (k)+1 , k ∈ N+ ,

where the qi and qeie , i ∈ N+ , are associated with the RCF expansion and
the SRCF expansion considered, respectively. Cf. Bosma (1987, p. 364).
The next result [cf. Bosma and Kraaikamp (1990)] places OCF expan-
sions in the context of the S-expansion theory. More precisely, it shows how
singularizing appropriately the RCF expansion yields the OCF expansion.
(Note that it is for this reason that we have anticipated notation by denoting
the OCF expansion as an S-expansion.)
Lemma 4.3.14 Let ω ∈ Ω have RCF expansion [a1 , a2 , · · · ], RCF con-
vergents pn /qn , and RCF approximation coefficients Θn , n ∈ N. Consider
the set µ µ ¶¶
2 2x − 1
SOCF = (x, y) ∈ I ; y < min x, .
1−x
Then for any n ∈ N+ the following three assertions are equivalent:
Ergodic theory of continued fractions 295

(i) pn /qn is not an OCF convergent of ω;

(ii) an+1 = 1 , Θn−1 < Θn and Θn > Θn+1 ;

(iii) (τ n , sn ) ∈ SOCF .

Proof. For the proof of the equivalence of (i) and (ii) we refer the reader
to Corollary (4.20) of Bosma (1987). Here we show that (ii) and (iii) are
equivalent.
Since
sn τn
Θn−1 = , Θn = , n ∈ N+ , (4.3.2)
sn τ n + 1 sn τ n + 1
we have
|qn ω − pn | Θn qn−1
= = τ n < 1, ω ∈ Ω. (4.3.3)
|qn−1 ω − pn−1 | Θn−1 qn
Also
Θn−1 < Θn if and only if τ n > sn . (4.3.4)
Furthermore, if an+1 = 1 then pn+1 = pn + pn−1 and qn+1 = qn + qn−1 , and
by (4.3.3) we have

Θn+1 = qn+1 |qn+1 ω − pn+1 |

= (qn + qn−1 )|(qn + qn−1 )ω − (pn + pn−1 )|

= (qn + qn−1 )|(qn−1 ω − pn−1 ) + (qn ω − pn )|

= (qn + qn−1 )(|qn−1 ω − pn−1 | − |qn ω − pn |)

since qn ω − pn and qn−1 ω − pn−1 have different signs, as shown by equation


(1.1.18). Thus
µ ¶ µ ¶
qn qn−1
Θn+1 = Θn−1 1 + − Θn 1 + .
qn−1 qn

It follows from (4.3.3) that

2τ n − 1
an+1 = 1 and Θn+1 < Θn if and only if sn < . (4.3.5)
1 − τn

Combining (4.3.4) and (4.3.5) with the definition of SOCF completes the
proof. 2
296 Chapter 4

Remarks. 1. It is easy to check that


log G
γ̄ (SOCF ) = 1 − ,
log 2
so SOCF is a maximal singularization area. See Figure 4.5. Notice that SOCF
contains SDCF , hence any sequence of OCF convergents is a subsequence of
the corresponding sequence of DCF convergents. Since τ̄ (SOCF ) ⊂ I 2 \SOCF ,
the set BSOCF of the OCF preservation area of 1’s is empty. Hence any OCF
incomplete quotient (or digit) is greater than or equal to 2.
2. It now appears that the function n : N+ → N+ considered above is
in fact nSOCF . It then follows from Theorem 4.2.8 that
n(k) log 2
lim = = 1.4404 · · · a.e..
k→∞ k log G
2
As in the DCF expansion case, the general theory developed in Subsec-
tions 4.2.4 and 4.2.5 allows us to state the following results. For detailed
proofs the reader is referred to Bosma and Kraaikamp (1990, 1991).
With the notation in Subsection 4.2.5, for the OCF case we have
µ µ ¶¶
2 2 2x − 1
∆OCF = I \ SOCF = (x, y) ∈ I : y ≥ min x, ,
1−x
¡ ¢
∆− 2
OCF = τ̄ (SOCF ) = (x, y) ∈ I : (y, x) ∈ SOCF ,

that is, reflecting SOCF in the diagonal y = x yields ∆−


OCF , and

AOCF := ASOCF = M (∆OCF )


µ µ ¶
2x + 1 x + 1
= (x, y) ∈ (−1/2, g) × [0, g] : y ≤ min ,
x+1 x+2
µ ¶¶
2x − 1
and y ≥ max 0, ,
1−x
see Figure 4.5.
Furthermore, writing fOCF for fSOCF and τ̄OCF for τ̄SOCF we have
¹ º
¯ −1 ¯ b|x−1 |c + y sgn x
fOCF (x, y) = ¯x ¯ + ,
2(b|x−1 |c + y sgn x) + 1
³¯ ¯ ´
τ̄OCF (x, y) = ¯x−1 ¯ − fOCF (x, y), (fOCF (x, y) + y sgn x)−1
Ergodic theory of continued fractions 297

1 .....
.....
.....
.....
..
.....
.
....
.....
.....
.....
τ̄ (SOCF ) .
.....
.
......
.
....
.....
.....
.
..
......
.
.....
....
....................
....................
...
...
...
...
...
................... ..
.
.
......
............... ...
.............. ...
.............
............ ..
.
...........
..
...
...
............ 1/2 ..
.
.
..
.......... ..
......... ..
... ..
... ..
.
.
... ..
... ..
..
...
....
..
...
.
SOCF
.. ..
... ..
...
..
.
M (τ̄ (SOCF )) .
.
..
...
... ...
... ...
.. ..
... ...

−1/2 0 1/2 g 1

Figure 4.5: SOCF

for (x, y) ∈ AOCF .


Theorem 4.3.15 Let ρOCF be the probability measure on BAOCF with
density
1 1
, (x, y) ∈ AOCF .
log G (xy + 1)2
Then (AOCF , BAOCF , τ̄OCF , ρOCF ) is an ergodic dynamical system which un-
derlies the OCF expansion.
Remark. For both DCF and OCF expansions the two-dimensional sets
ADCF and AOCF have curved boundaries. This implies that the functions
fDCF and fOCF depend on both their arguments x and y, and not only on x
as in the case of α-expansions, α ∈ [1/2, 1]. As a result, no one-dimensional
ergodic dynamical system exists for either DCF or OCF expansion. 2
¡ ¢
Proposition 4.3.16 For any µ ∈ pr B[−1/2,g] such that µ ¿ λ and
any (t1 , t2 ) ∈ I 2 we have
³ ´
lim µ Θ e en−1 ≤ t1 , Θ
e en ≤ t2 = H(t1 , t2 ).
n→∞

Here H is the distribution function with density


 µ ¶

 1 1 1
 log G p +p if (x, y) ∈ Π,
1 − 4xy 1 + 4xy



0 elsewhere,
298 Chapter 4

¡ ¢
where Π = (x, y) ∈ R2++ : 4x2 + y 2 < 1, x2 + 4y 2 < 1 .
The result above can be also stated in an equivalent form concerning the
existence for any (t1 , t2 ) ∈ I 2 of the limit a.e. equal to H(t1 , t2 ) of

1 e e ≤ t1 , Θ
e e ≤ t2 , 0 ≤ k ≤ n − 1}
card{k : Θ k k+1
n
as n → ∞. It then follows, e.g., that
n
1 X ee arctan 12
lim Θk = = 0.24087 · · · a.e.. (4.3.6)
n→∞ n 4 log G
k=1

Other consequences are that for any irrational number we have


e en < 1/2, n ∈ N+ ;
(i) 0 < Θ
√ √
ee + Θ
(ii) 0 < Θ e en < 2/ 5, hence min (Θ
ee , Θe en ) < 1/ 5, n ∈ N+ .
n−1 n−1

In connection with (ii) above, it should be noted that the constant 1/ 5
in the second inequality is ‘best possible’ by A. Hurwitz’s result mentioned
just before Theorem 4.3.8.
Remark. The a.e. asymptotic arithmetic mean (4.3.6) should be com-
pared with the corresponding values

1
= 0.36067 · · · for the RCF expansion,
4 log 2

1
= 0.25 for the DCF expansion,
4

5−2
= 0.24528 · · · for the NICF and SCF expansions,
2 log G

8G + 6 − 2G − 1
= 0.24195 · · · for the α0 -expansion,
log G
where α0 = 0.55821 · · · . See Corollary 4.1.23 and Proposition 4.3.10 for the
first two values, and Bosma et al. (1983) for the last two ones. Note how
close the value in (4.3.6) is to

log G
1 − γ̄ (SOCF ) = = 0.24061 · · · .
2
Ergodic theory of continued fractions 299

The latter gives an a priori bound for the a.e. asymptotic arithmetic mean
of the approximation coefficients. It can be shown that the value in (4.3.6)
is in fact ‘the best one can get’ for any irrational number. More precisely,
we have the following result.
Theorem 4.3.17 [Bosma and Kraaikamp (1991)] Whatever the SRCF
expansion with convergents pen /qne and approximation coefficients Θen , n ∈ N,
we have
m n
1 X e 1 X ee
Θk ≥ Θ k , n ∈ N+ ,
m n
k=1 k=1

for any irrational number, where m = card{k : qke < qen+1e , k ∈ N+ } and
e e n , n ∈ N+ , are associated with the OCF expansion.
qen and Θ e

4.4 Continued fraction expansions with σ-finite,


infinite invariant measure
4.4.1 The insertion process
We have seen in previous subsections how the concept of singularization
leads to a class of SRCF expansions for which the underlying ergodic theory
can be developed.
The idea of adding a convergent instead of removing one (as singulariza-
tion does) leads to the concept of insertion, to some extent the opposite of
that of singularization. Now, the fundamental identity is
1 −1
a+ = a+1 + ,
b+x 1
1+
b−1+x
where a ∈ Z, b ∈ N+ , b > 1, x ∈ [0, 1). Let (cf. Subsection 4.2.2)

(ek )k∈M , (ak )k∈{0}∪M (4.4.1)

be a (finite or infinite) CF with a`+1 > 1, e`+1 = 1 for some ` ∈ N for which
` + 1 ∈ M . The transformation ι` which takes (4.4.1) into the CF

(e
ek )k∈M
f, (e
ak )k∈{0}∪M
f, (4.4.2)

f = M if M = N+ and M
where M f = {k : 1 ≤ k ≤ n + 1} if M =
{k : 1 ≤ k ≤ n}, n ∈ N+ , with eek = ek , k ∈ M , k ≤ `, ee`+1 = −1,
300 Chapter 4

ee`+2 = 1, eek = ek−1 , k ∈ M , k ≥ ` + 3, e ak = ak , k ∈ {0} ∪ M , k ≤ ` − 1,


e
a` = a` + 1, e a`+1 = 1, ea`+2 = a`+1 − 1, e ak = ak−1 , k ≥ ` + 3, is called
an insertion of the pair (1, −1) before a`+1 , e`+1 . Let (pek /qke )k∈{0}∪M and
(epek /e
qke )k∈{0}∪M
f be the sets associated with (4.4.1) and (4.4.2), respectively.
The result corresponding to Proposition 4.2.4 can be stated as follows.
Proposition 4.4.1 Let ` ∈ N such that ` + 1 ∈ M . The set of conver-
gents
pek /e
(e qke )k∈{0}∪M
f

resulting after the insertion ι` of the pair (1, −1) before a`+1 (> 1), e`+1
(= 1), is obtained by inserting the term (pe` + pe`−1 )/(q`e + q`−1
e ) in the set

(pk /qk )k∈{0}∪M before the convergent p` /q` . As usual, here pe−1 = 1, q−1
e e e e e =

0.
The proof is similar to that of Proposition 4.2.4 by using appropriate
matrix identities. 2
Starting from the RCF expansion, by appropriate insertions we can ob-
tain many classical SRCF expansions, and also continued fraction algorithms
which are not SRCF expansions. Amongst the former we mention the Lehner
continued fraction (LCF) expansion, and amongst the latter the Farey con-
tinued fraction (FCF) expansion. Both these expansions will be studied in
the next subsection.
In particular, we can obtain this way the OddCF and EvenCF expansions
—see the examples of SRCF expansions at the end of Subsection 4.2.2—as
well as the backward continued fraction (BCF) expansion that we will study
in Subsection 4.4.3.

4.4.2 The Lehner and Farey continued fraction expansions


Lehner (1994) showed that any number x ∈ [1, 2) has a unique infinite SRCF
expansion of the form
e1
b0 + e2 := [ b0 ; e1 /b1 , e2 /b2 , · · · ] , (4.4.3)
b1 +
.
b2 + . .
where (bn , en+1 ) is equal to either (1, 1) or (2, −1), n ∈ N. We shall call
this expansion the Lehner continued fraction (LCF ) expansion. Dajani and
Kraaikamp (2000) called it the Lehner fraction or the Lehner expansion, and
showed that if we define the transformation L : [1, 2) → [1, 2) by
e(x)
L(x) = , x ∈ [1, 2),
x − b(x)
Ergodic theory of continued fractions 301

where 
 (2, −1) if 1 ≤ x < 32 ,
(b(x), e(x)) =
 3
(1, 1) if 2 ≤ x < 2,
then
(bn (x), en+1 (x)) = (b(Ln (x)), e(Ln (x))) , x ∈ [1, 2),
for any n ∈ N. Here Ln , n ∈ N+ , denotes the composition of L with itself
n times while L0 is the identity map.
Denoting as usual the RCF convergents of a real number x = [a0 ;
a1 , a2 , · · · ] by (pn /qn )n∈N and defining the mediant convergents of x by
kpn + pn−1
, 1 ≤ k < an+1 , n = 1, 2, · · ·
kqn + qn−1
(so that if an+1 = 1 then there is no mediant convergent), we will see that
the set of LCF convergents of x is the union of the sets of RCF and mediant
convergents of x. It is for this reason that the LCF expansion was called the
mother of all SRCF expansions in Dajani and Kraaikamp (op. cit.).
Proposition 4.4.2 Let x ∈ [1, 2) \ Q, with RCF expansion

[ 1; a1 , a2 , · · · ].

Then the LCF expansion (4.4.3) of x is given by the following algorithm.


(i) Let n be the smallest m ∈ N for which am+1 > 1. If n = 0, that is,
a1 > 1 then we replace [1; a1 , a2 · · · ] by

[ 2; −1/2, · · · , −1/2, −1/1, 1/1, 1/a2 , · · · ] .


| {z }
(a1 −2) times

If n ≥ 1 then we replace

[ 1; 1, · · · , 1, an+1 , · · · ]

by
ιn+an+1 −1 ( · · · (ιn+1 (ιn ([ 1; 1, · · · , 1, an+1 , · · · ])) · · · )

= [ 1; 1/1, · · · , 1/1, , 1/2, −1/2, · · · , −1/2, −1/1, 1/1, 1/an+2 , · · · ] ,


| {z } | {z }
(n−1) times (an+1 −2) times

where ιn is defined as in Subsection 4.4.1. Denote the SRCF expansion


of x thus obtained by
302 Chapter 4

[ b00 ; e01 /b01 , e02 /b02 , · · · ]. (4.4.4)

(ii) Let n0 > n be the smallest integer m0 > n for which e0m0 +1 = 1 and
b0m0 +1 > 1. Apply to (4.4.4) the procedure from (i) to b0n0 +1 .

The proof is easy and left to the reader. 2


Remark. It follows from the very insertion mechanism that any RCF or
mediant convergent is an LCF convergent. Conversely, the sequence of LCF
convergents is obtained after all mediant convergents have been inserted into
the sequence of RCF convergents. Another immediate consequence is that
the LCF expansion of a quadratic irrationality is (eventually) periodic. 2
Note that the transformation L [which is implicit in Lehner (1994)] is
isomorphic to the transformation I : [0, 1) → [0, 1) defined by
 x

 if 0 ≤ x < 1/2,
 1−x
I(x) =


 1 − x if 1/2 ≤ x < 1,
x
which was used by Ito (1989) to generate the RCF and mediant convergents
of any x ∈ [0, 1). More precisely, we have

L(x) = I(x − 1) + 1, x ∈ [1, 2).

We also have
1
L(x) = , x ∈ [1, 2),
I (h(x − 1))
where the bijective function h : [0, 1) → [1/3, 2/3) is defined by
 1

 if 0 ≤ x < 1/2,
 2−x
h(x) =

 x

if 1/2 ≤ x < 1.
x+1
Ito (op. cit.) showed that I is ν-preserving, where ν is the ¡ σ-finite, infinite
¢
−1
measure on B[0,1) with density x , x ∈ (0, 1), and that [0, 1), B[0,1) , I, ν is
an ergodic dynamical system. This implies that L is µ-preserving, where µ
is the σ-finite,
¡ infinite measure
¢ on B[1,2) with density (x − 1)−1 , x ∈ (1, 2),
and that [1, 2), B[1,2) , L, µ , is an ergodic dynamical system underlying the
LCF expansion.
Ergodic theory of continued fractions 303

We will now exhibit the relationship between the LCF expansion and an
algorithm yielding the so called Farey continued fraction (FCF ) expansion.
The latter is an infinite CF expansion of any x ∈ [−1, 0) ∪ (0, ∞) of the form
f1
:= [ f1 /d1 , f2 /d2 , · · · ] , (4.4.5)
f2
d1 +
.
d2 + . .
where (dn , fn ) is equal to either (1, 1) or (2, −1), n ∈ N+ . Formally, as
shown by Dajani and Kraaikamp (op. cit.), if we define the transformation
F : [−1, ∞) → [−1, ∞) by

 f (x)

x − d(x) if x 6= 0,
F(x) =


0 if x = 0,

where 
 (2, −1) if − 1 ≤ x < 0,
(d(x), f (x)) =

(1, 1) if x ≥ 0,
then
¡ ¢
(dn (x), fn (x)) = d(Fn−1 (x)), f (Fn−1 (x)) , x ∈ [−1, ∞),

for any n ∈ N+ . Here Fn , n ∈ N+ , denotes the composition of F with itself


n times while F0 is the identity map.
By its very definition the FCF expansion is not an SRCF expansion
since the condition fn+1 + dn ≥ 1, n ∈ N+ , is violated.
Put D = [1, 2) × [−1, ∞), and define the transformation L̄ : D → D by
µ ¶
e(x)
L̄(x, y) = L(x), , (x, y) ∈ D.
b(x) + y

It is easy to check that L̄ is a one-to-one transformation of D0 := [1, 2) ×


([−1, 0) ∪ (0, ∞)) with inverse
µ ¶
−1 f (y)
L̄ (x, y) = + d(y), F(y) , (x, y) ∈ D0 .
x

Also, for any n ≥ 2 we have

L̄n (x, y) = (Ln (x), [en (x)/bn−1 (x), · · · , e2 (x)/b1 (x), e1 (x)/(b0 (x) + y)])
304 Chapter 4

whatever (x, y) ∈ D, and

L̄−n (x, y) = ([dn (y); fn (y)/dn−1 (y), · · · , f2 (y)/d1 (y), f1 (y)/x], Fn (y))

whatever (x, y) ∈ D0 .
Remark. It is interesting to compare the last two equations above with
(1.3.10 ) and (1.3.20 ). This might suggests developments similar to those in
Section 1.3. 2
¡ ¢
Theorem 4.4.3 The quadruple D, BD , L̄, µ̄ is an ergodic dynamical
system which is a natural extension of the dynamical system
¡ ¢
[1, 2), B[1,2) , L, µ .

Here µ̄ is the σ-finite, infinite measure on BD with density (x+y)−2 , (x, y) ∈


D = [1, 2) × [−1, ∞).
Proof. Let π1 : [1, 2) × [−1, ∞) → [1, 2) denote the projection onto the
first axis. Cf. Remark 1 after Proposition 4.0.5. Then it is easy to check
that π1 ◦ L̄ = L ◦ π1 , and that
¡ ¢
µ̄ π1−1 (A) = µ(A), A ∈ B[1,2) .

We should next show that L̄ is µ̄-preserving and, finally, that the σ-algebra
generated by [ ¡ ¢
L̄n π1−1 B[1,2)
n∈N

coincides with BD . We leave the details to the reader, who can find them in
Dajani and Kraaikamp (op. cit.). 2
Let us denote by φ the σ-finite, infinite measure on B[−1,∞) with density
(x + 1)−1 − (x + 2)−1 , x ∈ (−1, ∞). It is easy to check that F is φ-preserving.
Theorem 4.4.4 The map ξ : [−1, 0) ∪ (0, ∞) → [1, 2) defined by

ξ(x) = [ d1 ; f1 /d2 , f2 /d3 , · · · ] ,

if x ∈ [−1, 0) ∪ (0, ∞) has FCF expansion

x = [ f1 /d1 , f2 /d2 , · · · ]
¡ ¢ ¡ ¢
is an isomorphism from [−1, ∞), B[−1,∞) , F, φ to [1, 2), B[1,2) , L, µ .
Ergodic theory of continued fractions 305

Proof. It is clear that ξ is bijective. Since


L (ξ(x)) = L ([ d1 ; f1 /d2 , f2 /d3 , · · · ])

= [ d2 ; f2 /d3 , f3 /d4 , · · · ]

= ξ ([ f2 /d2 , f3 /d3 , · · · ]) = ξ (F(x)) ,


¡ ¢
we only need to show that ξ is measurable and that µ(A) = φ ξ −1 (A) for
any A ∈ B[1,2) . Whilst measurability is obvious, the equation above can
be easily checked. The details can be found in Dajani and Kraaikamp (op.
cit.). 2
An immediate consequence of Theorems 4.4.3 and 4.4.4 is that
¡ ¢
[−1, ∞), B[−1,∞) , F, φ
is an ergodic dynamical system underlying the FCF expansion.
Remark. Corollary 4.1.10 in conjunction with the insertion
¡ concept pro-¢
vides a heuristic argument why the dynamical system [1, 2), B[1,2) , L, µ
should be ergodic, where L is µ-preserving for a σ-finite, infinite measure
µ. After all, an insertion before a digit > 1 is simply building a tower over
the RCF cylinder corresponding to that digit. Since the LCF expansion is
obtained by using insertion as many times as possible in order to ‘shrink
away’ any RCF digit > 1, it follows that the system thus obtained should
be ergodic (it includes the RCF dynamical system as an induced system),
but by Corollary 4.1.10 it should have infinite mass. 2
The next result corresponds to Proposition 4.1.8 for the values p =
−1, 0, 1 there.
Theorem 4.4.5 Let x ∈ [1, 2) \ Q with LCF expansion
[ b0 ; e1 /b1 , e2 /b2 , · · · ].
Then
n
lim = 2 a.e.,
n→∞ 1 1
+ ··· +
b1 bn

lim n b1 · · · bn = 2 a.e.,
n→∞

b1 + · · · + bn
lim = 2 a.e.
n→∞ n
306 Chapter 4

Proof. Let [1; a1 , a2 , · · · ] be the RCF expansion of x. For any given


sufficiently large m ∈ N+ there (uniquely) exist integers k ∈ N+ and j ∈ N
such that
m = a1 + · · · + ak + j , 0 ≤ j < ak+1 .

By Proposition 4.4.2 the LCF expansion is obtained by replacing any RCF


digit ` by a block of LCF digits of length ` consisting of (` − 1) 2’s followed
by one 1. Then

k
1 1 1X j m+k
+ ··· + = k+ (ai − 1) + = .
b1 bm 2 2 2
i=1

This implies that


m 1
= ´.
1 1 1³ k
+ ··· + 1+ m
b1 bm 2
Since 0 ≤ j < ak+1 , we have

k 1
≤ Pk ,
m 1
i=1 ai
k

which converges a.e. to 0 by Corollary 4.1.10. Hence

m
lim = 2.
m→∞ 1 1
+ ··· +
b1 bm

Since any bn , n ∈ N+ , is equal to either 1 or 2, recalling the classical


inequalities

m p
m b1 + · · · + bm
≤ b1 · · · bm ≤ (≤ 2) ,
1 1 m
+ ··· +
b1 bm

the result follows. 2


Corollary 4.4.6 Let x ∈ [−1, ∞) \ Q, with FCF expansion

[ f1 /d1 , f2 /d2 , · · · ].
Ergodic theory of continued fractions 307

Then
n
lim = 2 a.e.,
n→∞ 1 1
+ ··· +
d1 dn

lim n d1 · · · dn = 2 a.e.,
n→∞

d1 + · · · + dn
lim = 2 a.e..
n→∞ n

The proof follows from Theorems 4.4.4 and 4.4.5. 2

4.4.3 The backward continued fraction expansion


Until now we have used only the insertion mechanism in this section. As
an example of combining singularization and insertion we discuss here the
backward continued fraction (BCF ) expansion.
Any irrational number ω ∈ I has an infinite CF expansion of the form
1
1− := [ 1; −1/c1 , −1/c2 , · · · ] , (4.4.6)
1
c1 −
.
c2 + . .
where 2 ≤ cn = cn (ω) ∈ N+ , so that (4.4.6) is an SRCF expansion. There
is a transformation β : I → I naturally associated with the RCF transfor-
mation τ , which is defined by
 ¥ ¦
 (x − 1)−1 − (x − 1)−1 if x ∈ [0, 1),
β(x) =

0 if x = 1.
The graph of β can be obtained from that of τ by reflecting the latter in the
line x = 1/2. It is for this reason that (4.4.6) has been called ‘backward’.
Note also that β(x) = −N0 (x − 1), x ∈ I, where N0 is defined in Subsection
4.3.1.
¡ In terms
¢ of β, the incomplete
¥ BCF ¦quotients are given by cn =
c1 β n−1 (ω) , n ∈ N+ , with c1 = (1 − ω)−1 , ω ∈ Ω. Here β n , n ∈ N+ ,
denotes the composition of β with itself n times while β 0 is the identity
map. Rényi (1957) showed that β is ν-preserving, where ν is Ito’s σ-finite,
infinite measure with density x−1 , x ∈ (0, 1), which has been considered in
Subsection 4.4.2, and that the dynamical system (I, BI , β, ν) is ergodic. See
also Adler and Flatto (1984).
308 Chapter 4

As with Proposition 4.4.2 we leave to the reader the proof of the following
result.
Proposition 4.4.7 Let ω ∈ Ω with RCF expansion [a1 , a2 , · · · ]. Then
the BCF expansion (4.4.6) of ω is given by the following algorithm.
(i) If a1 = 1 then singularize a1 to arrive at

[ 1; −1/(a2 + 1), 1/a3 , · · · ]

as a new SRCF expansion of ω. If a1 > 1 then insert (a1 − 1) times


−1/1 before a1 to arrive at

[ 1; −1/2, · · · , −1/2, −1/1, 1/1, 1/a2 , · · · ]


| {z }
(a1 −2) times

as a new SRCF expansion of ω, and then singularize the digit 1 ap-


pearing before 1/a2 in this expansion of ω. In either case we obtain
as SRCF expansion of ω

[ 1; (−1/2)a1 −1 , −1/(a2 + 1), 1/a3 , · · · ] , (4.4.7)


where (−1/2)a1 −1 abbreviates −1/2, · · · , −1/2.
| {z }
(a1 −1)times

(ii) Let n be the smallest integer m ∈ N+ for which em = 1 in (4.4.7).


Apply to the latter expansion the procedure from (i) to an .

Remarks. 1. The above insertion/singularization mechanism implies


that ω has a BCF expansion

[ 1; (−1/2)a1 −1 , −1/(a2 + 2), (−1/2)a3 −1 , 1/(a4 + 2), · · · ] . (4.4.8)

See also Zagier (1981, Aufgabe 3, p. 131). It also follows easily from (4.4.8)
that every quadratic irrationality has an (eventually) periodic BCF expan-
sion.
2. Again, as for the LCF expansion, it heuristically follows from Corol-
lary 4.1.10 and the insertion mechanism that the BCF transformation β
should be ergodic, with invariant σ-finite, infinite measure. 2

For the LCF expansion it was intuitively clear that n b1 · · · bn → 2 a.e. as
n → ∞ since the only digits are 1 and 2, and ‘there are very few 1’s against
Ergodic theory of continued fractions 309

the 2’s’ (by Corollary 4.1.10). For the BCF expansion such an argument
clearly does not work. However, we have the following result.
Theorem 4.4.8 Let ω ∈ Ω with BCF expansion (4.4.6). Then

lim n c1 · · · cn = 2 a.e.
n→∞

and
n
lim = 2 a.e..
n→∞ 1 1
+ ··· +
c1 cn

Proof. Let [a1 , a2 , · · · ] be the RCF expansion of ω. For any given suf-
ficiently large m ∈ N+ there (uniquely) exist integers k ∈ N+ and j ∈ N
such that
m = a1 + a3 + · · · + a2k−1 + j, 0 ≤ j < a2k+1 .
It follows from (4.4.8) that
Pk k
Y
c1 · · · cm = 2 i=1 (a2i−1 −1)+j−1 (a2i + 2) ,
i=1

and therefore
m
à k ! k
1 X log 2 X 1X
log ci = a2i−1 − k + j − 1 + log(a2i + 2)
m m m
i=1 i=1 i=1
 
k
X
  log(a2i + 2)
 k+1 
  i=1
= (log 2) 1 − k + k .
 X  X
 a2i−1 + j  a2i−1 + j
i=1 i=1

Since
k+1 1
k
= k
→ 0 a.e.
X 1 X j
a2i−1 + j a2i−1 +
k+1 k+1
i=1 i=1
as m → ∞, and
k
X
log(a2i + 2)
i=1
k
→ 0 a.e.
X
a2i−1 + j
i=1
310 Chapter 4

as m → ∞, we deduce that

m
c1 · · · cm → 2 a.e.

as m → ∞. Next, since cn ≥ 2, n ∈ N+ , we have


m
≥ 2.
1 1
+ ··· +
c1 cm
Using the same inequalities as in the proof of Theorem 4.4.5 we therefore
obtain
m √
2 ≤ lim ≤ lim m c1 · c2 · · · · · cm = 2,
m→∞ 1 1 m→∞
+ ··· +
c1 cm
that is,
m
lim = 2 a.e..
m→∞ 1 1
+ ··· +
c1 cm
2
Remark. The asymptotic behaviour of the arithmetic mean
c1 + · · · + cm
m
as m → ∞ was posed as an open problem in Dajani and Kraaikamp (2000).
If we write m as before, then an easy calculation yields
k
X
a2i
c1 + · · · + cm i=1
= 2+ ,
m Xk
j+ a2i−1
i=1

with 0 ≤ j < a2k+1 . Thus we need to study the behaviour of


k
X
a2i
i=1
k
(4.4.9)
X
a2i−1
i=1
Ergodic theory of continued fractions 311

as k → ∞. The asymptotic behaviour of the numerator in (4.4.9) is the same


of that of the denominator, and Aaronson (1986) showed that the fraction
converges to 1 in probability. However, one expects that infinitely often
the denominator is much larger that the numerator, and vice-versa. Thus
Dajani and Kraaikamp (op. cit.) conjectured that the lim inf and lim sup
of (4.4.9) are a.e. equal to 0 and +∞, respectively. Recently, Aaronson and
Nakada (2001) have proved this conjecture. 2
312 Chapter 4
Appendix 1: Spaces,
functions, and measures

A1.1
Let X be an arbitrary non-empty set. A non-empty collection X of subsets
of X is said to be a σ-algebra (in X) if and only if it is closed under the for-
mation of complements and countable unions. Clearly, ∅ and X both belong
to X , and X is also closed under the formation of countable intersections.
For any non-empty collection C of subsets of X the σ-algebra generated by
C, denoted σ(C), is defined as the smallest σ-algebra in X which contains C.
Clearly, σ(C) is the intersection of all σ-algebras in X which contain C.
A pair (X, X ) consisting of a non-empty set X and a σ-algebra X in X is
called a measurable space. In the special case where X is a denumerable set
the usual σ-algebra in X is P(X), the collection of all subsets of X. Clearly,
P(X) is generated by the elements of X : P(X) = σ ({x} : x ∈ X).
The product of two measurable spaces (X, X ) and (Y, Y) is the measur-
able space (X × Y, X ⊗ Y), where the product σ-algebra X ⊗ Y is defined as
σ(C) with C = (A × B : A ∈ X , B ∈ Y).

A1.2
Let (X, X ) and (Y, Y) be two measurable spaces. A map f : X → Y from X
into Y is said to be (X , Y)-measurable or a Y -valued random variable (r.v.)
on X if and only if the inverse image f −1 (A) = (x ∈ X : f (x) ∈ A) of
every set A ∈ Y is in X . Setting f −1 (Y) = (f −1 (A) : A ∈ Y), the above
condition can be compactly written as f −1 (Y) ⊂ X . [Note that f −1 (Y) is
always a σ-algebra in X whatever f : X → Y ! ]
Let (X, X ) be a measurable space, let ((Yi , Yi ))i∈I be a family of
measurable spaces, and for any i ∈ I let fi be a Yi -valued r.v. on X. Then

313
314 Appendix 1

¡ ¢
the σ-algebra σ ∪i∈I fi−1 (Yi ) is called the σ-algebra generated by the family
(fi )i∈I and is denoted σ((fi )i∈I ). Clearly, this is the smallest σ-algebra S⊂X
having the property that fi is (S, Yi )-measurable for any i ∈ I.

A1.3
Let (X, X ) be a measurable space. A function µ : X → R+ is said to
be a (finite) measure on X if and only if it is completely additive, that
is,¡ for any sequence
¢ P (Ai )i∈N+ of pairwise disjoint elements of X we have
µ ∪i∈N+ Ai = i∈N+ µ(Ai ). Complete additivity is equivalent to finite
additivity [that is, for any finite collection
P A1 , . . . , An of pairwise disjoint
elements of X , we have µ (∪ni=1 Ai ) = ni=1 µ(Ai )] in conjunction with con-
tinuity at ∅ (that is, for any decreasing sequence A1 ⊃ A2 ⊃ . . . of elements
of X with ∩i∈N+ Ai = ∅ we have limn→∞ µ(An ) = 0 ). Clearly, finite
additivity implies µ (∅) = 0. In the special case where X is a denumerable
set a measure µ on P(X) is defined by simply giving the values µ ({x}) for
the elements x ∈ X. A probability on X is a measure P on X satisfying
P (X) = 1. An important example of a probability on X is that of the
probability δx concentrated at x for any given x ∈ X, which is defined by
δx (A) = IA (x), A ∈ X . The collection of all measures (probabilities) on X
will be denoted m(X ) (pr(X )).
A triple (X, X , P ) consisting of a measurable space (X, X ) and a prob-
ability P on X is called a probability space. [The traditional notation for
a probability space is (Ω, K, P ). The points ω ∈ Ω are interpreted as the
possible outcomes (elementary events) of a random experiment, and the sets
A ∈ K as the (random) events associated with it; these are the subsets of
Ω arising as the truth sets of certain statements concerning the experiment.]
We say that A ∈ X occurs P -almost surely, and write A P -a.s., if and only
if P (A) = 1. Let (Y, Y) be a measurable space and let f be a Y -valued
¡r.v. on ¢X. The P -distribution of f is the probability P f −1 on Y defined by
Pf −1 −1
(A) = P (f (A)), A ∈ Y.
Let (X, X ) and (Y, Y) be two measurable spaces. The product measure
of µ ∈ m(X ) and ν ∈ m(Y) is the (unique) measure µ ⊗ ν ∈ m (X ⊗ Y)
satisfying the equation µ ⊗ ν(A × B) = µ(A)ν(B) for any A ∈ X and B ∈ Y.

A1.4
Let X be a metric space with metric d. The usual σ-algebra in X, denoted
BX , is that of Borel subsets of X, that is, the σ-algebra generated by the
Spaces, functions, and measures 315

collection of all open subsets of X. In the special case where X = Rn (n-


dimensional Euclidean space) we write Bn for BRn , n ∈ N+ , and B = B 1 .
Further, if X is a Borel subset M of Rn , then BM = B n ∩ M = (A ∩ M :
A ∈ B n ), n ∈ N+ .
A sequence (µn )n∈N+ of measures on BX is said to converge weakly to a
w
measure µ on BX , and we write µn → µ, if and only if
Z Z
lim hdµn = hdµ
n→∞ X X

for any h ∈ Cr (X) = the set of all real-valued bounded continuous functions
on (X, d). An equivalent definition is obtained by asking that

lim µn (A) = µ(A) (A1.1)


n→∞

for any A ∈ BX for which µ (∂A) = 0, where ∂A is the boundary of A


defined as the closure of A minus the interior of A. In the special case
where X = R, putting Fn (x) = µn ((−∞, x]) and F (x) = µ ((−∞, x]),
x ∈ R, equation (A1.1) holds if and only if limn→∞ µn (R) = µ(R) and
limn→∞ Fn (x) = F (x) for any point of continuity x of F .
The Prokhorov metric dP on pr(BX ) is defined by

dP (P, Q) = inf(ε > 0 : P (A) ≤ Q(Aε )+ε, A ⊂ X, A closed), P, Q ∈ pr(BX ),

where Aε = (x : d(x, A) < ε) and d(x, A) = inf(d(x, y) : y ∈ A). If the


metric space (X, d) is separable, then for P, Pn ∈ pr(BX ), n ∈ N+ , the
weak convergence of Pn to P is equivalent to limn→∞ dP (Pn , P ) = 0.
Let (X, d) and (Y, d0 ) be two metric spaces. Consider a Y -valued r.v. f
on X. The set Df of all discontinuity points of f belongs to BX since it
can be written as ∪ε ∩δ Aε,δ , where ε and δ vary over the positive rational
numbers, and Aε,δ is the (open) set of all points x ∈ X for which there
exist x0 , x00 ∈ X such that d(x, x0 ) < δ, d(x, x00 ) < δ and d0 (f (x0 ), f (x00 )) ≥
ε.
w
Proposition A1.1 If Pn , P ∈ pr (BX ), Pn → P , and P (Df ) = 0, then
w
Pn f −1 → P f −1 .

In particular, the above result holds for a continuous f for which clearly
Df = ∅. For a characterization via weak convergence of almost every-
where continuous functions f , that is, such that P (Df ) = 0, see Mazzone
(1995/96).
316 Appendix 1

A1.5
In this section (X, d) is the real line with the usual Euclidean distance.
The characteristic function (ch.f.) or Fourier transform of a measure

µ ∈ m(B) is the complex-valued function µ defined on R by
Z

µ (t) = e itx µ(dx), t ∈ R.
R

∧ ∧
If µ = ν for two measures µ, ν ∈ m(B), then µ = ν.
Proposition A1.2 (Lévy-Cramér continuity theorem) Let P, Pn ∈
pr(B), n ∈ N+ .
w
(i) Pn → P ∈ pr(B) implies limn→∞ Pbn = Pb pointwise, and the conver-
gence of ch.f.s is uniform on compact subsets of R.

(ii) If limn→∞ P n = h pointwise and h is continuous at 0, then h is the
w
ch.f. of a probability P ∈ pr(B) and Pn → P .
Let µ, ν ∈ m(B). The convolution µ ∗ ν is the measure on B defined by
Z
µ ∗ ν(A) = µ(A − x)ν(dx), A ∈ B,
R

where A − x := (y − x : y ∈ A) , x ∈ R.
The convolution operator ∗ is associative and commutative. We have

µ[
∗ν =µ
b νb, µ, ν ∈ m(B).

For any n ∈ N+ let fi , 1 ≤ i ≤ n, be real-valued r.v.s on a probability


space (Ω, K, P ). The fi are said to be independent if and only if the
σ-algebras fi−1 (B), 1 ≤ i ≤ n, are P -independent, that is,
Ãn ! n
\ Y
P Ai = P (Ai )
i=1 i=1

for any Ai ∈ fi−1 (B), 1 ≤ i ≤ n. For independent real-valued r.v.s fi , 1 ≤ i ≤


P P
n, the ch.f. of the P -distribution P ( ni=1 fi )−1 of the sum ni=1 fi is equal
to the product of the ch.f.s of the P -distributions P fi−1 of the summands,
P
1 ≤ i ≤ n. Also, P ( ni=1 fi )−1 is the convolution of the P fi−1 , 1 ≤ i ≤ n.
Let µ ∈ m(B). For any n ∈ N+ the nth convolution µ∗n of µ with itself
is defined recursively by µ∗1 = µ and µ∗n = µ∗(n−1) ∗ µ for n ≥ 2. Define
also µ∗0 as δ0 .
Spaces, functions, and measures 317

Let µ ∈ m(B). The Poisson probability Pois µ associated with µ is defined


as
X µ∗n
Pois µ = e −µ(R) = e µ−µ(R) .
n!
n∈N
∧ ∧
d µ = exp(µ − µ (0)). The classical Poisson distribution P (θ)
Its ch.f. is Pois
with parameter θ > 0 is Pois(θδ1 ).
A measure on B¡ is 2said
¢ to be a Lévy measure if and only if it integrates
the function min 1, x on the whole of R. Given a Lévy measure µ, the
τ -centered Poisson probability cτ Pois µ, τ > 0, is defined as the probability
with characteristic function
µZ ¶
¡ itx ¢
exp e − 1 − itx I[−τ,τ ] (x) µ(dx) .
R

We have cτ Pois µ = (Pois µ) ∗ δb(τ ) , where


Z τ
b(τ ) = − xµ(dx).
−τ

A probability P ∈ pr(B) is said to be infinitely divisible if and only if for


any n ∈ N+ there exists Pn ∈ pr(B) such that Pn∗n = P .
Proposition A1.3 (Lévy–Khinchin representation) P ∈ pr(B) is in-
finitely divisible if and only if there exist σ ≥ 0 and a Lévy measure ν, and
for any τ > 0 there exists aτ ∈ R such that
µ Z ¶
∧ σ 2 t2 ¡ itx ¢
P (t) = exp itaτ − + e − 1 − itx I[−τ,τ ] (x) ν(dx) , t ∈ R.
2 R

It follows from Proposition A1.3 that an infinitely divisible probability is


the convolution of a normal distribution N (aτ , σ 2 ) and a τ -centered Poisson
probability cτ Pois ν. Either of the two terms can be degenerate, that is, the
cases σ = 0 and ν ≡ 0 are allowed.
An important special class of infinitely divisible probabilities on B is that
of stable probabilities. A probability P ∈ pr(B) is said to be stable if and
only if for any n ∈ N+ there exist An ∈ R++ and Bn ∈ R such that

P ∗n = P fn−1 ,

where fn is the affine function on R defined by

fn (x) = An x + Bn , x ∈ R. (A1.2)
318 Appendix 1

If Bn = 0 for any n ∈ N+ , then P is said to be strictly stable. It appears


that the only constants An allowed in (A1.2) are An = n1/α , n ∈ N+ , with
α ∈ (0, 2], and then α is called the order of µ. A probability P ∈ pr(B) is

stable of order α if and only if its ch.f. P has the form

α
P (t) = exp [i at − c|t| (1 − i b sgn t σ (t, α))] , t ∈ R,

where a, b, c ∈ R with |b| ≤ 1 and c ≥ 0, and


 πα
 tg 2 if α 6= 1,
σ(t, α) =
 2
π log |t| if α = 1.

In particular, a stable probability has order 2 if and only if it is normal.


An important example of a stable probability is that of the 1-centered
Poisson probability c1 Pois µk1 ,k2 ,α , 0 < α < 2, k1 , k2 ≥ 0, k1 + k2 > 0,
whose Lévy measure has density
µk1 ,k2, α (dx) ¡ ¢
= k2 I(−∞,0) (x) + k1 I(0,∞) (x) |x|−1−α , x 6= 0.
dx
The ch.f. hk1 ,k2 ,α of c1 Pois µk1 ,k2 ,α is

 Z0 ¡ ¢
hk1 ,k2 ,α (t) = exp k2 e itx − 1 − itx I[−1,0) (x) |x|−1−α dx

−∞

Z∞
¡ itx ¢ −1−α 
+ k1 e − 1 − itx I(0,1] (x) x dx , t ∈ R,

0

which can be expressed in terms of elementary functions as follows. We have

hk1 ,k2 ,1 (t)


½ µ ¶ ¾
π(k1 + k2 ) 2 k1 − k2
= exp i(k2 − k1 )(C − 1)t − 1 + i sgn t log |t| |t| ,
2 π k1 + k2
where C = 0.57721... is Euler’s constant, while for α 6= 1, 0 < α < 2,
½
i(k2 − k1 )t
hk1 ,k2 ,α (t) = exp
1−α
µ ¶ ¾
Γ(2 − α) πα k1 − k2 πα α
+(k1 + k2 ) cos 1 + i sgn t tg |t| ,
α(α − 1) 2 k1 + k2 2
Spaces, functions, and measures 319

where Γ is the classical gamma function.


Actually, any stable probability of order α 6= 2 has the form

δa ∗ c1 Pois µk1 ,k2 ,α

with a ∈ R, k1 , k2 ≥ 0, k1 + k2 > 0.

A1.6
Let C = Cr (I) be the metric space of real-valued continuous functions
on I = [0, 1] with the uniform metric

d(x, y) = sup |x(t) − y(t)| , x, y ∈ C.


t∈I

The space C is complete and separable. The σ-algebra BC of Borel sets


in (C, d) coincides with the σ-algebra B I ∩ C. Here BI denotes the σ-
algebra in RI generated by the collection of its subsets of the form Πt∈I At ,
where At ∈ B, t ∈ I, and At 6= R for finitely many t ∈ I.
Of paramount importance is the probability W on BC known as the
Wiener measure, for which

W (x : x(0) = 0) = 1,

W (x : x(ti ) − x(ti−1 ) ≤ ai , 1 ≤ i≤ k)
Yk Z ai
1 2
= p e−u /2(ti −ti−1 ) du
i=1
2π (ti − ti−1 ) −∞

for any k ∈ N+ , 0 ≤ t0 < t1 < · · · < tk ≤ 1, ai ∈ R, 1 ≤ i ≤ k.


Let D = D(I)(⊃ Cr (I)) be the metric space of real-valued functions
on I which are right continuous and have left limits, with the Skorohod
metric d0 to be defined below. Clearly, we can also consider the uniform
metric d in D which is defined similarly to that in C, that is, d(x, y) =
supt∈I |x(t) − y(t)| , x, y ∈ D.
Let L denote the set of all strictly increasing continuous functions ` :
I → I with `(0) = 0, `(1) = 1, and put

s0 (`) = sup |log [(`(t) − `(s)) / (t − s)]|


s6=t

for any ` ∈ L. The distance d0 (x, y)(≤ d(x, y)) for x, y ∈ D is defined as
the infimum of all ε > 0 for which there exists ` ∈ L such that s0 (`) ≤ ε
320 Appendix 1

and supt∈I |x(t) − y (`(t))| ≤ ε. The metrics d0 and d generate the same
topology in D. Nevertheless, while D is complete and separable under d0 ,
separability does not hold under d.
The σ-algebra BD of Borel sets in (D, d0 ) coincides with the σ-algebra B I ∩
D. Wiener measure W can be immediately extended from BC to BD as the
topologies induced in D by the metrics d0 and d are identical. Hence A∩C ∈
BC for any A ∈ BD . This allows us to define W (A) = W (A ∩ C), A ∈ BD .
Clearly, C is the support of W in D, that is, the smallest closed subset of
D whose W -measure equals 1.
General references: Araujo and Giné (1980), Billingsley (1968), Halmos
(1950), Hoffmann–Jørgensen (1994), Samorodnitsky and Taqqu (1994).
Appendix 2: Regularly
varying functions

A2.1
A measurable function R : [r, ∞) → R+ , where r ∈ R+ , is said to be
regularly varying (at ∞) of index α ∈ R if and only if there exists x0 ≥ r
such that R([x0 , ∞)) ⊂ R++ and
R(tx)
lim = tα
x→∞ R(x)
for any t ∈ R++ . A regularly varying function of index 0 is called a slowly
varying function.
It is obvious that R is regularly varying of index α if and only if it can
be written in the form
R(x) = xα L(x), x ∈ (r, ∞),
where L is a slowly varying function.
The general form of a slowly varying function is described by the cel-
ebrated Karamata theorem below [cf. Seneta (1976, Theorem 1.2 and its
Corollary)].
Theorem A2.1 (Representation theorem) Let r ∈ R+ . A function L :
[r, ∞) → R+ is slowly varying if and only if
µZ x ¶
ε(t)
L(x) = c(x) exp dt , x ≥ x0 ,
x0 t

for some x0 ≥ r, where the function c : [x0 , ∞) → R+ is bounded and


measurable and limx→∞ c(x) = c > 0 while the function ε : [x0 , ∞) → R is
continuous and limx→∞ ε(x) = 0.
Corollary A2.2 If L is a slowly varying function, then

321
322 Appendix 2

(i) limx→∞ L(x + y)/L(x) = 1 for any y ∈ R++ ;

(ii) limx→∞ xε L(x) = ∞ and limx→∞ x−ε L(x) = 0 for any ε > 0;

(iii) L is bounded on finite intervals in [x0 , ∞) if x0 ≥ r is large enough.

There exist necessary or sufficient integral conditions for slow variation


which are easy to check and use for theoretical and practical purposes. Here
are two such results. See, e.g., Seneta (1976, pp. 53-56 and 86-88).
Theorem A2.3 Let r ∈ R+ . If L : [r, ∞) → R+ is a slowly varying
function and x0 ≥ r so large that L is bounded on finite intervals in [r, ∞),
then for any α ≥ −1 we have

xα+1 L(x)
lim Z x =α+1 (A2.1)
x→∞
y α L(y)dy
x0
Z x
while the function x → y α L(y)dy, x > x0 , is regularly varying of index
x0
α + 1.
Conversely, if L : [r, ∞) → R+ is measurable and bounded on finite
intervals in [x0 , ∞) for some x0 ≥ r and (A2.1) holds Z
for some α > −1, then
x
L is a slowly varying function while the function x → y α L(y)dy, x > x0 ,
x0
is regularly varying of index α + 1. The last assertion also holds for α = −1.
Theorem A2.4 Let r ∈ R+ . If L : [r, ∞) → R+ is a slowly varying
function, then Z ∞
lim y α L(y) dy < ∞ (A2.2)
x→∞ x
Z ∞
for any α < −1. If lim y −1 L(y) dy < ∞ then for any α ≤ −1 we
x→∞ x
have
xα+1 L(x)
lim Z ∞ = −(α + 1) (A2.3)
x→∞
y α L(y)dy
x
Z ∞
while the function x → y α L(y) dy, for x large enough, is regularly
x
varying of index α + 1.
Conversely, if L : [r, ∞) → R+ is measurable, satisfies (A2.2), and
(A2.3) holds for some α < −1, then L is a slowly varying function while
Regularly varying functions 323
Z ∞
the function x → y α L(y)dy, for x large enough, is regularly varying of
x
index α + 1.

A2.2
An important class of pairs of regularly varying functions is defined as fol-
lows. Let ξ be a non-degenerate real-valued random variable on a probability
space (Ω, K, P ), and define real-valued functions F and Fe on [0, ∞) by

F (x) = E(ξ 2 I(|ξ|≤x) ), Fe(x) = P (|ξ| > x), x ∈ R+ .

Clearly, F is non-decreasing and Fe non-increasing. It is easy to check that


Z x Z ∞
F (x) = − u2 dFe (u), Fe (x) = u−2 dF (u), x ∈ R+ ,
0 x

whence by integrating by parts we obtain


Z x
F (x) + x Fe (x) = 2
2
u Fe(u)du, (A2.4)
0

Z ∞
x Fe (x) + F (x) = 2x2
2
u−3 F (u)du, x ∈ R+ . (A2.5)
x

Theorem A2.5 If either F or Fe varies regularly, then the limit

x2 Fe (x)
lim =c (A2.6)
x→∞ F (x)

exists and 0 ≤ c ≤ ∞. Conversely, if (A2.6) holds with 0 < c < ∞, then


2 2
F (x) ∼ x2− 1+c L(x), Fe (x) ∼ cx− 1+c L(x)

as x → ∞, where L is a slowly varying function. Finally, (A2.6) holds with


c = 0 if and only if F is slowly varying while (A2.6) holds with c = ∞ if
and only if Fe is slowly varying.
The proof follows immediately from equations (A2.4) and (A2.5) by
using Theorems A2.3 and A2.4. 2
324 Appendix 2

A2.3
Let f : [1, ∞) → R++ be a measurable function which is bounded on finite
intervals and such that limx→∞ f (x) = ∞. For any y ∈ [f (1), ∞) define

f0 (y) = inf{x ≥ 1 : f (x) ≥ y}, f1 (y) = inf{x ≥ 1 : f (x) > y},

f2 (y) = sup{x ≥ 1 : f (x) ≤ y}.


Clearly, the functions fi : [f (1), ∞) → [1, ∞), i = 0, 1, 2, are well defined,
any of them is non-decreasing, 1 ≤ f0 ≤ f1 ≤ f2 , and limy→∞ fi (y) =
∞, i = 0, 1, 2. We say that f ∈ F if and only if

f1 (y)
lim = 1.
y→∞ f2 (y)

Lemma A2.6 [Samur (1989, Lemma 2.11)] (i) If f : [1, ∞) → R++ is


non-decreasing and limx→∞ f (x) = ∞, then f ∈ F.
(ii) If f : [1, ∞) → R++ is bounded on finite intervals and regularly
varying of index α > 0, then f ∈ F. Moreover,

f0 (y)
lim =1
y→∞ f2 (y)

and fi is regularly varying of index 1/α, i = 0, 1, 2.


(iii) If f ∈ F and f1 is regularly varying of index 1/α for some α > 0,
then f is regularly varying of index α.
Corollary A2.7 Let f ∈ F, and define a real-valued function F on R+
by X
F (x) = (log 2)−1 f 2 (k)k −2 , x ∈ R+ .
{k∈N+ : |f (k)|≤x}

(i) F is slowly varying if and only if

f 2 (x)
lim X = 0. (A2.7)
x→∞ x f 2 (k)k −2
{k∈N+ : k≤x}

(ii) If f ∈ F is regularly varying of index 1/2, then (A2.7) holds, that


is, F is slowly varying.
Appendix 3: Limit theorems
for mixing random variables

A3.1
Let (Ω, K, P ) be a probability space. For any two σ-algebras K1 and K2
included in the σ-algebra K define the dependence coefficients

α(K1 , K2 ) = sup (|P (A1 ∩ A2 ) − P (A1 )P (A2 )| : Ai ∈ Ki , i = 1, 2) ,

ϕ(K1 , K2 ) = sup (|P (A2 |A1 ) − P (A2 )| : Ai ∈ Ki , i = 1, 2, P (A1 ) > 0) ,


µ¯ ¯ ¶
¯ P (A2 |A1 ) ¯
ψ(K1 , K2 ) = sup ¯¯ − 1¯¯ : Ai ∈ Ki , P (Ai ) > 0, i = 1, 2 .
P (A2 )

Clearly,
α(K1 , K2 ) ≤ ϕ(K1 , K2 ) ≤ ψ(K1 , K2 )
and

0 ≤ α(K1 , K2 ), ϕ(K1 , K2 ) ≤ 1, 0 ≤ ψ(K1 , K2 ) ≤ ∞.

Let (X, X ) be a measurable space and consider an array

X = {Xnj , 1 ≤ j ≤ jn , jn ∈ N+ , n ∈ N+ } (A3.1)

of X-valued r.v.s defined on (Ω, K, P ). [An infinite sequence (Xn )n∈N+ of


X-valued r.v.s can be seen as the (triangular) array {Xnj ≡ Xj , 1 ≤ j ≤ n,
n ∈ N+ } .] For such an array define the dependence coefficients

δ (k) = sup max δ(σ (Xnj , 1 ≤ j ≤ h), σ (Xnj , h + k ≤ j ≤ jn )) ,


(k) 1≤h≤jn −k
n∈N+

325
326 Appendix 3

(k)
where N+ = {n ∈ N+ : jn > k} , k ∈ N+ , and δ stands for either α, ϕ or
ψ. Clearly, in the case of an infinite sequence (Xn )n∈N+ we can write

δ(k) = sup δ(σ(Xj , 1 ≤ j ≤ h), σ(Xj , h + k ≤ j ≤ h + k + `)).


h,`∈N+

It is obvious that the sequence (δ(k))k∈N+ is non-increasing. An array (resp.


sequence) of r.v.s is said to be δ-mixing if and only if limk→∞ δ(k) = 0. It
can be shown [Bradley (1986, p. 184)] that ϕ(1) < 1 whenever ψ(1) < ∞.
A finite collection (Xi )1≤i≤n , n ≥ 2, of X-valued r.v.s is said to be
strictly stationary if and only if the probability distribution of

(Xk+1 , · · · , Xk+h ), 0 ≤ k ≤ n − h,

does not depend on k whatever 1 ≤ h < n. A sequence (Xn )n∈N+ of X-


valued r.v.s is said to be strictly stationary if and only if the probability
distribution of (Xk+1 , · · · , Xk+h ) does not depend on k ∈ N whatever h ∈
N+ . An array of X-valued r.v.s is said to be strictly stationary if and only
if any row of it is strictly stationary.
Proposition A3.1 Let (A3.1) be a ψ-mixing array of X-valued r.v.s.
Let ξ and η be real-valued random variables which are σ(Xnj , 1 ≤ j ≤ h)-
and σ (Xnj , h + k ≤ j ≤ jn )-measurable, respectively, for some h, k, n ∈ N+ .
Assume that E |ξ| , E |η| < ∞ and ψ(k) < ∞. Then Cov (ξ, η) exists and

|Cov (ξ, η)| ≤ ψ(k)E |ξ| E |η| .

In particular, if Eξ 2 < ∞ and Eη 2 < ∞ then

|Cov (ξ, η)| ≤ ψ(k) Var1/2 ξ Var1/2 η.

Corollary A3.2 Let (A3.1) be a ψ-mixing strictly stationary array of


2 < ∞ for some n ∈ N .
real-valued r.v.s with ψ(1) < ∞. Assume that EXn1 +
Then
 
Xk jn
X
Var Xnj < k 1 + 2 ψ(j) Var Xn1 , 1 ≤ k ≤ jn .
j=1 j=1

Corollary A3.3 Let (Xn )n∈N+ be a Pψ-mixing strictly stationary se-


quence of X-valued r.v.s. Assume that n∈N+ ψ(n) < ∞. Let f be a
Limit theorems 327

real-valued r.v. on (X, X ), and assume that Ef 2 (X1 ) < ∞. Then the
series
X
σ 2 = Ef 2 (X1 ) − E 2 f (X1 ) + 2 E(f (X1 ) − Ef (X1 ))(f (Xn+1 ) − Ef (X1 ))
n∈N+

is absolutely convergent and σ ≥ 0. We have


n
X
Var f (Xj ) = n(σ 2 + o(1))
j=1

as n → ∞.
The above results are already folklore. See, e.g., Doukhan (1994, Ch. 1).
Proposition A3.4 [Gordin (1971, Remark 3)] In addition to the hy-
potheses of Corollary A3.3 assume that ψ(1) < 1. Then σ = 0 if and only
if f = const.

A3.2
For an array (A3.1) of real-valued r.v.s on (Ω, K, P ) set
k
X
Snk = Xnj , 1 ≤ k ≤ jn , Snjn = Sn , n ∈ N+ .
j=1

Then such an array is said to be strongly infinitesimal (s.i. for short) if and
only if it is strictly stationary and for any sequence (kn )n∈N+ of natural
integers such that kn ≤ jn , n ∈ N+ , and limn→∞ kn /jn = 0 the sum Snkn
converges in P -probability to 0 as n → ∞.
All results given below were proved by J.D. Samur, as indicated at appro-
priate places, in the more general case of Banach valued random variables.
Proposition A3.5 If (A3.1) is a ϕ-mixing s.i. array of real-valued r.v.s,
then ¡ −1 ¢
lim max dP P Snk , δ0 = 0
n→∞ 1≤k≤kn

for any sequence (kn )n∈N+ of natural integers such that kn ≤ jn , n ∈ N+ ,


and limn→∞ kn /jn = 0.
This is a consequence of a more general result [Samur (1984, Theorem
3.3)].
328 Appendix 3

Proposition A3.6 [Samur (1987, § 3.4.3.2)] Let (A3.1) be a ϕ-mixing


strictly stationary array of real-valued r.v.s such that P Sn−1 converges weakly
to some probability measure on B. Then the array (A3.1) is s.i. if and only
if Xn1 converges in P -probability to 0 as n → ∞, and for any ε > 0 there
exists 0 < a = a(ε) < 1 such that

lim sup max P (|Snk | > ε) < 1.


n→∞ 1≤k≤ajn

A3.3
Let ν be an infinitely divisible probability on B. We denote by Qν the
distribution (on BD ) of a stochastic process ξν = (ξν (t))t∈I with stationary
independent increments, ξν (0) = 0 a.s., trajectories in D, and ξν (1) having
probability distribution ν. When ν is Gaussian the process ξν can be taken
with trajectories in C. In this case the distribution of ξν is concentrated on
BC , and we shall denote it by Q0ν .
Given an array (A3.1) of real-valued r.v.s., for any n ∈ N+ define the
stochastic processes ξnD = (ξnD (t))t∈I and ξnC = (ξnC (t))t∈I by

ξnD (t) = Snbjn tc ,

ξnC (t) = Snbjn tc + (jn t − bjn tc) (Sn(bjn tc+1) − Snbjn tc ), t ∈ I,

with the convention Sn0 = 0, n ∈ N+ . Clearly, for any n ∈ N+ the


trajectories of ξnD and ξnC are in D and C, respectively.
Theorem A3.7 [Samur (1987, Theorem 3.2 and Corollary 3.3)] Let
(A3.1) be a ϕ-mixing strictly stationary array of real-valued r.v.s such that
ψ(1) < ∞. Let ν be a probability measure on B. Then the following state-
ments are equivalent:
w
I. P Sn−1 → ν and the array (A3.1) is s.i.
¡ ¢−1 w
II. ν is infinitely divisible and P ξnD → Qν in BD .
Remark. If the assumption ψ(1) < ∞ does not hold, then Theorem A3.7
still holds with statement I replaced by
w
I.0 P Sn−1 → ν, the array (A3.1) is s.i., and

sup jn P (|Xn1 | > ε) < ∞, lim jn P (|Xn1 | > ε, |Xnj | > ε) = 0


n∈N+ n→∞

for any ε > 0 and any integer j ≥ 2. 2


Limit theorems 329

Theorem A3.8 [Samur (1987, Corollary 3.5 and § 3.6.4)] Let (A3.1) be
a ϕ-mixing strictly stationary array of real-valued r.v.s. Let ν be a probability
measure on B. Then the following statements are equivalent:
w
I. P Sn−1 → ν, the array (A3.1) is s.i., and limn→∞ jn P (|Xn1 | > ε) = 0
for any ε > 0.
¡ ¢−1 w
II. ν is Gaussian and P ξnD → Qν in BD .
¡ ¢−1 w 0
III. ν is Gaussian and P ξnC → Qν in BC .

IV. ν is Gaussian, and on a common probability space (Ω0 , K0 , P 0 ) there


exist an array
© 0 ª
X0 = Xnj , 1 ≤ j ≤ jn , jn ∈ N+ , n ∈ N+

of real-valued r.v.s and a stochastic process ζ = (ζ(t))t∈I with trajec-


tories in C which satisfy

P 0 (Xn1
0 , · · · , X 0 )−1 = P (X , · · · , X
njn n1
−1
njn ) , n ∈ N+ ,

P 0 ζ −1 = Q0ν ,
¯ ¯
¯X ³ ´¯¯
¯ k 0
max ¯ Xnj − ζ jk ¯¯ → 0 P 0 -a.s. as n → ∞.
1≤k≤jn ¯¯ n ¯
j=1

Remark. If ϕ(1) < 1 and ν is Gaussian, then statement I above can be


replaced by
w
I.0 P Sn−1 → ν, and the array (A3.1) is s.i. 2

Theorem A3.9 [Samur (1987, § 3.4.3.1)] Let (Xn )n∈N+ be a ϕ-mixing


strictly stationary sequence of real-valued r.v.s. Let (Bn )n∈N+ be a sequence
of positive numbers such that limn→∞ Bn = ∞, and let (An )n∈N+ be a
sequence of real numbers. Assume that
 −1
n
X
1 w
P (Xj − An ) → ν,
Bn
j=1
330 Appendix 3

where ν is a non-degenerate probability measure on B. Then ν is stable. Let


α ∈ (0, 2] be the order of ν and write

1
Xnj = (Xj − An ) , 1 ≤ j ≤ n, n ∈ N+ .
Bn

The array X = {Xnj , 1 ≤ j ≤ n, n ∈ N+ } is s.i. if and only if:


(i) Bn = n1/α L(n), n ∈ N+ , for some slowly varying function L : R+ →
R++ integrable over finite intervals, and
(ii) for any sequence (rn )n∈N+ of natural integers such that rn ≤ n and
limn→∞ rn /n = 0 we have

rn (Arn − An )
lim = 0.
n→∞ Bn

Theorem A3.10 [Samur (1984, Theorem 5.6)] Let (A3.1) be a ϕ-mixing


strictly stationary array of real-valued r.v.s such that ψ(1) < ∞. Assume
there exist positive measures µn on B, n ∈ N+ , such that µn (R) ≤ 1 and
−1
µn ([−t, t]) = 0, n ∈ N+ , for some t ∈ R++ . If P Xn1 = (1 − µn (R))δ0 + µn
w
and jn µn converges weakly to a finite measure µ on B, then P Sn−1 → Pois µ.
Theorem A3.11 [Samur (1984, Theorems 4.1 and 4.2)] Let (A3.1) be a
ϕ-mixing strictly stationary s.i. array of real-valued r.v.s such that ϕ(1) < 1.
Assume that P Sn−1 converges weakly to a probability measure ν on B. Then
ν is Gaussian if and only if

lim jn P (|Xn1 | > ε) = 0


n→∞

for any ε > 0. If ν = N (m, σ 2 ) then for any ε > 0 we have


à !
jn ³
P ´ 2
(i) limn→∞ E Xnj I(|Xnj |≤ε) − EXnj I(|Xnj |≤ε) = σ2
j=1
and
jn
P
(ii) limn→∞ E Xnj I(|Xnj |≤ε) = m.
j=1

For any real-valued r.v. η put


 2
 E η/Eη 2 if 0 < Eη 2 < ∞,
2
m (η) =

0 if Eη 2 = ∞.
Limit theorems 331

It can be proved that if Eη 2 = ∞ then

E 2 η I(|η|≤x)
lim = 0. (A3.2)
x→∞ Eη 2 I(|η|≤x)

See, e.g., Araujo and Giné (1980, p. 80).


Theorem A3.12 [Samur (1985), Corollary 3.4] Let (Xn )n∈N+ be a ϕ-
mixing strictly stationary sequence of real-valued r.v.s for which
X
ϕ1/2 (n) < ∞.
n∈N+

Assume that
x2 P (|X1 | > x)
0 < EX12 ≤ ∞, lim = 0,
x→∞ EX 2 I(|X |≤x)
1 1

and the limits


EX1 Xn I(|X1 |≤x,|Xn |≤x)
ϕ(0)
n := lim , n ∈ N+ ,
x→∞ EX12 I(|X1 |≤x)
P
exist and are all finite. Put Sn = ni=1 Xi , n ∈ N+ , S0 = 0. Then the
following assertions hold:
(i) E |X1 | < ∞.
(ii) The series
(0)
X
2
σ(0) = ϕ1 − m2 (X1 ) + 2 (ϕ(0) 2
n − m (X1 ))
n≥2

converges absolutely and its sum is non-negative.


(iii) If σ(0) 6= 0 then for any sequence (Bn )n∈N+ of positive numbers with
limn→∞ Bn = ∞ satisfying

lim nBn−2 EX12 I(|X1 |≤Bn ) = 1


n→∞

w
we have P ξ¯n−1 → WD in BD , where

Sbntc − bntc EX1


ξ¯n (t) = , n ∈ N+ , t ∈ I.
σ(0) Bn

When a2 = EX12 < ∞ we can take Bn = |a|n1/2 , n ∈ N+ .


332 Appendix 3
Notes and Comments

1.1
As we have noted, the basic reference for classical non-metric results on
different types of continued fraction expansions is Perron (1954, 1957).
There exist several metrical results about Euclid’s algorithm. Let b, n ∈
N+ with 1 ≤ b < n. Then b/n = [a1 , · · · , aτ (b,n) ] with aτ (b,n) ≥ 2, and
τ (b, n) ∈ N+ is the number of division steps occurring when b and n are
input to the algorithm. Since Euclid’s algorithm applied to b and n behaves
essentially the same as when applied to b/g.c.d.(b, n) and n/g.c.d.(b, n),
it is convenient to consider the average number τn of division steps when b
is relatively prime to n and chosen at random, that is, probability 1/ϕ(n)
is given to any integer in the range [1, n] which is prime to n. Here ϕ is
Euler’s ϕ-function defined by
Yµ 1

ϕ(n) = n 1− , n ≥ 2,
p
p|n

and ϕ(1) = 1, where the product is taken over all prime numbers p which
divide n. Clearly,
n
X
1
τn = τ (k, n).
ϕ(n)
k=1
g.c.d.(k, n) = 1

Porter (1975) and Knuth (1976) showed that


12 log 2
τn = log n + c + O(n−1/6+ε )
π2
as n → ∞ for any ε > 0, with
6 log 2 ¡ ¢ 1
c= 2
3 log 2 + 4C − 24π 2 ζ 0 (2) − 2 − = 1.467078... .
π 2

333
334 Notes and Comments

The leading coefficient (12 log 2)/π 2 = 0.84276... was independently derived
by Dixon (1970, 1971) and Heilbronn (1969). A very interesting discussion of
this topic can be found in Knuth (1981, Section 4.5.3). See also Lochs (1961),
Szüsz (1980), and Tonkov (1974). For recent generalizations of Dixon’s and
Heilbronn’s results, see Hensley (1994). The largest quotient

max ak
1≤k≤τ (b,n)

occurring in Euclid’s algorithm when b and n are input to the algorithm,


has been studied by Hensley (1991).
The continued fraction transformation τ underlies a chaotic discrete dy-
namical system which exhibits in an accessible manner all the common fea-
tures of such systems. See, e.g., Corless (1992).

1.2
Whole sections or chapters on the metrical theory of continued fractions
can be found in the books by Billingsley (1965), Ibragimov and Linnik
(1971), Iosifescu and Grigorescu (1990), Kac (1959), Khin(t)chin(e) (1956,
1963, 1964), Knuth (1981), Koksma (1936), Lévy (1954), Rockett and Szüsz
(1992), Sinai (1994), Urban (1923).

1.3
The natural extension τ̄ of τ has been introduced in a more general context
by Nakada (1981) in order to derive ergodic properties of associated random
variables. See Sections 4.0 and 4.1.
The extended incomplete quotients have been first introduced by Faivre
(1996) and, in general, the extended random variables by Iosifescu (1997),
who proved Theorem 1.3.5 which motivates the consideration of the condi-
tional probability measures γa , a ∈ I. Proposition 1.3.8 and Corollary 1.3.9
can also be found in the latter reference.
Subsections 1.3.5 and 1.3.6 rely on the work of Iosifescu (1989, 2000
b). It is worth mentioning that to our knowledge it is the first time that
mixing coefficients have been computed exactly. A first estimation, ψ(n) ≤
(0.8)n , n ∈ N+ , of the ψ-mixing coefficients is due to Philipp (1988). As
to other types of mixing, it seems possible to prove a kind of α-mixing for
(r̄` )`∈Z using the Markovian structure of (s̄` )`∈Z and the reversibility of
(ā` )`∈Z .
Notes and Comments 335

It is the appropriate place to mention that the sequence (an )n∈N+ enjoys
another mixing property known as the almost Markov property, a concept
introduced by the Lithuanian school—see especially the references to the
papers by V.A. Statulevic̆ius and B. Riauba in Heinrich (1987) and Mis-
evic̆ius (1971). See also Saulis and Statulevic̆ius (1991). Let µ ∈ pr(BI ) and
for k, n ∈ N+ define the random variable

αk,n (µ) = sup |µ (B|σ(a1 , · · · , ak+n−1 )) − µ (B|σ(ak+1 , · · · , ak+n−1 ))| ,

where the supremum is taken over all B ∈ σ(ak+n , ak+n+1 , · · · ). Put

χµ (n) = sup ess sup αk,n (µ).


k∈N+

Then as shown in Heinrich (op. cit.)—for a slightly weaker form of this result
see Misevic̆ius (1981)—assuming that µ ¿ λ and that f = dµ/dλ ∈ L(I)
and is bounded away from 0, we have

χµ (n) ≤ 2−n+1 (24 + s(f )/ inf f (x)), n ∈ N+ .


x∈I

Finally, note that it has not been usual to prove F. Bernstein’s theo-
rem (Proposition 1.3.16) as an application of ψ-mixing of the sequence of
incomplete quotients.

2.1
Theorem 2.1.6 and Proposition 2.1.7 are in fact corollaries of the ergodic
theorem of Ionescu Tulcea and Marinescu (1950) [see also Hennion (1993)],
which is a deep generalization of an ergodic theorem of Doeblin and Fortet
(1937). Cf. Iosifescu (1993b). As noted by Iosifescu (1993a), it is hard to
understand how Doeblin (1940) missed a geometric rate solution to Gauss’
problem, which could have been obtained by using the latter theorem.
Subsection 2.1.3 relies on the work of Iosifescu (1992, 1993, 1994). In
particular, Propositions 2.1.11 and 2.1.12 have allowed for the simplest so-
lution known to date to Gauss’ problem, which is included in the first two
references just quoted. Proposition 2.1.11 has been also proved by Szüsz
(1961) for f ∈ C 1 (I).
In connection with Proposition 2.1.17 we note that in the case of a sin-
gular µ ∈ pr(BI ) the solution to the corresponding Gauss’ problem has not
been yet systematically studied. See Remark 2 following Corollary 4.1.10
for a case where the limit clearly differs from Gauss’ measure.
336 Notes and Comments

2.2
Subsections 2.2.1 and 2.2.2 contain a very detailed presentation of E.Wir-
sing’s 1974 celebrated paper. This also includes the effective computation
of numerical constants occurring there.
Subsection 2.2.3 relies on the work of Iosifescu (2000 a, c). That Theo-
rem 2.2.6 holds for f ∈ L(I), that is, that Theorem 2.2.8 holds, had been
announced in Iosifescu (1992) and subsequently used by Faivre (1998a). We
stress again the importance of a study of the set E defined in Remark 1
following Theorem 2.2.6. (See also Remark 2 following Theorem 2.2.11.)

2.3
This section contains a detailed presentation of K.I. Babenko’s work on
Gauss’ problem, with some improvements and generalizations. Information
about the life and work of K.I. Babenko (1919–1987) can be found in Russian
Math. Surveys 35 (1980), no. 2, 265–275, and 43 (1988), no. 2, 138–151.
Proposition 2.3.2 and its proof are due to Mayer and Roepstorff (1987).
For a = 0, that is, under Lebesgue measure λ = γ0 the exact Gauss–Kuzmin–
Lévy Theorem 2.3.5 has been proved by Babenko (1978). The general case
a ∈ I has been announced by Iosifescu (2000 b). Note that equation (2.3.14)
is equivalent to equation (3.6) in Hensley (1992).
We stress the fact that for some a ∈ I the exact convergence rate in
Gauss’ problem under γa is faster than Wirsing’s optimal rate O(λn0 ) as
n → ∞. See the Remark after the proof of Corollary 2.3.6.
It should be noted that by Proposition 2.1.17 for any i(k) ∈ Nk+ the
limit of µ[(an+1 , . . . , an+k ) = i(k) ] as n → ∞ exists and is equal to γ(I(i(k) )
whatever µ ∈ pr(BI ) such that µ ¿ λ. Corollary 2.3.6 shows that in the
case where µ = γa , a ∈ I, a good convergence rate also holds.
A note of historical nature is in order concerning the equation
µ ¶
1 1
lim λ(an = k) = log 1 + , k ∈ N+ ,
n→∞ log 2 k(k + 2)

which is a weaker form of a result given in Corollary 2.3.6. This formula was
first obtained as early as 1900. Two papers of the Swedish astronomer Hugo
Gyldén, whose understanding of the approximate computation of planetary
motions led him in 1888 to study the asymptotic of λ(an = k), k ∈ N+ , as
n → ∞, were taken up for revision by his fellow-countrymen Torsten Brodén
and Anders Wiman, both mathematicians associated with Lund University.
Notes and Comments 337

Wiman (1900) got finally the correct result after Sisyphical computations.
Two subsequent papers, both published in 1901, of Brodén and Wiman were
then considered by Émile Borel as the first ones to notice the applicability
of measure theory in probability. The reader will find precise references
and all the necessary details in von Plato (1994, Ch. 2). This book is a
fascinating account of the emergence of measure-theoretic probability in the
first third of the 20th century (until the publication of A.N. Kolmogorov’s
Grundbegriffe der Wahrscheinlichkeitsrechnung in 1933). It is convincingly
argued there that the theory of the continued fraction expansion should
be counted among the fields that brought infinitary events and the idea of
measure 0 into probability.

2.5
This section relies on the work of Iosifescu (1994, 1997, 1999). For a = 0,
that is, under Lebesgue measure λ = γ0 the optimal convergence rate O(g2n )
in Theorem 2.5.5 (without explicit lower and upper bounds), has been first
shown by Dürner (1992) using a different approach. For a = 0, too, Theorem
2.2.8 with just an upper bound O(gn ) [instead of the optimal one O(g2n )],
has been proved by a different method by Dajani and Kraaikamp (1994).
The proof given here emphasizes the importance of the generalized Brodén–
Borel–Lévy formula (1.3.21).
It is hard to understand why A. Denjoy’s 1936 Comptes Rendus Notes
went unnoticed so many years. The method of proving and generalizing
Denjoy’s results here, is quite different from that suggested by him.

3.0
The idea underlying Lemma 3.0.1 goes back to Philipp (1970). Lemma 3.0.2
is a special case of a result of Samur (1989, Lemma 2.3).

3.1
Except for Theorem 3.1.6, the results in Subsections 3.1.1 and 3.1.2 have
been proved by Samur (1989). The classical Poisson law [Theorem 3.1.2
(iii)] under any µ ¿ λ has been first given a complete proof by Iosifescu
(1977), who filled a gap in an incomplete proof by Doeblin (1940, p. 358).
338 Notes and Comments

3.2 & 3.3


Subsections 3.2.2 and 3.2.3 mainly rely on the work of Samur (1989, 1996),
who applied his earlier results for different mixing random variables to the
special case of random variables occurring in the metrical theory of contin-
ued fractions. The presentation here is more transparent due to the consis-
tent use of the extended random variables which only appear in an implicit
manner in Samur’s treatment.
For the first versions of most of the results in these sections credit should
be given to Doeblin (1940). An extensive analysis of Doeblin’s paper has
been made by Iosifescu (1990, 1993 a,b), where the reader can find a com-
prehensive evaluation of Doeblin’s important contributions to the metrical
theory of continued fractions as compared with subsequent work in the field.
It should be noted that Samur (1989) has also dealt with more general
partial sums Sn defined as follows. Let (fn )n∈N+ be a sequence of H-valued
functions
Pn on N+ , where H is a separable Hilbert space, and put Sn =
i=1 fn (a i ), n ∈ N+ . He derived sufficient conditions for the laws of certain
random functions associated with the Sn , n ∈ N+ , to converge weakly (in
the Skorohod space of H-valued functions on I) to an infinitely divisible
probability measure on H.
Another generalization of the case considered in Theorem 3.2.4 is that
of partial sums
Xn
Sn = fi (ai ),
i=1

where (fn )n∈N+ is a sequence of real-valued functions on N+ . A very special


case has been taken up by Doeblin (1940, p. 360), with fn (j) = 1 or 0
according as j ≥ cn or j < cn , n, j ∈ N+ . Here (cn )n∈N+ is a sequence
of positive numbers. In this case Sn is the number of occurrences of the
random events (ai ≥ ci ), 1 ≤ i ≤ n. By F. Bernstein’s theorem—see
Corollary
P 1.3.16—limn→∞ Sn < ∞ or = ∞ a.e. in I according as the series
Pn∈N+ 1/c n converges or diverges. Doeblin gave valid hints for a proof that
if n∈N+ 1/cn = ∞ then (Sn )n∈N+ obeys the central limit theorem under

λ. More precisely, (Sn − An )/ An is asymptotically N (0, 1) under λ as
n → ∞, with
n µ ¶
1 X 1
An = log 1 + , n ∈ N+ .
log 2 ci
i=1

A complete proof with an estimate of the convergence rate under any µ ¿ λ


has been given by Philipp (1970). This result has been improved by Zuparov
Notes and Comments 339

(1981). The functional version of this central limit theorem was proved by
Philipp and Webb (1973).

3.4
We only mention here a result not covered by those given in this section.
It is about Doeblin’s sequence (Sn )n∈N+ just discussed. Doeblin (1940, p.
361) asserted the validity of the law of the iterated logarithm
µ ¶
Sn − An
λ lim supn→∞ √ = 1 = 1.
2An log log An

A complete proof was again given by Philipp (1970). The functional version
of this law of the iterated logarithm might follow from a more general result
in Szüsz and Volkmann (1982, p. 458).

4.0
Most of the results stated for probability measures are still valid for finite
measures and even for σ-finite, infinite measures. See, e.g., Aaronson (1997).

4.1
Khin(t)chin(e) [1934/35, 1936;P1963 (or 1964), Ch. 3] proved the a.e. con-
vergence of arithmetic means ni=1 f (ai , · · · , ai+k−1 )/n, n ∈ N+ , for some
fixed k ∈ N, under an unnecessarily strong assumption on the function
f : Nk+ → R. His proofs are quite intricate since he made no use of the
Birkhoff–Khinchin (!) ergodic theorem which, as we have seen, provides
short and elegant proofs. (This should be certainly associated with the fact
that ergodic theory at the time was restricted to invertible transformations.
But even so a way out could have perhaps been found.) Unlike Khinchin,
Doeblin (1940, p. 366) did make use of the ergodic theorem. He proved that
the continued fraction transformation τ is ergodic under λ [a different proof
had been given earlier by Knopp (1926), see also Martin (1934)]. Since τ
is γ-preserving, this enabled him to derive (in an equivalent form) equa-
tion (4.1.1), thus to retrieve Khinchin’s results under weaker assumptions
in a straightforward manner. It is the appropriate place to note that, in
spite of the fact that, e.g., Billingsley (1965, p. 49) fully credits Doeblin
for the idea leading to (4.1.1), many authors assert that this idea is due to
340 Notes and Comments

Ryll–Nardzewski (1951). Actually, the only real advance made after 1940
in using ergodic theorems in the metric theory of RCF expansion originated
with Nakada (1981) who, as already mentioned, introduced the natural ex-
tension τ̄ of τ , allowing to derive equation (4.1.6). It is again really surprising
that Doeblin (1940, p. 365) asserts that his version of Theorem 2.2.11—see
Remark 1 following that theorem—implies that
1
lim card{k : Θ−1
k < x, 1 ≤ k ≤ n } = H(x), x ≥ 1,
n→∞ n
P
and that n−1 ni=1 Θi converges a.e. as n → ∞ to a constant (not indicated).
Or Doeblin’s first assertion above is equivalent to the first case considered
in Corollary 4.1.22 while the second one is the first equation in Corollary
4.1.23 without the value of the limit. How did Doeblin guess these results
whose proofs involve the use of τ̄ ?
It should be noted that special cases of the Khinchin-Doeblin results
have been known before. For example, as already noted, Proposition 4.1.1
and its consequences were first proved (without convergence rates) by Lévy
(1929).
The application of the Gál–Koksma theorem to the RCF expansion,
yielding the convergence rates indicated, is due to de Vroedt (1962, 1964).
Let us finally mention that in Philipp (1967) a more general problem is
considered. Given an arbitrary sequence (In )n∈N+ of intervals contained in
I, it is shown there that for any ε > 0 the random variable

card{k : τ k ∈ Ik , 1 ≤ k ≤ n }, n ∈ N+ ,

is equal to
Ã !1/2 Ã n !
n
X Xn X
3+ε
γ(Ik ) + O  γ(Ik ) log 2 γ(Ik )  a.e.
k=1 k=1 k=1

as n → ∞, where the constant implied in O depends on both ε and the


current point ω ∈ Ω.

Moeckel (1982), then Jager and Liardet (1988), using quite different
methods showed—amongst other things—that if we consider modulo 2 the
sequence (qn )n∈N+ of the denominators of the RCF convergents of any given
ω ∈ Ω, then the asymptotic relative frequencies of the digit blocks 01, 10,
and 11 all are a.e. equal to 1/3. [Note that the digit block 00 cannot occur
since |pn−1 qn − pn qn−1 | = 1, n ∈ N+ .] Jager and Liardet (op. cit.) showed
Notes and Comments 341

that results of this kind can be easily derived from the ergodicity of a certain
skew product. To define it we need some notation. For any integer m ≥ 2
let G(m) denote the finite group of 2 × 2 matrices with entries from Z/mZ
(the classes of remainders modulo m) and determinant equal to ±1, that is,
µµ ¶ ¶
a b
G(m) = : a, b, c, d ∈ Z/mZ, ad − bc = ±1 .
c d

It is known that the cardinality of G(m) is given by the formula



 2J(2) = 6 if m = 2,
card G(m) =

2mJ(m) if m ≥ 3,

where J is Jordan’s arithmetical totient function defined by


Yµ 1

2
J(m) = m 1 − 2 , m ≥ 2.
p
p|m

Here the product is taken over all prime numbers p which divide m.
Jager and Liardet’s skew product Tm : Ω × G(m) → Ω × G(m) is then
defined by
µ µ ¶¶
0 1
Tm (ω, A) = τ (ω), A , (ω, A) ∈ Ω × G(m).
1 a1 (ω) mod m

These authors showed that Tm is γ ⊗ hm -preserving, where hm is the Haar


measure on G(m), that is, the uniform one assigning measure 1/card G(m)
to any element of G(m), and that (Tm , γ ⊗ hm ) is an ergodic endomorphism.
Hence they deduced, e.g., that given integers m ≥ 2, a, b ∈ N+ , with
g.c.d.(a, b, m) = 1, we have
1 1
lim card {k : pk ≡ a, qk ≡ b mod m, 1 ≤ k ≤ n } = a.e.,
n→∞ n J(m)

a result also obtained by Moeckel (1982). Subsequently, Nolte (1990) gave


other interesting applications of Jager and Liardet’s endomorphism.
A natural extension T̄m of Tm was obtained and studied by Dajani and
Kraaikamp (1998). It appears that we can take T̄m : Ω2 ×G(m) → Ω2 ×G(m)
defined by
µ µ ¶¶
0 1
T̄m ((ω, θ), A) = τ̄ (ω, θ), A
1 a1 (ω) mod m
342 Notes and Comments

¡ ¢
for (ω, θ, A) ∈ Ω2 ×G(m). Then T̄m is γ̄⊗hm -preserving, and T̄m , γ̄ ⊗ hm is
an ergodic automorphism. Hence Dajani and Kraaikamp (op. cit.) deduced,
e.g., that for any integers m ≥ 2, 0 ≤ a, b ≤ m − 1, with g.c.d.(a, b, m) = 1
and for any (t1 , t2 ) ∈ I 2 we have
1
lim card {k : Θk−1 < t1 , Θk < t2 , pk ≡ a, qk ≡ b mod m, 1 ≤ k ≤ n }
n→∞ n

H(t1 , t2 )
= a.e.,
J(m)
where the distribution function H has been defined in Corollary 4.1.20.
Their paper contains a host of other results. They also showed that these
results can be extended to S-expansions (cf. Sections 4.2 and 4.3). It is
interesting to note that the sequences of numerators and denominators of
the S-convergents have – mod m – the same asymptotic behaviour as that
just indicated for the sequences of numerators and denominators of the RCF
convergents.

It may seem difficult to compare, e.g., the decimal expansion with the
RCF expansion, since their dynamics are different. However, Lochs (1964)
obtained a then surprising result that had to serve as a prototype for further
results of the same kind. Let ω ∈ Ω and consider the rational number
xn = xn (ω) := b10n ωc/10n , which yields the first n decimal digits of ω, and
yn = xn + 10−n , n ∈ N+ . Clearly, for n large enough we have yn < 1. Next,
let ω = [a1 , a2 , · · · ], xn = [b1 , · · · , bk ], and yn = [c1 , · · · , c` ] be the RCF
expansions of ω, xn , and yn , respectively, and for n ∈ N+ large enough put
mn = mn (ω) = max{i ≤ max(k, `) : bj = cj , 1 ≤ j ≤ i }.
In other words, mn (ω) is the largest integer such that the closed interval
[xn , yn ] is contained in the closure of the fundamental
√ interval I(a1 , · · · ,
3
amn (ω) ) (containing ω). For example, if ω = 2 − 1 = 0.259921 · · · then
x5 = 0.25992, y5 = 0.25993, ω = [3, 1, 5, 1, 1, · · · ], x5 = [3, 1, 5, 1, 1, 4, 2, 5, 1,
3], and y5 = [3, 1, 5, 1, 1, 5, 5, 1, 2, 1, 4, 3]. Therefore m5 (ω) = 5, that is,
from the first 5 decimal digits of ω we obtain its first 5 RCF digits. Using
arithmetic properties of τ and Paul Lévy’s result (4.1.19), Lochs (op. cit.)
proved that
mn 6 log 2 log 10
lim = = 0.97027014 · · · a.e..
n→∞ n π2
This means that, roughly speaking, usually around 97% of the RCF digits
are determined by the decimal digits. Using an early mainframe computer,
Notes and Comments 343

by way of example, Lochs (1963) calculated that the first 1000 decimal digits
of π determine 968 RCF digits of it!
Lochs’ result was generalized to a wider class of transformations of I by
Bosma et al. (1999). Their results are based on the Shannon–McMillan–
Breiman theorem in information theory [see Billingsley (1965, p. 129)] while
Lochs’ limit appears in fact to be the ratio of the entropies of the transfor-
mations S : I → I defined as Sx = 10 x mod 1, x ∈ I, underlying the decimal
expansion, and τ . Finally, Dajani and Fieldsteel (2001) gave wider applica-
tions and simpler proofs of results describing the rate at which the digits of
one number theoretical expansion determine those of another. Their proofs
are based on general measure-theoretic covering arguments and not on the
dynamics of specific maps.
We mention that Lochs’ problem was also considered by Faivre (1997,
1998b), who showed that (i) for any ε > 0 there exist positive constants
a < 1 and A such that
µ¯ ¯ ¶
¯ mn 6 log 2 log 10 ¯
λ ¯¯ − ¯ ≥ ε ≤ Aan , n ∈ N+ ,
¯
n π2
¡ ¢ √
and (ii) the random variable mn − 6(log 2)(log 10)n/π 2 / n is asymptot-
ically N (0, σ) for some σ > 0 (which is related to the constant denoted by
the same letter in Example 3.2.11). Clearly, Lochs’ result is implied by (i)
via the Borel–Cantelli lemma.

Cassels (1959) showed that there exist numbers x which are normal in
base 3 but non-normal in any base that is not a power of 3. This result was
generalized by Schmidt (1960) as follows. Let the notation r ∼ s stand for
r, s ∈ N+ being powers of the same integer. It is fairly obvious that if r ∼ s
then normality of x in both bases r and s imply each other. If r 6∼ s then
this implication does not hold. In fact, Schmidt (op. cit.) showed that in the
latter case there is a continuum power set of numbers x which are normal
in base r but not even simply normal in base s. (Simple normality means
that each single digit occurs with the proper frequency.) Motivated by this,
Schweiger (1969) defined two number theoretical transformations T and S
on I (or I d , the d-dimensional unit cube, d ∈ N+ ) to be equivalent (T ∼ S)
if there exist positive integers m, n ∈ N+ such that T m = S n . Schweiger
then showed that T ∼ S implies that every T -normal number is S-normal,
and conjectured that T 6∼ S implies the opposite conclusion.
Surprisingly, Kraaikamp and Nakada (2000) proved that the RCF and
NICF expansions share the same set of normal numbers. Clearly, in itself
344 Notes and Comments

this is not a counter-example to Schweiger’s conjecture, since the RCF trans-


formation τ and the NICF transformation N1/2 ‘live’ on different intervals.
However, in Kraaikamp and Nakada (2001) two counter-examples are given.

4.2 & 4.3


Section 4.2 fully relies on the work of Kraaikamp (1991), see also his 1989
paper.
There exists a host of CF expansions which would have deserved to
be discussed here. Two such expansions are the Rosen continued fraction
expansions, and the α-expansions of Tanaka and Ito (1981). We will briefly
discuss both of them.
Although Rosen (1954) introduced his CF expansions in the mid-1950s,
it is only very recently that there has been any investigation of their metric
properties—see Burton et al. (2000), Gröchenig and Haas (1996), Nakada
(1995), Sebe (2002), and Schmidt (1993).
The groups which underlie the Rosen continued fraction expansions are
Fuchsian groups of the first kind—discrete subgroups of PSL(2, R) acting
upon the Poincaré upper half-plane by Möbius (fractional linear) transfor-
mations, with all of R as their limit sets.
Let λ = λq = 2 cos(π/q) for q ∈ {3, 4, . . . }, and put
µ ¶ µ ¶
1 λ 0 −1
A= , B= .
0 1 1 0
Then the group Gq generated by A and B is called the Hecke (triangle)
group of index q. Rosen (op. cit.) defined a CF expansion related to Gq ,
q ≥ 4. (Note that for q = 3 we have the modular group.) Fix some such q
and let Jq = [−λ/2, λ/2 ]. Then the transformation fq : Jq → Jq defined by
$ %
sgn x sgn x 1
fq (x) = − + λ, x ∈ Jq \ {0}, fq (0) = 0,
x λx 2

leads to a CF expansion of the form


e1
x = ,
e2
b1 λ +
.
b2 λ + . .
where ei is equal to either 1 or −1 and bi ∈ N, i ∈ N+ . We call this the
Rosen, or λ-continued fraction (λ-CF ), expansion of x ∈ Jq \ {0}.
Notes and Comments 345

In Burton et al. (op. cit.) the natural extension of the ergodic dynamical
system underlying the λ-CF expansion was obtained for any q ≥ 3—the case
q = 3 is in fact the NICF expansion. [Previously, Nakada (op. cit.) obtained
a similar result for any even q.] From this a large number of results similar to
those holding for the RCF expansion, were obtained for the λ-CF expansion.

At first sight Nakada’s α-expansions and those of Tanaka and Ito (1981)
bear a close resemblance. Let α ∈ [1/2, 1], Iα = [α − 1, α], and define the
transformation Tα : Iα → Iα by
 −1 ¥ −1 ¦
 x − x + 1 − α if x ∈ Iα \ {0},
Tα (x) =

0 if x = 0.
It yields a unique Tanaka–Ito α-expansion of the form
1
x = , x ∈ Iα \ {0} ,
1
b1 +
.
b2 + . .
which is finite if and only if x is rational, and where bi ∈ Z \ {0}, i ∈
N+ . In spite of the similarities it is much harder to obtain results for the
Tanaka–Ito α-expansions as compared to the Nakada α-expansions discussed
in Subsection 4.3.1. E.g., Tanaka and Ito (op. cit.) were able only to give
the explicit form of the density of the invariant measure for 1/2 ≤ α ≤ g.
For these values of α they were also able to derive the entropy of Tα . It
is interesting to note that the latter is independent of α ∈ [1/2, g], and is
equal to π 2 /(6 log g), which is the value corresponding to an S-expansion
with maximal singularization area.

It should be noted that limit properties as those in Chapter 3 for CF


expansions, other than the RCF expansion, need the corresponding Gauss–
Kuzmin–Lévy theorems (implying ψ-mixing of the sequence of their in-
complete quotients). In this respect we mention the papers of Dajani and
Kraaikamp (1999), Iosifescu and Kalpazidou (1993), Kalpazidou (1985a, c,
1986d, e, 1987b), Popescu (1997a, b, 1999, 2000), Rieger (1978, 1979), Rock-
ett (1980), and Sebe (2000a, b, 2001a, b, 2002). It appears, as noted in the
Preface, that for any single CF expansion a specific approach is required,
which has to more or less mimic that working for the RCF expansion.
We conclude by briefly discussing a generalization of the RCF expansion
known as f -expansions (which, in general, are not CF expansions). Let f be
346 Notes and Comments

a continuous strictly decreasing (increasing) real-valued function defined on


[1, β], where either 2 < β ∈ N+ or β = ∞ ([0, β], where either 1 < β ∈ N+
or β = ∞), such that f (1) = 1 and f (β) = 0 (f (0) = 0 and f (β) = 1),
with the convention f (β) = limx→β f (x) for β = ∞. Denote by f −1 the
inverse function of f , which is defined on I. Such a function f can be used
to represent most real numbers t ∈ I as
t = f (a1 (t) + f (a2 (t) + · · · )) := lim fn (a1 (t), · · · , an (t)),
n→∞
where fn is defined recursively by
f1 (x1 ) = f (x1 ), f2 (x1 , x2 ) = f1 (x1 + f (x2 )),
and
fn+1 (x1 , · · · , xn+1 ) = fn (x1 , · · · , xn−1 , xn + f (xn+1 )), n ≥ 2.
Here the ‘incomplete quotients’ an (t) are defined recursively as
¥ ¦
an (t) = f −1 ({rn−1 (t)})
with
r0 (t) = t, rn (t) = f −1 ({rn−1 (t)}), n ∈ N+ .
Note that
rn (t) = an (t) + f (an+1 (t) + f (an+2 (t) + · · · )), n ∈ N+ .
The above representation of t is called its f -expansion. Clearly, the RCF
expansion is obtained for f (x) = 1/x, x ≥ 1, and the part of the continued
fraction transformation τ is now played by the f -expansion transformation
τf of I defined by τf (t) = {f −1 (t)}, t ∈ I. [Some caution is necessary in the
case where β = ∞ when either τf (0) or τf (1) should be given the value 0.]
Also, the natural extension τ̄f of τf is defined by
τ̄f (t, u) = (τf (t), f (a1 (t) + u))
for the points (t, u) of a suitable subset of I 2 of Lebesgue measure 1. The
f -expansions were first considered¯ by Kakeya
¯ (1924), who proved that if
−1 ¯ −1 0 ¯
f is absolutely continuous and ¯(f ) ¯ > 1 a.e. in I then, save possibly
a countable subset of I, any other t ∈ I has an f -expansion. A metrical
theory of f -expansions parallelling that of the RCF expansion is available.
See, e.g., Iosifescu and Grigorescu (1990, Section 5.4) and the references
therein. Finally, if β does not belong to N+ ∪ {∞}, then the corresponding
f leads to a so called f -expansion with dependent digits. For recent results
on such f -expansions, see Barrionuevo et al. (1994), Dajani and Kraaikamp
(1996, 2001), and Dajani et al. (1994).
References

Aaronson, J. (1986) Random f -expansions. Ann. Probab. 14, 1037–


1057.
Aaronson, J. (1997) An Introduction to Infinite Ergodic Theory. Math-
ematical Surveys and Monographs 50. Amer. Math. Soc., Providence,
RI.
Aaronson, J. and Nakada, H. (2001) Sums without maxima. Preprint.
Abramov, L.M. (1959) Entropy of induced automorphisms. Dokl.
Akad. Nauk SSSR 128, 647–650. (Russian)
Abramowitz, M. and Stegun, I.A. (Eds.) (1964) Handbook of Math-
ematical Functions with Formulas, Graphs, and Mathematical Tables.
National Bureau of Standards, Washington, D.C.
de Acosta, A. (1982) Invariance principles in probability for triangular
arrays of B-valued random vectors and some applications. Ann. Probab.
10, 346–373.
Adams, W.W. (1979) On a relationship between the convergents of
the nearest integer and regular continued fractions. Math. Comp. 33,
1321–1331.
Adler, R.L. (1991) Geodesic flows, interval maps, and symbolic dy-
namics. In: Bedford, T. et al. (Eds.) (1991), 93–123.
Adler, R.L. and Flatto, L. (1984) The backward continued fraction
map and geodesic flow. Ergodic Theory and Dynamical Systems 4,
487–492.
Adler, R., Keane, M., and Smorodinsky, M. (1981) A construction of a
normal number for the continued fraction transformation. J. Number
Theory 13, 95–105.

347
348 References

Alexandrov, A.G. (1978) Computer investigation of continued frac-


tions. Algoritmic Studies in Combinatorics, 142–161. Nauka, Moscow.
(Russian)

Aliev, I., Kanemitsu, S., and Schinzel, A. (1998) On the metric theory
of continued fractions. Colloq. Math. 77, 141–146.

Alzer, H. (1998) On rational approximation to e. J. Number Theory


68, 57–62.

Araujo, A. and Giné, E. (1980) The Central Limit Theorem for Real
and Banach Valued Random Variables. Wiley, New York.

Babenko, K.I. (1978) On a problem of Gauss. Soviet Math. Dokl. 19,


136–140.

Babenko, K.I. and Jur0 ev, S.P. (1978) On the discretization of a prob-
lem of Gauss. Soviet Math. Dokl. 19, 731–735.

Bagemilhl, F. and McLaughlin, J.R. (1966) Generalization of some


classical theorems concerning triples of consecutive convergents to sim-
ple continued fractions. J. Reine Angew. Math. 221, 146–149.

Bailey, D.H., Borwein, J.M., and Crandall, R.E. (1997) On the Khint-
chine constant. Math. Comp. 66, 417–431.

Baladi, V. and Keller, G. (1990) Zeta functions and transfer operators


for piecewise monotonic transformations. Comm. Math. Phys. 127,
459–477.

Barbolosi, D. (1990) Sur le développement en fractions continues à


quotients partiels impairs. Monatsh. Math. 109, 25–37.

Barbolosi, D. (1993) Automates et fractions continues. J. Théor. Nom-


bres Bordeaux 5, 1–22.

Barbolosi, D. (1997) Une application du théorème ergodique sous-


additif à la théorie métrique des fractions continues. J. Number Theory
66, 172–182.

Barbolosi, D. (1999) Sur l’ordre de grandeur des quotients partiels


du développement en fractions continues régulières. Monatsh. Math.
128, 189–200.
References 349

Barbolosi, D. and Faivre, C. (1995) Metrical properties of some ran-


dom variables connected with the continued fraction expansion. Indag.
Math. (N.S.) 6, 257–265.

Barndorff–Nielsen, O. (1961) On the rate of growth of the partial max-


ima of a sequence of independent identically distributed random vari-
ables. Math. Scand. 9, 383–394.

Barrionuevo, J., Burton, R.M., Dajani, K., and Kraaikamp, C. (1996)


Ergodic properties of generalized Lüroth series. Acta Arith. 74, 311–
327.

Bedford, T., Keane, M., and Series, C. (Eds.) (1991) Ergodic Theory,
Symbolic Dynamics and Hyperbolic Spaces. Oxford University Press,
Oxford.

Berechet, A. (2001a) A Kuzmin-type theorem with exponential con-


vergence for a class of fibred systems. Ergodic Theory and Dynamical
Systems 21, 673–688.

Berechet, A. (2001b) Perron–Frobenius operators acting on BV(I) as


contractors. Ergodic Theory and Dynamical Systems 21, 1609–1624.

Bernstein, F. (1911) Über eine Anwendung der Mengenlehre auf ein


aus der Theorie der säkularen Störungen herrührendes Problem. Math.
Ann. 71, 417–439.

Billingsley, P. (1965) Ergodic Theory and Information. Wiley, New


York.

Billingsley, P. (1968) Convergence of Probability Measures. Wiley, New


York.

Borel, É. (1903) Contribution à l’analyse arithmétique du continu.


J. Math. Pures Appl. (5) 9, 329–375.

Borel, É. (1909) Les probabilités dénombrables et leurs applications


arithmétiques. Rend. Circ. Mat. Palermo 27, 247–271.

Bosma, W. (1987) Optimal continued fractions. Indag. Math. 49,


353–379.

Bosma, W. and Kraaikamp, C. (1990) Metrical theory for optimal


continued fractions. J. Number Theory 34, 251–270.
350 References

Bosma, W. and Kraaikamp, C. (1991) Optimal approximation by con-


tinued fractions. J. Austral. Math. Soc. Ser. A 50, 481–504.

Bosma, W., Dajani, K., and Kraaikamp, C. (1999) Entropy and count-
ing correct digits. Report No. 9925 (June), Univ. Nijmegen, Dept. of
Math., Nijmegen (The Netherlands).

Bosma, W., Jager, H., and Wiedijk, F. (1983) Some metrical observa-
tions on the approximation by continued fractions. Indag. Math. 45,
281–299.

Bowman, K.O. and Shenton, L.R. (1989) Continued Fractions in Sta-


tistical Applications. Marcel Dekker, New York.

Boyarsky, A. and Góra, P. (1997) Laws of Chaos: Invariant Measures


and Dynamical Systems in One Dimension. Birkhäuser, Boston.

Bradley, R.C. (1986) Basic properties of strong mixing conditions. In:


Eberlein, E. and Taqqu, M.S. (Eds.) Dependence in Probability and
Statistics, 165–192. Birkhäuser, Boston.

Breiman, L. (1960) A strong law of large numbers for a class of Markov


chains. Ann. Math. Statist. 31, 801–803.

Brezinski, C. (1991) History of Continued Fractions and Padé Approx-


imants. Springer–Verlag, Berlin.

Brjuno, A.D. (1964) The expansion of algebraic numbers into contin-


ued fractions. Z. Vyčisl. Mat. i Mat. Fiz. 4, 211–221. (Russian)

Brodén, T. (1900) Wahrscheinlichkeitsbestimmungen bei der gewöhn-


lichen Kettenbruchentwickelung reeller Zahlen. Öfversigt af Kongl.
Svenska Vetenskaps-Akademiens Förhandlingar 57, 239–266.

Brown, G. and Yin, Q. (1996) Metrical theory for Farey continued


fractions. Osaka J. Math. 33, 951–970.

Bruckheimer, M. and Arcavi, A. (1995) Farey series and Pick’s area


theorem. Math. Intelligencer 17, no. 4, 64–67.

de Bruijn, N.G. and Post, K.A. (1968) A remark on uniformly dis-


tributed sequences and Riemann integrability. Indag. Math. 30, 149–
150.
References 351

Bunimovich, L.A. (1996) Continued fractions and geometrical optics.


Amer. Math. Soc. Transl. (2) 171, 45–55.
Burton, R.M., Kraaikamp, C., and Schmidt, T.A. (2000) Natural ex-
tensions for the Rosen fractions. Trans. Amer. Math. Soc. 352, 1277–
1298.
Cassels, J.W.S. (1959) On a problem of Steinhaus about normal num-
bers. Colloq. Math. 7, 95–101.
Chaitin, G.J. (1998) The Limits of Mathematics: A Course on Infor-
mation Theory and the Limits of Formal Reasoning. Springer–Verlag
Singapore, Singapore.
Champernowne, D.G. (1933) The construction of decimals normal in
the scale of ten. J. London Math. Soc. 8, 254–260.
Chatterji, S.D. (1966) Masse, die von regelmässigen Kettenbrüchen
induziert sind. Math. Ann. 164, 113–117.
Choong, K.Y., Daykin, D.E., and Rathbone, C.R. (1971) Rational
approximations to π. Math. Comp. 25, 387–392.
Chudnovsky, D.V. and Chudnovsky, G.V. (1991) Classical constants
and functions: computations and continued fraction expansions. In:
Chudnovsky, D.V. et al. (Eds.) Number Theory (New York, 1989/1990),
13–74. Springer–Verlag, New York.
Chudnovsky, D.V. and Chudnovsky, G.V. (1993) Hypergeometric and
modular function identities, and new rational approximations to and
continued fraction expansions of classical constants and functions. In:
Knopp, M. and Sheingorn, M. (Eds.) (1993), 117–162.
Clemens, L.E. , Merrill, K.D., and Roeder, D.W. (1995) Continued
fractions and series. J. Number Theory 54, 309–317.
Cohn, H. (Ed.) (1993) Doeblin and Modern Probability (Blaubeuren,
Germany, 1991). Contemporary Mathematics 149. Amer. Math. Soc.,
Providence, RI.
Corless, R.M. (1992) Continued fractions and chaos. Amer. Math.
Monthly 99, 203–215.
Cornfeld, I.P., Fomin, S.V., and Sinai, Ya.G. (1982) Ergodic Theory.
Springer–Verlag, Berlin.
352 References

Dajani, K. and Fieldsteel, A. (2001) Equipartition of interval parti-


tions and an application to number theory. Proc. Amer. Math. Soc.
129, 3453–3460.

Dajani, K. and Kraaikamp, C. (1994) Generalization of a theorem of


Kusmin. Monatsh. Math. 118, 55–73.

Dajani, K. and Kraaikamp, C. (1996) On approximation by Lüroth


series. J. Théor. Nombres Bordeaux 8, 331–346.

Dajani, K. and Kraaikamp, C. (1998) A note of the approximation by


continued fractions under an extra condition. New York J. Math. 3A,
69–80.

Dajani, K. and Kraaikamp, C. (1999) A Gauss–Kusmin theorem for


optimal continued fractions. Trans. Amer. Math. Soc. 351, 2055–
2079.

Dajani, K. and Kraaikamp, C. (2000) ‘The mother of all continued


fractions’. Colloq. Math. 84/85, 109–123.

Dajani, K. and Kraaikamp, C. (2001) From greedy to lazy expansions


and their driving dynamics. Preprint No. 1186, Utrecht Univ., Dept.
of Math., Utrecht.

Dajani, K., Kraaikamp, C., and Solomyak, B. (1996) The natural ex-
tension of the β-transformation. Acta Math. Hungar. 73, 97–109.

Daudé, H., Flajolet, P., and Vallée, B. (1997) An average-case anal-


ysis of the Gaussian algorithm for lattice reduction. Combinatorics,
Probability and Computing 6, 397–433.

Davenport, H. (1999) The Higher Arithmetic: An Introduction to the


Theory of Numbers, 7th Edition. Cambridge Univ. Press, Cambridge.

Davison, J.L. and Shallit, J.O. (1991) Continued fractions for some
alternating series. Monatsh. Math. 111, 119–126.

Delmer, F. and Deshouillers, J-M. (1993) On a generalization of Farey


sequences, I. In: Knopp, M. and Sheingorn, M. (Eds.) (1993), 243–
246.

Delmer, F. and Deshouillers, J-M. (1995) On a generalization of Farey


sequences. II. J. Number Theory 55, 60–67.
References 353

Denker, M. and Jakubowski, A. (1989) Stable limit distributions for


strongly mixing sequences. Statist. Probab. Lett. 8, 477–483.
Denjoy, A. (1936 a) Sur les fractions continues. C.R. Acad. Sci. Paris
202, 371–374.
Denjoy, A. (1936 b) Sur une formule de Gauss. C.R. Acad. Sci. Paris
202, 537–540.
Diamond, H.G. and Vaaler, J.D. (1986) Estimates for partial sums of
continued fraction partial quotients. Pacific J. Math. 122, 73–82.
Dixon, J.D. (1970) The number of steps in the Euclidean algorithm.
J. Number Theory 2, 414–422.
Dixon, J. D. (1971) A simple estimate for the number of steps in the
Euclidean algorithm. Amer. Math. Monthly 78, 374–376.
Doeblin, W. (1940) Remarques sur la théorie métrique des fractions
continues. Compositio Math. 7, 353–371.
Doeblin, W. and Fortet, R. (1937) Sur des chaı̂nes à liaisons complètes.
Bull. Soc. Math. France 65, 132–148.
Doob, J.L. (1953) Stochastic Processes. Wiley, New York.
Doukhan, P. (1994) Mixing: Properties and Examples. Lecture Notes
in Statist. 85. Springer–Verlag, New York.
Duren, P.L. (1970) Theory of H p Spaces. Academic Press, New York.
Dürner, A. (1992) On a theorem of Gauss–Kuzmin–Lévy. Arch. Math.
(Basel ) 58, 251–256.
Elsner, C. (1999) On arithmetic properties of the convergents of Euler’s
number. Colloq. Math. 79, 133–145.
Elton, H.J. (1987) An ergodic theorem for iterated maps. Ergodic The-
ory and Dynamical Systems 7, 481–488.
Faivre, C. (1992) Distribution of Lévy constants for quadratic num-
bers. Acta Arith. 61, 13–34.
Faivre, C. (1993) Sur la mesure invariante de l’extension naturelle de la
transformation des fractions continues. J. Théor. Nombres Bordeaux
5, 323–332.
354 References

Faivre, C. (1996) On the central limit theorem for random variables


related to the continued fraction expansion. Colloq. Math. 71, 153–
159.

Faivre, C. (1997) On decimal and continued fraction expansions of a


real number. Acta Arith. 82, 119–128.

Faivre, C. (1998a) The rate of convergence of approximations of a


continued fraction. J. Number Theory 68, 21–28.

Faivre, C. (1998b) A central limit theorem related to decimal and


continued fraction expansions. Arch. Math. (Basel ) 70, 455–463.

Falconer, K.J. (1986) The Geometry of Fractal Sets. Cambridge Univ.


Press, Cambridge.

Falconer, K. (1990) Fractal Geometry: Mathematical Foundations and


Applications. Wiley, Chichester.

Feller, W. (1968) An Introduction to Probability Theory and Its Ap-


plications, Vol. I, 3rd Edition. Wiley, New York.

Finch, S. (1995) Favorite Mathematical Constants. Available at: http:


//www.mathsoft.com/asolve/constant/constant.html

Flajolet, P. and Vallée, B. (1998) Continued fractions algorithms, func-


tional operators, and structure constants. Theoret. Comput. Sci. 194,
1–34.

Flajolet, P. and Vallée, B. (2000) Continued fractions, comparison


algorithms, and fine structure constants. Constructive, Experimental,
and Nonlinear Analysis (Limoges, 1999), 53–82. Amer. Math. Soc.,
Providence, RI.

Fluch, W. (1986) Eine Verallgemeinerung des Kuz’min-Theorems. Anz.


Österreich. Akad. Wiss. Math.-Natur. Kl. Sitzungsber. II 195, 325–
339.

Fluch, W. (1992) Ein Operator der Kettenbruchtheorie. Anz. Öster-


reich. Akad. Wiss. Math.-Natur. Kl. 129, 39–49.

Gál, I.S. and Koksma, J.F. (1950) Sur l’ordre de grandeur des fonctions
sommables. Indag. Math. 12, 638–653.
References 355

Galambos, J. (1972) The distribution of the largest coefficient in con-


tinued fraction expansions. Quart. J. Math. Oxford Ser. (2) 23, 147–
151.
Galambos, J. (1973) The largest coefficient in continued fractions and
related problems. In: Osgood, Ch. (Ed.) Diophantine Approximation
and its Applications (Proc. Conf., Washington, D.C., 1972), 101–109.
Academic Press, New York.
Galambos, J. (1994) An iterated logarithm type theorem for the largest
coefficient in continued fractions. Acta Arith. 25, 359–364.
Gologan, R.-N. (1989) Applications of Ergodic Theory. Technical Pub-
lishing House, Bucharest. (Romanian)
Gordin, M.I. (1971) On the behavior of the variances of sums of ran-
dom variables forming a stationary process. Theory Probab. Appl. 16,
474–484.
Gordin, M.I. and Reznik, M.H. (1970) The law of the iterated loga-
rithm for the denominators of continued fractions. Vestnik Leningrad.
Univ. 25, no. 13, 28–33. (Russian)
Gray, J.J. (1984) A commentary on Gauss’ mathematical diary, 1796–
1814, with an English translation. Exposition. Math. 2, 97–130.
Grigorescu, S. and Popescu, G. (1989) Random systems with complete
connections as a framework for fractals. Stud. Cerc. Mat. 41, 481–489.
Gröchenig, K. and Haas, A. (1996) Backward continued fractions and
their invariant measures. Canad. Math. Bull. 39, 186–198.
Grothendieck, A. (1955) Produits tensoriels topologiques et espaces
nucléaires. Mem. Amer. Math. Soc. 16. Amer. Math. Soc., Providence,
RI.
Grothendieck, A. (1956) La théorie de Fredholm. Bull. Soc. Math.
France 84, 319–384.
de Haan, L. (1970) On Regular Variation and its Application to the
Weak Convergence of Sample Extremes. Math. Centre Tracts 32.
Math. Centrum, Amsterdam.
Halmos, P.R. (1950) Measure Theory. Van Nostrand, New York. (Re-
printed 1974 by Springer–Verlag, New York)
356 References

Hardy, G.H. and Wright, E. (1979) An Introduction to the Theory of


Numbers, 5th Edition. Clarendon Press, Oxford. [Reprinted (with
corrections) 1983]

Harman, G. (1998) Metric Number Theory. Oxford University Press,


New York.

Harman, G. and Wong, K.C. (2000) A note on the metrical theory of


continued fractions. Amer. Math. Monthly 107, 834–837.

Hartman, S. (1951) Quelques propriétés ergodiques des fractions con-


tinues. Studia Math. 12, 271–278.

Hartono, Y. and Kraaikamp, C. (2002) On continued fractions with


odd partial quotients. Rev. Roumaine Math. Pures Appl. 47, no. 1.

Heilbronn, H. (1969) On the average length of a class of finite continued


fractions. Number Theory and Analysis (Papers in Honor of Edmund
Landau), 87–96. Plenum, New York.

Heinrich, H. (1987) Rates of convergence in stable limit theorems for


sums of exponentially ψ-mixing random variables with an application
to metric theory of continued fractions. Math. Nachr. 131, 149–165.

Hennion, H. (1993) Sur un théorème spectral et son application aux


noyaux lipschitziens. Proc. Amer. Math. Soc. 118, 627–634.

Hensley, D. (1988) A truncated Gauss–Kuzmin law. Trans. Amer.


Math. Soc. 306, 307–327.

Hensley, D. (1991) The largest digit in the continued fraction expan-


sion of a rational number. Pacific J. Math. 151, 237–255.

Hensley, D. (1992) Continued fraction Cantor sets, Hausdorff dimen-


sion, and functional analysis. J. Number Theory 40, 336–358.

Hensley, D. (1994) The number of steps in the Euclidean algorithm.


J. Number Theory 49, 142–182.

Hensley, D. (1996) A polynomial time algorithm for the Hausdorff


dimension of continued fraction Cantor sets. J. Number Theory 58,
9–45.

Hensley, D. (1998) Metric Diophantine approximation and probability.


New York J. Math. 4, 249–257.
References 357

Hensley, D. (2000) The statistics of the continued fraction digit sum.


Pacific J. Math. 192, 103–120.

Heyde, C.C. and Scott, D.J. (1973) Invariance principles for the law of
the iterated logarithm for martingales and processes with stationary
increments. Ann. Probab. 1, 428–436.

Hofbauer, F. and Keller, G. (1982) Ergodic properties of invariant


measures for piecewise monotonic transformations. Math. Z. 180, 119–
140.

Hoffmann-Jørgensen, J.(1994) Probability with a View toward Statis-


tics, Vols. I and II. Chapman & Hall, New York.

Hurwitz, A. (1889) Über eine besondere Art der Kettenbruch-Entwick-


lung reeller Grössen. Acta Math. 12, 367–405.

Ibragimov, I.A. and Linnik, Yu.V. (1971) Independent and Stationary


Sequences of Random Variables. Wolters–Noordhoff, Groningen.

Ionescu Tulcea, C. T. and Marinescu, G. (1950) Théorie ergodique


pour des classes d’opérations non complètement continues. Ann. of
Math. (2) 52, 140–147.

Iosifescu, M. (1968) The law of the iterated logarithm for a class of


dependent random variables. Theory Probab. Appl. 13, 304–313. Ad-
dendum, ibid. 15 (1970), 160.

Iosifescu, M. (1972) On Strassen’s version of the loglog law for some


classes of dependent random variables. Z. Wahrsch. Verw. Gebiete
24, 155–158.

Iosifescu, M. (1977) A Poisson law for φ-mixing sequences establishing


the truth of a Doeblin statement. Rev. Roumaine Math. Pures Appl.
22, 1441–1447.

Iosifescu, M. (1978) Recent advances in the metric theory of continued


fractions. Trans. Eighth Prague Conf. on Information Theory, Statis-
tical Decision Functions, Random Processes (Prague, 1978), Vol. A,
27–40. Reidel, Dordrecht.

Iosifescu, M. (1989) On mixing coefficients for the continued fraction


expansion. Stud. Cerc. Mat. 41, 491–499.
358 References

Iosifescu, M. (1990) A survey of the metric theory of continued frac-


tions, fifty years after Doeblin’s 1940 paper. In: Grigelionis, B. et
al. (Eds.) Probability Theory and Mathematical Statistics (Proc. Fifth
Vilnius Conference, 1989), Vol. I, 550–572. Mokslas, Vilnius & VSP,
Utrecht.
Iosifescu, M. (1992) A very simple proof of a generalization of the
Gauss–Kuzmin–Lévy theorem on continued fractions, and related ques-
tions. Rev. Roumaine Math. Pures Appl. 37, 901–914.
Iosifescu, M. (1993a) Doeblin and the metric theory of continued frac-
tions: a functional theoretical approach to Gauss’ 1812 problem. In:
Cohn, H. (Ed.) (1993), 97–110.
Iosifescu, M. (1993b) A basic tool in mathematical chaos theory: Doe-
blin and Fortet’s ergodic theorem and Ionescu Tulcea and Marinescu’s
generalization. In: Cohn, H. (Ed.) (1993), 111–124.
Iosifescu, M. (1994) On the Gauss–Kuzmin–Lévy theorem, I. Rev. Rou-
maine Math. Pures Appl. 39, 97–117.
Iosifescu, M. (1995) On the Gauss–Kuzmin–Lévy theorem, II. Rev. Rou-
maine Math. Pures Appl. 40, 91–105.
Iosifescu, M. (1996) On some series involving sums of incomplete quo-
tients of continued fractions. Stud. Cerc. Mat. 48, 31–36. Corrigen-
dum, ibid. 48, 146.
Iosifescu, M. (1997a) On the Gauss–Kuzmin–Lévy theorem, III. Rev.
Roumaine Math. Pures Appl. 42, 71–88.
Iosifescu, M. (1997b) A reversible random sequence arising in the met-
ric theory of the continued fraction expansion. Rev. Anal. Numér.
Théor. Approx. 26, 91–93.
Iosifescu, M. (1999) On a 1936 paper of Arnaud Denjoy on the metrical
theory of the continued fraction expansion. Rev. Roumaine Math.
Pures Appl. 44, 777–792.
Iosifescu, M. (2000a) An exact convergence rate result with application
to Gauss’ 1812 problem. Proc. Romanian Acad. Ser. A 1, 11–13.
Iosifescu, M. (2000b) Exact values of ψ-mixing coefficients of the se-
quence of incomplete quotients of the continued fraction expansion.
Proc. Romanian Acad. Ser. A 1, 67–69.
References 359

Iosifescu, M. (2000c) On the distribution of continued fraction approx-


imations: optimal rates. Proc. Romanian Acad. Ser. A 1, 143–145.

Iosifescu, M. and Grigorescu, S. (1990) Dependence with Complete


Connections and its Applications. Cambridge Univ. Press, Cambridge.

Iosifescu, M. and Kalpazidou, S. (1993) The nearest integer continued


fraction expansion: an approach in the spirit of Doeblin. In: Cohn,
H. (Ed.) (1993), 125–137.

Iosifescu, M. and Kraaikamp, C. (2001) On Denjoy’s canonical contin-


ued fraction expansion. Submitted.

Iosifescu, M. and Theodorescu, R. (1969) Random Processes and Learn-


ing. Springer–Verlag, Berlin.

Ito, Sh. (1987) On Legendre’s theorem related to Diophantine approx-


imations. Séminaire de Théorie des Nombres, 1987–1988 (Talence,
1987–1988), Exp. No. 44, 19 pp.

Ito, Sh. (1989) Algorithms with mediant convergents and their metri-
cal theory. Osaka J. Math. 26, 557–578.

Jager, H. (1982) On the speed of convergence of the nearest integer


continued fraction. Math. Comp. 39, 555–558.

Jager, H. (1985) Metrical results for the nearest integer continued frac-
tion. Indag. Math. 47, 417–427.

Jager, H. (1986a) The distribution of certain sequences connected with


the continued fraction. Indag. Math. 48, 61–69.

Jager, H. (1986b) Continued fractions and ergodic theory. Trans-


cendental Number Theory and Related Topics, 55–59. RIMS Koky-
uroku 599. Kyoto Univ., Kyoto.

Jager, H. and Kraaikamp, C. (1989) On the approximation by contin-


ued fractions. Indag. Math. 51, 289–307.

Jager, H. and Liardet, P. (1988) Distributions arithmétiques des déno-


minateurs de convergents de fractions continues. Indag. Math. 50,
181–197.

Jain, N.C. and Pruitt, W.E. (1975) The other law of the iterated
logarithm. Ann. Probab. 3, 1046–1049.
360 References

Jain, N.C. and Taylor, S.J. (1973) Local asymptotic laws for Brownian
motion. Ann. Probab. 1, 527–549.
Jenkinson, O. and Pollicott, M. (2001) Computing the dimension of
dynamically defined sets: E2 and bounded continued fractions. Er-
godic Theory and Dynamical Systems 21, 1429–1445.
Jain, N.C., Jodgeo, K., and Stout, W.F. (1975) Upper and lower func-
tions for martingales and mixing processes. Ann. Probab. 3, 119–145.
Jones, W.B. and Thron, W.J. (1980) Continued Fractions: Analytic
Theory and Applications. Addison-Wesley, Reading, Mass.
Kac, M. (1959) Statistical Independence in Probability and Statistics.
Wiley, New York.
Kaijser, T. (1983) A note on random continued fractions. Probabil-
ity and Mathematical Statistics : Essays in Honour of Carl-Gustav
Esseen, 74–84. Uppsala Univ., Dept. of Math., Uppsala.
Kakeya, S. (1924) On a generalized scale of notations. Japan J. Math.
1, 95-108.
Kalpazidou, S. (1985a) On a random system with complete connec-
tions associated with the continued fraction to the nearer integer ex-
pansion. Rev. Roumaine Math. Pures Appl. 30, 527–537.
Kalpazidou, S. (1985b) On some bidimensional denumerable chains of
infinite order. Stochastic Process. Appl. 19, 341–357.
Kalpazidou, S. (1985c) Denumerable chains of infinite order and Hur-
witz expansion. Selected Papers Presented at the 16th European Meet-
ing of Statisticians (Marburg, 1994). Statist. Decisions, Suppl. Issue
no. 2, 83–87.
Kalpazidou, S. (1986a) A class of Markov chains arising in the met-
rical theory of the continued fraction to the nearer integer expansion.
Rev. Roumaine Math. Pures Appl. 31, 877–890.
Kalpazidou, S. (1986b) Some asymptotic results on digits of the near-
est integer continued fraction. J. Number Theory 22, 271–279.
Kalpazidou, S. (1986c) On nearest continued fractions with stochasti-
cally independent and identically distributed digits. J. Number Theory
24, 114–125.
References 361

Kalpazidou, S. (1986d) On a problem of Gauss–Kuzmin type for con-


tinued fractions with odd partial quotients. Pacific J. Math. 123,
103–114.

Kalpazidou, S. (1986e) A Gaussian measure for certain continued frac-


tions. Proc. Amer. Math. Soc. 96, 629–635.

Kalpazidou, S. (1987a) On the entropy of the expansion with odd


partial quotients. In: Grigelionis, B. et al. (Eds.) Probability Theory
and Mathematical Statistics (Proc. Fourth Vilnius Conf., 1985), Vol.
II, 55–62. VNU Science Press, Utrecht.

Kalpazidou, S. (1987b) On the application of dependence with com-


plete connections to the metrical theory of G-continued fractions. Li-
thuanian Math. J. 27, no. 1, 32–40.

Kamae, T. (1982) A simple proof of the ergodic theorem using non-


standard analysis. Israel J. Math. 42, 284–290.

Kanwal, R.P. (1997) Linear Integral Equations: Theory and Technique,


2nd Edition. Birkhäuser, Boston.

Kargaev, P. and Zhigljavsky, A. (1997) Asymptotic distribution of the


distance function to the Farey points. J. Number Theory 65, 130–149.

Katznelson, Y. and Weiss, B. (1982) A simple proof of some ergodic


theorems. Israel J. Math. 42, 291–296.

Keane, M.S. (1991) Ergodic theory and subshifts of finite type. In:
Bedford, T. et al. (Eds.) (1991), 35–70.

Keller, G. (1984) On the rate of convergence to equilibrium in one-


dimensional systems. Comm. Math. Phys. 96, 181–193.

Khintchine, A. (1934/35) Metrische Kettenbruchprobleme. Composi-


tio Math. 1, 361–382.

Khintchine, A. (1936) Zur metrischen Kettenbruchtheorie. Compositio


Math. 3, 276–285.

Khintchine, A.J. (1956) Kettenbrüche. Teubner, Leipzig. [Translation


of the 2nd (1949) Russian Edition; 1st Russian Edition 1935]

Khintchine, A.Ya. (1963) Continued Fractions. Noordhoff, Groningen.


[Translation of the 3rd (1961) Russian Edition]
362 References

Khinchin, A.Ya. (1964) Continued Fractions. Univ. Chicago Press,


Chicago. [Translation of the 3rd (1961) Russian Edition]

Klein, F. (1895) Über eine geometrische Auffassung der gewöhnlichen


Kettenbruchentwicklung. Nachr. König. Gesellsch. Wiss. Göttingen
Math.-Phys. Kl. 45, 357–359. [French version (1896) Sur une représen-
tation géométrique du développement en fraction continue ordinaire.
Nouvelles Ann. Math. (3), 15, 327–331]

Knopp, K. (1926) Mengentheoretische Behandlung einiger Probleme


der diophantische Approximationen und der transfiniten Wahrschein-
lichkeiten. Math. Ann. 95, 409–426.

Knopp, M. and Sheingorn, M. (Eds.) (1993) A Tribute to Emil Gross-


wald: Number Theory and Related Analysis. Contemporary Mathe-
matics 143. Amer. Math. Soc., Providence, RI.

Knuth, D.E. (1976) Evaluation of Porter’s constant. Comput. Math.


Appl. 2, 137–139.

Knuth, D.E. (1981) The Art of Computer Programming, Vol. 2: Seminu-


merical Algorithms, 2nd Edition. Addison-Wesley, Reading, Mass.

Knuth, D.E. (1984) The distribution of continued fraction approxima-


tions. J. Number Theory 19, 443–448.

Köhler, G. (1980) Some more predictable continued fractions. Monatsh.


Math. 89, 95–100.

Koksma, J.F. (1936) Diophantische Approximationen. J. Springer,


Berlin.

Kraaikamp, C. (1987) The distribution of some sequences connected


with the nearest integer continued fraction. Indag. Math. 49, 177–191.

Kraaikamp, C. (1989) Statistic and ergodic properties of Minkowski’s


diagonal continued fraction. Theoret. Comput. Sci. 65, 197–212.

Kraaikamp, C. (1990) On the approximation by continued fractions,


II. Indag. Math. (N.S.) 1, 63–75.

Kraaikamp, C. (1991) A new class of continued fractions. Acta Arith.


57, 1–39.
References 363

Kraaikamp, C. (1993) Maximal S-expansions are Bernoulli shifts. Bull.


Soc. Math. France 121, 117–131.

Kraaikamp, C. (1994) On symmetric and asymmetric Diophantine ap-


proximation by continued fractions. J. Number Theory 46, 137–157.

Kraaikamp, C. and Liardet, P. (1991) Good approximations and con-


tinued fractions. Proc. Amer. Math. Soc. 112, 303–309.

Kraaikamp, C. and Lopes, A. (1996) The theta group and the con-
tinued fraction expansion with even partial quotients. Geometriae
Dedicata 59, 293–333.

Kraaikamp, C. and Meester, R. (1998) Convergence of continued frac-


tion type algorithms and generators. Monatsh. Math. 125, 1–14.

Kraaikamp, C. and Nakada, H. (2000) On normal numbers for contin-


ued fractions. Ergodic Theory and Dynamical Systems 20, 1405–1421.

Kraaikamp, C. and Nakada, H. (2001) On a problem of Schweiger


concerning normal numbers. J. Number Theory 86, 330–340.

Krasnoselskii, M. (1964) Positive Solutions of Operator Equations. No-


ordhoff, Groningen.

Krengel, U. (1985) Ergodic Theorems (with a Supplement by Antoine


Brunel). W. de Gruyter, Berlin.

Kuipers, L. and Niederreiter, H. (1974) Uniform Distribution of Se-


quences. Wiley, New York.

Kurosu, K. (1924) Notes on some points in the theory of continued


fractions. Japan J. Math. 1, 17–21. Corrigendum, ibid. 2 (1926), 64.

Kuzmin, R.O. (1928) On a problem of Gauss. Dokl. Akad. Nauk SSSR


Ser. A, 375–380. [Russian; French version in Atti Congr. Inter-
naz. Mat. (Bologna, 1928), Tomo VI, 83–89. Zanichelli, Bologna,
1932]

Lagarias, J.C. (1992) Number theory and dynamical systems. In:


Burr, S.A. (Ed.) The Unreasonable Effectiveness of Number Theory,
35–72. Proc. Sympos. Appl. Math. 46. Amer. Math. Soc., Providence,
RI.
364 References

Lang, S. and Trotter, H. (1972) Continued fractions for some algebraic


numbers. J. Reine Angew. Math. 255, 112–134. Addendum, ibid. 267
(1974), 219–220.

Lasota, A. and Mackey, M.C. (1985) Probabilistic Properties of Deter-


ministic Systems. Cambridge Univ. Press, Cambridge. [2nd Edition
(1994) Chaos, Fractals, and Noise: Stochastic Aspects of Dynamics.
Applied Mathematical Sciences 97. Springer–Verlag, New York]

Legendre, A.M. (1798) Essai sur la théorie des nombres. Duprat,


Paris. [2ème édition (1808), Courcier, Paris; 3ème édition (1830),
Didot, Paris; reprinted (1955), Blanchard, Paris]

Lehmer, D. (1939) Note on an absolute constant of Khintchine. Amer.


Math. Monthly 46, 148–152.

Lehner, J. (1994) Semiregular continued fractions whose partial de-


nominators are 1 or 2. In: Abikoff, W. et al. (Eds.) The Mathe-
matical Legacy of Wilhelm Magnus: Groups, Geometry and Special
Functions (Brooklyn, NY, 1992), 407–410. Contemporary Mathemat-
ics 169. Amer. Math. Soc., Providence, RI.

Lévy, P. (1929) Sur les lois de probabilité dont dépendent les quotients
complets et incomplets d’une fraction continue. Bull. Soc. Math. France
57, 178–194.

Lévy, P. (1936) Sur le développement en fraction continue d’un nombre


choisi au hasard. Compositio Math. 3, 286–303.

Lévy, P. (1952) Fractions continues aléatoires. Rend. Circ. Mat. Palermo


(2) 1, 170–208.

Lévy, P. (1954) Théorie de l’addition des variables aléatoires, 2ème


édition. Gauthier-Villars, Paris. (1ère édition 1937)

Liardet, P. and Stambul, P. (2000) Séries de Engel et fractions conti-


nues. J. Théor. Nombres Bordeaux 12, 37–68.

Lin, M. (1978) Quasi-compactness and uniform ergodicity of positive


operators. Israel J. Math. 29, 309–311.

Lochs, G. (1961) Statistik der Teilnenner der zu den echten Brüchen


gehörigen regelmässigen Kettenbrüche. Monatsh. Math. 65, 27–52.
References 365

Lochs, G. (1963) Die ersten 968 Kettenbruchnenner von π. Monatsh.


Math. 67, 311–316.

Lochs, G. (1964) Vergleich der Genauigkeit von Dezimalbruch und


Kettenbruch. Abh. Math. Sem. Hamburg 27, 142–144.

Lorenzen, L. and Waadeland, H. (1992) Continued Fractions and Ap-


plications. North-Holland, Amsterdam.

Loynes, R.M. (1965) Extreme values in uniformly mixing stationary


stochastic processes. Ann. Math. Statist. 36, 993–999.

Lyons, R. (2000) Singularity of some random continued fractions. J.


Theoret. Probab. 13, 535–545.

Mackey, M.C. (1992) Time’s Arrow: The Origins of Thermodynamic


Behavior. Springer–Verlag, New York.

MacLeod, A. J.(1993) High-accuracy numerical values in the Gauss–


Kuzmin continued fraction problem. Comput. Math. Appl. 26, 37–44.

Magnus, W., Oberhettinger, F., and Soni, R.P. (1966) Formulas and
Theorems for the Special Functions of Mathematical Physics, 3rd Edi-
tion. Springer–Verlag, Berlin.

Marcus, S. (1961) Les approximations diophantiennes et la catégorie


de Baire. Math. Z. 76, 42–45.

Marques Henriques, J. (1966) On probability measures generated by


regular continued fractions. Gaz. Mat. (Lisboa) 27, no. 103–104, 16–
22.

Martin, M.H. (1934) Metrically transitive point transformations. Bull.


Amer. Math. Soc. 40, 606–612.

Mayer, D.H. (1987) Relaxation properties of the mixmaster universe.


Physics Lett. A 122, 390–394.

Mayer, D. (1990) On the thermodynamic formalism for the Gauss map.


Comm. Math. Phys. 130, 311–333.

Mayer, D. (1991) Continued fractions and related transformations. In:


Bedford, T. et al. (Eds.) (1991), 175–222.
366 References

Mayer, D. and Roepstorff, G. (1987) On the relaxation time of Gauss’


continued-fraction map. I. The Hilbert space approach (Koopman-
ism). J. Statist. Phys. 47, 149–171.

Mayer, D. and Roepstorff, G. (1988) On the relaxation time of Gauss’


continued-fraction map. II. The Banach space approach (transfer op-
erator method). J. Statist. Phys. 50, 331–344.

Mazzone, F. (1995/96) A characterization of almost everywhere con-


tinuous functions. Real Anal. Exchange 21, no. 1, 317–319.

McKinney, T.E. (1907) Concerning a certain type of continued frac-


tions depending on a variable parameter. Amer. J. Math. 29, 213–278.

Minkowski, H. (1900) Über die Annäherung an eine reelle Grösse durch


rationale Zahlen. Math. Ann. 54, 91–124.

Minnigerode, B. (1873) Über eine neue Methode, die Pell’sche Gle-


ichung aufzulösen. Nachr. König. Gesellsch. Wiss. Göttingen Math.-
Phys. Kl. 23, 619–652.

Misevic̆ius, G. (1971) Asymptotic


Pn−1 expansions for the distribution func-
tions of sums of the form f (T j t). Ann. Univ. Sci. Budapest
j=0
Eötvös Sect. Math. 14, 77–92. (Russian)

Misevic̆ius, G. (1981) Estimate of the remainder term in the limit the-


orem for the denominators of continued fractions. Lithuanian Math. J.
21, 245–253.

Misevic̆ius, G. (1992) The optimal zone for large deviations of the


denominators of continued fractions. New Trends in Probability and
Statistics (Palanga, 1991), Vol. 2, 83–90. VSP, Utrecht.

Moeckel, R. (1982) Geodesics on modular surfaces and continued frac-


tions. Ergodic Theory and Dynamical Systems 2, 69–83.

Mollin, R.A. (1999) Continued fraction gems. Nieuw Arch. Wiskunde


(4) 17, 383–405.

Morita, T. (1994) Local limit theorem and distribution of periodic


orbits of Lasota-Yorke transformations with infinite Markov partitions.
J. Math. Soc. Japan 46, 309–343. Errata, ibid. 47 (1995), 191–192.
References 367

Nakada, H. (1981) Metrical theory for a class of continued fraction


transformations and their natural extensions. Tokyo J. Math. 7, 399–
426.

Nakada, H. (1990) The metrical theory of complex continued fractions.


Acta Arith. 56, 279–289.

Nakada, H. (1995) Continued fractions, geodesic flows and Ford circles.


In: Takahashi, Y. (Ed.), Algorithms, Fractals and Dynamics, 179–191.
Plenum, New York.

Nakada, H., Ito, Sh., and Tanaka, S. (1977) On the invariant measure
for the transformations associated with some real continued fraction.
Keio Engrg. Rep. 30, 159–175.

von Neumann, J. and Tuckerman, B. (1955) Continued fraction ex-


pansion of 21/3 . Math. Tables Aids Comput. 9, 23–24.

Nolte, V.N. (1990) Some probabilistic results on the convergents of


continued fractions. Indag. Math. (N.S.) 1, 381–389.

Obrechkoff, N. (1951) Sur l’approximation des nombres irrationnels


par des nombres rationnels. C.R. Acad. Bulgare Sci. 3, no. 1, 1–4.

Olds, C.D. (1963) Continued Fractions. Random House, Toronto.

Pedersen, P. (1959) On the expansion of π in a regular continued


fraction. II. Nordisk Mat. Tidskr. 7, 165–168.

Perron, O. (1954, 1957) Die Lehre von der Kettenbrüchen. Band I: El-
ementare Kettenbrüche; Band II: Analytisch-funktiontheoretische Ket-
tenbrüche. Teubner, Stuttgart. (1st Edition 1913; 2nd Edition 1929)

Petek, P. (1989) The continued fraction of a random variable. Expo-


sition. Math. 7, 369–378.

Petersen, K. (1983) Ergodic Theory. Cambridge Univ. Press, Cam-


bridge.

Pethő, A. (1982) Simple continued fractions for the Fredholm numbers.


J. Number Theory 14, 232–236.

Philipp, W. (1967) Some metrical theorems in number theory. Pacific


J. Math. 20, 109–127.
368 References

Philipp, W. (1970) Some metrical theorems in number theory II. Duke


Math. J. 37, 447–458. Errata, ibid. 37, 788.

Philipp, W. (1976) A conjecture of Erdös on continued fractions. Acta


Arith. 28, 379–386.

Philipp, W. (1988) Limit theorems for sums of partial quotients of


continued fractions. Monatsh. Math. 105, 195–206.

Philipp, W. and Stackelberg, O.P. (1969) Zwei Grenzwertsätze für


Kettenbrüche. Math. Ann. 181, 152–156.

Philipp, W. and Stout, W. (1975) Almost Sure Invariance Principles


for Partial Sums of Weakly Dependent Random Variables. Mem. Amer.
Math. Soc. 161. Amer. Math. Soc., Providence, RI.

Philipp, W. and Webb, G.R. (1973) An invariance principle for mixing


sequences of random variables. Z. Wahrsch. Verw. Gebiete 25, 223–
237.

von Plato, J. (1994) Creating Modern Probability: Its Mathematics,


Physics and Philosophy in Historical Perspective. Cambridge Univ.
Press, Cambridge.

van der Poorten, A. and Shallit, J. (1992) Folded continued fractions.


J. Number Theory 40, 237–250.

Popescu, C. (1997a) Continued fractions with odd partial quotients:


an approach in the spirit of Doeblin. Stud. Cerc. Mat. 49, 107–117.

Popescu, C. (1997b) On the rate of convergence in Gauss’ problem


for the continued fraction expansion with odd partial quotients. Stud.
Cerc. Mat. 49, 231–244.

Popescu, C. (1999) On the rate of convergence in Gauss’ problem


for the nearest interger continued fraction expansion. Rev. Roumaine
Math. Pures Appl. 44, 257–267.

Popescu, C. (2000) On a Gauss–Kuzmin problem for the α-continued


fractions. Rev. Roumaine Math. Pures Appl. 45, 993–1004.

Popescu, G. (1978) Asymptotic behaviour of random systems with


complete connections, I, II. Stud. Cerc. Mat. 30, 37–68, 181–215.
(Romanian)
References 369

Porter, J.W. (1975) On a theorem of Heilbronn. Mathematika 22,


20–28.

Postnikov, A.G. (1960) Arithmetic Modeling of Random Processes.


Trudy Mat. Inst. Steklov. 57. Nauka, Moscow. [Russian; English
translation Selected Transl. in Math. Statist. and Probab. 13 (1973),
41–122]

Raney, G.N. (1973) On continued fractions and finite automata. Math.


Ann. 206, 265–283.

Răuţu, G. and Zbăganu, G. (1989) Some Banach algebras of functions


of bounded variation. Stud. Cerc. Mat. 41, 513–519.

Rényi, A. (1957) Representations for real numbers and their ergodic


properties. Acta Math. Acad. Sci. Hungar. 8, 477–493.

Richtmyer, R.D. (1975) Continued fraction expansion of algebraic num-


bers. Adv. in Math. 16, 362–367.

Rieger, G.J. (1977) Die metrische Theorie der Kettenbrüche seit Gauss.
Abh. Braunschweig. Wiss. Gesellsch. 27, 103–117.

Rieger, G.J. (1978) Ein Gauss–Kusmin–Lévy–Satz für Kettenbrüche


nach nächsten Ganzen. Manuscripta Math. 24, 437–448.

Rieger, G.J. (1979) Mischung und Ergodizität bei Kettenbrüchen nach


nächsten Ganzen. J. Reine Angew. Math. 310, 171–181.

Rieger, G.J. (1981a) Ein Heilbronn–Satz für Kettenbrüche mit unger-


aden Teilnennern. Math. Nachr. 101, 295–307.

Rieger, G.J. (1981b) Über die Länge von Kettenbrüchen mit ungeraden
Teilnennern. Abh. Braunschweig. Wiss. Gesellsch. 32, 61–69.

Rieger, G.J. (1984) On the metrical theory of the continued fractions


with odd partial quotients. Topics in Classical Number Theory (Bu-
dapest, 1981), Vol. II, 1371–1418. Colloq. Math. Soc. János Bolyai
34. North-Holland, Amsterdam.

Rivat, J. (1999) On the metric theory of continued fractions. Col-


loq. Math. 79, 9–15.

Rockett, A.M. (1980) The metrical theory of continued fractions to


the nearer integer. Acta Arith. 38, 97–103.
370 References

Rockett, A.M. and Szűsz, P. (1992) Continued Fractions. World Sci-


entific, Singapore.

Rogers, C.A. (1998) Hausdorff measures, 2nd Printing, with a Fore-


word by K.Falconer. Cambridge Univ. Press, Cambridge.

Rosen, D. (1954) A class of continued fractions associated with certain


properly discontinuous groups. Duke Math. J. 21, 549–563.

Rousseau-Egèle, J. (1983) Un théorème de la limite locale pour une


classe de transformations dilatantes et monotones par morceaux. Ann.
Probab. 11, 772–788.

Ruelle, D. (1978) Thermodynamic Formalism. The Mathematical Struc-


tures of Classical Equilibrium Statistical Mechanics. Addison-Wesley,
Reading, Mass.

Ryll–Nardzewski, C. (1951) On the ergodic theorems. II. Ergodic the-


ory of continued fractions. Studia Math. 12, 74–79.

Šalát, T. (1967) Remarks on the ergodic theory of the continued frac-


tions. Mat. Časopis Sloven. Akad. Vied 17, 121–130.

Šalát, T. (1969) Bemerkung zu einem Satz von P. Lévy in der metrischen


Theorie der Kettenbrüche. Math. Nachr. 41, 91–94.

Šalát, T. (1984) On a metric result in the theory of continued fractions.


Acta Math. Univ. Comenian. 44–45, 49–53.

Salem, R. (1943) On some singular monotonic functions which are


strictly increasing. Trans. Amer. Math. Soc. 53, 427–439.

Samorodnitsky, G. and Taqqu, M.S. (1994) Stable Non-Gaussian Ran-


dom Processes: Stochastic Models with Infinite Variance. Chapman &
Hall, New York.

Samur, J.D. (1984) Convergence of sums of mixing triangular arrays


of random vectors with stationary rows. Ann. Probab. 12, 390–426.

Samur, J.D. (1985) A note on the convergence to Gaussian laws of


sums of stationary ϕ-mixing triangular arrays. Probability in Banach
Spaces V (Proccedings, Medford, 1984), 387–399. Lecture Notes in
Math. 1153. Springer–Verlag, Berlin.
References 371

Samur, J.D. (1987) On the invariance principle for stationary ϕ-mixing


triangular arrays with infinitely divisible limits. Probab. Theory Re-
lated Fields 75, 245–259.

Samur, J.D. (1989) On some limit theorems for continued fractions.


Trans. Amer. Math. Soc. 316, 53–79.

Samur, J.D. (1991) A functional central limit theorem in Diophantine


approximation. Proc. Amer. Math. Soc. 111, 901–911.

Samur, J.D. (1996) Some remarks on a probability limit theorem for


continued fractions. Trans. Amer. Math. Soc. 348, 1411–1428.

Saulis, L. and Statulevic̆ius, V. (1991) Limit Theorems for Large De-


viations. Kluwer, Dordrecht.

Schmidt, A.L. (1975) Diophantine approximation of complex numbers.


Acta Math. 134, 1–85.

Schmidt, A.L. (1983) Ergodic theory for complex continued fractions.


Monatsh. Math. 93, 39–62.

Schmidt, T.A. (1993) Remarks on the Rosen λ-continued fractions.


In: Pollington, A. and Moran, W. (Eds.), Number Theory with an
Emphasis on the Markoff Spectrum, 227–238. Marcel Dekker, New
York.

Schmidt, W.M. (1960) On normal numbers. Pacific J. Math. 10,


661–672.

Schmidt, W.M. (1980) Diophantine Approximation. Lecture Notes in


Math. 785. Springer–Verlag, Berlin.

Schweiger, F. (1969) Eine Bemerkung zu einer Arbeit von S.D. Chat-


terji. Mat. Časopis Sloven. Akad. Vied 19, 89–91.

Schweiger, F. (1995) Ergodic Theory of Fibred Systems and Metric


Number Theory. Clarendon Press, Oxford.

Schweiger, F. (2000a) Kuzmin’s theorem revisited. Ergodic Theory


and Dynamical Systems 20, 557–565.

Schweiger, F. (2000b) Multidimensional Continued Fractions. Oxford


Univ. Press, Oxford.
372 References

Sebe, G.I. (1999) Spectral analysis of the Ruelle operator associated


with the topological infinite order chain of the continued fraction ex-
pansion. Rev. Roumaine Math. Pures Appl. 44, 277–291.

Sebe, G.I. (2000a) The Gauss–Kuzmin theorem for Hurwitz’s singular


continued fraction expansion. Rev. Roumaine Math. Pures Appl. 45,
495–514.

Sebe, G.I. (2000b) A two-dimensional Gauss–Kuzmin theorem for sin-


gular continued fractions. Indag. Math. (N.S.) 11, 593–605.

Sebe, G.I. (2001a) On convergence rate in the Gauss–Kuzmin problem


for the grotesque continued fractions. Monatsh. Math. 133, 241–254.

Sebe, G.I. (2001b) Gauss’ problem for the continued fraction expan-
sion with odd partial quotients revisited. Rev. Roumaine Math. Pures
Appl. 46, 839–852.

Sebe, G.I. (2002) A Gauss–Kuzmin theorem for the Rosen fractions.


J. Théor. Nombres Bordeaux 14.

Segre, B. (1945) Lattice points in infinite domains, and asymmetric


Diophantine approximation. Duke J. Math. 12, 337–365.

Selenius, C.-O. (1960) Konstruktion und Theorie halbregelmässiger


Kettenbrüche mit idealer relativer Approximationen. Acta Acad. Abo.
Math. Phys. 22, no. 2, 1–75.

Sendov, B. (1959/60) Der Vahlensatz über die singulären Kettenbrüche


und die Kettenbrüche nach nächsten Ganzen. Annuaire Univ. Sofia
Fac. Sci. Phys. Math. Livre 1 Math. 54, 251–258.

Seneta, E. (1976) Regularly Varying Functions. Lecture Notes in


Math. 508. Springer–Verlag, Berlin.

Series, C. (1982) Non-Euclidean geometry, continued fractions, and


ergodic theory. Math. Intelligencer 4, no. 1, 24–31.

Series, C. (1991) Geometrical methods of symbolic coding. In: Bed-


ford, T. et al. (Eds.) (1991), 125–151.

Shallit, J. (1979) Simple continued fractions for some irrational num-


bers. J. Number Theory 11, 209–217.
References 373

Shallit, J. O. (1982a) Simple continued fractions for some irrational


numbers, II. J. Number Theory 14, 228–231.

Shallit, J. O. (1982b) Explicit descriptions of some continued fractions.


Fibonacci Quart. 20, 77–81.

Shallit, J. (1994) Origins of the analysis of the Euclidean algorithm.


Historia Math. 21, 401–419.

Shanks, D. and Wrench, J.W., Jr. (1959) Khintchine’s constant. Amer.


Math. Monthly 66, 276–279.

Shiu, P. (1995) Computation of continued fractions without input val-


ues. Math. Comp. 64, 1307–1317.

Sinai, Ya.G. (1994) Topics in Ergodic Theory. Princeton Univ. Press,


Princeton, NJ.

Sloane, N.J.A. and Plouffe, S. (1995) The Encyclopedia of Integer Se-


quences. Academic Press, San Diego.

Sprindžuk, V.G. (1979) Metric Theory of Diophantine Approxima-


tions. Wiley, New York.

Stadje, W. (1985) Bemerkung zu einem Satz von Akcoglu und Krengel.


Studia Math. 81, 307–310.

Strassen, V. (1964) An invariance principle for the law of the iterated


logarithm. Z. Wahrsch. Verw. Gebiete 3, 211–226.

Sudan, G. (1959) The Geometry of Continued Fractions. Technical


Publishing House, Bucharest. (Romanian)

Szűsz, P. (1961) Über einen Kusminschen Satz. Acta Math. Acad. Sci.
Hungar. 12, 447–453.

Szűsz, P. (1962) Verallagemainerung und Anwendungen eines Kusmin-


schen Satzes. Acta Arith. 7, 149–160.

Szűsz, P. (1980) On the length of continued fractions representing a


rational number with given denominator. Acta Arith. 37, 55–59.

Szűsz, P. and Volkmann, B. (1982) On Strassen’s law of the iterated


logarithm. Z. Wahrsch. Verw. Gebiete 61, 453–458.
374 References

Tamura, J. (1991) Symmetric continued fractions related to certain


series. J. Number Theory 38, 251–264.

Tanaka, S. and Ito, Sh. (1981) On a family of continued-fraction


transformations and their ergodic properties. Tokyo J. Math. 4, 153–
175.

Thakur, D.S. (1996) Exponential and continued fractions. J. Number


Theory 59, 248–261.

Tietze, H. (1913) Über die raschesten Kettenbruchentwicklungen reeller


Zahlen. Monatsh. Math. Phys. 24, 209–242.

Tong, J. (1983) The conjugate property of the Borel theorem on Dio-


phantine approximation. Math. Z. 184, 151–153.

Tong, J. (1994) The best approximation function to irrational num-


bers. J. Number Theory 49, 89–94.

Tonkov, T. (1974) On the average length of finite continued fractions.


Acta Arith. 26, 47–57.

Urban, F.M. (1923) Grundlagen der Wahrscheinlichkeitsrechnung und


der Theorie der Beobachtungsfehler. Teubner, Leipzig.

Urbański, M. (2001) Porosity in conformal infinite iterated function


systems. J. Number Theory 88, 283–312.

Vahlen, K.T. (1895) Über Näherungswerthe und Kettenbrüche. J. Reine


Angew. Math. 115, 221–233.

Vajda, S. (1989) Fibonacci and Lucas Numbers, and the Golden Sec-
tion: Theory and Applications. E. Horwood, Chichester.

Vallée, B. (1997) Opérateurs de Ruelle–Mayer généralisés et analyse


des algorithmes d’Euclide et de Gauss. Acta Arith. 81, 101–144.

Vallée, B. (1998) Dynamique des fractions continues à contraintes


périodiques. J. Number Theory 72, 183–235.

Vallée, B. (2000) Digits and continuants in Euclidean algorithms. Er-


godic versus Tauberian theorems. J. Théor. Nombres Bordeaux 12,
531–570.
References 375

Vardi, I. (1995) The limiting distribution of the St. Petersburg game.


Proc. Amer. Math. Soc. 123, 2875–2882.

Vardi, I. (1997) The St. Petersburg game and continued fractions.


C.R. Acad. Sci. Paris Ser. I Math. 324, 913–918.

Veech, V.A. (1982) Gauss measures for transformations on the space


of interval exchange maps. Ann. of Math. (2) 115, 201–242.

Vershik, A.M. and Sidorov, N.A. (1993) Arithmetic expansions asso-


ciated with the rotation of a circle. Algebra i Analiz 5, no. 6, 97–115.
(Russian)

Viader, P., Paradis, J., and Bibiloni, L. (1998) A new light on Minkow-
ski’s ?(x)-function. J. Number Theory 73, 212–227.

Viswanath, D. (2000) Random Fibonacci sequences and the number


1.13198824 · · · . Math. Comp. 69, 1131–1155.

de Vroedt, C. (1962) Measure-theoretical investigations concerning


continued fractions. Indag. Math. 24, 583–591.

de Vroedt, C. (1964) Metrical problems concerning continued fractions.


Compositio Math. 16, 191–195.

Wall, H.S. (1948) Analytic Theory of Continued Fractions. Van Nos-


trand, New York.

Walters, P. (1982) An Introduction to Ergodic Theory. Graduate Texts


in Mathematics 79. Springer–Verlag, New York.

Watson, G.N. (1944) A Treatise on the Theory of Bessel Functions,


2nd Edition. Cambridge Univ. Press, Cambridge.

Whittaker, E.T. and Watson, G.N. (1927) A Course of Modern Anal-


ysis. Cambridge Univ. Press, Cambridge.

Wiman, A. (1900) Über eine Wahrscheinlichkeitsaufgabe bei Ketten-


bruchentwickelungen. Öfversicht af Kongl. Svenska Vetenskaps-Akade-
miens Förhandlingar 57, 829–841.

Wirsing, E. (1974) On the theorem of Gauss–Kusmin–Lévy and a


Frobenius type theorem for function spaces. Acta Arith. 24, 507–
528.
376 References

Wrench, J.W., Jr. (1960) Further evaluation of Khintchine’s constant.


Math. Comp. 14, 370–371.

Wrench, J.W., Jr. and Shanks, D. (1966) Questions concerning Khint-


chine’s constant and the efficient computation of regular continued
fractions. Math. Comp. 20, 444–448.

Zagier, D.B. (1981) Zetafunktionen und quadratische Körper. Eine


Einführung in die höhere Zahlentheorie. Springer–Verlag, Berlin-New
York.

Zuparov, T.M. (1981) On a theorem from the metric theory of con-


tinued fractions. Izv. Akad. UzSSR Ser. Fiz.-Mat. Nauk no. 6, 9–12.
(Russian)
Index

Aaronson, J., 311, 339 Berechet, A., xiii, 151


Abramov’s formula, 277 Bernstein, F.
Acosta, A. de, 202 F. Bernstein’s theorem, 49, 174
Adams, W.W., 271, 272 Bibiloni, L., 238
Adler, R.L, 9, 244, 307 Billingsley, P., 36, 180, 187, 221,
α-expansion, 281, 344, 345 224, 257, 320, 334, 343
Alexandrov, A.G., 241 Birkhoff’s individual ergodic theo-
algorithm A, 259 rem, 221
algorithm B, 260 Borel sets, 314
algorithm C, 260 Borel, É., 22, 30, 243, 337
almost Markov property, 335 Borwein, J.M., 231, 233, 241
Alzer, H., 13 Bosma, W., 249, 251, 252, 260,
approximation coefficient, 27, 263 281, 288, 293–296, 298, 299,
Araujo, A., 197, 320, 331 343
arc-sine law, 187 boundary, 315
generalization of, 202 bounded essential variation, 55
array, 325 bounded p-variation, 75
strictly stationary, 326 Boyarski, A., 58, 221, 223
strongly infinitesimal (s.i.), 327 Bradley, R.C., 326
associated random variables, 15 Breiman, L., 253
extended, 34 Brezinski, C., xii
automorphism, 219 Brjuno, A.D., 12, 241
Brodén, T., 22, 336, 337
Babenko, K.I., 103, 109, 111, 113, Brodén–Borel–Lévy formula, 21
336 generalized, 37
backward continued fraction (BCF) Burton, R.M., 344, 345
expansion, 307
Bagemihl, F., 30 Cassels, J.W.S., 343
Bailey, D.H., 231, 233, 241 Champernowne, D.G., 243
Barbolosi, D., 249, 264 characteristic function, 316
Barndorff–Nielsen, O., 176 Choong, K.Y., 241
Barrionuevo, J., 346 Chudnovsky, D.V., 13

377
378 Index

Chudnovsky, G.V., 13 Diophantine approximation, 29


Clemens, L.E., 12 fundamental theorem of, 257
conditional probability measures, Dixon, J.D., 334
36 δ-mixing, 326
continuant, 5 Doeblin, W., xi, 22, 33, 99, 204,
continued fraction (CF), 260 252, 335, 337–340
continued fraction digits, 4 Doeblin–Lenstra conjecture, 252
continued fraction expansion, 4 Doob, J.L., 31
continued fraction expansion for e, Doukhan, P., 327
12 Duren, P.L., 102
continued fraction expansion for π, Dürner, A., 34, 337
13 dynamical system, 219
continued fraction transformation,
2 Elsner, C., 13
natural extension of, 25 Elton, H.J., 253
continued fraction with even incom- endomorphism, 219
plete quotients (Even CF) entropy, 257, 277
expansion, 264 Euclid’s algorithm, 1, 2
Euler, L., 5, 12
continued fraction with odd incom-
plete quotients (Odd CF) Faivre, C., 9, 101, 130, 249, 334,
expansion, 264 336, 343
convolution, 316 Falconer, K.J., 233, 234
Corless, R.M., 334 Farey continued fraction (FCF) ex-
Cornfeld, I.P., 221 pansion, 303
Crandall, R.E., 231, 233, 241 Feller, W., 238
f -expansion, 346
Dajani, K., 250, 300, 303–305, 310, with dependent digits, 346
311, 337, 341, 343, 345, Fieldsteel, A., 343
346 Flajolet, P., 111, 130, 134
Daudé, H., 111, 130, 134 Flatto, L., 307
Davison, J.L., 13 Fluch, W., 134
Daykin, D.E., 241 Fortet, R., 335
Denjoy, A., 156, 163, 337 Fourier transform, 316
dependence coefficients, 325 Fujiwara, M., 30
dependence with complete connec- fundamental interval, 18
tions, 23, 234
diagonal continued fraction (DCF) Gál, I.S., 221, 340
expansion, 289 Góra, P., 58, 221, 223
Diamond, H.G., 235, 239, 240 Galambos, J., 173, 174
digamma function ψ, 145 Gauss, C.F., x, 15
Index 379

Gauss–Kusmin–Lévy theorem infinite-order chain, 33


‘exact’, 111, 125 insertion, 300
L2 -version, 123 Ionescu Tulcea, C.T., 335
Gauss’ measure, 16 Iosifescu, M., 23, 33, 62, 64, 147,
extended, 26 151, 168, 173, 178, 179,
Gauss’ Problem, 15 183, 193, 204, 334–337, 345,
Babenko’s solution to, 101f 346
Paul Lévy’s solution to, 39f isomorphism, 222
Wirsing’s solution to, 79f iterated function systems, 234
Gauss’ problem for τ̄ , 246 Ito, Sh., 273, 281, 302, 344, 345
geodesic flow, 9
Giné, E., 197, 320, 331 Jager, H., 30, 249, 251, 252, 271–
Gordin, M.I., 216, 327 273, 281, 288, 298, 340,
Gröchenig, K., 344 341
Gray, J.J., 16 Jain, N.C., 215, 216
Grigorescu, S., 23, 33, 62, 168, 193, Jarnı́k, V., 234
253, 334, 346 Jenkinson, O., 234
Grothendieck, A., 105 Jogdeo, K., 215, 216
Gyldén, H., 336 Jones, W.B., xii
Jur0 ev, S.P., 113
Haan, L. de, 174
Haas, A., 344 Kac, M., 334
Halmos, P.R., 320 Kakeya, S., 346
Hardy, G.H., 11 Kalpazidou, S., 345
Harman, G., 233 Kamae, T., 221
Hartman, S., 238 Kanwal, R.P., 105
Hartono, Y., 264 Karamata theorem, 321
Hausdorff dimension, 233 Katznelson, Y., 221
Hausdorff measure, 233 K-automorphism, 223
Heilbronn, H., 334 Keane, M.S., 221, 244
Heinrich, H., 203, 335 Keller, G., 193
Hennion, H., 335 Khin(t)chin(e), A.Ya., 16, 204, 231,
Hensley, D., 2, 103, 194, 234, 252, 257, 334, 339, 340
334, 336 Knopp, K., 339
Heyde, C.C., 188, 214 Knuth, D.E., 2, 92, 101, 333, 334
Hofbauer, F., 193 Köhler, G., 13
Hoffmann–Jørgensen, J., 320 Koksma, J.F., 221, 334, 340
Hurwitz, A., 263, 264, 288, 298 Kolmogorov, A.N., 337
Kraaikamp, C., 30, 250, 251, 264,
Ibragimov, I.A., 71, 72, 334 273, 278, 286–290, 294, 296,
380 Index

299, 300, 303–305, 310, 311, MacLeod, A.J., 111, 119


337, 341–346 Magnus, W., 105, 107
Krasnoselskii, M., 128 Marinescu, G., 335
Kurosu, K., 287 Martin, M.H., 339
Kuzmin, R.O., 16 matrix approach, 7
Mayer, D.H., 59, 103, 109, 111,
Lagarias, J.C., 238 120, 127, 130, 194, 336
Lagrange, J.-L., 11 Mazzone, F., 315
Lamé, G., 2 McKinney, T.E., 281
Lang, S., 241 McLaughlin, J.R., 30
Laplace, P.S., 15 measurable space, 313
Lasota, A., 58, 220 measure, 314
λ-continued fraction (λ-CF) expan- mediant convergents, 301
sion, 344 Merrill, K.D., 12
Law of the iterated logarithm Minnigerode, B., 263
Chung’s, 215 Misevic̆ius, G., 193, 194, 335
classical, 213 Möbius transformation, 7
Strassen’s, 213, 216 Moeckel, R., 340, 341
Legendre constants, 273 Morita, T., 195
Legendre’s theorem, 20 ‘Mother of all SRCF expansions’,
Lehmer, D., 231 301
Lehner continued fraction (LCF)
expansion, 300 Nakada, H., 9, 271, 281, 283, 285,
Lehner, J., 300, 302 311, 334, 340, 343–345
Lenstra, H.W., 252 nearest integer continued fraction
LeVeque, J., 30 (NICF) expansion, 263
Lévy-Cramér continuity theorem, Neumann, J. von, 244
316 Nolte, V.N., 227, 229, 341
Lévy–Khinchin representation, 317 normal continued fraction number,
Lévy measure, 317 243
Lévy, Paul, 16, 22, 39, 256, 271, normal number, 243
334, 340, 342 number normal in base b, 243
Liardet, P., 340, 341
Linnik, Yu.V., 71, 72, 334 Oberhettinger, F., 105, 107
Lochs, G., 334, 342 Obrechkoff, N., 30
Lopes, A., 264 Olds, C.D., xii
Lorenzen, L., xii 1–block, 258
Loynes, R.M., 174 Operator
Mayer–Ruelle, 130
Mackey, M.C., 58, 220 generalization of, 134
Index 381

nuclear of order 0 (of trace class), independent, 316


105 P -distribution of, 314
trace of, 105 Raney, G.N., 9
Perron–Frobenius, 57, 58 Rathbone, C.R., 241
transition, 65 Răuţu, G., 56
optimal continued fraction (OCF) (regular) continued fraction (RCF),
expansion, 293 3, 4
convergents of [= (RCF) con-
Paradis, J., 238 vergents], 4
Pedersen, P., 231 digits of, 4
Perron, O., xii, 11, 261, 288, 289, asymptotic relative digit fre-
333 quencies, 225
Petek, P., 64 asymptotic relative frequencies
Petersen, K., 221, 223, 276, 277 of digits between two given
Pethő, A., 13 values, 226
Philipp, W., 34, 173, 174, 176, 181, asymptotic relative frequencies
215, 216, 230, 239, 256, of digits exceeding a given
334, 337–340 value, 227
Plato, J. von, xii, 337 asymptotic relative m-digit block
Poisson probability, 317 frequencies, 226
τ -centered, 317 incomplete (partial) quotients
Pollicott, M., 234 of, 4
Poorten, A. van der, 13 extended, 31
Popescu, C., 180, 281, 345 regularly varying function, 321
Popescu, G., 253 index of, 321
Porter, J.W., 333 Reznik, M.H., 216
Postnikov, A.G., 244 Riauba, R., 335
preservation area, 269 Richtmyer, R.D., 12, 241
probability, 314 Rieger, G.J., 264, 272, 284, 345
infinitely divisible, 317 Rockett, A.M., xii, 284, 334, 345
stable, 317 Roeder, D.W., 12
order of, 318 Roepstorff, G., 109, 111, 120, 127,
strictly stable, 318 336
probability space, 314 Rogers, C.A., 233
Prokhorov metric, 315 Rosen continued fraction expansion,
Pruitt, W.E., 215 344
ψ-mixing coefficient, 43 Rosen, D., 344
quadratic irrationality, 11 Rousseau-Egèle, J., 193
Ruelle, D., 130
random variable (r.v.), 313 Ryll–Nardzewski, C., 340
382 Index

Šalát, T., 233 spectral radius, 95


Salem, R., 238 Sprindžuk, V.G., xii
σ-algebra, 313 St. Petersburg game, 238
Samorodnitsky, G., 320 Stackelberg, O.P., 216
Samur, J.D., 79, 99, 188, 197, 211, Stadje, W., 55
324, 327–331, 337, 338 Statulevic̆ius, V.A., 335
Saulis, L., 335 Stout, W.F., 181, 215, 216
Schmidt, T.A., 344 Strassen, V., 213, 216
Schmidt, W.M., xii, 343 Sudan, G., xii
Schweiger, F., 264, 343 Szűsz, P., xii, 16, 30, 334, 335, 339
S-convergent, 270
Scott, D.J., 188, 214 Tamura, J., 13
Sebe, G.I., 344, 345 Tanaka, S., 281, 344, 345
Segre, B., 30 Taqqu, M.S., 320
Selenius, C.-O., 260 Taylor, S.J., 215, 216
semi-regular continued fraction (SRCF) Thakur, D.S., 13
expansion, 261 Theodorescu, R., 168
closest, 259 Thron, W.J., xii
fastest, 259, 294 Tietze, H., 261
Sendov, B., 30, 287 Tong, J., 30, 31
Seneta, E., 321, 322 Tonkov, T., 334
Series, C., 9 transformation, 219
S-expansion, 267 ergodic, 220
Shallit, J.O., 2, 13, 241 exact, 220
Shanks, D., 231, 232, 241 measure preserving, 219
Shiu, P., 241 natural extension of, 222
Sinai, Ya.G., 334 non-singular, 219
singular continued fraction (SCF) strongly mixing, 220
expansion, 264 Trotter, H., 241
singularization, 258, 265 Tuckerman, B., 244
singularization area, 267 UB Conjecture, 139
maximal, 269 Urban, F.M., 334
singularization process, 265 Uspensky, J.V., 16
skew product, 223
Jager and Liardet’s, 341 Vaaler, J.D., 235, 239, 240
Skorohod metric d0 , 319 Vahlen, K.T., 28
slowly varying function, 321 Vallée, B., 111, 130, 134, 135, 194
representation theorem, 321 Vardi, I., 238
Smorodinsky, M., 244 Viader, P., 238
Soni, R.P., 105, 107 Volkmann, B., 339
Index 383

Vroedt, C. de, 340

Waadeland, H., xii


Wall, H.S., xii
Walters, P., 221
Watson, G.N., 104, 227
weak convergence, 315
Webb, G.R., 339
Weiss, B., 221
Whittaker, E.T., 227
Wiedijk, F., 249, 251, 252, 281,
288, 298
Wiener measure, 319
Wiman, A., 336, 337
Wirsing, E., 16, 83, 91, 92, 113,
336
Wrench, J.W., 231, 232
Wright, E., 11

Zagier, D.B., 308


Zbăganu, G., 56
Zuparov, T.M., 338

Potrebbero piacerti anche