Viana-Lectures On Lyapunov Exponents-Cambridge (2014)

Lectures on Lyapunov Exponents
MARCELO VIANA
Instituto Nacional de Matemática Pura e Aplicada (IMPA),
Rio de Janeiro
University Printing House, Cambridge CB2 8BS, United Kingdom
Cambridge University Press is part of the University of Cambridge.

It furthers the University’s mission by disseminating knowledge in the pursuit of
education, learning and research at the highest international levels of excellence.
www.cambridge.org
Information on this title: www.cambridge.org/9781107081734
© Marcelo Viana 2014
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2014
Printed in the United Kingdom by CPI Group Ltd, Croydon CRO 4YY
A catalogue record for this publication is available from the British Library
Library of Congress Cataloging-in-Publication data

Viana, Marcelo, author.
Lectures on Lyapunov exponents / Marcelo Viana, Instituto Nacional de Matemática
Pura e Aplicada (IMPA), Rio de Janeiro.
pages cm. – (Cambridge studies in advanced mathematics ; 145)
Includes bibliographical references and index.
ISBN 978-1-107-08173-4 (Hardback)
1. Lyapunov exponents. I. Title.
QA372.V53 2014
515 .48–dc23 2014021609
ISBN 978-1-107-08173-4 Hardback
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party internet websites referred to in this publication,
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
Contents
Preface page xi
1 Introduction 1
1.1 Existence of Lyapunov exponents 1
1.2 Pinching and twisting 2
1.3 Continuity of Lyapunov exponents 3
1.4 Notes 3
1.5 Exercises 4
2 Linear cocycles 6
2.1 Examples 7
2.1.1 Products of random matrices 7
2.1.2 Derivative cocycles 8
2.1.3 Schrödinger cocycles 9
2.2 Hyperbolic cocycles 10
2.2.1 Definition and properties 10
2.2.2 Stability and continuity 14
2.2.3 Obstructions to hyperbolicity 16
2.3 Notes 18
2.4 Exercises 19
3 Extremal Lyapunov exponents 20
3.1 Subadditive ergodic theorem 20
3.1.1 Preparing the proof 21
3.1.2 Fundamental lemma 23
3.1.3 Estimating ϕ− 24
3.1.4 Bounding ϕ+ from above 26
3.2 Theorem of Furstenberg and Kesten 28
3.3 Herman’s formula 29
3.4 Theorem of Oseledets in dimension 2 30
vii
viii Contents
3.4.1 One-sided theorem 30

3.4.2 Two-sided theorem 34
3.5 Notes 36
3.6 Exercises 36
4 Multiplicative ergodic theorem 38
4.1 Statements 38
4.2 Proof of the one-sided theorem 40
4.2.1 Constructing the Oseledets flag 40
4.2.2 Measurability 41
4.2.3 Time averages of skew products 44
4.2.4 Applications to linear cocycles 47
4.2.5 Dimension reduction 48
4.2.6 Completion of the proof 52
4.3 Proof of the two-sided theorem 53
4.3.1 Upgrading to a decomposition 53
4.3.2 Subexponential decay of angles 55
4.3.3 Consequences of subexponential decay 56
4.4 Two useful constructions 59
4.4.1 Inducing and Lyapunov exponents 59
4.4.2 Invariant cones 61
4.5 Notes 63
4.6 Exercises 64
5 Stationary measures 67
5.1 Random transformations 67
5.2 Stationary measures 70
5.3 Ergodic stationary measures 75
5.4 Invertible random transformations 77
5.4.1 Lift of an invariant measure 79
5.4.2 s-states and u-states 81
5.5 Disintegrations of s-states and u-states 85
5.5.1 Conditional probabilities 85
5.5.2 Martingale construction 86
5.5.3 Remarks on 2-dimensional linear cocycles 89
5.6 Notes 91
5.7 Exercises 91
6 Exponents and invariant measures 96
6.1 Representation of Lyapunov exponents 97
6.2 Furstenberg’s formula 102
6.2.1 Irreducible cocycles 102
Contents ix
6.2.2 Continuity of exponents for irreducible cocycles 103

6.3 Theorem of Furstenberg 105
6.3.1 Non-atomic measures 106
6.3.2 Convergence to a Dirac mass 108
6.3.3 Proof of Theorem 6.11 111
6.4 Notes 112
6.5 Exercises 113
7 Invariance principle 115
7.1 Statement and proof 116
7.2 Entropy is smaller than exponents 117
7.2.1 The volume case 118
7.2.2 Proof of Proposition 7.4. 119
7.3 Furstenberg’s criterion 124
7.4 Lyapunov exponents of typical cocycles 125
7.4.1 Eigenvalues and eigenspaces 126
7.5 Notes 130
7.6 Exercises 131
8 Simplicity 133
8.1 Pinching and twisting 133
8.2 Proof of the simplicity criterion 134
8.3 Invariant section 137
8.3.1 Grassmannian structures 137
8.3.2 Linear arrangements and the twisting property 139
8.3.3 Control of eccentricity 140
8.3.4 Convergence of conditional probabilities 143
8.4 Notes 147
8.5 Exercises 147
9 Generic cocycles 150
9.1 Semi-continuity 151
9.2 Theorem of Mañé–Bochi 153
9.2.1 Interchanging the Oseledets subspaces 155
9.2.2 Coboundary sets 157
9.2.4 Derivative cocycles and higher dimensions 161
9.3 Hölder examples of discontinuity 164
9.4 Notes 168
9.5 Exercises 169
x Contents
10 Continuity 171
10.1 Invariant subspaces 172
10.2 Expanding points in projective space 174
10.3 Proof of the continuity theorem 176
10.4 Couplings and energy 178
10.5 Conclusion of the proof 181
10.5.1 Proof of Proposition 10.9 183
10.6 Final comments 186
10.7 Notes 189
10.8 Exercises 189
References 191
Index 198
Preface
1. The study of characteristic exponents originated from the fundamental work

of Aleksandr Mikhailovich Lyapunov [85] on the stability of solutions of dif-
ferential equations. Consider a linear equation
v̇(t) = B(t) · v(t) (1)
where B(·) is a bounded function from R to the space of d × d matrices. By the
general theory of differential equations, there exists a so-called fundamental
matrix At , t ∈ R such that v(t) = At · v0 is the unique solution of (1) with initial
condition v(0) = v0 . If the characteristic exponents
1
λ (v) = lim sup log At · v (2)
t→∞ t
are negative, for all v = 0, then the trivial solution v(t) ≡ 0 is asymptotically
stable, and even exponentially asymptotically stable. The stability theorem of
Lyapunov asserts that, under an additional regularity condition, stability re-
mains valid for nonlinear perturbations
ẇ(t) = B(t) · w(t) + F(t, w) with F(t, w) ≤ const w1+ε .
That is, the trivial solution w(t) ≡ 0 is still exponentially asymptotically stable.
The regularity condition of Lyapunov means, essentially, that the limit in
(2) does exist, even if one replaces vectors v by l-vectors v1 ∧ · · · ∧ vl ; that
is, elements of the k-exterior power of Rd , for any 0 ≤ l ≤ d. This is usually
difficult to check in specific situations. But the multiplicative ergodic theorem
of Oseledets asserts that Lyapunov regularity holds with full probability, in
great generality. In particular, it holds on almost every flow trajectory, relative
to any probability measure invariant under the flow.
2. The work of Furstenberg, Kesten, Oseledets, Kingman, Ledrappier, Guiv-

arc’h, Raugi, Gol’dsheid, Margulis and other mathematicians, mostly in the
xi
xii Preface
1960s–80s, built the study of Lyapunov characteristic exponents into a very

active research field in its own right, and one with an unusually vast array of in-
teractions with other areas of Mathematics and Physics, such as stochastic pro-
cesses (random matrices and, more generally, random walks on groups), spec-
tral theory (Schrödinger-type operators) and smooth dynamics (non-uniform
hyperbolicity), to mention just a few.
My own involvement with the subject goes back to the late 20th century
and was initially motivated by my work with Christian Bonatti and José F.
Alves on the ergodic theory of partially hyperbolic diffeomorphisms and, soon
afterwards, with Jairo Bochi on the dependence of Lyapunov exponents on the
underlying dynamical system. The way these two projects unfolded very much
inspired the choice of topics in the present book.
3. A diffeomorphism f : M → M is called partially hyperbolic if there exists

a D f -invariant decomposition
T M = Es ⊕ Ec ⊕ Eu
of the tangent bundle such that E s is uniformly contracted and E u is uniformly

expanded by the derivative D f , whereas the behavior of D f along the center
bundle E c lies somewhere in between. It soon became apparent that to improve
our understanding of such systems one should try to get a better hold of the
behavior of D f | E c and, in particular, of its Lyapunov exponents. In doing
this, we turned to the classical linear theory for inspiration.
That program proved to be very fruitful, as much in the linear context (e.g.
the proof of the Zorich–Kontsevich conjecture, by Artur Avila and myself) as
in the setting of partially hyperbolic dynamics we had in mind originally (e.g
the rigidity results by Artur Avila, Amie Wilkinson and myself), and remains
very active to date, with important contributions from several mathematicians.
4. Before that, in the early 1980s, Ricardo Mañé came to the surprising conclu-
sion that generic (a residual subset of) volume-preserving C1 diffeomorphisms
on any surface have zero Lyapunov exponents, or else they are globally hyper-
bolic (Anosov); in fact, the second alternative is possible only if the surface is
the torus T2 . This discovery went against the intuition drawn from the classical
theory of Furstenberg.
Although Mañé did not write a complete proof of his findings, his approach
was successfully completed by Bochi almost two decades later. Moreover, the
conclusions were extended to arbitrary dimension, both in the volume-preserv-
ing and in the symplectic case, by Bochi and myself.
Downloaded from University Publishing Online. This is copyrighted material

http://dx.doi.org/10.1017/CBO9781139976602.001
Preface xiii
5. In this monograph I have sought to cover the fundamental aspects of the

classical theory (mostly in Chapters 1 through 6), as well as to introduce some
of the more recent developments (Chapters 7 through 10).
The text started from a graduate course that I taught at IMPA during the
(southern hemisphere) summer term of 2010. The very first draft consisted of
lecture notes taken by Carlos Bocker, José Régis Varão and Samuel Feitosa.
The unpublished notes [9] and [28], by Artur Avila and Jairo Bochi were im-
portant for setting up the first part of the course.
The material was reviewed and expanded later that year, in my seminar,
with the help of graduate students and post-docs of IMPA’s Dynamics group.
I taught the course again in early 2014, and I took that occasion to add some
proofs, to reorganize the exercises and to include historic notes in each of the
chapters. Chapter 10 was completely rewritten and this preface was also much
expanded.
6. The diagram below describes the logical connections between the ten chap-
ters. The first two form an introductory cycle. In Chapter 1 we offer a glimpse
of what is going to come by stating three main results, whose proofs will ap-
pear, respectively, in Chapters 3, 6 and 10. In Chapter 2 we introduce the no-
tion of linear cocycle, upon which is built the rest of the text. We examine more
closely the particular case of hyperbolic cocycles, especially in dimension 2,
as this will be useful in Chapter 9.
Ch. 2 Ch. 1
3 4 6 Ch. 5
7 8 9 Ch. 10
In the next four chapters we present the main classical results, including
the Furstenberg–Kesten theorem and the subadditive ergodic theorem of King-
man (Chapter 3), the multiplicative ergodic theorem of Oseledets (Chapter 4),
Ledrappier’s exponent representation theorem, Furstenberg’s formula for ex-
ponents of irreducible cocycles and Furstenberg’s simplicity theorem in dimen-
sion 2 (Chapter 6). The proof of the multiplicative ergodic theorem is based on
the subadditive ergodic theorem and also heralds the connection between Lya-

xiv Preface
punov exponents and invariant/stationary measures that lies at the heart of the
results in Chapter 6. In Chapter 5 we provide general tools to develop that
connection, in both the invertible and the non-invertible case.
7. The last four chapters are devoted to more advanced material. The main
goal there is to provide a friendly introduction to the existing research litera-
ture. Thus, the emphasis is on transparency rather than generality or complete-
ness. This means that, as a rule, we choose to state the results in the simplest
possible (yet relevant) setting, with suitable references given for stronger state-
ments.
Chapter 7 introduces the invariance principle and exploits some of its con-
sequences, in the context of locally constant linear cocycles. This includes
Furstenberg’s criterion for λ− = λ+ , that extends Furstenberg’s simplicity the-
orem to arbitrary dimension. The invariance principle has been used recently
to analyze much more general dynamical systems, linear and nonlinear, whose
Lyapunov exponents vanish. A finer extension of Furstenberg’s theorem ap-
pears in Chapter 8, where we present a criterion for simplicity of the whole
Lyapunov spectrum.
Then, in Chapter 9, we turn our attention to the contrasting Mañé–Bochi
phenomenon of systems whose Lyapunov spectra are generically not sim-
ple. We prove an instance of the Mañé–Bochi theorem, for continuous linear
cocycles. Moreover, we explain how those methods can be adapted to con-
struct examples of discontinuous dependence of Lyapunov exponents on the
cocycle, even in the Hölder-continuous category. Having raised the issue of
(dis)continuity, in Chapter 10 we prove that for products of random matrices
in GL(2) the Lyapunov exponents do depend continuously on the cocycle data.
8. Each chapter ends with set of notes and a list of exercises. Some of the ex-
ercises are actually used in the proofs. They should be viewed as an invitation
for the reader to take an active part in the arguments. Throughout, it is assumed
that the reader is familiar with the basic ideas of Measure Theory, Differential
Topology and Ergodic Theory. All that is needed can be found, for instance,
in my book with Krerley Oliveira, Fundamentos da Teoria Ergódica [114]; a
translation into English is under way.
I thank David Tranah, of Cambridge University Press, for his interest in

this book and for patiently waiting for the writing to be completed. I am also
grateful to Vaughn Climenhaga, and David himself, for a careful revision of
the manuscript that very much helped improve the presentation.
Rio de Janeiro, March, 2014

Marcelo Viana

1
Introduction
This chapter is a kind of overture. Simplified statements of three theorems

are presented that set the tone for the whole text. Much broader versions of
these theorems will appear later and several other themes around them will be
introduced and developed as we move on. At this initial stage we choose to
focus on the following special, yet significant, setting.
Let A1 , . . . , Am be invertible 2×2 real matrices and let p1 , . . . , pm be positive
numbers with p1 + · · · + pm = 1. Consider
Ln = Ln−1 · · · L1 L0 , n ≥ 1,
where the L j are independent random variables with identical probability dis-
tributions, such that
the probability of {L j = Ai } is equal to pi
for all j ≥ 0 and i = 1, . . . , m. In brief, our goal is to describe the (almost
certain) behavior of Ln as n → ∞.
1.1 Existence of Lyapunov exponents

We begin with the following seminal result of Furstenberg and Kesten [56]:
Theorem 1.1 There exist real numbers λ+ and λ− such that
1 1
lim log Ln = λ+ and lim log (Ln )−1 −1 = λ−
n n n n
with full probability.

The numbers λ+ and λ− are called extremal Lyapunov exponents. Clearly,
λ+ ≥ λ− (1.1)

2 Introduction
because B ≥ B−1 −1 for any invertible matrix B. If B has determinant ±1
then we even have B ≥ 1 ≥ B−1 −1 . Hence,
λ+ ≥ 0 ≥ λ− (1.2)
when all matrices Ai , 1 ≤ i ≤ m have determinant ±1.
1.2 Pinching and twisting

Next, we discuss conditions for the inequalities (1.1) and (1.2) to be strict.
Let B be the monoid generated by the matrices Ai , i = 1, . . . , m; that is, the
set of all products Ak1 · · · Akn with 1 ≤ k j ≤ m and n ≥ 0 (for n = 0 interpret the
product to be the identity matrix). We say that B is pinching if for any constant
κ > 1 there exists some B ∈ B such that
B > κ B−1 −1 . (1.3)
This means that the images of the unit circle under the elements of B are
ellipses with arbitrarily large eccentricity. See Figure 1.1.
B
B−1 −1
B
1
Figure 1.1 Eccentricity and pinching
We say that the monoid B is twisting if given any vector lines F, G1 , . . . ,

Gn ⊂ R2 there exists B ∈ B such that
/ {G1 , . . . , Gn }.
B(F) ∈ (1.4)
The following result is a variation of a theorem of Furstenberg [54]:
Theorem 1.2 Assume B is pinching and twisting. Then λ− < λ+ . In partic-

ular, if | det Ai | = 1 for all 1 ≤ i ≤ m then both extremal Lyapunov exponents
are different from zero.

1.3 Continuity of Lyapunov exponents 3
1.3 Continuity of Lyapunov exponents

The extremal Lyapunov exponents λ+ and λ− may be viewed as functions of
the data
A1 , . . . , Am , p1 , . . . , pm .
Let the matrices A j vary in the linear group GL(2) of invertible 2 × 2 matrices
and the probability vectors (p1 , . . . , pm ) vary in the open simplex
Δm = {(p1 , . . . , pm ) : p1 > 0, . . . , pm > 0 and p1 + · · · + pm = 1}.
The following result is part of a theorem of Bocker and Viana [35]:
Theorem 1.3 The extremal Lyapunov exponents λ± depend continuously on
(A1 , . . . , Am , p1 , . . . , pm ) ∈ GL(2)m × Δm at all points.
Example 1.4 Let m = 2, with

σ 0 cos θ − sin θ
A1 = and A = R A R
θ 1 −θ , Rθ =
0 σ −1
2
sin θ cos θ
for some σ > 1 and θ ∈ R. By Theorem 1.3, the Lyapunov exponents λ±
depend continuously on the parameter σ and θ . Moreover, using Theorem 1.2,
we have λ+ = 0 if and only if p1 = p2 = 1/2 and θ = π /2+nπ for some n ∈ Z.
1.4 Notes
Theorem 1.1 is a special case of the theorem of Furstenberg and Kesten [56],
which is valid in any dimension d ≥ 2. The full statement and the proof will
appear in Chapter 3: we will deduce this theorem from an even more general
statement, the subadditive ergodic theorem of Kingman [74]. Kingman’s theo-
rem will also be used in Chapter 4 to prove the fundamental result of the theory
of Lyapunov exponents, the multiplicative ergodic theorem of Oseledets [92].
It is natural to ask whether the type of asymptotic behavior prescribed by
Theorem 1.1 for the norm Ln and conorm (Ln )−1 −1 extends to the indi-
vidual matrix coefficients Li,n j . Furstenberg, Kesten [56] proved that this is so
if the coefficients of the matrices Ai , 1 ≤ i ≤ m are all strictly positive. The ex-
ample in Exercise 1.3 shows that this assumption cannot be removed. On the
other hand, the theorem of Oseledets theorem does contain such a description
for the matrix column vectors.
Theorem 1.2 is also the tip of a series of fundamental results, which are to
be discussed in Chapters 6 through 8. The full statement and proof of Fursten-
berg’s theorem for 2-dimensional cocycles (Furstenberg [54]) will be given in

4 Introduction
Chapter 6. The extension to any dimension will be stated and proved in Chap-
ter 7: it will be deduced from the invariance principle (Ledrappier [81], Bon-
atti, Gomez-Mont and Viana [37], Avila and Viana [16], Avila, Santamaria and
Viana [13]), a general tool that has several other applications, both for linear
and nonlinear systems.
In dimension larger than 2, there is a more ambitious problem: rather than
asking when λ− < λ+ , one wants to know when all the Lyapunov exponents
are distinct. That will be the subject of Chapter 8, which is based on Avila and
Viana [14, 15].
Furstenberg and Kifer [57] proved continuity of the Lyapunov exponents of
products of random matrices, restricted to the (almost) irreducible case. A vari-
ation of their argument will be given in Section 6.2.2. The reducible case re-
quires a delicate analysis of the random walk defined by the cocycle in projec-
tive space. That was carried out by Bocker and Viana [35], in the 2-dimensional
case, using certain discretizations of projective space. At the time of writing,
Avila, Eskin and Viana [12] are extending the statement of the theorem to arbi-
trary dimension, using a very different strategy. The proof of Theorem 1.3 that
we present in Chapter 10 is based on this more recent approach.
The problem of the dependence of Lyapunov exponents on the data can
be formulated in the broader context of linear cocycles that we are going to
introduce in Chapter 2. We will see in Chapter 9 that, in contrast, continuity
often breaks down in that generality.
1.5 Exercises
The following elementary notions are used in some of the exercises that fol-
low. We call a 2 × 2 matrix hyperbolic if it has two distinct real eigenvalues,
parabolic if it has a unique real eigenvalue, with a one-dimensional eigenspace,
and elliptic if it has two distinct complex eigenvalues. Multiples of the identity
belong to neither of these three classes.
Exercise 1.1 Show that, in dimension d = 2, if | det Ai | = 1 for all 1 ≤ i ≤ m

then λ+ + λ− = 0.
Exercise 1.2 Calculate the extremal Lyapunov exponents for m = 2 and p1 ,

p2 > 0 with p1 + p2 = 1 and

σ 0 σ −1 0
(1) A1 = and A2 = , where σ > 1;
0 σ −1 0 σ

1.5 Exercises 5

σ 0 0 −1
(2) A1 = and A2 = , where σ > 1.
0 σ −1 1 0
Exercise 1.3 (Furstenberg and Kesten [56]) Take m = 2 with p1 = p2 = 1/2
and

2 0 0 1
A1 = and A2 = .
0 1 1 0
Show that limn (1/n) log |Li,n j | does not exist for any i, j, with full probability.
Exercise 1.4 Show that if some matrix Ai , 1 ≤ i ≤ m is either hyperbolic or
parabolic then the monoid B is pinching.
Exercise 1.5 Show that the monoid B may be pinching even if all the matri-
ces Ai , 1 ≤ i ≤ m are elliptic.
Exercise 1.6 Suppose that there exists 1 ≤ i ≤ m such that Ai is conjugate to
an irrational rotation. Conclude that B is twisting.
Exercise 1.7 Suppose that there exist 1 ≤ i, j ≤ m such that Ai and A j are
either hyperbolic or parabolic and that they have no common eigenspace. Con-
clude that B is twisting (and pinching).
Exercise 1.8 Let Ai , i = 1, 2 be as in the second part of Exercise 1.2. Check
that
λ+ (A1 , A2 , 1, 0) = lim λ+ (A1 , A2 , 1 − p2 , p2 ).
p2 →0
Thus, the hypothesis p1 > 0, . . . , pm > 0 cannot be removed in Theorem 1.3.

2
Linear cocycles
Linear cocycles are the basic object upon which this text is built. Here we
define this concept and introduce a few examples. Special attention is given to
(uniformly) hyperbolic cocycles, a class that is often used as a kind of paradigm
for the behavior of more general systems.
Let (M, B, μ ) be a probability space and f : M → M be a measure-preserving
map. Let A : M → GL(d) be a measurable function with values in the linear
group GL(d) of invertible d × d matrices with real coefficients. Sometimes we
let A take values in the special linear group SL(d) of real d × d matrices with
determinant ±1. The linear cocycle defined by A over f is the transformation
F : M × Rd → M × Rd , (x, v) → ( f (x), A(x)v). (2.1)
Observe that F n (x, v) = ( f n (x), An (x)) for every n ≥ 1, where
An (x) = A( f n−1 (x)) · · · A( f (x))A(x).
If f is invertible then so is F. Moreover, F −n (x, v) = ( f −n (x), A−n (x)) for all

n ≥ 1, where
A−n (x) = A( f −n (x))−1 · · · A( f −1 (x))−1 = An ( f −n (x))−1 .
The Furstenberg–Kesten theorem (Theorem 1.1) extends to this setting, as

follows: for any f -invariant probability measure μ such that log A±1 ∈ L1 (μ ),
1 1
λ+ (x) = lim log An (x) and λ− (x) = lim log An (x)−1 −1 (2.2)
n n n n
exist at μ -almost every point x. This fact will be proven in Chapter 3.

More generally, one may consider A to take values in the group GL(d, C) of
invertible d × d matrices with complex coefficients, or the subgroup SL(d, C)
of matrices with determinant in the unit circle. This gives rise to complex linear
cocycles M × Cd → M × Cd . Of course, every complex cocycle in dimension

2.1 Examples 7
d is also a real cocycle in dimension 2d and, conversely, every d-dimensional

real linear cocycle defines a d-dimensional complex linear cocycle. The two
theories, real and complex, are actually very similar. We focus on the real case,
except where stated otherwise.
2.1 Examples
We illustrate this notion with three important classes of linear cocycles, arising
from probability theory, dynamical systems, and spectral theory, respectively.
2.1.1 Products of random matrices

The situation we considered in Chapter 1 can be modeled by (a special case
of) the following class of linear cocycles. Let X = GL(d) and M = X Z (or
M = X N ) and
f : M → M, (αk )k → (αk+1 )k
be the shift map on X. Consider the function
A : M → GL(d), (αk )k → α0
and let F : M × Rd → M × Rd be the linear cocycle defined by A over f . Note

that the kth iterate of F is given by

F n (αk )k , v = (αk+n )k , αn−1 . . . α1 α0 v .
Given a probability measure p in the space GL(d), consider the product mea-
sure μ = pZ (or μ = pN ), which is characterized by

μ {(αk )k : αi ∈ Ei , . . . , α j ∈ E j } = p(Ei ) · · · p(E j )
for every i ≤ j and any measurable sets E1 , . . . , E j ⊂ X. It is clear that μ is

invariant under the shift map.
We call a locally constant linear cocycle the following slightly more general
construction. Let (Y, Y , q) be any probability space and then consider N = Y Z
endowed with the product σ -algebra C = Y Z and the product measure ν = qZ
(or N = Y N endowed with C = Y N and ν = qN ). Let g : N → N be the shift
map. Moreover, let B : N → GL(d) be any measurable function depending only
on the zeroth coordinate; that is, of the form B(y) = β (y0 ) for some measurable
function β : Y → GL(d). Then consider the linear cocycle G : N ×Rd → N ×Rd

8 Linear cocycles
defined by B over g. Note that G is semi-conjugate to a cocycle F as in the

previous paragraph, with p = β∗ q:
N × Rd
G / N × Rd
Φ Φ

M × Rd
F / M × Rd

with Φ (xk )k , v = (β (xk ))k , v). For this reason, the two cocycles are equiva-
lent for most of our purposes.
2.1.2 Derivative cocycles

Consider a diffeomorphism f : M → M on the torus M = Td of dimension
d ≥ 1. It is easy to construct smooth vector fields X1 , . . . , Xd on Td such that
{X1 (x), . . . , Xd (x)} is a basis of the tangent space Tx M, for every x ∈ M. One
says that the torus is a parallelizable manifold. The derivative cocycle of f is
F : M × Rd → M × Rd , (x, v) → ( f (x), A(x)v),
where A(x) ∈ GL(d) is the matrix, with respect to these bases, of the derivative
D f (x) : Tx M → T f (x) M.
For more general diffeomorphisms, on non-parallelizable manifolds, the pre-
vious construction does not apply. However, one can still view the derivative
map D f : T M → M as a linear cocycle, in the following more general sense.
Let π : V → M be a finite-dimensional vector bundle. This means that V is
equipped with a family of homeomorphisms hα : Uα × Rd → π −1 (Uα ) such
that:
(i) {Uα } is an open cover of M;

(ii) π ◦ hα (x, v) = x for every x ∈ Uα and any α ;
(iii) for every x ∈ Uα ∩ Uβ and any α , β , there exists a linear isomorphism
Lα ,β (x) : Rd → Rd such that h−1β ◦ hα (x, v) = (x, Lα ,β (x)v) for every v.
The integer d ≥ 1 is the dimension of the vector bundle. A linear cocycle on V

over a transformation f : M → M is a measurable transformation F : V → V
such that π ◦ F = f ◦ π and the actions Fx : Vx → V f (x) on the fibers are linear
isomorphisms:
V
F /V
π π
f
M /M

2.1 Examples 9
For our purposes there is not much to gain from considering such generality.
So, most of the time we will stick to the case of trivial fiber bundles; that is, to
linear cocycles of the form (2.1).
2.1.3 Schrödinger cocycles

Consider = {(un )n∈Z : ∑n |un |2 < ∞}. The Schrödinger operator associated
2
with a sequence (Vn )n∈Z in R, is defined by
H : 2 → 2 , u = (un )n∈Z → H(u) = (un+1 + un−1 +Vn un )n∈Z . (2.3)
In the most interesting models the sequence Vn is generated from a dynamical

system f : M → M and a function V : M → R (the so-called potential) through
Vn = V ( f n (x)), for some x ∈ M. The most studied cases are:
(1) Random Schrödinger cocycles: Let M = X Z be a shift space, f : M → M

be the shift map, and μ = pZ be a Bernoulli measure on M. Fix x ∈ M and
then take Vn = V ( f n (x)), where the function V : M → R is such that V (x)
depends only on the zeroth coordinate of x ∈ M.
(2) Quasi-periodic Schrödinger cocycles: Let μ the normalized Lebesgue mea-
sure on M = Td and f : Td → Td be an irrational translation. Fix x ∈ Td
and take Vn = V ( f n (x)), where V : Td → R is an analytic function.
A main objective is to understand the spectral theory of these operators (a

theorem of Pastur [97] asserts that when the base system ( f , μ ) is ergodic the
spectrum of H is the same for almost all choices of x ∈ M; the same is true for
the absolutely continuous spectrum, the singular continuous spectrum and the
pure point spectrum, by Kunz and Souillard [78]). Thus, one is led to studying
the eigenvalue equation
H(u) = Eu, for E ∈ R. (2.4)
While, by definition, the eigenvectors of H are the solutions of this equation in

the space 2 , it is useful to consider (2.4) for any real sequence u = (un )n∈Z .
Note that the equation may be rewritten as
un+1 + un−1 +V ( f n (x))un = Eun
or, still equivalently,

un+1 E −V ( f n (x)) −1 un
= .
un 1 0 un−1

10 Linear cocycles
This suggests that we consider the linear cocycle FE : M × R2 → M × R2

defined over f : M → M by the function

E −V (y) −1
A : M → SL(2), A(y) = .
1 0
We have just seen that u = (un )n∈Z is a solution to H(u) = Eu if and only
if U = (un , un−1 )n∈Z is a trajectory of the linear cocycle FE . The behavior of
these linear cocycles provides useful information about the spectral properties
of the Schrödinger operator. For example, if the Lyapunov exponents of FE
are different from zero then E cannot be an eigenvalue of H : 2 → 2 . See
Damanik [48] for much more information.
2.2 Hyperbolic cocycles

We are going to define an important class of cocycles whose behavior is partic-
ularly well understood. We focus on the two-dimensional setting, but we also
comment briefly on the general case.
2.2.1 Definition and properties

Let M be a compact metric space and f : M → M be a homeomorphism. We
call a continuous cocycle
F : M × R2 → M × R2 , (x, v) → ( f (x), A(x)v)
hyperbolic if there are C > 0 and λ < 1 and, for every x ∈ M, there exist
transverse lines Exs and Exu in R2 such that
(1) A(x)Exs = E sf (x) and A(x)Exu = E uf(x)

(2) An (x)vs ≤ Cλ n vs and A−n (x)vu ≤ Cλ n vu
for every vs ∈ Exs , vu ∈ Exu , x ∈ M, and n ≥ 1.
Proposition 2.1 Let F : M × R2 → M × R2 be the linear cocycle defined by a

continuous function A : M → SL(2) over a homeomorphism f : M → M. Then
F is hyperbolic if and only if there exist constants c > 0 and σ > 1 such that
An (x) ≥ cσ n for all x ∈ M and n ≥ 1.
Proof (We are going to use (2.2), whose proof will be given in Section 3.2.
Similar arguments will appear in Section 3.4, for proving the multiplicative
ergodic theorem in dimension 2.)

Suppose that F is hyperbolic. Condition (2) in the definition implies that

An (x)vu ≥ C−1 λ −n vu , and so
An (x) ≥ C−1 λ −n for every x and n.
This means that we may take c = C−1 and σ = λ −1 . In the converse direction,
suppose that An (x) ≥ cσ n for every x and n. In particular, An (x) > 1 for
every large n. Let un (x) and sn (x) be unit vectors, respectively, most expanded
and most contracted by An (x) (Exercise 2.3):
An (x)un (x) = An (x) ≥ cσ n and
−1 −1
(2.5)
A (x)sn (x) = A (x)
n n
= An (x)−1 ≤ c−1 σ −n .
Lemma 2.2 There are C1 , C2 > 0 such that
| sin (sn (x), sn+1 (x))| ≤ C1 σ −n An (x)−1 ≤ C2 σ −2n .
for all n ≥ 0 and x ∈ M.

Proof Write αn = sn (x), sn+1 (x) . Then
sn (x) = sin αn un+1 (x) + cos αn sn+1 (x).
Then, since the images of sn+1 (x) and un+1 (x) under An+1 (x) are orthogonal,
An+1 (x)sn (x) ≥ sin αn An+1 (x)un+1 (x) = | sin αn | An+1 (x).
On the other hand,
An+1 (x)sn (x) ≤ A( f n (x)) An (x)sn (x) = A( f n (x)) An (x)−1 .
Let C0 > 0 be an upper bound for the norm of A. Then, substituting (2.5),
A( f n (x)) C0 C0
| sin αn | ≤ ≤ ≤ .
An+1 (x) An (x) cσ n+1 An (x) c2 σ 2n+1
This implies the conclusion of the lemma, with cσ C1 = c2 σ C2 = C0 .
Lemma 2.2 implies that (sn (x))n is a Cauchy sequence in projective space;
that is, it can be made a Cauchy sequence by multiplying some of the vectors
sn (x) by −1. Let s(x) = limn sn (x). Then,
∞ ∞
| sin (sn (x), s(x))| ≤ C1 ∑ σ −m Am (x)−1 ≤ C2 ∑ σ −2m (2.6)
m=n m=n
for every x ∈ M and every n ≥ 1. In particular, the convergence is uniform.

This also implies that s(x) is a continuous function of x ∈ M.
Lemma 2.3 A(x)s(x) is collinear to s( f (x)) for every x ∈ M.

12 Linear cocycles

Proof Let βn = A(x)sn+1 (x), sn ( f (x)) . Then
A(x)sn+1 (x) = cos βn sn ( f (x)) + sin βn un ( f (x)).
Applying An ( f (x)) to both sides,
An+1 (x)sn+1 (x) ≥ | sin βn | An ( f (x))un ( f (x)) − An ( f (x))sn ( f (x)).
Substituting (2.5), we get that c−1 σ −n−1 ≥ cσ n | sin βn | − c−1 σ −n , and so
| sin βn | ≤ 2c−2 σ −2n .
So | sin βn | → 0 as n → ∞, and this implies the claim in the lemma.
Lemma 2.4 For any σ0 < σ there exists n0 ≥ 1 such that An (x)s(x) ≤ σ0−n
for every x ∈ M and n ≥ n0 .
Proof First, take x ∈ M to be such that limn n−1 log An (x) exists. Observe
that, according to (2.2), this is the case for μ -almost every point x and every
f -invariant probability measure μ . We claim that
1
lim sup log An (x)s(x) ≤ − log σ . (2.7)
n n

For proving this, let γn = s(x), sn (x) . Then s(x) = cos γn sn (x) + sin γn un (x)
and so, using (2.5) and (2.6),
An (x)s(x) ≤ | cos γn |An (x)sn (x) + | sin γn |An (x)un (x)
∞
≤ An −1 +C1 ∑ σ − j A j −1 An .
m=n
The assumption implies that for any ε > 0 there exists nε ≥ 1 such that
e−nε ≤ Am −1 An ≤ enε for every m ≥ n ≥ nε .
Using (2.5) once more, it follows that
∞
An (x)s(x) ≤ c−1 σ −n +C1 enε ∑ σ − j ≤ C1 enε σ −n for every n ≥ nε ,
j=n
where the constant C1 depends only on c, C1 , and σ . This implies our claim.
Thus, we have shown that (2.7) holds for μ -almost every x and any f -invariant
probability measure μ .
Now, suppose that the conclusion of the lemma is false. Then there exists
σ0 < σ and, for every k ≥ 1, there exists nk ≥ k and xk ∈ M such that
−nk
Ank (xk )s(xk ) > σ0 .
n −1
Define μk = n−1
k ∑ j=0 δ f j (xk ) and φ (x) = log A(x)s(x). The previous relation
k


may be written, equivalently, as φ d μk > − log σ0 . Since the space of prob-
ability measures on M is weak∗ -compact, up to restricting to a subsequence
if necessary we may assume that the sequence (μk )k converges in the weak∗
topology to some probability measure μ on M. Since the function is continu-
ous, it follows that

φ d μ ≥ − log σ0 .

Clearly, f∗ μk = μk + n−1k δ f nk (xk ) − δxk . Making k → ∞ we find that f∗ μ = μ ;
that is, μ is invariant under f . Let φ̃ be the Birkhoff time average of φ :
1 n−1 1
φ̃ (x) = lim
n
∑ φ ( f j (x)) = lim
n j=0 n n
log An (x)s(x) .

On the one hand, by the ergodic theorem, φ̃ d μ = φ d μ ≥ − log σ0 . On the
other hand, (2.7) gives that φ̃ (x) ≤ − log σ for μ -almost every x. These two
inequalities are incompatible. This contradiction proves the lemma.
Let σ0 ∈ (1, σ ) be fixed. By Lemma 2.4, there exists C3 > 0 such that
An (x)s(x) ≤ C3 σ0−n for every x ∈ M and every n ≥ 1. (2.8)
Analogously, considering backward iterates instead, one constructs a unit vec-
tor u(x) such that A−1 (x)u(x) is collinear to u(x) for every x ∈ M and
A−n (x)u(x) ≤ C3 σ0−n for every x ∈ M and every n ≥ 1. (2.9)
Incidentally, this is the first time in the proof that we have used the assumption
that the cocycle is invertible.
Lemma 2.5 The vectors s(x) and u(x) are transverse for every x ∈ M.
Proof From (2.8) we get that An ( f −n (x))s( f −n (x)) ≤ C3 σ −n for every x
and n. This may be rewritten as
A−n (x)s(x) ≥ C3−1 σ0n ,
in view of Lemma 2.3. Moreover, using (2.9),
A−n (x)u(x) ≤ C3 σ0−n ,
for every x and n. Then A−n (x)s(x) is larger than A−n (x)u(x) for every
large n. In particular, s(x) and u(x) cannot be collinear, as claimed.
Define Exs and Exu to be the lines generated by s(x) and u(x), respectively.
Lemmas 2.2–2.5 yield all the properties in the conclusion of Proposition 2.1.

14 Linear cocycles
2.2.2 Stability and continuity

Let C0 (M, SL(2)) denote the space of continuous functions M → SL(2), en-
dowed with the distance
d(A, B) = sup A(x) − B(x).
x∈M
We are going to deduce from Proposition 2.1 that hyperbolicity corresponds

to an open subset of this space. Moreover, the invariant decomposition varies
continuously on this subset.
Proposition 2.6 Let f : M → M be fixed. Suppose that the linear cocycle F
defined by A : M → SL(2) over f is hyperbolic, and let R2 = Exs ⊕ Exu be the
corresponding invariant decomposition.
There exists δ > 0 such that the cocycle defined over f by any continuous
function B : M → SL(2) with d(A, B) < δ is also hyperbolic.
Moreover, given ε > 0, the B-invariant decomposition R2 = EB,x s ⊕ Eu , x ∈
B,x
M satisfies | sin (Ex , EB,x )| < ε and | sin (Ex , EB,x )| < ε for all x ∈ M, if δ is
s s u u
small enough.
Proof For each x ∈ M and γ > 0, let Cu (x, γ ) be the set of vectors v ∈ Rd
whose coordinates (vs , vu ) in the decomposition Rd = Exs ⊕ Exu satisfy vs ≤
γ vu (this is a particular case of a cone, of which we will hear more in Sec-
tion 4.4.2). Let C > 0 and λ < 1 be constants as in the definition of hyperbol-
icity. Fix k ≥ 1 large enough so that Cλ k ≤ 1/3. Then, for any v ∈ Cu (x, 1) and
x ∈ M,
1 1 1
Ak (x)vu ≥ 3vu and Ak (x)vs ≤ vs ≤ vu ≤ Ak (x)vu ,
3 3 9
and so Ak (x)v ∈ Cu (x, 1/9). Then, for any B : M → SL(2) such that d(A, B) is
sufficiently small, for every v ∈ Cu (x, 1), and every x ∈ M,
u
Bk (x)v ≥ 2vu and Bk (x)v ∈ Cu (x, 1).
u
So, by induction, Bkn (x)v ≥ 2n vu for every v ∈ Cu (x, 1), every x ∈ M,
and every n ≥ 1. This implies that Bn (x) ≥ cσ n for every x ∈ M and n ≥ 1,
where σ = 21/k and c > 0 are independent of B, x, and n. By Proposition 2.1,
it follows that the cocycle defined by B is hyperbolic, as claimed.
To prove the other claim in the proposition, let ε > 0 be fixed. Let s(x)
be a unit vector in the direction of EB,x s and, for each n ≥ 1, let s (x) be
B,n
n
the unit vector most contracted by B (x). By (2.6), there is a constant C > 0
independent of B, x, and n, such that
| sin (sB,n (x), sB (x))| ≤ Cσ −2n .

Fix n ≥ 1, large enough so that Cσ −2n < ε /4 and then assume d(A, B) is small
enough so that | sin (sA,n (x), sB,n (x))| < ε /4 for every x ∈ M and n ≥ 1. Then
| sin (sA (x), sB (x))| ≤ | sin (sA,n (x), sB,n (x))| + 2Cσ −2n < ε .
s )| < ε for every x ∈ M. The corresponding state-
This proves that | sin (Exs , EB,x
ment for E u is analogous.
Here is another interesting consequence of Proposition 2.1.
Proposition 2.7 Suppose that A : M → SL(2) is such that A(x) has posi-
tive entries for every x ∈ M. Then the linear cocycle F defined by A over any
homeomorphism f : M → M is hyperbolic.
Proof By hypothesis,

ax bx
A(x) =
cx dx
with ax , bx , cx , dx > 0 and ax dx − bx cx = 1. Since M is compact and A is con-
tinuous, there exists a positive lower bound δ > 0 for ax , bx , cx , and dx over all
x ∈ M. It is clear that the cocycle preserves the subset of vectors with positive
entries: if v = (v0 , w0 ) is such that v0 > 0 and w0 > 0 then A(x)v = (v1 , w1 )
with v1 > 0 and w1 > 0. Moreover,
v1 w1 > (1 + 2bx cx )v0 w0 ≥ (1 + 2δ 2 )v0 w0 .
Then, by induction, the iterates (vn , wn ) = An (x)v have vn > 0, wn > 0 and
vn wn > (1 + 2δ 2 )n v0 w0 . This implies that the norm of An (x) grows exponen-
tially fast: An (x) ≥ cσ n for some c > 0 and σ = (1 + 2δ 2 )1/2 . Now the claim
follows directly from Proposition 2.1.
The notion of hyperbolicity extends naturally to linear cocycles in any di-
mension and one may even consider linear cocycles on general vector bundles,
as mentioned in Section 2.1.2. Namely, let f : M → M be a homeomorphism
on a compact space M. A continuous linear cocycle F : V → V over f is hy-
perbolic if there are constants C > 0 and λ < 1 and for every x ∈ M there exists
a direct sum decomposition Vx = Exs ⊕ Exu of the fiber over x satisfying
(1) Fx (Exs ) = E sf (x) and Fx (Exu ) = E uf(x)
(2) Fxn (vs ) ≤ Cλ n vs and Fx−n (vu ) ≤ Cλ n vu for every vs ∈ Exs , vu ∈ Exu ,
and n ≥ 1.
The conclusion of Proposition 2.6 remains valid in this generality, as we are
going to explain. First of all, there is a natural topology in the space of linear
cocycles over f : M → M, associated with the following norm. Fix any finite

16 Linear cocycles
set of trivializing coordinates hα : Uα × Rd → π −1 (Uα ) for the vector bundle

V . The local expressions

h−1
β ◦ F ◦ hα : Uα ∩ f
−1
(Uβ ) × Rd → Uβ × Rd
of F are maps of the form (x, v) → ( f (x), Fα ,β (x)v). Define

F = max sup Fα ,β (x) : x ∈ Uα ∩ f −1 (Uβ ) .
α ,β
The topology associated with this norm does not depend on the choice of the
trivializing atlas.
Then the set of hyperbolic cocycles is an open subset of the space of all
linear cocycles over f : M → M. Moreover, the invariant sub-bundles E s and E u
vary continuously with the linear cocycle inside that open set, in the following
sense: restricted to the each trivializing domain Uα , we may view x → Ex∗ ,
∗ ∈ {s, u} as a map from Uα to the Grassmannian Gr(d); if F is a hyperbolic
linear cocycle and G is a nearby linear cocycle then the maps Uα → Gr(d)
associated with G are uniformly close to those of F, for each α . The proof of
these facts is part of Exercise 2.6 (see also Shub [107]).
By the Grassmannian Gr(d) we mean the (disjoint) union of the Grass-
mannian manifolds Gr(l, d), 0 ≤ l ≤ d, whose elements are the l-dimensional
vector subspaces of Rd .
2.2.3 Obstructions to hyperbolicity

We describe a few mechanisms that exclude the presence of hyperbolic cocy-
cles in certain situations.
Example 2.8 Let M be a compact, connected metric space, f : M → M be a
continuous transformation, and A : M → SL(d) be a continuous function such
that for every 1 ≤ i ≤ d − 1 there exists a periodic point pi ∈ M of f , with
period κi ≥ 1 such that the eigenvalues {β ji : 1 ≤ j ≤ d} of each Aκi (pi ) satisfy
|β1i | ≥ · · · ≥ |βi−1
i
| > |βii | = |βii+1 | > |βi+2
i
| ≥ · · · ≥ |βdi | (2.10)
and βii , βi+1
i are complex conjugate (not real). Such an A may be found, for
instance, starting with a constant cocycle and deforming it on disjoint neigh-
borhoods of the periodic orbits. Property (2.10) remains valid for every B :
M → SL(d) in a C0 neighborhood U of A. Consider any B ∈ U and suppose
the associated cocycle is hyperbolic, with hyperbolic decomposition Exs ⊕ Exu at
each x ∈ M. Since the hyperbolic decomposition is continuous (Exercises 2.1
and 2.6), and M is connected, the dimensions of Exs and Exu are constant. Let
dim Ex = l. Then, for any point p ∈ M with f κ (p) = p, the l largest (in norm)
eigenvalues of Bκ (p) are strictly larger than the other d − l eigenvalues. This

is incompatible with (2.10) when p = pl . This contradiction shows that no

cocycle in U can be hyperbolic.
Example 2.9 (Michael Herman) Let S1 = R/Z and f : S1 → S1 be a continu-
ous transformation, with topological degree deg( f ) ∈ Z. Let A : S1 → SL(2) be
of the form A(x) = A0 R2πα (x) where A0 ∈ SL(d), α : S1 → S1 is a continuous
function with topological degree deg(α ) ∈ Z, and Rθ denotes the rotation of
angle θ . Let U be isotopy class of A in the space of maps from S1 to SL(2).
This is a C0 neighborhood of A. We claim that if 2 deg(α ) is not a multiple of
deg( f ) − 1 then the linear cocycle associated with every B ∈ U is not hyper-
bolic. Indeed, let B ∈ U and suppose the associated cocycle is hyperbolic. Let
Exs ⊕ Exu be the hyperbolic decomposition. Let us view E s : x → Exs as a con-
tinuous map from S1 to the projectivization PR2 of the real plane. The graph
{(x, Exs ) : x ∈ S1 } represents some element (η , ζ ) of the fundamental group
π1 (S1 × PR2 ) = Z ⊕ Z. Since B is isotopic to A, the image of graph(E s ) under
the cocycle must represent
(η deg( f ), ζ + 2 deg(α )) ∈ π1 (S1 × PR2 )
(the factor 2 comes from the fact that S1 is the 2-fold covering of PR2 ). This
must be collinear to (η , ζ ), because E s is invariant under the cocycle. In other
words, we must have ζ + 2 deg(α ) = deg( f )ζ . Then deg( f ) − 1 must divide
2 deg(α ). This proves our claim.
Example 2.10 A diffeomorphism f : M → M on a compact manifold is an
Anosov diffeomorphism if the derivative cocycle F = D f is hyperbolic. The
existence of Anosov diffeomorphisms imposes strong restrictions on the mani-
fold M. For one thing, the Euler characteristic must be zero. In dimension 2 the
Klein bottle may also be excluded, so that the only surface that admits Anosov
diffeomorphisms is the torus T2 .
Anosov diffeomorphisms can also be constructed in the high-dimensional
tori, as follows. Let A be any d × d a matrix with integer coefficients and de-
terminant 1. Then A defines a diffeomorphism of Td = Rd /Zd and, assuming
that A has no eigenvalues in the unit circle, this diffeomorphism is Anosov.
Every Anosov diffeomorphism on a torus is topologically conjugate to such a
hyperbolic automorphism.
Anosov diffeomorphisms with a similar algebraic flavor may be constructed
on the more general class of infranilmanifolds, which are suitable quotients of
Lie groups. Again, all Anosov diffeomorphisms on infranilmanifolds are topo-
logically conjugate to a hyperbolic automorphism of the manifold. Moreover,
all known examples of Anosov diffeomorphisms are defined on infranilmani-
folds. See Newhouse [91], Franks [52], Manning [90].

18 Linear cocycles
2.3 Notes
The theory of products of random matrices was effectively initiated by Fursten-
berg and Kesten [56] and Furstenberg [54]. There is now a vast literature on
the subject, some of which will be discussed later. In addition to the references
we provide along the way, let us mention the collections [7, 44, 51] and the
book [46].
The time-independent Schrödinger equation describes the so-called orbitals,
or stationary waves, in quantum mechanics. It takes the form HΨ = E ψ where
Ψ is the wave function, E is the energy of the quantum state Ψ and H is the
Hamiltonian operator, an hermitian operator that characterizes the total energy
of any wave function. In the case of a single particle moving in an electric field
the Hamiltonian operator is given by

h̄2
HΨ(x) = − Δ +V (x) Ψ(x)
2m
where h̄ is the reduced Planck constant, m is the mass, V is the potential en-
ergy and Δ is the Laplacian operator. Our discussion in Section 2.1.3 concerns
the case when the space variable x is one-dimensional and discrete. Then, the
Laplacian is simply given by ΔΨ(n) = Ψ(n + 1) − 2Ψ(n) + Ψ(n − 1) and this
readily leads to the form of the Schrödinger operator in (2.3).
Much of the mathematical study of Schrödinger operators is motivated by
an observation made in 1958 by the American physicist Philip Anderson. He
argued in [1] that, while ideal crystals are always conductors, the presence of
impurities should cause the crystal to loose all its conductivity properties and,
thus, become an insulator: the electrons are trapped due to the crystal lattice
disorder (this discovery earned Anderson the Nobel Prize for Physics in 1977).
Mathematically, such disordered systems are modeled by suitable (random)
Schrödinger operators H : 2 (Zd ) → 2 (Zd ) and the Anderson localization
phenomenon corresponds to the operator H having only a pure point spec-
trum. In other words, localization means that the space 2 (Zd ) admits a Hilbert
basis formed by eigenvectors of H. See Damanik [47] and Jitomirskaya and
Marx [66] for surveys on this and related problems. In the text, we restricted
ourselves to hinting (in the case d = 1) that the Lyapunov exponents of the
associated Schrödinger cocycles have a say in the spectral theory of these op-
erators. Damanik [48] contains a lot more information on this topic.
The observations in Section 2.2 will be useful in Chapter 9. The notion of
(uniform) hyperbolicity is due to Smale; see [109] and references therein. He
had in mind diffeomorphisms and smooth flows (derivative cocycles) and the
purpose was twofold: to prove that hyperbolicity characterizes the structural

2.4 Exercises 19
stability of the dynamics (the Stability Conjectures of Palis and Smale [96]),
and to conclude that most smooth systems are structurally stable (which turned
out not to be true).
Of course, hyperbolicity was implicit in some previous works, for instance
in the proof by E. Hopf [65] that the geodesic flow on any surface with neg-
ative curvature is ergodic. Anosov [2] introduced the class of globally hyper-
bolic diffeomorphisms and flows to extend this result to arbitrary dimension: he
observed that the geodesic flow on any compact manifold with negative (sec-
tional) curvature is globally hyperbolic and he proved that volume-preserving
globally hyperbolic C2 diffeomorphisms and flows are ergodic.
Especially in the context of diffeomorphisms and smooth flows, the con-
cept of (uniform) hyperbolicity has been broadened to that of non-uniform hy-
perbolicity, which refers to systems whose Lyapunov exponents are non-zero
at almost every point, relative to some distinguished invariant measure. This
was initiated by Pesin [99, 100] with major contributions also by Katok [68],
Mañé [86], Ledrappier [79], Ledrappier and Young [83, 84] and Barreira, Pesin
and Schmeling [19] among others. See also the book of Barreira and Pesin [18]
and references therein. For most of these results the system is assumed to be
C1+Hölder .
2.4 Exercises
Exercise 2.1 Show that Exs and Exu are unique when they exist. Moreover,
show they depend continuously on the point x ∈ M.
Exercise 2.2 Prove that if the Schrödinger cocycle FE is hyperbolic then the
spectral equation H(u) = Eu has no solution in 2 .
Exercise 2.3 Show that given B ∈ SL(2) such that B = 1, there exist unit
vectors s and u such that B(u) = B and B(s) = B−1 −1 = B−1 . These
vectors are unique, up to multiplication by −1, they are orthogonal, and their
images B(s) and B(u) are also orthogonal.
Exercise 2.4 Check that B|r1 /B|r2 ≤ B2 for any lines r1 , r2 ⊂ R2 and
B ∈ SL(2).
Exercise 2.5 Prove that a linear cocycle F is hyperbolic if and only if any
iterate F k , k = 0 is hyperbolic.
Exercise 2.6 Extend Exercise 2.1, Proposition 2.6, and Exercise 2.5 to linear
cocycles on vector bundles of arbitrary dimension.

3
Extremal Lyapunov exponents
As announced already in (2.2), the theorem of Furstenberg and Kesten [56]

states that, under a suitable integrability assumption,
the norm An (x) and the conorm An (x)−1 −1
of any linear cocycle have well-defined exponential rates as time n → ∞, for

almost every point x. The precise statement, which contains Theorem 2.2, is
given in Section 3.2. Indeed, we obtain the result as a straightforward conse-
quence of the so-called subadditive ergodic theorem of Kingman [74], which
we state and prove in Section 3.1.
We take the occasion, in Section 3.3, to illustrate Herman’s [62] subhar-
monic trick for bounding the largest Lyapunov exponents from below.
The multiplicative ergodic theorem of Oseledets [92] improves the Fursten-
berg–Kesten theorem in that it provides exponential rates for the iterates
An (x)v
of all vectors, rather than just for the norm and conorm of the matrices. The
full statements will appear in Chapter 4. Here (Section 3.4), we treat the special
case of cocycles in dimension 2, which is much simpler and contains a few of
the elements of the general case.
3.1 Subadditive ergodic theorem

Let (M, B, μ ) be a probability space and f : M → M be a measure-preserving
transformation. The positive part and the negative part of a measurable func-
tion ϕ : M → [−∞, +∞] are the non-negative functions defined by, respectively,
ϕ + (x) = max{0, ϕ (x)} and ϕ − (x) = max{0, −ϕ (x)}.
20

A measurable function ϕ is called (essentially) invariant if ϕ ( f (x)) = ϕ (x) for

μ -almost all x ∈ M. By definition, a measurable subset of M is invariant if its
characteristic function is invariant. A sequence ϕn : M → [−∞, +∞), n ≥ 1, of
measurable functions is subadditive, relative to f , if
ϕm+n ≤ ϕm + ϕn ◦ f m for all m, n ≥ 1.
Example 3.1 Given any measurable function ψ : M → R, consider its orbital
sum ϕn = ∑n−1 j=0 ψ ◦ f . Then ϕm+n = ϕm + ϕn ◦ f for every m and n and, in par-
j m
ticular, (ϕn )n is a subadditive sequence, and even an additive sequence, since

the equality always holds. It is clear that, conversely, every additive sequence
is the orbital sum of its first term.
Example 3.2 Given any measurable function A : M → GL(d), consider the
sequence ϕn (x) = log An (x), where An (x) = A( f n−1 (x)) · · · A( f (x))A(x). As
B1 B2 ≤ B1 B2 for every B1 , B2 ∈ GL(d), the sequence (ϕn )n is subad-
ditive.
Theorem 3.3 (Kingman) Let ϕn : M → [−∞, +∞), n ≥ 1 be a subadditive

sequence of measurable functions such that ϕ1+ ∈ L1 (μ ). Then (ϕn /n)n con-
verges μ -almost everywhere to some invariant function ϕ : M → [−∞, +∞).
Moreover, the positive part ϕ + is integrable and

1 1
ϕ d μ = lim ϕn d μ = inf ϕn d μ ∈ [−∞, +∞).
n n n n
The proof of this theorem will be presented later. An interesting feature is

that it does not use the ergodic theorem of Birkhoff. Thus, the latter can be
obtained as a consequence (Corollary 3.10).
3.1.1 Preparing the proof

A sequence (an )n in [−∞, +∞) is said to be subadditive if am+n ≤ am + an
holds for any m, n ≥ 1.
Lemma 3.4 If (an )n is a subadditive sequence then
an an
lim = inf ∈ [−∞, ∞). (3.1)
n n n n
Proof If am = −∞ for some m then, by subadditivity, an = −∞ for all n > m.

Then both sides of (3.1) are equal to −∞, and so the lemma holds in this case.
From now, assume that an ∈ R for all n.
Let L = infn (an /n) ∈ [−∞, +∞) and let L be any real number bigger than L.
Then we can find k ≥ 1 such that ak /k < L . For n > k, we may write n = kp+q,

22 Extremal Lyapunov exponents
where p and q are integer numbers such that p ≥ 1 and 1 ≤ q ≤ k. Then, by

subadditivity,
an ≤ akp + aq ≤ pak + aq ≤ pak + α ,
where α = max{ai : 1 ≤ i ≤ k}. Then,
an pk ak α
≤ + .
n n k n
Observe that pk/n converges to 1 and α /n converges to zero when n → ∞.
Therefore, since ak /k < L , we have
an
L≤ < L
n
for any large enough n. Making L → L, we conclude that
an an
lim = L = inf .
n n n n
This completes the argument.
Now let (ϕn )n be as in Theorem 3.3. By subadditivity,
ϕn ≤ ϕ1 + ϕ1 ◦ f + · · · + ϕ1 ◦ f n−1 .
This inequality remains true if we replace ϕn and ϕ1 by ϕn+ and ϕ1+ , respec-
tively. So, the hypothesis that ϕ1+ ∈ L1 (μ ) implies that ϕn+ ∈ L1 (μ ) for any n.
On the other hand, the hypothesis that (ϕn )n is subadditive implies that

an = ϕn d μ , n ≥ 1,
is a subadditive sequence in [−∞, +∞). Then, by Lemma 3.4,

an an
lim = inf = L ∈ [−∞, ∞)
n n n n
exists. Define ϕ− : M → [−∞, ∞] and ϕ+ : M → [−∞, ∞] by

ϕn ϕn
ϕ− (x) = lim inf (x) and ϕ+ (x) = lim sup (x).
n n n n
It is clear that ϕ− (x) ≤ ϕ+ (x) for every x ∈ M. We are going to prove that

ϕ− d μ ≥ L ≥ ϕ+ d μ , (3.2)
provided that every function ϕn is bounded from −∞. Consequently, the two
functions ϕ− and ϕ+ coincide at μ -almost every point and their integrals are
equal to L. This yields the theorem in this case, with ϕ = ϕ− = ϕ+ (the fact
that ϕ is an invariant function is proved in Exercise 3.2). At the end, we use
a truncation trick to remove the boundedness condition and, hence, prove the
theorem in complete generality.

3.1.2 Fundamental lemma

In this section we assume that ϕ− > −∞ at every point. Fix ε > 0 and define,
for each k ∈ N,

Ek = x ∈ M : ϕ j (x) ≤ j ϕ− (x) + ε for some j ∈ {1, . . . , k} .
It is clear that Ek ⊂ Ek+1 for any k. Moreover, the definition of ϕ− (x) implies

that M = k Ek . Define also,

ϕ− (x) + ε if x ∈ Ek
ψk (x) =
ϕ1 (x) if x ∈ Ekc .
The definition of Ek implies that ϕ1 (x) > ϕ− (x) + ε for every x ∈ Ekc . Thus,
ψk (x) decreases to ϕ− (x) + ε as k → ∞, for every x ∈ M. In particular, by the
monotonic convergence theorem,

ψk d μ → (ϕ− + ε ) d μ as k → ∞.
The crucial step in the proof of the theorem is the following estimate:
Lemma 3.5 For any n > k ≥ 1 and μ -almost every point x ∈ M,

n−k−1 n−1
ϕn (x) ≤ ∑ ψk ( f i (x)) + ∑ max{ψk , ϕ1 }( f i (x)).
i=0 i=n−k
This means that the sequence (ϕn )n is bounded from above by an orbital
sum of ψk and an “error term”. Since orbital sums are additive sequences (re-
call Example 3.1), this pretty much reduces our subadditive setting to the much
easier additive case. Recall that ψk converges to ϕ− + ε when k goes to infin-
ity. The last sum on the right-hand side of the inequality (the “error term”) is
negligible because it is the sum of a fixed number k of integrable functions,
and n may be taken to be much larger than k.
Proof Take x ∈ M such that ϕ− (x) = ϕ− ( f j (x)) for any j ≥ 1 (this is the case
for μ -almost every point; see Exercise 3.2). Consider the sequence, possibly
finite, of integer numbers
m0 ≤ n1 < m1 ≤ n2 < m2 < · · · (3.3)
defined inductively in the following way (see also Figure 3.1).

Take m0 = 0. Given j ≥ 1, let n j be the smallest integer greater or equal than
m j−1 that satisfies f n j (x) ∈ Ek (assuming it exists). Then, by definition of Ek ,
there exists m j such that 1 ≤ m j − n j ≤ k and
ϕm j −n j ( f n j (x)) ≤ (m j − n j )(ϕ− ( f n j (x)) + ε ). (3.4)

Ekc Ekc Ekc Ekc
m0 n1 m1 nl ml n
E E Ekc Ekc
m0 n1 m1 nl ml nl1 n
Figure 3.1 Splitting the trajectory of a point
This completes the definition of the sequence (3.3). Given any n ≥ k, let l ≥ 0
be largest such that ml ≤ n. By subadditivity,
n j −1
ϕn j −m j−1 ( f m j−1 (x)) ≤ ∑ ϕ1 ( f i (x))
i=m j−1
for any j = 1, . . . , l such that m j−1 = n j , and similarly for ϕn−ml ( f ml (x)). Thus,
l
ϕn (x) ≤ ∑ ϕ1 ( f i (x)) + ∑ ϕm j −n j ( f n j (x)) (3.5)
i∈I j=1
l
where I = j=1 [m j−1 , n j ) ∪ [ml , n). Observe that

l
ϕ1 ( f i (x)) = ψk ( f i (x)) for any i∈ [m j−1 , n j ) ∪ [ml , min{nl+1 , n}),
j=1
since f i (x) ∈ Ekc in all these cases. Moreover, as ϕ− is constant on orbits (Ex-
ercise 3.2) and ψk ≥ ϕ− + ε , the relation (3.4) implies that
m j −1 m j −1
ϕm j −n j ( f n j (x)) ≤ ∑ (ϕ− ( f i (x)) + ε ) ≤ ∑ ψk ( f i (x))
i=n j i=n j
for every j = 1, . . . , l. Thus, using (3.5) we conclude that

min{nl+1 ,n}−1 n−1
ϕn (x) ≤ ∑ ψk ( f i (x)) + ∑ ϕ1 ( f i (x)).
i=0 i=nl+1
Since nl+1 > n − k, the lemma is proved.
3.1.3 Estimating ϕ−
In order to prove (3.2), in this section we prove the following lemma:


Lemma 3.6 ϕ− d μ = L.
Proof Suppose, for the time being, that ϕn /n is uniformly bounded from be-
low; that is, there exists κ > 0 such that ϕn /n ≥ −κ for every n. In particular,
ϕ− ≥ −κ > −∞. Applying Fatou’s lemma to the sequence of non-negative
functions ϕn /n + κ , we obtain that ϕ− is integrable and

ϕn
ϕ− d μ ≤ lim d μ = L.
n
To prove the opposite inequality, observe that Lemma 3.5 implies that

1 n−k k
ϕn d μ ≤ ψk d μ + max{ψk , ϕ1 } d μ (3.6)
n n n
Note that max{ψk , ϕ1 } ≤ max{ϕ− + ε , ϕ1+ } and this last function is integrable.
So, the lim supn of the final term in (3.6) is non-positive. Hence, taking n → ∞

we obtain that L ≤ ψk d μ for any k. Then, making k → ∞ we conclude that

L≤ ϕ− d μ + ε

Finally, making ε → 0 we obtain that L ≤ ϕ− d μ . This proves the lemma
when ϕn /n is uniformly bounded from below.
Now, let us remove that hypothesis. Define, for each κ > 0,
ϕnκ = max{ϕn , −κ n} and ϕ−κ = max{ϕ− , −κ }.
The sequence (ϕnκ )n satisfies the hypotheses of Theorem 3.3: it is subadditive

and the positive part of ϕ1κ is integrable. Moreover, ϕ−κ = lim infn (1/n)ϕnκ .
Hence, the argument in the previous paragraph shows that

1
ϕ−κ d μ = inf ϕnκ d μ . (3.7)
n n
By the monotone convergence theorem, we also have that

ϕn d μ = inf ϕnκ d μ and ϕ− d μ = inf ϕ−κ d μ . (3.8)
κ κ
Combining (3.7) and (3.8), we find that

1 1
ϕ− d μ = inf ϕ−κ = inf inf ϕnκ d μ = inf ϕn d μ = L.
κ κ n n n n
This completes the proof of this lemma.

3.1.4 Bounding ϕ+ from above

We are going to show that ϕ+ d μ ≤ L if every ϕn is bounded from −∞. That
will complete the proof of (3.2). First, we prove a couple of auxiliary results:
Lemma 3.7 If φ : M → R is integrable with respect to μ then
1
lim φ ( f n (x)) = 0 for μ -almost all x ∈ M.
n n
Proof Fix any ε > 0. Since μ is invariant under f ,

μ {x ∈ M : |φ ( f n (x))| ≥ nε } = μ {x ∈ M : |φ (x)| ≥ nε }
∞ |φ (x)|
= ∑μ x∈M:k≤ < k+1 .
k=n ε
Adding these inequalities over every n ∈ N, we obtain
∞ ∞ |φ (x)|
∑ μ {x ∈ M : |φ ( f n (x))| ≥ nε } = ∑ kμ x ∈ M : k ≤ ε < k + 1
n=1 k=1

|φ |
dμ.
≤
ε
Since φ is assumed to be integrable, the right-hand side is finite. Hence, we
may use the Borel–Cantelli lemma to conclude that the set B(ε ) of points x
such that |φ ( f n (x))| ≥ nε for infinitely many values of n has measure zero.
By the definition of B(ε ), for every x ∈ / B(ε ) there exists p ≥ 1 such that

|φ ( f n (x))| < nε for every n ≥ p. Now, consider B = ∞ i=1 B(1/i). Then B has
measure zero and limn (1/n)φ ( f n (x)) = 0 for every x ∈ / B.
Lemma 3.8 For any fixed k,
ϕkn ϕn
lim sup = k lim sup .
n n n n
Proof The inequality ≤ is clear, since ϕkn /kn is a subsequence of ϕn /n. To
get the other inequality, write n = kqn + rn with rn ∈ {1, . . . , k}. By subadditiv-
ity,
ϕn ≤ ϕkqn + ϕrn ◦ f kqn ≤ ϕkqn + ψ ◦ f kqn
where ψ = max{ϕ1+ , . . . , ϕk+ }. Note that n/qn → k as n → ∞. Moreover, since
ψ ∈ L1 (μ ), we may use Lemma 3.7 to check that ψ ◦ f n /n converges to zero
at μ -almost every point. Thus, dividing the previous relation by n and taking
the lim sup when n → ∞, we obtain that
1 1 1 1 1
lim sup ϕn ≤ lim sup ϕkqn + lim sup ψ ◦ f kqn = lim sup ϕkq ,
n n n n n n k q q
as claimed in the lemma.


Lemma 3.9 Suppose that inf ϕn > −∞ for any n. Then ϕ+ d μ ≤ L.
Proof For each fixed k and n ≥ 1, consider θn = − ∑n−1
j=0 ϕk ◦ f . Observe
jk
that
θn d μ = −n ϕk d μ for every n, (3.9)
since f k preserves the measure μ . As the sequence (ϕn )n is subadditive, we

have that θn ≤ −ϕkn for any n. Then, using Lemma 3.8,
θn ϕkn ϕn
θ− = lim inf ≤ − lim sup = −k lim sup = −kϕ+
n n n n n n
and so
θ− d μ ≤ −k ϕ+ d μ . (3.10)
Observe also that the sequence (θn )n is additive: θm+n = θm + θn ◦ f km for any
m, n ≥ 1. As θ1 = −ϕk is bounded from above by − inf ϕk , we also have that
the function θ1+ is bounded and, consequently, integrable. Thus, we can apply
Lemma 3.6, together with (3.9), to conclude that

θn
θ− d μ = lim d μ = − ϕk d μ . (3.11)
n n
Putting the relations (3.10) and (3.11) together, we obtain that

1
ϕ+ d μ ≤ ϕk d μ .
k

Finally, taking the infimum on k yields ϕ+ d μ ≤ L.
Lemmas 3.6 and 3.9 prove the relation (3.2) and, thus, Theorem 3.3 when
inf ϕn > −∞ for any n. In the general case, define
ϕnκ = max{ϕn , −κ n} , ϕ−κ = max{ϕ− , −κ } and ϕ+κ = max{ϕ+ , −κ }
for any constant κ > 0. The previous arguments can be applied to the sequence
(ϕnκ )n for any fixed κ > 0. Therefore, ϕ+κ = ϕ−κ at μ -almost every point and
for any κ > 0. Since ϕ−κ → ϕ− and ϕ+κ → ϕ+ when κ → ∞, it follows that
ϕ− = ϕ+ at μ -almost every point. The proof of Theorem 3.3 is complete.
Corollary 3.10 (Birkhoff ergodic theorem) Let ϕ : M → R be a μ -integrable
function. Then
1 n−1
ϕ̃ (x) = lim
n
∑ ϕ ( f j (x))
n j=0
exists at μ -every point. Moreover, the function ϕ̃ is invariant and μ -integrable,

with ϕ̃ d μ = ϕ d μ .

Proof According to Example 3.1, this is a particular case of Theorem 3.3.
Now we can also deduce the following strong version of Lemma 3.7 (the
conclusion is the same while the assumption is weaker) that will be useful
later.
Corollary 3.11 Let φ : M → R be a measurable function such that the func-
tion ψ = φ ◦ f − φ is integrable with respect to μ . Then
1
lim φ ( f n (x)) = 0 for μ -almost all x ∈ M.
n n
In particular, this holds if φ ∈ L1 (μ ).

Proof Note that φ ( f n (x)) = φ (x) + ∑n−1
j=0 ψ ( f (x)) for every x and every n.
j
So, by the Birkhoff ergodic theorem applied to the integrable function ψ ,

1 1 1 n−1
lim φ ( f n (x)) = lim φ (x) + lim ∑ ψ ( f j (x)) (3.12)
n n n n n n
j=0
exists at μ -almost every point. On the other hand, since μ is f -invariant,

1
μ x : φ ( f n (x)) ≥ c = μ y : |φ (y)| > nc → 0 when n → ∞.
n
In other words, the sequence (1/n) φ ◦ f n converges to zero in measure. Thus,
the limit in (3.12) must be zero at μ -almost every point.
3.2 Theorem of Furstenberg and Kesten

Let F : M × Rd → M × Rd be given by F(x, v) = ( f (x), A(x)v), for some mea-
surable function A : M → GL(d). Let L1 (μ ) denote the space of μ -integrable
functions on M.
Theorem 3.12 (Furstenberg–Kesten) If log+ A±1 ∈ L1 (μ ) then
1 1
λ+ (x) = lim log An (x) and λ− (x) = lim log (An (x))−1 −1
n n n n
exist for μ -almost everywhere x ∈ M. Moreover, the functions λ± are invariant

and μ -integrable, with

1
λ+ d μ = lim log An (x)d μ
n n

1
λ− d μ = lim log (An (x))−1 −1 d μ .
n n

3.3 Herman’s formula 29
Proof This is a direct consequence of Theorem 3.3. Define

ϕn (x) = log An (x) and ψn (x) = log (An (x))−1
The hypothesis implies that ϕ1+ , ψ1+ ∈ L1 (μ ) and so ϕ1 (x), ψ1 (x) ∈ [−∞, +∞)
for μ -almost every x. Since the norm of linear operators is sub-multiplicative,
the sequences ϕn and ψn are subadditive (Example 3.2). Now the conclusion
of Theorem 3.12 follows immediately from applying Theorem 3.3 to these
sequences.
3.3 Herman’s formula

Michael Herman devised a method for bounding Lyapunov exponents from
below, based on the theory of (sub)harmonic functions. Here is one application.
Theorem 3.13 (Herman’s formula) Let S1 = R/Z and f : S1 → S1 be an
Let A : S → SL(2) be ofthe form A(x)
irrational 1 = A0 R2π x where
rotation.
σ 0 cos θ − sin θ
A0 = for some σ > 0 and Rθ = . Then
0 σ −1 sin θ cos θ
σ + σ −1
λ+ ≥ log (3.13)
2
where λ+ is the largest Lyapunov exponent of the cocycle defined by A, relative
to the unique f -invariant probability measure m.
Proof Let ω ∈ 2π (R \ Q) be the angle of the rotation f : S1 → S1 . By Theo-

rem 3.12,

1
λ+ = lim log An (y) dm(y)
n n S1
2π (3.14)
1
= lim log A0 Rx+(n−1)ω · · · A0 Rx+ω A0 Rx dx.
n 2π n 0
This does not depend on the choice of the norm; we take · to be given
by the maximum absolute value of the coefficients. The idea of the proof is
to extend the function in the last integral to a subharmonic function on the
complex plane and then deduce the claim of the theorem from the average
property of subharmonic functions (the value at any point is not bigger than
the average of the function over any circle centered at that point). For this,
consider the complex matrices
2
(z + 1)/2 (−z2 + 1)/(2i)
R(z) =
(z2 − 1)/(2i) (z2 + 1)/2

defined for z ∈ C. Observe that R(eiθ ) = eiθ Rθ for every θ . Thus, z → R(z) is
a kind of holomorphic extension of the family θ → Rθ of rotations. Now write
Cn (z) = A0 R(e(n−1)ω i z) · · · A0 R(eω i z)A0 R(z)
Then Cn (eix ) = eiτ An (x) with τ = nx + n(n − 1)ω /2, and so (3.14) becomes
2π
1
λ+ = lim log Cn (eix ) dx.
n 2π n 0
The function z → log Cn (z) is subharmonic because the absolute value of a
holomorphic function is subharmonic and the maximum of subharmonic func-
tions is subharmonic (see [45, § 19.4]). It follows that
1
λ+ ≥ lim log Cn (0).
n n
Moreover,
1 1
lim log Cn (0) = lim log (A0 R(0))n
n n n n
equals the logarithm of the spectral radius of A0 R(0). An explicit calculation

gives that the spectral radius is (σ + σ −1 )/2. This proves the claim.
For σ = 1 the theorem gives that λ+ is positive. Observe that deg( f ) = 1 and
deg(θ ) = 1, where θ (x) = x for x ∈ S1 . So, by the criterion in Example 2.9, the
cocycle cannot be hyperbolic.
3.4 Theorem of Oseledets in dimension 2

We are going to state and prove a version of the multiplicative ergodic theorem
for 2-dimensional cocycles, both invertible and non-invertible. The full state-
ment of the theorem will the topic of Chapter 4. The reason for including this
discussion of a special case is that it can handled by much simpler methods
and, thus, provides a useful glimpse into the main features of the theorem, in a
very transparent situation.
Let F : M × R2 → M × R2 be given by F(x, v) = ( f (x), A(x)v), for some
measurable function A : M → GL(2) satisfying log+ A±1 ∈ L1 (μ ).
3.4.1 One-sided theorem

Theorem 3.14 For μ -almost every x ∈ M,

(1) either λ− (x) = λ+ (x) and

1
lim log An (x)v = λ± (x), for all v ∈ R2 ;
n n
(2) or λ+ (x) > λ− (x) and there exists a vector line Exs ⊂ R2 such that

1 λ− (x) if v ∈ Exs \ {0}
lim log A (x)v =
n
n n λ+ (x) if v ∈ Rd \ Exs .
Moreover A(x)Exs = E sf (x) for every x as in (2).
Proof We treat the case when A takes values in SL(2) and leave it to the
reader (Exercise 3.3) to extend the conclusions to the general GL(2) setting.
Consider any x as in the conclusion of Theorem 3.12 and let λ (x) = λ+ (x) =
−λ− (x).
First, let us consider x ∈ M such that λ (x) = 0. For any v ∈ R2 ,
An (x)−1 v = An (x)−1 −1 v ≤ An (x)v ≤ An (x)v,
and so
1 1 1
log(An (x)−1 v) ≤ log An (x)v ≤ log(An (x)v).
n n n
Sending n → ∞ the left-hand side goes to −λ (x) = 0 and the right-hand side
goes to λ (x) = 0. So, we are done with the case when the exponent vanishes.
Now let us suppose that λ (x) > 0. Then An (x) ≈ enλ (x) is larger than 1
for every large n. So (Exercise 2.3), there exist unit vectors sn (x) and un (x),
respectively, most contracted and most expanded under An (x):
An (x)sn (x) = An (x)−1 and An (x)un (x) = An (x). (3.15)
Lemma 3.15 The angle (sn (x), sn+1 (x)) decreases exponentially:
1
lim sup log | sin (sn (x), sn+1 (x))| ≤ −2λ (x).
n n
Proof Let us denote αn = (sn (x), sn+1 (x)). Then
sn (x) = sin αn un+1 (x) + cos αn sn+1 (x).
It follows that
An+1 (x)sn (x) ≥ sin αn An+1 (x)un+1 (x) = | sin αn | An+1 (x),
and
An+1 (x)sn (x) ≤ A( f n (x))An (x)sn (x) = A( f n (x))An (x)−1 .

Hence
A( f n (x))
| sin αn | ≤ .
An+1 (x) An (x)
Corollary 3.11 ensures that
1
lim log A( f n (x)) = 0.
n n
So, taking limit superior on both sides of the previous expression, we find that
1
lim sup log | sin αn | ≤ −2λ (x).
n n
This completes the proof of the lemma.
Lemma 3.16 The sequence (sn (x))n is Cauchy in projective space.
Proof Consider any ε > 0 such that −2λ (x) + ε < 0. By Lemma 3.15,
| sin αn | ≤ en(−2λ (x)+ε )
for every large n. Then, up to replacing some s j (x) by −s j (x),
sn (x) − sn+1 (x) ≤ 2en(−2λ (x)+ε )
for every large n. Consequently, there exists C > 0 such that
sn+k (x) − sn (x) ≤ Cen(−2λ (x)+ε )
for every k ≥ 1 and n large enough. In particular, the sequence is Cauchy.
Define s(x) = lim sn (x) whenever the limit exists.
Lemma 3.17 The vector s(x) is contracted at the rate −λ (x):
1
lim log An (x)s(x) = −λ (x).
n n
Proof Let βn = (s(x), sn (x)). Then s(x) = cos βn sn (x) + sin βn un (x) and so
An (x)s(x) ≤ | cos βn |An (x)sn (x) + | sin βn |An (x)un (x).

Then (Exercise 3.6),

1
lim sup log An (x)s(x)
n n
1 1
≤ max lim sup | cos βn | An (x)sn (x), lim sup | sin βn | An (x)un (x)
n n n n
1
≤ max lim sup log An (x)−1 ,
n n
1 1
lim sup log | sin βn | + lim sup An (x)un (x)
n n n→∞ n
≤ max{−λ (x), −2λ (x) + λ (x)} = −λ (x).
This completes the proof.
Lemma 3.18 If v ∈ R2 is not collinear with s(x) then
1
lim log An (x)v = λ (x).
n n
Proof Denote γn = (v, sn (x)). Then v = cos γn sn (x) + sin γn un (x) and so
An (x)v ≥ | sin γn |An (x)un (x) − | cos γn |An (x)sn (x).
Note that | sin γn | is bounded from zero for all large n, because sn (x) → s(x)
and v is not collinear to s(x). Using also (3.15),
An (x)un (x) ≈ enλ (x) and An (x)sn (x) ≈ e−nλ (x) .
Substituting this in the previous inequality, and taking the limit as n → ∞, we
get
1
lim inf log An (x)v ≥ λ (x).
n n
From A (x)v ≤ A (x)v we immediately get the opposite inequality:
n n
1 1
lim sup log An (x)v ≤ lim log An (x) = λ (x).
n n n n
The proof of the lemma is complete.
Lemma 3.19 A(x)s(x) is collinear to s( f (x)).
Proof By Lemma 3.17,
1 1
lim log An ( f (x))A(x)s(x) = lim log An+1 (x)s(x) = −λ (x).
n n n n+1
By Lemma 3.18,
1
lim log An ( f (x))v = λ ( f (x)) = λ (x)
n n

for every v not collinear to s( f (x)). This implies the claim of the lemma.
Take Exs to be the line Rs(x) generated by s(x). Lemmas 3.15 through 3.19
contain all the claims in Theorem 3.14.
3.4.2 Two-sided theorem

Theorem 3.20 If f : M → M is invertible then for μ -almost every x ∈ M:
(1) either λ− (x) = λ+ (x) and

1
lim log An (x)v = λ± (x) for all v ∈ R2 ;
n→±∞ n
(2) or λ− (x) < λ+ (x) and there is a direct sum decomposition R2 = Exu ⊕ Exs
such that

1 λ− (x) if v ∈ Exs \ {0}
lim log A (x)v =
n
n→+∞ n λ+ (x) if v ∈ Rd \ Exs

1 λ+ (x) if v ∈ Exu \ {0}
lim log An (x)v =
n→−∞ n λ− (x) if v ∈ Rd \ Exu .
Moreover, in the latter case, A(x)Exu = E uf(x) and A(x)Exs = E sf (x) and the angle
between the two lines decreases subexponentially along orbits:
1
lim log | sin (E ufn (x) , E sf n (x) )| = 0.
n→±∞ n
Proof We deal with the case when A(x) ∈ SL(2) for all x and let the reader
extend the conclusions to the general setting. For x as in the conclusion of
Theorem 3.12, write λ (x) = λ+ (x) = −λ− (x).
The case λ (x) = 0 follows directly from Theorem 3.14 applied to F and to
its inverse F −1 . From now on, assume that λ (x) > 0. Let Exs = Rs(x) and Exu =
Ru(x) be the subspaces given by Theorem 3.14 for F and F −1 , respectively.
We need to check that these two lines are transverse:
Lemma 3.21 The vectors s(x) and u(x) are non-collinear, for μ -almost every
point in {x : λ (x) > 0}.
Proof We have limn→−∞ n−1 log An (x) | Exu = λ (x), by Theorem 3.14 ap-
plied to F −1 . Thus, the lemma will follow if we prove that
1
lim log An (x) | Exs = −λ (x).
n→−∞ n

From Theorem 3.14 applied to F −1 , we know that the limit on the left-hand
side exists. Let us denote it by ψ (x). Consider the sequence of functions
1 1
ψn (x) = log A−n (x) | Exs and φn (y) = log (An (y) | Eys )−1 .
−n −n
From the definition of A−n we get that ψn (x) = φn ( f −n (x)) for every n ≥ 1.
Since E s is one-dimensional, we may write
1
φn (x) = log An (y) | Eys .
n
Then limn→∞ φn (y) = −λ (y), by Lemma 3.17. In particular, the sequence φn
converges to −λ in measure; that is,
lim μ ({y : |φn (y) + λ− (y)| > δ }) = 0 for any δ > 0.
n
Then, since μ is f -invariant,

lim μ ({y : |φn ( f −n (x)) + λ− ( f −n (x))| > δ }) = 0 for any δ > 0.
n
In view of the previous observations, and the fact that the function λ− is in-
variant, this implies that
lim μ ({y : |ψn (x) + λ (x)| > δ }) = 0 for any δ > 0.
n
This means that ψn converges to −λ in measure. On the other hand, ψn con-

verges to ψ almost everywhere and, consequently, in measure. By uniqueness
of the limit, it follows that ψ = λ− , as claimed.
Lemma 3.22 Let θ (y) = (Eys , Eyu ). For μ -almost every x with λ (x) > 0,
1
lim log | sin θ ( f n (x))| = 0.
n→±∞ n
Proof From the elementary relation (Exercise 3.4)
| sin θ ( f (x))|
A(x)−2 ≤ ≤ A(x)2
| sin θ (x)|

we find that log | sin θ ( f (x))| − log | sin θ (x)| ≤ 2 log A(x) and, in partic-
ular, log | sin θ | ◦ f − log | sin θ | ∈ L1 (μ ). So we may conclude from Corol-
lary 3.11 that
1
lim log | sin θ ( f n (x))| = 0
n→±∞ n
as stated.
This finishes the proof of Theorem 3.20.

3.5 Notes
Theorem 3.3 is an application of the main result of Kingman [74]. The proof
we presented here is due to Avila and Bochi [10], inspired by a proof of the
Birkhoff ergodic theorem by Katznelson and Weiss [69]. See Ledrappier [80,
§ I.2] for another proof inspired by [69].
Theorem 3.12, which we obtained as a consequence of the subadditive er-
godic theorem, was actually proven before, by Furstenberg and Kesten [56].
The subadditive ergodic theorem will reappear in Chapter 4, as part of the
proof of the Oseledets theorem.
Theorem 3.13 is perhaps the simplest application of the method devised
by Herman [62] for estimating Lyapunov exponents from below. Avila and
Bochi [11] proved that (3.13) is, actually, an equality.
The arguments in Section 3.4 are inspired by the proof of Proposition 2.1.
Other proofs of the Oseledets theorem in two dimensions have been given by
Young [119] and Bochi [29].
3.6 Exercises
Exercise 3.1 Check that, given any measurable function A : M → GL(d),
log+ A±1 ∈ L1 (μ ) ⇔ | log A±1 | ∈ L1 (μ ) ⇔ log− A±1 ∈ L1 (μ ).
Moreover, if A takes values in SL(d), all these conditions are equivalent to

log A ∈ L1 (μ ).
Exercise 3.2 Check that the functions ϕ− and ϕ+ are invariant.
Exercise 3.3 Given A : M → GL(d), define c(x) = | det A(x)|1/d and then let
B : M → SL(d) be given by A(x) = c(x)B(x). Show that log c and log+ B±1
are in L1 (μ ) if log+ A±1 is in L1 (μ ). Check that, for μ -almost every x ∈ M,
1 1
lim log An (x)v = t(x) + lim log Bn (x)v for every v = 0,
n n n n
where t(x) = limn n−1 ∑n−1 j

j=0 log c( f (x)) is the Birkhoff time average of the
function log c. Deduce that the associated cocycles F(x, v) = ( f (x), A(x)v) and
G(x, v) = ( f (x), B(x)v) have the same Oseledets flag/decomposition at almost
every point. Moreover, the Lyapunov spectrum of the former cocycle is the
t(x)-translate of the Lyapunov spectrum of the latter.

3.6 Exercises 37
Exercise 3.4 Show that, for any B ∈ SL(2) and non-zero vectors u, v ∈ R2 ,
| sin (B(u), B(v))|
B−2 ≤ ≤ B2 .
| sin (u, v)|
Exercise 3.5 Prove that limn→∞ n−1 log | sin (An (x)v, E ufn (x) )| = −2λ (x) for
/ Exs . If the cocycle F is invertible then the same holds as n → −∞, with
any v ∈
Ex in the place of Exs .
u
Exercise 3.6 Show that if an , bn > 0 for every n then

1 1 1
(1) lim sup log(an + bn ) = max lim sup log an , lim sup log bn .
n n n n n n

1 1 1
(2) lim sup log a2n + b2n = max lim sup log an , lim sup log bn .
n n n n n n
1 1 1
(3) lim inf log(an + bn ) ≥ max lim inf log an , lim inf log bn .
n n n n n n

1 1 1
(4) lim inf log a2n + b2n ≥ max lim inf log an , lim inf log bn .
n n n n n n

4
Multiplicative ergodic theorem
This chapter is devoted to the fundamental result in the theory of Lyapunov ex-
ponents: the multiplicative ergodic theorem of Oseledets [92]. The statements
are given in Section 4.1 and proofs appear in Section 4.2 (one-sided version,
for general cocycles) and Section 4.3 (two-sided version, for invertible cocy-
cles). Some related issues are discussed in Section 4.4.
Throughout, we take (M, B, μ ) to be a complete separable probability space.
Recall that complete means that any subset of a measurable set with zero mea-
sure is measurable (and has zero measure) and separable means that there
exists a countable family E ⊂ B such that for any ε > 0 and any B ∈ B there
exists E ∈ E such that μ (BΔE) < ε .
Let F : M × Rd → M × Rd be the linear cocycle defined by a measurable
function A : M → GL(d) over a measurable transformation f : M → M that pre-
serves the probability measure μ . It is assumed that the functions log+ A±1
are integrable with respect to μ .
4.1 Statements
Recall that the Grassmannian of Rd is the disjoint union Gr(d) of the Grass-
mannian manifolds Gr(l, d), 0 ≤ l ≤ d. A map x → Vx with values in Gr(d) is
measurable if and only if there exist measurable, linearly independent vector
fields that span Vx at each point (Exercise 4.1).
A flag in Rd is a decreasing family W 1 · · · W k {0} of vector subspaces
of the d-dimensional Euclidean space. The flag is called complete if k = d and
dimW j = d + 1 − j for all j = 1, . . . , d.
Theorem 4.1 (Oseledets) For μ -almost every x ∈ M there is k = k(x), num-
38

4.1 Statements 39
bers λ1 (x) > · · · > λk (x) and a flag Rd = Vx1 · · · Vxk {0}, such that, for
all i = 1, . . . , k:
(a) k( f (x)) = k(x) and λi ( f (x)) = λi (x) and A(x) ·Vxi = V fi (x) ;
(b) the maps x → k(x) and x → λi (x) and x → Vxi (with values in N and R and
Gr(d), respectively) are measurable;
1
(c) lim log An (x)v = λi (x) for all v ∈ Vxi \Vxi+1 (with Vxk+1 = {0}).
n n
When μ is ergodic, it follows that the values of k(x) and of each of the
Lyapunov exponents λi (x) are constant on a full measure subset, and so are the
dimensions of the Oseledets subspaces Vxi . We call dimVxi −dimVxi+1 the multi-
plicity of the corresponding Lyapunov exponent λi (x). The Lyapunov spectrum
of F is the set of all Lyapunov exponents, each counted with multiplicity. The
Lyapunov spectrum is simple if all Lyapunov exponents have multiplicity 1 or,
equivalently, if the Oseledets flag is complete.
When the transformation f : M → M is invertible, we have a stronger con-
clusion:
Theorem 4.2 (Oseledets) Suppose that f : M → M is invertible. Then, for μ -

almost every x ∈ M, there exists a direct sum decomposition Rd = Ex1 ⊕· · ·⊕Exk
such that, for every i = 1, . . . , k:
k j
(a) A(x) · Exi = E if (x) and Vxi = j=i Ex and
1
(b) lim log An (x)v = λi (x) for all v ∈ Exi \ {0} and
n→±∞ n
j
1
(c) lim log sin E if n (x) , E f n (x) = 0 whenever I ∩ J = 0.
/
n→±∞ n
i∈I j∈J
By definition, the angle (V,W ) between two subspaces V and W of Rd is

the smallest angle between non-zero vectors v ∈ V and w ∈ W .
Clearly, the multiplicity of each Lyapunov exponent λi coincides with the di-
mension dim Exi = dimVxi − dimVxi+1 of the associated Oseledets subspace Exi .
Thus, the Lyapunov spectrum is simple if and only if dim Exi = 1 for every i.

Remark 4.3 The sums kj=i Exj of Oseledets subspaces corresponding to the
smallest Lyapunov exponents depend only on the forward iterates of the linear
cocycle, since they coincide with the subspaces Vxi . Analogously, any sum of
Oseledets subspaces corresponding to the largest Lyapunov exponents depends
only on the backward iterates.

40 Multiplicative ergodic theorem
4.2 Proof of the one-sided theorem

As we are going to see, it is rather easy to get a weaker form of Theorem 4.1,
where limit is replaced by limit superior in part (c). In Section 4.2.1 we find
the Lyapunov exponents λi (x) and the Oseledets subspaces Vxi and we check
that they are invariant. Then, in Section 4.2.2, we show that these objects are
measurable.
The main difficulty is to prove that the limit in part (c) actually exists.
Roughly speaking, the proof is by induction on the number k of Oseledets sub-
spaces. Sections 4.2.3 and 4.2.4 prepare the proof of the case k = 1, whereas
the inductive step is prepared in Section 4.2.5. In Section 4.2.6 we wrap up the
proof.
4.2.1 Constructing the Oseledets flag

For each v ∈ Rd \ {0} and x ∈ M, define
1
λ (x, v) = lim sup log An (x)v. (4.1)
n n
In the next lemma we collect a few basic properties of this function. Recall that,
by Theorem 3.12, the extremal Lyapunov exponents λ± (x) are well defined and
finite for μ -almost every x.
Lemma 4.4 For μ -almost every x ∈ M and any v, v ∈ Rd \ {0},
(i) λ− (x) ≤ λ (x, v) ≤ λ+ (x);

(ii) λ (x, cv) = λ (x, v) for v ∈ Rd \ {0} and c = 0;
(iii) λ (x, v + v ) = max{λ (x, v), λ (x, v )} if v + v = 0;
(iv) λ (x, v) = λ ( f (x), A(x)v).
Proof For claim (i), just observe that
An (x)−1 −1 v ≤ An (x)v ≤ An (x) v
and take the limit/limit superior as n → ∞. Part (ii) follows directly from the
definition (4.1). Claim (iii) is a direct consequence of Exercise 3.6. Part (iv)
also follows directly from the definition.
For any x ∈ M as in Lemma 4.4, take k(x) ≥ 1 to be the number of elements

of {λ (x, v) : v ∈ Rd \ {0}}, let λ1 (x) > · · · > λk(x) (x) be those elements, in
decreasing order, and let

Vxi = v ∈ Rd \ {0} : λ (x, v) ≤ λi (x) ∪ {0} for i = 1, . . . , k(x).

By part (i) of Lemma 4.4, every λi (x) is a real number. Moreover, parts (ii) and
(iii) ensure that Vxi is a vector subspace, for every i.
It follows directly from these definitions that the Vxi constitute a flag:
k(x)
Rd = Vx1 · · · Vx {0}.
In particular, k(x) ≤ d. Another direct consequence of the definitions is:
λ (x, v) = λi (x) for every v ∈ Vxi \Vxi+1 . (4.2)
In particular, by part (i) of Lemma 4.4,
λ− (x) ≤ λi (x) ≤ λ+ (x) for every i = 1, . . . , k(x). (4.3)
Finally, by Lemma 4.4(iv), the functions x → k(x) and x → λi (x) and x → Vxi
are all invariant: for every i = 1, . . . , k(x),
k( f (x)) = k(x) and λi ( f (x)) = λi (x) and V fi (x) = A(x)Vxi .
This proves part (a) of the theorem.
4.2.2 Measurability
Next, we are going to show that the functions k, λi and V i are measurable,
as claimed in part (b) of the theorem. We will use the following criteria for
measurability, whose proof can be found in Castaing and Valadier [43].
Let (X, B, μ ) be a complete probability space and Y be a separable complete
metric space. Denote by B(Y ) the Borel σ -algebra of Y .
Proposition 4.5 (Theorem III.23 in [43]) Let B ⊗ B(Y ) be the product σ -
algebra in X ×Y and π : X ×Y → X be the canonical projection. Then π (E) ∈
B for every E ∈ B ⊗ B(Y ).
Proposition 4.6 (Theorem III.30 in [43]) Let K (Y ) be the space of compact
subsets of Y , with the Hausdorff topology. The following are equivalent:
(i) a map x → Kx from X to K (Y ) is measurable;
(ii) its graph {(x, y) : y ∈ Kx } is in B ⊗ B(Y );
(iii) {x ∈ X : Kx ∩U = 0} / ∈ B for any open set U ⊂ Y .
Moreover, any of these conditions implies that there exists a measurable map
σ : X → Y such that σ (x) ∈ Kx for every x ∈ X.
Here is a useful application to maps with values in the Grassmannian:
Corollary 4.7 Let (X, B, μ ) be a complete probability space and x → Vx be
a map from X to the Grassmannian Gr(d). The following are equivalent:

(a) the map x → Vx is measurable;

(b) its graph {(x, v) ∈ X × Rd : v ∈ Vx } is in B ⊗ B(Rd ).
Proof For each subspace V of Rd , define PV = {ξ ∈ PRd : ξ ⊂ V }. Consider

the following projectivized versions of conditions (a) and (b):
(a ) the map x → PVx , from M to K (PRd ) is measurable;

(b ) its graph {(x, ξ ) ∈ M × PRd : ξ ∈ PVx } is in B ⊗ B(PRd ).
Note that (a ) ⇔ (b ), by Proposition 4.6. Next, consider the map
P : Gr(Rd ) → K (PRd ), V → PV.
This map is continuous and injective. Since Gr(d) is compact, it follows that P
is a homeomorphism onto its image. The composition of P with the map in (a)
is, precisely, the map in (a ). Thus, (a) ⇔ (a ). Similarly,
{(x, v) ∈ M × Rd : v ∈ Vx }

= p−1 {(x, ξ ) ∈ M × PRd : ξ ∈ PVx } ∪ M × {0} .
where

p : M × Rd \ {0} → M × PRd , p(x, v) = (x, [v]).
So, since the map p is measurable, (b ) ⇒ (b). Next, we prove that (b) ⇒ (b ).
Let π : M × Rd → M be the canonical projection to the first coordinate and let
U be any open subset of PRd . Observe that
{x ∈ M : PVx ∩U = 0}
/
(4.4)
= π {(x, v) ∈ M × Rd : v ∈ Vx } ∩ p−1 (M ×U) .
Assume (b); that is, take {(x, v) ∈ M × Rd : v ∈ Vx } to be in B ⊗ B(Rd ). Then,

using Proposition 4.5, it follows from (4.4) that {x ∈ M : PVx ∩U = 0}
/ is in B.
By Proposition 4.6(iii), this implies the statement in (b ):
{(x, ξ ) ∈ M × PRd : ξ ∈ PVx } is in B ⊗ B(PRd ),
Thus (b) ⇒ (b ). This completes the proof that (a) ⇔ (b).
Let e1 , . . . , ed be an arbitrary basis of Rd . By Exercise 3.6,
λ1 (x) = max{λ (x, ei ) : 1 ≤ i ≤ d}.
Since (x, v) → λ (x, v) is measurable, it follows that x → λ1 (x) is a measurable

function and
V∗2 = {(x, v) ∈ M × Rd \ {0} : λ (x, v) < λ1 (x)}

is a measurable subset of M × Rd . Observe that

π (V∗2 ) = {x ∈ M : λ (x, v) < λ1 (x) for some v ∈ Rd \ {0}}
= {x ∈ M : k(x) ≥ 2}.
By Proposition 4.5, this is a measurable subset of M. For x ∈ π (V∗2 ), define
Vx2 = {v ∈ Rd : (x, v) ∈ V 2 } ∪ {0}.
Since V∗2 ∪ (M × {0}) is a measurable subset of M × Rd , Corollary 4.7 gives

that x → Vx2 is a measurable map on π (V∗2 ). Then, by Exercise 4.1, each
Ml2 = {x ∈ π (V∗2 ) : dimVx2 = l}, 1≤l≤d
is a measurable subset and for each l there exist measurable functions
v1 , . . . , vl : Ml2 → Rd
such that {v1 (x), . . . , vl (x)} is a basis of Vx2 for every x. Then
λ2 (x) = max{λ (x, ui (x)) : 1 ≤ i ≤ l}
is a measurable function on Ml2 , for every 1 ≤ l ≤ d. Next, let
V∗3 = {(x, v) ∈ M × Rd \ {0} : λ (x, v) < λ2 (x)}
and Vx3 = {v ∈ Rd : (x, v) ∈ V∗3 } ∪ {0} for each x ∈ π (V∗3 ). Just as before,
π (V∗3 ) = {x ∈ M : k(x) ≥ 3} is a measurable subset of M, the map x → Vx3 is
measurable on π (V∗3 ), each
Ml3 = {x ∈ π (V∗3 ) : dimVx3 = l}, 1≤l≤d
is a measurable subset and, for each l, there exist measurable functions
v1 , . . . , vl : Ml3 → Rd
such that {v1 (x), . . . , vl (x)} is a basis of Vx3 for every x. Repeating this argument
successively, we find that
(i) {x ∈ M : k(x) ≥ s} is measurable for every s ≥ 1; thus, the function x →

k(x) is measurable;
(ii) each Lyapunov exponent λi (x) is a measurable function of x on the set
π (V∗i ) = {x ∈ M : k(x) ≥ i};
(iii) each Oseledets subspace Vxi is a measurable function of x on π (V∗i ).
This proves part (b) of the theorem.

4.2.3 Time averages of skew products

The relation (4.2) gives a weaker version of part (c), with limit superior instead
of limit. We are left to prove that the limit does exist. The heart of the argument
is Proposition 4.8 below.
Let P be a compact metric space. Consider the space C0 (P) of continuous
real functions on P, endowed with the norm
ψ 0 = sup{|ψ (ξ )| : ξ ∈ P}.
It is well known that C0 (P) is a separable space (see [114, Theorem A.3.13]).
Denote by F the space of measurable functions Ψ : M × P → R such that
Ψ(x, ·) ∈ C0 (P) for μ -almost every x and x → Ψ(x, ·)0 is integrable with
respect to μ , modulo identifying any two functions that coincide on some N ×P
with μ (N) = 1. Then,

Ψ1 = Ψ(x, ·)0 d μ (x) (4.5)
defines a complete norm on F . Using the assumption that (M, B, μ ) is a sep-

arable probability space, one also gets that (F , · ) is separable. See Exer-
cise 4.2.
Let M (μ ) be the space of probability measures on M ×P such that π∗ η = μ ,
where π : M × P → M is the canonical projection. The weak∗ topology is the
smallest topology on M (μ ) such that the operator

M (μ ) → R, η → Ψ dη is continuous, for every Ψ ∈ F .
A variation of the classical Banach–Alaoglu argument (see Theorem 2.1 in

[114]) shows that this topology is metrizable and compact (Exercise 4.4).
Proposition 4.8 Let G : M × P → M × P be a measurable map of the form

G (x, v) = ( f (x), Gx (v)), where Gx : P → P is a continuous map for μ -almost
every x. Given any Φ ∈ F , define
n−1 n−1
1 1
I(x) = lim inf ∑ Φ(G j (x, v)) and S(x) = lim sup ∑ Φ(G j (x, v)).
n n v∈P j=0 n n v∈P
j=0
The limits exist at μ -almost every point and there exist G -invariant probability
measures ηI ∈ M (μ ) and ηS ∈ M (μ ) such that

Φ d ηI = I dμ and Φ d ηS = S dμ. (4.6)
Proof The roles of the infimum and the supremum are exchanged if one re-
places Φ by −Φ. Thus, it suffices to prove the claims pertaining to either one

of them. For each n ≥ 1, define

n−1
In (x) = inf
v∈P
∑ Φ(G j (x, v)) (4.7)
j=0
Every In is a measurable function, since it coincides with the infimum taken

over any countable dense subset of P. Note that I1 is integrable, because Φ ∈
F . Moreover, the sequence (In )n is super-additive:
Im+n (x) ≥ Im (x) + In ( f m (x)) for every m, n and x.
Then, by the subadditive ergodic theorem applied to (−In )n ,

1
I(x) = lim In (x) exists for μ -almost every x.
n n
Consider the subsets Γn of M × P defined by

n−1
Γn = {(x, v) ∈ M × P : ∑ Φ(G j (x, v)) = In (x)}
j=0
and let Γn (x) = {v ∈ P : (x, v) ∈ Γn } for each x ∈ M. This is a non-empty com-

pact subset of P, for μ -almost every x, since P is compact and v → Φ(G j (x, v))
is continuous for every j. Then, since Γn is measurable, Proposition 4.6 gives
that there exists a measurable map vn : M → P such that vn (x) ∈ Γn (x) for
μ -almost every x. Now let

1 n−1 j
ξn = δx,vn (x) d μ (x) and ηn = ∑ G∗ ξn .
n j=0
It is clear that π∗ ξn = μ for every n. It follows that π∗ ηn = μ for every n,

because G is a skew product over f and the measure μ is f -invariant. By the
compactness of M (μ ), there exists (nk )k → ∞ such that (ηnk )k converges to
some ηI ∈ M (μ ). Given any Ψ ∈ F ,
1

(Ψ ◦ G ) d ηnk − Ψ d ηnk = (Ψ ◦ G nk ) d ξnk − Ψ d ξnk
nk
(4.8)
2 2
≤ Ψ(x, ·)0 d μ (x) = Ψ1 .
nk nk
Observe that Ψ ◦ G ∈ F , since G is measurable, Gx is continuous and
(Ψ ◦ G )(x, ·)0 ≤ Ψ( f (x), ·)0
for μ -almost every x. Thus, by the definition of the weak ∗

topology, the left-
hand side of (4.8) converges to | (Ψ ◦ G ) d ηI − Ψ d ηI . The right-hand side

converges to zero. So, we have shown that

(Ψ ◦ G ) d ηI = Ψ d ηI for all Ψ ∈ F .
This implies that ηI is a G -invariant measure (Exercise 4.3). Finally,

1 nk −1
Φ d ηI = lim
k
Φ d ηnk = lim
k
∑ Φ(G j (x, vnk (x))) d μ (x)
nk j=0

1
= lim Ink (x) d μ (x) = I dμ
k nk
(the last step uses the subadditive ergodic theorem). This proves our claim.
Corollary 4.9 In the setting of Proposition 4.8, for μ -almost every x there
exist vI (x) ∈ P and vS (x) ∈ P such that
1 n−1 1 n−1
lim
n
∑ Φ(G j (x, vI (x))) = I(x) and
n j=0
lim
n
∑ Φ(G j (x, vS (x))) = S(x).
n j=0
Proof Up to replacing Φ by −Φ, it suffices to prove either one of the two

claims. It is clear that, for every v ∈ P and μ -almost every x,
1 n−1 1 n−1
I(x) ≤ lim inf
n
∑
n j=0
Φ(G j (x, v)) ≤ lim sup ∑ Φ(G j (x, v)) ≤ S(x).
n n j=0
(4.9)
By the Birkhoff ergodic theorem, given any G -invariant measure η ,

1 n−1
Φ̃(x, v) = lim
n
∑ Φ(G j (x, v))
n j=0
(4.10)

exists for η -almost every point (x, v), and satisfies Φ̃ d η = Φ d η . Taking

η = ηI , this gives that Φ̃ d η = I d μ . In view of (4.9), this implies that the
measurable set
E = {(x, v) ∈ M × P : Φ̃(x, v) = I(x)}
has full ηI -measure. By Proposition 4.5, the set π (E) is measurable. Moreover,
it has full μ -measure: since π∗ ηI = μ ,
μ (π (E)) = ηI (π −1 (π (E))) ≥ ηI (E) = 1.
Now, it is clear that every x ∈ π (E) satisfies the claim relative to I and vI .
Remark 4.10 When G is invertible, we may replace (4.10) by
1 n−1
Φ̃(x, v) = lim
n→±∞ n
∑ Φ(G j (x, v))
j=0

(the two limits exist and are equal almost everywhere). Then we get
1 n−1 1 n−1
lim
n→±∞ n
∑ Φ(G j (x, vI (x))) = I(x) and lim
n→±∞ n
∑ Φ(G j (x, vS (x))) = S(x)
j=0 j=0
in the conclusion of Corollary 4.9.
4.2.4 Applications to linear cocycles

Let us go back to linear cocycles. We are going to deduce:
Proposition 4.11 Let x → Vx be a measurable invariant sub-bundle for the

cocycle F : M × Rd → M × Rd . Then, for μ -almost every point x,
1
(a) lim log (An (x) | Vx )−1 −1 = min{λ (x, v) : v ∈ Vx \ {0}};
n n
1
(b) lim log (An (x) | Vx ) = max{λ (x, v) : v ∈ Vx \ {0}}.
n n
Proof Up to restricting to convenient invariant measurable subsets of M, and

considering the normalized restriction of μ to such subsets, we may suppose
that the dimension of Vx is constant. Let l = dimVx for every x. By Exercise 4.4
and Gram–Schmidt, we may find measurable functions {v1 (x), . . . , vl (x)} that
constitute a orthonormal basis of Vx at every point. Using such a basis, we may
identify Vx with Rl through an isometry.
Denote D(x) = A(x) | Vx and let G : M × Rl → M × Rl be the linear cocycle
induced by the map D : M → GL(l) over f . Clearly, D(x) ≤ A(x) and
D(x)−1 ≤ A(x)−1 and so the hypotheses imply log+ D±1 ∈ L1 (μ ). We
also consider the projectivization
G : M × PRl → M × PRl
of G. Consider Φ : M × PRl → R, defined by

D(x)v
Φ(x, [v]) = log .
v
It is clear that Φ ∈ F ; that is, Φ is measurable and Φ(x, ·) ∈ C0 (PRd ) for every
x. For any vector v ∈ Rl \ {0} and any n ≥ 0,
1 n−1 1 Dn (x)v

lim sup
n
∑
n j=0
Φ(G j (x, [v])) = lim sup log
n n v
= λ (x, v).
Moreover,
In (x) = log Dn (x)−1 −1 and Sn (x) = log Dn (x)

for every n ≥ 0, and so

I(x) = λ− (x) and S(x) = λ+ (x).
According to Lemma 4.4(i), we have λ− (x) ≤ λ (x, v) ≤ λ+ (x) for every v = 0.
So, the conclusion of Corollary 4.9 means that
min{λ (x, v) : v ∈ Rl \ {0}} = λ− (x) and max{λ (x, v) : v ∈ Rl \ {0}} = λ+ (x),
as we wanted to prove.
Remark 4.12 If f is invertible, using Remark 4.10 we get:
1
(a) lim log (An (x) | Vx )−1 −1 = min{λ (x, v) : v ∈ Vx \ {0}};
n→±∞ n
1
(b) lim log (An (x) | Vx ) = max{λ (x, v) : v ∈ Vx \ {0}}.
n→±∞ n
The special case Vx = Rd in Proposition 4.11 gives:
Corollary 4.13 λ+ (x) = λ1 (x) and λ− (x) = λk(x) (x) for μ -almost every x.
4.2.5 Dimension reduction

Let us consider the following situation. Let x → Vx be a measurable invariant
sub-bundle and α (x) < β (x) be measurable invariant functions such that
(i) λ (x, v) ≤ α (x) for every v ∈ Vx \ {0};
(ii) λ (x, u) ≥ β (x) for every u ∈ Rd \Vx ;
for μ -almost every x. By Proposition 4.11, condition (i) implies
1
(iii) lim log An (x) | Vx ≤ α (x).
n n
Let Vx⊥ denote the orthogonal complement of Vx . Note that x → Vx⊥ is mea-
surable, since x → Vx is taken to be measurable and the orthogonal complement
map ⊥: Gr(l, d) → Gr(d − l, d) is a diffeomorphism for every l. Let

B(x) 0
A(x) = (4.11)
C(x) D(x)
be the expression of A relative to the direct sum decomposition Rd = Vx⊥ ⊕Vx .

Clearly, D(x) is just the restriction A(x) | Vx of A(x) to the invariant sub-bundle.
It is also clear that the norms of B(x), C(x) and D(x), and their inverses, are
bounded by the norms of A(x)±1 . Thus, the hypotheses ensure that
log B±1 , log C±1 , log D±1 ∈ L1 (μ ). (4.12)

Proposition 4.14 For μ -almost every x, any u ∈ Vx⊥ \ {0} and any v ∈ Vx ,
1 1
(a) lim sup log Bn (x)u = lim sup log An (x)(u + v).
n n n n
1 1
(b) if lim log B (x)u exists then lim log An (x)(u + v) exists for all v ∈
n
n n n n
Vx , and the two limits coincide.
Proof First, we prove (a). By Exercise 3.6 and the assumptions (i) and (ii),
1 1
lim sup logAn (x)(u + v) ≤ lim sup log An (x)u + An (x)v
n n n n
1 1
= max lim sup log An (x)u, lim sup log An (x)v
n n n n
1
= lim sup log An (x)u
n n
and, analogously,
1 1
lim sup log An (x)u ≤ lim sup log An (x)(u + v) + An (x)v
n n n n
1 1
= max lim sup log An (x)(u + v), lim sup log An (x)v
n n n n
1
= lim sup log An (x)(u + v).
n n
This proves that, for any u ∈ Vx⊥ \ {0} and v ∈ Vx ,
1 1
lim sup log An (x)(u + v) = lim sup log An (x)u.
n n n n
So, from now on we consider v = 0. We will need the following fact:
Lemma 4.15 Given any ε > 0, there exists a measurable function dε (x) > 0
such that
Dn ( f m (x)) ≤ dε (x)eα (x)n+(m+n)ε for every m, n ≥ 0.
Proof Define
bε (x) = sup{Dn (x) e−n(α (x)+ε ) : n ≥ 0}.
Observe that 1 ≤ bε (x) < ∞ at μ -almost every point, by condition (iii). From
bε ( f (x)) = sup{Dn ( f (x)) e−n(α (x)+ε ) : n ≥ 0} (4.13)
we get that
bε ( f (x)) ≥ D(x)−1 eα (x)+ε sup{Dn (x)e−n(α (x)+ε ) : n ≥ 1}.

There are two possibilities. If the supremum on the right-hand side coincides
with bε (x), then this yields
log bε ( f (x)) ≥ log bε (x) + log D(x)−1 + α (x) + ε .
Otherwise, if the supremum in the definition of bε (x) is attained at n = 0, then

bε (x) = 1 ≤ bε ( f (x)). In any event,
log bε ( f (x)) − log bε (x) ≥ min{− log+ D(x) + α (x) + ε , 0}. (4.14)
Similarly, (4.13) yields bε ( f (x)) ≤ D(x)−1 eα (x)+ε bε (x), and so
log bε ( f (x)) − log bε (x) ≤ log+ D(x)−1 + α (x) + ε . (4.15)
The inequalities (4.14)–(4.15), together with (4.12), ensure that log bε ◦ f −

log bε is in L1 (μ ). Then, using Corollary 3.11,
1
lim log bε ( f m (x)) = 0
m m
Let dε (x) = sup{bε ( f m (x))e−ε m
: m ≥ 0}. The previous relation ensures that
dε (x) is finite at μ -almost every point. By definition, bε ( f m (x)) ≤ dε (x)eε m
and
Dn ( f m (x)) ≤ bε ( f m (x))en(α (x)+ε ) ≤ dε (x)eα (x)n+(m+n)ε
for every m, n ≥ 0.
Going back to the proof of Proposition 4.14, observe that

n
B (x) 0
An (x) = for every n ≥ 0,
Cn (x) Dn (x)
where
n−1
Cn (x) = ∑ Dn− j−1 ( f j+1 (x))C( f j (x))B j (x). (4.16)
j=0
Given x ∈ M and u ∈ Vx \ {0}, consider

1
γ = max α (x), lim sup log Bn (x)u
n n
In particular,
1
lim sup log Bn (x)u ≤ γ (4.17)
n n
which implies that, given any ε > 0, there exists a real number bε such that
B j (x)u ≤ bε e j(γ +ε ) for every j. (4.18)

By Corollary 3.11 and (4.12), there exists a measurable function cε such that
C( f j (x)) ≤ cε (x)e jε for every j. (4.19)
By Lemma 4.15, there exists a measurable function dε such that
Di ( f j (x)) ≤ dε (x)eiα (x)+(i+ j)ε for every i and j. (4.20)
Substituting the estimates (4.18)–(4.20) in (4.16), we find that

n−1
Cn (x)u ≤ ∑ dε (x)e(n− j−1)α (x)+nε cε (x)e jε bε e j(γ +ε ) ≤ naε en(γ +3ε ) ,
j=0
where aε = bε cε (x)dε (x). Consequently,
1
lim sup log Cn (x)u ≤ γ + 3ε . (4.21)
n n
Since An (x)u = (Bn (x)u,Cn (x)u), we have that
An (x)u2 = Bn (x)u2 + Cn (x)u2 ,
So, by Exercise 3.6, the inequalities (4.17) and (4.21) give

1 1
lim sup log Bn (x)u ≤ lim sup log An (x)u ≤ γ + 3ε .
n n n n
Since ε > 0 is arbitrary, this implies that

1 1
lim sup log Bn (x)u ≤ lim sup log An (x)u ≤ γ . (4.22)
n n n n
Now, the assumption (ii) yields

1
α (x) < β (x) ≤ lim sup log An (x)u ≤ γ
n n
and so α (x) is strictly smaller than γ . So, by the definition of γ ,
1
lim sup log Bn (x)u = γ . (4.23)
n n
The two relations (4.22) and (4.23) yield part (a) of the proposition.
Now we can readily deduce part (b). Note that
An (x)(u + v)2 = Bn (x)u2 + Cn (x)u + Dn (x)v2 ,

since An (x)(u + v) = (Bn (x)u,Cn (x)u + Dn (x)v). So, by Exercise 3.6,

1
lim inf log An (x)(u + v)
n n
1 1
≥ max{lim inf log Bn (x)u, lim inf log Cn (x)u + Dn (x)v}
n n n n
1 1
≥ lim inf log B (x)u = lim sup log Bn (x)u.
n
n n n n
Then the claim follows immediately from part (a) of the proposition. This fin-
ishes the proof of Proposition 4.14.
4.2.6 Completion of the proof

Keep in mind that our goal, to finish the proof of Theorem 4.1, is to show that
1
lim log An (x)v (4.24)
n n
exists for μ -almost every x ∈ M, every v ∈ Vxi \Vxi+1 and every 1 ≤ i ≤ k. It will
follow that the limit is equal to λ (x, v) = λi (x), of course.
Up to replacing M by suitable invariant measurable subsets, and considering
the normalized restriction of μ to each of such subsets, we may suppose that
k(x) is independent of x, and so is the dimension l ≥ 1 of the invariant sub-
bundle Vx = Vxk . Let α (x) = λk (x) and β (x) = λk−1 (x). It is clear that conditions
(i) and (ii) in Section 4.2.5 are satisfied in this context, and so we will be able
to use Proposition 4.14. Moreover, Proposition 4.11 implies that, for μ -almost
every x,
1 1
lim log (An (x) | Vx )−1 −1 = λk (x) = lim log An (x) | Vx
n n n n
and so,
1
lim log An (x)v = λk (x) for all v ∈ Vx \ {0}. (4.25)
n n
Consider the expression (4.11) of A relative to the direct sum decomposition

Rd = Vx⊥ ⊕Vx . Using a measurable orthonormal basis {w1 (x), . . . , wd−l (x)}, we
may identify Vx⊥ with Rd−l and view each B(x) as an element of GL(d − l),
depending measurably on the point x. Recall that log± B ∈ L1 (μ ), by (4.12).
Define
Uxi = Vx⊥ ∩Vxi , for every i = 1, . . . , k.
By Proposition 4.14(a), for every 1 ≤ i ≤ k − 1 and u ∈ Uxi \Uxi+1 ,
1 1
lim sup log Bn (x)u = lim sup log An (x)u = λi (x).
n n n n

Thus, Rd−l = Ux1 · · · Uxk−1 {0} is the Oseledets flag of B, and the Lya-
punov exponents are λ1 (x), . . . , λk−1 (x). By induction on k,
1
lim log Bn (x)u = λi (x) for all u ∈ Uxi \Uxi+1
n n
and every i = 1, . . . , k − 1. By Proposition 4.14(b), it follows that

1
lim log An (x)v = λi (x) for all v ∈ Vxi \Vxi+1 (4.26)
n n
and every i = 1, . . . , k − 1. The relations (4.25) and (4.26) contain (4.24).

The proof of Theorem 4.1 is complete.
4.3 Proof of the two-sided theorem

Now let us prove Theorem 4.2. We have seen in Remark 4.12 that the limits
are not affected when one takes n → −∞ instead. The main remaining steps
for proving Theorem 4.2 are: to upgrade the Oseledets flag to an invariant
decomposition (Section 4.3.1) and to prove the subexponential decay of the
angles (Section 4.3.2).
4.3.1 Upgrading to a decomposition

We pick-up where we left, in Section 4.2.6. It is no restriction to suppose that
k(x)
k(x) and the dimension l ≥ 1 of the invariant sub-bundle Vx = Vx are inde-
pendent of x. Let α (x) = λk (x) and β (x) = λk−1 (x). Consider the expression
(4.11) of A relative to the decomposition Rd = Vx⊥ ⊕Vx . By Remark 4.12,
1 1
lim log Dn (x)−1 −1 = lim log Dn (x) = α (x) (4.27)
n→±∞ n n→±∞ n
for μ -almost every x. Similarly, since λ1 , . . . , λk−1 are the Lyapunov exponents
of B (see Section 4.2.6), Remark 4.12 gives
1
lim log Bn (x)−1 −1 = β (x) for μ -almost every x. (4.28)
n→±∞ n
We are going to use these facts in the proof of the following result.
Proposition 4.16 If f : M → M is invertible then there exists a measurable
invariant sub-bundle x → Wx such that Rd = Wx ⊕Vx for μ -almost every x.
Proof Let L be the space of all measurable maps L : x → Lx assigning to
μ -almost every x a linear map Lx : Vx⊥ → Vx . The graph transform T : L → L
is the transformation characterized by the condition that, for every x ∈ M, the

image of the graph of Lx under A(x) coincides with the graph of T (L) f (x) .
Using the expression (4.11), this condition translates to the following relation:

T (L) f (x) = C(x) + D(x)Lx B(x)−1 for every x. (4.29)
Clearly, the graph of some L ∈ L is an invariant sub-bundle if and only if
T (L)x = Lx for μ -almost every x. Thus, to prove the proposition it suffices to
find such an (essentially) fixed point of the graph transform.
With that in mind, let us rewrite (4.29) as
T (L)y = Ry + S(L)y for every y, (4.30)
where R : y → Ry = C( f −1 (y))B( f −1 (y))−1 = C( f −1 (y))B−1 (y) is an element
of L and S : L → L is the linear operator defined by
S(L)y = D( f −1 (y))L f −1 (y) B−1 (y).
We claim that there exist measurable functions a(x) > 0 and ε (x) > 0 such that
Sk (R)y ≤ a(y)e−kε (y) for every k ≥ 0 and μ -almost every y. (4.31)
Assume this fact for a while. Then the sum L : y → Ly = ∑∞ k
k=0 S (R)y is well
defined μ -almost everywhere, and it is a measurable map; in other words, L ∈
L . Moreover, L is a fixed point of the graph transform:
∞ ∞ ∞
Ry + S(L)y = Ry + S ∑ Sk (R) y
= Ry + ∑ Sk (R)y = ∑ Sk (R)y = Ly .
k=0 k=1 k=0
We are left to prove the claim (4.31). Begin by observing that, for any k ≥ 0,
Sk (R)y = Dk ( f −k (y))R f −k (y) B−k (y).
For each y, take
1
ε (y) =
β (y) − α (y) .
5
By (4.27) and Lemma 4.15, there exists a measurable function d(·) such that
Dk ( f −k (y)) ≤ d(y)ek(α (y)+2ε (y)) for every k ≥ 0. (4.32)
Similarly, (4.28) and Lemma 4.15 imply that there exists a measurable function
b(·) such that
Bk (y)−1 ≤ b(y)ek(−β (y)+ε (y)) for every k ≥ 0. (4.33)
The observation (4.12) implies that log+ R ≤ log C ◦ f −1 + log+ B−1 is
in L1 (μ ). Then we may use Corollary 3.11: there exists a measurable function
r such that
R f −k (y) ≤ r(y)ekε (y) for every k ≥ 0. (4.34)

Let a(y) = b(y)d(y)r(y). Putting (4.32)–(4.34) together,

Sk (L)y ≤ b(y)d(y)r(y)ek(α (y)−β (y)+4ε (y)) = a(y)e−kε (y)
This proves (4.28), and that completes the proof of the proposition.
The relation (4.27) implies that
1
lim log An (x)v = α (x) = λk (x) for all v ∈ Vz \ {0} (4.35)
n→±∞ n
and μ -almost every x. Identify Wx with Rd−l through some measurable or-
thonormal basis and let Ã : M → GL(d − l) be given by Ã(x) = A(x) | Wx .
Define
Wxi = Wx ∩Vxi for every i = 1, . . . , k.
By Proposition 4.14(a), for every 1 ≤ i ≤ k − 1 and u ∈ Wxi \Wxi+1 ,
1 1
lim sup log Ãn (x)u = lim sup log An (x)u = λi (x).
n n n n
Thus, Wx = Wx1 · · · Wxk−1 {0} is the Oseledets flag of Ã, and the Lya-
punov exponents are λ1 (x), . . . , λk−1 (x). By induction on k, there exists an Ã-
invariant splitting Wx = Ex1 ⊕ · · · ⊕ Exk−1 such that

k−1
Wxj = Exi for every j = 1, . . . , k − 1 (4.36)
i= j
1
lim log An (x)u = λi (x) for all u ∈ Exi \ {0} (4.37)
n→±∞ n
and μ -almost every x. Denote Exk = Vx = Vxk . Then Rd = Ex1 ⊕ · · · ⊕ Exk−1 ⊕ Exk
is an A-invariant splitting and (4.36) leads to

k
Vxj = Vxk ⊕Wxj = Exi for every j = 1, . . . , k.
i= j
This proves part (a) of the theorem. Part (b) is contained in (4.35) and (4.37).
4.3.2 Subexponential decay of angles

All that is left to do is to prove part (c) of Theorem 4.2. As explained before,
it is no restriction to assume that k(x) and the dimensions of the Oseledets
subspaces are constant μ -almost everywhere. Given disjoint subsets I and J of
{1, . . . , k}, define

φ (x) = sin Exi , Exj .
i∈I j∈J

Since the Oseledets subspaces are invariant,

j
φ ( f (x)) | sin (A(x) i∈I Exi , A(x) j∈J Ex )|
= .
φ (x) | sin ( i∈I Exi , j∈J Exj )|
By elementary linear algebra (Exercise 3.5), the right-hand side is bounded
above by A(x)A(x)−1 and below by (A(x)A(x)−1 )−1 . In particular,
the hypotheses log+ A±1 ∈ L1 (μ ) imply that log φ ◦ f −log φ is μ -integrable.
Equivalently, log φ ◦ f −1 −log φ is μ -integrable. Applying Corollary 3.11, both
to f and to f −1 we conclude that
1
lim φ ( f n (x)) = 0,
n→±∞ n
which is what we wanted to prove. The proof of Theorem 4.2 is complete.
4.3.3 Consequences of subexponential decay

Let us begin by pointing out that we proved a stronger fact than was stated in
part (b) of the theorem, namely:
1 1
lim log (An (x) | Exi )−1 −1 = lim log An (x) | Exi = λi (x) (4.38)
n→±∞ n n→±∞ n
for μ -almost every x (this implies, for instance, that the limit in (b) is uniform
over all unit vectors). It follows (Exercise 4.7) that
1
lim log det(An (x) | Exi ) = λi (x) dim Exi
n→±∞ n
for μ -almost every x. Moreover, using Theorem 4.2(c) and Exercise 4.7,

1
lim log det An (x) | Exi = ∑ λi (x) dim Exi (4.39)
n→±∞ n
i∈I i∈I
for any I ⊂ {1, . . . , k(x)}. For example,

k(x)
1
lim log | det An (x)| = ∑ λi (x) dim Exi (4.40)
n→±∞ n
i=1
for μ -almost every x ∈ M. In particular, if | det A| ≡ 1 then the sum of all Lya-
punov exponents, counted with multiplicity, is identically zero. Analogously,
1
lim log det(An (x) | Exu ) = ∑ λi (x) dim Exi
n→±∞ n
λ (x)>0 i
(4.41)
1
lim log det(An (x) | Exs ), = ∑ λi (x) dim Exi
n→±∞ n
λ (x)<0 i


where Exu = λi (x)>0 Exi is the unstable bundle and Exs = λi (x)<0 Exi is the
stable bundle.
One can interpret and expand these conclusions, using the formalism of ex-
terior linear algebra. Let us recall a few basic notions involved, referring the
reader to Bourbaki [39, § III.5] for more information.
Let E = Rd and 0 ≤ l ≤ d. The exterior l-power Λl E of E is the vector
space of alternating l-linear forms ω : E ∗ × · · · × E ∗ → R on the dual space E ∗
(a 0-linear form is just a real constant). Its elements are called l-vectors. Define
the exterior product of vectors v1 , . . . , vl ∈ E to be the alternating l-linear form
v1 ∧ · · · ∧ vl : E ∗ × · · · × E ∗ → R defined by

v1 ∧ · · · ∧ vl (φ1 , . . . , φl ) = ∑ sgn(σ ) φσ (1) (v1 ) · · · φσ (l) (vl ),
σ
where the sum is over all permutations of {1, . . . , l}. These are special ele-
ments of Λl E, that we call decomposable l-vectors. Every l-vector is a sum of
decomposable l-vectors. Moreover, if {e j : j = 1, . . . , d} is a basis of E then
{e j1 ∧ · · · ∧ e jl : 1 ≤ j1 < · · · < jl ≤ d} is a basis of Λl E. Thus,

d
dim Λl E = .
l
Every linear map L : E → E induces a linear map Λl L : Λl E → Λl E, by

Λl L(ω ) : (φ1 , . . . , φl ) → ω (φ1 ◦ L, . . . , φl ◦ L), (4.42)
for ω ∈ Λl E and φ1 , . . . , φl ∈ E ∗ . This map preserves the subset Λld E of decom-
posable l-vectors, since
Λl L(v1 ∧ · · · ∧ vl ) = L(v1 ) ∧ · · · ∧ L(vl ).
Recall that Gr(l, d) denotes the Grassmannian manifold, whose elements are
the l-dimensional subspaces of E. Since
v1 ∧ · · · ∧ vl = 0 ⇔ {v1 , . . . , vl } is linearly independent,
there is a well-defined map Ψ : Λld E \ {0} → Gr(l, d) assigning to each non-
zero decomposable l-vector v1 ∧ · · · ∧ vl the linear subspace spanned by the
vectors v1 , . . . , vl . Two l-vectors have the same image under Ψ if and only if
one is a multiple of the other. Hence, this map factors through a bijection
PΛld E → Gr(l, d),
that we can use to identify the projectivization of the space of decomposable
l-vectors with the Grassmannian manifold. Then (4.42) corresponds to the nat-
ural action F → L(F) of L on the Grassmannian.

Now take E to be endowed with some inner product (v, w) → v · w. For de-
composable l-vectors, define

(v1 ∧ · · · ∧ vl ) · (w1 ∧ · · · ∧ wl ) = det vi · w j 1≤i, j≤d . (4.43)
The right-hand side does not depend on the particular decomposition of the
l-vectors as exterior products. Moreover, (4.43) extends to an inner product on
the exterior power Λl E. If {e j : j = 1, . . . , d} is an orthonormal basis of E then
{e j1 ∧ · · · ∧ e jl : 1 ≤ j1 < · · · < jl ≤ d} is an orthonormal basis of Λl E. The
norm v1 ∧ · · · ∧ vl of a non-zero decomposable l-vector coincides with its
volume, defined as the unique positive real number vol(v1 , . . . , vl ) such that
v1 ∧ · · · ∧ vl = ± vol(v1 , . . . , vl ) u1 ∧ · · · ∧ ul ,
for any orthonormal basis {u1 , . . . , ul } of the subspace V = Ψ(v1 ∧ · · · ∧ vl ).
This definition does not depend on the choice of this basis. The determinant of
a linear isomorphism L : E → E restricted to V is given, in absolute value, by

det(L | V ) = vol(L(v1 ), . . . , L(vl )) = Λ L(v1 ∧ · · · ∧ vl ) .
l
(4.44)
vol(v1 , . . . , vl ) v1 ∧ · · · ∧ vl
Proposition 4.17 Let F : M × Rd → M × Rd , F(x, v) = ( f (x), A(x)v) be an
invertible linear cocycle and let χ1 (x) ≤ · · · ≤ χd (x) be the Lyapunov exponents
of F, counted with multiplicity. For each 0 ≤ l ≤ d, let Λl F be the linear cocycle
induced by F on the exterior l-power; that is,
Λl F : M × Λl (Rd ) → M × Λl (Rd ), Λl F(x, ω ) = ( f (x), Λl A(x)ω ).
The Lyapunov exponents of the linear cocycle Λl F, counted with multiplicity,
are the sums χi1 (x) + · · · + χil (x), with 1 ≤ i1 < · · · < il ≤ d.
The proof of the proposition goes as follows. Let λ1 (x) > · · · > λk (x) be the
Lyapunov exponents and Rd = Ex1 ⊕ · · · ⊕ Exk be the Oseledets decomposition
of F. We may suppose that k and the dimensions dm = dim Exm are constant
μ -almost everywhere. For each (l1 , . . . , lk ) with 0 ≤ lm ≤ dm and ∑km=1 lm = l,
l ,...,l
define Ex 1 k to be the subspace of Λl (Rd ) generated by the l-vectors v1 ∧
· · · ∧ vl with
m−1 m
v j ∈ Exm for ∑ li < j ≤ ∑ li . (4.45)
i=1 i=1
l ,...,l
Every x → Ex 1 k is a Λl F-invariant sub-bundle. Moreover, these sub-bundles
are in general position with respect to each other, and

d d
dim Ex l1 ,...,lk
= 1 ··· k .
l1 lk

l ,...,l
Then (Exercise 4.5), we have l1 ,...,lk Ex 1 k = Λl (Rd ). Take v1 ∧ · · · ∧ vl as in
(4.45) and let Vx be the subspace generated by v1 , . . . , vl . Using (4.44),
1 1
lim log Λl A(x)n (v1 ∧ · · · ∧ vl ) = lim log det(An (x) | Vx ).
n→±∞ n n→±∞ n
For each m = 1, . . . , k, take Vxm to be the subspace generated by the vectors v j
k
i=1 li < j ≤ ∑i=1 li . Then Vx =
with ∑m−1 m m
m=1 Vx and, using Theorem 4.2(c) and
Exercise 4.7,
1 k
1
lim log det(An (x) | Vx ) = ∑ lim log det(An (x) | Vxm ).
n→±∞ n n→±∞ n
m=1
By (4.38) and Exercise 4.7,

1
lim log det(An (x) | Vxm ) = lm λm (x).
n→±∞ n
Putting these three relations together,

k
1
lim log Λl A(x)n (v1 ∧ · · · ∧ vl ) = ∑ lm λm (x).
n→±∞ n
m=1
Since these decomposable k-vectors v1 ∧ · · · ∧ vl generate the space E l1 ,...,lk , it

follows that the latter is contained in some Oseledets subspace, with Lyapunov
exponent ∑km=1 lm λm (x).
Thus, the Lyapunov exponents of Λl F are the numbers ∑km=1 lm λm (x), with
l ,...,l
0 ≤ lm ≤ dm and ∑km=1 lm = l, each with multiplicity equal to dim Ex 1 k (some
of these numbers may coincide, in which case the multiplicities add up). This
is a rephrasing of the conclusion of Proposition 4.17.
4.4 Two useful constructions

Here we present a couple of constructions concerning Lyapunov exponents that
will be useful later. The reader may choose to skip this section at first reading,
and come back to it when these ideas are called for.
4.4.1 Inducing and Lyapunov exponents

Let Z ⊂ M be a positive μ -measure subset and g : Z → Z be the first return
map, defined by
g(x) = f r(x) (x), r(x) = inf{n ≥ 1 : f n (x) ∈ Z} (4.46)

(r(x) is finite for μ -almost every x, by Poincaré recurrence). Let ν be the nor-
malized restriction of μ to Z; that is,
μ (E)
ν (E) = for every measurable set E ⊂ Z. (4.47)
μ (Z)
Let G : Z × Rd → Z × Rd be the linear cocycle over g defined by the function
B : Z → GL(d), B(x) = Ar(x) (x). We are going to see that the Lyapunov expo-
nents of G at each point x are obtained from the Lyapunov exponents of F by
multiplication by a constant c(x) ≥ 1.
Proposition 4.18 Take the transformation f : M → M to be invertible.
(1) The probability measure ν is g-invariant and we have log+ B±1 ∈ L1 (ν )

whenever log+ A± ∈ L1 (μ ).
(2) The Oseledets decomposition of G coincides with the restriction of the
Oseledets decomposition of F.
(3) For ν -almost every x ∈ Z there exists c(x) ≥ 1 such that the Lyapunov
exponents satisfy λ j (G, x) = c(x)λ j (F, x) for every j.
Proof For each j ≥ 1, let Z j be the subset of points x ∈ Z such that r(x) = j.
Then {Z j : j ≥ 1} and { f j (Z j ) : j ≥ 1} are partitions of full measure subsets
of Z. Note also that g | Z j = f j | Z j for all j ≥ 1. For any measurable set E ⊂ Z
and any j ≥ 1,

μ g−1 (E ∩ f j (Z j )) = μ f − j (E ∩ f j (Z j )) = μ E ∩ Z j ,
because μ is invariant under f . It follows that

∞ −1 ∞
μ g−1 (E) = ∑μ g (E ∩ f j (Z j )) = ∑μ E ∩ Z j = μ (E).
j=1 j=1
So, the normalized restriction ν is invariant under g. Next, from the definition
B(x) = Ar(x) (x), we get
∞ ∞ j−1
Z
log+ B d μ = ∑ log+ A j d μ ≤ ∑∑ log+ A ◦ f i d μ .
j=1 Z j j=1 i=0 Z j
Since μ is invariant under f and the domains f i (Z j ) are pairwise disjoint for
all 0 ≤ i ≤ j − 1, it follows that
∞ j−1
Z
log+ B d μ ≤ ∑∑ f i (Z j )
log+ A d μ ≤ log+ A d μ .
j=1 i=0
The corresponding bound for the norm of the inverse is obtained in the same

way. This proves that log+ B±1 is ν -integrable. It is clear that the restriction
of the Oseledets decomposition of F is invariant under G. Define
1 k−1
c(x) = lim ∑ r(g j (x)).
k→∞ k j=0
Note that r is integrable relative to ν :

∞ ∞ j−1
Z
r dμ = ∑ j μ (Z j ) = ∑ ∑ μ ( f i (Z j )) ≤ 1.
j=1 j=1 i=0
Thus, by the ergodic theorem, c(x) is well defined at ν -almost every x. It is

clear from the definition that c(x) ≥ 1. Now, given any non-zero vector v and a
generic point x ∈ Z,
1 1
lim log Bk (x)v = c(x) lim log An (x)v .
k→±∞ k n→±∞ n
(Theorem 4.2 ensures that the limit on the right-hand side exists.) This implies
that the Oseledets decomposition of G coincides with the restriction of the
Oseledets decomposition of F, and that the Lyapunov exponents of the two
cocycles are related by the factor c(x), as claimed in the proposition.
4.4.2 Invariant cones

The cone of radius a > 0 around a subspace V of Rd is defined by
C(V, a) = {v1 + v2 ∈ V ⊕V ⊥ : v2 < av1 }. (4.48)
Proposition 4.19 Assume there exist δ > 0, b > a > 0, and an F-invariant
measurable decomposition Rd = Vx ⊕Wx , defined at μ -almost every point, such
that the dimensions of Vx and Wx are constant,
| sin (Vx ,Wx )| ≥ δ and A(x)(C(Vx , b)) ⊂ C(V f (x) , a)
for μ -almost every x ∈ M. Then the Lyapunov exponents of F along the sub-
bundle Vx are strictly larger than the Lyapunov exponents of F along Wx .
Proof It is no restriction to suppose that the subspace Vx is constant: pick

any measurable orthonormal basis {v1 (x), · · · , vd (x)} of the space Rd such that
{v1 (x), . . . , vl (x)} is a basis of Vx and use it to identify Rd to itself; then Vx is
identified to Rl × {0}. So, from now on we suppose Vx = V for every x ∈ M.
The cone C(V, b) is the union of all the graphs of linear maps u : V → V ⊥
such that u < b. In the space H (V, b) of such maps, consider the Hilbert

g0
f0 ψ
φ
{u : u < b}
Figure 4.1 Cross-ratio and projective metric
metric (see Birkhoff [26] and [114, Section 12.3.1]), defined by the logarithm
of the cross-ratio
φ − ψ0 ψ − φ0
d(φ , ψ ) = log , (4.49)
φ − φ0 ψ − ψ0
where φ0 and ψ0 are the points where the line through φ and ψ hits the bound-
ary {u : u = b}, denoted in such a way that φ0 is closer to φ and ψ0 is
closer to ψ (see Figure 4.1). The hypothesis implies that each A(x)(graph(φ )),
φ ∈ H (V, b) may be written as graph(ψ ) for some ψ ∈ H (V, a) ⊂ H (V, b).
Since H (V, a) is relatively compact in H (V, b), it has finite diameter for the
Hilbert metric. By a crucial property of the Hilbert metric (see [114, Propo-
sition 12.3.6]), the action H (V, b) → H (V, b) of A(x) thus defined is a θ -
contraction with respect to this metric, for some θ < 1 that depends only
on an upper bound for the diameter and, hence, can be expressed in terms
of a and b alone. It follows (Exercise 4.11) that the width of the iterates
An ( f −n (x))(C(V, b)) decays as Ke−κ n , for some K > 0 and κ > 0 that depend
only on a and b. Of course, the intersection of the iterate satisfies
∞

An ( f −n (x))(C(V, b)) = V. (4.50)
n=1
We are going to deduce that any unit vector v ∈ V is expanded at a definitely

faster rate than any unit vector w ∈ Wx . Indeed, one can find c0 > 0 depending
only on b and such that v + c0 w ∈ C(V, b). Then,
An (x)v + c0 An (x)w ∈ C(V, Ke−κ n ) for every n ≥ 1.
Since An (x)w ∈ W f n (x) and | sin (W f n (x) ,V f n (x) )| ≥ δ , the previous relation
implies that
An (x)w ≤ K0 e−κ n An (x)v for every n ≥ 1,

4.5 Notes 63
where K0 > 0 depends only on K, c0 , and δ , and so is completely determined

by a, b, and δ . It follows that
1 1
lim log An (x)w ≤ lim log An (x)v − κ
n n n n
for every v ∈ V and w ∈ Wx , and μ -almost every x. This proves that any Lya-
punov exponent of a vector in Wx is smaller, by a definite amount −κ , than any
Lyapunov exponent of a vector in V .
4.5 Notes
Theorems 4.1 and 4.2 are part of the main result in Oseledets [92]: the results of
Oseledets are stated in the slightly broader context of linear cocycles on finite-
dimensional vector bundles. Other proofs were given by Raghunathan [101],
Ruelle [105] and Ledrappier [80], among others. While the early proofs relied
mostly on linear algebra calculations, dynamical proofs were later found by
Mañé [89] and Walters [117]. Our presentation is closest to [117] and to its
extension to the invertible case by Bochi [28].
The assumption log+ A±1 ∈ L1 (μ ) is unnecessarily strong for Theorem
4.1: it suffices to assume that the function log+ A is integrable; however, in
this case the smallest Lyapunov exponent may be −∞. The multiplicative er-
godic theorem also extends to some infinite-dimensional cocycles with finitely
many non-negative Lyapunov exponents: see Ruelle [106] and Mañé [87].
Proposition 4.17 shows that the subadditive ergodic theorem suffices to iden-
tify all the Lyapunov exponents, including their multiplicities, not just the ex-
tremal ones: for every 1 ≤ l ≤ d, the l-largest Lyapunov exponent (counted
with multiplicity) coincides with λ+ (Λl F) − λ+ (Λl−1 F). This observation was
part of the early proofs of the Oseledets theorem.
Most of what follows in this book turns around the multiplicative ergodic
theorem. Chapter 6 shows that the Lyapunov exponents may be expressed, in
an explicit way, in terms of the invariant measures and the stationary measures
of the projective cocycle PF. The invariance principle in Chapter 7 is a general
tool for analysing the special situation when all the Lyapunov exponents are
equal (and to prove that this is seldom the case). In Chapter 8 we will investi-
gate conditions under which all the Lyapunov exponents are distinct. Finally,
in Chapters 9 and 10 we will investigate how Lyapunov exponents vary with
the linear cocycle and with the invariant measure μ .
The constructions in Sections 4.4.1 and 4.4.2 will be used in the proof of
Propositions 9.13 and 8.3, respectively.

4.6 Exercises
Exercise 4.1 Prove that a map x → Vx with values in Gr(d) is measurable if
and only if Ml = {x ∈ M : dimVx = l} is a measurable set for every 0 ≤ l ≤ d
and, for each l, there exist measurable vector fields vi : Ml → Rd , i = 1, . . . , l
such that {v1 (x), . . . , vl (x)} is a basis of Vx for every x ∈ Ml .
Exercise 4.2 Prove that (4.5) defines a complete separable norm in F :
(1) Check that · 1 is indeed a norm in F .
(2) Show that any Cauchy sequence in F admits some subsequence (Ψn )n
such that Ψn − Ψn+1 1 ≤ 2−2n for every n. Deduce that there exists N ⊂
M with μ (N) = 1 such that Ψ(x, v) = limn Ψn (x, v) exists for every (x, v) ∈
N × P. Argue that ψ ∈ F and use bounded convergence to get that Ψn −
Ψ1 → 0.
(3) Use the fact that C0 (P) is separable, and the assumption that the probabil-
ity space (M, B, μ ) is separable, to conclude that (F , · 1 ) is separable
(compare Theorem 2 in [120, page 137]).
Exercise 4.3 Show that two probability measures η , ζ ∈ M (μ ) coincide if

and only if Ψ d η = Ψ d ζ for every Ψ ∈ F .
Exercise 4.4 Prove that the weak∗ topology on M (μ ) is metrizable and com-
pact:
(1) Fix a countable dense subset {Ψk : k ≥ 1} of the unit ball of F and define
∞

d(ξ , η ) = ∑ 2−k Ψk d ξ − Φk d η for any ξ , η ∈ M (μ ).
k=1
Show that d is a distance compatible with the weak∗ -topology.

(2) Given any sequence in M (μ ), use a diagonal argument to find a subse-

quence (ηn )n such that limn Ψk d ηn exists for every k. Conclude that
there exists a unique bounded linear operator g : F → R such that

g(Ψk ) = lim Ψk d ηn for every k.
n

Then Ψ d η = g(Ψ) defines a probability measure η on M ×P that projects
down to μ . Moreover, (ηn )n converges to η in the weak∗ topology.
Exercise 4.5 Check that if the d1 + · · · + dk = d then

d d1 d
= ∑ ··· k ,
l l ,...,l
l 1 lk
1 k
where the sum is over all 0 ≤ li ≤ di with ∑ki=1 li = l.

4.6 Exercises 65
Exercise 4.6 Show that if {v1 , . . . , vk } and {w1 , . . . , wk } are orthonormal bases
for the same subspace V then the matrix (vi · w j )i, j is orthonormal.
Exercise 4.7 Show that if L : Rd → Rd is a linear isomorphism and V1 ,V2 are

linear subspaces of Rd then
1 | sin (L(V1 ), L(V2 ))|
(1) −1
≤ ≤ L L−1 ;
L L | sin (V1 ,V2 )|
(2) (L | V1 )−1 −d1 ≤ | det(L | V1 )| ≤ (L | V1 )d1 , where d1 = dimV1 ;
(3) | sin (V1 ,V2 )| = π −1 −1 , where π : V1 → V2 is the orthogonal projection;
| det(L | V1 ⊕V2 )|
(4) | sin (L(V1 ), L(V2 ))|d1 ≤ ≤ | sin (V1 ,V2 )|−d1 .
| det(L | V1 )| | det(L | V2 )|
Exercise 4.8 Prove that if μ is ergodic for f then ν is ergodic for the induced
map g, and c(x) = 1/μ (Z) for ν -almost every x.
Exercise 4.9 Extend Proposition 4.18 and Exercise 4.8 to the non-invertible
case, using the concept of natural extension (see [114, Section 2.4.2]).
Exercise 4.10 Let V,W be subspaces of Rd with dimV = dimW . Prove that:
(1) given a > 0 there is b > 0 such that C(W, b) ⊂ C(V, 2a) if W ⊂ C(V, a);
(2) if b < 1/10 then W ⊂ C(V, b) implies C(V, b) ⊂ C(W, 3b).
Exercise 4.11 Let ρ > 0 be small. Show that if H ⊂ H (V, b) is contained

in the ρ -neighborhood of φ ≡ 0, relative to the Hilbert metric d, then the union
C of all graph(φ ), φ ∈ H is contained in C(V, ρ /b). More generally, relate the
d-diameter of a subset H of H (V, b) to the width of the corresponding cone
C.
Exercise 4.12 Check that one may replace the assumption | sin (Vx ,Wx )| ≥
δ in Proposition 4.19 by a condition of subexponential decay of the angles:
1
lim log | sin (An (x)Vx , An (x)Wx )| = 0.
n n
Exercise 4.13 Let B ∈ GL(d) be such that B(C(V, b)) ⊂ C(V, a) for some
subspace V ⊂ Rd and some 0 < a < b. Prove that:
(1) B admits a dominated decomposition Rd = U ⊕ S with dimU = dimV ; that

is, B(U) = U, B(S) = S, and every eigenvalue of B restricted to U is larger,
in norm, than every eigenvalue of B restricted to S;

(2) there is r > 0 such that B̃ = B1 · · · B admits a dominated decomposition

Rd = UB̃ ⊕ SB̃ , for any ≥ 1 and any B1 , . . . , B in the r-neighborhood of
B; moreover, UB̃ is uniformly close to U if r is small.

5
Stationary measures
Further development of the theory requires that we exploit in more depth the
connection between the Lyapunov exponents of a linear cocycle and the invari-
ant measures of the corresponding projective cocycle, of which we had a brief
glimpse in the proof of the multiplicative ergodic theorem. In this chapter we
introduce a general formalism that will be very useful towards that end.
Linearity is not relevant at this stage, so we formulate the results for a class
of systems more general than linear and projective cocycles, that we call ran-
dom transformations. The definition and fundamental properties of such sys-
tems are discussed in Section 5.1.
In Section 5.2 we introduce the key notion of stationary measure for a ran-
dom transformation. We will see in the next chapter that the measures station-
ary under the projective cocycle completely determine the Lyapunov exponents
of the linear cocycle. The properties of stationary measures of general (possi-
bly non-invertible) random transformations are studied in Sections 5.2 and 5.3.
The invertible case is treated in more detail in Section 5.4 and leads to the
important concepts of u-state and s-state, which are invariant probability mea-
sures whose disintegrations along the fibers have special invariance properties.
These disintegrations are revisited, from a different angle, in Section 5.5.
5.1 Random transformations

Denote by M the space of sequences X N (or X Z ) in some probability space
(X, X , p), endowed with the product σ -algebra A = X N (or X Z ) and the
product measure μ = pN (or μ = pZ ). Moreover, let f : M → M be the shift
map on M.
Let (N, B) be a measurable space and take M × N to be endowed with the
product σ -algebra A ⊗ B. Given a set E ⊂ M × N and points x ∈ M and v ∈ N,
67

68 Stationary measures
we will consider the slices defined by
Ev = {x ∈ M : (x, v) ∈ E} and E x = {v ∈ N : (x, v) ∈ E}.
If E ⊂ M × N is measurable then so are Ev ⊂ M and E x ⊂ N (Exercise 5.1).

A random transformation (or locally constant skew product) over f is a
measurable transformation of the form
F : M × N → M × N, F(x, v) = ( f (x), Fx (v))
where Fx depends only on the zeroth coordinate of x ∈ M.
Example 5.1 (Random matrices) Let X = {A1 , . . . , Am } ⊂ GL(d). Consider

the probability measure defined on X by p = p1 δA1 + · · · + pm δAm , with p1 ,
. . . , pm > 0 and p1 + · · · + pm = 1. Let N = Rd and Fx = x0 for every x in
M = X Z . Two random transformations associated with this data are relevant
for our purposes: the locally constant linear cocycle F : M × Rd → M × Rd ,

(αn )n , v → (αn+1 )n , α0 (v) ,
and the corresponding projective cocycle PF : M × PRd → M × PRd ,

(αn )n , [v] → (αn+1 )n , [α0 (v)] .
Example 5.2 Let F1 , F2 : S1 → S1 be homeomorphisms of the circle N = S1 .

Consider X = {1, 2} and M = X N . Let f : M → M be the shift map and μ = pN
with p = p1 δ1 + p2 δ2 . Let F : M × N → M × N, F(x, v) = ( f (x), Fx0 (v)). Then
F n (x, v) = ( f n (x), Fi(n−1) · · · Fi(0) (v)) where the Fi( j) ∈ {F1 , F2 } are independent
random variables with identical distribution p.
Let F : M × N → M × N be a random transformation. The transition proba-

bilities associated with F are defined by

p(v, B) = μ {x ∈ M : Fx (v) ∈ B} = μ F −1 (M × B)v ,
for each v ∈ N and each measurable set B ⊂ N. The set on the right-hand side is
measurable (Exercise 5.1) and so p(v, B) is well defined. It is clear that p(v, B)
is countably additive on the second variable, and so each p(v, ·) is a probability
measure on N. Moreover (Exercise 5.1), the function N → R, v → p(v, B) is
measurable, for any measurable B ⊂ N.
The transition operator P associated with the random transformation F
is the linear map P acting on the space of bounded measurable functions
ϕ : N → R by

P ϕ : N → R, P ϕ (v) = ϕ (Fx (v)) d μ (x). (5.1)

5.1 Random transformations 69
Note that x → ϕ (Fx (v)) is measurable (Exercise 5.2), and so P ϕ (v) is well
defined. It is clear that the function P ϕ defined by (5.1) is bounded. Moreover,
P ϕ is measurable. To see this, begin by supposing that ϕ is the characteristic
function of some set B ⊂ N. Then
P ϕ (v) = μ ({x : Fx (v) ∈ B}) = p(v, B) (5.2)
is a measurable function of v, as observed in the previous paragraph. By the
linearity of the transition operator, it follows that P ϕ is measurable whenever
ϕ is a simple function; that is, a linear combination of characteristic functions.
In general, there exists a sequence (ϕn )n of simple functions converging uni-
formly to ϕ . Thus, P ϕ is the uniform limit of measurable functions P ϕn ,
n ≥ 1, and so it is measurable.
The adjoint transition operator P ∗ associated with the random transforma-
tion F acts on the space of probability measures η on N by

P ∗ η (B) = (Fx )∗ η (B) d μ (x) = η Fx−1 (B) d μ (x), (5.3)
for any measurable set B ⊂ N. Observe that

η Fx−1 (B) = η F −1 (M × B)x (5.4)
is a measurable function of x, by Exercise 5.1. Thus, the integral in (5.3) is well
defined. Moreover, it is clear from the definition (5.3) that P ∗ η is countably
additive and satisfies P ∗ η (N) = 1. Thus, P ∗ η is a probability measure on N.
We will take the space T (N) of measurable transformations N → N to be
endowed with some σ -algebra relative to which the map F : M → T (N),
F (x) = Fx is measurable. For instance, this could be the push-forward of the
σ -algebra A under the map F . Then we denote by ν the push-forward F∗ μ
of the probability measure μ under F . The definitions (5.1) and (5.3) translate
to

P ϕ (v) = ϕ (g(v)) d ν (g) and P ∗ η (B) = η (g−1 (B)) d ν (g). (5.5)
So, the transition operators are completely characterized by the probability

measure ν . We have the following duality relation:
Lemma 5.3 Let ϕ : N → R be a bounded measurable function. Then

ϕ d(P ∗ η ) = (P ϕ ) d η .
Proof Suppose first that ϕ is the characteristic function of some measurable

set B ⊂ N. Then, using (5.4),

ϕ d(P η ) = P η (B) = η F −1 (M ×B)x d μ (x) = (μ × η )(F −1 (M ×B))
∗ ∗

and, using (5.2),

(P ϕ ) d η = μ {x : Fx (v) ∈ B} d η (v) = (μ × η )(F −1 (M × B)).
This proves the equality in this case. By linearity, we get that the equality holds
for every simple function ϕ . The general case follows, because every bounded
measurable function is a uniform limit of simple functions.
5.2 Stationary measures

A probability measure η on N is called stationary for the random transforma-
tion F if P ∗ η = η ; that is, if

η (B) = η (Fx−1 (B)) d μ (x) = η (g−1 (B)) d ν (g) (5.6)
for every measurable set B ⊂ N. The probability measure ν = F∗ μ was intro-

duced in Section 5.1. The expression on the right-hand side of (5.6) shows that
ν determines the set of stationary measures entirely.
Recall that a measure η is said to be invariant under a transformation g if
η (g−1 (B)) = η (B) for every measurable set. The definition (5.6) means that η
is called stationary if such an equality holds on average over all Fx , the average
being with respect to μ . Thus, clearly, any probability measure that is invariant
under Fx for μ -almost every x is also stationary. The following example (see
also Exercise 6.8) illustrates the fact that the converse is far from being true.
Example 5.4 Let f : M → M be the shift map on M = {1, 2}Z and μ be a
Bernoulli measure supported on the whole M = {1, 2}Z . Let A : M → SL(2)
be a locally constant function,
A | [0; 1] ≡ A1 and A | [0; 2] ≡ A2 ,
where A1 and A2 are hyperbolic matrices with no common eigenspace. Let
F : M × PR2 → M × PR2 be the projective cocycle defined by A over f . For
each i = 1, 2, the probability measures on PR2 invariant under the action of
Ai are the convex combinations of the Dirac masses on the corresponding
eigenspaces. Thus, the hypotheses imply that A1 and A2 have no common in-
variant probability measure. On the other hand, Proposition 5.6 below ensures
that F does have some stationary measure.
The following characterization of stationary measures is specific to one-
sided random transformations; that is, such that M = X N and f : M → M is
the one-sided shift. The two-sided situation will be analyzed later.

Proposition 5.5 Let F : M × N → M × N be a one-sided random transfor-

mation. A probability measure η on N is stationary for F if and only if the
probability measure μ × η on M × N is F-invariant.
Proof First, we prove the ‘if’ claim. Suppose that μ × η is invariant under F.
Given any bounded measurable function ϕ : N → R, define ψ : M × N → R by
ψ (x, v) = ϕ (v). Then, using Lemma 5.3,

ϕ (v)(dP ∗ η )(v) = P ϕ (v) d η (v) = ψ (x, Fx (v)) d μ (x) d η (v)

= ψ ( f (x), Fx (v)) d μ (x) d η (v)

= ψ (x, v) d μ (x) d η (v) = ϕ (v) d η (v).
Since ϕ is arbitrary, this shows that P ∗ η = η , as we wanted to prove.

Now we prove the ‘only if’ claim. Let η be a stationary measure on N.
Given any bounded measurable function ψ : M × N → R, consider the function

ϕ : N → R defined by ϕ (v) = ψ (x, v)d μ (x). Then

ψ (x, v) d μ (x) d η (v) = ϕ (v) d η (v) = P ϕ (v) d η (v)

= ϕ (Fx (v)) d μ (x) d η (v)

= ψ (y, Fx (v)) d μ (y) d μ (x) d η (v).
Write x = (x0 , x1 , . . . ) and y = (y0 , y1 , . . . ). Since Fx depends only on x0 , the

triple integral may be written as

ψ (y, Fx0 (v) d pN (y) d p(x0 ) d η (v).
Let us write z = (x0 , y0 , y1 , . . . ). Then f (z) = y, and so the previous integral

becomes

ψ ( f (z), Fz0 (v)) d pN (z) d η (v) = ψ ( f (z), Fz (v)) d μ (z) d η (v).
This proves that

ψ d μ dη = (ψ ◦ F) d μ d η .
Since ψ is arbitrary, this shows that μ × η is invariant under F.
There is more than one useful topology in the space of probability measures

on a measurable space N. The uniform topology is defined by the total varia-

tion norm

ξ − η = sup ψ d ξ − ψ d η ,
|ψ |≤1
where the supremum is over all measurable functions ψ : M → R with |ψ | ≤ 1.

The pointwise topology is the smallest topology such that
ξ → ξ (E) is continuous for every measurable E ⊂ N.

It follows that ξ → ψ d ξ is continuous for every bounded measurable func-
tion ψ : M → R. When N is a metric space, we may also consider the weak∗
topology, which is the smallest topology such that

ξ → ϕ d ξ is continuous for each bounded uniformly continuous function ϕ .
It is clear from the definitions that uniform convergence implies pointwise con-
vergence which, in turn, implies weak∗ convergence. Both converses are false,
in general. The weak∗ topology is especially useful, not the least because it is
compact if N is compact (see [114, Theorem 2.1.5]).
Proposition 5.6 Assume that N is a compact metric space and Fx : N → N is

continuous for every x ∈ M. Then there is some stationary measure for F.
Proof The first step is
Lemma 5.7 If ϕ : N → R is continuous then P ϕ : N → R is also continuous.
Proof Given n ∈ N, v ∈ N, and δ > 0, define

1
B(v, n, δ ) = x ∈ M : Fx B v, ⊂ B(Fx (v), δ ) .
n
Fix v ∈ N and ε > 0. Then fix δ > 0 such that

ε
d(z, y) < δ ⇒ |ϕ (z) − ϕ (y)| < .
2
The fact that every Fx is continuous ensures that μ (B(v, n, δ )c ) → 0 as n → ∞.
Fix n large enough so that
ε
μ (B(v, n, δ )c ) < .
4 sup |ϕ |

Now we are ready to prove continuity. For w ∈ N such that d(v, w) < δ ,

|P ϕ (v) − P ϕ (w)| = [ϕ (Fx (v)) − ϕ (Fx (w))] d μ (x)

≤ | · · · | d μ (x) + | · · · | d μ (x)
B(v,n,δ ) B(v,n,δ )c
ε
≤ μ (B(v, n, δ )) + 2 sup |ϕ |μ (B(v, n, δ )c ) ≤ ε
2
This proves the lemma.
Lemma 5.8 The operator P ∗ is continuous relative to the weak∗ topology

in the space of probability measures on N.
Proof Let (ηk )k be a sequence of probability measures on N converging to

η in the weak∗ topology. Let ϕ : N → R be continuous. By Lemma 5.7, the
function P ϕ is also continuous. Then, using Lemma 5.3 and the definition of
the weak∗ topology,

∗
ϕ d(P ηk ) = (P ϕ ) d ηk → (P ϕ ) d η = ϕ d(P ∗ η ).
This proves that P ∗ ηk converges to P ∗ η in the weak∗ topology, and so P ∗

is continuous.
To finish the proof of Proposition 5.6, let ξ be an arbitrary probability mea-

sure on N and then define
1 n−1 ∗ j
ηn = ∑ (P ξ ).
n j=0
Since the space of probability measures on N is weak∗ compact, there exists

(nk )k → ∞ such that ηnk converges in the weak∗ topology to some probability
measure η . On the one hand, by Lemma 5.8, P ∗ ηnk converges to P ∗ η . On
the other hand,
1 n 1 ∗ nk
P ∗ ηk = ∑
nk j=1
P ∗j
η j , = ηk +
nk
P ξ − ξ
and so P ∗ ηk also converges to η . These two facts show that P ∗ η = η as we

wanted to prove.
Since the stationary measures are the fixed points of P ∗ , it also follows from
Lemma 5.8 that, under the assumptions of Proposition 5.6, the set of stationary
measures for F is closed (hence, compact) for the weak∗ topology. In fact,
one has a stronger statement, where the cocycle and the Bernoulli measure are

allowed to vary: the set of triples (p, F, η ) such that η is a stationary measure
is closed.
To state this in precise terms, let (pk )k be a sequence of probability measures
on X, let μk = pN Z
k (or pk ) for each k ≥ 1, let Fk : M × N → M × N, k ≥ 1 be
random transformations over the Bernoulli shifts f : (M, μk ) → (M, μk ), and
let ηk be a stationary measure for each Fk , k ≥ 1.
Proposition 5.9 Take N to be a compact metric space and Fx : N → N to be
continuous for every x ∈ M.
(a) Suppose that (pk )k converges to p in the pointwise topology, (Fk )k con-
verges to F uniformly on M × N, and (ηk )k converges to η in the weak∗
topology. Then η is a stationary measure for F.
(b) If X is a metric space and x → Fx (v) is continuous for every v ∈ N, then the
previous conclusion remains true when (pk )k converges to p in the weak∗
topology and the other assumptions are unchanged.
Proof The argument is essentially the same for both claims, pointwise and
weak∗ . For each k ≥ 1, let Pk and Pk∗ be the transition operators associated
with the random transformation Fk .
Lemma 5.10 Pk ϕ converges uniformly to P ϕ , for any continuous ϕ : N →
R.
Proof Let ε > 0. By definition, for any v ∈ N,

Pk ϕ (v) − P ϕ (v) = ϕ (Fk,x (v)) d pk (x) − ϕ (Fx (v)) d pk (x)

+ ϕ (Fx (v)) d pk (x) − ϕ (Fx (v)) d p(x).
(5.7)
Since ϕ is (uniformly) continuous and Fk is uniformly close to F,
|ϕ (Fk,x (v)) − ϕ (Fx (v))| < ε for every (x, v) ∈ M × N
if k is large enough. Then the first term of (5.7) on the right-hand side of (5.7)
is smaller than ε . The last term of (5.7) is also smaller than ε if k is large,
because (pk )k converges to p and the function x → ϕ (Fx (v)) is bounded and
measurable (respectively, continuous). This proves the lemma.
Proceeding with the proof of Proposition 5.9, consider any sequence (ηk )k
converging to some η . For any continuous function ϕ : N → R,

ϕ d(Pk∗ ηk ) − ϕ d(P ∗ η ) =

∗
= (Pk ϕ ) d ηk − (P ϕ ) d ηk + ϕ d(P ηk ) − ϕ d(P ∗ η ).

5.3 Ergodic stationary measures 75
By Lemma 5.10, the first difference on the right-hand side goes to zero when
k → ∞. The second difference goes to zero as well because, by Lemma 5.8,
P ∗ ηk converges to P ∗ η in the weak∗ topology. Since ϕ is arbitrary, this
proves that Pk∗ ηk converges to P ∗ η in the weak∗ topology, as k → ∞. When
each ηk is stationary for Fk this means that ηk converges to P ∗ η . Then P ∗ η =
η , as we wanted to prove.
5.3 Ergodic stationary measures

A bounded measurable function ϕ : N → R is stationary if it satisfies P ϕ = ϕ .
A set B ⊂ N is stationary if its characteristic function XB is stationary. Let η
be a stationary measure. A bounded measurable function ϕ : N → R is η -
stationary if P ϕ (v) = ϕ (v) for η -almost every v ∈ N. A set B ⊂ N is η -
stationary if XB is η -stationary.
Proposition 5.11 Let η be a stationary measure. The following conditions

are equivalent:
(a) every η -stationary function is constant on some set with full η -measure;
(b) if B ⊂ N is an η -stationary set then η (B) is either 0 or 1.
Proof Property (b) is a special case of property (a). To prove that (b) also
implies (a), let ϕ : N → R be η -stationary. We claim that
B(c) = {v ∈ N : ϕ (v) > c}
is an η -stationary set for every c ∈ R. Consequently, η (B(c)) is either 0 or

1, for every c ∈ R. Let c̄ be the supremum of the set of values of c such that
η (B(c)) = 1. Then ϕ (v) = c̄ for η -almost every v ∈ B(c̄). We are left to prove
the claim.
Lemma 5.12 If η is a stationary measure and ψ1 , ψ2 : N → R are η -station-

ary then so are the functions max{ψ1 , ψ2 } and min{ψ1 , ψ2 }.
Proof Begin by noting that if ψ is η -stationary then |ψ | = |P ψ | ≤ P|ψ |

at η -almost every point and (P|ψ | − |ψ |) d η = 0, because η is stationary.
These two facts together imply that |ψ | is η -stationary. It is clear that the sum
and the difference of η -stationary functions are also η -stationary. Then, for
any ψ1 and ψ2 as in the statement
ψ1 + ψ2 |ψ1 − ψ2 |
max{ψ1 , ψ2 } = +
2 2

and
ψ1 + ψ2 |ψ1 − ψ2 |
min{ψ1 , ψ2 } = −
2 2
are η -stationary. This proves lemma.
It follows from Lemma 5.12 that ϕn = min{1, n max{ϕ − c, 0}} is η -station-

ary for every n ≥ 1. Now, ϕn converges monotonically to the characteristic
function of B(c). So (Exercise 5.7), the characteristic function is η -stationary,
as claimed. This finishes the proof of the proposition.
A stationary measure η is ergodic if it satisfies conditions (a) and (b) in

Proposition 5.11. It follows from Exercise 5.6 that the definition does not
change if one considers, instead, sets or functions that are actually stationary
(not just η -stationary).
In the one-sided case, we have the following complement to Proposition 5.5:
Proposition 5.13 Let F : M × N → M × N be a one-sided random transfor-

mation and η be a stationary measure. Then η is ergodic if and only if the
F-invariant probability measure m = μ × η is ergodic.
Proof Suppose that m = μ × η is ergodic and let B ⊂ N be a stationary set.

The latter means that

XB (v) = PXB (v) = XB (Fx (v)) d μ (x)
for every v ∈ N. This may be read as follows: v ∈ B implies Fx (v) ∈ B for

μ -almost every x and v ∈ / B implies Fx (v) ∈/ B for μ -almost every x. Then
the set M×B is (F, m)-invariant, in the sense that the symmetric difference
(M×B)ΔF −1 (M×B) has zero m-measure. By ergodicity, it follows that η (B) =
m(M×B) is either 0 or 1. This proves the “if” part of the statement.
To prove the “only if” part, let ψ : M × N → R be any bounded measurable
F-invariant function; that is, such that ψ ◦ F = ψ . Let ϕ : N → R be defined by

ϕ (v) = ψ (y, v) d μ (y). Then, for any v ∈ N,

P ϕ (v) = ϕ (Fx (v)) d μ (x) = ψ (y, Fx (v)) d μ (y) d μ (x).
By assumption, Fx depends only on the zeroth coordinate x0 . On the other hand,

f (x) = (x1 , . . . , xn , . . . ) is independent of x0 . Thus, recalling that μ = pN , the
last integral coincides with

ψ ( f (x), Fx (v)) d μ (x) = ψ (x, v) d μ (x) = ϕ (v)
for every v ∈ N. This means that ϕ is stationary. Hence, by hypothesis, there

exists C ∈ R such that ϕ (v) = C for η -almost every v. Moreover, for each k ≥ 1
and x0 , x1 , . . . , xk−1 ∈ X,

ψ (x, v) d μ (xk , xk+1 , . . . ) = ψ ( f k (x), Fxk (v)) d μ (xk , xk+1 , . . . ).
Changing variables y = f k (x) = (xk , xk+1 , . . . ), and observing that Fxk depends
only on x0 , . . . , xk−1 , we conclude that

ψ (x, v) d μ (xk , xk+1 , . . . ) = ψ (y, Fxk (v)) d μ (y) = ϕ (Fxk (v)).
This proves that

ψ (x, v) d μ (xk , xk+1 , . . . ) = C for every k, x0 , . . . , xk−1 , v.
Then (Exercise 5.8), we have ψ (x, v) = C for m-almost all (x, v) ∈ M × N. This
proves that m is ergodic, and so the proof of the proposition is complete.
The next theorem is an analogue for random transformations of the ergodic

decomposition theorem in deterministic dynamics (see [114, Theorem 5.1.3]):
every stationary measure can be written as a convex combination of ergodic
stationary measures.
Theorem 5.14 (Ergodic decomposition) Take N to be a separable complete

metric space. Then the space of ergodic stationary measures is a measurable
subset of the space of all stationary measures. Moreover, every stationary mea-

sure η may be represented as η = ξ d η̂ (ξ ), meaning that

η (B) = ξ (B) d η̂ (ξ ) for every Borel set B ⊂ N,
where η̂ is a probability measure in the space of stationary measures giving

full weight to the subset of ergodic stationary measures.
See Kifer [71, Appendix A.1] for a proof. The statement extends, immedi-
ately, to the case when N is just a Borel subset of a separable complete metric
space, because any measure on N may be identified with a measure on the
larger metric space.
5.4 Invertible random transformations

We have seen in Proposition 5.5 that, in the one-sided case, a probability mea-
sure η is stationary if and only if the probability measure μ × η is invariant.
Here we discuss the relations between these two concepts when the random

transformation F is invertible and, in particular, the shift f : M → M is two-

sided.
Throughout, we assume that X is a separable complete metric space. Then
(Exercise 5.12), M = X Z (and M = X N ) itself is also a separable complete
metric space.
Let f : (M, μ ) → (M, μ ) be a two-sided Bernoulli shift: M = X Z and the
invariant probability measure μ = pZ for some probability measure p on X.
Denote
Z+ = {n ∈ Z : n ≥ 0} and Z− = {n ∈ Z : n < 0}.
± ±
Then write M ± = X Z and μ ± = pZ . Let π ± : M → M ± be the canonical
projections, and define f ± : M ± → M ± by
f + ◦ π+ = π+ ◦ f and f − ◦ π − = π − ◦ f −1 .
Then f + : (M + , μ + ) → (M + , μ + ) and f − : (M − , μ − ) → (M − , μ − ) are one-
sided Bernoulli shifts and μ = μ − × μ + .
Now let F : M × N → M × N be an invertible random transformation over
f : (M, μ ) → (M, μ ). There are two non-invertible random transformations nat-
urally associated with F, namely:
F + : M + × N → M + × N, F + (x+ , v) = ( f + (x+ ), Fx++ (v)) over f +
F − : M − × N → M − × N, F − (x− , v) = ( f − (x− ), Fx−− (v)) over f − ,
−1
where Fx++ = Fx for any x ∈ M with π + (x) = x+ and Fx−− = Ff −1 (x) for
any x ∈ M with π − (x) = x− . Since Fx : N → N depends only on the zeroth
coordinate x0 of the point x ∈ M, these transformations are indeed well defined
and locally constant: Fx++ depends only on the coordinate x0+ and Fx−− depends
−
only on the coordinate x−1 .
A probability measure η on N is forward stationary for F if it is stationary
for F + ; that is, if

η (B) = η (Fx++ )−1 (B) d μ + (x+ ) = η Fx−1 (B) d μ (x)
for every measurable set B. Analogously, η is backward stationary for F if it

is stationary for F − ; that is, if

η (B) = η (Fx−− )−1 (B) d μ − (x− ) = η Ff −1 (x) (B) d μ (x)

= η Fx (B) d μ (x)
for every measurable set B (the last equality uses the fact that the measure μ is
invariant under f ).

In what follows we are going to analyze these notions from several different
angles. The conclusions can be summarized as follows (some of the notions
involved have yet to be defined):
η is forward stationary ⇔ the product m+ = μ + × η is F + -invariant

⇔ the lift m of η (and m+ ) is a u-state
⇔ the disintegration of m factors through M −
η is backward stationary ⇔ the product m− = μ − × η is F − -invariant
⇔ the lift m of η (and m− ) is an s-state
⇔ the disintegration of m factors through M + .
In general, the sets of forward stationary measures and backward stationary

measures are distinct and even disjoint. That is the case, for instance, for locally
constant projective cocycles that are pinching and twisting (Exercise 5.24).
5.4.1 Lift of an invariant measure

The invariant probabilities of F over μ are in one-to-one correspondence with
the invariant probabilities of F + over μ + , and the same is true for F − and μ − .
To state this fact precisely, consider the canonical projections
P : M×N → M and Q : M×N → N

± ± ±
(5.8)
P : M ×N → M and Q± : M ± × N → N,
and let Π± : M × N → M ± × N be given by Π± (x, v) = (π ± (x), v). Recall that

π ± : M → M ± are the canonical projections between the shift spaces. The next
two lemmas remain true if one replaces all +-signs by −-signs.
Lemma 5.15
(1) If m is an F-invariant probability measure with P∗ m = μ then m+ = Π+

∗m
is an F + -invariant probability measure with P∗+ m+ = μ + .
(2) Given any F + -invariant probability measure m+ with P∗+ m+ = μ + there
exists a unique F-invariant probability measure m such that Π+ ∗m=m
+
and P∗ m = μ .
Proof The first claim is an immediate consequence of the obvious relations

F + ◦ Π+ = Π+ ◦ F and P+ ◦ Π+ = π + ◦ P. To prove the second claim, let C be
the σ -algebra of M × N and, for each n ≥ 0, denote by Cn the sub-σ -algebra

of sets of the form F n (M − × G) for some G ⊂ M + × N. Note that

∞

C0 ⊂ C1 ⊂ · · · ⊂ Cn ⊂ · · · and Cn generates C . (5.9)
n=0
Suppose that m is an F-invariant probability measure such that Π+ +

∗m=m .
Then
m(F n (M − × G)) = m(M − × G) = m+ (G) (5.10)
for any measurable set G ⊂ M + × N. This proves that m is uniquely determined
on Cn for every n. Then, by (5.9), m is uniquely determined on C . To prove
existence, take (5.10) to define m on each σ -algebra Cn . These definitions are
consistent: if
F n (M − × Gn ) = F n−1 (M − × Gn−1 )
then
M − × Gn = F −1 (M − × Gn−1 ) = M − × (F + )−1 (Gn−1 )
and, using that m+ is F+ -invariant, we find that m+ (Gn−1 ) = m+ (Gn ). So,
(5.10) does define a probability measure on C . By construction, this probabil-
ity measure is F-invariant.
The probability measure m is called the lift of m+ . Note that the operations
in parts (1) and (2) of Lemma 5.15 are inverse to each other: by definition, the
Π+ + +
∗ -projection of the lift of m coincides with m and, by uniqueness, the lift
of the Π+ -projection of m coincides with m.
Lemma 5.16 An F + -invariant probability measure m+ is ergodic for F + if
and only if its lift m is ergodic for F.
Proof Let G ⊂ M + ×N be a measurable set invariant under F + . Then M − ×G
is invariant under F. If m is ergodic, it follows that m+ (G) = m(M − × G) is
either 0 or 1. Thus, m+ is ergodic.
To prove the converse, let ψ : M ×N → R be a bounded measurable function.
We claim that the Birkhoff time average, relative to the transformation F, is
constant m-almost everywhere. It follows that, m is ergodic for F. To prove the
claim, suppose first that there exists k ≥ 1 such that ψ (x, v) depends only on
(x−k , . . . , x0 , . . . ) and v ∈ N. Then, for every n ≥ k
ψ (F n (x, v)) = ψ ( f n (x), Fxn (v))
depends only on x+ = (x0 , . . . , xn , . . . ) and v. Consequently, the time average ψ̃
is constant on each set M − × {(x+ , v)}. This means that ψ̃ : M × N → R may
be identified with a function ψ̂ : M + × N → R, defined on a subset with full

m+ -measure. Moreover, the fact that ψ̃ ◦ F = ψ̃ at m-almost every point entails

that ψ̂ ◦ F + = ψ̂ at m+ -almost every point. By ergodicity of m+ , it follows that
ψ̂ is constant on a subset with full m+ -measure. Equivalently, ψ̃ is constant on
a subset with full m-measure. This proves the claim in this case. In general, for
each k ≥ 1, let ψk : M × N → R be defined by

ψk (x, v) = ψ (x, v) d μ (. . . , x−m , . . . , x−k+1 ).
By the previous special case of our claim, the time average ψ̃k is constant m-
almost everywhere. Now, (ψk )k converges to ψ at m-almost every point (Ex-
ercise 5.16). By the bounded convergence theorem, it follows that (ψk )k con-
verges to ψ in the space L1 (m). Then the time averages (ψ̃k )k also converge to
ψ̃ in L1 (m). It follows that ψ̃ is constant m-almost everywhere.
5.4.2 s-states and u-states

In view of the previous observations, to every forward stationary measure η
one can associate an F-invariant probability measure m, namely the lift of the
F + -invariant probability measure m+ = μ + × η ; similarly for every backward
stationary measure. In either case, we call m the lift of the stationary measure
η . Our purpose now is to characterize the classes of F-invariant measures one
finds in this way.
Let m be an F-invariant probability measure on M × N with P∗ m = μ . We
call m an s-state if, for measurable sets A− ⊂ M − , A+ ⊂ M + , and B ⊂ N,
m(A− × A+ × B)
does not depend on A− . (5.11)
μ (A− × A+ )
Analogously, m is called a u-state if, for measurable sets A− ⊂ M − , A+ ⊂ M + ,

and B ⊂ N,
m(A− × A+ × B)
does not depend on A+ . (5.12)
μ (A− × A+ )
We call m an su-state if it is both an s-state and a u-state. These are measures
of a very special kind:
Proposition 5.17 If m is a u-state then η = Q∗ m is a forward stationary

measure. Conversely, if η is a forward stationary measure on N then its lift
m is a u-state. Moreover, m is ergodic if and only if η is ergodic. The same is
true replacing, simultaneously, forward stationary by backward stationary and
u-state by s-state.

Proof Let m be a u-state and m+ = Π+

∗ m. For any measurable sets A ⊂ M
+ +
and B ⊂ N,
m+ (A+ × B) m(M − × A+ × B)
=
μ + (A+ ) μ (M − × A+ )
depends only on B. Let this expression be denoted η (B). It is clear from the
definition that η is a probability measure on N. Moreover, m+ = μ + × η and,
since m+ is invariant under F + , Proposition 5.5 asserts that η is forward sta-
tionary for F. Finally, Q∗ m = Q+ ∗ m = η . This proves the first claim.
+
To prove the converse, suppose η is a stationary measure and let m be its

lift. We want to prove that
m(A− × A+ × B)
μ (A− × A+ )
does not depend on A+ . It is no restriction to take A− to be a cylinder,
A− = {x− ∈ M − : x−k ∈ X−k , . . . , x−1 ∈ X−1 },
for some measurable sets X−k , . . . , X−1 ⊂ X, because the cylinders generate
the σ -algebra of M − . Then, since m is invariant under F,
m(A− × A+ × B) = m(F −k (A− × A+ × B))

= m(M − × G) = (μ + × η )(G)
where G is the set of (x+ , v) ∈ M + × N with x0 ∈ X−k , . . . , xk−1 ∈ X−1 ,

(xk , . . . , xn , . . . ) ∈ A+ , and v ∈ (Fxk )−1 (B). Notee that Fxk is a function of x+ ,
indeed it depends only on x0 , . . . , xk−1 . We conclude that m(A− × A+ × B) is
given by

μ + (A+ )η ((Fxk )−1 (B)) d pk (x0 , . . . , xk−1 ).
X−k ×···×X−1
Consequently,
k −1
X−k ×···×X−1 η ((Fx ) (B)) d p (x0 , . . . , xk−1 )
k
m(A− × A+ × B)
=
μ (A− × A+ ) μ − (A− )
does not depend on A+ , as we wanted to prove. By Lemma 5.16, m is ergodic
if and only if m+ is ergodic. By Proposition 5.13, m+ is ergodic if and only if
η is ergodic.
For the remainder of this section we take N to be a compact metric space,

and Fx : N → N to be a homeomorphism for every x ∈ M. Moreover, x0 → Fx0
is a measurable map from X to the space of continuous transformations on N,

endowed with the topology of uniform convergence. Keep in mind that we take
X and M to be separable complete metric spaces.
It follows from Propositions 5.6 and 5.17 that the set of u-states (and s-
states) is non-empty. Exercise 5.19 outlines an alternative, more direct proof
of this fact, and similar ideas will be developed in Section 6.1.
The following stronger statement will be useful later. Let (pk )k be a seq-
uence of probability measures on X and (μk )k be the corresponding Bernoulli
measures on M. Let Fk : M × N → M × N, k ≥ 1 be random transformations
over the Bernoulli shifts f : (M, μk ) → (M, μk ) and, for each k ≥ 1, let mk be
an s-state (respectively, a u-state) for Fk .
Proposition 5.18 Suppose that (pk )k converges to p in the uniform topology,

(Fk )k converges to F uniformly on M × N, and (mk )k converges to m in the
weak∗ topology. Then m is an s-state (respectively, a u-state) for F.
Moreover, the assumption on (pk )k is not necessary when F is continuous.
Proof Since each mk projects to μk and (mk )k → m in the weak∗ topology, the
projections (μk )k must converge to the projection of m in the weak∗ topology.
Now, the hypothesis (pk )k → p implies that (μk )k → μ in the weak∗ topology
(Exercise 5.14). Thus, m projects down to μ .
Lemma 5.19 Let (mk )k be a sequence of probability measures on M × N pro-

jecting to Bernoulli measures (μk )k on M and satisfying (5.11). If (mk )k con-
verges to some probability measure m and (μk )k converges to some Bernoulli
measure μ , in the weak∗ topology, then m satisfies (5.11). Analogously for con-
dition (5.12).
Proof Write mk = μk− × m+ k where mk is the projection of mk to M × N

+ +
− −
(Exercise 5.17). The hypothesis gives that (mk )k → m and (μk )k → μ in the
weak∗ topology. Restricting to some subsequence, we may assume that (m+ k )k
converges to some probability measure m+ on M + × N. Then (Exercise 5.14)
m = μ − × m+ and this means that m satisfies (5.11). Exchanging ±-signs, one
gets the same claim for (5.12).
To finish the proof of Proposition 5.18, we still have to prove that m is F-

invariant. We claim that

(ϕ ◦ Fk ) dmk − (ϕ ◦ F) dm → 0 as k → ∞, (5.13)
for any bounded uniformly continuous function ϕ : M × N → R. This means

that (Fk )∗ mk converges to F∗ m in the weak∗ topology. The hypothesis also
implies that (Fk )∗ mk = mk converges to m in the weak∗ topology. It follows

that F∗ m = m, as we wanted to prove. Observe that

(ϕ ◦ Fk ) dmk − (ϕ ◦ F) dmk → 0 as k → ∞,
because ϕ ◦ Fk − ϕ ◦ F converges uniformly to zero. Thus, to prove (5.13) we

only have to show that

(ϕ ◦ F) dmk − (ϕ ◦ F) dm → 0 as k → ∞, (5.14)
for any bounded uniformly continuous function ϕ : M × N → R. This is clear if

F is continuous, because mk is assumed to converge to m in the weak∗ topology.
This proves the second claim in the proposition. Now we prove the first claim,
where F is not assumed to be continuous. Arguing as in Exercise 5.13, it is no
restriction to suppose that ϕ : M × N → R is such that ϕ (x, v) depends only on
(x−n , . . . , x0 , . . . , xn−1 , v), for some fixed m ≥ 1. For each x ∈ M, consider the
continuous function Φx : N → R defined by
Φx (v) = ϕ ◦ F(x, v) = ϕ ( f (x), Fx (v)).
Since Φx depends only on (x−n+1 , . . . , x0 , . . . , xn ), we may view Φ : x → Φx as

a (measurable) map from X 2n to the set C0 (N) of continuous functions on N.
This is a separable metric space for the norm of uniform convergence. Thus,
we may apply the theorem of Lusin (see [114, Appendix A.3]) to find, for any
given ε > 0, a compact set K ⊂ X 2n such that p2n (K c ) < ε and Φ is continuous
on K. By construction, the restriction of ϕ ◦ F to K × N → R is continuous.
Then, by the extension theorem of Tietze, there exists a continuous function
ψ : X 2n × N → R such that sup |ψ | ≤ sup |ϕ | and ψ (x, v) = ϕ ◦ F(x, v) for every
x ∈ K and v ∈ N. Then (functions on X 2n × N are also viewed as functions on
M × N that depend only on (x−n+1 , . . . , x0 , . . . , xn )),

(ϕ ◦ F) dmk − (ϕ ◦ F) dm = ψ dmk − ψ dm

+ (ϕ ◦ F − ψ ) dmk − (ϕ ◦ F − ψ ) dm.
K c ×N K c ×N
(5.15)
The first difference on the right-hand side is less than ε in absolute value if k is
large, because ψ is continuous and mk converges to m in the weak∗ topology.
The remainder terms in (5.15) are bounded by

2 sup |ϕ | mk (K c × N) + m(K c × N) = 2 sup |ϕ | p2n c 2n
k (K ) + p (K ) .
c
The hypothesis that pk converges to p implies that p2n 2n

k converges to p , in
k (K ) < ε for all large k.
the uniform sense (Exercise 5.14). In particular, p2n c

Substituting these estimates in (5.15) we find that

(ϕ ◦ F) dmk − (ϕ ◦ F) dm < ε + 4 sup |ϕ | ε
if k is large enough. This completes the proof of (5.14) and the proposition.
5.5 Disintegrations of s-states and u-states

For future use, we give an alternative characterization of s-states and u-states,
in terms of their disintegrations along the vertical fibers. Take X, M = X Z and
N to be separable complete metric spaces.
5.5.1 Conditional probabilities

Let m be any probability measure on M × N that projects down to μ . A dis-
integration of m along vertical fibers is a measurable family {mx : x ∈ M} of
probability measures on N satisfying

m(E) = mx {v : (x, v) ∈ E} d μ (x) for any measurable set E ⊂ M × N.
M
The mx are called conditional probabilities of m.

The family P = {v : (x, v) ∈ E} of vertical fibers is a measurable partition
of M × N, in the sense of Rokhlin; that is, it is the limit of some increasing
sequence of finite partitions P1 ≺ · · · ≺ Pn ≺ · · · . To see this, just consider
any countable basis of open sets {Uk : k ∈ N} of M, and take
n

Pn = Uk × N,Ukc × N .
k=1
Then, Rokhlin’s disintegration theorem (see [114, Section 5.2]) gives that a dis-
integration along vertical fibers does exist. Moreover, it is essentially unique:
any two disintegrations coincide on a full μ -measure subset. Indeed, the proof
of Rokhlin’s theorem provides the following explicit characterization of the
conditional probabilities:
m(Uk × B)
mx (B) = lim for μ -almost every x ∈ M (5.16)
Uk →x m(Uk )
and any B ⊂ N in some countable algebra AN generating the Borel σ -algebra
of N. The limit is taken over suitable decreasing sequences of open sets Uk
whose intersection consists of the point x only.

Proposition 5.20 An F-invariant probability measure m on M ×N with P∗ m =

μ is an s-state if and only if it admits a disintegration that factors through M + ;
that is, such that each mx depends only on x+ = π + (x). Dually, m is a u-state
if and only if it admits a disintegration that factors through M − .
Proof Suppose that m admits a disintegration {mx : x ∈ M} where each mx

depends only on x+ = π + (x): we write mx = mx+ . For any A− , A+ , B,

m(A− × A+ × B) A− d μ
− (x− )
A+ mx+ (B) d μ
+ (x+ )
= .
μ (A− × A+ ) μ (A− )μ + (A+ )
−
The right-hand side may be rewritten as

A+ mx+ (B) d μ
+ (x+ ) m(M − × A+ × B)
= ,
μ + (A+ ) μ (M − × A+ )
and so it does not depend on A− . Conversely, suppose m is a u-state. Fix count-
able bases of open sets {Uk− : k ∈ N} of M − and {Ul+ : l ∈ N} of M + . The
products Uk− ×Ul+ form a basis of open sets of M. As observed in (5.16), there
exists a full μ -measure set M0 ⊂ M and a countable algebra AN generating the
Borel σ -algebra of N, such that
m(Uk− ×Ul+ × B)
mx (B) = lim for all x ∈ M0 and B ∈ AN , (5.17)
Uk− ×Ul+ →x μ (Uk− ×Ul+ )
for every x ∈ M0 and B ∈ AN , where the limit is taken over any sequence of
basis elements Uk− × Ul+ shrinking down to x. Given any pair x, y ∈ M0 such
that π + (x) = π + (y), we may consider the same neighborhoods Ul+ for both
points. Then, as the right-hand side of (5.17) does not depend on Uk− , we get
that mx (B) = my (B) for every B ∈ AN . Since the algebra AN is generating, this
implies that mx = my . We have shown that mx factors through M + , restricted
to the full measure subset M0 . Then we can force this property on the whole M
by redefining the mx appropriately for x ∈ M \ M0 (since this is a zero measure
set, the family mx continues to be a disintegration of m). This proves the claim
for s-states. The dual claim, for u-states, is analogous.
5.5.2 Martingale construction

Let (X, B, μ ) be a probability space. The conditional expectation E(Ψ | A ) of
an integrable function Ψ : X → R relative to a σ -algebra A ⊂ B is the unique
A -measurable function on X such that

E(Ψ | A ) d μ = Ψd μ for all A ∈ A .
A A

A martingale is a sequence (Ψn , Bn ), n ≥ 0 where Bn , n ≥ 0 is a non-decreas-

ing sequence of sub-σ -algebras of B, each Ψn : X → R, n ≥ 0 is Bn -measur-
able and integrable, and
E(Ψn+1 | Bn ) = Ψn almost everywhere, for every n ≥ 0. (5.18)
Theorem 5.21 (martingale convergence theorem) Given any martingale

(Ψn , Bn ), n ≥ 0, such that supn E(|Ψn |) < ∞, there exists a measurable func-
tion Ψ : X → R with E(|Ψ|) < ∞, such that
(1) Ψn → Ψ almost everywhere;
(2) E(Ψ | Bn ) = Ψn almost everywhere, for every n ≥ 0;
(3) Ψ is B∞ -measurable, where B∞ denotes the σ -algebra generated by the
union of all Bn , n ≥ 0.
A proof of this classical theorem can be found in Durrett [49, § 5.2], for
example. We are going to use the martingale convergence theorem to give a
useful alternative proof of Proposition 5.17 in the case when is N a compact
metric space.
Let F : M × N → M × N and F + : M + × N → M + × N be as in the previous
section. Let m+ be any F + -invariant probability measure that projects down to
μ + and let {m+y : y ∈ M } be a disintegration of m . Given any x ∈ M, define
+ +
mx = mx+ where x = π (x).

+ + + +
Lemma 5.22 The weak∗ limit mx = limn An ( f −n (x))∗ m+f −n (x) exists for μ -
almost every x ∈ M.
Proof Given any continuous function ϕ : N → R and n ≥ 0, define

Ψϕ ,n (x) = ϕ d An ( f −n (x))∗ m+f −n (x) . (5.19)
For each n ≥ 0, let Fn be the σ -algebra generated by the cylinders

[−n : Δ−n , . . . , Δl ] = {x ∈ M : xi ∈ Δi for i = −n, . . . , l}
with l ≥ −n and Δi ⊂ X a measurable set for i = −n, . . . , l. It is clear that
Fn ⊂ Fn+1 for every n ≥ 0. Since m+ y depends only on y = π (y) and A(x)
+ +
depends only on the zeroth coordinate x0 , the value of Ψϕ ,n (x) depends only on
xk , k ≥ −n. Hence, every Ψϕ ,n is Fn -measurable. Moreover, given any cylinder
C = [−n : Δ−n , . . . , Δl ],

Ψϕ ,n+1 (x) d μ (x) = ϕ ◦ An+1 ( f −n−1 (x)) dm+f −n−1 (x) d μ (x)
C C

= ϕ ◦ An ( f −n (x)) dq f −n−1 (x) d p(x−n ) · · · d p(x−1 )
C


y d p(y0 ). Observe that dqy = m f (y) for μ -almost every
+
where qy = X A(y)∗ m+
y, because m+ is F + -invariant (Exercise 5.20). Therefore,

Ψϕ ,n+1 (x) d μ (x) = ϕ ◦ An ( f −n (x)) dm+f −n (x) d μ (x)
C C

= Ψϕ ,n (x) d μ (x).
C
Since these cylinders C generate the σ -algebra Fn , this proves that (Ψϕ ,n , Fn )
is a martingale.
Then, by the martingale convergence theorem, there exists a measurable
function Ψϕ : M → R such that
Ψϕ (x) = lim Ψϕ ,n (x) (5.20)
for every x in some full μ -measure set Mϕ ⊂ M. Now we use once more the
fact that the space of continuous functions on the compact metric space N
admits countable dense subsets. Taking the intersection of the Mϕ over all ϕ in
such a dense subset one obtains a full μ -measure set M∗ ⊂ M such that (5.20)
holds for every x ∈ M∗ and every continuous function ϕ : N → R. Every map
ϕ → Ψϕ (x) is linear and non-negative and it sends ϕ ≡ 1 to Ψ1 (x) = 1. Thus,
by the Riesz representation theorem (see [114, Theorem A.3.11]), it defines a
probability measure mx on N:

ϕ dmx = Ψϕ (x), for each continuous ϕ : N → R.
The definition gives that mx is the weak∗ limit of An ( f −n (x))∗ m+f −n (x) .
The corollary that follows means that the lift of m+ is precisely the prob-
ability measure m on M × N that projects down to μ and admits the family
{mx : x ∈ M} in Lemma 5.22 as a disintegration.
Corollary 5.23 The lift of m+ is the probability measure m defined on M × N

by

m(E) = mx E ∩ ({x} × N) d μ (x)
for every measurable set E ⊂ M × N.
Proof Recall the proof of Lemma 5.22. The definition (5.19) implies
m f (x) = lim An+1 ( f −n (x))∗ m f −n (x) = A(x)∗ mx

n
for μ -almost every x. Therefore (Exercise 5.20), the probability measure m is

invariant under F. Moreover, given any cylinder C = [0 : Δ0 , . . . , Δl ] in F0 and

any continuous function ϕ : N → R,

XC ϕ dm = ϕ dmx d μ (x) = Ψϕ (x) d μ (x)
C C
= Ψϕ ,0 (x) d μ (x) = ϕ dm+
x d μ (x) = XC ϕ dm+
C C
because E(Ψϕ | F0 ) = Ψϕ ,0 . This proves that m projects to m+ under the map

Π+ : M × N → M + × N.
Corollary 5.24 Suppose that m+ = μ + × η for some forward stationary mea-
sure η . Then the lift m is a u-state.
Proof x ≡ η and so
Recall the proof of Lemma 5.22. In the present case, m+
each
Ψϕ ,n (x) = ϕ d An ( f −n (x))∗ η

depends only on x−n , . . . , x−1 . Thus, the limit Ψϕ (x) = ϕ dmx depends only
on x− = π − (x). This means that m is a u-state.
5.5.3 Remarks on 2-dimensional linear cocycles

Let F : M × R2 → M × R2 , F(x, v) = ( f (x), A(x)v) be an invertible locally con-
stant linear cocycle and PF : M × PR2 → M × PR2 be the associated projective
cocycle. By definition, the base dynamics f : (M, μ ) → (M, μ ) is a two-sided
Bernoulli shift. The function A : M → GL(2) is assumed to satisfy the integra-
bility conditions log+ A±1 ∈ L1 (μ ). Take the Lyapunov exponents λ+ and
λ− to be distinct and let R2 = Ex+ ⊕ Ex− be the Oseledets decomposition.
Let ms and mu be defined on the product space M × PR2 by

ms (C × D) = μ {x ∈ C : Ex− ∈ D} = δEx− (D)d μ (x)
C

m (C × D) = μ {x ∈ C : Ex+ ∈ D} = δEx+ (D)d μ (x)
u
C
for all measurable sets C ⊂ M and D ⊂ PR2 .

In other words, ms and mu are
the probability measures on M × PR that project down to μ and admit, re-
2
spectively, δEx− and δEx+ as conditional probabilities along the projective fibers.
Note that ms and mu are PF-invariant, because (Exercise 5.20)
A(x)∗ δEx− = δE − and A(x)∗ δEx+ = δE + μ -almost everywhere.
f (x) f (x)
Let G ⊂ M × PR2 be a measurable set invariant under PF. Then the set Cs =
{x ∈ M : (x, Ex− ) ∈ G} is invariant under f , and so μ (Cs ) is either 0 or 1.

Consequently, ms (G) is either 0 or 1, and this proves that ms is ergodic. A

similar argument proves that mu is ergodic. Moreover, ms is an s-state, because
Exs depends only on x− = π − (x); similarly, mu is a u-state.
Lemma 5.25 Any PF-invariant probability measure m that projects down to
μ may be written as a convex combination m = ams + bmu , with a, b ≥ 0 and
a+b = 1
Proof For each k ≥ 1, consider the set
1
Bk = (x, v) ∈ M × PR2 : | sin (v, Ex∗ )| ≥ | sin (Exs , Exu )| for ∗ = s, u .
k
Consider any (x, v) ∈ Bk . Since λ− < λ+ , the angle between An (x)v and E ufn (x)
decays exponentially fast as n → +∞ (Exercise 3.5). On the other hand, by
the theorem of Oseledets, the angle between E sf n (x) and E ufn (x) decays sub-
exponentially. This implies that F n (x, v) eventually leaves Bk . So, by Poincaré
recurrence, m(Bk ) = 0 for every F-invariant probability measure m and every
k ≥ 1. This means that m is concentrated on the set of points (x, v) ∈ M × PR2
with v ∈ {Exs , Exu }, and so its disintegration has the form
mx = a(x)δExs + b(x)δE u (x)
with a(x), b(x) ≥ 0 and a(x) + b(x) = 1. Moreover, the functions a(·) and b(·)
are ( f , μ )-invariant and so, by ergodicity of μ , they are constant on a full μ -
measure set.
Remark 5.26 The hypotheses that F is locally constant and the system ( f , μ )
is a Bernoulli shift are not used in the proof of Lemma 5.25. Thus, the con-
clusion extends to any 2-dimensional linear cocycle over an ergodic system
satisfying the integrability conditions in Theorem 3.20 and such that λ− < λ+ .
This will be used in the proof of Proposition 9.13.
Thus, ms and mu are the unique ergodic PF-invariant probability measures.
The next result also follows from the lemma.
Corollary 5.27 Either ms is the unique s-state or all PF-invariant probability
measures that project down to μ are s-states. Either mu is the unique u-state
or all PF-invariant probability measures that project down to μ are u-states.
Proof Suppose that there exists some s-state m = ms . Then m = ams + bmu
with b = 0, and so we may write
1 a
mu = m − ms .
b b
This implies that mu is an s-state, and therefore so is any convex combination

5.6 Notes 91
of ms and mu . By Lemma 5.25, this proves the first claim. The second one is
analogous.
In the next chapter, especially in Section 6.1, we will push this kind of pic-
ture to linear cocycles in arbitrary dimension.
5.6 Notes
The books of Kifer [72] and Arnold [5] contain systematic developments of the
theory of random transformations, from two different perspectives. Our defini-
tion is more restrictive than Arnold’s, as we consider only the Bernoulli case.
However, several parts of the material we present here have been developed in
much more generality.
The notions of s-state and u-state were introduced by Bonatti, Gomez-Mont
and Viana [37] and were further developed in a series of papers by Avila and
Viana [16], by Avila, Santamaria and Viana [13] and by Avila, Viana and
Wilkinson [17]. The analysis of the invertible case presented in Section 5.4
is probably new.
The idea of disintegration into conditional probabilities relies on the disinte-
gration theorem of Rokhlin [102]; see also [114, Section 5.2]. The martingale
construction in Section 5.5.2 goes back to Furstenberg [55].
The remarks in Section 5.5.3 are from Avila and Viana [16]. They hold more
generally: the cocycle need not be locally constant and it suffices that the in-
variant measure μ has local product structure, meaning that the restriction of
μ to a neighborhood of every point x is equivalent to the product μx+ × μx− of
a measure μx+ on the unstable manifold of x by a measure μx− on the stable
manifold of x.
5.7 Exercises
Exercise 5.1 Show that if E is a measurable subset of M × N then:
(1) the sets Ev = {x ∈ M : (x, v) ∈ E} and E x = {v ∈ N : (x, v) ∈ E} are mea-

surable subsets of M and N, respectively, for every v ∈ N and x ∈ M;
(2) the functions N → R, v → ξ (Ev ) and M → R, x → η (E x ) are measurable,
for any probability measures ξ on (M, A ) and η on (N, B);

(3) moreover, (ξ × η )(E) = ξ (Ev ) d η (v) = η (ξ x ) d ξ (x).
Exercise 5.2 Show that if F : M × N → M × N, F(x, v) = ( f (x), Fx (v)) is

measurable then the maps M → N, x → Fx (v) and Fx : N → N are measurable,

for every v ∈ N and x ∈ M.
Exercise 5.3 Check that the transition operators can be defined in terms of
the transition probabilities alone:

(1) P ϕ (v) = ϕ (w) d p(v, w) for every v ∈ N, and

(2) P ∗ η (B) = p(v, B) d η (v) for every measurable set B ⊂ N.
Exercise 5.4 Consider the situation in Example 5.1. Show that a probability
measure η is stationary for the projective cocycle PF : M × PRd → M × PRd
if and only if it satisfies η = ∑m
i=1 pi (Ai )∗ η .
Exercise 5.5 Calculate all the stationarymeasures ofthe projective cocycle

PF : X N × PR2 → X N × PR2 given by PF (αn )n∈N , [v] = (αn+1 )n∈N , [α0 v])
where X = {A1 , A2 } and p = p1 δA1 + p2 δA2 , with

σ 0
A1 = A−1 = for some σ > 1 and p1 , p2 > 0.
2 0 σ −1
Exercise 5.6 Let η be a stationary measure and ϕ : N → R be an η -stationary
function. Show that there is a stationary function ψ : N → R with ϕ (v) = ψ (v)
for η -almost every v ∈ N. Show that if ϕ is a characteristic function of some
subset of N then ψ may also be taken to be a characteristic function.
Exercise 5.7 Let η be a stationary measure and ϕn : N → R, n ≥ 1, be a mono-
tone sequence of η -stationary functions converging η -almost everywhere to
some bounded measurable function ϕ : N → R. Check that ϕ is η -stationary.
Exercise 5.8 Let φ : M → R be a bounded measurable function and C ∈ R.
(1) Show that if {x ∈ M : φ (x) > C} has positive μ -measure then there exist

k, x0 , . . . , xk−1 such that φ (x) d μ (xk , xk+1 , . . . ) > C.

(2) Deduce that if φ (x) d μ (xk , xk+1 , . . . ) = C for every k, x0 , . . . , xk−1 then
φ (x) = C for μ -almost every x ∈ M.
Exercise 5.9 Check that the projection to N of a probability measure m in-
variant under a random transformation F : M × N → M × N need not be a
stationary measure for F.
Exercise 5.10 Check that a stationary measure η is ergodic if and only if it is
extremal; that is, if it cannot be written as a convex combination η = α1 η1 +
α2 η2 where α1 , α2 > 0 with α1 + α2 = 1, and the probability measures η1 ,
η2 are stationary and mutually singular (two measures η1 and η2 are mutually
singular if there is a measurable subset such that η1 (A) = 0 and η2 (Ac ) = 0).

5.7 Exercises 93
Exercise 5.11 Let F : M × N → M × N be a one-sided random transformation

and η be a stationary measure. Let ξ d η̂ (ξ ) be an ergodic decomposition of

η . Show that (μ × ξ ) d η̂ (ξ ) is an ergodic decomposition of the F-invariant
probability measure m = μ × η .
Exercise 5.12 Let (Xi , di ), i ∈ I, be separable, complete metric spaces, where
I ⊂ Z. Let X be the cartesian product Πi∈I Xi , endowed with the product topol-
ogy. Show that X is a separable space and
d(x, y) = ∑ 2−|i| min{1, di (xi , yi )}, x, y ∈ X, (5.21)
i∈I
defines a complete metric, compatible with the product topology on X.

Exercise 5.13 Let (Xi , di ), i ∈ I, be metric spaces, where I ⊂ Z, and let the
cartesian product X = ∏i∈I Xi be endowed with the distance (5.21). For any
bounded uniformly continuous function ϕ : X → R and any ε > 0, find a finite
set J ⊂ I and a bounded uniformly continuous function ψ : X → R such that
sup |ϕ − ψ | < ε and ψ (x) depends only on the coordinates x j with j ∈ J.
Exercise 5.14 Let (Xi , di ), i ∈ I be separable metric spaces, where I ⊂ Z,
and let the cartesian product X = ∏i∈I Xi be endowed with the distance (5.21).
Prove that:
(1) If I is finite then the following is true relative to the uniform topology, the
pointwise topology and the weak∗ topology: if (pn,i )n converges to pi for
every i ∈ I then (pn = ∏i∈I pn,i )n converges to p = ∏i∈I pi . Indeed:
(a) pn − p ≤ ∑i∈I pn,i − pi ;
(b) {E ⊂ X : pn (E) → p(E)} is a monotone class and contains every finite
union of measurable rectangles ∏i∈I Ei ;
(c) for any bounded uniformly continuous ϕ : X → R and ε > 0 there is a
countable partition {Rk : k ≥ 1} of X into rectangles with p(∂ Rk ) = 0
and |ϕ (z) − ϕ (w)| ≤ ε if z and w are in the same Rk .
(2) When I is infinite, the statement remains true for the weak∗ topology (use
Exercise 5.13) but, in general, not for the pointwise topology nor the uni-
form topology: Take Xi = [0, 1] and pn,i = (1 − n−1 )δ0 + n−1 δ1 .
Exercise 5.15 Consider the situation in Example 5.1. Let η be a probability
measure on N. Show that:
−1
(1) η is forward stationary if and only if η (B) = ∑m
i=1 pi η (Ai (B)) for every
measurable B ⊂ N;
(2) η is backward stationary if and only if η (B) = ∑mi=1 pi η (Ai (B)) for every
measurable B ⊂ N.

The sets of backward and forward stationary measures may be disjoint.
Exercise 5.16 Let φ : M → R be a bounded measurable function and

φk : M → R, φk (x) = φ (x) d μ (xk , xk+1 , . . . )
for each k ≥ 1. Use the martingale convergence theorem to prove that (φk (x))k
converges to φ (x) for μ -almost every x.
Exercise 5.17 Prove that an F-invariant probability measure m on M × N is
(1) an s-state if and only if m = μ − × m+ for some probability measure m+ on

M + × N; then m+ = Π+ +
∗ m and it is F -invariant;
(2) a u-state if and only if m = μ × m for some probability measure m− on
+ −
M − × N; then m− = Π− −
∗ m and it is F -invariant;
(3) an su-state if and only if m = μ × η where η , the projection of m to N, is
both forward stationary and backward stationary.
Exercise 5.18 Check that the conditions (5.11) and (5.12) are preserved by,
respectively, backward and forward iteration under the random transformation.
Namely,
(1) if m is a probability measure on M ×N that projects down to μ and satisfies

(5.11), then the same is true for F∗−1 m;
(2) if m is a probability measure on M ×N that projects down to μ and satisfies
(5.12), then the same is true for F∗ m.
Exercise 5.19 Prove the following.
(1) The space M (μ ) of probability measures on M × N that project down to μ

is non-empty and compact for the weak∗ topology.
(2) The push-forward F∗ : M (μ ) → M (μ ) is continuous.
(3) The subset of measures m ∈ M (μ ) satisfying (5.12) is invariant under F∗
and is closed in the weak∗ topology.
(4) Any Cesaro limit of the forward iterates of any element of this subset is an
F-invariant probability measure and a u-state for F.
(5) A dual construction yields s-states.
Exercise 5.20 Let m be a probability measure on M × N that projects down

to an f -invariant probability measure μ . Prove that:
(1) if f : M → M is invertible (two-sided shift) then m is F-invariant if and

only if (Fx )∗ mx = m f (x) for μ -almost every x ∈ M;

5.7 Exercises 95
(2) if f : M → M is a one-sided shift then m is F-invariant if and only if

m f (x) = (Fy0 )∗ m(y0 ,x1 ,...,xn ,...) d p(y0 )
for μ -almost every x = (x0 , x1 , . . . , xn , . . .) ∈ M.

Exercise 5.21 Let m be the lift of an F + -invariant probability measure m+ .
Show that m is an s-state if and only if any of the following conditions holds:
(1) m = μ − × m+ ;
(2) mx = m+ x+
for μ -almost every x ∈ M, where x+ = π + (x);
(3) (Fx+ )∗ mx+ = m+f + (x+ ) for μ + -almost every x+ ∈ M + ;
+
where {mx : x ∈ M} and {m+

x+
: x+ ∈ M + } are disintegrations of m and m+ .
Exercise 5.22 Prove that if m is an F-invariant probability measure on M × N
that projects down to μ then its ergodic components also project down to μ .
Exercise 5.23 Prove that if m is an s-state (respectively, a u-state) then its
ergodic components are s-states (respectively, u-states).
Exercise 5.24 Let F : M × R2 → M × R2 be an invertible locally constant
linear cocycle over a two-sided Bernoulli shift. Show that if F is pinching and
twisting then the projective cocycle PF admits a unique u-state and a unique
s-state, and they are distinct. Conclude that the forward stationary measure and
the backward stationary measure are also unique and distinct.

6
Exponents and invariant measures
The main theme in the present chapter is that the Lyapunov exponents and
Oseledets subspaces of a linear cocycle F may be retrieved from the invariant
measures and the stationary measures of the associated projective cocycle PF.
This theme will be substantiated by three important results.
A theorem of Ledrappier [80], which we discuss in Section 6.1, states that
the Lyapunov exponents of F coincide with the integrals of the function
A(x)v
Φ : M × PRd → R, Φ(x, [v]) = log (6.1)
v
with respect to the ergodic invariant probability measures of PF. Moreover,
any such probability measure is concentrated on the corresponding Oseledets
sub-bundle.
When the cocycle is strongly irreducible and locally constant, the largest
Lyapunov exponent is given by the integral of Φ with respect to any PF-
invariant measure of the form μ × η . That is the content of Furstenberg’s for-
mula [54], that we present in Section 6.2.
Finally, Section 6.3 is devoted to Furstenberg’s [54] criterion for the ex-
tremal Lyapunov exponents of locally constant SL(2)-dimensional cocycles to
coincide: if λ− = λ+ then either the cocycle lives in a compact subgroup or it
leaves invariant some finite subset of PR2 .
Throughout, PF : M × PRd → M × PRd is the projective cocycle associated
with a linear cocycle F : M × Rd → M × Rd defined over an ergodic transfor-
mation f : (M, μ ) → (M, μ ) by a measurable function A : M → GL(d). We
always take M to be a separable complete metric space and μ to be a Borel
measure on M such that log+ A±1 ∈ L1 (μ ). The extremal Lyapunov expo-
nents are denoted, indifferently, as λ± (F, μ ) or λ± (A, μ ).
For some of the results we take ( f , μ ) to be a Bernoulli shift and A to be
locally constant; that is, such that A(x) depends only on the zeroth coordinate
96

of x ∈ M. Then F and PF are random transformations, in the sense of the

previous chapter.
6.1 Representation of Lyapunov exponents

Let λ1 > · · · > λk be the Lyapunov exponents of the linear cocycle F (keep in
mind that we take ( f , μ ) to be ergodic) and then let Rd = Vx1 ⊃ · · · ⊃ Vxk ⊃ {0}
be the Oseledets flag of F (Theorem 4.1). When the cocycle is invertible, Rd =
Ex1 ⊕ · · · ⊕ Exk denotes the Oseledets decomposition of F (Theorem 4.2).
Theorem 6.1 (Ledrappier) Given any PF-invariant ergodic probability mea-

sure m on M × PRd that projects down to μ , there exists j ∈ {1, . . . , k} such
that
Φ dm = λ j and m {(x, [v]) : v ∈ Vxj \Vxj+1 } = 1. (6.2)
Conversely, given j ∈ {1, . . . , k} there is an PF-invariant ergodic probability

measure m projecting to μ and satisfying (6.2). When F is invertible, one may
replace Vxj \Vxj+1 by Exj in (6.2).
Proof Let m be PF-invariant and ergodic. On the one hand, by ergodicity,

1 An (x)v 1 n−1
lim log = lim ∑ Φ ◦ PF i (x, [v]) = Φ dm (6.3)
n n v n n
i=0
for m-almost every (x, [v]). On the other hand, for μ -almost every x ∈ M and
every [v] ∈ PRd , the expression on left-hand side of (6.3) converges to some

Lyapunov exponent λ j . It follows that Φ dm = λ j and m gives full weight to
the set
{(x, [v]) ∈ M × PRd : v ∈ Vxj \Vxj+1 }
of pairs (x, [v]) for which the limit in (6.3) is λ j . When F is invertible, one can
take both limits n → ±∞ in (6.3). We conclude that m gives full weight to the
set
{(x, [v]) ∈ M × PRd : v ∈ Exj }
of pairs (x, [v]) for which the limit in (6.3) is λ j for both n → +∞ and n → −∞.
To prove the converse statement we need a few auxiliary lemmas. Let M (μ )
denote the space of probability measures on M × PRd that project down to μ .
Lemma 6.2 The push-forward F∗ : M (μ ) → M (μ ) is well defined and con-

tinuous relative to the weak∗ topology.

98 Exponents and invariant measures
Proof Let (mn )n be a sequence in M (μ ) converging, in the weak∗ sense, to

some probability measure m. We want to show that F∗ mn converges to F∗ m;
that is,

(ϕ ◦ F) dmn → (ϕ ◦ F) dm (6.4)
for every bounded continuous function ϕ : M × PRd → R. Since f : M → M

and A : M → SL(d) are measurable, we can use Lusin’s theorem to find, for
every ε > 0, compact sets K ⊂ M such that μ (K c ) < ε and the restriction of F to
L = K ×PRd is continuous. Then, ϕ ◦F is continuous restricted to L and, by the
Tietze extension theorem, there is some continuous function ψ : M × PRd → R
such that sup |ψ | ≤ sup |ϕ | and ψ = ϕ ◦ F restricted to L. It follows that

(ϕ ◦ F) dmn − (ϕ ◦ F) dm ≤ ψ dmn − ψ dm+

+ ψ dmn − ψ dm + (ϕ ◦ F) dmn − (ϕ ◦ F) dm
Lc Lc Lc Lc
The first term on the right-hand side is smaller than ε if n is large enough,
because mn → m and ψ is continuous. The remain terms are bounded by

2 sup |ϕ | mn (Lc ) + m(Lc ) = 4 sup |ϕ | μ (K c ) ≤ 4 sup |ϕ | ε .
This proves (6.4), and so the argument is complete.
Lemma 6.3 M (μ ) is sequentially compact, relative to the weak∗ topology.

Proof We are going to use the fact that every Borel measure in a separable,
completely metrizable space is tight (see [114, Appendix A.3]). This means
that for any ε > 0 we may find a compact set K ⊂ M such that μ (K) > 1 − ε .
Then K × PRd is compact and m(K × PRd ) = μ (K) > 1 − ε for every m ∈
M (μ ). This proves that the set M (μ ) is tight. Hence, by the theorem of Pro-
horov (see [114, Theorem 2.1.8]), it is sequentially compact.
Lemma 6.4 Let x → Vx be a measurable sub-bundle of M × Rd . Then the

subset of probability measures m ∈ M (μ ) such that m({(x, [v]) : v ∈ Vx }) = 1
is closed in the weak∗ topology.
Proof Let (mn )n be a sequence in M (μ ) such that
mn ({(x, [v]) : v ∈ Vx }) = 1 for all n,
and which converges to some m in the weak∗ topology. We want to show that
m({(x, [v]) : v ∈ Vx }) = 1. Since the sub-bundle V is assumed to be measurable,
for every ε > 0 one can use Lusin’s theorem to find a compact set K ⊂ M
with μ (K) > 1 − ε and such that the restriction of V to K is continuous. Then

{(x, [v]) ∈ K × PRd : v ∈ Vx } is closed and so its m-measure is bounded below

by
lim sup mn ({(x, [v]) ∈ K × PRd : v ∈ Vx }) ≥ μ (K) > 1 − ε .
n
Since ε is arbitrary, this proves that m({(x, [v]) : v ∈ Vx )}) = 1.
We are now ready to conclude the proof of Theorem 6.1. Let j be fixed.
Let us start with the invertible case. Since the Oseledets sub-bundle x → Exj is
measurable it follows from Exercise 4.1 that it admits a measurable section;
that is, there exists a measurable vector field x → σ (x) such that σ (x) ∈ Exj
for every x. Let m0 be the probability measure in M (μ ) that admits δσ (x) as a
disintegration. In other words,

m0 (B) = μ {x ∈ M : (x, σ (x)) ∈ B}
for every measurable B ⊂ M × PRd . Then define, for n ≥ 1,
1 n−1
mn = ∑ (PF)i∗ m0 .
n i=0
It is clear that mn ({(x, [v]) : v ∈ Exj }) = 1 for every n ≥ 0. Then, by Lemma 6.4,
the same is true for every accumulation point of the sequence (mn )n . Using
Lemma 6.2, every accumulation point is a PF-invariant probability measure.
By Lemma 6.3, accumulation points do exist. Considering ergodic compo-
nents, one gets that there exists some PF-invariant ergodic probability measure
m such that m({(x, [v]) : v ∈ Exj }) = 1. Then Φ̃(x, [v]) = λ j for m-almost every

(x, [v]), and so Φ dm = Φ̃ dm = λ j . This finishes the proof in the invertible
case.
The general case can be deduced easily, through the following invertible
extension of F : M × Rd → M × Rd . Let (see [114, Section 2.4.2])
fˆ : M̂ → M̂, fˆ(. . . , xn , . . . , x−1 , x0 ) → (. . . , xn , . . . , x−1 , x0 , f (x0 ))
be the natural extension of f : M → M, defined on the space of pre-orbits
M̂ = {(xn )n≤0 : f (xn ) = xn+1 for every n < 0}.
Denote by π : M̂ → M the projection to the zeroth coordinate. There is a unique

fˆ-invariant measure on M̂ such that π∗ μ̂ = μ . Define Â = A ◦ π : M̂ → GL(d)
and then let F̂ : M̂ × Rd → M̂ × Rd be the linear cocycle defined by Â over fˆ.
Note that

±1
log Â d μ̂ = log A±1 d μ .

It is also clear that F and F̂ have the same Lyapunov exponents, and their
Oseledets flags (V i )i and (V̂ i )i are related by
V̂x̂i = Vπi (x̂) for every x̂ ∈ M̂ and every i.
By the previous paragraph, there exists some PF̂-invariant probability measure

m̂ on M̂ × Rd that projects down to μ̂ and satisfies m̂({(x, [v]) : v ∈ Êxj }) = 1.
Let m be the image of m̂ under the map π × id : M̂ × PRd → M × PRd . Then
m is a PF-invariant probability measure and

m {(x, [v]) : v ∈ Vxj \Vxj+1 } = 1.
because Êx̂j ⊂ Vπj(x̂) \ Vπj+1

(x̂)
for every x̂ ∈ M̂. Considering ergodic components,
one concludes that there exists some PF-invariant ergodic probability measure
m such that m({(x, [v]) : v ∈ Vxj \Vxj+1 }) = 1. The other claim in (6.2) follows
immediately, just as in the invertible case.
The following application of Theorem 6.1 will be useful later on. Take the
linear cocycle F : M × Rd → M × Rd to be locally constant and, for the time
being, invertible.
Proposition 6.5 For invertible locally constant cocycles,

(1) λ+ (F, μ ) = max Φ dm : m a u-state for PF},

(2) λ− (F, μ ) = min Φ dm : m an s-state for PF}.
Proof It follows from Theorem 6.1 and Exercise 5.22 that

λ− (F, μ ) ≤ Φ dm ≤ λ+ (F, μ ) (6.5)
for every PF-invariant probability measure m that projects down to μ . Thus, we

need to show that the maximum is realized by some u-state, and the minimum
is realized by some s-state.
Consider the Oseledets sub-bundle E 1 corresponding to the largest Lya-
punov exponent λ1 = λ+ (F, μ ). As observed in Remark 4.3, the subspace Ex1
depends only on the negative part x− = π − (x) of the point x. So we may find a
measurable section x → σ (x) to the sub-bundle E 1 such that σ (x) depends only
on x− . Let m0 be the probability measure in M (μ ) that admits {δσ (x) : x ∈ M}
as a disintegration. In other words,
m0 (B) = μ ({x ∈ M : (x, σ (x)) ∈ B})
for every measurable B ⊂ M × PRd . Then define, for n ≥ 1,
1 n−1
mn = ∑ (PF)∗j m0 .
n j=0

Using Lemma 6.2 one easily gets that every accumulation point of this se-
quence is an PF-invariant probability measure that projects down to μ . Accu-
mulation points do exist, by Lemma 6.3. By Exercise 5.18, every mn satisfies
the u-state condition (5.12). Hence, according to Lemma 5.19, every accumu-
lation point m is a u-state for PF. Note also that mn ({(x, [v]) : v ∈ Exj }) = 1 for
every n ≥ 0. Then, in view of Lemma 6.4, the same is true for every accumu-
lation point m. This implies that
1 n−1
lim
n
∑ Φ(F j (x, [v])) = λ1
n j=0

for m-almost every (x, [v]), and so Φ dm = λ1 = λ+ (F, μ ). This shows that
m realizes the maximum in (6.5), as required. A dual argument shows that the
minimum in (6.5) is realized by some s-state.
Now consider the non-invertible cocycles F ± and PF ± associated with F
and PF, as introduced in Section 5.4. We have seen that every PF-invariant
probability measure m is the lift of some PF + -invariant probability measure
m+ . Note that
Φ dm = Φ dm+
M×PRd M + ×PRd
because Φ depends only on the zeroth coordinate. Moreover, m is a u-state if

and only if m+ = μ + × η for some forward stationary measure η . In this case,

Φ dm = Φ d(μ + × η ) = Φ d(μ × η ).
M×PRd M + ×PRd M×PRd
Similar remarks apply to PF − -invariant probabilities and backward stationary

measures. Thus, Proposition 6.5 immediately gives
Corollary 6.6 For invertible locally constant cocycles,

(a) λ+ (F, μ ) = max Φ d(μ × η ) : η forward stationary for PF},

(b) λ− (F, μ ) = min Φ d(μ × η ) : η backward stationary for PF}
This also leads to the one-sided version of Proposition 6.5:
Proposition 6.7 For general locally constant cocycles,

λ+ (F, μ ) = max Φ d(μ × η ) : η a stationary measure for PF .
Proof This follows from the previous corollary applied to the invertible ex-
tension F̂ : M̂ × Rd → M̂ × Rd of the linear cocycle F : M × Rd → M × Rd .
More precisely, let fˆ : M̂ → M̂ be the two-sided shift on M̂ = X Z and let
Â : M̂ → GL(d) be given by Â(x̂) = A(π (x̂)), where π : M̂ → M is the canonical

projection. Define F̂(x̂, v) = ( fˆ(x̂), Â(x̂)v). It is clear that λ+ (F, μ ) = λ+ (F̂, μ̂ ),

where μ̂ = pZ . Moreover, a probability measure η on PRd is stationary for PF
if and only if it is stationary for PF̂.
6.2 Furstenberg’s formula

In this section we take the linear cocycle F : M × Rd → M × Rd to be locally
constant and the shift f : (M, μ ) → (M, μ ) to be one-sided.
6.2.1 Irreducible cocycles

A linear cocycle F is irreducible if there is no proper subspace of Rd invariant
under A(x) for μ -almost every x ∈ M; and it is strongly irreducible if there is
no finite family of proper subspaces invariant by A(x) for μ -almost every x.
Proposition 6.7 implies that the integral of the function (6.1) with respect
to some PF-invariant probability measure of the form m = μ × η is equal to
the largest Lyapunov exponent. We are going to see that if one assumes strong
irreducibility then the same is true for every invariant probability measure of
the form μ × η . That also implies that for j > 1 the measure m in Theorem 6.1
cannot be a product measure μ × η .
Theorem 6.8 (Furstenberg’s formula) If F : M × Rd → M × Rd is strongly

irreducible then
λ+ (F, μ ) = Φ d(μ × η )
for any stationary measure η of the associated projective cocycle PF.
Proof Let m = μ × η . By the ergodic theorem, the Birkhoff average
1 An (x)v 1 n−1

Ψ(x, [v]) = lim log = lim ∑ Φ ◦ PF i (x, [v])
n n v n n
i=0

exists at m-almost every point, and it satisfies Ψ dm = Φ dm. By the theo-
rem of Oseledets, there exists M0 ⊂ M such that μ (M0 ) = 1 and
1 An (x)v
lim log = λ1 for every x ∈ M0 and v ∈
/ Vx2 .
n n v
So, it suffices to check that the set of pairs (x, v) with v ∈ Vx2 has zero m-
measure. For this we need the lemma that follows. A measure ξ in a measur-
able space (X, B) is non-atomic if ξ ({x}) = 0 for every x ∈ X.

6.2 Furstenberg’s formula 103
Lemma 6.9 If the cocycle F is strongly irreducible then η (V ) = 0 for any

proper projective subspace V of PRd and any PF-stationary measure η . In
particular, every F-stationary measure η is non-atomic.
Proof Suppose that there is some proper projective subspace V with η (V ) >
0. Let d0 ≥ 1 be the smallest dimension of such subspaces, c be the maxi-
mum value of η (V ) over all subspaces of dimension d0 , and V be the family
of all subspaces V of dimension d0 such that η (V ) = c. Since subspaces are
closed subsets of PRd , we have η (V ) = c for any accumulation point V of
any sequence (Vn )n of subspaces of dimension d0 such that η (Vn ) → c. By
compactness of the Grassmannian manifold G(d0 , d), accumulation points do
exist, and so the family V is non-empty. Moreover, V is finite: since η is a
probability measure and the elements of V are essentially disjoint (due to the
choice of d0 ) we have that #V ≤ 1/c. We write V = {V1 , . . . ,Vn }. Since

c = η (Vi ) = η A(x)−1 (Vi ) d μ (x)
and η (A(x)−1 (Vi )) ≤ c for all x ∈ M, we must have η (A(x)−1 (Vi )) = c for
μ -almost every A(x). In view of the choice of V , this implies that
A(x)−1 (Vi ) ∈ V
for μ -almost every x ∈ M. This means that the set V is invariant under al-
most every A(x0 ), which contradicts the strong irreducibility hypothesis. This
contradiction proves the first part of the lemma. The last part is a special case
(subspaces of dimension 1).

Hence, η (Vx2 ) = 0 for every x ∈ M and so m {(x, [v]) : v ∈ Vx2 } = 0. It

follows that Φ dm = Ψ dm = λ1 for any PF-invariant probability measure
of the form m = μ × η , as we wanted to prove.
6.2.2 Continuity of exponents for irreducible cocycles

Let I be the space of pairs (A, p) where A : X → SL(d) is a measurable func-
tion and p is a probability measure on X such that log+ A±1 ∈ L1 (p) and A
is strongly irreducible: no finite family of subspaces of Rd is invariant under
A(x) for p-almost every x. For any (A, p) ∈ I we denote μ = pN and ν = A∗ p.
Sometimes, we let ourselves view A as a function M → SL(d) depending only
on the zeroth coordinate. Then ν = A∗ μ .
A sequence of measures (ξk )k in GL(d) is called uniformly integrable if
given ε > 0, there exist R > 0 and k0 ≥ 1 such that

log+ B±1 d ξk (B) < ε (6.6)
{B>R}

for all k ≥ k0 and for both choices of the sign. This is the case, for example, if
there is some compact set K ⊂ GL(d) such that ξk (K) = 1 for every k. Theo-
rem 6.8 has the following interesting consequence:
Corollary 6.10 Let (A, p) ∈ I and (Ak , pk ), k ≥ 1 be a sequence in I such

that the probability measures νk = (Ak )∗ pk converge to ν = A∗ p in the weak∗
topology and are uniformly integrable. Then λ+ (Ak , μk ), k ≥ 1 converges to
λ+ (A, μ ) as k → ∞, where μk = pNk.
Proof For each k ≥ 1, define Fk (x, v) = ( f (x), Ak (x0 )v) and then let ηk be any
Fk -stationary measure. By Theorem 6.8,

λ+ (Ak , μk ) = Φ(B, [v]) d νk (B) d ηk ([v]), (6.7)

where Φ(B, [v]) = log B(v)/v . Up to restricting to a subsequence of an
arbitrary subsequence, we may suppose that ηk converges to some probabil-
ity measure η on PRd . Note that η is F-stationary, by Proposition 5.9. Thus,
Theorem 6.8 also gives

λ+ (A, μ ) = Φ(B, [v]) d ν (B)d η ([v]). (6.8)
So, we only have to show that the right-hand side of (6.7) converges to the
right-hand side of (6.8) when k → ∞. Let ε > 0. Since log+ A±1 ∈ L1 (p), we
may find R > 0 such that

log+ B±1 d ν (B) = log+ A(x)±1 d p(x) < ε .
{B>R} {A(x)>R}
The assumption of uniform integrability ensures that, increasing R if necessary,

the same holds for νk if k is sufficiently large. Now observe that
|Φ(B, [v])| ≤ log+ B + log+ B−1 for every (B, [v]) ∈ GL(d) × PRd .
Thus,

Φ d νk d ηk − Φ d ν d η ≤ 4ε (6.9)
{B>R} {B>R}
for every large k. Next, observe that νk × ηk converges to ν × η in the weak∗

topology (Exercise 5.14). Since the function Φ is continuous, using Exer-
cise 6.1 we get that

Φ d ν k d ηk − Φ dν dη ≤ ε (6.10)
{B≤R} {B≤R}
for every large k. Combining (6.7)–(6.10) we get that |λ+ (Ak , μk ) − λ (A, μ )| <
5ε for every k sufficiently large.

6.3 Theorem of Furstenberg

Let F : M × R2 → M × R2 be a locally constant linear cocycle defined by a
measurable function A : X → SL(2) over a one-sided shift f : M → M. We
continue to denote M = xN and μ = pN and ν = A∗ p.
The result that follows was stated before, in Theorem 1.2, but this time the
hypotheses are formulated directly in terms of the support of the measure ν ,
rather than the monoid B generated by the support. Exercises 6.6 and 6.7 show
that the two formulations are equivalent. In the next chapter we will discuss
extensions of the statement to arbitrary dimensions.
Theorem 6.11 (Furstenberg) Assume that
(a) the support of ν is not contained in a compact subgroup of SL(2) and

(b) there is no non-empty finite set L ⊂ PR2 such that B(L) = L for every B in
the support of ν .
Then the extremal Lyapunov exponents are distinct: λ− < 0 < λ+ .
Recall that the support of a measure θ on a topological space is the (closed)

subset supp θ of points whose neighborhoods all have positive measure. It fol-
lows from the theorem (and Exercises 6.6 and 6.7) that there are four possibil-
ities for SL(2)-cocycles:
(1) the cocycle is non-pinching: supp ν is contained in some compact sub-

group of SL(2);
(2) the cocycle is non-twisting, with either one or two invariant subspaces:
supp ν is contained in a triangular subgroup or a diagonal subgroup;
(3) the cocycle is non-twisting, with an invariant set formed by two subspaces
which are interchanged by some of the matrices;
(4) in all other cases, the Lyapunov exponents are distinct and non-zero.
This discussion is much refined by the following theorem of Thieullen [111],

which classifies SL(2)-cocycles up to conjugacy. A function φ : M → R is
cohomologous to zero mod r, for a given r > 0, if there exists a measurable
function u : M → R such that
φ + u ◦ f − u ∈ rZ almost everywhere. (6.11)
Given a set E ⊂ M, we define RE : M → SL(2) by

rotation of π /2 if x ∈ E
RE (x) =
id otherwise.

Theorem 6.12 (Thieullen) Let ( f , μ ) be ergodic. If A : M → SL(2) satisfies

log+ A±1 ∈ L1 (μ ) then there exists P : M → SL(2) such that the function
B(x) = P−1 ( f (x)) · A(x) · P(x) has one of the following forms, μ -almost every-
where:

cos θ (x) − sin θ (x)
(a) B(x) = with θ not cohomologous to zero mod π .
sin θ (x) cos θ (x)

a(x) b(x)
(b) B(x) = with log |a| d μ = 0 and log |b| ∈ L1 (μ ).
0 1/a(x)

cca(x) 0
(c) B(x) = RE (x) where log |a| ∈ L1 (μ ) and the charac-
0 1/a(x)
teristic function of E ⊂ M is not cohomologous to zero mod 2.

a(x) 0
(d) B(x) = with λ = log |a| d μ > 0.
0 1/a(x)
Moreover, B(x) and P( f (x))P−1 (x) are bounded above by A(x). In par-
ticular, the Lyapunov exponents of B are well defined and coincide with the
Lyapunov exponents of A. The Lyapunov exponents vanish in cases (a)–(c),
and they are ±λ in case (d).
In the remainder of the present section we prove Theorem 6.11.
6.3.1 Non-atomic measures

Let B : R2 → R2 be a non-zero linear map. When B is invertible, we denote by
B∗ θ the push-forward of a probability measure θ under the projective action
[v] → [B(v)]. When B is non-invertible (of rank 1, necessarily) the projective
action is defined at all points except [ker B] ∈ PR2 , and so B∗ θ is still defined
for every non-atomic probability measure θ . In this case, B∗ θ is the Dirac mass
at the image [B(R2 )] ∈ PR2 .
Lemma 6.13 The map (B, θ ) → B∗ θ is continuous, for B in the space of

non-zero linear maps of R2 , with the coefficients topology, and θ in the space
of non-atomic probability measures on PR2 , with the weak∗ topology.
Proof Let (Bn )n converge to B and (θn )n converge to θ . Consider any con-

tinuous function ϕ : PR2 → R. Then | (ϕ ◦ Bn ) d θn − (ϕ ◦ B) d θ | is bounded
by

(ϕ ◦ Bn ) d θn − (ϕ ◦ B) d θn + (ϕ ◦ B) d θn − (ϕ ◦ B) d θ . (6.12)

Suppose that B is invertible. Then the projective action of Bn converges to

the projective action of B, uniformly on PR2 . The first difference in (6.12) is
bounded by sup |ϕ ◦ Bn − ϕ ◦ B|, which goes to zero when n → ∞. The second
also goes to zero, because ϕ ◦ B is continuous and θn converges to θ . Since ϕ
is arbitrary, this proves that (Bn )∗ θn converges to B∗ θ in the weak∗ topology.
Now suppose B is non-invertible, of rank 1. Given ε > 0, fix a closed neigh-
borhood U of [ker B] ∈ PR2 such that θ (U) < ε . Then θn (U) < ε for every

large n. The difference | (ϕ ◦ Bn ) d θn − (ϕ ◦ B) d θ | is bounded by

( ϕ ◦ Bn ) d θ n − (ϕ ◦ B) d θ + (ϕ ◦ Bn ) d θn − (ϕ ◦ B) d θ . (6.13)
Uc Uc U U
The projective action of Bn converges to the projective action of B uniformly

on U c . So the same argument as before shows that the first difference in (6.13)
is less than ε if n is large enough. The second is bounded by 2ε sup |ϕ |. Since
ε is arbitrary, this proves that (Bn )∗ θn converges to B∗ θ also in this case.
We use Bt : Rd → Rd to denote the adjoint of a linear map B : Rd → Rd :
Bt (u) · v = u · B(v) for every u, v ∈ Rd .
Lemma 6.14 Let θ be a non-atomic probability measure on PR2 and (Bn )n

be a sequence in SL(2). The following conditions are equivalent:
(1) Bn → ∞ as n → ∞;
(2) every limit point of (Bn )∗ θ in the weak∗ topology is a Dirac mass δ[v] .
If (Bn )∗ θ → δ[v] then Btn (u)/Bn → |u · v|/v for every u ∈ R2 .
Proof Suppose that Bn does not converge to ∞. Then (Bn )n admits some
subsequence converging to an invertible linear map B : R2 → R2 . Then, by
Lemma 6.13, the corresponding push-forwards of θ converge to B∗ θ , which is
non-atomic. This shows that (2) implies (1). Now let us prove that (1) implies
(2) and the last conclusion in the statement. Suppose that Bn → ∞. Up to
restricting to a subsequence, we may suppose that Ln = Bn /Bn converges to
some L : R2 → R2 . Note that L = 1 and | det L| = limn 1/Bn 2 = 0, and so
L has rank 1. Let V = L(R2 ). By Lemma 6.13, the image (Bn )∗ θ = (Ln )∗ θ
converges to δV . Now fix unit vectors v ∈ V and w ∈ V ⊥ . Any vector u ∈ R2
may be written as u = (u · v)v + (u · w)w. Then
Btn u
= Ltn (u) → Lt (u) = |u · v|
Bn
because Lt (v) = ±v and Lt (w) = 0.

Corollary 6.15 The stabilizer H(θ ) = {B ∈ SL(2) : B∗ θ = θ } of any non-

atomic probability measure θ on PR2 is a compact subgroup of SL(2).
Proof The fact that H(θ ) is a subgroup of SL(2) is clear from the definition
and the fact that it is closed follows from Lemma 6.13. So, we only have to
prove that the norm of the elements of Hθ is bounded. That is a simple con-
sequence of Lemma 6.14: if there exists a sequence An ∈ H(θ ), n ≥ 1 with
An → ∞ then the sequence (An )∗ η accumulates on some Dirac mass; that is
not possible because this sequence is constant equal to η , which is non-atomic.
This contradiction proves that H(θ ) is indeed bounded in norm.
6.3.2 Convergence to a Dirac mass

The adjoint Ft of a linear cocycle F(x, v) = ( f (x), A(x)v) is the linear cocy-
cle defined over the inverse f −1 : M → M by F t (x, v) = ( f −1 (x), At (x)v), where
At (x) = A( f −1 (x))t . The n th iterate is given by (x, v) → ( f −n (x), At n (x)), where
At n (x) = At ( f −n+1 (x)) · · · At ( f −1 (x))At (x)

= A( f −n (x))t · · · A( f −2 (x))t A( f −1 (x))t .
The adjoint F t is not locally constant, because At (x) depends on the first (not
the zeroth) coordinate of x. However, F t is conjugate to the locally constant
cocycle F (x, v) = ( f −1 (x), A(x)t v) by the transformation (x, v) → ( f −1 (x), v).
We denote by PF t and PF the projective cocycles associated with F t and F ,
respectively. The notions of stationary measures for PF t and PF coincide.
Let η t be any PF t -stationary probability measure on PR2 . We are going to
analyze the iterates of η t under the matrices
An (x)t = A(x)t · · · A( f n−1 (x))t = At n ( f n (x)).
Lemma 6.16 For μ -almost every x ∈ M, there exists a probability measure

mx on PR2 such that An (x)t∗ η t → mx in the weak∗ topology.
Proof This is analogous to Lemma 5.22; we just describe the main steps.
Given any continuous function ϕ : PR2 → R and n ≥ 0, define

Ψϕ ,n (x) = ϕ ◦ An (x)t d η t .
Let F0 = {0,/ M} and, for each n ≥ 1, let Fn be the σ -algebra generated by the
cylinders [0 : Δ0 , . . . , Δn−1 ] with Δi ⊂ X a measurable set for i = 0, . . . , n − 1.
Then Fn ⊂ Fn+1 for every n ≥ 0. Since An (x)t depends only on x0 , . . . , xn−1 ,
so does the value of Ψϕ ,n (x). Hence, every Ψϕ ,n is Fn -measurable. Moreover,

for any cylinder C ∈ Fn ,

Ψϕ ,n+1 (x) d μ (x) = ϕ ◦ An+1 (x)t d η t d μ (x)
C C

= ϕ ◦ An (x)t d η t d μ (x)
C
because η t is PF t -stationary. This proves that (Ψϕ ,n , Fn ) is a martingale. It

follows that there exists a measurable function Ψϕ : M → R such that
Ψϕ (x) = lim Ψϕ ,n (x)
for every x in some full μ -measure set M0 ⊂ M. Using the fact that the space
C0 (PR2 ) is separable, one can choose the full measure subset M0 to be the
same for every continuous function ϕ . By the Riesz representation theorem,
each linear map ϕ → Ψϕ (x), x ∈ M0 defines a probability measure mx on PR2 .
By construction, mx is the weak∗ limit of An (x)t∗ η t .
Lemma 6.17 Let mx be as in Lemma 6.16. Then An (x)t∗ Bt∗ η t → mx for μ -

almost every x and ν -almost every B ∈ SL(2).
Proof Given any continuous function ϕ : PR2 → R and any n ≥ 0, define

Ψϕ ,n : M → R as in the proof of Lemma 6.16:

Ψϕ ,n (x) = ϕ ◦ An (x)t d η t .

We claim that Ψϕ ,n Ψϕ ,n+1 d μ = Ψ2ϕ ,n d μ for every n ≥ 0. Indeed, the ex-
pression on the left-hand side may be written as

Ψϕ ,n (x) ϕ ◦ An (x)t ◦ A( f n (x))t d η t d μ (x).
Since Ψϕ ,n (x) and An (x)t depend only on x0 , . . . , xn−1 , and A( f n (x))t depends
only on xn , this is the same as

Ψϕ ,n (x) ϕ ◦ An (x)t ◦ A( f n (x))t d η t d p(xn ) d pn (x0 , . . . , xn−1 ).
Using the assumption that η t is PF t -stationary, this expression becomes

Ψϕ ,n (x) ϕ ◦ An (x)t d η t d pn (x0 , . . . , xn−1 ) = Ψ2ϕ ,n d μ .
That proves our claim. Now it follows that

2
Ψϕ ,n+1 − Ψϕ ,n d μ = Ψ2ϕ ,n+1 − 2 Ψϕ ,n Ψϕ ,n+1 + Ψ2ϕ ,n d μ

= Ψ2ϕ ,n+1 d μ − Ψ2ϕ ,n d μ .

Therefore, by telescopic cancellation,

l−1 2
∑ Ψϕ ,n+1 − Ψϕ ,n d μ = Ψ2ϕ ,l d μ − Ψ2ϕ ,0 d μ ≤ sup |ϕ |2 (6.14)
n=0
for every l ≥ 1. The integral on the left-hand side may be rewritten as

2
ϕ ◦ An (x)t A( f n (x))t d η t − ϕ ◦ An (x)t d η t d pn (x0 , . . . , xn−1 ) d p(xn );
that is,

2
ϕ ◦ An (x)t Bt d η t − ϕ ◦ An (x)t d η t d pn (x0 , . . . , xn−1 ) d ν (B)

2
= ϕ ◦ An (x)t Bt d η t − ϕ ◦ An (x)t d η t d μ (x) d ν (B).
Thus, (6.14) means that

∞
2
∑ ϕ ◦ An
(x)t t
B d η t
− ϕ ◦ An
(x)t
d η t
d μ (x) d ν (B) < ∞.
n=0
By Borel–Cantelli, this implies that

lim ϕ ◦ An (x)t Bt d η t − ϕ ◦ An (x)t d η t = 0 (6.15)
n
for (μ × ν )-almost every (x, B). Taking the intersection of those full measure
sets over all functions ϕ in some countable dense subset of C0 (PR2 ), we find
a full (μ × ν )-measure set of pairs (x, B) such that (6.15) holds; that is,

lim ϕ ◦ An (x)t Bt d η t = ϕ dmx ,
n
for all continuous functions ϕ : PR2 → R. This is the claim of the lemma.
Lemma 6.18 For μ -almost any x there is V (x) ∈ PR2 such that mx = δV (x) .
Proof By Lemma 6.9, the PF t -stationary measure η t is non-atomic. Hence,
by Corollary 6.15, the stabilizer H(η t ) is a compact subgroup of SL(2). By
Lemma 6.17, for μ -almost every x ∈ M,
lim An (x)t∗ Bt∗ η t = lim An (x)t∗ η t = mx for ν -almost every B.
n n
Fix x and let P be any accumulation point of the sequence An (x)t /An (x)t .
Then P = 1 and, by Lemma 6.13, the previous relation implies
P∗ Bt∗ η t = P∗ η t = mx for ν -almost every Bt .
If P was invertible then Bt∗ η t = η t for ν -almost every B, contradicting the fact
that the stabilizer of η t is compact. Thus, P has rank 1 and so mx = P∗ η t is the
Dirac measure δV (x) with V (x) = P(R2 ).

6.3.3 Proof of Theorem 6.11

Let η be any PF-stationary measure. By Theorem 6.8, the largest Lyapunov
exponent λ+ is given by

λ+ = Φ dη d μ . (6.16)
Our goal is to show that the integral is positive. Let V (x) be as in Lemma 6.18.
Corollary 6.19 / V (x)⊥ and μ -almost every x ∈
An (x)u → ∞ for every u ∈
M.
Proof We have seen in Lemma 6.18 that An (x)t∗ η t converges to a Dirac mass,
at μ -almost every point. Write V (x) = [v]. Then Lemma 6.14 gives that
An (x)u |u · v|
An (x)t → ∞ and →
A (x)
n t v
/ V (x)⊥ .
for every u. In particular, An (x)u → ∞ for every u ∈
Since η is non-atomic, by Lemma 6.9, the set {(x,V (x)) : x ∈ M} has zero
μ × η -measure. So, Corollary 6.19 implies that
n
An (x)v
lim ∑ Φ(F j (x, [v])) = lim log =∞ (6.17)
n
j=0
n v
for μ × η -almost every (x, [v]). We need the following abstract result:
Lemma 6.20 Let (X, X , m) be a probability space and T : X → X a measure-
preserving map. If ϕ ∈ L1 (X, m) is such that
n−1
lim ∑ ϕ (T j (x)) = +∞ for m-almost every x ∈ X,
n
j=0

then ϕ dm > 0.
j=0 ϕ ◦ T and, for each fixed ε > 0, define

Let Sn = ∑n−1
Proof j

Aε = {x ∈ X : Sn (x) > ε for all n ≥ 1} and Bε = T −k (Aε ).
k≥0
We claim that if Sn (x) → ∞ then x belongs to Bε for some ε > 0. Indeed,

suppose otherwise. Then x ∈ / B1/l 2 for any l ≥ 1. This means that T k (x) ∈
/ A1/l 2
for any k ≥ 0 and l ≥ 1. Equivalently, for any k ≥ 0 and l ≥ 1 there exists
n ≥ 1 such that Sn ( f k (x)) ≤ 1/l 2 . Then we may construct a sequence (nl )l
with n0 = 0 and
Snl ( f n+ ···+nl−1 (x)) ≤ 1/l 2 for every l ≥ 1.

By construction, Sn1 +···+nl (x) ≤ 1 + 1/4 + · · · + 1/l 2 does not converge to ∞.

This proves our claim. It follows from the hypothesis of the lemma that the
union of all Bε , ε > 0 has full measure.
Now let ψ be the Birkhoff average of ϕ and let τε be the sojourn time
in Aε ; that is, the Birkhoff average of the characteristic function XAε . The
hypothesis of the lemma implies that ψ (x) ≥ 0 for almost every x, and so

ϕ dm = ψ dm ≥ 0. Suppose, for contradiction, that the integrals vanish.
Then ψ = 0 almost everywhere. Given x ∈ Bε , let k ≥ 0 be the smallest integer
such that T k (x) ∈ Aε . Then
n−1 k−1 n−1
∑ ϕ (T j x) ≥ ∑ f (T j x) + ∑ ε XAε (T j x), for all n ≥ 1.
j=0 j=0 j=k
Dividing by n and sending n → ∞, we obtain 0 = ψ (x) ≥ ετε (x) for every

x ∈ Bε . It is clear that τε (x) for every x ∈
/ Bε . So, we conclude that m(Aε ) =

τε dm = 0. Then m(Bε ) also vanishes, for every ε > 0. This contradicts the
claim in the previous paragraph. This contradiction proves the lemma.

In view of (6.16) and (6.17), Lemma 6.20 gives that λ+ = Φ d η d μ is
strictly positive. This finishes the proof of Theorem 6.11.
6.4 Notes
Our presentation in Section 6.1 is a variation of the original proof of Theo-
rem 6.1, which was due to Ledrappier [80, § I.5].
Theorem 6.8 is taken from Furstenberg [54]. Corollary 6.10 is contained in
results of Furstenberg, and Kifer [57] and of Hennion [61]. Related results, for
some reducible cocycles, were obtained by Kifer and Slud [70, 73]. We will
return to the topic of continuity of Lyapunov exponents in Chapter 10.
Theorem 6.11 is also due to Furstenberg [54]. Several extensions have been
obtained, by Virtser [116], Guivarc’h [59], Royer [103], Ledrappier and Royer
[82] and others. This includes the proof that the statement remains true when
the base dynamics ( f , μ ) is just a Markov shift.
More recently, Bonatti, Gomez-Mont and Viana [37] extended the crite-
rion to Hölder-continuous cocycles with invariant holonomies over hyperbolic
homeomorphisms. The invariant measure only needs to have local product
structure. Currently, the strongest statements are due to Viana [113], for Hölder-
continuous cocycles over (non-uniformly) hyperbolic systems, and Avila, San-
tamaria and Viana [13], when the base dynamics is partially hyperbolic.
Two different generalizations of Furstenberg’s theorem are at the heart of

6.5 Exercises 113
the next two chapters: the extension of the Furstenberg criterion to arbitrary
dimension, which we discuss in Chapter 7, and the criterion for simplicity of
the whole Lyapunov spectrum, to be presented in Chapter 8.
6.5 Exercises
Exercise 6.1 Let Y ⊂ X be either open or closed. Suppose that (μn )n con-
verges to μ in the weak∗ topology on X and let (μY,n )n and μY be their nor-
malized restrictions to Y . Conclude that (μY,n )n converges to μY in the weak∗
topology on Y .
Exercise 6.2 Use Exercise 5.22 to check that for each j = 1, . . . , k, the prob-
ability measure m in the second part of Theorem 6.1 may be chosen to be
ergodic.
Exercise 6.3 Use Exercise 5.23 to check that Proposition 6.5 remains true if
one restricts to ergodic s-states and ergodic u-states.
Exercise 6.4 Show that Proposition 6.7 and the expressions (a) and (b) for the
Lyapunov exponents remain true if one restricts to ergodic stationary measures.
Exercise 6.5 Suppose that X = SL(2) and the probability measure p has
finite support; that is, it is of the form p = p1 δA1 + · · · + pm δAm . Moreover,
let A : M → SL(2) be the projection to the zeroth coordinate. Show that the
associated linear cocycle is strongly irreducible if and only if it is twisting.
Exercise 6.6 Prove that the following conditions are equivalent.
(1) The support of ν is not contained in any compact subgroup of SL(2) (hy-
pothesis (a) in Theorem 6.11).
(2) The support of ν is not contained in a set of the form GRG−1 where
G ∈ SL(2) and R ⊂ SL(2) is the group of rotations.
(3) The linear cocycle F is pinching: the monoid B generated by supp ν con-
tains matrices with arbitrarily large norm.
Exercise 6.7 Let F be pinching. Prove that the following conditions are
equivalent.
(1) There is no finite set L ⊂ PR2 such that B(L) = L for ν -almost every B ∈
SL(2) (hypothesis (b) in Theorem 6.11).
(2) There is no set L ⊂ PR2 with one or two elements such that B(L) = L for
ν -almost every B ∈ SL(2).

(3) The linear cocycle F is twisting: for any V0 , V1 , . . . , Vk ∈ PR2 there is

B ∈ B such that B(V0 ) = Vi for all i = 1, . . . , k.
Exercise 6.8 Calculate the stationary measures for the product of random
matrices in the second part of Exercise 1.2. Deduce that this cocycle is a point
of continuity for the Lyapunov exponents among locally constant cocycles.
Exercise 6.9 Let F be an SL(2)-cocycle. Use Lemma 6.9 and Corollary 6.15
to show that if F satisfies conditions (a) and (b) in Theorem 6.11 then there
exists no probability measure on PR2 such that B∗ η = η for every B ∈ supp ν .
Exercise 6.10 Let F be an SL(2)-cocycle. Prove the converse to Exercise 6.9
and deduce that if λ± = 0 then there exists some probability measure η on PR2
such that B∗ η = η for every B ∈ supp ν .
Exercise 6.11 Show that the cocycle F in Section 6.3.2 satisfies the hypothe-
ses of Theorem 6.11 if and only if F does.

7
Invariance principle
The main result in this chapter takes our understanding of the invariant mea-
sures of linear and projective cocycles, and their links to Lyapunov exponents,
one step further. Most important, it applies precisely when the information
provided by the theorem of Oseledets is poor, namely, when all Lyapunov ex-
ponents coincide and so the Oseledets decomposition is trivial.
As a motivation for the statement, let us start with a brief discussion of
the 2-dimensional case. Let F : M × R2 → M × R2 be an invertible locally
constant linear cocycle over a Bernoulli shift f : (M, μ ) → (M, μ ), and let
PF : M × PR2 → M × PR2 be the associated projective cocycle.
Suppose first that the Lyapunov exponents λ± of F are distinct. Then, as
we have seen in Section 5.5.3, there exists an s-state ms and a u-state mu such
that every PF-invariant probability measure that projects down to μ is a con-
vex combination of ms and mu . Moreover, these are the unique PF-ergodic
probability measures, and they determine the Lyapunov exponents:

λ+ = Φ dmu and λ− = Φ dms . (7.1)

When the Lyapunov exponents coincide, we still get that λ± = Φ dm for
every PF-invariant probability measure. Moreover, the invariance principle
that we are going to prove in this chapter provides a surprisingly precise char-
acterization of these measures: every PF-invariant probability measure that
projects down to μ is a product measure m = μ × η , where η is both forward
and backward stationary.
The invariance principle holds in arbitrary dimension, as we are going to
see in Sections 7.1 and 7.2. In Section 7.3 we deduce the following result,
which was originally due to Furstenberg [54]: if λ− = λ+ then there exists
some probability measure η on projective space that is invariant under A(x)
for almost every x. In Section 7.4 we observe that the latter is a very restrictive
115

116 Invariance principle
condition on the cocycle. Thus, one has λ− < λ+ for generic locally constant
cocycles.
7.1 Statement and proof

Let F̂ : M̂ × Rd → M̂ × Rd be an invertible linear cocycle defined over a two-
sided Bernoulli shift fˆ : (M̂, μ̂ ) → (M̂, μ̂ ) by a measurable Â : M̂ → GL(d)
that depends only on the zeroth coordinate. Let PF̂ : M̂ × PRd → M̂ × PRd be
the associated projective cocycle. Write M̂ = X Z , where X and M̂ are taken to
be separable complete metric spaces and μ̂ = pZ . Assume that log+ Â±1 ∈
L1 (μ̂ ) and let λ− and λ+ be the extremal Lyapunov exponents of F̂.
Theorem 7.1 (Linear invariance principle) If λ− = λ+ then every PF̂-inv-

ariant probability measure m̂ that projects down to μ̂ is both an s-state and a
u-state. Consequently, m̂ = μ̂ × η for some probability measure η on PRd that
is both forward and backward stationary.
Theorem 7.1 is a consequence of the following theorem of Ledrappier [81].

Let F : M ×Rd → M ×Rd be a linear cocycle defined over a measure-preserving
transformation f : (M, μ ) → (M, μ ) by a measurable function A : M → GL(d)
such that log+ A±1 ∈ L1 (μ ). Let PF : M × PRd → M × PRd be the associ-
ated projective cocycle. Note F, PF, and f are not assumed to be invertible.
Moreover, f is arbitrary (not necessarily a shift) and A need not be locally con-
stant either. We do assume that M is a separable complete metric space (and so
(M, μ ) is a Lebesgue space, see [114, Section 8.5.2]).
Theorem 7.2 (Ledrappier) Assume that λ− (F, x) = λ+ (F, x) for μ -almost

every x ∈ M. Then
m f (x) = A(x)∗ mx for μ -almost every x ∈ M,
for any disintegration {mx : x ∈ M} of any PF-invariant probability measure

m that projects down to μ .
Proof of Theorem 7.1 Let M = X N , μ = pN and f : (M, μ ) → (M, μ ) be the

one-sided Bernoulli shift. Define A : M → GL(d) by Â = A ◦ π + and then take
F : M × Rd → M × Rd to be the linear cocycle and PF : M × PRd → M × PRd
to be the projective cocycle defined by A over f . Then λ− (F, x) = λ− and
λ+ (F, x) = λ+ for μ -almost every x ∈ M. Given any PF̂-invariant probability
measure m̂, let m = Π+ ∗ m̂ be its projection to M × PR . We know from Corol-
d
lary 5.23 that the disintegrations {m̂x̂ : x̂ ∈ M̂} and {mx : x ∈ M} of m̂ and m,

respectively, are related by

m̂x̂ = lim Ân ( fˆ−n (x̂))∗ mπ + ( fˆ−n (x̂)) for μ̂ -almost every x̂ ∈ M̂.
n
Now, using Theorem 7.2 and the relations f ◦ π + = π + ◦ f and Â = A ◦ π + ,

Ân ( fˆ−n (x̂))∗ mπ + ( fˆ−n (x̂)) = An (π + ( fˆ−n (x̂)))∗ mπ + ( fˆ−n (x̂))
= m f n (π + ( fˆ−n (x̂))) = mπ + (x)
for μ̂ -almost every x̂. Substituting this in the previous expression we conclude
that m̂x̂ = mπ + (x) for μ̂ -almost every x̂. In particular, m̂x̂ depends only on π + (x),
and that proves that m̂ is an s-state. An entirely dual argument, using backward
iterates instead, proves that m̂ is a u-state.
7.2 Entropy is smaller than exponents

In this section we prove Theorem 7.2. Let m be a probability measure on M ×
PRd that projects down to μ and let {mx : x ∈ M} be a disintegration of m. For
each x, let J(x, ·) be the Radon–Nikodym derivative

d A(x)−1
∗ m f (x)
J(x, v) = (v) .
dmx
This means that
A(x)−1
∗ m f (x) = J(x, ·)mx + ξx (7.2)
for some positive measure ξx (totally) singular with respect to mx . The fibered
entropy of m is defined by

h(m) = − log J dm = − log J dm + ∞m({J = 0}). (7.3)
{J>0}

Note that {J>0} J dm = J dm ≤ 1, because the left-hand side of (7.2) is a
probability measure and ξx is a positive measure. Then, by convexity,

− log J dm ≥ − log J dm ≥ 0. (7.4)
{J>0} {J>0}
This shows that the integral in (7.3) is well defined and h(m) ∈ [0, +∞].
Proposition 7.3 If h(m) = 0 then A(x)∗ mx = m f (x) for μ -almost every x.
Proof The first inequality in (7.4) is strict unless log J is constant m-almost

everywhere. The second one is strict unless J dm = 1 (equivalently, ξx = 0
for μ -almost every x). Therefore, h(m) = 0 if and only if m({J = 0}) = 0,


and J dm = 1, and log J is constant on a full m-measure set. In particular,
if h(m) = 0 then J = 1 on a full m-measure set. That is, A(x)−1 ∗ m f (x) = mx
or, equivalently, A(x)∗ mx = m f (x) for μ -almost every x ∈ M. This proves the
claim.
Proposition 7.4 0 ≤ h(m) ≤ d(λ+ − λ− ).
Theorem 7.2 is a direct consequence of Propositions 7.3 and 7.4. In the

remainder of this section we prove Proposition 7.4.
7.2.1 The volume case

As a motivation for the proof, and for the definition of h(m), we begin by
discussing the case when m = μ × λ , where λ is the Haar measure on PRd .
This will not be used in the proof of the general case, so the reader may choose
to skip this section altogether.
In the case under consideration mx = λ for every x ∈ M, and so the Radon–
Nikodym derivative
A(x)−1
∗ λ
J(x, [v]) = ([v]) = | det DΦx ([v])|
dλ
where Φx : PRd → PRd , Φx ([v]) = [A(x)v] denotes the projective action of
A(x).
d
It follows that J(x, [v]) = | det A(x)| v/A(x)v at every point, and so

A(x)v]|
h(m) = d log d μ (x) d λ ([v]) − log | det A(x)| d μ (x)
v

=d Φ d μ dλ − log | det A(x)| d μ (x),

where Φ is the function in (6.1). Let m = mα d α be the ergodic decomposi-
tion of m = μ × λ . By Theorem 6.1, for each α there exists j = j(α ) such that

Φ dmα = λ j . Thus,
k k
Φ d μ dλ = ∑ c jλ j, ∑ c j = 1,
j=1 j=1
where c j is the total weight of ergodic components with j(α ) = j. By (4.40),

k
log | det A(x)| d μ (x) = ∑ λ j dim E j .
j=1

Therefore,
k
h(m) = ∑ (dc j − dim E j )λ j ≤ d(λ+ − λ− ). (7.5)
j=1
The last inequality is, obviously, not sharp.
7.2.2 Proof of Proposition 7.4.

Consider any Δ > λ+ − λ− ≥ 0. We want to prove that h(m) ≤ dΔ for any F-
invariant probability measure m that projects down to μ . For the time being,
let us assume that m is ergodic for F; the general case will be deduced at the
end.

Given ε > 0, define hε (m) = − log Jε dm where Jε = J + ε . By the mono-
tone convergence theorem, hε (m) increases to h(m) as ε decreases to 0. Sup-
pose, for contradiction, that h(m) > dΔ. Then, for any ε > 0 small enough,
hε (m) > dΔ + 4ε . (7.6)
Lemma 7.5 Each fiber {x} × PRd admits partitions Pn (x), defined for every
large n, such that
(1) #Pn (x) ≤ en(dΔ+2ε ) ,
(2) diam Pn (x) ≤ e−n(Δ+2ε ) ,
(3) mx (∂ Pn (x, v)) = 0 for every v ∈ PRd .
Figure 7.1 A refining sequence of partitions of PRd
Proof We start from a regular triangulation of the sphere Sd−1 into 2d sim-
plices, as pictured in Figure 7.1 for the case d = 3. We take the sphere to carry
the standard metric with curvature ≡ 1. By identifying antipodes, one obtains
a triangulation T1 of the projective space PRd into 2d−1 simplices. Each sim-
plex can be split into 2d−1 regular simplices, each of which is the image of
the original simplex by a (1/2)-contraction, as illustrated in Figure 7.1. By

repeating this procedure successively, we obtain an increasing sequence Tk of

triangulations of PRd satisfying
(a) #Tk = 2k(d−1) for every k ≥ 1

(b) C−1 ≤ 2k diam Tk (v) ≤ C for every k ≥ 1 and v ∈ PRd ,
where the constant C > 1 is independent of k and v.

By construction, the boundaries of the atoms of every Tk are contained in
finitely many projective hyperplanes. Thus, for each x ∈ M and k ≥ 1, we may
find an orthogonal transformation Bx,k : Rd → Rd such that the partition Tx,k =
Bx,k (Tk ) satisfies
(c) mx (Tx,k (v)) = 0 for every v ∈ PRd .
Moreover, properties (a) and (b) are not affected when one replaces Tk by
Tx,k , because the projective action of Bx,k is an isometry. Assuming ε is small
enough relative to Δ, for every large n ≥ 1 there exists some integer k ≥ 1 such
that
dΔ + 2ε Δ + 2ε logC
n ≥k≥n + .
(d − 1) log 2 log 2 log 2
Then the partition Pn (x) = Tx,k satisfies the conditions in the conclusion of
the lemma.
Let Pn (x, v) denote the atom of the partition Pn (x) that contains the point
v. For each 0 ≤ k ≤ n let Pn,k (x) be the partition of {x} × PRd given by the
pull-back of Pn ( f k (x)) under Ak (x):

Pn,k (x, v) = Ak (x)−1 Pn (F k (x, v)) for each v ∈ PRd .
Observe that Pn,0 (x, v) = Pn (x, v). Then define, for 0 ≤ k < n,

m f (x) Pn,k (F(x, v))
Jn,k (x, v) =
mx Pn,k+1 (x, v))

m f n (x) Pn (F n (x, v)) n−1
Jn (x, v) = = ∏ Jn,k (x, v)
mx Pn,n (x, v)) k=0
and, for each ε > 0,

n−1
Jn,k,ε (x, v) = Jn,k (x, v) + ε and Jn,ε (x, v) = ∏ Jn,k (x, v).
k=0
Lemma 7.6 sup0≤k≤n log Jn,k,ε − log Jε L1 (m) converges to zero as n → ∞.

Let us assume this fact for a while and use it to complete the proof of Propo-
sition 7.4. Since we assume m to be ergodic,
1 1 n−1 1 n−1
lim log Jn,ε = lim ∑ log Jn,k,ε ◦ F j = lim ∑ log Jε ◦ F j
n n n n n n
k=0 k=0

= log Jε dm = hε (m)
at m-almost every point (the second equality uses Lemma 7.6). Consequently,
1 1
lim sup log m f n (x) Pn (F n (x, v)) ≤ lim sup log Jn (x, v)
n n n n
1
≤ lim log Jn,ε (x, v) = −hε (m)
n n
for m-almost every (x, v). In particular, for every large n ≥ 1 there exists En ⊂
M × PRd such that m(En ) > 1/2 and

m f n (x) Pn (F n (x, v)) ≤ en(−hε (m)+ε ) for all (x, v) ∈ En .
Since every Pn (y) has at most en(dΔ+2ε ) atoms (Lemma 7.5), and recalling
(7.6), it follows that the m f n (x) -measure of the intersection of F n (En ) with the
fiber of f n (x) does not exceed
en(dΔ+2ε ) en(−hε (m)+ε ) ≤ e−nε .
Since x is arbitrary, we conclude that m(F n (En )) ≤ e−nε , contradicting the

fact that m(En ) > 1/2 for every n. This contradiction reduces the proof of
Proposition 7.4 to proving Lemma 7.6.
Lemma 7.7 sup0≤k≤n sup{diam Pn,k (x, v) : v ∈ PRd } converges to zero when
n → ∞, for μ -almost every x ∈ M.
Proof The derivative of the action of Ak (x)−1 in projective space is bounded

by Ak (x)Ak (x)−1 (Exercise 7.1). Hence, using Lemma 7.5,
diam Pn,k (x, v) ≤ Ak (x)Ak (x)−1 diam Pn (x, v)

≤ Ak (x)Ak (x)−1 e−n(Δ+2ε ) .
For μ -almost every x ∈ M,
1
lim log Ak (x)Ak (x)−1 = λ+ − λ− .
n n
Hence, there exists k0 (x) ≥ 1 such that
diam Pn,k (x, v) ≤ ekΔ e−n(Δ+2ε ) ≤ e−2nε for every k0 (x) < k ≤ n.

Fix k0 (x) and let K0 (x) = max{Ak (x)Ak (x)−1 : 0 ≤ k ≤ k0 (x)}. Then
diam Pn,k (x, v) ≤ K0 (x)e−n(Δ+2ε ) ≤ e−2nε
for every n sufficiently large. This proves that, for μ -almost every x ∈ M, there
exists n0 (x) ≥ 1 such that sup0≤k≤n supv diam Pn,k (x, v) ≤ e−n2ε for every n ≥
n0 (x). That implies the conclusion of the lemma.
Lemma 7.8 Let X be a complete metric space and η0 , η1 be Borel probability

measures on X such that η1 ≥ cη0 . Let ρ : X → [0, +∞] be the Radon-Nikodym
derivative ρ = d η1 /d η0 and, given any countable partition P of X, define
ρP (x) = η1 (P(x))/η0 (P(x)) for x ∈ X. Then:

(1) log ρ d η0 ≤ log ρP d η0 ≤ 0;
(2) given ε > 0 there is δ > 0 such that log ρP − log ρ L1 (η0 ) ≤ ε for any
countable partition P with η0 (∂ P) = 0 and diam P < δ .

Proof By convexity, log ρP d η0 ≤ log ρP d η0 = 0. Similarly,

log ρ d η0 ≤ η0 (P(x)) log ρP (x)
P(x)

for every atom P(x), and so log ρ d η0 ≤ log ρP d η0 . This proves part (1)
of the lemma. Next, note that the functions ρP satisfy a uniform integrability
condition: for all Y ⊂ X with η0 (Y ) < 1/e,

| log ρP | d η0 ≤ −η0 (Y )(log η0 (Y ) + log c). (7.7)
Y
Indeed, the assumption implies − log ρP ≤ − log c and so the claim is trivial if
log ρP happens to be negative on Y . When log ρP ≥ 0 on the set Y , the claim
follows from convexity:

η0 η0 η1 (P(Y )) 1
log ρP d ≤ log ρP d ≤ log ≤ log ,
Y η0 (Y ) Y η0 (Y ) η0 (Y ) η0 (Y )
where P(Y ) denotes the union of all atoms of P that intersect Y . The gen-
eral case of (7.7) is handled by splitting Y into two subsets where log ρP has
constant sign. Next, we claim that if R refines Q then
log ρR − log ρQ L1 (η0 ) ≤ log ρ − log ρQ L1 (η0 ) . (7.8)
To see that this is so, write

| log ρR − log ρQ | d η0 = ∑ | log ρR − log ρQ | d η0 ,
R⊂Q R

where the sum is over the pairs of atoms R ∈ R and Q ∈ Q with R ⊂ Q. Since
ρR and ρQ are constant on each R ∈ R, this may be rewritten as

∑ 0η (R) | log ρ R − log ρ Q | = ∑ log ρ d η0 − η0 (R) log ρ Q
R
R⊂Q R⊂Q

≤ ∑ | log ρ − log ρQ | d η0 .
R⊂Q R
The combination of these two relations proves (7.8).

We are ready to prove part (2) of the lemma. Let (Qn )n be any refining
sequence of partitions with m(∂ Qn ) = 0 and diam Qn → 0. By the martingale
convergence theorem (Theorem 5.21), ρQn → ρ at η0 -almost every point. By
uniform integrability (7.7), it follows that log ρQn → log ρ in L1 (η0 ). Given
ε > 0, fix n sufficiently large so that
log ρQn − log ρ L1 (η0 ) < ε /4. (7.9)
Given any partition P with η0 (∂ P) = 0, let R = P ∨ Qn (the coarsest par-

tition that refines both P and Qn ). Combining (7.8) with (7.9), we get
log ρR − log ρ L1 (η0 ) < ε /2. (7.10)
Now, let Y be the set of points x ∈ X such P(x) ⊂ Qn (x). Since η0 (∂ Qn ) = 0

we may find δ > 0 such that diam P < δ implies η0 (Y ) is small enough that
−η0 (Y )(log η0 (Y ) + log c) < ε /4. Then, combining (7.8) with the fact that
R(x) = P(x) for all x ∈ X \Y , we find

log ρP − log ρR L1 (η0 ) ≤ (| log ρP | + | log ρR |) d η0 < ε /2.
Y
Together with (7.10), this implies log ρP − log ρ L1 (η0 ) < ε , as claimed.
Proof of Lemma 7.6 First, apply Lemma 7.8 with X = PRd and c = ε and
P = Pn,k (x) and η0 = mx and η1 = A(x)−1
∗ m f (x) + ε mx . Note that
ρ = J + ε = Jε and ρP = Jn,k + ε = Jn,k,ε .

Moreover, Lemma 7.7 ensures that the diameter of P is small if n is large
enough. It follows from Lemma 7.8 that, for any γ > 0 and μ -almost every x,
one has
log Jn,k,ε − log Jε L1 (mx ) < γ for all 0 ≤ k < n (7.11)
if n is sufficiently large n. Note that log Jε L1 (mx ) ≤ 1 − log ε , because

log Jε dmx ≤ 1 and − log Jε dmx ≤ − log ε ,
{Jε ≥1} {Jε <1}

and analogously for Jn,k,ε . This shows that the left-hand side of (7.11) is uni-
formly bounded, and so we may use the bounded convergence theorem to de-
duce that
log Jn,k,ε − log Jε L1 (m) < 2γ for all 0 ≤ k < n
if n is sufficiently large. That finishes the proof of the lemma.
This proves Proposition 7.4 in the ergodic case. Now let m be an arbitrary

F-invariant probability measure that projects down to μ , and let m = mα d α
be an ergodic decomposition of m. Using Exercise 7.3, it follows that the claim
of Proposition 7.4 holds for m as well:

h(m) = − log J dmα d α = h(mα ) d α ≤ dΔ.
This completes the proof of Proposition 7.4 and Theorem 7.2.
7.3 Furstenberg’s criterion

Let us go back to the setting at the beginning of Section 7.1. Let ν = Â∗ μ̂ .
Among the consequences of the invariance principle is the following result of
Furstenberg [54] (the SL(2)-version appeared already in Exercise 6.10):
Theorem 7.9 (Furstenberg) If λ− = λ+ then there exists some probability

measure η on PRd such that B∗ η = η for every B ∈ supp ν . Indeed, this holds
for any PF̂-stationary measure η .
Proof By Proposition 5.6, the set of stationary measures is non-empty. Let

η be any stationary measure in PRd and let m be its lift to M × PRd . Then
m is an PF-invariant probability measure. By Theorem 7.1, m is an su-state
and so (Exercise 5.17) m = μ × η . Now the fact that m is PF̂-invariant means
(Exercise 5.20) that Â(x)∗ η = η for μ -almost every x ∈ M.
Example 7.10 Consider the locally constant cocycle defined by

2 0
A1 = and A2 = Rθ with θ ∈ R \ Q.
0 2−1
On the one hand, the A1 -invariant probability measures in PR2 are the convex
combinations of the Dirac masses at [(1, 0)] and [(0, 1)]. On the other hand, A2
has a unique invariant probability measure; namely, the Haar measure in PR2 .
Thus, the two matrices have no common invariant probability measure. Hence,
Theorem 7.9 implies that λ− < 0 < λ+ .

Let us comment a bit on the relations between Theorems 6.11 and 7.9. The
former was stated for non-invertible cocycles, whereas presently we deal with
the invertible case. However, that distinction is clearly irrelevant here: as we
have observed before (for instance, while proving Proposition 6.7), every lo-
cally constant cocycle has a trivial extension to an invertible locally constant
cocycle. Observe also that any compact subgroup of GL(d) must be contained
in SL(d), and so the two groups have the same compact subgroups.
It is not difficult to see that, up to these simple observations, Theorem 7.9
contains the high-dimensional version of Theorem 6.11:
Corollary 7.11 (Furstenberg) Assume that
(a) the support of ν is not contained in a compact subgroup of GL(d);

(b) the cocycle F̂ is strongly irreducible.
Then λ− < λ+ .
Proof According to Exercise 7.5, the conditions (a) and (b) imply that there
exists no probability measure η on PRd that is invariant under every B ∈
supp ν . The conclusion then follows from Theorem 7.9.
For SL(2)-cocycles, Theorem 6.11 also contains Theorem 7.9, as we saw in

Exercise 6.10. However, the analogue of Exercise 6.10 in dimensions d ≥ 3
(the converse to Exercise 7.5) is only partly true, by Exercises 7.6 and 7.7.
7.4 Lyapunov exponents of typical cocycles

Let X = {1, . . . , m} be a finite set with m ≥ 2 elements and let p be a probability
measure on X with p({i}) > 0 for every i ∈ X. Let f : M → M be the shift map
on M = pN and consider μ = pN . The set of functions A : X → GL(d) may
be identified with GL(d)m which, in turn, may be viewed as an open subset of
a Euclidean space. Let λ± (A) denote the extremal Lyapunov exponents of the
locally constant linear cocycle F : M × Rd → M × Rd defined by A over ( f , μ ).
Theorem 7.12 The subset Z of the functions A : X → GL(d) such that

λ− (A) = λ+ (A) is contained in a finite union of closed proper submanifolds
of GL(d)m . In particular, the closure of Z is nowhere dense and has volume
zero.
We prove this theorem later. Let us point out now that the conclusion can
be sharpened considerably. Exercise 7.8 is one example of this. Another is that
the arguments we are going to present remain valid if one replaces GL(d) by

the subgroup SL(d): just note that the curves B(t) defined in (7.12) and (7.14)
lie in SL(d) if the initial matrix A does.
7.4.1 Eigenvalues and eigenspaces

For each r, s ≥ 0 with r + 2s = d, let G(r, s) be the subset of matrices A ∈ GL(d)
having r real eigenvalues and s pairs of (strictly) complex conjugate eigenval-
ues, such that all the eigenvalues that do not belong to the same complex conju-
gate pair have distinct norms. Every G(r, s) is open and the complement of the

union G = r,s G(r, s) is a small subset of GL(d), meaning that it is contained
in a finite union of closed proper submanifolds.
For A ∈ G(r, s), let λ1 (A), . . . , λr (A) be the real eigenvalues, in decreasing
order of the absolute value, and μ1 (A), μ̄1 (A), . . . , μs (A), μ̄s (A) be the pairs of
complex eigenvalues, also in decreasing order of the absolute value. Moreover,
let ξ j (A) ∈ Gr(1, d) be the eigenspace corresponding to each real eigenvalue
λ j (A) and ηk (A) ∈ Gr(2, d) be the invariant plane corresponding to each pair
of complex conjugate eigenvalues μk (A) and μ̄k (A).
We need to analyse how these eigenvalues and invariant subspaces vary in-
side G(r, s). Let us start with the case when all eigenvalues are real:
Lemma 7.13 The maps A → λ j (A) and A → ξ j (A) are smooth on G(d, 0) for
j = 1, . . . , d. Moreover, the map G(d, 0) → Gr(1, d)d , A → (ξ1 (B), . . . , ξd (B))
is a submersion.
Proof Each λ j (A) is a simple root of the polynomial det(A − λ id), and so it
has a smooth continuation on G(d, 0), given by the implicit function theorem.
Denote L j (A) = A − λ j (A) id. This matrix depends smoothly on A ∈ G(d, 0)
and, since λ j (A) remains a simple eigenvalue throughout, it always has rank
d − 1. Let L j (A)a be the adjoint matrix. Its entries are the cofactors of L j (A),
and so the adjoint is non-zero and varies smoothly with A. In particular, the
columns of L j (A)a are smooth functions of A. Moreover, L j (A) · L j (A)a =
det L j (A) id = 0, which implies that every non-zero column of L j (A)a is an
eigenvector for A, associated with the eigenvalue λ j (A). This shows that every
A → ξ j (A) is smooth. To check that the derivative of
ξ : A → (ξ1 (A), . . . , ξd (A)) ∈ Gr(1, d)d
is onto, consider any differentiable curve β (t) = (β1 (t), . . . , βd (t)) on Gr(1, d)d
such that β (0) = ξ (A). Take t → P(t) to be a differentiable curve in GL(d)
such that the columns of P(t) are non-zero vectors in the direction of the β j (t).
Define
B(t) = P(t) diag[λ1 (A), . . . , λd (A)]P(t)−1 . (7.12)

Then t → B(t) is also a differentiable curve in GL(d) with B(0) = A and

ξ (B(t)) = β (t) for all t. In particular, the derivative Dξ (A) maps B (0) to
β (0). So the derivative of ξ is surjective, as claimed.
Next we deal with complex eigenvalues. Given A ∈ GL(d), let μ be a com-
plex solution of det(A − μ id) = 0 and v ∈ Cd \ {0} be such that Av = μ v. Then,
since the matrix A is taken to be real, det(A − μ̄ id) = 0 and Av̄ = μ̄ v̄. Assuming
that μ ∈
/ R, the conjugate vectors v and v̄ are linearly independent and so (Ex-
ercise 7.9) the real vectors ℜv and ℑv are also linearly independent. Moreover,
the real plane ψ (v) = span{ℜv, ℑv} is invariant under A. This plane depends
only on the complex projective class of v and so we may consider
ψ : {[v] ∈ PCd : v and v̄ are linearly independent} → Gr(2, d). (7.13)
We invite the reader to check that this map is smooth and is a submersion
(Exercise 7.10).
Lemma 7.14 The maps A → λ j (A) and A → μk (A) and A → ξ j (A) and A →
ηk (A) are smooth on G(r, s), for j = 1, . . . , r and k = 1, . . . s. Furthermore,
G(r, s) → Gr(1, d)r × Gr(2, d)s , A → (ξ1 (A), . . . , ξr (A), η1 (A), . . . , ηs (A))
is a submersion.
Proof Smoothness of the eigenvalues λ j and μk follows from the implicit
function theorem. Recall that ξ j (A) ∈ PRd denotes the eigenspace associated
with each real eigenvalue λ j (A). Let ζk (A) ∈ PCd be the eigenspace associated
with each complex eigenvalue μk (A). The same arguments as in Lemma 7.13
imply that A → ξ j (A) and A → ζk (A) are smooth maps. By the observations
preceding this lemma, it follows that ηk (A) = ψ (ζk (A)) ∈ Gr(2, d) is also a
smooth function of A. Moreover (Exercise 7.10), to prove that the map in the
statement is a submersion it suffices to show that
ζ : A → (ξ1 (A), . . . , ξr (A), ζ1 (A), . . . , ζs (A)) ∈ (PRd )r × (PCd )s
is a submersion. Let α (t) = (β1 (t), . . . , βr (t), γ1 (t), . . . , γs (t)) be any differen-
tiable curve in (PRd )r × (PCd )s with α (0) = ζ (A). Define
B(t) = P(t) diag[λ1 (A), . . . , λr (A),
(7.14)
μ1 (A), μ̄1 (A), . . . , μs (A), μ̄s (A)] P(t)−1 .
where t → P(t) is a differentiable curve such that the columns of P(t) are
non-zero vectors u j (t) ∈ β j (t) and vk (t), v̄k (t) with vk (t) ∈ γk (t). Observe that
t → B(t) is a differentiable curve in GL(d), even if P(t) need not be real.
Moreover, B(0) = A and ζ (B(t)) = α (t) for all t. So Dζ (A) maps B (0) to
α (0). This shows that the derivative Dζ (A) is indeed surjective.


Since the complement of G is a small subset of GL(d), the set of A ∈ GL(d)m
such that Ai ∈
/ G for some i = 1, . . . , m is a small subset of GL(d)m . So, we may
restrict ourselves to the open set
G = {A ∈ GL(d)m : Ai ∈ G for every i = 1, . . . , m}.
In view of Theorem 7.9, it suffices to show that the subset of A ∈ G such that
the matrices Ai , i = 1, . . . , m admit a common invariant probability is small.
If Ai ∈ G(ri , si ) then, as all the eigenvalues have different absolute values, ev-
ery Ai -invariant probability measure on PRd is a convex combination of Dirac
masses on the eigenspaces ξ j (A), j = 1, . . . , ri and of probability measures
supported in the invariant planes ηk (A), k = 1, . . . , si (each ηk (A) is naturally
identified with a 1-dimensional subspace of the projective space). In particular,
the support must be contained in
Σ(Ai ) = {ξ1 (Ai ), . . . , ξri (Ai )} ∪ η1 (Ai ) ∪ · · · ∪ ηsi (Ai ).
Suppose first that d ≥ 4. Since the ξ j (Ai ) are points and the ηk (Ai ) are
lines in the projective space, and dim PRd ≥ 3, it follows from Lemmas 7.13
and 7.14 that there exists a small subset Z1 of G(r1 , s1 )×G(r2 , s2 )×GL(d)m−2
such that
ξa (A1 ) = ξb (A2 ) (7.15)
ξa (A1 ) ∈
/ ηe (A2 ) and ξb (A2 ) ∈
/ ηc (A1 ) (7.16)
ηc (A1 )∩ηe (A2 ) = 0/ (7.17)
for every A ∈ G(r1 , s1 )×G(r2 , s2 )×GL(d)m−2 \Z
1 and 1 ≤ a ≤ r1 , 1 ≤ b ≤ r2 ,
1 ≤ c ≤ s1 and 1 ≤ e ≤ s2 . This implies that Σ(A1 ) ∩ Σ(A2 ) = 0/ which, by the
remarks in the previous paragraph, implies that A1 and A2 have no common
invariant probability measure. This proves the theorem in dimension d ≥ 4.
Now suppose that d = 3. The previous arguments extend to this case when
either s1 = 0 or s2 = 0; that is, when the condition (7.17) is void. When s1 =
s2 = 1, there is one difficulty: since the projective space PR3 is only 2-dimen-
sional, one cannot force the pair of 1-dimensional submanifolds η1 (A1 ) and
η1 (A2 ) to be disjoint, as required in (7.17). This can be bypassed as follows.
By the same arguments as before, there exists a small subset of Z2 of
G(1, 1)2 × GL(d)m−2 such that every A ∈ G(1, 1) × G(2, 2) × GL(d)m−2 \ Z2
satisfies (7.15) and (7.16) and η1 (A1 ) = η1 (A2 ) (instead of (7.17)). Suppose
that A1 and A2 have some common invariant probability measure θ . Observing
that
Σ(A1 ) ∩ Σ(A2 ) = η1 (A1 ) ∩ η1 (A2 )

consists of a single point z in the projective space, θ has to be the Dirac mass
at z. Then z has to be a fixed point for Ai inside η1 (Ai ), for i = 1, 2. This is
impossible, because the eigenspace ηi (Ai ) contains no Ai -invariant line. This
contradiction proves the theorem in dimension d = 3.
Now we treat the case d = 2. If s1 = s2 = 0 then the conditions (7.16) and
(7.17) are void and we can use the same arguments as we did for d ≥ 4. If s1 = 0
and s2 = 1 then Σ(A1 )∩Σ(A2 ) consists of the two points ξ1 (A1 ) and ξ2 (A1 ). So,
if A1 and A2 have a common invariant probability, then this must be a convex
combination of the Dirac masses at the two eigenspaces of A1 . Then the union
of these eigenspaces has to be invariant under A2 , which is impossible because
the action of A2 ∈ G(0, 1) on the projective space is a rotation whose angle is
not a multiple of π . Up to exchanging the roles of the matrices, this also covers
the case d = 2 with s1 = 1 and s2 = 0.
The only remaining case, d = 2 with s1 = s2 = 1, requires an essentially new
ingredient that we are going to borrow from Douady and Earle [50]. Recall that
every matrix B ∈ GL(2) with positive determinant induces an automorphism
hB of the Poincaré half-plane H:

a b az + b
B= −→ hB (z) = . (7.18)
c d cz + d
The action of B on the projective plane may be identified with the action of hB
on the boundary of H, via
∂ H → PR2 , x → [(x, 1)]
(including x = ∞) so that B-invariant measures on the projective plane may

be seen as hB -invariant measures sitting on the real axis. It is also easy to
check that hB has a fixed point in the open disc H if and only if B ∈ G(0, 1).
Define φ (B) to be this (unique) fixed point. It is easy to see that B → φ (B) is
a smooth submersion: use the explicit expression for the fixed point extracted
from (7.18).
Lemma 7.15 If A1 , a2 ∈ G(0, 1) have a common invariant probability mea-

sure μ on ∂ H then φ (A1 ) = φ (A2 ).
Proof It is clear that A1 and A2 have no invariant measures with atoms of

mass larger than 1/3: such atoms would correspond to periodic points in the
projective plane, with periods 1 or 2, which are forbidden by the definition of
G(0, 1). In Proposition 1 of [50] a map μ → B(μ ) is constructed that assigns
to each probability measure μ with no atoms of mass ≥ 1/2 (see Remark 2 in
[50, page 26]) a point B(μ ) in the half-plane H, called conformal barycenter

of μ , such that
B(h∗ μ ) = h(B(μ )) for every conformal automorphism h : H → H.
When μ is Ai -invariant this implies hAi (B(μ )) = B((hAi )∗ μ ) = B(μ ), and so

the conformal barycenter must coincide with the fixed point φ (Ai ) of the au-
tomorphism hAi . Thus, if μ is a invariant measure for both A1 and A2 then
φ (A1 ) = B(μ ) = φ (A2 ).
Since φ is a submersion, Z3 = {A ∈ G(0, 1)2 × GL(d)m−2 : φ (A1 ) = φ (A2 )}

is a small subset of G(0, 1)2 × GL(d)m−2 . By Lemma 7.15, if A is in the com-
plement of Z3 then there are no common invariant probability measures.
Altogether, this proves that Z is a small subset of GL(d)m , as claimed in
the first part of Theorem 7.12. The second part is an immediate consequence,
because the closure of a small subset is still a small subset, and small subsets
have volume zero and, consequently, are nowhere dense.
7.5 Notes
Theorem 7.2 is a special case of the main result of Ledrappier [81]. Ledrap-
pier shows how to deduce both Furstenberg’s theorem (Theorem 7.9) and a
celebrated theorem of Kotani [77]. The proof in Section 7.2 is from Avila and
Viana [16], where a much more general statement is obtained.
The invariance principle for linear cocycles goes back to Bonatti, Gomez-
Mont and Viana [37]. They considered both locally constant cocycles and
Hölder-continuous cocycles with invariant holonomies over hyperbolic homeo-
morphisms with invariant probabilities having local product structure (the no-
tion of local product structure was defined in the Notes to Chapter 5 and that
of invariant holonomies will be discussed in Section 10.6). Viana [113] used a
non-uniform version of these methods to prove that almost all linear cocycles
over any hyperbolic system, uniform or not, have some non-vanishing Lya-
punov exponent.
The expression invariance principle was coined by Avila and Viana [16],
who extended the statement to smooth cocycles over hyperbolic homeomor-
phisms; that is, cocycles that act by diffeomorphisms on the fibers. Avila, San-
tamaria and Viana [13] further extended the statement to smooth cocycles over
partially hyperbolic volume-preserving diffeomorphisms. This turned the in-
variance principle into a very flexible tool, with several applications in the
theory of partially hyperbolic systems.
Among these, we mention the rigidity results of Avila and Viana [16], for
symplectic diffeomorphisms, and Avila, Viana and Wilkinson [17], for time-
one maps of geodesic flows, the construction of measures of maximal entropy

7.6 Exercises 131
by Rodriguez-Hertz, Rodriguez-Hertz, Tahzibi and Ures [64] and the results of

Viana and Yang [115] on the existence and finiteness of physical measures.
Theorem 7.9 and Corollary 7.11 extend Theorem 6.11 to arbitrary dimen-
sion. They were originally due to Furstenberg [54] and also hold more gen-
erally than stated (see [81, Corollary 1]): it suffices to assume that A+ ∩ A−
contains only zero and full measure sets, where A± is the σ -algebra generated
by the functions A ◦ f ±n , n ≥ 1. Clearly, this is the case if ( f , μ ) is a Bernoulli
shift and A is locally constant.
Section 7.4 is a simplified version of material from Bonatti, Gomez-Mont
and Viana [37], which has also been extended by Avila, Santamaria and Viana
[13] to cocycles over partially hyperbolic diffeomorphisms. In fact, for the 2-
dimensional case we follow the latter paper more closely.
7.6 Exercises
Exercise 7.1 Let N = PRd and Ψ : N → N be the projective action of some
B ∈ GL(d). Prove that, for every [v] ∈ N and ξ ∈ T[v] N,
(1) T[v] N = orthogonal complement of [v];
v
(2) DΨ([v])ξ = orthogonal projection of B(ξ ) to T[B(v)] N;
B(v)
(3) DΦ([v]) ≤ BB−1 and DΦ([v])−1 ≤ BB−1 ;

d
v
(4) | det DΨ([v])| = | det B| .
B(v)
Exercise 7.2 Let θ be a probability measure on PRd and consider any finite
family of projective hyperplanes H1 , . . . , Hn ⊂ PRd . Show that there exists
some orthogonal transformation B : Rd → Rd such that θ (B(H j )) = 0 for every
j = 1, . . . , n.
Exercise 7.3 Prove that if m is an F-invariant probability measure that projects
down to μ then almost every ergodic component mα projects down to μ .
Exercise 7.4 Prove that Corollary 6.15 extends to any dimension d ≥ 2.
Namely, that the stabilizer H(θ ) = {B ∈ GL(d) : B∗ θ = θ } of any non-atomic
probability measure θ on PRd is a compact subgroup of GL(d).
Exercise 7.5 Show that Exercise 6.9 extends to any dimension d ≥ 2. Namely,
that if (a) the support of ν is contained in no compact subgroup of GL(d) and
(b) the cocycle is strongly irreducible, then there exists no probability measure
on PRd such that B∗ η = η for every B ∈ supp ν .

Exercise 7.6 Show that if the support of ν is contained in a compact subgroup

of GL(d), d ≥ 2 then there exists some probability measure on PRd such that
B∗ η = η for every B ∈ supp ν .
Exercise 7.7 Show that there exist SL(3)-cocycles that are not irreducible
and yet admit no probability measure on PR3 with B∗ η = η for every B ∈
supp ν . Check that this is the case for any random product of A1 = shear
along the xy-plane, A2 = irrational rotation around the z-axis and A3 = hy-
perbolic map on the xy-plane cross identity along the z-axis, with probabilities
p1 , p2 , p3 > 0.
Exercise 7.8 Check that the submanifolds in Theorem 7.12 may be chosen
with codimension ≥ [m/2].
Exercise 7.9 Given v ∈ Cd , prove that the following conditions are equiva-
lent:
(1) v and v̄ are C-linearly dependent;
(2) ℜv and ℑv are R-linearly dependent;
(3) v = eiθ w for some θ ∈ R and w ∈ Rd .
Exercise 7.10 Show that the set R = {[w] ∈ PCd : w ∈ Cd \ {0}} is closed in
PCd and the map ψ : PCd \ R → Gr(2, d) in (7.13) is a submersion. This can
be done in the following steps.
(1) Let 1 ≤ i < j ≤ d be fixed. Given vectors u, v ∈ Rd , let πi, j (u, v) be the
2 × 2-matrix whose rows are (ui , vi ) and (u j , v j ), and πi,∗ j (u, v) be the (d −
2) × 2-matrix whose rows are (ul , vl ) for l ∈ / {i, j}. Denote by Gi, j the
subset of planes L ∈ Gr(2, d) admitting a basis {u, v} such that πi, j (u, v)
is invertible. Check that the matrix φi, j (L) = πi,∗ j (u, v)πi, j (u, v)−1 does not
depend on the choice of the basis, and these maps φi, j : Gi, j → R2(d−2)
form an atlas of Gr(2, d).
(2) Observe that the map ψ is given by ψ ([w]) = πi,∗ j (ℜw, ℑw)πi, j (ℜw, ℑw)−1
in such local coordinates. Deduce that ψ is smooth and the derivative is
given by
Dψ ([w])ẇ = πi,∗ j (ℜẇ, ℑẇ)πi, j (ℜw, ℑw)−1
− πi,∗ j (ℜw, ℑw)πi, j (ℜw, ℑw)−1 πi, j (ℜẇ, ℑẇ)πi, j (ℜw, ℑw)−1 ,
for every ẇ ∈ T[w] PCd . Given Ḃ ∈ Tψ ([w]) Gr(2, d) (a real (d − 2) × 2 ma-

trix), consider u̇, v̇ ∈ Rd such that πi,∗ j (u̇, v̇) = Ḃπi, j (ℜw, ℑw) and πi, j (u̇, v̇) =
0. Let ẇ = u̇ + iv̇ and check that Dψ ([w])ẇ = Ḃ. Conclude that the deriva-
tive of ψ is indeed surjective.

8
Simplicity
In the previous two chapters we found sufficient conditions for the largest and
smallest Lyapunov exponents λ− and λ+ to be distinct. We are now going to
see that, under only mildly stronger conditions, all Lyapunov exponents are
distinct.
Let F : M × Rd → M × Rd be the linear cocycle defined over a measure-
preserving transformation f : (M, μ ) → (M, μ ) by some measurable function
A : M → GL(d) such that log+ A±1 ∈ L1 (μ ). The Lyapunov spectrum of F
is simple if all the Lyapunov exponents have multiplicity 1; that is, if
dimVxj − dimVxj+1 = 1 for every j, (8.1)
where Rd = Vx1 > · · · > Vxk > {0} is the Oseledets flag in Theorem 4.1. In other
words, F has d distinct Lyapunov exponents. When F is invertible, condition
(8.1) may be written as dim Exj = 1 for every j, where Ex1 ⊕ · · · ⊕ Exk is the
Oseledets decomposition in Theorem 4.2.
The main result in this chapter (Theorem 8.1) is a general criterion for sim-
plicity. We restrict ourselves to the case when F : M × Rd → M × Rd is locally
constant, although the statement is a lot more general. That is, we suppose
that ( f , μ ) is a Bernoulli shift, either one-sided or two-sided, and the function
A : M → GL(d) depends only on the zeroth coordinate. Let ν = A∗ μ be the
probability measure on GL(d) obtained by push-forward of μ under A.
We need to recall a few elementary notions from linear algebra.
8.1 Pinching and twisting

The singular values of a linear operator B ∈ GL(d) are the positive square roots
σ1 , . . . , σd of the eigenvalues of the self-adjoint operator Bt B. They have the
following geometric interpretation: the image of the unit sphere under B : Rd →
133

134 Simplicity
Rd is an ellipsoid with semi-axes of length σ1 , . . . , σd . We always take the

singular values to be numbered in non-increasing order: σ1 (B) ≥ · · · ≥ σd (B).
The eccentricity of B is defined by
σk (B)
ecc(B) = inf .
1≤k≤d−1 σk+1 (B)
Given a (d − k)-dimensional subspace G of Rd , the hyperplane section dual
to G is the set of all F ∈ Gr(k, d) such that F ∩ G = {0}. This is a closed,
nowhere-dense subset of Gr(k, d).
A monoid is a set B endowed with an associative binary operation that
admits a unit element. The monoid associated with the linear cocycle F is the
set B ⊂ GL(d) of all products B1 · · · Bn with Bi ∈ supp ν for 1 ≤ i ≤ n and n ≥ 0
(for n = 0 interpret the product to be the identity). A monoid B ⊂ GL(d) is
• pinching if it contains matrices with arbitrarily large eccentricity;

• twisting if, given any 1 ≤ k ≤ d − 1, any F ∈ Gr(k, d), and any finite family
G1 , . . . , Gn ∈ Gr(d − k, d), there exists B ∈ B such that B(F) ∩ Gi = {0} for
every 1 ≤ i ≤ n.
We leave it to the reader to check that for d = 2 these conditions coincide with
those in the definitions of pinching and twisting in Section 1.2.
Theorem 8.1 Suppose that the monoid B associated with F is pinching and
twisting. Then the Lyapunov spectrum of the linear cocycle F is simple.
The hypotheses of the theorem are typical among locally constant cocycles
(Exercise 8.1). It is also interesting to compare these hypotheses with those
of the high-dimensional Furstenberg theorem. It is clear that pinching implies
the non-compactness condition (a) in Corollary 7.11 and the implication is
strict if d ≥ 3. Moreover, twisting implies the strong irreducibility condition
(b) in Corollary 7.11. Lemma 8.7 below contains a kind of converse to this
last observation.
8.2 Proof of the simplicity criterion

For proving Theorem 8.1, it is no restriction to suppose the Bernoulli shift
f : (M, μ ) → (M, μ ) to be two-sided, M = X Z and μ = pZ , and we do so.
Given B ∈ GL(d) and 1 ≤ k ≤ d − 1 such that σk (B) > σk+1 (B), define
Eku (B) ∈ Gr(k, d) and Eks (B) ∈ Gr(d − k, d) to be the subspaces given by:
• Eku (B) is spanned by the eigenvectors of Bt B with eigenvalue ≥ σk (B)2 ;

8.2 Proof of the simplicity criterion 135
• Eks (B) is spanned by the eigenvectors of Bt B with eigenvalue ≤ σk+1 (B)2 .

In geometric terms: Eku (B) is the pre-image of the subspace generated by the
k largest semi-axes, and Eku (B) is the pre-image of the subspace generated by
the d − k smallest semi-axes of the ellipsoid {B(v) : v = 1}.
Proposition 8.2 For each 1 ≤ k ≤ d − 1 there exists a measurable function
ξ u : M − → Gr(k, d) such that ξ̂ u = ξ u ◦ π − : M → Gr(k, d) satisfies
(1) ξ̂ u ( f (x)) = A(x)ξ̂ u (x) for μ -almost every x ∈ M.
(2) For μ -almost every x ∈ M,
σd−k (A−n (x))
→ ∞ and Ed−k
s
(A−n (x)) → ξ̂ u (x).
σd−k+1 (A−n (x))
(3) For every hyperplane section S ⊂ Gr(k, d) there exists a set of x− ∈ M −
with positive μ − -measure such that ξ u (x− ) ∈
/ S.
The proof of this proposition will be given in a while. Replacing k by d − k
and reversing time, we also obtain:
Proposition 8.3 For each 1 ≤ k ≤ d − 1 there exists a measurable function
ξ s : M + → Gr(d − k, d) such that ξ̂ s = ξ s ◦ π + : M → Gr(d − k, d) satisfies
(1) ξ̂ s ( f (x)) = A(x) · ξ̂ s (x) for μ -almost every x ∈ M.
(2) For μ −almost every x ∈ M,
σk (An (x))
→ ∞ and Eks (An (x)) → ξ̂ s (x).
σk+1 (An (x))
(3) For every hyperplane section S ⊂ Gr(d −k, d) there exists a set of x+ ∈ M +
with positive μ + -measure such that ξ s (x+ ) ∈
/ S.
Proof of Theorem 8.1 Let 1 ≤ k ≤ d − 1 be fixed. Let ξ u : M − → Gr(k, d) be
as in Proposition 8.2 and ξ s : M + → Gr(d − k, d) be as in Proposition 8.3. We
write each x ∈ M as (x− , x+ ) with x− = π − (x) and x+ = π + (x). We claim that
ξ u (x− ) ∩ ξ s (x+ ) = {0} for μ −almost every x ∈ M. (8.2)
Indeed, the set of x ∈ M for which this property holds is ( f , μ )-invariant. So, by
ergodicity, either the claim is true or else ξ u (x− ) ∩ ξ s (x+ ) = {0} for μ -almost
every x. By Fubini, the latter would imply that there exists x+ ∈ M + such that
ξ u (x− ) is contained in the hyperplane section dual to ξ s (x+ ) for μ − -almost
every x− ∈ M − , and that would contradict Proposition 8.2. This contradiction
proves (8.2).
Thus, we have an F-invariant measurable decomposition Rd = ξ̂ s (x)⊕ ξ̂ u (x)
defined at almost every point. We want to prove that the Lyapunov exponents of

136 Simplicity
F along ξ̂ u are strictly bigger than those along ξ̂ s . Denote ξns (x) = Eks (An (x)).
By Proposition 8.3, the sequence ξns converges to ξ̂ s at μ -almost every point.
In particular, the angle between ξns (x) and ξ̂ u (x) is bounded from zero for all
large n. So, we may choose α > 0, n0 ≥ 1, and a set Z ⊂ M with μ (Z) > 0 such
that
ξ̂ u (x) ⊂ C(ξns (x)⊥ , α ) and ξ̂ s (x) ∩C(ξns (x)⊥ , 4α ) = 0/ (8.3)
for every n ≥ n0 and x ∈ Z. Note that C(V, a) denotes the cone of radius a > 0
around a subspace V of Rd , as defined in Section 4.4.2. Fix β ∈ (0, 1/10) such
that (Exercise 4.10)
C(ξ̂ u (x), 4β ) ⊂ C(ξns (x)⊥ , 2α ).
Then, for any x ∈ Z and n ≥ n0 ,

An (x) C(ξ̂ u (x), 4β ) ⊂ C An (x)(ξns (x)⊥ ), 2α En (x) (8.4)
where
σk+1 (An (x))
En (x) = .
σk (An (x))
By Proposition 8.3, En (x) goes to zero as n → ∞. Thus, increasing n0 and
reducing Z if necessary, we may suppose 2α En (x) ≤ β for every n ≥ n0 and
x ∈ Z. Now, (8.4) and the fact that ξ̂ u is F-invariant imply that

ξ̂ u ( f n (x)) ∈ C An (x)(ξns (x)⊥ ), β .

Then C An (x)(ξns (x)⊥ ), β ⊂ C ξ̂ u ( f n (x)), 3β (Exercise 4.10). Altogether,
this proves that

An (x) C(ξ̂ u (x), 4β ) ⊂ C(ξ̂ u ( f n (x)), 3β ) (8.5)
for every n ≥ n0 and x ∈ Z.
Reducing Z once more, if necessary, we may assume that its first n0 iterates
are pairwise disjoint. Equivalently, the first return time r(x) ≥ n0 for every
x ∈ Z. Let g(x) : Z → Z, g(x) = f r(x) (x) be the first return map and G : Z ×Rd →
Z × Rd be the linear cocycle G(x, v) = (g(x), B(x)v) with B(x) = Ar(x) (x). Then
property (8.5) implies that every B(x) maps the cone C(ξ̂ u (x), 4β ) inside a
strictly smaller cone C(ξ̂ u ( f n (x)), 3β ). By (8.3), these cones are disjoint from
ξ̂ s (x). So, we are in a position to apply Proposition 4.19 to conclude that the
Lyapunov exponents of G along ξ̂ u (x) are strictly bigger than the Lyapunov
exponents of G along ξ̂ s (x). Then, by Proposition 4.18, the same is true for the
original cocycle F: the first k exponents are strictly larger than all the remaining
ones. Since k is arbitrary, this proves that all the Lyapunov exponents of F are
distinct, as claimed in the theorem.

Now let us prove Proposition 8.2 and Proposition 8.3.
8.3 Invariant section

Let Fk : M × Gr(k, d) → M × Gr(k, d) be the Grassmannian cocycle associated
with the linear cocycle F : M ×Rd → M ×Rd . That is, Fk (x,V ) = ( f (x), A(x)V )
for every (x,V ) ∈ M × Gr(k, d). For k = 1 this is just the projective cocycle
PF : M × PRd → M × PRd .
For the proofs of Propositions 8.2 and 8.3, we need to expand a bit more on
the formalism introduced in Section 4.3.2.
8.3.1 Grassmannian structures

Let E = Rd and ΛE be the disjoint union of the exterior powers Λk E with
0 ≤ k ≤ d. The exterior product on E is the bilinear map
∧ : ΛE × ΛE → ΛE,
characterized by (u1 ∧ · · · ∧ uk ) ∧ (v1 ∧ · · · ∧ vl ) = (u1 ∧ · · · uk ∧ v1 ∧ · · · ∧ vl ).
Let H be a hyperplane (that is, a codimension-1 linear subspace) of the vec-
tor space Λk E. Then H may be written as
H = {ω ∈ Λk E : ω ∧ υ = 0}
for some non-zero υ ∈ Λd−k E. We call the hyperplane geometric if υ may be
chosen to be a decomposable (d − k)-vector; that is, υ = υk+1 ∧ · · · ∧ υd for
some choice of vectors υi in E. Observe that,
ω ∈ H ⇔ ω ∧ υ = 0 ⇔ Ψ(ω ) ∩ Ψ(υ ) = {0} for any ω = ω1 ∧ · · · ∧ ωk
(the map Ψ was introduced in Section 4.3.2). Thus, the intersection of Gr(k, d)
with the projective image of the geometric hyperplane H coincides with the
hyperplane section of Gr(k, d) dual to the (d − k)-dimensional subspace Ψ(υ )
generated by the vectors υk+1 , . . . , υd . This shows that the hyperplane sections
of the Grassmannian manifold are, precisely, the intersections of Gr(k, d) with
the projective images of geometric hyperplanes of Λk E. As observed before,
hyperplane sections are closed nowhere-dense subsets of the Grassmannian
manifold.
A geometric subspace of Λk E is a finite intersection of geometric hyper-
planes. Similarly, a linear section of Gr(k, d) is a finite intersection of hyper-
plane sections. Equivalently, a linear section is the intersection of the Grass-
mannian with the projective image of some geometric subspace of the exterior
power.

138 Simplicity
We call linear arrangement in Λk E any finite union of geometric subspaces.

A linear arrangement in Gr(k, d) is a finite union of linear sections. Thus, a
linear arrangement in Gr(k, d) is the intersection of the Grassmannian manifold
with the projective image of some linear arrangement in Λk E. In particular, ev-
ery linear arrangement is a closed nowhere-dense subset of the Grassmannian.
Lemma 8.4 If {Lα : α ∈ I} is an arbitrary family of linear arrangements in

Gr(k, d) then α ∈I Lα coincides with the intersection of the Lα over a finite
subfamily.
Proof By definition, for each α ∈ I there exists some linear arrangement Lα

in Λk E such that Lα = PLα ∩ Gr(k, d). Let I1 ⊂ I be an arbitrary finite subset.
We may write

Lα = V1 ∪ · · · ∪Vm
α ∈I1
where every V j ⊂ Λk E is a geometric subspace. Renumbering the V j if neces-

sary, we may suppose that there is s ∈ {0, . . . , m} such that V j ⊂ α ∈I Lα if

and only if 1 ≤ j ≤ s. If s = m then α ∈I1 Lα = α ∈I Lα and so the claim is
proved. Otherwise, for each j ∈ {s + 1, . . . , n} there is α j ∈ I such that V j is not
contained in Lα j . Let I2 = I1 ∪ {αs+1 , . . . , αn }. Then

Lα = W1 ∪ · · · ∪Wn
α ∈I2
where Wi = Vi for i ≤ s and each Wi with i > s is a proper subspace of some V j

with j > s. In particular,

max dimW j : W j ⊂ Lα ≤ max dimV j : V j ⊂ Lα − 1.
α ∈I α ∈I
Thus, repeating this procedure not more than dim Λk E times, we find a finite

set Ir ⊂ I such that α ∈Ir Lα = α ∈I Lα . Then

Lα = P Lα ∩ Gr(k, d) = P Lα ∩ Gr(k, d) = Lα ,
α ∈I α ∈I α ∈Ir α ∈Ir
which proves the claim.
Corollary 8.5 The family of linear arrangements in Gr(k, d) is closed under

finite unions and arbitrary intersections.
Proof The statement about finite unions and finite intersections is an immedi-
ate consequence of the definition of linear arrangement. The claim for arbitrary
intersections follows directly from Lemma 8.4.

Corollary 8.6 Let L be a linear arrangement in Gr(k, d) and let B ∈ GL(d).

If B(L ) ⊂ L then B(L ) = L .
Proof Consider the non-increasing sequence Bn (L ), n ≥ 1. By Lemma 8.4

there exists p ≥ 1 such that Bn (L ) = B p (L ) for every n ≥ p. In particular,
B p+1 (L ) = B p (L ) and, applying B−p to both sides, this gives B(L ) = L .
8.3.2 Linear arrangements and the twisting property

A linear arrangement is proper if it is neither empty nor the whole Gr(k, d).
Lemma 8.7 A monoid B is twisting if and only if, for any 1 ≤ k ≤ d − 1,

there is no proper linear arrangement L in Gr(k, d) with B(L ) = L for all
B ∈ B.
Proof Suppose that there exists a proper linear arrangement L ⊂ Gr(k, d)

invariant under every B ∈ B. Fix F ∈ L and consider G1 , . . . , Gm ∈ Gr(d −
k, d) such that L is contained in the union of the hyperplane sections S1 ,
. . . , Sm dual to G1 , . . . , Gm . Then, for every B ∈ B, there exists j such that
B(F) ∈ S j or, equivalently, F ∩ G j = {0}. This shows that B is not twisting.
Conversely, suppose that B is not twisting. Then, there exist F, G1 , . . . , Gm
such that, for every B ∈ B, there exists j ∈ {1, . . . , m} such that B(F) belongs
to the hyperplane section S j dual to G j . Consider

L = B−1 S1 ∪ · · · ∪ Sm .
B∈B
Then L is a linear arrangement, by Corollary 8.5, and it is proper, since

F ∈ L . Moreover, B−1 (L ) ⊃ L for every B ∈ B. Using Corollary 8.6 we
conclude that B(L ) = L for every B ∈ B. This finishes the proof.
Corollary 8.8 The inverse monoid B −1 = {B−1 : B ∈ B} is pinching or

twisting if and only if B is pinching or twisting, respectively.
Proof By Exercise 8.2, any B and B−1 have the same singular values. Thus,
B −1 is pinching if and only if B is pinching. The claim about twisting is a di-
rect consequence of Lemma 8.7, because B(L ) = L if and only if B−1 (L ) =
L.
The proof of the following corollary mimics that of Lemma 6.9.
Corollary 8.9 If B is twisting then any Fk -stationary measure gives zero

weight to every proper linear arrangement in Gr(k, d).

140 Simplicity
Proof Suppose η (S ) > 0 for some linear section S ⊂ Gr(k, d). By defini-
tion, S = Gr(k, d) ∩ PS for some geometric subspace S ⊂ Λk E. Let d0 ≥ 1 be
the smallest dimension of a geometric subspace S such that η (Gr(k, d) ∩ PS) >
0 and let c > 0 be the supremum of η (Gr(k, d) ∩ PS) over all subspaces S of
dimension d0 .
Let V be the family of all S = Gr(k, d) ∩ PS such that S is a geometric sub-
space with dim S = d0 and η (S ) = c. We claim that V is non-empty. Indeed,
consider any sequence Sn = Gr(k, d) ∩ PSn where Sn is a geometric subspace
with dim Sn = d0 and η (Sn ) → c. Up to restricting to a subsequence, we may
suppose (Exercise 8.8) that (Sn )n converges to some geometric subspace S of
dimension d0 . Since S = Gr(k, d)∩PS is a closed subset of the Grassmannian,
it follows that η (S ) = c. This proves our claim.
We are also going to show that V is finite. Indeed, if Si = Gr(k, d) ∩ PSi ,
i = 1, 2 are two distinct elements of V then S1 = S2 then dim(S1 ∩ S2 ) < d0
and so S1 ∩ S2 = Gr(k, d) ∩ P(S1 ∩ S2 ) has zero η -measure. Therefore, #V ≤
1/c < ∞.
Now, consider any S ∈ V . Since η is Fk -stationary,

c = η (S ) = η (B−1 (S )) d ν (B)
and so η (B−1 (S )) = c for ν -almost every B. It follows that η (B−1 (S )) = c

for every B ∈ supp ν . This proves that B−1 (S ) ∈ V for all B ∈ B and S ∈

V . Consequently, L0 = S ∈V S is a proper linear arrangement in Gr(k, d),
and L0 is invariant under every B ∈ B. By Lemma 8.7, this contradicts the
assumption that B is twisting. This contradiction proves that η (S ) = 0 for
every linear section S , and so η (L ) = 0 for every linear arrangement L ⊂
Gr(k, d).
8.3.3 Control of eccentricity

Let 1 ≤ k ≤ d − 1 be fixed. Recall that Eku (B) ∈ Gr(k, d) is the subspace of
dimension k most expanded by B and Eks (B) ∈ Gr(d − k, d) is the subspace
of dimension d − k most contracted by B. They are defined when σk (B) >
σk+1 (B).
Lemma 8.10 Let (Bn )n be a sequence in GL(d) with σk (Bn )/σk+1 (Bn ) → ∞
and Bn (Eku (Bn )) → Eku and Eks (Bn ) → Eks as n → ∞. Let K be any compact
subset of Gr(k, d) which does not intersect the hyperplane section dual to Eks .
Then Bn (K) → Eku as n → ∞.
Proof Suppose that the conclusion is not true. Then, restricting to a subse-

quence if necessary, there exist Fn ∈ K such that Bn (Fn ) converges to some

F = Eku . Take hn ∈ Fn with hn = 1 such that
Bn (hn )
→h∈
/ Eku .
Bn (hn )
By Exercise 8.3, there exists c1 > 0 depending only on the distance from K to
the hyperplane section dual to Eks such that
Bn (hn ) ≥ c1 σk (Bn ) for every n. (8.6)
s (B−1 ), there exists c > 0 depending
Since (Exercise 8.2) Bn (Eku (Bn )) = Ed−k n 2
u
only on the distance from h to Ek such that
1 = hn = B−1 −1
n (Bn (hn )) ≥ c2 σd−k (Bn )Bn (hn ).
for all large n. Consequently, since σd−k (B−1

n ) = σk+1 (Bn )
−1 (Exercise 8.2),
σk (Bn )
1 ≥ c1 c2
σk+1 (Bn )
for all n large, which contradicts the hypothesis.
Lemma 8.11 Let (Bn )n be a sequence in GL(d). Suppose that there is F ∈
Gr(k, d) such that the set {F ∈ Gr(k, d) : Bn (F ) → F} is not contained in a
hyperplane section. Then σk (Bn )/σk+1 (Bn ) → ∞ and F = limn Bn (Eku (Bn )).
Proof Suppose that σk (Bn )/σk+1 (Bn ) is bounded along some subsequence.
Passing to a subsequence, and replacing Bn by Yn Bn Zn , with convenient Yn ,
Zn ∈ GL(d) such that Yn±1 and Zn±1 are bounded, we may assume that
there exist l < k < m such that
• σ j (Bn )/σk (Bn ) → ∞ for j ∈ {1, . . . , l},

• σ j (Bn ) = σk (Bn ) for j ∈ {l + 1, . . . , m},
• σ j (Bn )/σk (Bn ) → 0 for j ∈ {m + 1, . . . , d},
and there exists an orthonormal basis {e1 , . . . , ed }, independent of n, such that
• Bn (ei ) = σi (Bn )ei for every i and n.

Let E u , E c , E s be the spans of {e1 , . . . , el }, {el+1 , . . . , em }, {em+1 , . . . , ed }, re-
spectively. We claim that the set of F ∈ Gr(k, d) such that Bn (F ) → F is
contained in the hyperplane section dual to some G ∈ Gr(k, d). We split the
argument into three cases.
If F is not contained in E u ⊕ E c , take G to be any subspace of dimension
d − k containing E s . Observe that F ∩ E s = {0}, necessarily, and so F ∩ G =
{0}. That proves the claim in this case.

142 Simplicity
From now on, suppose F ⊂ E u ⊕ E c . If F does not contain E u , take G to be

any subspace of dimension d − k contained in E c ⊕ E s . Observe that F cannot
be transverse to E c ⊕ E s , and so dim F ∩ (E c ⊕ E s ) > k − l. Then
dim F ∩ (E c ⊕ E s ) + dim G > k − l + d − k = dim E c ⊕ E s
and that implies that F ∩ G = {0}.
Finally, suppose E u ⊂ F ⊂ E c ⊕ E u . Then F = E u ⊕ F c for some subspace
F ⊂ E c with dimension k − l. Take G to be any subspace of dimension d − k
c
that contains E s and intersects F c . Observe that F must be transverse to E c ⊕

E s and F ∩ (E c ⊕ E s ) must be the graph of a linear map F c → E s . Hence,
F ∩ G = {0}. The proof of our claim is complete.
Thus, we have shown that if {F ∈ Gr(k, d) : Bn (F ) → F} is not contained
in a hyperplane section, for some F ∈ Gr(k, d), then σk (Bn )/σk+1 (Bn ) → ∞.
Moreover, passing to a subsequence if necessary, we may assume that Eks (Bn )
converges to some Eks . Since {F ∈ Gr(k, d) : Bn (F ) → F} is not contained in
the hyperplane section dual to Eks , Lemma 8.10 gives that F = lim Bn (Eku (Bn )).
That completes the argument.
The statement and proof of the next result are akin to those of Lemma 6.14.
Lemma 8.12 Let (Bn )n be a sequence in GL(d) and β be any probability
measure on Gr(k, d).
(1) Assume that β gives zero weight to every hyperplane section and assume
σk (Bn )/σk+1 (Bn ) → ∞ and Bn (Eku (Bn )) → Eku . Then (Bn )∗ β converges in
the weak∗ topology to a Dirac mass on Eku .
(2) Assume that β is not supported in a hyperplane section and assume (Bn )∗ β
converges in the weak∗ topology to a Dirac mass on some Eku ∈ Gr(k, d).
Then σk (Bn )/σk+1 (Bn ) → ∞ and Bn (Eku (Bn )) → Eku .
Proof Let us first prove (1). Up to restricting to some subsequence of an
arbitrary subsequence, we may assume that Eks (Bn ) also converges to some
Eks . Take a compact set K disjoint from the hyperplane section dual to Eks and
such that β (K) > 1 − ε . Then (Bn )∗ β (Bn (K)) > 1 − ε and, by Lemma 8.10,
Bn (K) is close to Eku for all large n. This shows that (Bn )∗ β converges to the
Dirac measure at Eku . Now we prove (2). For each m ≥ 1, let Vm denote the
neighborhood of radius 1/m around Eku . The hypothesis implies that for each
m ≥ 1 there exists nm ≥ 1 such that

β B−1n (Vm ) > 1 − 2
−m
for every n ≥ nm .

Define X = ∞ ∞ −1
l=1 m=l Bnm (Vm ). Then β (X) = 1 and so, by the hypothesis on
β , the set X is not contained in any hyperplane section. Moreover, for every

F ∈ X one has Bnm (F ) → Eku as m → ∞. Using Lemma 8.11, we conclude that

σk (Bn )/σk+1 (Bn ) → ∞ and Bn (Eku (Bn )) → Eku restricted to the subsequence
(nm )m . The same argument can be applied starting from any subsequence of
(Bn )n . Thus, we have shown that σk (Bn )/σk+1 (Bn ) → ∞ and Bn (Eku (Bn )) → Eku
restricted to some subsequence of every subsequence. This implies the claim,
and so the proof of the lemma is complete.
8.3.4 Convergence of conditional probabilities

We call homogeneous measure any probability measure β in Gr(k, d) such
that β (L ) = 0 for every proper linear arrangement L . Corollary 8.9 states
that every Fk -stationary measure is homogeneous, if the associated monoid
B is twisting. Clearly, the support of a homogeneous measure can never be
contained in a hyperplane section.
Lemma 8.13 There exist θ > 0, m ≥ 1, and matrices P, Q, Qi , 1 ≤ i ≤ m in

B, with the following properties:
(1) P admits an invariant decomposition Rd = UP ⊕ SP where dimUP = k and

all eigenvalues of P restricted to UP are larger, in norm, than all eigenval-
ues of P restricted to SP .
(2) Q(UP ) ∩ SP = {0}
(3) For every G ∈ Gr(d − k, d) there exists i ∈ {1, . . . , m} such that the angle
between Qi (UP ) and G is at least 2θ .
Proof Since B is pinching, we may find Bn ∈ B such that σk (Bn )/σk+1 (Bn )
goes to infinity. Up to restricting to a subsequence, we may suppose that
Bn (Eku (Bn )) and Eks (Bn ) converge to subspaces Eku and Eks , respectively. Since
B is twisting, we may find B0 ∈ B such that B0 (Eku ) ∩ Eks = {0}. We are go-
ing to take P = B0 Bn for some large n. Fix c > b > a > 0, large enough so
that P(Eku (Bn )) is contained in the cone C(Eku (Bn ), a) for every large n. Since
σk (Bn ) is much larger than σk+1 (Bn ), the map P = B0 Bn sends C(Eku (Bn ), c) to
a thin cone around P(Eku (Bn )). Hence, assuming n is large enough,

P C(Eku (Bn ), c) ⊂ C(Eku (Bn ), b).
Then (Exercise 4.13), P admits a dominated decomposition as in claim (1).

Since B is twisting, there exists Q ∈ B as in claim (2). For the same reason,
given any G ∈ Gr(d − k, d) there exists QG ∈ B such that QG (UP ) ∩ G = {0}.
Moreover, one may take QG = QG for every G ∈ Gr(d − k, d) in a neighbor-
hood of G. So, by compactness of Gr(d − k, d), one can choose matrices Q1 ,

144 Simplicity
. . . , Qm such that

max | sin Qi (UP ), G)| > 0.
1≤i≤m
Using compactness once more, the expression on the left is bounded below by
some 2θ > 0.
By definition, P = P1 · · · PN(P) for some Pj ∈ supp ν for 1 ≤ j ≤ N(P) and

N(P) ≥ 1. Fix any such factorization of P and let r > 0. Define VP to be the set
of all products
P = P1 · · · PN(P)

with Pj in the r-neighborhood of Pj for 1 ≤ j ≤ N(P).
Define N(Q), VQ and N(Qi ), VQi , 1 ≤ i ≤ m analogously, starting from factor-

izations of Q and Qi , 1 ≤ i ≤ m instead. Take r > 0 to be small enough that the
VQi , 1 ≤ i ≤ m are pairwise disjoint and the conclusions of Lemma 8.13 remain
true, for slightly smaller θ > 0, when P, Q, and Qi , 1 ≤ i ≤ m are replaced by
any P ∈ VP , Q ∈ VQ , and Qi ∈ VQi , 1 ≤ i ≤ m. For each l ≥ 1, let VPl denote
the set of all products
P̃ = P(1) · · · P(l) , with P( j) ∈ VP for all 1 ≤ j ≤ l.

Assuming r > 0 is small, every P̃ ∈ l VPl admits a dominated decomposition
Rd = UP̃ ⊕ SP̃ as in part (1) of Lemma 8.13 (Exercise 4.13). Moreover, UP̃ is
uniformly close to UP , so that parts (2) and (3) remain valid as well.
Let CP denote the subset of x ∈ M such that A( f j−1 (x)) is in the r-neighbor-
hood of Pj for every j = 1, . . . , N(P). Note that CP is a cylinder of M, since A
is locally constant, and CP has positive μ -measure, because every Pj ∈ supp ν .
Analogously, starting from the chosen factorizations of Q and Qi , 1 ≤ i ≤ m, we
define cylinders CQ and CQi , 1 ≤ i ≤ m with positive μ -measure. For each l ≥ 1,
let CPl denote the subset of x ∈ M such that f jN(P) (x) ∈ CP for j = 0, . . . , l − 1.
Let nl,i = lN(P) + N(Q) + lN(P) + N(Qi ) and define Yl,i to be the subset of
points x ∈ M such that
f −nl,i (x) ∈ CPl and f −nl,i +lN(P) (x) ∈ CQ and

(8.7)
−nl,i +lN(P)+N(Q) −nl,i +lN(P)+N(Q)+lN(P)
f (x) ∈ CPl and f (x) ∈ CQi .
Then Yl,i is an nl,i -cylinder of M with μ (Yl,i ) > 0, for every l and i. For each l
fixed, the cylinders Yl,i , 1 ≤ i ≤ m are pairwise disjoint.
Given ε > 0, we say a probability measure ρ in Gr(k, d) is ε -concentrated
if there exists some ε -neighborhood Bε ⊂ Gr(k, d) such that ρ (Bε ) > 1 − ε .
Lemma 8.14 Given ε > 0 and any homogeneous measure β in Gr(k, d), there
is l0 ≥ 1 and, given any B0 ∈ B, there is i = i(B0 ) ∈ {1, . . . , m} such that B∗ β

is ε -concentrated, for every B = B0 Qi P Q P with Qi ∈ VQi , P ∈ VPl , Q ∈ VQ ,

P ∈ VPl , and l > l0 .
Proof Since β is homogeneous, it follows from (1) in Lemma 8.13 that, if l

is large, most of the mass of P∗ β is concentrated near the sum UP ∈ Gr(k, d)
of the eigenspaces of P associated with the k eigenvalues with largest norms.
Then most of the mass of (Q P )∗ β is concentrated near Q (UP ). By (2) in
Lemma 8.13, the latter is transverse to SP . So, if l is large then most of the mass
of (P Q P )∗ β is concentrated near the sum UP ∈ Gr(k, d) of the eigenspaces
associated with the k eigenvalues of P with largest norms. Choose θ > 0 as
in Lemma 8.13 and let GB0 ∈ Gr(d − k, d) be as in Exercise 8.6. The action of
all B0 ∈ B is equicontinuous restricted to the set of k-dimensional subspaces
whose angle to GB0 is at least θ . Let i = i(B0 ) be as in (3) of Lemma 8.13, for
G = GB0 . Then most of the mass of (Qi P QP )∗ β is concentrated near Qi (UP ),
and so most of the mass of B∗ β is concentrated near B0 Qi (UP ), if l is large
enough. Moreover, the equicontinuity property allows us to take the condition
on l uniform on B0 . This proves the lemma.
In what follows, take l0 = l0 (ε , β ) ≥ 1 to be as in Lemma 8.14.
Lemma 8.15 There is c0 > 0 and measurable sets Zl ⊂ M with μ (Xl ) > c0
for every l ≥ 1 such that, for every ε > 0, every homogeneous measure β in
Gr(k, d), and every x ∈ Zl with l > l0 , there exists n > l such that An ( f −n (x))∗ β
is ε -concentrated.
Proof For each l ≥ 1 and 1 ≤ i ≤ m, let nl,i and Yl,i be as defined in (8.7).
Define nl = maxi nl,i By ergodicity of ( f −nl , μ ), for μ -almost every x ∈ M
there exist infinitely many values of k ≥ 1 such that f −knl (x) ∈ Yl,1 ∪ · · · ∪Yl,m .
Choose k = k(x) minimum with this property and then let n = knl + nl,i and
B0 = Aknl ( f −knl (x)) and B = An ( f −n (x)).
The definitions give that B = B0 Qi P Q P with P , P ∈ VPl , Q ∈ VQ , and Qi ∈
VQi for some i ∈ {1, . . . , m}. Moreover (Exercise 8.7),
μ (Yl, j ) μ (CQ j )
μ (Zl ) ≥ min = min
1≤ j≤m μ (Yl,1 ) + · · · + μ (Yl.m ) 1≤ j≤m μ (CQ1 ) + · · · + μ (CQm )
where Zl is the set of points x ∈ M for which this i = i(B0 ). Let c0 > 0 denote
the expression on the right-hand side. To conclude, apply Lemma 8.14.
Proof of Proposition 8.2 Let η be any stationary measure for Fk : stationary

measures do exist, by Proposition 5.6. Let m be the lift of η to M × Gr(k, d).

146 Simplicity
By Corollaries 5.23 and 5.24, m is an Fk -invariant u-state and its disintegration

is given by
mx = lim An ( f −n (x))∗ η for μ -almost every x. (8.8)
n
Let Xl , l ≥ 1 be as in Lemma 8.15 and Z be the set of points x that belong to Xl

for infinitely many values of l. Then μ (Z) ≥ c0 and An ( f −n (x))∗ η accumulates
at a Dirac mass for all x ∈ Z. So, mx is a Dirac mass for every x ∈ Z. By the
ergodicity of ( f , μ ), it follows that mx is a Dirac mass for μ -almost every x.
Denote the support of this Dirac mass by ξ̂ u (x). Since m is a u-state, we can
factorize ξ̂ u = ξ u ◦ π− for some measurable ξ u : M − → Gr(k, d). The fact that
m is Fk -invariant means that ξ̂ u ( f (x)) = A(x)(ξ̂ u (x)) for μ -almost every x.
That proves claim (1). By Corollary 8.9, the support of η cannot be contained
in any hyperplane section. It follows from part (2) of Lemma 8.12 that
σk (An ( f −n (x)))
→ ∞ and An ( f −n (x))Eku (An ( f −n (x))) → ξ̂ u (x),
σk+1 (An ( f −n (x)))
for μ -almost every x. In view of Exercise 8.2, this is the same as
σd−k (A−n (x))
→ ∞ and Ed−k
s
(A−n (x)) → ξ̂ u (x),
σd−k+1 (A−n (x))
which is precisely claim (2) in the proposition. Moreover, claim (3) follows
directly from the observation that m projects to η in Gr(k, d) and the support
of η is not contained in any hyperplane section. This finishes the proof of
Proposition 8.2.
To deduce Proposition 8.3, consider the inverse F −1 : M × Rd → M × Rd ,
which is the linear cocycle defined over f −1 : M → M by
A−1 : M → GL(d), A−1 (x) = A( f −1 (x))−1 .
The associated monoid is B −1 = {B−1 : B ∈ B} and, by Corollary 8.8, this
monoid is pinching and twisting. Strictly speaking, F −1 is not locally con-
stant, because A−1 (x) depends on the first (not the zeroth) coordinate of x ∈ M.
However, the cocycle F : M × Rd → M × Rd defined over f −1 : M → M by
A : M → GL(d), A (x) = A(x)−1
is locally constant, and it is conjugate to F −1 by the map (x, v) → ( f −1 (x), v)
on M × Rd . In particular, the two cocycles have the same associated monoid
B −1 . Applying Proposition 8.2 to the cocycle F , with k replaced by d − k, and
then translating the conclusions to F −1 through the conjugacy, we obtain the
claims in Proposition 8.3.
Now the proof of Theorem 8.1 is complete.

8.4 Notes 147
8.4 Notes
Simplicity criteria were first proven in the special case of products of random
matrices. Guivarc’h and Raugi [60] obtained sufficient conditions (strong ir-
reducibility together with the contraction property) for the extremal Lyapunov
exponents λ± to be simple. Using exterior powers as in Section 4.3.2, this
yields sufficient conditions for simplicity of all Lyapunov exponents.
A more direct criterion for simplicity of the whole Lyapunov spectrum of
a product of random matrices was given by Gol’dsheid and Margulis [58],
based on considering the action of the cocycle on the Grassmannian manifolds.
The contraction property in [60] is replaced by a more explicit condition on
the Zariski closure of the group generated by the support of the probability
measure p.
Bonatti, Viana [38] extended the criterion in [60] to Hölder-continuous lin-
ear cocycles with invariant holonomies and Avila and Viana [14] obtained a
similar extension for [58]. In either case, the base dynamics is assumed to be
a hyperbolic homeomorphism endowed with an invariant probability measure
with local product structure.
A main application was the proof of the Zorich–Kontsevich conjecture on
the Lyapunov spectrum of the Teichmüller flow on strata of Abelian differen-
tials, by Avila and Viana [15]. See also the proofs, by Cambrainha [42], that
typical symplectic cocycles have only non-zero Lyapunov exponents, and by
Herrera [63], that certain multidimensional continued fraction algorithms have
simple Lyapunov spectra.
The presentation in this chapter was adapted from Avila and Viana [14, 15].
8.5 Exercises
Exercise 8.1 Let m ≥ 2 and S be the set of all (A1 , . . . , Am ) ∈ GL(d)m such
that the monoid generated by the set {A1 , . . . , Am } is pinching and twisting.
Prove that S is open and has full Lebesgue measure in GL(d)m .
Exercise 8.2 Prove that, for every B ∈ GL(d) and 1 ≤ k ≤ d,
σk (B−1 ) = σd−k+1 (B)−1 and σk (Bt ) = σk (B).

Moreover, assuming that σk (B) > σk+1 (B), show:
(1) Eks (B) and Eku (B) are orthogonal complements to each other;
(2) B(Eks (B)) and B(Eku (B)) are orthogonal complements to each other;
(3) Eks (B−1 ) = B(Ed−k
u (B)) and E u (B−1 ) = B(E s (B));
k d−k

148 Simplicity
(4) Eks (Bt ) = B(Eks (B)) and Eku (Bt ) = B(Eku (B)).
Exercise 8.3 Let B ∈ GL(d) and 1 ≤ k ≤ d − 1 be such that σk (B) > σk+1 (B).
Prove that:
(1) B(h) ≥ σk (x)h+ k for every h ∈ R , where h represents the component

d +
u
of h along Ek (B);
(2) if F ∈ Gr(k, d) is transverse to Eks (x) then B(h) ≥ cσk (B)h for all
h ∈ F, where c > 0 does not depend on B, but only on the distance between
F and the hyperplane section dual to Eks (x).
Exercise 8.4 Let B be a monoid. Prove that:
(1) if there is B1 ∈ B whose eigenvalues are all distinct in norm, then B is

pinching;
(2) if there exists B1 as above and there exists B2 ∈ B such that B2 (V ) ∩W =
{0} for any pair of B1 -invariant subspaces with complementary dimen-
sions, then B is twisting.
Exercise 8.5 Show that the adjoint monoid Bt = {Bt : B ∈ B} is pinching

or twisting if and only if B is pinching or twisting, respectively.
Exercise 8.6 For each B ∈ GL(d) let GB be some subspace of dimension

d − k spanned by eigenvectors associated with the d − k smallest eigenvalues
of Bt B. If σk (B) > σk+1 (B) then the only choice is GB = Eks (B). Let θ > 0 be
fixed. Show that for any ε > 0 there exists δ > 0 such that
| sin (F1 , F2 )| < δ ⇒ | sin (B(F1 ), B(F2 ))| < ε
for any B ∈ GL(d) and F1 , F2 ∈ Gr(k, d) with | sin (Fi , GB )| > θ for i = 1, 2.
Exercise 8.7 Let f : (M, μ ) → (M, μ ) be a Bernoulli shift, M = X Z and μ =

pZ . Let m, n ≥ 1, and Y1 , . . . , Ym be positive measure subsets of M defined by
imposing conditions on coordinates x0 , . . . , xn−1 only. Define g(x) = f kn (x)
with k = k(x) ≥ 1 minimum such that f kn (x) ∈ Y1 ∪ · · · ∪Ym . Prove that, given
any sequence (ik )k with values in {1, . . . , m},
μ (Y j )
μ (Z) ≥ min
1≤ j≤m μ (Y1 ) + · · · + μ (Ym )
where Z is the set of all points x ∈ M for which g(x) ∈ Yik(x) .
Exercise 8.8 Prove that:
(1) the set of decomposable k-vectors is a closed subset of Λk (Rd ) and its
intersection with the unit ball is compact;

8.5 Exercises 149
(2) the set of geometric hyperplanes is a compact subset of the space of hy-
perplanes of Λk (Rd );
(3) for every fixed l, the set of geometric subspaces is a compact subset of the
space of l-dimensional subspaces of Λk (Rd ).
Exercise 8.9 Prove that if B is pinching and twisting then each associated
Grassmannian cocycle Fk : M × Gr(k, d) → M × Gr(k, d), 1 ≤ k ≤ d − 1 has a
measure m defined on M ×
unique invariant u-state; u
namely, the probability
Gr(k, d) by m (E) = μ {x ∈ M : (x, ξ̂ (x)) ∈ E} . Analogously, each Fk has a
u u
unique invariant s-state.

9
Generic cocycles
In Chapters 7 and 8 we came to the conclusion that, for a significant class

of linear cocycles, the Lyapunov exponents are most of the time distinct. The
results were stated for locally constant cocycles over Bernoulli shifts but, as
observed at the end of both chapters, the conclusions extend much beyond:
roughly speaking, they remain valid for Hölder-continuous cocycles with in-
variant holonomies, assuming that the base dynamics is sufficiently “chaotic”.
Rather in contrast, in the early 1980s Mañé [88] announced that generic (that
is, a residual subset of all) area-preserving C1 diffeomorphisms on any surface
have λ± = 0 at almost every point, or else they are Anosov diffeomorphisms.
Actually, as observed in Example 2.10, the second alternative is possible only
if the surface is the torus T2 . A complete proof of Mañé’s claim was first given
by Bochi [30], based on an unpublished draft by Mañé himself. This family of
ideas is the subject of the present chapter.
In Section 9.1, we make a few useful observations about semi-continuity

of Lyapunov exponents. Then, in Section 9.2, we state and prove a version of
the Mañé–Bochi theorem for continuous linear cocycles (Theorem 9.5). This
can be extended in several ways: to the original setting of diffeomorphisms
(derivative cocycles); to higher dimensions; and to continuous time systems.
Some of these are briefly discussed in Section 9.2.4.
This theory is also connected to the question of how Lyapunov exponents

vary with the linear cocycle, which will be the central topic of Chapter 10.
Indeed, Theorem 9.5 yields (in Corollary 9.8) a complete characterization of
the continuity points of the Lyapunov exponents in the realm of continuous
linear cocycles. Similar ideas allow us, in Section 9.3, to give examples of
discontinuity of the Lyapunov exponents among Hölder-continuous cocycles.
150

9.1 Semi-continuity 151
9.1 Semi-continuity
Let f : M → M be a measurable transformation on a separable complete metric
space M and let μ be a Borel probability measure on M invariant under f . Let
F : M × Rd → M × Rd be the linear cocycle defined over ( f , μ ) by a bounded
measurable function A : M → GL(d).
To begin with, we are going to show that the extremal Lyapunov expo-
nents are semi-continuous functions of the corresponding cocycle. We write
λ± (A, μ ) to mean λ± (F, μ ). Similarly, let χ1 (A, μ ) ≥ · · · ≥ χd (A, μ ) denote all
the Lyapunov exponents of F, counted with multiplicity.
The first lemma and its corollaries hold for pairs (A, μ ) in any of the follow-
ing situations:
• A bounded and continuous, with the topology of uniform convergence (given

by the C0 -norm), and μ ∈ M (M) with the weak∗ topology;
• A bounded and measurable, with the topology of uniform convergence, and
μ ∈ M (M) with the pointwise topology.
Lemma 9.1 The function (A, μ ) → λ+ (A, μ ) is upper semi-continuous and

the function (A, μ ) → λ− (A, μ ) is lower semi-continuous. More generally, for
any 0 ≤ k ≤ d, the function (A, μ ) → ∑ki=1 χi (A, μ ) is upper semi-continuous
and the function (A, μ ) → ∑di=d−k+1 χi (A, μ ) is lower semi-continuous.

Proof Observe that (A, μ ) → log An d μ is continuous for every n ≥ 1.
Indeed,

log An d μ − log Bn d ν

≤ log An d(μ − ν ) + log An − log Bn d ν
and the first term goes to zero when μ → ν , because log An is bounded,
and the second term goes to zero when A − B → 0, because the difference
log An − log Bn converges uniformly to zero. So, the case k = 1 of the
claim follows directly from the identities

1
λ+ (A, μ ) = inf log An d μ and
n≥1 n

1
λ− (A, μ ) = sup log (An )−1 −1 d μ
n≥1 n
in Theorem 3.12. To deduce the general case, consider the cocycle Λk F in-

152 Generic cocycles
duced by F in the exterior k-power Λk (Rd ). From Proposition 4.17 we get
χ1 (A, μ ) + · · · + χk (A, μ ) = λ+ (Λk A, μ )

χd−k+1 (A, μ ) + · · · + χd (A, μ ) = λ− (Λk A, μ ).
Then it suffices to observe that A → Λk A is continuous with respect to the

topology of uniform convergence (Exercise 9.1).
A subset of a topological space is called residual if it contains a countable

intersection of open dense subsets, and it is called meager if its complement is
residual or, in other words, if it is contained in a countable union of nowhere-
dense closed sets. The ambient is called a Baire space if every residual subset
is dense or, equivalently, if every meager subset is nowhere dense. Complete
metric spaces and locally compact topological spaces are Baire spaces.
Corollary 9.2 The set of discontinuity points (A, μ ) for the Lyapunov expo-
nents λ± is a meager subset of the domain.
Proof This is a general feature of semi-continuous functions. We recall the

argument. Let Q be any countable dense subset of R and, for each q ∈ Q, define
Fq = ∂ {(A, μ ) : λ+ (A, μ ) < q}.
The assumption gives that Fq is the boundary of an open set, and so it is closed
and nowhere dense. Now let (A, μ ) be a point of discontinuity for λ+ . Then
there exists (An , μn ) → (A, μ ) such that limn λ+ (An , μn ) < λ+ (A, μ ). Let q be
any element of Q between these two numbers. Then (A, μ ) ∈ Fq . This proves

that the set of discontinuity points of λ+ is contained in q∈Q Fq . Analogously,
one gets that the set of discontinuity points of λ− is contained in a countable
union of closed nowhere-dense subsets.
Corollary 9.3 If (A, μ ) is such that all the Lyapunov exponents are equal,
then (A, μ ) is a continuity point for the Lyapunov exponents.
Proof Denote λ = λ+ (A, μ ) = λ− (A, μ ). By upper semi-continuity of the

largest exponent and lower semi-continuity of the smallest exponent,
λ+ (B, ν ) < λ + ε and λ− (B, ν ) > λ − ε
for every (B, ν ) close to (A, μ ). Then |χ j (B, ν )− χ j (A, μ )| = |χ j (B, ν )− λ | < ε
for every j = 1, . . . , d, and so (A, μ ) is indeed a continuity point.
Another rather general class of continuity points for the Lyapunov exponents
are the hyperbolic cocycles introduced in Section 2.2.

Lemma 9.4 The Lyapunov exponents vary continuously with A ∈ GL(2) in

the open subset of hyperbolic cocycles in C0 (M, SL(2)).
Proof Let Rd = EA,xs ⊕ E u be the hyperbolic decomposition for the cocycle

A,x
defined by A. Given x ∈ M, define
A(x)vs A(x)vu
gsA (x) = and guA (x) =
vs vu
for any non-zero vs ∈ Exs and vu ∈ Exu (the definition does not depend on the
choice of these vectors). The Lyapunov exponents are given by

λ+ (A, μ ) = log guA d μ and λ− (A, μ ) = log gsA d μ . (9.1)
By Proposition 2.6, the invariant sub-bundles EAs and EAu depend continuously
on A. Thus, gsB and guB are uniformly close to gsA and guA , respectively, if B is
close to A. Then, by (9.1), the exponents λ± (B, μ ) are close to λ± (A, μ ).
9.2 Theorem of Mañé–Bochi

An invariant probability measure of a transformation f : M → M is aperiodic if
the set of periodic points of has zero measure. Given an invariant set Λ ⊂ M, we
say that a cocycle F : M × Rd → M × Rd is hyperbolic over Λ if the restriction
of F to Λ × Rd is hyperbolic.
Theorem 9.5 Let f : M → M be a homeomorphism and μ be an aperi-

odic ergodic probability measure on some compact metric space M. For any
continuous function A : M → SL(2), either the cocycle associated with A is
hyperbolic over the support of μ or A is approximated in C0 (M, SL(2)) by
continuous functions C : M → SL(2) such that λ± (C, μ ) = 0.
Before proving this theorem, let us list a few applications:
Corollary 9.6 In the setting of Theorem 9.5, the set of A ∈ C0 (M, SL(2))
for which either the linear cocycle is hyperbolic over the support of μ or else
λ± (A, μ ) = 0 is a residual subset of C0 (M, SL(2)).
Proof By Proposition 2.6, the subset H of functions A for which the lin-
ear cocycle is hyperbolic is open in C0 (M, SL(2)). By Lemma 9.1, the subset
L(δ ) of continuous functions A : M → SL(2) for which λ+ (A, μ ) < δ is also
open. Fix any sequence (δn )n → 0. Then the set in the statement coincides with

n H ∪ L(δn ). By Theorem 9.5, the intersection is dense and, hence, so is each
of the open sets H ∪ L(δn ).

In some cases, the hyperbolicity alternative can be excluded a priori. Then

one gets an abundance of vanishing exponents:
Example 9.7 Let S1 = R/Z and f : S1 → S1 be a continuous transforma-

tion, with degree deg( f ) ∈ Z. Let A : S1 → SL(2) be of the form A(x) =
A0 R2πα (x) where A0 ∈ SL(d) and α : S1 → S1 is a continuous function with de-
gree deg(α ) ∈ Z and Rθ denotes the rotation of angle θ . Assume that μ is sup-
ported on the whole S1 . Assume that 2 deg(α ) is not a multiple of deg( f ) − 1.
Then, continuous cocycles with zero Lyapunov exponents form a residual sub-
set of the isotopy class of A for, as we have seen in Example 2.9, no element
of the isotopy class can be hyperbolic.
Theorem 9.5 also yields a complete characterization of the continuity points

for the Lyapunov exponents among continuous cocycles:
Corollary 9.8 In the setting of Theorem 9.5, a function A : M → SL(2) is a

continuity point for the Lyapunov exponents in C0 (M, SL(2)) if and only if the
corresponding linear cocycle is hyperbolic over the support of μ or else the
Lyapunov exponents λ± (A, μ ) vanish.
Proof By Lemma 9.4, Lyapunov exponents are continuous at every hyper-

bolic cocycle. By Corollary 9.3, the same is true for every cocycle with van-
ishing Lyapunov exponents. Theorem 9.5 implies that there are no other con-
tinuity points.
Let us give an outline of the proof of Theorem 9.5. We will fill in the details
in the next three sections.
Start from any A ∈ C0 (M, SL(2)) whose associated linear cocycle is not
hyperbolic over supp μ . Proposition 9.10 below states that if the Lyapunov ex-
ponents λ± (A, μ ) are non-zero then one can find a nearby measurable function
B : M → SL(2) that preserves the union of the Oseledets sub-bundles of A
but exchanges the two Oseledets sub-bundles over some positive measure set
Z ⊂ M.
Another key point, that will be established in Proposition 9.12, is that Z
may be chosen so that the second return map Z → Z is ergodic. According to
Proposition 9.13 below, it follows that λ± (B, μ ) = 0. This is not quite the end
yet, because B need not be continuous. However, using Lusin’s theorem, there
exist continuous functions C : M → SL(2) with uniformly bounded norm and
coinciding with B outside sets that have arbitrarily small measure. Then the
Lyapunov exponents λ± (C, μ ) are arbitrarily close to zero.
This proves that every non-hyperbolic cocycle is approximated by contin-

uous cocycles with arbitrarily small Lyapunov exponents. A Baire argument

wraps up the proof of the theorem.
9.2.1 Interchanging the Oseledets subspaces

Let us now present the arguments in detail.
Lemma 9.9 Let f : M → M be an invertible transformation and μ be an

invariant aperiodic probability measure. For any measurable set Y ⊂ M with
μ (Y ) > 0 and any m ≥ 1 there exists a measurable set Z ⊂ Y such that μ (Z) > 0
and Z, f (Z), . . . , f m−1 (Z) are pairwise disjoint.
Proof Since fixed points form a zero measure subset of Y , we may find Y1 ⊂ Y
such that μ (Y1 Δ f (Y1 )) > 0. Then Z1 = Y1 \ f (Y1 ) has positive measure and it
is disjoint from its image f (Z1 ). Similarly, we may find Y2 ⊂ Z1 such that
μ (Y2 Δ f 2 (Y2 )) > 0. Then Z2 = Y2 \ f 2 (Y2 ) has positive measure and Z2 , f (Z2 ),
and f 2 (Z2 ) are pairwise disjoint. Continuing in this way we find Z = Zm as in
the statement.
Proposition 9.10 Let A ∈ C0 (M, SL(2)) be such that the associated linear
cocycle F is not hyperbolic over supp μ . Then for every ε > 0 there exists
m ≥ 1 and Z ⊂ M with μ (Z) > 0 such that Z, f (Z), . . . , f m−1 (Z) are pairwise
disjoint, and there exists a measurable map B : M → SL(2), satisfying
m−1 j
(a) A(x) − B(x) < ε for every x ∈ M and A = B outside j=0 f (Z);
(b) B(x)Exs= E ufm (x)
and =B(x)Eus for x ∈ Z,E ufm (x) = Exs ⊕ Exu is where Rd
the Oseledets decomposition of the linear cocycle F associated with A.
Proof There are two parts. First, we explain how to construct a perturbation
that maps E u to E s . More precisely, we find Y ⊂ M, p ≥ 1, and J : M → SL(2)
satisfying (a) and J p (x)Exu = E sf p (x) but, possibly, not J p (x)Exs = E ufp (x) . Next,
we explain how to modify the construction and obtain Z, m, and B satisfying
all the conditions in the statement.
Mapping E u to E s : Fix δ > 0, much smaller than ε > 0. If the set of points
x such that | sin (Exu , Exs )| < δ has positive measure, we may take Y to be
that set, p = 1, and J(x) = A(x)R(x) for x ∈ Y , where R(x) is a small rotation
such that R(x)Exs = Exu . From now on, we suppose that | sin (Exu , Exs )| ≥ δ for
almost every x. For each p ≥ 1, consider the set

Γ p = x ∈ M : A p (x) | Exu < 2A p (x) | Exs || .

The assumption that the cocycle is non-hyperbolic implies that μ (Γ p ) > 0 for
every p. Indeed, suppose there exists p ≥ 1 such that
A p (x) | Exu
≥2 for μ -almost every x ∈ M.
A p (x) | Exs
Then (Exercise 2.4), for any k ≥ 1 and μ -almost every x
Akp (x) | Exu

Akp (x)2 ≥ ≥ 2k .
Akp (x) | Exs
Then, by continuity, Akp (x) ≥ 2k/2 for every k ≥ 1 and x in the support of
μ . It follows, by Proposition 2.1, that the linear cocycle F p is hyperbolic over
supp μ . Then (Exercise 2.5), F is hyperbolic over supp μ . This contradiction
proves that μ (Γ p ) > 0 for every p, as claimed.
Now fix θ > 0 much smaller than δ and p ≥ 1 much larger than | log θ |. By
Lemma 9.9, we may find a measurable set Y ⊂ Γ p with μ (Y ) > 0 and such that
Y , f (Y ), . . . , f p−1 (Y ) are pairwise disjoint. For each x ∈ Y and 0 < j < m − 1,
define
J( f j (x)) = A( f j (x))L j (x),
where L j (x) is a linear map that fixes E sf j (x) and E uf j (x) , expanding the former
and contracting the latter by a definite factor e±θ :
L j (x) : E sf j (x) ⊕ E uf j (x) → E sf j (x) ⊕ E uf j (x) , vs + vu → eθ vs + e−θ vu .
Since the angle between the two sub-bundles is bounded from zero, one has
that A( f j (x)) − J( f j (x))) < ε for all 0 < j < p − 1 and x ∈ Y , as long as θ
is small enough. Then
A( f p−1 (x))J( f p−2 (x)) · · · J( f (x))A(x) | Exs 2(p−2)θ A (x) | Ex

p s
≥ e
A( f p−1 (x))J( f p−2 (x)) · · · J( f (x))A(x) | Exu A p (x) | Exu
≥ e2(p−2)θ −1 > 2δ −1 θ −2 ,
as long as p is large enough. Since the angle between the Oseledets subspaces
is bounded below by δ , this implies that there exists some line r ⊂ R2 such
that, denoting r p = A( f p−1 (x))J( f p−2 (x)) · · · J( f (x))A(x)r,
| sin (r, Exu )| < θ and | sin (r p , E sf p (x) )| < θ .
Define J(x) = A(x)R(x) and J( f p−1 (x)) = L(x)A( f p−1 (x)) where R(x) and
L(x) are small rotations mapping Exu to r and r p to E sf p (x) , respectively. This
preserves A − J < ε and also gives J p (x)Exu = E sf p (x) , as required.

Mapping E s to E u : The latter property implies that J p (x)Exs = E sf p (x) . Then

(Exercise 3.15),
lim | sin (Am−p ( f p (x))J p (x)Exs , E ufm (x) )| = 0
m
Moreover, by Poincaré recurrence,

lim sup | sin (E ufm (x) , E sf m (x) )| > 0.
m
Thus, we may find m > p and a positive measure set Z ⊂ Y such that, for ev-
ery z ∈ Z, the angle between Am−p ( f p (x))J p (x)Exs and E ufm (x) is much smaller
than the angle between the two Oseledets subspaces at f m (x). hence, there
exists a linear map S( f m (x)) close to the identity that fixes E sf m (x) and maps
Am−p ( f p (x))J p (x)Exs to E ufm (x) . By Lemma 9.9, Z, f (Z), . . . , f m−1 (Z) may be
assumed to be pairwise disjoint. Let B : M → SL(2) be given by
⎧
⎨ J(x) if x ∈ Z ∪ f (Z) ∪ · · · ∪ f p−1 (Z)
B(x) = S( f (x))A(x) if x ∈ f m−1 (Z)
⎩
A(x) in all other cases.
It is clear from the construction that Z, m, and B satisfy all the properties in the
conclusion of the proposition.
9.2.2 Coboundary sets

We want to deduce from Proposition 9.10 that the Lyapunov exponents of B
vanish. The following observation shows that additional information is needed
for this.
Let A : M → SL(2) be such that the Lyapunov exponents of the associated
cocycle are non-zero, and let Rd = Exs ⊕ Exu be the corresponding Oseledets
decomposition. Fix a measurable set W ⊂ M and let H(x) : R2 → R2 be defined
as follows: if x ∈ W then H(x) is an orthonormal map that exchanges Exs with
Exu ; otherwise, H(x) = id. Then take B : M → SL(2) to be given by B(x) =
H( f (x))A(x)H(x)−1 for every x ∈ M. By construction,
B(x)Exs = Exu and B(x)Exu = Exs for x ∈ W Δ f (W )
(9.2)
B(x)Exs = Exs and B(x)Exu = Exu for x ∈
/ W Δ f (W ).
On the other hand, since the norm of H ±1 is bounded, the Lyapunov expo-
nents of A and B coincide. Hence, the property of exchanging the Oseledets
subspaces of A alone cannot force the Lyapunov exponents of B to vanish.
This problem is handled by the notion of coboundary set, introduced by
Knill [75] in this context. A measurable set Z ⊂ M is a coboundary for f if

there exists some measurable set W ⊂ M such that Z coincides with W Δ f (W )

up to a zero measure set.
Given any measurable set Z ⊂ M with μ (Z) > 0, let ν be the normalized
restriction of μ to Z, and let g : Z → Z be the first return map: g(x) = f r(x) (x)
where r(x) ≥ 1 is the first time the forward orbit of x hits z (actually, g is defined
on a full measure subset of Z). By Proposition 4.18, the map g is ergodic for
the probability measure ν .
Lemma 9.11 Let f : M → M be an invertible transformation and μ be an
ergodic probability measure. A set Z ⊂ M with μ (Z) > 0 is a coboundary for
f if and only if the second return map g2 : Z → Z is not ergodic for ν .
Proof Suppose that there exists W ⊂ M be such that Z = W Δ f (W ) up to a
zero measure set. Let U = W \ f (W ) and V = f (W ) \ W . Given x ∈ U, let
n ≥ 1 be minimum such that f n (x) ∈ / W . Then g(x) = f n (x) ∈ V . Given x ∈ V ,
let n ≥ 1 be minimum such that f n (x) ∈ W . Then g(x) = f n (x) ∈ U. This
shows that g(U) ⊂ V and g(V ) ⊂ U. It follows that ν (U) = ν (V ) = 1/2 and
g2 (U) ⊂ U. In particular, g2 is not ergodic.
To prove the converse, suppose there exists U ⊂ Z such that 0 < ν (U) < 1
and g2 (U) = U. Then U ∪ g(U) is a (g, ν )-invariant set with positive measure.
Since g is ergodic, it follows that ν (U ∪ g(U)) = 1. In other words, Z = U ∪
g(U) up to a zero measure set. Define W = { f j (x) : x ∈ U and 0 ≤ i < r(x)}.
Then f (W ) = { f j (x) : x ∈ U and 0 < i ≤ r(x)} and so

W Δ f (W ) = f j (x) : x ∈ U and i ∈ {0, r(x)} = U ∪ g(U) = Z
up to a zero measure set. Thus, Z is a coboundary set.
Proposition 9.12 Assume that μ is aperiodic for f . Then any set with positive
measure has some subset with positive measure that is not a coboundary set.
Proof Since M is a compact metric space, the probability space (M, B, μ ) is a
Lebesgue space (see [114, Section 8.5.2]). Moreover, μ is non-atomic, because
it is aperiodic. So, (M, μ ) is isomorphic to the interval [0, 1], endowed with the
Lebesgue measure. Theorem 9.3 in Friedman [53] asserts that, for any ergodic
measure-preserving transformation in a Lebesgue space, the family W ( f , μ ),
of measurable sets Z ⊂ M such that the first return map fZ : Z → Z is weak
mixing, is dense in the σ -algebra of M, relative to the distance d(A1 , A2 ) =
μ (A1 ΔA2 ). In particular, W ( f , μ ) contains some measurable set Z with μ (Z) >
0.
Now, given any Y ⊂ M with μ (Y ) > 0, let h = fY : Y → Y be the first return
map and ν be the normalized restriction of μ to Y . Then ν is aperiodic for h.
By the previous paragraph, there is Z ∈ W (h, ν ) with ν (Z) > 0. This means

that hZ : Z → Z is weak mixing and μ (Z) > 0. Since weak mixing is preserved
under iteration, the second return map h2Z is also weak mixing and, hence,
ergodic. Clearly, hZ coincides with the first return map fZ : Z → Z under the
transformation f . Thus, fZ2 is ergodic and so Z is not a coboundary set.
Therefore, we may always choose the set Z in Proposition 9.10 in such a
way that it is not a coboundary.
Proposition 9.13 Let Z ⊂ M, m ≥ 1, and B : M → SL(2) be as in the con-
clusion of Proposition 9.10 and assume that Z is not a coboundary set. Then
λ± (B, μ ) = 0.
Proof Let Rd = Exs ⊕ Exu denote the Oseledets decomposition for A. Suppose
that λ± (B, μ ) are different from zero and let Rd = EB,x
s ⊕ E u be the corre-
B,x
sponding Oseledets decomposition. We are going to use the inducing construc-
tion in Section 4.4.1. More precisely, we consider the linear cocycle
G : Z × R 2 → Z × R2 , G(x, v) = ( f r(x) (x), Br(x) v)
over the first return map g : Z → Z. We also consider the corresponding pro-
jective cocycle PG : Z × PR2 → Z × PR2 . By construction,
Br(x) (Exs ) = Exu and Br(x) (Exu ) = Exs ν -almost everywhere. (9.3)
By Proposition 4.18, the Lyapunov exponents λ± (G, ν ) are different from zero,
and the Oseledets decomposition of G is given by Rd = EB,x s ⊕ E u restricted
B,x
to the domain Z. Let m be the probability measure defined on Z × PR2 by
1 1
m(X) = ν ({x ∈ Z : (x, Exs ) ∈ X}) + ν ({x ∈ Z : (x, Exu ) ∈ X}) .
2 2
In other words, m projects down to μ and its disintegration is given by
1 1
x → δExs + δExu .
2 2
It follows from (9.3) that m is invariant under PG. Then, by Lemma 5.25 and
Remark 5.26, m is a linear combination of the probability measures msB and muB
defined on Z × PR2 by

msB (X) = ν {x ∈ Z : (x, EB,x
s
) ∈ X}

muB (X) = ν {x ∈ Z : (x, EB,x
u
) ∈ X} .
Lemma 9.14 The probability measure m is ergodic for PG.
Proof Suppose that there is an invariant set X ⊂ Z × PR2 with 0 < m(X) < 1.
Let Z0 be the set of x ∈ Z whose fiber X ∩ ({x} × PR2 ) contains neither Exs nor
Exu . In view of (9.3), Z0 is a (g, ν )-invariant set and so its ν -measure is either

0 or 1. Since m(X) > 0, we must have ν (Z0 ) = 0. Similarly, m(X) < 1 implies
that ν (Z2 ) = 0, where Z2 is the set of x ∈ Z whose fiber contains both Exs and
Exu . Now let Zs be the set of x ∈ Z whose fiber contains Exs but not Exu , and
let Zu be the set of x ∈ Z whose fiber contains Exu but not Exs . The previous
observations show that Zs ∪ Zu has full ν -measure. Moreover, (9.3) implies
g(Zs ) = Zu and g(Zu ) = Zs
up to zero measure sets. Thus, ν (Zs ) = ν (Zu ) = 1/2 and g2 (Zs ) = Zs . This
implies that g2 is not ergodic, contradicting Lemma 9.11. This contradiction
proves the present lemma.
It follows that m coincides with either msB and muB . This is a contradiction,
because the conditional probabilities of m are supported on exactly two points
on each fiber, whereas the conditional probabilities of both muB and msB are
Dirac masses on a single point. This contradiction proves that the Lyapunov
exponents λ± (B, μ ) are zero, as claimed.

So far we have shown that any continuous cocycle F which is not hyperbolic
over the support of μ can be approximated, in the uniform norm, by measur-
able cocycles with vanishing exponents. To complete the proof of Theorem 9.5
we must replace this by continuous cocycles. We are going to use the following
semi-continuity result:
Lemma 9.15 Let B : M → SL(d), d ≥ 2 be such that log B ∈ L1 (μ ). Given
L > 0 and δ > 0, there exists ρ > 0 such that λ+ (J,
μ ) < λ+ (B, μ ) + δ for
any
measurable function J : M → SL(d) such that μ {x ∈ M : B(x) = J(x)} < ρ
and J∞ < L.
Proof Given δ > 0, fix n ≥ 1 such that

1 δ
log Bn d μ ≤ λ+ (B, μ ) + .
n 2
Given J : M → SL(d) such that J < L, denote Δ = {x ∈ M : B(x) = J(x)}
and Δn = Δ ∪ f (Δ) ∪ · · · ∪ f n−1 (Δ). Then

1 1 1
λ+ (J, μ ) ≤ log J n d μ = log J n d μ + log Bn d μ .
n n Δn n M\Δn
The right-hand side is bounded by

1 δ δ
μ (Δn ) log L + λ+ (B, μ ) + ≤ μ (Δ) log L + λ+ (B, μ ) + .
n 2 2

It follows that λ+ (J, μ ) ≤ λ+ (B, μ ) + δ as long as μ (Δ) < ρ for some ρ > 0
sufficiently small. This proves the lemma.
Let us proceed with the proof of Theorem 9.5. Previously, we have shown
that for any ε > 0 there exist measurable functions B : M → SL(2) such that
A − B∞ < ε and λ+ (B, μ ) = 0. Fix any δ > 0. Given any ρ > 0 there exist
continuous functions J : M → SL(2) such that J∞ ≤ B∞ and J = B out-
side a set with measure ρ (Exercise 9.4). By Lemma 9.15, this implies that
λ+ (J, μ ) < δ , as long as ρ is chosen small enough.
Let H be the subset of continuous functions A : M → SL(2) such that the as-
sociated linear cocycle is hyperbolic over supp μ and, for any δ > 0, let L(δ ) be
the subset of continuous functions A : M → SL(2) such that λ+ (A, μ ) < δ . By
Propositions 2.6 and 9.1, H and the L(δ ) are all open subsets of C0 (M, SL(2)).
The reasoning in the previous paragraph shows that L(δ ) is dense in the closed
set C0 (M, SL(2))\H, for every δ > 0. Since C0 (M, SL(2)) is a complete metric

space, it follows from Baire’s theorem (Exercise 9.3) that ∞ k=1 L(1/k) is dense
in C0 (M, SL(2)) \ H. In other words, every non-hyperbolic continuous cocy-
cle is approximated in C0 (M, SL(2)) by another continuous cocycle, whose
Lyapunov exponents are zero. This finishes the proof of Theorem 9.5.
9.2.4 Derivative cocycles and higher dimensions

Recall that a C1 diffeomorphism f : M → M is Anosov if the derivative cocycle
F = D f is hyperbolic. We denote by Diff1m (M) the space of volume-preserving
diffeomorphisms on a compact Riemannian manifold, endowed with the C1 -
topology: two diffeomorphisms are C1 -close if they are uniformly close and
their derivatives are also uniformly close. Theorem 9.5 is still true, albeit much
harder, for derivative cocycles of area-preserving diffeomorphisms:
Theorem 9.16 (Mañé [88], Bochi [30]) Let M be a compact Riemannian

manifold of dimension 2. For every f ∈ Diff1m (M), either f is an Anosov dif-
feomorphism or f is approximated in Diff1m (M) by diffeomorphisms whose
Lyapunov exponents vanish almost everywhere.
Corollary 9.17 Let M be a compact Riemannian manifold of dimension 2. If

f ∈ Diff1m (M) is a continuity point for the Lyapunov exponents then either f
is an Anosov diffeomorphism or the Lyapunov exponents vanish almost every-
where. The continuity points form a residual subset of Diff1m (M).
The 2-dimensional torus is the only compact surface that carries Anosov
diffeomorphisms (see [52, 91] and Example 2.10). In all other cases, Corol-

lary 9.17 gives that a residual subset of all area-preserving diffeomorphisms

have zero Lyapunov exponents almost everywhere.
Now we discuss extensions of the previous results to arbitrary dimension.
For the statements we need the notion of dominated decomposition. Let M be
a compact metric space and F : M × Rd → M × Rd be a continuous cocycle
over some homeomorphism f : M → M. Let
Rd = Vx1 ⊕ · · · ⊕Vxl , x∈Λ
be an F-invariant decomposition, defined for x in some invariant set Λ ⊂ M

and such that every x → dimVxi is constant on Λ. We call the decomposition
dominated over Λ if there exist C > 0 and θ < 1 such that
F n (x)vi
≤ Cθ n for every unit vector vi ∈ Vxi , vi−1 ∈ Vxi−1 ,
F n (x)vi−1
and for every x ∈ X and 1 < i ≤ l. In other words, the cocycle is more con-
tractive along each Vxi than along Vxi−1 , by a definite factor. By convention, the
trivial decomposition into a single subspace (the case l = 1) is dominated.
In the next statement G is any subgroup of GL(d) that acts transitively on
the projective space; that is, such that {g(r) : g ∈ G} = PRd for any r ∈ PRd .
That includes many interesting subgroups, such as the special linear group
SL(d) and the symplectic group (if d is even). When d is even, it also includes
SL(d/2, C) and GL(d/2, C), viewed as subgroups of GL(d).
Theorem 9.18 (Bochim and Viana [34]) Let f : M → M be a homeomor-

phism on a compact metric space M and μ be an aperiodic ergodic probabil-
ity measure. Then A : M → G is a continuity point for the Lyapunov exponents
among G-valued continuous cocycles if and only if the Oseledets decomposi-
tion of the associated linear cocycle over f is dominated (possibly trivially)
over the support of μ . The continuity points form a residual subset of the space
of all continuous G-valued cocycles.
The statement actually proven in [34] is a bit stronger: in particular, it in-

cludes the non-ergodic case.
Theorem 9.19 (Bochi and Viana [34]) For any compact manifold M, there
exists a residual subset R of the space Diff1m (M) of volume-preserving diffeo-
morphisms such that for every f ∈ R the Oseledets decomposition of F = D f
is dominated (possibly trivially) on every orbit of f .
Example 9.20 Bonatti and Viana [38] constructed open sets U of volume-
preserving diffeomorphisms on M = T4 that are not Anosov and yet admit a
unique decomposition T M = E s ⊕ E u that is invariant and dominated for the

derivative cocycle. Both subspaces E s and E u are 2-dimensional. Tahzibi [110]

showed that every C2 diffeomorphism in U is ergodic. By Avila [8], the set
of C2 diffeomorphisms is dense in Diff1m (M). By Oxtoby and Ulam [93], the
set of ergodic diffeomorphisms is always a countable intersection of open sets.
Combining these three facts we get that ergodic diffeomorphisms form a resid-
ual subset R of U . For any f ∈ R, Theorem 9.19 and Exercise 9.5 give that
the Oseledets decomposition extends continuously to a dominated decomposi-
tion on the whole M. Then, by uniqueness, the Oseledets decomposition must
coincide with E s ⊕ E u at almost every point. In particular, every f ∈ R has ex-
actly one positive exponent and one negative exponent, both with multiplicity
2.
Now take M to be endowed with a symplectic form; that is, a closed non-
degenerate differential 2-form ω . Existence of such a form implies that the
dimension of M is even, d = 2k. Moreover, ω k = ω ∧ · · · ∧ ω is a volume form
on M. Let m be the corresponding volume measure, normalized so that m(M) =
1. A diffeomorphism f : M → M is symplectic if the derivative preserves the
symplectic form; that is, if
ωx (v, w) = ω f (x) (D f (x)v, D f (x)w) for every x ∈ M and v, w ∈ Tx M.

Every symplectic diffeomorphism preserves the volume measure m. The space
Diff1ω (M) of symplectic diffeomorphisms is a closed subset of Diff1m (M). In
dimension d = 2 the two spaces coincide.
Theorem 9.21 (Bochi and Viana [34], Bochi [31]) There exists a residual
subset R of the space Diff1ω (M) of symplectic diffeomorphisms such that for
every f ∈ R the Oseledets decomposition of F = D f is dominated (possibly
trivially) on every orbit of f .
As a consequence, one gets the following dichotomy for a residual subset of

symplectic diffeomorphisms that was also announced by Mañé [88]:
(1) either f is an Anosov diffeomorphism, with invariant decomposition T M =

E u ⊕ E s such that dim E u = dim E s and D f | E s is uniformly contracting
and D f | E u is uniformly expanding;
(2) or, for almost every x, there is a decomposition E s ⊕ E c ⊕ E u dominated on
the orbit of x, with dim E s = dim E u (possibly equal to zero) and dim E c ≥
2, such that D f | E s is uniformly contracting, D f | E u is uniformly expand-
ing, and the Lyapunov exponents of D f | E c vanish.
For continuous time systems new difficulties arise, especially in the presence
of equilibrium points of the flow. In his thesis, Bessa [20] obtained a version

of Theorem 9.5 for 2-dimensional continuous time linear cocycles. The ex-
tension to arbitrary dimension, corresponding to Theorem 9.18, was obtained
by Bessa [22]. As happens for the discrete time systems, the case of deriva-
tive cocycles is much harder. A version of Theorem 9.16 for 3-dimensional
flows was proven by Bessa [21], in the case when the flow has no equilibrium
points. This restriction was removed soon afterwards by Araújo and Bessa [3].
Concerning extensions to higher dimensions, the continuous time version of
Theorem 9.19 was settled by Bessa and Rocha [24] and a version of Theo-
rem 9.21 for Hamiltonian flows with two degrees of freedom was obtained by
Bessa and Dias [23].
9.3 Hölder examples of discontinuity

Corollary 9.8 shows that continuity of the Lyapunov exponents, although cor-
responding to residual subsets of the domain (by Corollary 9.2), can only occur
at cocycles that are rather special from a dynamical viewpoint. While this is
perhaps specific to the C0 -topology in the space of cocycles (see the discus-
sion in Section 10.6), it is interesting to point out that a variation of the proof
of Theorem 9.5 also yields examples of discontinuity in the space of Hölder-
continuous cocycles over a Bernoulli shift, with arbitrary Hölder constant. That
is the purpose of the present section and also motivates some of the ideas we
will put forward in Section 10.6.
Let f : M → M be the shift map on the space M = X Z , with X = {1, 2},
endowed with the metric
d(x, y) = 2−N(x,y) , N(x, y) = sup{N ≥ 0 : xn = yn whenever |n| < N}.
Let μ = pZ , where p = p1 δ1 + p2 δ2 with p1 , p2 positive and distinct.

Let r ∈ (0, ∞) be fixed. A function A : M → GL(d) is r-Hölder-continuous
if there exists C > 0 such that A(x) − A(y) ≤ Cd(x, y)r for any x, y ∈ M. The
space H r (M) of all r-Hölder-continuous functions A : M → GL(d) comes with
a natural topology, defined by the r-Hölder norm
A(x) − A(y)
AH r = sup A(x) : x ∈ M + sup : x
= y . (9.4)
d(x, y)r
Given σ > 1, consider the locally constant cocycle associated with the func-
tion A : M → SL(2) given by

σ 0
A(x) = if x0 = 1
0 σ −1

and
−1
σ 0
A(x) = if x0 = 2.
0 σ
Note that A is r-Hölder continuous for any r > 0. The Lyapunov exponents
λ± (A, μ ) = ±|p1 − p2 | log σ are non-zero.
Theorem 9.22 For any r > 0 such that 22r < σ there exist r-Hölder-contin-
uous cocycles B : M → SL(2) with vanishing Lyapunov exponents and such
that A − BHr is arbitrarily close to zero. Hence, A is a point of discontinuity
for the Lyapunov exponents on H r ({1, 2}).
The hypothesis on the Hölder constant r is probably not sharp. For reasons
to be discussed in Section 10.6, it would be interesting to weaken it to 2r < σ 2 .
Here is an outline of the proof of Theorem 9.22. Note that the original co-
cycle A preserves the horizontal and vertical line bundles Hx = R(1, 0) and
Vx = R(0, 1). Then the Oseledets subspaces must coincide with Hx and Vx al-
most everywhere. We choose cylinders Zn ⊂ M whose first n iterates f i (Zn ),
0 ≤ i ≤ n − 1 are pairwise disjoint. Then we construct cocycles Bn by modify-
ing A on some of these iterates so that
Bnn (x)Hx = V f n (x) and Bnn (x)Vx = H f n (x) for all x ∈ Zn .
We deduce from this property that the Lyapunov exponents of Bn vanish. More-
over, by construction, each Bn is constant on every atom of some finite parti-
tion of M into cylinders. In particular, Bn is Hölder continuous for every r > 0.
From the construction we also get that
2r n/2
Bn − AH r ≤ const 2 /σ (9.5)
decays to zero as n → ∞. This is how we get the claims in the theorem.
In the remainder of this section we fill the details in this outline, to prove
Theorem 9.22. Let n = 2k + 1 for some k ≥ 1 and Zn = [0; 2, . . . , 2, 1, . . . , 1, 1]
where the symbol 2 appears k times and the symbol 1 appears k + 1 times. Note
that the f i (Zn ), 0 ≤ i ≤ 2k are pairwise disjoint. Let
εn = σ −k and δn = arctan εn . (9.6)
Define Rn : M → SL(2) by
Rn (x) = rotation of angle δn if x ∈ f k (Zn )

1 0
Rn (x) = if x ∈ Zn ∪ f 2k (Zn )
εn 1
Rn (x) = id in all other cases.

and then take Bn = ARn .

Lemma 9.23 Bnn (x)Hx = V f n (x) and Bnn (x)Vx = H f n (x) for all x ∈ Zn .
Proof Note that for any x ∈ Zn ,
Bkn (x)Hx = R(εn , 1) and Bkn (x)Vx = V f k (x)
Bk+1
n (x)Hx = V f k+1 (x) and n (x)Vx = R(−εn , 1)
Bk+1
B2k
n (x)Hx = V f 2k (x) and n (x)Vx = R(−1, εn ).
B2k
The claim follows by iterating one more time.
2r k
Lemma 9.24 r ≤ C 2 /σ
There is C > 0 such that Bn − AH for every n.
Proof Let Ln = Bn − A. Clearly, sup Ln ≤ sup A Rn − id and this is
bounded by σ εn . Now let us estimate the second term in the definition (9.4). If
x and y are not in the same cylinder [0; a] then d(x, y) = 1, and so
Ln (x) − Ln (y)
≤ 2 sup Ln ≤ 2σ εn . (9.7)
d(x, y)r
From now on we suppose x and y belong to the same cylinder. Then, since A is
constant on cylinders,
Ln (x) − Ln (y) A(x)(Rn (x) − Rn (y)) Rn (x) − Rn (y)
= ≤σ .
d(x, y)r d(x, y)r d(x, y)r
If neither x nor y belong to Zn ∪ f k (Zn ) ∪ f 2k (Zn ) then Rn (x) = Rn (y) and so
the expression on the right vanishes. The same holds if x and y belong to the
same f i (Zn ) with i ∈ {0, k, 2k}. We are left to consider the case when one of
the points belongs to some f i (Zn ) with i ∈ {0, k, 2k} and the other one does
not. Then d(x, y) ≥ 2−2k and so, using once more that Rn − id ≤ εn at every
point, we have
Ln (x) − Ln (y) Rn (x) − Rn (y)
≤σ ≤ 2σ εn 22kr .
d(x, y)r d(x, y)r
Combining this with (9.7), we conclude that
k
Ln r ≤ σ εn + 2σ εn 22kr ≤ 3σ 22r /σ .
Now it suffices to take C = 3σ to complete the proof of the lemma.
Now we want to prove that λ± (Bn ) = 0 for every n. Let μn be the normalized
restriction of μ to Zn and fn : Zn → Zn be the first return map (defined on a full
measure subset). We have the following explicit description:

Zn = [0; w, b, w] (up to a zero measure subset)
b∈B

where w = (2, . . . , 2, 1, . . . , 1, 1) and the union is over the set Ω of all finite
words b = (b1 , . . . , bs ) not having w as a subword; moreover,
fn | [0; w, b, w] = f n+s | [0; w, b, w] for each b ∈ Ω.
Thus, ( fn , μn ) is a Bernoulli shift with an infinite alphabet Ω and probabil-
ity vector given by pb = μn ([0; w, b, w]). Let B̂n : Zn → SL(2) be the cocycle
induced by B over fn ; that is,
B̂n | [0; w, b, w] = Bn+s
n | [0; w, b, w] for each b ∈ Ω.
By Proposition 4.18, the Lyapunov spectrum of the induced cocycle is obtained
multiplying the Lyapunov spectrum of the original cocycle by the average re-
turn time. In our setting this means that
1
λ± (B̂n ) = λ± (Bn ).
μ (Zn )
Therefore, it suffices to prove that λ± (B̂n ) = 0 for every n.
Suppose that the Lyapunov exponents of B̂n are different from zero and let
Rd = Exu ⊕ Exs be the Oseledets decomposition (defined almost everywhere in
Zn ). The key observation is that, as a consequence of Lemma 9.23, the cocycle
B̂n permutes the vertical and horizontal sub-bundles:
B̂n (x)Hx = V fn (x) and B̂n (x)Vx = H fn (x) for all x ∈ Zn . (9.8)
Let m be the probability measure defined on M × PR2 by
1 1
mn (G) = μn ({x ∈ Zn : Vx ∈ G}) + μn ({x ∈ Zn : Hx ∈ G}) .
2 2
In other words, mn projects down to μn and its disintegration is given by
1
x → (δHx + δVx ).
2
It is clear from (9.8) that mn is B̂n -invariant. Using Lemma 5.25 we get that
mn is a linear combination of the probability measures msn and mun defined on
M × PR2 by

msn (G) = μn {x ∈ Zn : (x, Exs ) ∈ G}

mun (G) = μn {x ∈ Zn : (x, Exu ) ∈ G} .
Lemma 9.25 The probability measure mn is ergodic.

Proof Suppose that there is an invariant set X ⊂ M × PR2 with mn (X ) ∈
(0, 1). Let X0 be the set of x ∈ Zn whose fiber X ∩({x}×PR2 ) contains neither
Hx nor Vx . In view of (9.8), X0 is an ( fn , μn )-invariant set and so its μn -measure

is either 0 or 1. Since mn (X ) > 0, we must have μn (X0 ) = 0. The same kind

of argument shows that μn (X2 ) = 0, where X2 is the set of x ∈ Zn whose fiber
contains both Hx and Vx . Now let XH be the set of x ∈ Zn whose fiber contains
Hx but not Vx , and let XV be the set of x ∈ Zn whose fiber contains Vx but not
Hx . The previous observations show that XH ∪ XV has full μn -measure and it
follows from (9.8) that
fn (XH ) = XV and fn (XV ) = XH .
Thus, μn (XH ) = 1/2 = μn (XV ) and fn2 (XH ) = XH and fn2 (XV ) = XV . This is a
contradiction because fn is Bernoulli and, in particular, the second iterate fn2 is
ergodic.
Thus, mn must coincide with either msn and mun . This is a contradiction, be-
cause the conditional probabilities of mn are supported on exactly two points on
each fiber, whereas the conditional probabilities of either mun and msn are Dirac
masses on a single point. This contradiction proves that the Lyapunov expo-
nents of B̂n vanish for every n, and that concludes the proof of Theorem 9.22.
9.4 Notes
The original announcement of Theorem 9.16 was made by Mañé in his invited
address to the ICM 1983, in Warsaw [88]. A draft of the proof circulated for
several years, but it remained incomplete when Mañé passed away in 1995.
The first complete proof was provided by Bochi in his thesis, based on that
draft, and was published in [30].
A few related results had been obtained in the meantime. Knill [75, 76] con-
sidered L∞ cocycles with values in SL(2) and proved that, as long as the base
dynamics is aperiodic, the set of cocycles with non-zero exponents is never
open. This was refined to the C0 case by Bochi [27], who proved that an SL(2)-
cocycle is a continuity point for the Lyapunov exponents in C0 (M, SL(2)) if
and only if it is hyperbolic or else the exponents vanish (Corollary 9.8).
Our presentation in Section 9.2 is adapted from the texts of Bochi [27] and
Avila and Bochi [9]. A different strategy was used by Bochi [30], extending to
the special case of derivative cocycles.
Related results were also obtained for L p cocycles with p < ∞. Arbieto and
Bochi [4] proved that Lyapunov exponents are still semi-continuous on the
cocycle relative to the L p -norm, for any 1 ≤ p < ∞. Then, using a result of
Arnold and Cong [6], they concluded that the cocycles whose exponents are
all equal are precisely the continuity points for the Lyapunov exponents on

9.5 Exercises 169
L p (μ ). It follows that these cocycles form a residual subset of L p (μ ). See also

Bessa and Vilarinho [25].
Bochi and Viana [34] extended Theorems 9.5 and 9.16 to arbitrary dimen-
sion (Theorems 9.18 and 9.19). They also proved a version of Theorem 9.19 for
symplectic diffeomorphisms which was later improved by Bochi [31] to obtain
Theorem 9.21. Moreover, Bessa and his coauthors [3, 20, 21, 22, 23, 24] ex-
tended most of these statements to the continuous-time setting, as we explained
at the end of Section 9.2.4.
The examples in Section 9.3 were constructed by Bocker and Viana [35],
based on the method of proof of Theorem 9.5. Such examples notwithstand-
ing, it should be stressed that the behavior of Hölder-continuous cocycles is
usually very different from the behavior of typical continuous cocycles. Ex-
ample 9.7 provides a striking illustration of this fact: one can use the invari-
ance principle to show (see [36, Corollary 12.34]) that, under two mild ad-
ditional assumptions, every Hölder-continuous cocycle in a C0 -neighborhood
has non-vanishing Lyapunov exponents. See Chapter 12 of Bonatti, Dı́az and
Viana [36] and the survey papers of Bochi and Viana [32, 33] for more detailed
discussions.
9.5 Exercises
Exercise 9.1 Prove that the maps A → Λk A, 1 ≤ k ≤ d − 1 are continuous for
the topology of uniform convergence (C0 -norm). Moreover, the same is true
for the L1 -norm, restricted to maps A : M → GL(d) with A∞ ≤ C, for any
constant C > 1.
Exercise 9.2 Let A : M → GL(d) be such that A±1 ∈ L∞ (μ ). Show that for
every ε > 0 and C > 1 there exists δ > 0 such that
λ+ (B, μ ) ≤ λ+ (A, μ ) + ε and λ− (B, μ ) ≤ λ− (A, μ ) − ε
for any B : M → GL(d) with B±1 ∞ ≤ C and A − B1 ≤ δ . Moreover, this

statement remains true if one replaces λ+ and λ− by, respectively, ∑ki=1 χi and
∑di=d−k+1 χi , for any 0 ≤ k ≤ d.
Exercise 9.3 Let X be a complete metric space, F be a closed subset and An ,
n ≥ 1 be open subsets whose closure contains F for every n. Show that the

closure of n An contains F.
Exercise 9.4 Use the theorem of Lusin and the Tietze extension theorem to
show that, given any measurable function B ∈ SL(d) and any ρ > 0, there exist

continuous functions J : M → SL(d) such that J∞ ≤ B∞ and J = B outside

a set with measure ρ .
Exercise 9.5 Suppose that the linear cocycle F admits a dominated decom-
position Rd = Vx1 ⊕ · · · ⊕Vxl , x ∈ Λ on some invariant set Λ. Conclude that:
(1) The decomposition is continuous (the subspaces Vxi depend continuously
on x ∈ Λ), and even admits a continuous extension to the closure of Λ.
(2) Any linear cocycle sufficiently close to F in the uniform convergence norm
admits a dominated decomposition on Λ into subspaces with the same di-
mensions.
Exercise 9.6 Show that an SL(2)-cocycle admits a dominated decomposition
over an invariant set Λ ⊂ M if and only if it is hyperbolic over Λ.
Exercise 9.7 Prove the following fact, which was implicit in Section 9.3: If
Z is a cylinder in a Bernoulli shift space (M, μ ) then Z is not a coboundary set.

10
Continuity
We have seen in Chapter 9 that the Lyapunov exponents may depend in a com-
plicated way on the underlying linear cocycle. The theme, in the context of
products of random matrices, of the present chapter is that this dependence is
always continuous.
Let G (d) denote the space of compactly supported probability measures p
on GL(d), endowed with the following topology: p is close to p if it is close
in the weak∗ -topology and supp p is contained in a small neighborhood of
supp p. Let λ+ (p) and λ− (p) denote the extremal Lyapunov exponents of the
product of random matrices associated with a given p ∈ G (d), in the sense of
Section 2.1.1. In other words, λ± (p) = λ± (A, μ ), where A : GL(d)N → GL(d),
(αk )k → α0 and μ = pN . A probability measure η on PRd will be called p-
stationary if it is stationary for this cocycle. We are going to prove:
Theorem 10.1 (Bocker and Viana) The functions G (2) → R, p → λ± (p) are
continuous at every point in the domain.
Avila, Eskin and Viana announced recently that this statement remains true
in arbitrary dimension. Even more, for any d ≥ 2, all the Lyapunov exponents
depend continuously on the probability distribution p ∈ G (d). The proof will
appear in [12].
It is easy to see that Theorem 10.1 implies Theorem 1.3. Indeed, given
(Ai, j , p j )i, j ∈ GL(2)m × Δm , let us consider p = ∑ j p j δA j ∈ G (2). For any
nearby element (Ai, j , pj )i, j of GL(2)m × Δm , the corresponding probability
measure p = ∑ j pj δAj is close to p inside G (2); note that the assumption that
p j > 0 for every j is crucial for ensuring that supp p is contained in a small
neighborhood of supp p. So, the Lyapunov exponents λ± (Ai, j , pj )i, j = λ± (p )
are close to λ± (Ai, j , p j )i, j = λ± (p), as claimed in Theorem 1.3.
Here is another direct consequence of Theorem 10.1. Let X be a separable
complete metric space and f : M → M be the shift map on M = X N . Con-
171

172 Continuity
sider the space of measurable functions A : X → GL(2) such that log A±1
are bounded, with the topology of uniform convergence. Fix any probability
measure p on X and let λ± (A, p) be the extremal Lyapunov exponents relative
to μ = pN of the locally constant linear cocycle defined by A over f . Observe
that λ± (A, p) = λ± (A∗ p) and A → A∗ p is continuous. Thus, it follows from
Theorem 10.1 that the functions A → λ± (A, p) are continuous at every point.
The proof of Theorem 10.1 occupies Sections 10.1 through 10.5. Then, in
Section 10.6, we try to put this theorem and the results of Chapter 9 together
in a consistent possible scenario.
10.1 Invariant subspaces

According to Corollary 9.3, we only need to consider the case λ− (p) < λ+ (p).
Let (pk )k be a sequence converging to some p in G (2). In other words, given
any continuous functions ϕ1 , . . . , ϕl : GL(d) → R and any ε > 0,

| ϕ j d pk − ϕ j d p| < ε for j = 1, . . . , l
and supp pk is contained in the ε -neighborhood of supp p, for every large k. We

are going to show that λ+ (pk ) → λ+ (p) as k → ∞. Then, by Exercise 10.1, we
also have λ− (pk ) → λ− (p) as k → ∞.
For each k, let ηk be a pk -stationary measure that realizes the largest Lya-
punov exponent for pk :

λ+ (pk ) = Φ d ηk d pk .
Up to restricting to a subsequence, we may suppose that (ηk )k converges to

some probability measure η on PR2 . By Proposition 5.9(b), this is a p-station-
ary measure. Moreover, since Φ is continuous,

λ+ (pk ) = Φ d ηk d pk converges to Φ d η d p.
Thus, if η realizes the largest Lyapunov exponent for p then we are done. In
what follows we suppose that

Φ d η d p < λ+ (p) (10.1)
and we prove that this leads to a contradiction.

We will use α = (αn )n to represent a generic point of M = GL(d)N and we
will write α (n) = αn−1 · · · α0 for n ≥ 1.
Proposition 10.2 There exists some L ∈ PR2 such that

10.1 Invariant subspaces 173
(a) η ({L}) > 0;

(b) α (L) = L for every α ∈ supp p;
(c) for μ -almost all α ∈ GL(2)N ,

1 λ− (p) if v ∈ L \ {0}
lim log α (n) (v) =
n n λ+ (p) if v ∈ R2 \ L.
Proof According to Proposition 5.5, the product measure μ × η is invariant
under the projective cocycle PF : M × PR2 → M × PR2 defined by the function
A : M → GL(2) over the shift f : M → M on M = GL(2)N . So, we may use the
ergodic theorem to conclude that
n−1 1
Φ̃ α , [v] = lim ∑ Φ PF j (α , [v]) = lim log α (n) (v)
n n n
j=0
exists for μ × η -almost every point, is constant on the orbits of PF and satis-

fies Φ̃ d η d μ = Φ d η d μ . Thus, the assumption implies that Φ̃ < λ+ (p) on
some subset with positive measure for μ × η .
We claim that for μ -almost every α ∈ M there exists a unique L(α ) ∈ PR2
such that Φ̃(α , L(α )) < λ+ (p). Moreover,
L( f (α )) = α0 (L(α )) for μ -almost every α ∈ M. (10.2)
Indeed, consider
Z = {(α , [v]) ∈ M × PR2 : Φ̃(α , [v]) < λ+ (p)}.
This is a measurable, PF-invariant set with positive measure for μ × η . Hence,

the projection Z = {α ∈ M : (α , [v]) ∈ Z for some [v]} is a measurable (Propo-
sition 4.5), f -invariant set with positive measure for μ . Since ( f , μ ) is ergodic,
it follows that μ (Z) = 1. This proves the existence part of the claim.
Next, given any α ∈ M, suppose that there exist two distinct points [v1 ] and
[v2 ] in projective space such that Φ̃(α , [vi ]) < λ+ (p) for i = 1, 2. Since every
vector in R2 may be written as linear combination of v1 and v2 , it follows that
1
lim log α (n) < λ+ (p).
n n
This can only happen on a subset with zero measure for μ , because the largest
Lyapunov exponent is equal to λ+ (p) at μ -almost every point α ∈ M. That
proves the uniqueness part of the claim. Finally, (10.2) is a direct consequence
of the fact that the function Φ̃ is constant on the orbits of PF.
Let η0 be any ergodic component (Theorem 5.14) of η such that μ × η0 (Z )
is positive. Then μ × η0 (Z ) = 1, by ergodicity, and so η0 ({L(α )}) = 1 for μ -
almost every α ∈ M. It follows that there exists L ∈ PR2 such that L(α ) =

174 Continuity
L for μ -almost every α ∈ M. This subspace L satisfies the conditions in the

statement. Indeed, condition (a) is given by η ({L}) = μ × η (Z ) > 0. The
relation (10.2) gives that α (L) = L for p-almost every α , which is equivalent
to condition (b). By construction, Φ̃(α , [v]) = λ+ (p) for [v] = L and Φ̃(α , L) <
λ+ (p) for μ -almost all α ∈ M. Since Φ̃ only takes the values λ± (p), by the
Oseledets theorem, it follows that Φ̃(α , L) = λ− (p) for μ -almost all α ∈ M.
Thus, L satisfies condition (c) as well.
10.2 Expanding points in projective space

Let L be the subspace given by Proposition 10.2. We are going to view L as a
kind of repelling fixed point for the random walk defined by p on the projec-
tive space and to analyse such a repeller from the point of view of (the random
walks defined by) nearby probability measures. The first step will be to estab-
lish in which sense L is repelling. For that, we need to introduce some notation.
Most of the time the dimension d will be arbitrary, although our interest is in
d = 2.
Let P(d) be the space of probability measures on PRd , endowed with the
weak∗ topology. Given p ∈ G (d) and η ∈ P(d), we denote by p ∗ η the push-
forward of the product measure p × η under the map (α , x) → α (x). That is,

p ∗ η (D) = η (α −1 (D)) d p(α ) for every measurable set D ⊂ PRd .
(10.3)
Observe that a probability η is p-stationary if and only if p ∗ η = η .
Similarly, given p1 , p2 ∈ G (d), we denote by p1 ∗ p2 ∈ G (d) the push-
forward of the product measure p1 × p2 under the map (α1 , α2 ) → α1 α2 . In
other words,

p1 ∗ p2 (E) = p2 (α1−1 E) d p1 (α1 ) = p1 (E α2−1 ) d p2 (α2 ) (10.4)
for every measurable set E ⊂ GL(d). We call p1 ∗ p2 the convolution of p1 and

p2 . Clearly, p1 ∗ p2 is compactly supported if p1 and p2 are. Given p ∈ G (d),
define p(1) = p and p(n+1) = p ∗ p(n) for every n ≥ 1.
Remark 10.3 The operations (10.3) and (10.4) make sense, more generally,
for measures (not necessarily probabilities) p1 , p2 and p in GL(d) and η in
PRd .
Given α ∈ GL(d) and [v] ∈ PRd , we denote by Dα ([v]) the derivative at

the point [v] of the action of α in the projective space. In explicit terms (see

10.2 Expanding points in projective space 175
Exercise 10.3):
projα (v) α (v̇)
Dα ([v])v̇ = for every v̇ ∈ T[v] PRd = {v}⊥ (10.5)
α (v)/v
where projw : u → u − w(u · w)/(w · w) denotes the orthogonal projection to the
hyperplane orthogonal to w. When d = 2, the tangent hyperplanes are lines and
so Dα ([v]) is a real number.
Given p ∈ G (d) and x ∈ PRd , we say that x is p-invariant if it is a fixed point
for p-almost every α ∈ GL(d) or, equivalently, for every α ∈ supp p. Then we
say that x is p-expanding if there exist ≥ 1 and c > 0 such that

log Dβ (x)v̇ d p() (β ) ≥ 2c for every unit vector v̇ ∈ Tx PRd . (10.6)
Part (b) of Proposition 10.2 means that L is p-invariant. In view of part (c),
the next proposition implies that L is also p-expanding:
Proposition 10.4 Let x ∈ PRd be a p-invariant point and suppose that there
exist a < b such that

1 ≤ a if v ∈ x \ {0}
lim log α (n) (v)
n n ≥ b if v ∈ Rd \ x.
for μ -almost all α ∈ M. Then x is a p-expanding point.
Proof By Proposition 4.14 and the hypothesis,

1 1
lim log projα (n) (x) α (n) (v̇) = lim log α (n) (v̇) ≥ b
n n n n
for every unit vector v̇ ∈ Tx RPd = {x}⊥ and μ -almost every α ∈ M. Then,
1
lim log Dα (n) (x)v̇ ≥ b − a > 0
n n
for every unit vector v̇ ∈ {x}⊥ and μ -almost every α ∈ M. Since the support
of p is compact, the inequalities in Exercise 10.3 ensure that there exists C > 0
such that −C ≤ log Dα (x)v̇ ≤ C for every α ∈ supp p and every univ vector
v̇ ∈ {v}⊥ . It follows that
1
−C ≤ log Dα (n) (x)v̇ ≤ C
n
for every α ∈ supp p(n) , every unit vector v̇ ∈ {v}⊥ and every n ≥ 1. So, using
the bounded convergence theorem, the previous inequality implies that

1 1
lim log Dβ (x)v̇ d p(n) (β ) = lim log Dα (n) (x)v̇ d μ (α ) ≥ b − a
n n n n

176 Continuity
for every unit vector v̇ ∈ {x}⊥ . Let c = (b − a)/2 and, for each unit vector
v̇ ∈ {x}⊥ , let n(v̇) ≥ 1 be the smallest value of n such that

1
log Dβ (x)v̇d p(n) (β ) > c.
n
Note that n(v̇) depends upper semi-continuously on v̇, and so
n0 = sup{n(v̇) : v̇ ∈ {x}⊥ with v̇ = 1}
is finite. Fix C > 0 such that

log Dβ (x)v̇d p(n) (β ) ≥ −C
for 1 ≤ n ≤ n0 and every unit vector v̇ ∈ {x}⊥ . Then

n
log Dβ (x)v̇d p(n) (β ) ≥ c n0 −C ≥ 2c
n0
for every unit vector v̇ ∈ {x}⊥ , if n is sufficiently large.
The heart of the proof of Theorem 10.1 is the statement that p-expanding
points are invisible for the p-stationary measures that are limits of pk -station-
ary non-atomic measures with pk → p. In precise terms:
Theorem 10.5 Suppose that x ∈ PRd is a p-expanding point. Let (pk )k be
a sequence in G (d) converging to p and, for each k ≥ 1, let ηk ∈ P(d) be a
pk -stationary measure. Suppose that (ηk )k converges to some η . If ηk is non-
atomic for k arbitrarily large, then η ({x}) = 0.
The proof of this theorem will be given in Sections 10.4 and 10.5. Right
now, let us explain how it can be used to complete the proof of Theorem 10.1.
10.3 Proof of the continuity theorem

The existence of an invariant subspace L implies that no matrix α ∈ supp p
is elliptic. If the support consisted only of parabolic matrices (with the same
invariant subspace) and multiples of the identity then we would have λ− (p) =
λ+ (p), which is assumed not to be the case. Thus, supp p contains some hy-
perbolic matrix. Since the set of hyperbolic matrices is open, it follows that it
has positive measure for p. This fact will be useful in a while.
By Proposition 10.2 and Theorem 10.5, we may suppose that ηk is atomic
for every large k. Then, by Lemma 6.9, there exists a finite set Lk ⊂ PR2
such that α (Lk ) = Lk for every α ∈ supp p and η ({y}) > 0 for every y ∈ Lk .
The open set of hyperbolic matrices has positive measure for pk , for every

10.3 Proof of the continuity theorem 177
large k, because pk is close to p in the weak∗ topology. Observe, furthermore,

that for a hyperbolic matrix any finite invariant set consists of either one or
two eigenspaces. Thus, for every large k, the set Lk can have no more than 2
elements.
First, suppose that Lk has a single element Lk . We claim that

Φ(α , Lk ) d pk (α ) = λ+ (pk ). (10.7)
To prove this, let ζk be the Dirac mass at the point Lk . The product measure
μk × ζk is PF-invariant, and so it follows from the ergodic theorem that

Φ(α , Lk ) d pk (α ) = Φ̃(α , Lk ) d μk (α ). (10.8)
We also have that Φ̃(α , [v]) = λ+ (pk ) for μk × ηk -almost every (α , [v]), be-

cause Φ̃(α , [v]) d ηk ([v]) d μk (α ) = λ+ (pk ) and Φ̃(α , [v]) ≤ λ+ (pk ) for μk -
almost every α ∈ M and every [v] ∈ PR2 . Since ηk ({Lk }) > 0, this implies
Φ̃(α , Lk ) = λ+ (pk ) for μk -almost every α ∈ M. Together with (10.8), this im-
plies the claim (10.7).
Up to restricting to a subsequence, may assume that (Lk )k converges to some
subspace L0 ∈ PR2 . Let ζ0 be the Dirac mass at L0 . Since supp pk converges
to supp p in the Hausdorff distance (Exercise 10.4), it follows that α (L0 ) = L0
for every α ∈ supp p. Thus, the product measure μ × ζ0 is invariant under the
projective cocycle PF. Since Φ is a continuous function, passing to the limit
in (10.7) we get that

λ+ (pk ) → Φ(α , L0 ) d p(α ) = Φ̃(α , L0 ) d μ (α ) (10.9)
According to Proposition 10.2, there are two possibilities:
(i) If L0 = L then the right-hand side of (10.9) is equal to λ+ (p). Then

Φ d ηk d μk = λ+ (pk ) converges to λ+ (p), which contradicts (10.1).
(ii) If L0 = L then the right-hand side of (10.9) is λ− (p). Then (Exercise 10.1)
we also have that λ− (pk ) → λ+ (p). This is a contradiction, because
λ− (pk ) ≤ λ+ (pk ) for every k and λ− (p) < λ+ (p).
We have shown that, for k large, Lk cannot consist of a single point.

Finally, suppose that, for all k large, Lk consists of two points, Lk and Lk ,
with positive measure for η . Let ζk = (δLk + δL )/2. Arguing as we did for
k
(10.7), we find that

1 1
Φ(α , Lk ) d pk (α ) + Φ(α , Lk ) d pk (α ) = λ+ (pk ). (10.10)
2 2

178 Continuity
We may assume that (Lk )k and (Lk )k converge to subspaces L0 and L0 , respec-
tively. Let ζ0 = (δL0 + δL )/2. Passing to the limit in (10.10),
0

1 1
λ+ (pk ) → Φ(α , L0 ) d p(α ) + Φ(α , L0 ) d p(α )
2 2
(10.11)
1 1
= Φ̃(α , L0 ) d p(α ) + Φ̃(α , L0 ) d p(α ).
2 2
If L0 and L0 are both different from L then the right-hand side is equal to
λ+ (p) and we reach a contradiction just as we did in case (i) of the previous
paragraph. Now suppose that L0 = L. We may assume that, for each k large,
there exists αk ∈ supp pk such that αk (Lk ) = Lk : otherwise, we could take Lk =
{Lk } instead, and that case has already been dealt with. Passing to the limit
along a convenient subsequence, we find α0 ∈ supp p such that α0 (L0 ) = L0 .
By part (b) of Proposition 10.2, this implies that L0 is also equal to L. Then the
right-hand side of (10.11) is equal to λ− (p), and we reach a contradiction just
as we did in case (ii) of the previous paragraph. So, Lk cannot consist of two
points either.
We have reduced the proof of Theorem 10.1 to proving Theorem 10.5.
10.4 Couplings and energy

For simplicity, we will write P = PRd and G = GL(d). Let d be the distance
on P defined by the angle between two directions. For any Borel measure ξ on
P × P and δ > 0, define the δ -energy of ξ to be

Eδ (ξ ) = d(x, y)−δ d ξ (x, y).
Let π i : P × P → P be the projection on the i th coordinate, for i = 1, 2.

By definition, the mass of a measure η on P is the total measure η = η (P)
of the ambient space. If η1 and η2 are measures on P with the same mass, a
coupling of η1 and η2 is a measure on P × P that projects to ηi on the i th
coordinate for i = 1, 2. For instance,
1 1
ξ= η1 × η2 = η1 × η2
η1 η2
is a coupling of η1 and η2 . Define:
eδ (η1 , η2 ) = inf{Eδ (ξ ) : ξ a coupling of η1 and η2 }. (10.12)
The infimum is always achieved, because the function ξ → Eδ (ξ ) is lower
semi-continuous (Exercise 10.5).

10.4 Couplings and energy 179
A self-coupling of a measure η ∈ P(d) is a coupling of η and η . We call a

self-coupling symmetric if it is invariant under the involution ι : (x, y) → (y, x).
Define the δ -energy of η to be:
eδ (η ) = eδ (η , η ) = inf{Eδ (ξ ) : ξ a self-coupling of η }. (10.13)
We call p ∗ η the convolution of η by p. Observe that a probability η is

p-stationary if and only if p ∗ η = η . If ξ is any self-coupling then ξ =
(ξ + ι∗ ξ )/2 is a symmetric self-coupling and Eδ (ξ ) = Eδ (ξ ) Thus, the in-
fimum in (10.13) is always achieved at some symmetric self-coupling of η .
Such symmetric self-couplings are called δ -optimal.
Example 10.6 Let P = [0, 2] and η be the convex combination of the Dirac
masses at 0, 1 and 2; that is,
1 1 1
η = δ0 + δ1 + δ2 .
3 3 3
Since η is a probability, the product ξ1 = η × η is a (symmetric) self-coupling
of η . Observe that the product ξ1 has atoms on the diagonal of P × P and, thus,
its δ -energy is infinite. Now, consider
1 1 1 1 1 1
ξ2 = δ(0,1) + δ(0,2) + δ(1,0) + δ(1,2) + δ(2,0) + δ(2,1) .
6 6 6 6 6 6
Observe that ξ2 is a symmetric self-coupling of η . Moreover, the support of ξ2
does not intersect the diagonal and, consequently, Eδ (ξ2 ) is finite. This proves
that eδ (η ) < ∞. It is easy to check that ξ2 is a δ -optimal self-coupling of η .
This notion of energy allows us to characterize the presence of atoms:
Lemma 10.7 For any η ∈ P(d) and δ > 0:
(1) if η ({x}) > η /2 for some x ∈ P then eδ (η ) = ∞;

(2) if η ({x}) < η /2 for every x ∈ P then eδ (η ) < ∞.
Proof First we prove (1). Take x as in the hypothesis. For any self-coupling
ξ of η , we have
ξ ({x}c × P) = η ({x}c ) = ξ (P × {x}c ).
Taking the union, ξ ({(x, x)}c ) ≤ 2η ({x}c ) < η (P) = ξ (P × P). This implies
that (x, x) is an atom for ξ , and so Eδ (ξ ) = ∞.
Now we prove (2). The hypothesis implies (Exercise 10.7) that there exists
ρ > 0 such that η (B(x, 2ρ )) < η (B(x, 2ρ )c ) for every x ∈ P. Let x1 , . . . , xl be

180 Continuity
such that U = {B(xi , ρ ) : i = 1, . . . , l} covers P. We claim that there exists a

finite sequence ξ0 , . . . , ξl of symmetric self-couplings of η such that

ξ j B(xi , ρ ) × B(xi , ρ ) = 0 for every 1 ≤ i ≤ j. (10.14)
Denote Δr = {(y, z) ∈ P × P : d(y, z) < r}. Take r > 0 to be a Lebesgue number

for the open cover U . Then, by definition,

l
Δr ⊂ B(xi , ρ ) × B(xi , ρ ).
i=1

It follows that ξl Δr = 0 and so Eδ (ξl ) < ∞. We are left to prove our claim.
We are going to construct the sequence ξ j by induction of j. Let ξ0 be an
arbitrary self-coupling of η . For each j = 1, . . . , l, suppose that ξ j−1 has been
constructed in such a way that B(xi , ρ ) × B(xi , ρ ) has zero measure for 1 ≤ i ≤
j −1. Denote ξ = ξ j−1 and B = B(x j , ρ ) and C = B(x j , 2ρ ) and D = B(x j , 2ρ )c .
Note that (see Figure 10.1)
• ξ (C ×C) + ξ (C × D) = η (C), because π∗1 ξ = η ;

• ξ (D ×C) + ξ (D × D) = η (D), because π∗1 ξ = η ;
• ξ (C × D) = ξ (D ×C), because ξ is symmetric.
Since η (C) < η (D), this implies that ξ (B × B) ≤ ξ (C × C) <ξ (D × D). Let
θ = ξ (B × B)/ξ (D × D) and then let ζ be any coupling of π∗ ξ | B × B and
1
θ π∗1 ξ | D × D . This is well defined, because of the choice of θ . Moreover, ζ

is concentrated on B × D and, in particular, it satisfies ζ (Δρ ) = 0. Define (see
Figure 10.1)

ξ j = ξ − ξ | B × B − θ ξ | D × D + ζ + ι∗ ζ . (10.15)
Observe that ξ j is a (positive) measure, because θ < 1. It is also clear that

ξ j is symmetric. Moreover,

π∗1 ξ j = π∗1 ξ − π∗1 ξ | B × B − θ π∗1 ξ | D × D

+ π∗1 ξ | B × B + θ π∗1 ξ | D × D = π∗1 ξ = η .
This shows that ξ j is indeed a self-coupling of η . The definition (10.15) also

gives that ξ j (B × B) = 0 and (since ζ and ι∗ ζ give zero weight to Δρ )

ξ j B(xi , ρ ) × B(xi , ρ ) ≤ ξ B(xi , ρ ) × B(xi , ρ )) = 0
for every 1 ≤ i ≤ j − 1. Thus, ξ j satisfies (10.14).

y
D +ζ −θ (ξ | D × D)
C
B + ι∗ ζ
B C D
Figure 10.1 Removing mass from the neighborhood of the diagonal
10.5 Conclusion of the proof

We are ready to finish the proof of Theorem 10.5 and, thus, of Theorem 10.1.
Keep in mind that we denote P = PRd and G = GL(d).
Let x be a p-expanding point. Fix constants c > 0 and ≥ 1 as in the def-
inition (10.6). Up to restricting to a subsequence, we may suppose that (ηk )n
converges to η . Assume, for contradiction, that η ({x}) > 0. Then let U0 ⊂ P
be an open neighborhood of x such that
9
η ({x}) > η (Ū0 ).
10
Lemma 10.8 Assuming that δ > 0 is small enough and k ≥ 1 is large enough,

then Dβ (x)v̇−δ d pk (β ) ≤ 1 − cδ for every unit vector v̇ ∈ Tx P.
()
Proof Fix some compact neighborhood V of the support of p. Take k to be

large enough that supp pk ⊂ V . Each function

Dβ (x)v̇−δ d pk (β )
()
ψv̇,k (δ ) =
is differentiable and the derivative is given by

Dβ (x)v̇−δ log Dβ (x)v̇ d pk (β ).
()
ψv̇,k (δ ) = −

(0) = − log Dβ (x)v̇ d p (β ). Since p ()() in the weak∗ ()
Note that ψv̇,k k k →p
topology, and the function β → log Dβ (x)v̇ is continuous on V , we get that

lim ψv̇,k (0) = − log Dβ (x)v̇ d p() (β ) ≤ −2c for every v̇.
k

182 Continuity
This convergence is uniform on v̇, as the sequence of functions v̇ → ψv̇,k (0)
is equicontinuous. So, assuming that k is large enough, ψv̇,k (0) ≤ −3c/2 for
every v̇. Since β and v̇ live inside compact sets V and Tx P, respectively, the
1
factor Dβ (x)v̇−δ is uniformly close to 1 if δ is close to 0. Thus, in particular,

(δ ) ≤ −c for every v̇, every large k and every small δ . Since ψ (0) = 1,
ψv̇,k v̇,k
this gives the claim.
Fix δ as in Lemma 10.8 and take k to be large enough that the conclusion of
the lemma holds; further conditions will be imposed on k along the way. For
()
notational simplicity, we write Π = p() and Πk = pk for each k.
Let U1 ⊂ U0 be an open neighborhood of x such that

d(β (y), β (z))−δ dΠk (β ) ≤ (1 − cδ )d(y, z)−δ , (10.16)
for any pair of distinct points y, z ∈ U1 . Since x is a p-invariant point, β (x) = x

for every β ∈ supp Π. Let K be a compact neighborhood of the support of Π
and U1 ⊃ U2 ⊃ U3 ⊃ U4 ⊃ U5 be open neighborhoods of x such that
β −1 (U2 ) ⊂ U1 and Ū3 ⊂ U2 and Ū4 ⊂ U3 and β (U5 ) ⊂ U4 for every β ∈ K.
Take k to be large enough that supp Πk ⊂ K.

The δ -energy of every ηk | U1 is finite, because the ηk are assumed to be
non-atomic. The strategy for the proof of Theorem 10.5 is to use the expansion
property (10.16) to find a uniform bound for these δ -energies. That ensures
that the δ -energy remains finite at k = ∞, contradicting the existence of a fat
atom for η | U1 at x. The main step for realizing this strategy is the following
proposition:
Proposition 10.9 For each k ≥ 1, let ξk be any symmetric self-coupling of

ηk | U1 . Then there exists C > 0 such that
eδ (ηk | U1 ) ≤ (1 − cδ )Eδ (ξk ) +C for every k sufficiently large.
Theorem 10.5 can be deduced as follows. Suppose there exists a subsequence

(k j ) j → ∞ such that ηk j has no atoms in U1 and, in particular, eδ (ηk j | U1 )
is finite. Let ξk j be a δ -optimal symmetric self-coupling of ηk j ; that is, such
that Eδ (ξk j ) = eδ (ηk j | U1 ). Then Proposition 10.9 gives that eδ (ηk j | U1 ) ≤
C/(cδ ). Let η̂ be any accumulation point of the sequence (ηk j | U1 ) j . On the
one hand, by lower semi-continuity,
C
eδ (η̂ ) ≤ lim sup eδ (ηk j | U1 ) ≤ < ∞.
j cδ

On the other hand, η | U1 ≤ η̂ ≤ η | Ū1 and, in particular,

9
η̂ ({x}) > η̂ .
10
The latter implies that eδ (η̂ ) = ∞, which contradicts the previous conclusion.
This contradiction proves Theorem 10.5.
10.5.1 Proof of Proposition 10.9

We need to construct a (symmetric) self-coupling of ηk | U1 whose δ -energy
is significantly smaller than that of ξk . A good starting point is the diagonal
convolution ξ̃k of Πk and ξk ; that is, the image of Πk × ξk under the diagonal
push-forward (β , (y, z)) → (β (y), β (z)). Equivalently,

ξ̃k (D1 , D2 ) = ξk (β −1 (D1 ) × β −1 (D2 )) dΠk (β ) (10.17)
for any measurable sets D1 , D2 ⊂ PRd . Indeed, it is clear that ξ̃k is a symmetric
measure and the expansion property (10.16) implies that

Eδ (ξ̃k ) = d(y, z)−δ d ξ̃k (y, z) = d(β (y), β (z))−δ dΠk (β ) d ξk (y, z)

≤ (1 − cδ ) d(y, z)−δ d ξk (y, z) = (1 − cδ )Eδ (ξk ).
(10.18)
However, the restriction of ξ̃k to U1 × U1 is not a self-coupling of ηk | U1 .
Indeed, let ηk1 be the projection of ξ̃k | U1 × U1 on either coordinate (thus,
ξ̃k | U1 × U1 is a symmetric self-coupling of ηk1 ). The next lemma shows that
ηk1 ≤ ηk | U1 and it describes the difference between the two measures pre-
cisely. There are two terms, ik and ok , that correspond to displacements of
mass in opposite directions across the border of U1 under the diagonal push-
forward: ik accounts for mass that is mapped from the outside into U1 × U1
whereas ok comes from mass that overflows U1 ×U1 . See Figure 10.2.
Lemma 10.10 (ηk | U1 ) − ηk1 = ik + ok where ik is the restriction to U1 of

Πk ∗ (ηk | U1c ) and ok is the projection of ξ̃k | (U1 ×U1c ) on the first coordinate.
Moreover,
1
ok ≤ ik and lim sup ik ≤ η ({x}).
k 10
Proof It follows from (10.17) that

π∗j ξ̃k = (ηk | U1 ) ◦ β −1 dΠk (β ) = Πk ∗ (ηk | U1 ) for j = 1, 2.

184 Continuity
10.5 Conclusion of the proof
U1c
Ok
Ik
U1
U2
U2 U1 U1c
Figure 10.2 Diagonal push-forward: mass in region Ik projects to ik and mass in

region Ok projects to ok
Combining this with the fact that ηk is Πk -stationary, we find

π∗1 ξ̃k = Πk ∗ ηk − Πk ∗ (ηk | U1c ) = ηk − Πk ∗ (ηk | U1c ),
and so π∗1 ξ̃k | U1 = (ηk | U1 ) − ik . Moreover,

π∗1 ξ̃k | U1 = π∗1 ξ̃k | U1 ×U1 + π∗1 ξ̃k | U1 ×U1c = ηk1 + ok .
The first claim is a direct consequence of these two equalities.
Next, observe that

Πk ∗ (ηk | U1 ) (U1c ) + Πk ∗ (ηk | U1 ) (U1 ) = Πk ∗ (ηk | U1 ) (P) = ηk (U1 )
and, using that ηk is Πk -stationary,

Πk ∗ (ηk | U1c ) (U1 ) + Πk ∗ (ηk | U1 ) (U1 ) = Πk ∗ ηk (U1 ) = ηk (U1 ).
This implies that

Πk ∗ (ηk | U1c ) (U1 ) = Πk ∗ (ηk | U1 ) (U1c ). (10.19)
The left-hand side is precisely ik . As for the right-hand side,

Πk ∗ (ηk | U1 ) (U1c ) = (π∗2 ξ̃k )(U1c ) = ξ̃k (P ×U1c ) ≥ ξ̃k (U1 ×U1c ) = ok .
This proves that ok ≤ ik .
Finally, the first part of the lemma implies that ik ≤ ηk | U1 . Moreover, the
choice of U2 ensures that U2 ∩ β −1 (U1 ) = 0/ for every β ∈ supp Πk and so
ik (U2 ) = 0. Thus,
1
lim sup ik ≤ lim sup ηk (Ū1 \U2 ) ≤ η (Ū1 \U2 ) ≤ η ({x}).
k k 10
This finishes the proof of the lemma.

According to Lemma 10.10, the difference (ηk | U1 ) − ηk1 is positive and

relatively small. This suggests that we try to construct the self-coupling of
ηk | U1 we are looking for, with small δ -energy, by adding suitable correcting
terms to the restriction of ξ̃k to U1 × U1 . These correcting terms should be
concentrated outside a neighborhood of the diagonal, if possible, so that their
contribution to the total energy is bounded. We choose:
!
ζk
ξ̄k = (ξ̃k | U1 ×U1 ) − 4 (ξ̃k | U4 ×U4 )
ηk
1
+ (ok | U3 ) × ik + ik × (ok | U3 ) (10.20)
ik
1
+ 4 (ζk × ηk4 ) + (ηk4 × ζk )
ηk
where ηk4 is the projection (on either coordinate) of ξ̃k | U4 ×U4 and

ok | U3
ζk = 1 − ik + (ok | U3c ).
ik
Lemma 10.11 Assume that k ≥ 1 is large enough. Then ξ̄k is a symmetric
self-coupling of ηk | U1 . Moreover,
Eδ (ξ̄k ) ≤ Eδ (ξ̃k | U1 ×U1 ) +C,
where C is a positive constant independent of k.
Proof It is clear that the three terms on the right-hand side of (10.20) are sym-
metric. Moreover, ok | U3 ≤ ok ≤ ik . This ensures that ζk is a positive
measure. Now it is clear that the last two terms in (10.20) are positive. To see
that the first one is also positive, it suffices to check that ζk < ηk4 . That
is easily done as follows. We have ηk (U5 ) ≥ (9/10)ηk (U1 ) and that implies
ξk (U5 ×U5 ) ≥ (8/10)ηk (U1 ). Since β −1 (U4 ) ⊃ U5 for every β ∈ supp Πk , the
definition (10.17) gives that ξ̃k (U4 ×U4 ) ≥ ξk (U5 ×U5 ). Thus,
8 8
ηk4 = ξ̃k (U4 ×U4 ) ≥ ηk (U1 ) ≥ η ({x}).
10 10
Furthermore, assuming k is sufficiently large,
3
ζk ≤ ik + ok ≤ 2ik ≤ η ({x}).
10
In particular, ζk < ηk4 as we wanted to prove.
Next, observe that the projection of ξ̄k on either coordinate is equal to
! ! !
ζk 4 ok | U3 ζk 4
ηk − 4 ηk + (ok | U3 ) +
1
ik + ζk + 4 ηk
ηk ik ηk

186 Continuity
U1
(ok | U3 ) × ik
U2
ηk4 × ζk
U3
ik × (ok | U3 )
U4
ζk × ηk4
U4 U3 U2 U1
Figure 10.3 Coupling terms far from the diagonal
and that adds up to ηk1 + ik + ok = ηk | U1 . So, ξ̄k is a self-coupling of ηk | U1

as claimed.
The final step is to estimate the δ -energy of ξ̄k . It is important to notee that
the two correcting terms
1 1
(ok | U3 ) × ik + ik × (ok | U3 ) and (ζk × ηk4 ) + (ηk4 × ζk )
ik ηk
4
are supported away from the diagonal of P × P. Indeed, ok | U3 is concentrated

on U3 while ik is concentrated on U2c , as we have seen. Similarly, ζk is con-
centrated on U3c while ηk4 is concentrated on U4 . See Figure 10.3. Moreover,
their masses are uniformly bounded (by 1, say). Thus, the δ -energy of the
sum of these two terms is bounded by some constant C > 0, for every large
k. As for the first term in (10.20), it is clear that its δ -energy is bounded by
Eδ (ξ̃k | U1 ×U1 ). Thus, Eδ (ξ̄k ) ≤ Eδ (ξ̃k | U1 ×U1 ) +C, as claimed.
Combining (10.18) with Lemma 10.11, we get
eδ (ηki ) ≤ Eδ (ξ̄k ) ≤ Eδ (ξ̃k | U1 ×U1 ) +C ≤ Eδ (ξ̃k ) +C ≤ (1 − cδ )Eδ (ξk ) +C
as claimed in Proposition 10.9.
10.6 Final comments

We saw in Chapter 9 that discontinuity of the Lyapunov exponents is common
among cocycles with low regularity: continuous or Hölder-continuous cocy-
cles with small Hölder constant. On the other hand, Theorems 1.3 and 10.1
show that, at least in dimension 2, for locally constant cocycles (cocycles with

10.6 Final comments 187
“infinite Hölder constant”) one always has continuity. In what follows, we

present a possible global scenario for 2-dimensional cocycles over “chaotic”
systems, encompassing these two contrasting conclusions.
Let X = {1, . . . , m} and f : M → M be the shift map on M = X Z . Fix θ ∈
(0, 1) and endow M with the metric
dθ (x, y) = θ N(x,y) , where N(x, y) = max{N ≥ 0 : xn = yn for all |n| < N}.
Let A : M → GL(2) be r-Hölder continuous for this metric. In other words,
there exists C1 > 0 such that
A(x) − A(y) ≤ C1 θ Nr for any x, y with xn = yn for all |n| < N. (10.21)
The cocycle F defined by A over f is fiber-bunched if there exists C2 > 0 and
λ < 1 such that
An (x) An (x)−1 θ nr ≤ C2 λ n for every x ∈ M and n ≥ 1. (10.22)
This notion was introduced in [37], under a different name.
Conjecture 10.12 Lyapunov exponents vary continuously restricted to the
subset of fiber-bunched elements A : M → GL(2) of the space of r-Hölder-
continuous cocycles.
Observe that in Section 9.3 we had A = A−1 = σ and θ = 1/2, and
so the cocycle is fiber-bunched if and only if σ 2 < 2r . Thus, the hypothesis
σ > 22r in Theorem 9.22 is incompatible with fiber-bunching.
It is clear from the definition that fiber-bunched cocycles form an open sub-
set of the space of r-Hölder-continuous cocycles for any r > 0. An important
feature of fiber-bunched cocycles is that they admit invariant stable and unsta-
ble holonomies. Let us explain this.
s (x) of a point x ∈ M is the set of all y ∈ M that have
The local stable set Wloc
the same positive part; that is, such that π + (x) = π + (y). The local unstable
set is defined analogously, requiring π − (x) = π − (y) instead. The stable and
unstable sets are defined by
∞
s n ∞
u −n
W s (x) = f −n Wloc ( f (x)) and W u (x) = f n Wloc ( f (x)) ,
n=0 n=0
respectively. An s-holonomy for the cocycle associated with A is a family of

linear maps hsx,y : Rd → Rd , defined for every x and y in the same stable set,
such that
(1) hsy,z ◦ hsx,y = hsx,z and hsx,x = id for every x, y, z in the same stable set;
(2) A(y) ◦ hsx,y = hsf (x), f (y) ◦ A(x) for every x, y in the same stable set;

188 Continuity
(3) (x, y, v) → hsx,y (v) is continuous, when x and y vary in the set of points in
the same local stable set;
(4) there are C > 0 and t > 0 such that hsx,y is (C,t)-Hölder for every x and y
in the same local stable set.
Analogously, one defines u-holonomy for the cocycle. If A is fiber-bunched
then
hsx,y = lim An (y)−1 An (x) and hux,y = lim A−n (y)−1 A−n (x)
n n
are well defined and they yield an s-holonomy and a u-holonomy for the cocy-
cle.
Fiber-bunching is not necessary for the existence of invariant holonomies,
of course. For instance, for locally constant cocycles one may always take
hsx,y = id for all y ∈ Wloc
s
(x) and hux,y = id for all y ∈ Wloc
u
(x).
Thus, the following statement contains both Theorem 1.3 and Conjecture 10.12:
Conjecture 10.13 Let H be a family of r-Hölder-continuous A : M → GL(2)
such that stable and unstable holonomies, hsA,x,y and huA,x,y , exist for all A ∈ H
and vary continuously on H . Then the Lyapunov exponents vary continuously
on H .
Here, the continuity condition means that (x, y, v, A) → hsA,x,y (v) is contin-
uous when x and y vary in the set of points in the same local stable set, and
(x, y, v, A) → huA,x,y (v) is continuous when x and y vary in the set of points in
the same local unstable set. Moreover, the Hölder constants C and t in condi-
tion (4) are assumed to be locally uniform in A.
Hyperbolic cocycles need not be fiber-bunched neither do the cocycles with
λ− = λ+ . Thus, Corollary 9.3 and Lemma 9.4 imply that the converse to
Conjecture 10.12 is not true in general. Instead, we state:
Conjecture 10.14 Consider any r-continuous function A : M → GL(2) such
that λ− (A, μ ) < λ+ (A, μ ) and the cocycle defined by A over f is not hyper-
bolic. If A is not fiber-bunched then there exists a sequence (An )n converging
to A in the space of r-Hölder-continuous functions such that limn λ+ (An , μ ) <
λ+ (A, μ ).
Young [118] constructed a C1 -open set of SL(2)-cocycles over an expand-
ing circle map which are neither hyperbolic nor fiber-bunched and whose Lya-
punov exponents are uniformly bounded from zero. The estimates are not suf-
ficient to decide whether the Lyapunov exponents vary continuously in this set.
Still, if such an open set can be constructed for Hölder-continuous cocycles,
this could be a good test case for Conjecture 10.14.

10.7 Notes 189
10.7 Notes
Theorem 10.1 was proven by Bocker and Viana [35], based on certain esti-
mates of the random walk induced by the cocycle on PR2 , which they ob-
tained through a suitable discretization of the projective space. Avila, Eskin
and Viana [12] use a very different analysis of the random walk to extend
the theorem to arbitrary dimension d ≥ 2. The presentation in Sections 10.1
through 10.5 is based on the latter.
The dependence of Lyapunov exponents on the linear cocycle or the base
dynamics has been studied by many authors. Ruelle [104] proved real-analytic
dependence of the largest exponent on the cocycle, for linear cocycles admit-
ting an invariant convex cone field. Short afterwards, Furstenberg and Kifer [57,
70] and Hennion [61] studied the dependence of the largest exponent of ran-
dom matrices on the probability distribution, proving continuity with respect
to the weak∗ topology in the almost irreducible case; that is, when there is
at most one invariant subspace. Kifer [70] observed that Lyapunov exponents
may jump when the probability vector degenerates (Exercise 1.8) and John-
son [67] also found examples of discontinuous dependence of the exponent on
the energy E, for Schrödinger cocycles over quasi-periodic flows.
For random matrices satisfying strong irreducibility and the contraction prop-
erty, Le Page [94, 95] proved local Hölder continuity and even smooth depen-
dence of the largest exponent on the cocycle; the assumptions ensure that the
largest exponent is simple (multiplicity 1), by work of Guivarc’h and Raugi [60]
and Gol’dsheid and Margulis [58]. Le Page’s result cannot be improved: a con-
struction of Halperin (see Simon and Taylor [108]) shows that for every α > 0
one can find random Schrödinger cocycles near which the exponents fail to
be α -Hölder continuous. For random matrices with finitely many values and,
more generally, for locally constant cocycles over Markov shifts, Peres [98]
showed that simple exponents are locally real-analytic functions of the tran-
sition data. Recently, Bourgain and Jitomirskaya [40, 41] proved continuous
dependence of the exponents on the energy E, for quasi-periodic Schrödinger
cocycles.
10.8 Exercises
Exercise 10.1 Prove that if (pk )k converges to p in the space G (2) then
(λ+ (pk ) + λ− (pk ))k converges to λ+ (p) + λ− (p). Conclude that λ+ (pk ) →
λ+ (p) if and only if λ− (pk ) → λ− (p).
Exercise 10.2 Prove the following properties of the convolution operations:

190 Continuity
(1) ∗ : G (d) × P(d) → P(d) defined in (10.3) is continuous and it satisfies

(p1 ∗ p2 ) ∗ η = p1 ∗ (p2 ∗ η ).
(2) ∗ : G (d) × G (d) → G (d) defined in (10.4) is commutative, associative and
continuous. Is there a unit element?
Exercise 10.3 Verify the equality (10.5) and prove that
v̇
≤ Dα ([v])v̇ ≤ α α −1 v̇
α α −1
for every [v] ∈ PRd and v̇ ∈ {v}⊥ .
Exercise 10.4 Prove that if (pk )k converges to p in G (d) then (supp pk )k
converges to supp p relative to the Hausdorff distance, defined in the space of
compact subsets of GL(d) by
dH (K1 , K2 ) = inf{r > 0 : K1 ⊂ Br (K2 ) and K2 ⊂ Br (K1 )}.
Exercise 10.5 Prove the following.
(1) The function ξ → Eδ (ξ ) is lower semi-continuous with respect to the
weak∗ topology in the space of measures with a given mass on N × N.
(2) The function η → eδ (η ) is lower semi-continuous with respect to the
weak∗ topology in the space of measures with a given mass on N.
Exercise 10.6 Let η be a probability measure on a compact metric space P.
Check that the function U → eδ (η | U) may not be monotone.
Exercise 10.7 Let η be a Borel measure on some compact metric space P
such that η ({x}) < η /2 for every x ∈ P. Show that there exists τ > 0 such
that η (B(z, τ )) < η (B(z, τ )c ) for every z ∈ P.
Exercise 10.8 Let η be a probability on [0, 1] of the form η = (δ0 + μ )/2,
where μ is a probability on (0, 1]. Check that, depending on μ , the δ -energy
of η may be finite or infinite.
Exercise 10.9 State and prove an analogue of Lemma 10.7 for couplings of
any two measures η1 and η2 with the same mass.
Exercise 10.10 Check that the fiber-bunching property (10.22) does not de-
pend on the choice of θ in the definition of the distance.

References
[1] P. Anderson. Absence of diffusion in certain random lattices. Phys. Rev.,

109:1492–1505, 1958.
[2] D. V. Anosov. Geodesic flows on closed Riemannian manifolds of negative
curvature. Proc. Steklov Math. Inst., 90:1–235, 1967.
[3] V. Araújo and M. Bessa. Dominated splitting and zero volume for incompress-
ible three-flows. Nonlinearity, 21:1637–1653, 2008.
[4] A. Arbieto and J. Bochi. L p -generic cocycles have one-point Lyapunov spec-
trum. Stoch. Dyn., 3:73–81, 2003.
[5] L. Arnold. Random Dynamical Systems. Springer-Verlag, 1998.
[6] L. Arnold and N. D. Cong. On the simplicity of the Lyapunov spectrum of
products of random matrices. Ergod. Theory Dynam. Sys., 17:1005–1025, 1997.
[7] L. Arnold and V. Wishtutz (Eds.). Lyapunov Exponents (Bremen, 1984), volume
1186 of Lecture Notes in Math. Springer, 1986.
[8] A. Avila. Density of positive Lyapunov exponents for quasiperiodic SL(2, R)-
cocycles in arbitrary dimension. J. Mod. Dyn., 3:631–636, 2009.
[9] A. Avila and J. Bochi. Lyapunov exponents: parts I & II. Notes of mini-course
given at the School on Dynamical Systems, ICTP, 2008.
[10] A. Avila and J. Bochi. Proof of the subadditive ergodic theorem. Preprint
www.mat.puc-rio.br/∼jairo/.
[11] A. Avila and J. Bochi. A formula with some applications to the theory of Lya-
punov exponents. Israel J. Math., 131:125–137, 2002.
[12] A. Avila, A .Eskin and M. Viana. Continuity of Lyapunov exponents of random
matrix products. In preparation.
[13] A. Avila, J. Santamaria, and M. Viana. Holonomy invariance: rough regularity
and Lyapunov exponents. Astérisque, 358:13–74, 2013.
[14] A. Avila and M. Viana. Simplicity of Lyapunov spectra: a sufficient criterion.
Port. Math., 64:311–376, 2007.
[15] A. Avila and M. Viana. Simplicity of Lyapunov spectra: proof of the Zorich-
Kontsevich conjecture. Acta Math., 198:1–56, 2007.
[16] A. Avila and M. Viana. Extremal Lyapunov exponents: an invariance principle
and applications. Inventiones Math., 181:115–178, 2010.
[17] A. Avila, M. Viana, and A. Wilkinson. Absolute continuity, Lyapunov exponents
and rigidity I: geodesic flows. Preprint www.preprint.impa.br 2011.
191

192 References
[18] L. Barreira and Ya. Pesin. Nonuniform Hyperbolicity: Dynamics of Systems with
Nonzero Lyapunov Exponents. Cambridge University Press, 2007.
[19] L. Barreira, Ya. Pesin and J. Schmeling. Dimension and product structure of
hyperbolic measures. Ann. of Math., 149:755–783, 1999.
[20] M. Bessa. Dynamics of generic 2-dimensional linear differential systems. J.
Differential Equations, 228:685–706, 2006.
[21] M. Bessa. The Lyapunov exponents of generic zero divergence three-
dimensional vector fields. Ergodic Theory Dynam. Systems, 27:1445–1472,
2007.
[22] M. Bessa. Dynamics of generic multidimensional linear differential systems.
Adv. Nonlinear Stud., 8:191–211, 2008.
[23] M. Bessa and J. L. Dias. Generic dynamics of 4-dimensional C2 Hamiltonian
systems. Comm. Math. Phys., 281:597–619, 2008.
[24] M. Bessa and J. Rocha. Contributions to the geometric and ergodic theory of
conservative flows. Ergodic Theory Dynam. Sys., 33:1709–1731, 2013.
[25] M. Bessa and H. Vilarinho. Fine properties of lp-cocycles. J. Differential Equa-
tions, 256:2337–2367, 2014.
[26] G. Birkhoff. Lattice Theory, volume 25. A.M.S. Colloq. Publ., 1967.
[27] J. Bochi. Discontinuity of the Lyapunov exponents for non-hyperbolic cocycles.
Preprint www.mat.puc-rio.br/∼jairo/.
[28] J. Bochi. The multiplicative ergodic theorem of Oseledets. Preprint
www.mat.puc-rio.br/∼jairo/.
[29] J. Bochi. Proof of the Oseledets theorem in dimension 2 via hyperbolic geometry.
Preprint www.mat.puc-rio.br/∼jairo/.
[30] J. Bochi. Genericity of zero Lyapunov exponents. Ergod. Theory Dynam. Sys.,
22:1667–1696, 2002.
[31] J. Bochi. C1 -generic symplectic diffeomorphisms: partial hyperbolicity and zero
centre Lyapunov exponents. J. Inst. Math. Jussieu, 8:49–93, 2009.
[32] J. Bochi and M. Viana. Pisa lectures on Lyapunov exponents. In Dynamical Sys-
tems. Part II, Pubbl. Cent. Ric. Mat. Ennio Giorgi, pages 23–47. Scuola Norm.
Sup., 2003.
[33] J. Bochi and M. Viana. Lyapunov exponents: how frequently are dynamical
systems hyperbolic? In Modern Dynamical Systems and Applications, pages
271–297. Cambridge University Press, 2004.
[34] J. Bochi and M. Viana. The Lyapunov exponents of generic volume-preserving
and symplectic maps. Ann. of Math., 161:1423–1485, 2005.
[35] C. Bocker and M. Viana. Continuity of Lyapunov exponents for 2D random
matrices. Preprint www.impa.br/∼viana/out/bernoulli.pdf.
[36] C. Bonatti, L. J. Dı́az, and M. Viana. Dynamics Beyond Uniform Hyperbolicity,
volume 102 of Encyclopaedia of Mathematical Sciences. Springer-Verlag, 2005.
[37] C. Bonatti, X. Gómez-Mont, and M. Viana. Généricité d’exposants de Lyapunov
non-nuls pour des produits déterministes de matrices. Ann. Inst. H. Poincaré
Anal. Non Linéaire, 20:579–624, 2003.
[38] C. Bonatti and M. Viana. SRB measures for partially hyperbolic systems whose
central direction is mostly contracting. Israel J. Math., 115:157–193, 2000.
[39] N. Bourbaki. Algebra. I. Chapters 1–3. Elements of Mathematics (Berlin).
Springer-Verlag, 1989. Translated from the French, Reprint of the 1974 edition.

References 193
[40] J. Bourgain. Positivity and continuity of the Lyapounov exponent for shifts on
Td with arbitrary frequency vector and real analytic potential. J. Anal. Math.,
96:313–355, 2005.
[41] J. Bourgain and S. Jitomirskaya. Continuity of the Lyapunov exponent for
quasiperiodic operators with analytic potential. J. Statist. Phys., 108:1203–1218,
2002.
[42] M. Cambrainha. Generic symplectic cocycles are hyperbolic. PhD thesis, IMPA,
2013.
[43] C. Castaing and M. Valadier. Convex Analysis and Measurable Multifunctions.
Lecture Notes in Mathematics, Vol. 580. Springer-Verlag, 1977.
[44] J. E. Cohen, H. Kesten, and C. M. Newman (Eds.). Random Matrices and their
Applications (Brunswick, Maine, 1984), volume 50 of Contemp. Math. Amer.
Math. Soc., 1986.
[45] J. Conway. Functions of One Complex Variable. II, volume 159 of Graduate
Texts in Mathematics. Springer-Verlag, 1995.
[46] A. Crisanti, G.Paladin, and A. Vulpiani. Products of Random Matrices in Statis-
tical Physics, volume 104 of Springer Series in Solid-State Sciences. Springer-
Verlag, 1993. With a foreword by Giorgio Parisi.
[47] D. Damanik. Schrödinger operators with dynamically defined potentials. In
preparation.
[48] David Damanik. Lyapunov exponents and spectral analysis of ergodic
Schrödinger operators: a survey of Kotani theory and its applications. In Spec-
tral Theory and Mathematical Physics: a Festschrift in Honor of Barry Simon’s
60th birthday, volume 76 of Proc. Sympos. Pure Math., pages 539–563. Amer.
Math. Soc., 2007.
[49] L. J. Dı́az and R. Ures. Persistent homoclinic tangencies at the unfolding of
cycles. Ann. Inst. H. Poincaré, Anal. Non-linéaire, 11:643–659, 1996.
[50] A. Douady and J. C. Earle. Conformally natural extension of homeomorphisms
of the circle. Acta Math., 157:23–48, 1986.
[51] R. Dudley, H. Kunita, F. Ledrappier, and P. Hennequin (Eds.). École d’été de
probabilités de Saint-Flour, XII—1982, volume 1097 of Lecture Notes in Math.
Springer, 1984.
[52] J. Franks. Anosov diffeomorphisms. In Global Analysis (Proc. Sympos. Pure
Math., Vol. XIV, Berkeley, Calif., 1968), pages 61–93. Amer. Math. Soc., 1970.
[53] N. Friedman. Introduction to Ergodic Theory. Van Nostrand, 1969.
[54] H. Furstenberg. Non-commuting random products. Trans. Amer. Math. Soc.,
108:377–428, 1963.
[55] H. Furstenberg. Boundary theory and stochastic processes on homogeneous
spaces. In Harmonic Analysis in Homogeneous Spaces, volume XXVI of Proc.
Sympos. Pure Math. (Williamstown MA, 1972), pages 193–229. Amer. Math.
Soc., 1973.
[56] H. Furstenberg and H. Kesten. Products of random matrices. Ann. Math. Statist.,
31:457–469, 1960.
[57] H. Furstenberg and Yu. Kifer. Random matrix products and measures in projec-
tive spaces. Israel J. Math, 10:12–32, 1983.
[58] I. Ya. Gol’dsheid and G. A. Margulis. Lyapunov indices of a product of random
matrices. Uspekhi Mat. Nauk., 44:13–60, 1989.

194 References
[59] Y. Guivarc’h. Marches aléatories à pas markovien. Comptes Rendus Acad. Sci.
Paris, 289:211–213, 1979.
[60] Y. Guivarc’h and A. Raugi. Products of random matrices: convergence theorems.
Contemp. Math., 50:31–54, 1986.
[61] H. Hennion. Loi des grands nombres et perturbations pour des produits
réductibles de matrices aléatoires indépendantes. Z. Wahrsch. Verw. Gebiete,
67:265–278, 1984.
[62] M. Herman. Une méthode nouvelle pour minorer les exposants de Lyapunov
et quelques exemples montrant le caractère local d’un théorème d’Arnold et de
Moser sur le tore de dimension 2. Comment. Math. Helvetici, 58:453–502, 1983.
[63] A. Herrera. Simplicity of the Lyapunov spectrum for multidimensional continued
fraction algorithms. PhD thesis, IMPA, 2009.
[64] F. Rodriguez Hertz, M. A. Rodriguez Hertz, A. Tahzibi, and R. Ures. Maxi-
mizing measures for partially hyperbolic systems with compact center leaves.
Ergodic Theory Dynam. Sys., 32:825–839, 2012.
[65] E. Hopf. Statistik der geodätischen Linien in Mannigfaltigkeiten negativer
Krümmung. Ber. Verh. Sächs. Akad. Wiss. Leipzig, 91:261–304, 1939.
[66] S. Jitomirskaya and C. Marx. Dynamics and spectral theory of quasi-periodic
Schrödinger type operators. In preparation.
[67] R. Johnson. Lyapounov numbers for the almost periodic Schrödinger equation.
Illinois J. Math., 28:397–419, 1984.
[68] A. Katok. Lyapunov exponents, entropy and periodic points of diffeomorphisms.
Publ. Math. IHES, 51:137–173, 1980.
[69] Y. Katznelson and B. Weiss. A simple proof of some ergodic theorems. Israel J.
Math., 42:291–296, 1982.
[70] Yu. Kifer. Perturbations of random matrix products. Z. Wahrsch. Verw. Gebiete,
61:83–95, 1982.
[71] Yu. Kifer. Ergodic Theory of Random Perturbations. Birkhäuser, 1986.
[72] Yu. Kifer. General random perturbations of hyperbolic and expanding transfor-
mations. J. Analyse Math., 47:11–150, 1986.
[73] Yu. Kifer and E. Slud. Perturbations of random matrix products in a reducible
case. Ergodic Theory Dynam. Sys., 2:367–382 (1983), 1982.
[74] J. Kingman. The ergodic theorem of subadditive stochastic processes. J. Royal
Statist. Soc., 30:499–510, 1968.
[75] O. Knill. The upper Lyapunov exponent of SL(2, R) cocycles: discontinuity and
the problem of positivity. In Lyapunov Exponents (Oberwolfach, 1990), volume
1486 of Lecture Notes in Math., pages 86–97. Springer-Verlag, 1991.
[76] O. Knill. Positive Lyapunov exponents for a dense set of bounded measurable
SL(2, R)-cocycles. Ergod. Theory Dynam. Sys., 12:319–331, 1992.
[77] S. Kotani. Lyapunov indices determine absolutely continuous spectra of sta-
tionary random one-dimensional Schrödinger operators. In Stochastic Analysis,
pages 225–248. North Holland, 1984.
[78] H. Kunz and Bernard B. Souillard. Sur le spectre des opérateurs aux différences
finies aléatoires. Comm. Math. Phys., 78:201–246, 1980/81.
[79] F. Ledrappier. Propriétés ergodiques des mesures de Sinaı̈ Publ. Math. I.H.E.S.,
59:163–188, 1984.

References 195
[80] F. Ledrappier. Quelques propriétés des exposants caractéristiques. volume 1097

of Lect. Notes in Math., pages 305–396, Springer-Verlag, 1984.
[81] F. Ledrappier. Positivity of the exponent for stationary sequences of matrices.
In Lyapunov Exponents (Bremen, 1984), volume 1186 of Lect. Notes in Math.,
pages 56–73. Springer-Verlag, 1986.
[82] F. Ledrappier and G. Royer. Croissance exponentielle de certain produits
aléatorires de matrices. Comptes Rendus Acad. Sci. Paris, 290:513–514, 1980.
[83] F. Ledrappier and L.-S. Young The metric entropy of diffeomorphisms. I. Char-
acterization of measures satisfying Pesin’s entropy formula. Ann. of Math.,
122:509–539, 1985.
[84] F. Ledrappier and L.-S. Young The metric entropy of diffeomorphisms. II. Rela-
tions between entropy, exponents and dimension. Ann. of Math., 122:540–574,
1985.
[85] A. M. Lyapunov. The General Problem of the Stability of Motion. Taylor & Fran-
cis Ltd., 1992. Translated from Edouard Davaux’s French translation (1907) of
the 1892 Russian original and edited by A.T. Fuller, with an introduction and
preface by Fuller, a biography of Lyapunov by V.I. Smirnov, and a bibliogra-
phy of Lyapunov’s works compiled by J.F. Barrett, Lyapunov centenary issue,
Reprint of Internat. J. Control 55 (1992), no. 3 [MR1154209 (93e:01035)], With
a foreword by Ian Stewart.
[86] R. Mañé. A proof of Pesin’s formula. Ergod. Theory Dynam. Sys., 1:95–101,
1981.
[87] R. Mañé. Lyapunov exponents and stable manifolds for compact transforma-
tions. In Geometric Dynamics, volume 1007 of Lect. Notes in Math., pages
522–577. Springer-Verlag, 1982.
[88] R. Mañé. Oseledec’s theorem from the generic viewpoint. In Procs. Interna-
tional Congress of Mathematicians, Vol. 1, 2 (Warsaw, 1983), pages 1269–1276,
Warsaw, 1984. PWN Publ.
[89] R. Mañé. Ergodic Theory and Differentiable Dynamics. Springer-Verlag, 1987.
[90] A. Manning. There are no new Anosov diffeomorphisms on tori. Amer. J. Math.,
96:422–429, 1974.
[91] S. Newhouse. On codimension one Anosov diffeomorphisms. Amer. J. Math.,
92:761–770, 1970.
[92] V. I. Oseledets. A multiplicative ergodic theorem: Lyapunov characteristic num-
bers for dynamical systems. Trans. Moscow Math. Soc., 19:197–231, 1968.
[93] J. C. Oxtoby and S. M. Ulam. Measure-preserving homeomorphisms and metri-
cal transitivity. Ann. of Math., 42:874–920, 1941.
[94] É. Le Page. Théorèmes limites pour les produits de matrices aléatoires. In
Probability Measures on Groups (Oberwolfach, 1981), volume 928 of Lecture
Notes in Math., pages 258–303. Springer-Verlag, 1982.
[95] É. Le Page. Régularité du plus grand exposant caractéristique des produits de
matrices aléatoires indépendantes et applications. Ann. Inst. H. Poincaré Probab.
Statist., 25:109–142, 1989.
[96] J. Palis and S. Smale. Structural stability theorems. In Global Analysis, volume
XIV of Proc. Sympos. Pure Math. (Berkeley 1968), pages 223–232. Amer. Math.
Soc., 1970.

196 References
[97] L. Pastur. Spectral properties of disordered systems in the one-body approxima-

tion. Comm. Math. Phys., 75:179–196, 1980.
[98] Y. Peres. Analytic dependence of Lyapunov exponents on transition probabil-
ities. In Lyapunov Exponents (Oberwolfach, 1990), volume 1486 of Lecture
Notes in Math., pages 64–80. Springer-Verlag, 1991.
[99] Ya. B. Pesin. Families of invariant manifolds corresponding to non-zero charac-
teristic exponents. Math. USSR. Izv., 10:1261–1302, 1976.
[100] Ya. B. Pesin. Characteristic Lyapunov exponents and smooth ergodic theory.
Russian Math. Surveys, 324:55–114, 1977.
[101] M. S. Raghunathan. A proof of Oseledec’s multiplicative ergodic theorem. Israel
J. Math., 32:356–362, 1979.
[102] V. A. Rokhlin. On the fundamental ideas of measure theory. A. M. S. Transl.,
10:1–54, 1962. Transl. from Mat. Sbornik 25 (1949), 107–150. First published
by the A. M. S. in 1952 as Translation Number 71.
[103] G. Royer. Croissance exponentielle de produits markoviens de matrices. Ann.
Inst. H. Poincaré, 16:49–62, 1980.
[104] D. Ruelle. Analyticity properties of the characteristic exponents of random ma-
trix products. Adv. in Math., 32:68–80, 1979.
[105] D. Ruelle. Ergodic theory of differentiable dynamical systems. Inst. Hautes
Études Sci. Publ. Math., 50:27–58, 1979.
[106] D. Ruelle. Characteristic exponents and invariant manifolds in Hilbert space.
Annals of Math., 115:243–290, 1982.
[107] M. Shub. Global Stability of Dynamical Systems. Springer-Verlag, 1987.
[108] B. Simon and M. Taylor. Harmonic analysis on SL(2, R) and smoothness of the
density of states in the one-dimensional Anderson model. Comm. Math. Phys.,
101:1–19, 1985.
[109] S. Smale. Differentiable dynamical systems. Bull. Am. Math. Soc., 73:747–817,
1967.
[110] A. Tahzibi. Stably ergodic diffeomorphisms which are not partially hyperbolic.
Israel J. Math., 142:315–344, 2004.
[111] Ph. Thieullen. Ergodic reduction of random products of two-by-two matrices. J.
Anal. Math., 73:19–64, 1997.
[112] M. Viana. Lyapunov exponents and strange attractors. In J.-P. Françoise, G. L.
Naber, and S. T. Tsou, editors, Encyclopedia of Mathematical Physics. Elsevier,
2006.
[113] M. Viana. Almost all cocycles over any hyperbolic system have nonvanishing
Lyapunov exponents. Ann. of Math., 167:643–680, 2008.
[114] M. Viana and K. Oliveira. Fundamentos da Teoria Ergódica. Coleção Fronteiras
da Matemática. Sociedade Brasileira de Matemática, 2014.
[115] M. Viana and J. Yang. Physical measures and absolute continuity for one-
dimensional center directions. Ann. Inst. H. Poincaré Anal. Non Linéaire,
30:845–877, 2013.
[116] A. Virtser. On products of random matrices and operators. Th. Prob. Appl.,
34:367–377, 1979.
[117] P. Walters. A dynamical proof of the multiplicative ergodic theorem. Trans.
Amer. Math. Soc., 335:245–257, 1993.

References 197
[118] L.-S. Young. Some open sets of nonuniformly hyperbolic cocycles. Ergod.
Theory Dynam. Sys., 13(2):409–415, 1993.
[119] L.-S. Young. Ergodic theory of differentiable dynamical systems. In Real and
Complex Dynamical Systems, volume NATO ASI Series, C-464, pages 293–336.
Kluwer Acad. Publ., 1995.
[120] A. C. Zaanen. Integration. North-Holland Publishing Co., 1967. Completely
revised edition of An Introduction to the Theory of Integration.

Index
∗ GL(d), GL(d, C)
convolution, 174, 179 linear group, 6
Bt PRd , PCd
adjoint linear operator, 107 projective space, real and complex, 17
C(V, a) SL(d), SL(d, C)
cone of width a around V , 61 special linear group, 6, 7
C0 (P) Td
space of continuous functions, 44 d-dimensional torus, 17
Df B(Y )
derivative of a smooth map, 8 Borel σ -algebra, 41
Es, Eu G (d)
stable and unstable bundles, 10, 57 space of compactly supported measures on
Eδ (ξ ), eδ (η ) GL(d), 171
δ -energy of a measure, 178, 179 K (Y )
Ev , E x space of compact subsets, 41
slices of a set in a product space, 67, 91 M (μ )
G(d), G(k, d) space of measures projecting down to μ , 44
Grassmannian, 16, 38 P(d)
L1 ( μ ) space of probability measures in PRd , 174
space of integrable functions, 28 ecc(B)
Rθ eccentricity of a linear map, 133
rotation, 3 2
Vx⊥ space of square integrable sequences, 9
orthogonal complement, 48 λ (x, v)
W s (x), W u (x) Lyapunov exponent function, 40
(global) stable and unstable sets, 187 λ+ , λ−
s (x), W u (x)
Wloc extremal Lyapunov exponents, 1, 6
loc
local stable and unstable sets, 187 PF
Δm projective cocycle, 68
open simplex, 3 projw u
Δr orthogonal projection, 175
r-neighborhood of the diagonal, 180 α
Λl E generic element of GL(d)N , 172
exterior l-power, 57 ϕ +, ϕ −
Λld E positive and negative parts of a function, 20
decomposable l-vectors, 57 p ∗ η , p1 ∗ p2
198

http://ebooks.cambridge.org/null/ebook.jsf?bid=CBO9781139976602
Index 199
convolution, 174, 179 probability space, 38

sn (x) conditional
most contracted vector, 31 expectation, 86
un (x) probabilities, 85
most expanded vector, 31 cone, 61, 65
δ -energy of a measure, 178, 179 conformal barycenter, 130
ε -concentrated measure, 144 conorm, 20
C0 topology, 153 continuity theorem, 3, 171
C1 topology, 161 contraction property, 189
l-vector, 57 convergence
decomposable, 57 pointwise, 74
L∞ cocycle, 168 weak∗ , 74
L p cocycle, 168 convolution, 174, 179, 189
p-expanding point, 175 diagonal, 183
p-invariant point, 175 coupling, 178
s-state, 81, 83, 86, 94, 149 cross-ratio, 61
su-state, 81 cylinder, 82, 87, 89, 170
u-state, 81, 83, 86, 94, 149 decomposable l-vector, 57
additive sequence, 21 derivative cocycle, 8, 161
adjoint determinant, 58
cocycle, 108 diagonal
monoid, 148 convolution, 183
operator, 69, 107 push-forward, 183
Anderson localization, 18 dimension of a vector bundle, 8
angle between subspaces, 39 disintegration, 85, 87
Anosov diffeomorphism, 17, 161 dominated decomposition, 65, 162, 163
aperiodic measure, 153, 155, 158 duality, 69
backward stationary measure, 78, 101 eccentricity of a linear map, 133
Baire space, 152 elliptic matrix, 4
barycenter, 130 energy of a measure, 178, 179
Bernoulli shift, 7 ergodic
decomposition, 95
characteristic exponents, xi
decomposition theorem, 77
coboundary set, 158, 159, 170
measure, 159, 167
cocycle
stationary measure, 76
L∞ , 168
theorem of Birkhoff, 27
L p , 168
essentially
continuous, 154, 168
invariant function, 21, 36
derivative, 161
invariant set, 21
fiber-bunched, 187
Grassmannian, 137 expectation, 86
Hölder continuous, 165, 187 exponents representation theorem, 97
inverse, 146 exterior
invertible, 34, 39 power, 57, 151
irreducible, 102 product, 57, 137
locally constant, 7 extremal
projective, 68, 92, 137 Lyapunov exponents, 1
strongly irreducible, 102, 103 measure, 92
cohomological condition, 105 fiber-bunched cocycle, 187
complete fibered
flag, 38 entropy, 117

200 Index
Jacobian, 117 cocycle, 146

first return monoid, 139
map, 59, 158 invertible cocycle, 34, 39
time, 59 invertible extension, 99, 101
flag, 38 irreducibility, 102
complete, 38 strong, 102, 134
of Oseledets, 39, 41 irreducible cocycle, 102, 103
formula strongly, 102
of Furstenberg, 102 Lebesgue space, 158
of Herman, 29 lift
forward stationary measure, 78, 101 of a stationary measure, 81, 88, 89
Furstenberg’s of an invariant measure, 79–81
criterion, 2, 105, 124 linear
formula, 102 arrangement, 137, 139, 143
geometric proper, 139
hyperplane, 137 cocycle, 6
subspace, 137 hyperbolic, 10, 15
graph transform, 53 invertible extension, 101
Grassmannian, 16, 38 on a vector bundle, 8
cocycle, 137 derivative cocycle, 8
manifold, 16, 38, 57 group, 6
special, 6, 7
Hölder invariance principle, 116
continuity, 164 Schrödinger cocycle, 9
norm, 164 section, 137
Hamiltonian operator, 18 local
Hausdorff distance, 190 stable set, 187
Herman’s formula, 29 unstable set, 187
Hilbert metric, 61 local product structure, 91
holonomy locally constant
stable, 188 cocycle, 7, 96
unstable, 188 skew product, 67
homogeneous measure, 143 Lyapunov
hyperbolic exponents, 39, 40, 53, 60
cocycle, 17, 153 extremal, 1
linear cocycle, 10, 15 spectrum, 39, 133
matrix, 4
martingale, 86, 88
hyperbolicity
convergence theorem, 86
non-uniform, 19
mass of a measure, 178
uniform, 10, 15
matrix
hyperplane, 137
elliptic, 4
section, 134, 137
hyperbolic, 4
inducing, 59, 166 parabolic, 4
interchanging subspaces, 155, 160, 166, 167 meager subset, 152
invariance principle, 115, 116 measurable
invariant section, 99
function, 21 sub-bundle, 64, 98
holonomies, 187 measure
measure, 70, 79, 97 backward stationary, 78
set, 21 ergodic, 76
inverse extremal, 92

Index 201
forward stationary, 78 Radon–Nikodym derivative, 117

invariant, 70 random
non-atomic, 102, 106 matrices, 7, 68
stationary, 70 Schrödinger cocycle, 9
monoid, 2, 134 transformation, 67, 97
most contracted transformation (invertible), 77
subspace, 134, 140 transformation (one-sided), 70, 76
vector, 11, 19, 31 residual subset, 152, 153, 168
most expanded rotation, 3
subspace, 134, 140 s-holonomy, 188
vector, 11, 19, 31 Schrödinger
multiplicative ergodic theorem cocycle, 9
one-sided, 30, 38 quasi-periodic, 9
two-sided, 34, 39 random, 9
multiplicity of a Lyapunov exponent, 39, 133 operator, 9
natural extension, 65, 99 second return map, 154, 158, 167
negative part of a function, 20 self-coupling, 179
non-atomic measure, 102, 106 optimal, 179
non-compactness condition, 105, 132, 134 symmetric, 179
normalized restriction, 59, 113 semi-continuity, 151, 152, 160
open simplex, 3 separable
orbital sum, 21 metric space, 41
orthogonal complement, 48 probability space, 38
Oseledets sequence
decomposition, 34, 39, 60, 97 additive, 21
flag, 39, 41, 53, 97 subadditive, 21
sub-bundles, 39 super-additive, 45
subspaces, 39, 40 sequential compactness, 83
simple Lyapunov spectrum, 39, 133
parabolic matrix, 4 simplicity criterion, 134
parallelizable manifold, 8 singular
pinching, 2, 113, 134, 139, 148 measures, 117
point values, 133
p-expanding, 175 slices of a set in a product space, 68
p-invariant, 175 small subset, 126
pointwise special linear group, 6, 7
convergence, 74 stabilizer of a measure, 108, 131
topology, 72 stable
positive part of a function, 20 bundle, 10, 57, 135
potential, 9 holonomy, 188
probabilistic repeller, 174 set, 187
probability space stationary
complete, 38 function, 75
separable, 38 measure, 70, 101, 115, 171
projection theorem, 41 set, 75
projective strong irreducibility, 102, 103, 105, 134
action, 106 subadditive
cocycle, 68, 92, 137 ergodic theorem, 20, 21
space, 17 sequence, 21
proper linear arrangement, 139 subexponential decay, 31, 34, 35, 39
quasi-periodic Schrödinger cocycle, 9 subharmonic function, 29, 30

202 Index
superadditive sequence, 45 unstable

support of a measure, 105 bundle, 10, 57, 135
symmetric self-coupling, 179 holonomy, 188
optimal, 179 set, 187
symplectic vector
diffeomorphism, 163 bundle, 8
form, 163 dimension, 8
theorem most contracted, 31
invariance principle, 116 most expanded, 31
martingale convergence, 86, 87 volume
of Baire, 169 of a decomposable l-vector, 58
of Birkhoff, 27 weak∗
of continuity, 3, 171 convergence, 74, 83
of ergodic decomposition, 77 topology, 72
of Furstenberg, 2, 105
of Furstenberg (formula), 102
of Furstenberg-Kesten, 1, 20, 28
of Herman (formula), 29
of Kingman, 21
of Kingman (subadditive), 20
of Ledrappier, 97, 116
of Mañé–Bochi, 153
of Mañé-Bochi, 161
of Oseledets (one-sided), 30, 38
of Oseledets (two-sided), 34, 39
of Prohorov, 83
of Rokhlin, 85
of simplicity, 134
of Thieullen, 106
projection, 41
subadditive, 21
topological degree, 17
topology
C0 , 153
C1 , 161
pointwise, 72
uniform, 72
weak∗ , 72
total variation norm, 72
transition
operator, 68
operator (adjoint), 69
probabilities, 68
transitive action, 162
twisting, 2, 113, 134, 139, 148
u-holonomy, 188
uniform
convergence, 83
hyperbolicity, 10, 15
integrability, 103
topology, 72

v
To Tania, Miguel and Anita,

for their understanding.


Viana-Lectures On Lyapunov Exponents-Cambridge (2014)

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Viana-Lectures On Lyapunov Exponents-Cambridge (2014)

Caricato da

Copyright:

Formati disponibili

Lectures on Lyapunov Exponents

Cambridge University Press is part of the University of Cambridge.

Library of Congress Cataloging-in-Publication data

3.4.1 One-sided theorem 30

6.2.2 Continuity of exponents for irreducible cocycles 103

1. The study of characteristic exponents originated from the fundamental work

2. The work of Furstenberg, Kesten, Oseledets, Kingman, Ledrappier, Guiv-

1960s–80s, built the study of Lyapunov characteristic exponents into a very

3. A diffeomorphism f : M → M is called partially hyperbolic if there exists

of the tangent bundle such that E s is uniformly contracted and E u is uniformly

Downloaded from University Publishing Online. This is copyrighted material

5. In this monograph I have sought to cover the fundamental aspects of the

Downloaded from University Publishing Online. This is copyrighted material

I thank David Tranah, of Cambridge University Press, for his interest in

Rio de Janeiro, March, 2014

Downloaded from University Publishing Online. This is copyrighted material

This chapter is a kind of overture. Simplified statements of three theorems

1.1 Existence of Lyapunov exponents

with full probability.

Downloaded from University Publishing Online. This is copyrighted material

when all matrices Ai , 1 ≤ i ≤ m have determinant ±1.

1.2 Pinching and twisting

Figure 1.1 Eccentricity and pinching

We say that the monoid B is twisting if given any vector lines F, G1 , . . . ,

The following result is a variation of a theorem of Furstenberg [54]:

Theorem 1.2 Assume B is pinching and twisting. Then λ− < λ+ . In partic-

Downloaded from University Publishing Online. This is copyrighted material

1.3 Continuity of Lyapunov exponents

Downloaded from University Publishing Online. This is copyrighted material

Exercise 1.1 Show that, in dimension d = 2, if | det Ai | = 1 for all 1 ≤ i ≤ m

Exercise 1.2 Calculate the extremal Lyapunov exponents for m = 2 and p1 ,

Downloaded from University Publishing Online. This is copyrighted material

Thus, the hypothesis p1 > 0, . . . , pm > 0 cannot be removed in Theorem 1.3.

Downloaded from University Publishing Online. This is copyrighted material

F : M × Rd → M × Rd , (x, v) → ( f (x), A(x)v). (2.1)

Observe that F n (x, v) = ( f n (x), An (x)) for every n ≥ 1, where

An (x) = A( f n−1 (x)) · · · A( f (x))A(x).

If f is invertible then so is F. Moreover, F −n (x, v) = ( f −n (x), A−n (x)) for all

A−n (x) = A( f −n (x))−1 · · · A( f −1 (x))−1 = An ( f −n (x))−1 .

The Furstenberg–Kesten theorem (Theorem 1.1) extends to this setting, as

exist at μ -almost every point x. This fact will be proven in Chapter 3.

Downloaded from University Publishing Online. This is copyrighted material

d is also a real cocycle in dimension 2d and, conversely, every d-dimensional

2.1.1 Products of random matrices

be the shift map on X. Consider the function

and let F : M × Rd → M × Rd be the linear cocycle defined by A over f . Note

for every i ≤ j and any measurable sets E1 , . . . , E j ⊂ X. It is clear that μ is

Downloaded from University Publishing Online. This is copyrighted material

defined by B over g. Note that G is semi-conjugate to a cocycle F as in the

2.1.2 Derivative cocycles

(i) {Uα } is an open cover of M;

The integer d ≥ 1 is the dimension of the vector bundle. A linear cocycle on V

Downloaded from University Publishing Online. This is copyrighted material

2.1.3 Schrödinger cocycles

H : 2 → 2 , u = (un )n∈Z → H(u) = (un+1 + un−1 +Vn un )n∈Z . (2.3)

In the most interesting models the sequence Vn is generated from a dynamical

(1) Random Schrödinger cocycles: Let M = X Z be a shift space, f : M → M

A main objective is to understand the spectral theory of these operators (a

H(u) = Eu, for E ∈ R. (2.4)

While, by definition, the eigenvectors of H are the solutions of this equation in

un+1 + un−1 +V ( f n (x))un = Eun

or, still equivalently,

Downloaded from University Publishing Online. This is copyrighted material