Graduate Studies in Mathematics Geometric Relativity PDF

GRADUATE STUDIES
I N M AT H E M AT I C S 201
Geometric
Relativity
Dan A. Lee
Geometric
Relativity
GRADUATE STUDIES
I N M AT H E M AT I C S 201
Geometric
Relativity
Dan A. Lee
EDITORIAL COMMITTEE
Daniel S. Freed (Chair)
Bjorn Poonen
Gigliola Staffilani
Jeff A. Viaclovsky
2010 Mathematics Subject Classification. Primary 53-01, 53C20, 53C21, 53C24, 53C27,
53C44, 53C50, 53C80, 83C05, 83C57.
For additional information and updates on this book, visit

www.ams.org/bookpages/gsm-201
Library of Congress Cataloging-in-Publication Data

Names: Lee, Dan A., 1978- author.
Title: Geometric relativity / Dan A. Lee.
Description: Providence, Rhode Island : American Mathematical Society, [2019] | Series: Gradu-
ate studies in mathematics ; volume 201 | Includes bibliographical references and index.
Identifiers: LCCN 2019019111 | ISBN 9781470450816 (alk. paper)
Subjects: LCSH: General relativity (Physics)–Mathematics. | Geometry, Riemannian. | Differ-
ential equations, Partial. | AMS: Differential geometry – Instructional exposition (textbooks,
tutorial papers, etc.). msc | Differential geometry – Global differential geometry – Global
Riemannian geometry, including pinching. msc | Differential geometry – Global differential
geometry – Methods of Riemannian geometry, including PDE methods; curvature restrictions.
msc | Differential geometry – Global differential geometry – Rigidity results. msc — Differential
geometry – Global differential geometry – Spin and Spin. msc | Differential geometry – Global
differential geometry – Geometric evolution equations (mean curvature flow, Ricci flow, etc.).
msc | Differential geometry – Global differential geometry – Lorentz manifolds, manifolds with
indefinite metrics. msc | Differential geometry – Global differential geometry – Applications to
physics. msc | Relativity and gravitational theory – General relativity – Einstein’s equations
(general structure, canonical formalism, Cauchy problems). msc | Relativity and gravitational
theory – General relativity – Black holes. msc
Classification: LCC QC173.6 .L44 2019 | DDC 530.1101/516373–dc23
LC record available at https://lccn.loc.gov/2019019111
Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting
for them, are permitted to make fair use of the material, such as to copy select pages for use
in teaching or research. Permission is granted to quote brief passages from this publication in
reviews, provided the customary acknowledgment of the source is given.
Republication, systematic copying, or multiple reproduction of any material in this publication
is permitted only under license from the American Mathematical Society. Requests for permission
to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For
more information, please visit www.ams.org/publications/pubpermissions.
Send requests for translation rights and licensed reprints to reprint-permission@ams.org.
2019
c by the author. All rights reserved.
Printed in the United States of America.

∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability.
Visit the AMS home page at https://www.ams.org/
10 9 8 7 6 5 4 3 2 1 24 23 22 21 20 19
For my parents, Rupert and Gloria Lee
Contents
Preface ix
Part 1. Riemannian geometry
Chapter 1. Scalar curvature 3

§1.1. Notation and review of Riemannian geometry 3
§1.2. A survey of scalar curvature results 17
Chapter 2. Minimal hypersurfaces 23

§2.1. Basic definitions and the Gauss-Codazzi equations 23
§2.2. First and second variation of volume 26
§2.3. Minimizing hypersurfaces and positive scalar curvature 38
§2.4. More scalar curvature rigidity theorems 54
Chapter 3. The Riemannian positive mass theorem 63

§3.1. Background 63
§3.2. Special cases of the positive mass theorem 76
§3.3. Reduction to Theorem 1.30 86
§3.4. A few words on Ricci flow 104
Chapter 4. The Riemannian Penrose inequality 107

§4.1. Riemannian apparent horizons 107
§4.2. Inverse mean curvature flow 121
§4.3. Bray’s conformal flow 142
Chapter 5. Spin geometry 159
vii
viii Contents
§5.1. Background 159

§5.2. The Dirac operator 166
§5.3. Witten’s proof of the positive mass theorem 169
§5.4. Related results 175
Chapter 6. Quasi-local mass 181
§6.1. Bartnik mass and static metrics 181
§6.2. Bartnik minimizers 187
§6.3. Brown-York mass 193
§6.4. Bartnik data with η = 0 199
Part 2. Initial data sets

Chapter 7. Introduction to general relativity 207
§7.1. Spacetime geometry 207
§7.2. The Einstein field equations 214
§7.3. The Einstein constraint equations 221
§7.4. Black holes and Penrose incompleteness 228
§7.5. Marginally outer trapped surfaces 240
§7.6. The Penrose inequality 249
Chapter 8. The spacetime positive mass theorem 255
§8.1. Proof for n < 8 256
§8.2. Spacetime positive mass rigidity 275
§8.3. Proof for spin manifolds 275
Chapter 9. Density theorems for the constraint equations 285
§9.1. The constraint operator 285
§9.2. The density theorem for vacuum constraints 292
§9.3. The density theorem for DEC (Theorem 8.3) 295
Appendix A. Some facts about second-order linear elliptic operators 301
§A.1. Basics 301
§A.2. Weighted spaces on asymptotically flat manifolds 318
§A.3. Inverse function theorem and Lagrange multipliers 337
Bibliography 343
Index 359
Preface
The mathematical study of general relativity is a large and active field.

This book is an attempt to introduce students to just one part of this field.
Specifically, as the title suggests, this book deals primarily with problems
in general relativity that are essentially geometric in character, meaning
that they can be attacked using the methods of Riemannian geometry and
partial differential equations. However, since there are still so many topics
that match this description, we have chosen to further narrow the focus of
this book to the following concept. This book is primarily about the positive
mass theorem and the various ideas that surround it and have grown from
it. It is about understanding the interplay between mass, scalar curvature,
minimal surfaces, and related concepts.
Many geometric problems in general relativity specialize to problems
in pure Riemannian geometry. The most famous of these is the positive
mass theorem, first proved by Richard Schoen and Shing-Tung Yau in 1979
[SY79c, SY81a], and later by Edward Witten using an unrelated method
[Wit81]. Around two decades later, Gerhard Huisken and Tom Ilma-
nen proved a generalization of the positive mass theorem called the Pen-
rose inequality [HI01], which was later proved using a different approach
by Hubert Bray [Bra01]. The goal of this book is to explain the back-
ground context and proofs of all of these theorems, while introducing var-
ious related concepts along the way. Unfortunately, there are many topics
and results that would fit together nicely with the material in this book,
and an argument could certainly be made that they belong in this book,
but for one reason or another, we had to leave them out. At the top of
the wish list for topics we would have liked to include are: a thorough
discussion of the Jang equation as in [SY81b, Eic13, Eic09, AM09], a
ix
x Preface
complete proof of the rigidity of the spacetime positive mass theorem as

in [BC96, HL17] (see Section 8.2), compactly supported scalar curvature
deformations as in [Cor00, CS06, Cor17] (see Theorems 3.51 and 6.14),
and a tour of constant mean curvature foliations and their relationship to
center of mass [HY96, QT07, Hua09, EM13].
The main prerequisite for this book is a working understanding of
Riemannian geometry (from books such as [Cha06, dC92, Jos11, Lee97,
Pet16, Spi79]) and basic knowledge of elliptic linear partial differential
equations, especially Sobolev spaces (various parts of [Eva10,GT01,Jos13]).
Certain facts from partial differential equations are recalled in the Appendix,
with special attention given to the topics which are the least “standard”—
most notably the theory of weighted spaces on asymptotically flat manifolds.
A modest amount of knowledge of algebraic topology is assumed (at the level
of a typical one-year graduate course such as [Hat02, Bre97]) and will typ-
ically only be used on a superficial level. No knowledge of physics at all is
required. In fact, the book has been structured in such a way that Part 1
contains almost no physics. Although the Riemannian positive mass theo-
rem was originally motivated by physical considerations, it is the author’s
conviction that it eventually would have been discovered for purely mathe-
matical reasons. Part 2 includes a short crash course in general relativity,
but again, only the most shallow understanding of physics is involved.
Despite the level of prerequisites, this book is still, unfortunately, not
self-contained. We will typically skip arguments that rely on a large body
of specialized knowledge (e.g., geometric measure theory). More generally,
there are many places in the book where we only give sketches of proofs.
This is sometimes because the results draw upon a wide variety of facts in
geometric analysis, and it is not realistic to include all relevant background
material. In other cases, it is because our goal is less to give a complete
proof than to give the reader a guide for how to understand those proofs.
For example, we avoid the most technical details in the two proofs of the
Penrose inequality in Chapter 4, partly because the author has little to offer
in terms of improved exposition of those details. The interested reader can
and should consult the original papers [HI01, Bra01, BL09]. Since this
book is intended to be an introduction to a field of active research, we are
not shy about presenting statements of some theorems without any proof at
all. We hope that this will help the reader to understand the current state
of what is known and offer directions for further study and research.
In order to simplify the discussion, most definitions and theorems will
be stated for manifolds, metrics, functions, vector fields, etc., which are
smooth. Except where explicitly stated otherwise, the reader should assume
that everything is smooth. (Despite this, because of the use of elliptic theory,
Preface xi
we will of course still need to use Sobolev spaces for our proofs.) The reason
for this is to prevent having to discuss what the optimal regularity is for the
hypotheses of each theorem. The reader will have to refer to the research
literature if interested in more precise statements.
When we refer to concepts or ideas that are especially common or well
known, instead of citing a textbook, we will sometimes cite Wikipedia. The
reasoning is that in today’s world, although Wikipedia is rarely the best
source, it is often the fastest source. Here, the reader can get a quick intro-
duction (or refresher) on the concept and then seek a more traditional math-
ematical text as desired. These citations will be marked with the name of
the relevant article. For example, the citation [Wik, Riemannian geometry]
means that the reader should visit
http://en.wikipedia.org/wiki/Riemannian geometry.
There are many exercises sprinkled throughout the text. Some of them
are routine computations of facts and formulas that are used heavily through-
out the text. Others serve as simple “reality checks” to make sure the reader
understands statements of definitions or theorems on a basic level. Finally,
there are some exercises (and “check this” statements) that ask the reader
to fill in the details of some proof—these are meant to mimic the sort of
routine computations that tend to come up in research.
The motivation for writing this book came from the fact that, to the
author’s knowledge, there is no graduate-level text that gives a full account
of the positive mass theorem and related theorems. This presents an un-
necessarily high barrier to entry into the field, despite the fact that the core
material in this book is now quite well understood by the research com-
munity. A fair amount of the material in Part 1 was presented as a series
of lectures during the Fall of 2015 as part of the General Relativity and
Geometric Analysis seminar at Columbia University.
I would like to thank Hubert Bray, who is the person most responsible for
shepherding me into this field of research. He taught me much of what I know
about the subject matter of this book and strongly shaped my intuition and
perspective. He also encouraged me to write this book and came up with the
title. I thank Richard Schoen, my doctoral advisor, for teaching me about
geometric analysis and supporting my research in geometric relativity. I have
also learned a great deal about this subject from him through many private
conversations, unpublished lecture notes, and talks I have attended over the
years. Similarly, I thank my other collaborators in the field, who have taught
me so much throughout my career: André Neves, Jeffrey Jauregui, Christina
Sormani, Michael Eichmair, Philippe LeFloch, and especially Lan-Hsuan
Huang, who kindly discussed certain technical issues related to this book.
xii Preface
I also thank Mu-Tao Wang for inviting me to give lectures at Columbia

on the positive mass theorem at the very beginning of this project, and
Greg Galloway for explaining to me various things that made their way
into the introduction to general relativity in Part 2. Indeed, the exposition
there owes a great deal to his excellent lecture notes [Gal14]. I thank
Pengzi Miao for some helpful conversations while writing this book, as well
as the anonymous reviewers who offered constructive feedback on an earlier
draft. As an undergraduate, I wrote my senior thesis on Witten’s proof of
the positive mass theorem under the direction of Peter Kronheimer, and in
some sense this book might be thought of as the culmination of that project,
which began nearly two decades ago.
Part 1
Riemannian geometry
Chapter 1
Scalar curvature
1.1. Notation and review of Riemannian geometry

1.1.1. Riemannian metrics and local frames. We begin with some
material that appears in most textbooks on differential geometry and Rie-
mannian geometry, in order to settle our notation and terminology. Up until
the discussion of curvature, this material should be thought of as a refresher
rather than a self-contained introduction to Riemannian geometry.
Let M n be a smooth n-manifold. We will always assume that our given
manifolds are connected unless explicitly stated otherwise (or unless we con-
struct objects which are not obviously connected). We will use C ∞ (M ) to
denote the space of smooth functions on M , and for any vector bundle V
on M , we let C ∞ (V ) denote the space of smooth sections of V . We use T M
and T ∗ M to denote the tangent bundle and cotangent bundle, respectively.
For any vector bundles V and W on M , we can form their tensor prod-

uct V ⊗ W , antisymmetric products k V , and symmetric products k V .
For example, a vector field ∞
is an element of C (T M ) while a k-form is an
element of C ∞ k ∗
T M .
A Riemannian metric g on M is an element of C ∞ (T ∗ M T ∗ M ), or in
other words, a symmetric (0, 2)-tensor, that is positive definite at each point.
In other words, it defines an inner product on the tangent space of each point,
varying smoothly from point to point. Given p ∈ M and two tangent vectors
v, w ∈ Tp M , we typically denote this inner product by v, wg or simply
v, w if the meaning is clear, and we define the norm of a tangent vector by
|v|2 := v, v. The metric sets up an isomorphism between T M and T ∗ M ,
sometimes called the musical isomorphism, extending the linear algebra fact
that an inner product sets up an isomorphism between a vector space and
3
4 1. Scalar curvature
its dual space. Explicitly, the musical isomorphism : Tp M −→ Tp∗ M is

the map v → v, · for any v ∈ Tp M . In particular, we naturally obtain an
inner product on T ∗ M as well (which we abusively also call g and denote
by ·, ·).
A local frame is a choice of smooth vector fields v1 , . . . , vn defined on an
open set U ⊂ M such that at each point p ∈ U , v1 , . . . , vn forms a basis of the
tangent space Tp M . The most frequently used local frames are coordinate
frames and orthonormal frames. If x1 , . . . , xn are smooth coordinates on
U , then ∂x∂ 1 , . . . , ∂x∂n is a local frame which we will call a coordinate frame.
An orthonormal frame is just a local frame that happens to be orthonormal
with respect to the given metric g at each point of U . Many computations
in Riemannian geometry can be carried out using either coordinate frames
or orthonormal frames, with the choice often being a matter of taste. In this
book, our default choice will be to use orthonormal frames.
Given any local frame v1 , . . . , vn over U , there is a corresponding dual
local coframe of 1-forms v 1 , . . . , v n , which forms a basis of the cotangent
space Tp∗ M at each point p ∈ M . This dual coframe is constructed using
the pointwise dual basis construction of linear algebra, so that v i (vj ) = δji ,
where δji is the Kronecker delta (which is 1 if i = j, and 0 otherwise).
In particular, the dual coframe of ∂x∂ 1 , . . . , ∂x∂n is dx1 , . . . , dxn . The dual
coframe of an orthonormal frame is again orthonormal. Given any vector
field X defined over U , we can write it as X = ni=1 X i vi for some uniquely
determined functions i ∈ C ∞ (U ). Similarly, for any 1-form ω over U , we
X n
can write it as ω = i=1 ωi v i for some uniquely determined functions ωi ∈
C ∞ (U ). In this text we will often use Einstein summation notation [Wik,
Einstein notation] for repeated indices, but we will not use it exclusively.
(In particular, we will often abandon it when working in an orthonormal
frame.) For example, we can write X = X i vi and ω = ωi v i . Using this
local representation, we can think of the list of functions X i as the vector
field X. Indeed, common abuse of notation is to refer to X i directly as a
vector field. In this language, the aforementioned musical isomorphism maps
X i to (X )j = gij X i (now indexed by j). This is an example of “lowering
indices.”
Using the local frame, we can also express the metric as an explicit
symmetric matrix of functions gij by defining
gij := vi , vj ,
so that g = gij v i ⊗ v j . If X i and Y i are vector fields, X, Y = gij X i Y j .

Note that if we use an orthonormal frame, then gij is just the Kronecker
delta δij .
1.1. Notation and review of Riemannian geometry 5
The notation g ij indicates the matrix inverse of the matrix gij so that
we have g ik gkj = gjk g ki = δji . Note that this g ij also represents the inner
product on T ∗ M described earlier, so that if ωi and ηi are 1-forms, then
ω, η = g ij ωi ηj . Also, the inverse of , denoted by : T ∗ M −→ T M , maps
ωi to the vector field (ω )j = g ij ωi (now indexed by j). This is an example
of “raising indices.”
Recall that, given a smooth function f , we can define the 1-form df via
the equation df (X) = Xf , where the right-hand side is the action of an
arbitrary vector field X on f . Given a metric, we define the gradient of f
to be ∇f := (df ) . Note that
∇X f := ∇f, X = df (X) = Xf = LX f,
where the last term is the Lie derivative, so that we have many notations
for the same (simple) thing. Note that although ∇f depends on the choice
of metric, ∇X f does not. If there are chosen coordinates x1 , . . . , xn , we will
sometimes use the notation ∂f to mean the (locally defined) vector field
whose components are
∂f
∂i f := = fi .
∂xi
Indeed, we will sometimes write the vector fields ∂x∂ 1 , . . . , ∂x∂n as ∂1 , . . . , ∂n
for the purpose of readability.
1.1.2. Volume. The Riemannian metric naturally gives rise to a volume

form as follows:

(1.1) dvol := det g v 1 ∧ · · · ∧ v n ,
where det g is the determinant of the matrix gij described above. Note that
for an orthonormal coframe e1 , . . . , en , we just have dvolg = e1 ∧ · · · ∧ en .
Exercise 1.1. Prove that the right-hand side of equation (1.1) depends only
on g and the orientation of the local frame, and not on the choice of local
frame v1 , . . . , vn .
The exercise means that a metric and a choice of orientation combine

to give a well-defined global volume form on a Riemannian manifold (M, g).
This volume form can then be used to construct a volume measure on M
(which can also be called Riemannian measure or Lebesgue measure on M ).
Specifically, the measurable sets in each coordinate patch are precisely those
that correspond to Lebesgue measurable sets in Rn via the coordinate chart,
and then the measurable sets in M are countable unions of those sets. Then
for any measurable set U in M , we can define its volume to be

μ(U ) := |U | := dvol.
U
Note that since a Riemannian metric induces a metric space structure on M ,

one can also define n-dimensional Hausdorff measure and see that this mea-
sure agrees with the Riemannian measure, though this takes some work to
show.
If M is not orientable, even though there is no globally defined vol-
ume form, one can still define a volume measure on M . This is intuitively
clear since the measure is local in nature, but formally this can be easily
defined by pushing forward the volume measure on the orientable double
cover and then dividing by 2. (See the related notion of density [Wik,
Density on a manifold].) When we integrate over M , we will denote the
volume measure by dμ, or perhaps dμM or dμg if more clarity is desired.
We will be interested in understanding how volume changes under de-
formation of the metric. For this we will want to compute the linearization
of the volume form. If M is orientable, then the linearization of dvol at g is
defined to be the linear operator

D(dvol)|g : C ∞ (T ∗ M T ∗ M ) −→ C ∞ ( n T ∗ M )
such that for any ġ ∈ C ∞ (T ∗ M T ∗ M ),

∂
D(dvol)|g (ġ) := dvolgt
∂t t=0
for any smooth family of Riemannian metrics gt on M such that g0 = g and
d
ġ = dt g.
t=0 t
In order to compute this, let v1 , . . . , vn be a positively oriented local
frame. Then if we write gt in this frame, we have

dvolgt = det gt v 1 ∧ · · · ∧ v n .
Note that v 1 ∧ · · · ∧ v n is fixed and does not depend on t. In order to

differentiate this, we use the following linear algebra fact, sometimes called
Jacobi’s formula.
Exercise 1.2. Let A : (a, b) −→ GLm (R) be a differentiable family of
invertible matrices. Show that for any t ∈ (a, b),
d
det(A(t)) = tr(A−1 (t)A (t)) det(A(t)).
dt
By the exercise above,
∂ 1

−1 ∂
det gt = tr gt gt det gt .
∂t 2 ∂t
Evaluating this at t = 0, we obtain
1
D(dvol)|g (ġ) = tr g −1 ġ det g v 1 ∧ · · · ∧ v n .
2
In the right-hand side of this formula, g and ġ are thought of as matri-

ces, using the local frame. Using more geometric notation, we obtain the
following.
Proposition 1.3. Given a metric g on a manifold M , the linearization of
the volume form at g is
1
D(dvol)|g (ġ) = (trg ġ) dvolg
2
∞ ∗ ∗
for all ġ ∈ C (T M T M ). Since the computation is local, it also follows
that
1
D(dμ)|g (ġ) = (trg ġ) dμg ,
2
where the left side is interpreted in the obvious way.
1.1.3. Lie derivatives. The main difficulty involved in differentiating a

vector field Y on M is that the values of Y at different points p and q lie in
different fibers Tp M and Tq M , and thus there is no natural way to compare
them. Let X be a vector field on M . We will now define the Lie derivative
LX Y , which is also a vector field on M . The Lie derivative deals with the
basic problem of comparing Tp M to Tq M by using X itself to construct an
isomorphism between those spaces.
Given p ∈ M , we will describe how to define LX Y at any point p. For
each x near p, let Φt (x) : [0, ) −→ M solve the ODE
∂
Φt (x) = X(Φt (x)),
∂t
with initial condition Φ0 (x) = x. By the existence and uniqueness theorem
for ODEs, there exists > 0, independent of x in some neighborhood of p,
such that a unique solution exists for all t ∈ [0, ). In particular, Φt is a local
diffeomorphism near p, so that dΦt |p : Tp M −→ TΦt (p) M is an isomorphism.
If X has compact support, then Φt will define a diffeomorphism for all t ∈ R,
and in fact, for any t, s ∈ R, Φt+s = Φs ◦ Φt . The maps Φt are called the
one-parameter family of diffeomorphisms of M generated by X.
In any case, we can define

∂
(LX Y )(p) := (Φ∗ Y )(p),
∂t t=0 t
where Φ∗t denotes the pullback (dΦt |p )−1 .
The Lie derivative LX Y is also denoted by the Lie bracket [X, Y ], and
in local coordinates x1 , . . . , xn , one can explicitly compute
∂ ∂
[X, Y ]i = X j j Y i − Y j j X i ,
∂x ∂x
and from this it is clear that [Y, X] = −[X, Y ].
More generally, one can use the same idea to define the Lie derivative of
other tensor fields besides vector fields. (By tensor field, we mean a section
of a bundle that is a tensor product of some number of copies of T M and
some number of copies of T ∗ M .) This is because a local diffeomorphism also
induces an isomorphism between fibers of the tensor bundle. One can show
that with this definition, the Lie derivative obeys appropriate Leibniz rules
with respect to products of tensors. For example, given vector fields X, Y, Z
and a 1-form ω, we have facts such as LX (Y ⊗Z) = (LX Y ) ⊗Z +Y ⊗(LX Z)
and ∇X (ω(Y )) = (LX ω)(Y ) + ω(LX Y ).
We say that X is a Killing field if the local diffeomorphisms Φt that it
generates are isometries, meaning that for all tangent vectors v, w ∈ Tp M
we have dΦt |p (v), dΦt |p (w) = v, w.
Exercise 1.4. Show that X is a Killing field for the metric g if and only if
LX g = 0,
where g is thought of as a (0, 2) tensor.
Observe that if X is Killing, then if we choose local coordinates x1 , . . . , xn

such that X = ∂1 , then gij written with respect to those local coordinates
will be independent of the variable x1 .
One important aspect of the Lie derivative LX Y is that it is not linear
over C ∞ (M ) in the X input. In this sense, it does not behave as we would
like a “directional derivative” to behave; it differentiates the X input as well
as the Y input. Another method of “comparing tangent spaces” comes from
the idea of parallel transport, which turns out to be equivalent to the notion
of connection. But unlike Lie differentiation, this requires making a choice
of connection.
1.1.4. Levi-Civita connection and divergence. Recall that a connec-

tion ∇ is a map from C ∞ (T M ) × C ∞ (T M ) to C ∞ (T M ) that is linear over
C ∞ (M ) in its first input but not in its second input, where it instead obeys
a Leibniz rule,
∇X (f Y ) = f ∇X Y + (∇X f )Y,
where X, Y are vector fields and f is a function. A Riemannian metric g
leads to a natural choice of connection called the Levi-Civita connection. The
Levi-Civita connection is the unique connection which is both compatible
with g, meaning that ∇X Y, Z = ∇X Y, Z + Y, ∇X Z, and also torsion-
free, meaning that ∇X Y −∇Y X = [X, Y ], where X, Y, Z are arbitrary vector
fields, and [X, Y ] denotes the Lie bracket. If we choose local coordinates, the
connection ∇ can be represented by functions Γkij called Christoffel symbols,
defined by the formula
∇∂i ∂j = Γkij ∂k .
Using the properties of the Levi-Civita connection, one can derive

1
(1.2) Γkij = g k (gi,j + gj,i − gij, ),
2
∂
where the commas denote differentiation, that is, gij, = ∂x gij . If one uses
a general local frame rather than a coordinate frame, we can still do this but
the formula for Γ does not come out as nicely. (See Koszul’s formula [Wik,
Fundamental theorem of Riemannian geometry] to see what happens more
generally.)
The case of orthonormal frames is also particularly important. Given an
orthonormal frame e1 , . . . , en , there exist locally defined connection 1-forms
ωji such that
∇ej = ωji ei .
If θ1 , . . . , θn is the orthonormal dual coframe to e1 , . . . , en , then ωji can be
computed using the equation
dθi = −ωji ∧ θj ,
which is sometimes written as simply dθ = −ω ∧ θ.
Exercise 1.5. Show that ωji is antisymmetric in i and j.
Much like the Lie derivative, one can extend the Levi-Civita connection
to more general tensor fields via appropriate Leibniz rules. As is standard,
given a tensor such as Tjk in a local frame, the notation ∇i Tjk , or even
more briefly Tjk;i , is usually taken to mean the “ijk-component” of ∇T in
that frame. (Or more formally, the indices can just be used as abstract
placeholders [Wik, Abstract index notation].)
Exercise 1.6. Show that the metric g is always constant with respect to
its own Levi-Civita connection, that is, ∇g = 0. Use this to show that with
respect to a local frame, we have
(LX g)ij = ∇i Xj + ∇j Xi = Xj;i + Xi;j
for any vector field X.
Given the Levi-Civita connection and a local frame v1 , . . . , vn , we can

define the divergence of a vector field X by
div X := ∇i X i = X;ii ,
where we might also write this as divM or divg for added clarity. One
can show that this definition is independent of choice of local frame. More
generally, one can also talk about the divergence of other tensor fields by
taking the covariant derivative and then taking an appropriate trace.
Theorem 1.7 (Divergence theorem). Let (M, g) be a compact manifold,

possibly with boundary, and let X be a smooth vector field on M . Then

(div X) dμM = X, ν dμ∂M ,
M ∂M
where ν is the outward-pointing unit normal of ∂M , which is equipped with
the induced metric.
It is a simple exercise to see that when M is orientable, this is equivalent

to Stokes’ Theorem for (n − 1)-forms on M , and the nonorientable case
follows from the orientable case by passing to the orientable double cover.
Observe that if f is a function and X is a vector field, then
div(f X) = ∇f, X + f (div X).
Together with the divergence theorem, this tells us that the gradient ∇ on
functions and − div on vector fields are formal adjoint operators of each
other. In particular, this motivates us to define the Laplacian of a smooth
function f by
Δf := div(∇f ).
Some authors only refer to the Laplacian on Euclidean space as “the Lapla-
cian,” and instead call this more general operator the Laplace-Beltrami op-
erator. We will call it the g-Laplacian and write it as Δg when we want to
emphasize the dependence on the metric, and we will refer to any function
f solving Δf = 0 as a g-harmonic function. Be aware that there are some
differential geometers who choose to define the Laplacian with the opposite
sign. With our choice of sign, −Δ on a compact manifold has a nonnegative
spectrum unbounded from above. (See Theorem A.13.)
Observe that there is a close relationship between divergence and volume.
Exercise 1.8. Show that for any smooth vector field X, we have
(div X) dvolg = LX (dvolg ).
Show that in local coordinates x1 , . . . xn , we have
1 ∂
div X = √ ( det gX i ),
det g ∂xi
and consequently, for any smooth function f ,

1 ∂ ij ∂
Δf = √ det gg f .
det g ∂xi ∂xj
Given an orthonormal frame e1 , . . . , en , we have

n
Δf = ∇ei ∇ei f.
i=1
Finally, recall that the Hessian of a function f is

Hess f := ∇∇f,
which is the (0, 2)-tensor obtained by applying the Levi-Civita connection
to the gradient vector field of f . Note that the trace of Hess f using the
metric g recovers Δf . (Check this.)
1.1.5.
Curvature. The Riemann curvature tensor Riem is an element of
C ∞ ( 4 T ∗ M ) defined so that for all vector fields X, Y, Z, W ∈ C ∞ (T M ),
we have
Riem(X, Y, Z, W ) := −∇X ∇Y Z + ∇Y ∇X Z + ∇[X,Y ] Z, W .
Given a local frame v1 , . . . , vn , we can represent Riem in that frame by
Rijk := Riem(vi , vj , vk , v ).
The tensor Riem is antisymmetric in the first pair of inputs and the last
pair of inputs, and symmetric when interchanging
those pairs, or in other
∞
2 ∗ 2 ∗
words, Riem ∈ C ( T M ) ( T M ) . There is one more symmetry,
known as the first Bianchi identity:
Rijk + Rjki + Rkij = 0.
The second Bianchi identity concerns the derivatives of the curvature tensor:
Rijk;m + Rijm;k + Rijmk; = 0.
Recall that the semicolons denote covariant differentiation.
Given an orthonormal frame e1 , . . . , en , the Riemann curvature tensor
can be computed in terms of the connection 1-forms ωji . Specifically,
(1.3) Riem(ei , ej , ek , e ) = −Ω(ei , ej )ek , e ,
where Ω is the End(T M )-valued 2-form defined by
Ω = dω + ω ∧ ω,
or more precisely, Ω = Ωij ei ⊗ θj , where Ωij is the 2-form given by
Ωij = dωji + ωki ∧ ωjk ,
and θ1 , . . . , θn is the orthonormal dual coframe of e1 , . . . , en .
Remark 1.9. Other texts define the Riemann curvature tensor with three
lower indices and one raised index, instead of four lower and zero raised
as we have done here. This is an insignificant difference since the metric
provides a natural way to raise and lower indices, as described earlier.
More significantly, many texts use the opposite sign convention for the
definition of the Riemann curvature tensor. This is significant because the
sign of the curvature is very important! However, the literature is consis-

tent when defining sectional curvature, Ricci curvature, and scalar curva-
ture. Consequently, regardless of how the Riemann tensor is defined, positive
curvature assumptions are always consistent with spherical geometry while
negative curvature assumptions are always consistent with hyperbolic ge-
ometry. With the convention used in this book, the sphere has acurvature
tensor which is positive definite as a symmetric bilinear form on 2 Tp M at
each point p.
The sectional curvature K(Π) of a 2-plane Π ⊂ Tp M can be defined as

follows. If e1 , e2 is an orthonormal basis for Π, then K(Π), which we also
denote K(e1 , e2 ), is just Riem(e1 , e2 , e1 , e2 ).
The Ricci curvature Ric is defined to be the trace of the Riemann curva-
ture tensor over the second and fourth components. With respect to a local
frame v1 , . . . , vn , the local expression for Ric is Rij := Ric(vi , vj ), and thus
Rij = g k Rikj .
Finally, we can define the scalar curvature R to be the trace of the Ricci
curvature so that with respect to a local frame, we have
R = g ij Rij .
In particular, recall that when M is two-dimensional, the scalar curvature
is just twice the Gauss curvature K. We will often use the notation Rg or
RM to refer to the scalar curvature of the Riemannian manifold (M, g), and
similarly for Riem and Ric.
Exercise 1.10. We define the Einstein tensor by
1
G := Ric − Rg.
2
Contract the Bianchi identity to prove that the Einstein tensor is divergence-
free. That is,
(div G)i := ∇j Gij = 0.
It is sometimes convenient to fix a background metric ḡ and compare

the geometry of g to that of ḡ. Note that in a single local coordinate chart,
one can always choose the background metric to be the Euclidean metric
determined by the local coordinates, that is, one can choose ḡij = δij . In
that case, ∇f is the same thing as ∂f .
The difference between the Levi-Civita connections of g and ḡ,
W := ∇ − ∇,
is then a tensor (unlike Γ), which can sometimes be convenient. With
respect to a frame v1 , . . . , vn , we can write the components of W via
(∇i − ∇i )(vj ) = Wijk vk .
Exercise 1.11. Derive the following formula:

1
Wijk = g k (∇i gj + ∇j gi − ∇ gij ).
2
Clearly, this generalizes equation (1.2). Note that the expression ∇ gij ac-
tually denotes a component of the tensor ∇g, not the derivative of the
function gij .
Exercise 1.12. Show that the Ricci curvatures of g and ḡ are related by
Rij = R̄ij + (∇k Wijk − ∇j Wki
k k
) + (Wk Wij − Wj
k
Wik ).
In particular,
R = g ij R̄ij + g ij (∇k Wijk − ∇j Wki
k
) + g ij (Wk
k
Wij − Wj
k
Wik ).
Note that in a local coordinate chart, if one replaces ḡij by δij and W by Γ,
one obtains (the more common) formulas for Rij and R in local coordinates.
A simple way to build new metrics from old ones is the warped product
construction. The following useful proposition gives formulas for the Ricci
curvature of a warped product metric. We omit the proof, which is fairly
involved. See [Che17, Section 3] or [Bes08, Proposition 9.106] for details.
Proposition 1.13. Let (B, gB ) and (F, gF ) be Riemannian manifolds such
that F has dimension k > 1, and let f be a positive function on B. Let g
be the warped product metric on B × F with warping factor f , given by the
equation
g = gB + f 2 gF .
Let X, Y be vectors in B × F tangent to the B directions, and let V, W be
vectors tangent to the F directions. Then:
k
• Ricg (X, Y ) = RicB (X, Y ) − (Hess f )(X, Y ),
f
• Ricg (X, V ) = 0,

Δf |∇f |2
• Ricg (V, W ) = RicF (V, W ) − + (k − 1) 2 V, W .
f f
Consequently,
RF Δf |∇f |2
Rg = RB + − 2k − k(k − 1) .
f2 f f2
Exercise 1.14. Recall that normal coordinates for the Euclidean, hyper-
bolic, and spherical metrics naturally express all three of these metrics as
warped products of a one-dimensional base and a spherical fiber, with the
only difference between the three metrics being the choice of warping factor.
Using your knowledge that these three spaces all have constant curvature
0, −1, and 1, respectively, check that the above proposition holds in these
three cases.
1.1.6. Scalar curvature. Scalar curvature has a simple geometric inter-

pretation at the local level. It measures the deviation of the volume of
infinitesimally small geodesic balls from the volume of balls in Euclidean
space. Positive scalar curvature corresponds to less volume while negative
scalar curvature corresponds to more volume. This is similar to how sec-
tional curvature measures the deviation between geodesic rays.
Exercise 1.15. Let Br (p) be the geodesic ball of radius r around p in a
smooth Riemannian manifold (M n , g), and let |Br (p)| denote its volume.
Let ωn denote the volume of a unit ball in Euclidean Rn . Prove that for all
small r,
|Br (p)| 6
=1− R(p)r2 + O(r4 ),
ωn rn n+2
where R(p) is the scalar curvature at p. The “big O” notation means that
O(r4 ) stands in for a quantity that is bounded by Cr4 for some constant C
independent of r.
State and prove a similar formula for the volume of the geodesic sphere
∂Br (p).
Unfortunately, unlike the situation for sectional curvature, this sort of

“local” interpretation of scalar curvature cannot be “integrated” to obtain
any kind of nonlocal result for scalar curvature. (An example of a re-
sult like this for sectional curvature would be Toponogov’s Theorem [Wik,
Toponogov’s theorem].) Indeed, in order to control the volumes of larger ge-
odesic balls, one typically needs to control the Ricci curvature [Wik, Bishop–
Gromov inequality]. Or in other words, although Exercise 1.15 gives us a
very nice local interpretation of scalar curvature, it does not seem to be
useful for understanding the global nature of scalar curvature.
We present one lesser-known comparison result for scalar curvature,
which could be thought of as a scalar curvature analog of the more famous
Bonnet-Myers Theorem [Wik, Myers’s theorem]. This theorem is due to
Leon Green, who credits Marcel Berger with discovering it independently.
Theorem 1.16 (Green [Gre63], Berger). Let (M n , g) be a compact Rie-
mannian manifold whose average scalar curvature is at least n(n − 1). Then
the conjugate radius of (M, g) is less than or equal to π. Moreover, if it is
equal to π, then (M, g) must be a spherical space form with constant curva-
ture 1.
Recall that the conjugate radius is the supremum of all r with the prop-
erty that any two conjugate points along a unit speed geodesic are at least
r units apart. Or equivalently, it is the supremum of all r with the property
that the exponential map at every p ∈ M has nonsingular derivative at every
point of the ball Br (0) ⊂ Tp M .
Proof. We follow the proof in [TW14]. Although the proof is fairly ele-
mentary and accessible, it uses background material that will not be used
much in the rest of this book. Specifically, we assume familiarity with
the index form for geodesics. Recall that the energy functional of a curve
γ : [0, a] −→ M is
a
E(γ) = |γ (t)|2 dt.
0
Let γ : [0, a] −→ M be a unit speed geodesic, and let X be a vector field
defined along γ such that X(0) = X(a) = 0, and consider a smooth family
of curves γs with γ0 = γ whose deformation vector field is X. Recall that
the index form along γ may be defined to be the second variation of the
energy functional in the X direction, that is,

d2
(1.4) I(X, X) := 2 E(γs ),
ds s=0

∂
where X = ∂s γ . Recall that this can be computed to be
s=0 s
a
2
(1.5) I(X, X) = |X (t)| − Riem(γ (t), X(t), γ (t), X(t)) dt.
0
Now assume the hypotheses of the theorem, and let a be the conjugate
radius. Let p ∈ M , let u be a unit vector in Tp M , and let γ : [0, a] −→ M be
the unique geodesic with γ(0) = p and γ (0) = u. Recall the fact that γ has
no conjugate points before reaching a means that γ locally minimizes the
energy of paths from γ(0) to γ(a). This means that for any vector field X
along γ vanishing at the endpoints, we have I(X, X) ≥ 0. In particular, if
we choose V1 , . . . , Vn−1 so that γ , V 1 , . . . , Vn−1 forms a parallel orthonormal
basis along γ and set Xi = sin πt a Vi , then we have I(Xi , Xi ) ≥ 0 for each
i from 1 to n − 1. Summing this inequality over i and using (1.5) yields
a

π2 2 πt
(n − 1) − Ric(γ (t), γ (t)) sin dt ≥ 0.
2a 0 a
(So far this is the exact same argument used to prove the Bonnet-Myers
Theorem.) The next step is to integrate this inequality over all possible
starting pairs (p, u) determining γ. That is, we integrate over the unit
sphere bundle SM lying inside the tangent bundle (which has a natural
metric coming from g). This gives us
a

π2 πt
(n − 1) ωn−1 |M | − Ric(γ (t), γ (t)) dμSM sin2 dt ≥ 0,
2a 0 SM a
where γ itself should now be thought of asdepending on the point (p, u) ∈
SM . The key point is that the integral SM Ric(γ (t), γ (t)) dμSM is ac-
tually independent of t since the geodesic flow is a diffeomorphism of SM
preserving dμSM . Therefore

a

π2 πt
(n − 1) ωn−1 |M | ≥ Ric(u, u) dμSM sin2 dt
2a SM 0 a

ωn−1 a
= R dμM
M n 2
a
≥ (n − 1) ωn−1 |M |,
2
where we used our assumed lower bound on the average of R in the last line.
Therefore a ≤ π.
If a = π, then every inequality becomes an equality. In particular the
vector fields X described above become Jacobi fields, and from this one can
see that the sectional curvatures K(γ , V ) along each geodesic γ are all equal
to 1. Therefore the (M, g) has constant curvature 1.
Exercise 1.17. Using equation (1.4) as the definition of I(X, X), prove
equation (1.5).
Two techniques arose which revolutionized our understanding of scalar

curvature. One technique uses spinors while the other uses minimal hyper-
surfaces. Both of these techniques will be discussed in this book, though we
will go into more detail about the minimal hypersurface technique.
In our study of scalar curvature, it will be useful to understand how it
changes under deformations. Just as we did for the volume form, we define
the linearization of R at g to be the linear operator
DR|g : C ∞ (T ∗ M T ∗ M ) −→ C ∞ (M )
such that for any ġ ∈ C ∞ (T ∗ M T ∗ M ),

d
DR|g (ġ) := Rgt
dt t=0
for any smooth family of Riemannian metrics gt on M such that g0 = g and
d
ġ = dt g.
t=0 t
Exercise 1.18. Prove that

DR|g (ġ) = −Δg (trg ġ) + divg (divg ġ) − Ricg , ġg ,
where the double divergence
of ġ can be defined so that in any orthonormal
frame, divg (divg h) = ni,j=1 ∇ei ∇ej hij . Hint: Use Exercise 1.12, expressing
Rgt in terms of the background metric g. Differentiating at zero will cause
many terms to vanish since W = 0 at t = 0.
A more detailed computation shows that if g = ḡ + ġ, then

Rg = R̄ + DR|ḡ (ġ) + Q(g),
1.2. A survey of scalar curvature results 17
where Q(g) is a contraction of three copies of g −1 (that is, g with raised

indices) and two copies of ∇ġ = ∇g.
1.2. A survey of scalar curvature results

In this section we will survey some of the literature on scalar curvature
of compact manifolds. Our inspiration for the study of scalar curvature
begins with the two-dimensional case, in which the scalar curvature is just
twice the Gauss curvature. Recall the Gauss-Bonnet Theorem [Wik, Gauss-
Bonnet theorem].
Theorem 1.19 (Gauss-Bonnet Theorem). For a compact Riemannian sur-
face (M 2 , g), possibly with boundary, we have

K dμ = 2πχ(M ) − κ ds,
M ∂M
where K is the Gauss curvature of (M, g), κ is the geodesic curvature of

∂M , and ds is its line element. The sign convention for κ is such that the
boundary of the Euclidean unit disk has κ = 1. Recall that χ(M ) is the Euler
characteristic, which is a topological invariant of M .
In the case of no boundary, this sets up a simple relationship between

Gauss curvature and topology. Obviously, nothing quite so nice will be true
in higher dimensions, but we can still ask questions about how the topology
relates to sign restrictions on the scalar curvature. First, it turns out that
negative scalar curvature places no restriction on the topology.
Theorem 1.20 (Aubin [Aub70]). Every compact manifold of dimension
at least 3 admits a metric with constant negative scalar curvature.
This was later generalized by J. Bland and M. Kalka to show that non-
compact manifolds of dimension at least 3 admit complete metrics of con-
stant negative scalar curvature [BK89].
In fact, it turns out that a much more striking theorem is true.
Theorem 1.21 (Lohkamp [Loh94]). Every manifold of dimension at least
3 admits a complete metric with negative Ricci curvature.
The compact three-dimensional case had been established earlier by

L. Zhiyong Gao and Shing-Tung Yau [GY86] using a different method,
and then refined by Robert Brooks [Bro89].
Lohkamp proved another striking theorem about the nature of negative
scalar curvature, illustrating how scalar curvature can always be “pushed
down.”
Theorem 1.22 (Lohkamp [Loh99]). Let n ≥ 3, and let (M n , g) be a Rie-

mannian manifold. Let U be an open subset of M , and let f ∈ C ∞ (M )
such that f < Rg on U and f = Rg on M U . Then for any > 0, there
exists a metric g such that g = g outside an -neighborhood of U , while
f − ≤ Rg ≤ f inside the -neighborhood. Moreover, g may be chosen to
be arbitrarily C 0 -close to g.
This leaves open the question of which compact manifolds admit pos-
itive scalar curvature, or nonnegative scalar curvature. The question of
which manifolds admit positive scalar curvature is a deep and complicated
one. Meanwhile, the question of nonnegative scalar curvature is very closely
related, as shown by the following theorem, attributed to J.-P. Bourguignon
in [KW75b]. The proof will be presented in Section 2.3.2.
Theorem 1.23 (Bourguignon). Suppose that (M, g) is a compact Riemann-
ian manifold with nonnegative scalar curvature, but M does not admit any
metric with positive scalar curvature. Then g must be Ricci-flat.
We say that a manifold is Yamabe positive if it admits a metric with
positive scalar curvature. The first result restricting the topology of Yamabe
positive manifolds was proved by A. Lichnerowicz, who showed that any spin
manifold with positive scalar curvature cannot have any harmonic spinors.
Then, by the Atiyah-Singer index theorem, the following is immediate.
Theorem 1.24 (Lichnerowicz [Lic63]). If M n is a spin manifold that ad-
mits positive scalar curvature, then its Hirzebruch Â genus vanishes.
We will touch on this result more in Chapter 5. For now, we only note
that spin is a topological property (stronger than orientability), and that the
Â genus is a topological invariant that is only nontrivial when n is a multiple
of four. This theorem was later extended by N. Hitchin, who was able to
upgrade the result to the vanishing of an invariant that lies in KO−∗ (pt).
Theorem 1.25 (Hitchin [Hit74]). If M is a spin manifold that admits pos-
itive scalar curvature, then its Atiyah-Milnor-Singer invariant α vanishes.
When n is a multiple of 4, α essentially comes from Â, but it gives new
information when n is equal to 1 or 2 (mod 8). It is trivial in all other
dimensions. This α invariant is actually an invariant of the spin cobordism
class of M . Using the α invariant, one can show that there are exotic spheres
which do not admit positive scalar curvature.
These are obstructions to positive scalar curvature. What about exis-
tence? First we consider which manifolds are known to carry metrics with
positive scalar curvature. Obviously, any manifold with positive sectional
curvature or positive Ricci curvature has positive scalar curvature. More-
over, we have the following.
Exercise 1.26. Suppose M is a compact manifold that carries a metric

of positive scalar curvature. Let N be any compact manifold. Prove that
M × N carries a metric of positive scalar curvature.
A much more sophisticated result is the following, which was proved

by R. Schoen and S.-T. Yau [SY79d] and by M. Gromov and H. B. Law-
son [GL80b].
Theorem 1.27 (Surgery for positive scalar curvature). Suppose M is a

compact manifold (not necessarily connected) that carries a metric of posi-
tive scalar curvature. Then any manifold obtained from M by surgeries in
codimension at least 3 also carries a metric of positive scalar curvature. In
particular, for dimension n ≥ 3, if M n and N n carry metrics of positive
scalar curvature, then so does their connected sum M #N .
Recall that surgery on a k-sphere in M is a topological procedure in

which one removes a tubular closed neighborhood S k × B̄ n−k of that k-
sphere and replaces it by B̄ k+1 × S n−k−1 , which has the same bound-
ary [Wik, Surgery theory]. Note that the k = 0 case (which is surgery in
codimension n) involves removing two disjoint n-balls and replacing them
by a connecting cylinder. In particular, if the two disjoint n-balls lie on
different components of M , this is what we usually call the connected sum
construction. The proof of Theorem 1.27 is perhaps easiest to think about in
this codimension n case. Schoen and Yau glued the two metrics together in
a simple way and then followed this by a global conformal change in order to
impose positive scalar curvature. Alternatively, Gromov and Lawson used
a construction that involved interpolating between the metric on a small
annulus around a point in M and a tiny cylindrical metric, in such a way
that the positive scalar curvature is preserved. For a complete version of
Gromov and Lawson’s proof, see the treatment by Jonathan Rosenberg and
S. Stolz in [RS01].
Given the theorem above, constructing Yamabe positive manifolds be-
comes primarily a topological problem. For simply connected compact man-
ifolds of dimension at least 5, Gromov and Lawson were able to show that the
property of admitting a metric of positive scalar curvature is a spin cobor-
dism invariant, and that nonspin manifolds always admit metrics of positive
scalar curvature [GL80b]. Building on this foundational work, S. Stolz was
able to show that for simply connected compact manifolds of dimension at
least 5, Hitchin’s theorem (Theorem 1.25) gives all possible obstructions.
Theorem 1.28 (Stolz [Sto92]). Let n ≥ 5. If M n is a compact simply

connected manifold, then it carries a metric of positive scalar curvature if
and only if either M is not spin, or M is spin and α = 0.
This leaves only the low-dimensional cases (n = 3 or 4) and the non-

simply connected cases. In three dimensions, the problem is completely
understood.
Theorem 1.29 (Classification of 3-manifolds carrying positive scalar curva-

ture). A compact 3-manifold admits a positive scalar curvature metric if and
only if it is a connected sum of spherical space forms and copies of S 2 × S 1 .
The reverse implication follows immediately from the connected sum

case of Theorem 1.27. The forward implication can either be proved us-
ing the minimal surface technique of Schoen and Yau [SY79d], or by the
spinor technique of Gromov and Lawson [GL80a]. (Note that all orientable
3-manifolds are spin.) In either case, in order to obtain the theorem as
stated above, one must use G. Perelman’s proof [Per02, Per03b, Per03a]
(see [MT07]) of the Poincaré conjecture [Wik, Poincare conjecture], as
well as I. Agol’s proof [Ago13] of the virtually Haken conjecture [Wik,
Virtually Haken conjecture]. (Of course, these results were not available
back in 1979.)
In dimension 4, once again the techniques of Schoen-Yau and Gromov-
Lawson yield various obstructions, but in addition to these results, one
also has new obstructions to positive scalar curvature arising from Seiberg-
Witten theory. Since those techniques are outside the scope of this book,
we will say no more about it. For more information, see the survey arti-
cle [Ros07].
For reasons to be described later, we are especially interested in un-
derstanding the case of the torus. For a time, it was an important open
question whether the 3-torus can carry a metric of positive scalar curva-
ture [KW75b, Ger75]. Theorem 1.29 answered this question in the nega-
tive, but we can ask the same question for higher-dimensional tori or, more
generally, for connected sums with higher-dimensional tori.
Theorem 1.30. Let T n be the n-dimensional torus, and let M n be a com-

pact manifold. Then T n #M cannot carry a metric of positive scalar curva-
ture.
The n = 3 case is a special case of Theorem 1.29. Schoen and Yau proved
the result in dimensions less than 8 [SY79d], and this is the case that we will
discuss in greater detail in Chapter 2. Soon afterward, Gromov and Lawson
discovered a proof that works whenever M is spin [GL80a]. A result of
Nathan Smale [Sma93] implies the n = 8 case. In recent years, proofs in
higher dimensions have appeared in a preprint by Schoen and Yau [SY17]
and a series of preprints by Lohkamp [Loh06, Loh15c, Loh15a, Loh15b].
Theorem 1.30 has central importance for us because of its relevance to

the positive mass theorem. Specifically, in Section 3.3 we will explain how
the positive mass theorem follows from Theorem 1.30.
For the case of spin manifolds, Gromov and Lawson actually proved
a much more general theorem than Theorem 1.30, and, since that time,
there has been a good deal of progress in using spinor techniques. For the
case of nonsimply connected compact spin manifolds of dimension at least
5, the primary motivating problem is the stable Gromov-Lawson-Rosenberg
Conjecture. The subject is primarily topological, and we refer the interested
reader to the survey article [Ros07], where the conjecture and partial results
are discussed.
Another important theorem for surfaces is the uniformization theorem
[Wik, Uniformization theorem]. It is sometimes stated in terms of complex
geometry, but here we state it in terms of curvature.
Theorem 1.31 (Uniformization Theorem). For any compact Riemannian
surface (M 2 , g), there exists a metric conformal to g which has constant
Gauss curvature.
Recall that, given a metric g, a metric g̃ is said to be conformal to

g if angle measurements between tangent vectors are the same, whether
measured using g or g̃. One can see that this is the same as saying that each
metric is a positive function times the other. The Uniformization Theorem
generalizes to higher dimensions in a very nice way.
Theorem 1.32 (Yamabe problem). For any compact Riemannian mani-
fold (M, g), there exists a metric conformal to g which has constant scalar
curvature.
This theorem, first proposed by H. Yamabe [Yam60], was proved over

many years via important contributions from N. Trudinger [Tru68],
T. Aubin [Aub76], and R. Schoen [Sch84], and it has a long story of
its own. The final step by Schoen used the positive mass theorem (Theo-
rem 3.18), which we will discuss in Chapter 3, in an essential way. For an
overview of the Yamabe problem, see the excellent survey article [LP87].
We briefly discuss the relationship between conformal changes to the
metric and scalar curvature.
Definition 1.33. On a manifold of dimension n ≥ 3, we say that g̃ is
conformal to g if and only if there exists a smooth positive function u such
4
that g̃ = u n−2 g. The function u is called a conformal change of metric.
The set of all metrics g̃ obtained in this way from g is called the conformal
class of g. A choice of conformal class of metrics on a manifold is called a
conformal structure.
4
The choice of exponent n−2 is immaterial to the definition, but it turns
out to be a convenient choice for purposes of analysis.
Exercise 1.34. Let (M n , g) be a Riemannian manifold, and let u be a
4
smooth positive function on M . If g̃ = u n−2 g, show that

− n−2
n+2 4(n − 1)
(1.6) Rg̃ = u − Δg u + Rg u .
n−2
Hint: Use Exercise 1.12.
We can define the conformal Laplacian

4(n − 1)
(1.7) Lg u := − Δg u + Rg u,
n−2
so that we can write
Rg̃ = u− n−2 Lg u.
n+2
(1.8)
Note that the conformal Laplacian is a symmetric second-order linear elliptic
operator. Since the right side of equation (1.6) is a nonlinear elliptic expres-
sion in u, we see why the Yamabe problem is at least naively a reasonable
problem to study.
Next we consider the problem of prescribing scalar curvature. That
is, given a function f on M , is there a metric g whose scalar curvature is
equal to f ? This problem was answered definitively by Kazdan and Warner.
However, note that the theorem as stated below uses the solution of the
Yamabe problem (Theorem 1.32), which post-dates their work.
Theorem 1.35 (Kazdan-Warner trichotomy [KW75a]). All compact man-
ifolds can be placed in three different categories: (1) manifolds that admit
positive scalar curvature, (2) manifolds that admit nonnegative scalar cur-
vature, but not positive scalar curvature, and (3) everything else, that is,
manifolds that do not admit nonnegative scalar curvature.
In dimension at least 3: for manifolds of type (1), any function can be
prescribed as the scalar curvature; for manifolds of type (2), a function can
be prescribed as the scalar curvature if and only if it is negative somewhere
or is identically zero; for manifolds of type (3), a function can be prescribed
as the scalar curvature if and only if it is negative somewhere.
In dimension 2, we have the same result, except that in case (1), the
scalar curvature must be positive somewhere.
Chapter 2
Minimal hypersurfaces
2.1. Basic definitions and the Gauss-Codazzi equations

Let Σm be a submanifold of a Riemannian manifold (M n , g). The metric g
induces a metric h on Σ. If we denote the Levi-Civita connection of (Σ, h)
ˆ then it is known that for any p ∈ Σ, X ∈ Tp Σ, and any vector field
by ∇,
Y ∈ C ∞ (T Σ),
ˆ X Y = (∇X Ỹ ) ,
∇
the tangential component of ∇X Ỹ at p (that is, its orthogonal projection
to the tangent space Tp Σ), where Ỹ is any extension of Y to a vector field
on M . Let N Σ be the normal bundle of Σ in M . Recall that the second
fundamental form of Σ is a tensor A ∈ C ∞ (T ∗ Σ ⊗ T ∗ Σ ⊗ N Σ) defined so
that for any p ∈ Σ and X, Y ∈ Tp Σ,
A(X, Y ) := (∇X Ỹ )⊥ ,
the normal component of ∇X Ỹ (that is, its orthogonal projection to the
normal space Np Σ), where Ỹ is any extension of Y to a vector field on M .
Recall that A is symmetric, that is, for all X, Y ∈ Tp Σ,
A(X, Y ) = A(Y, X).
Note that in the literature, many authors use A or II instead of A.
Equivalently, we can define the shape operator (or Weingarten map) S ∈
C ∞ (N ∗ Σ ⊗ T ∗ Σ ⊗ T Σ) so that for any X ∈ Tp Σ and ν ∈ Np Σ,
Sν (X) := (−∇X ν̃) ,

the tangential component of ∇X ν̃, where ν̃ is any extension of ν that remains
normal along Σ. The second fundamental form and the shape operator are
23
24 2. Minimal hypersurfaces
related by the Weingarten equation,

A(X, Y ), ν = Sν (X), Y .
In the hypersurface case, when Σ has dimension n − 1, it is often conve-

nient to choose a distinguished unit normal vector. If there exists a global
choice of unit normal vector ν (i.e., Σ has trivial normal bundle), we say
that Σ is two-sided. Recall that if the ambient manifold M is orientable,
then a hypersurface Σ is two-sided if and only if it is orientable. Given such
a ν, we can think of the second fundamental form as a scalar-valued bilinear
form rather than as a normal vector-valued bilinear form by defining
A(X, Y ) := A(X, Y ), −ν.
Keep in mind that there is always at least an implicit choice of unit normal
ν whenever the notation A(X, Y ) is used. In general, if a unit normal ν
is not specified, and Σ has an inside and outside, it is typically implicitly
assumed that ν is the outward normal. Similarly, if ν is understood, then
we write S := S−ν for the shape operator. Note that
(2.1) ∇X Y, −ν = A(X, Y ) = S(X), Y = ∇X ν, Y .
Remark 2.1. The −ν instead of ν that appears in our definitions for A and
S is simply a convention chosen for this text. This somewhat curious choice
stems from our desire to simultaneously have (1) the bilinear form A and the
operator S be positive for spheres in Euclidean space, and (2) the outward
unit normal be our default choice of normal. Unfortunately, this convention
is contrary to the classical definition of the shape operator! However, we
feel that the benefits of this convention outweigh this drawback.
Exercise 2.2. Verify that if (M, g) is Euclidean space and Σ is a sphere
of radius r, and we choose ν to be the outward unit normal, then A = 1r h,
where h is the induced metric on Σ.
The mean curvature vector H is the trace of A over the tangential di-
rections, that is, at p ∈ Σ,

m
H := A(ei , ei ),
i=1
where e1 , . . . , em is any orthonormal basis of the tangent space Tp Σ. In the
hypersurface case, we can define the mean curvature scalar
H := H, −ν = trh A = tr S.
Exercise 2.3. Given a hypersurface Σ in (M, g) with normal ν, prove that
for any smooth function f ,
ΔΣ f = Δg f − ∇ν ∇ν f + H, ∇f .
2.1. Basic definitions and the Gauss-Codazzi equations 25
Given a vector field X in M , defined along Σ, we can define the tangential

divergence of X to be

m
(2.2) divΣ X := ∇ei X, ei ,
i=1
where e1 , . . . , em is any orthonormal frame for Σ. Notice that this gener-

alizes the traditional definition of divergence on Σ to vectors that are not
necessarily tangent to Σ. Also observe that this notation gives us another
expression for the mean curvature scalar of Σ:
H = tr S = divΣ ν.
Exercise 2.4. Show that for any frame v1 , . . . , vm for Σ, we have

m
divΣ X = v i , v j ∇vi X, vj .
i,j=1
The intrinsic and extrinsic curvatures of Σ and the ambient curvature are
all related to each other according to the Gauss-Codazzi equations [Wik,
Gauss-Codazzi equations], which we will separate into what we call the
Gauss equation and the Peterson-Codazzi-Mainardi equation.
Theorem 2.5 (Gauss equation). Let Σ be a submanifold of (M, g). For any
p ∈ Σ and any tangent vectors X, Y, Z, W ∈ Tp Σ, we have
RiemM (X, Y, Z, W ) = RiemΣ (X, Y, Z, W ) + A(X, W ), A(Y, Z)
− A(X, Z), A(Y, W ).
Theorem 2.6 (Peterson-Codazzi-Mainardi equation). Let Σ be a subman-
ifold of (M, g). For any p ∈ Σ, any tangent vectors X, Y, Z ∈ Tp Σ, and any
normal vector ν ∈ N Σ,
RiemM (X, Y, Z, ν) = (∇Y A)(X, Z) − (∇X A)(Y, Z), ν.
Now let us consider the hypersurface case. Let e1 , . . . , en−1 be an or-

thonormal basis of Tp Σ, and let ν be the distinguished unit normal in Np Σ.
The Gauss equation implies that
RiemM (ei , ej , ei , ej ) = RiemΣ (ei , ej , ei , ej ) + A(ei , ej )A(ej , ei )
− A(ei , ei )A(ej , ej ).
If we sum the above equation over both i and j from 1 to n − 1 (in other
words, take two traces), we obtain

n−1
RiemM (ei , ej , ei , ej ) = RΣ + |A|2 − H 2 .
i,j=1
If we set en = ν, then e1 , . . . , en is an orthonormal basis of Tp M , and so the

left side is

n−1
RiemM (ei , ej , ei , ej )
i,j=1

n
n
= RiemM (ei , ej , ei , ej ) − RiemM (ei , en , ei , en )
i,j=1 i=1
n
− RiemM (en , ej , en , ej ) + RiemM (en , en , en , en )
j=1
= RM − 2RicM (en , en ),
where we used the symmetries of the Riemann tensor. Hence, we obtain the
following.
Corollary 2.7 (Traced Gauss equation). Let Σ be a hypersurface of (M, g).
For any p ∈ Σ and any unit normal ν ∈ Np Σ, we have
RM = RΣ + 2RicM (ν, ν) + |A|2 − H 2 .
2.2. First and second variation of volume

Our goal in this section is to study the volume functional μ on the space of all
compact m-submanifolds of M n , either with or without boundary. In par-
ticular, we would like to understand the critical points of this functional, as
well as the local minima. Formally, we can think of the volume functional
as a function on the infinite-dimensional (nonlinear) space of all compact
m-dimensional submanifolds. In general, if M is a finite-dimensional man-
ifold, and f : M −→ V is a smooth map into a vector space V , then we
define the directional
derivative of f in the direction v at the point p to be
d
Df |p (v) := dt t=0 f (γ(t)), where γ is any smooth path in M with γ(0) = p
and γ (0) = v. The derivative map at p, Df |p : Tp M −→ V , is some-
times called the linearization of f at p. A critical point of f is a point p
where the linearization vanishes. This concept can be generalized to infi-
nite dimensions, but we will do this without formally defining a notion of
infinite-dimensional manifold, because we do not require such machinery.
In our infinite-dimensional setting described above, rather than taking
M to be the space of all compact m-submanifolds of M , it is sometimes
desirable to fix a single submanifold Σm and take M to be the space of all
smooth embeddings (or immersions, depending on the context) of Σ into M ,
because this space is easier to work with. There is another similar approach,
which is the one we will follow below. We will fix a specific submanifold
Σm ⊂ M n , and take M to be Diff 0 (M ), the space of all diffeomorphisms
2.2. First and second variation of volume 27
of M in the same component as the identity. By pushing forward Σ via

diffeomorphism, this space will parameterize all of the submanifolds of M
that are isotopic to Σ (indeed, this is the definition of isotopic). Of course,
this parameterization introduces a huge amount of redundancy, but these
redundancies will not cause problems for us. Even without the formalism
of infinite-dimensional manifolds, we can still think intuitively about what
the “tangent space” of Diff 0 (M ) at the identity should be: the space of
smooth vector fields C ∞ (T M ). Explicitly, if one considers a smooth1 path
Φ : (−, ) −→ Diff 0 (M ) such that Φ0 is the identity, then we can define
∂
a vector field X(p) = ∂t Φ (p) for all p ∈ M . More generally, we can
t=0 t
∂
define Xt (Φt (p)) = ∂t Φt (p) so that X0 = X. We abbreviate this by writing
∂
Xt = ∂t Φt . We often refer to the family Φt as a one-parameter family of
deformations and Xt as its first-order deformation vector field.
Fix a compact submanifold Σ of a Riemannian manifold (M, g) with
induced metric h and induced volume measure dμΣ = dμh . Define Σt =
Φt (Σ), where Φt is as described above. Our first goal is to compute the
linearization of the volume functional at Σ in the direction of X, also called
the first variation of volume with respect to the first-order deformation X,
d
which is just the quantity dt t=0
μ(Σt ). The idea behind the computation
is conceptually similar to computing the first variation of the energy of
curves—a standard computation in most Riemannian geometry textbooks.
Let ht denote the induced metric on Σt , pulled back to Σ via Φt , so that
h0 = h is just the original metric on Σ. Geometric quantities relating to Σt
will be labeled with a Σt . If they are instead labeled with ht , or sometimes
just t, this refers to the same quantity pulled back to Σ. For example
dμt := Φ∗t dμΣt . In particular,

(2.3) μ(Σt ) = dμΣt = dμt .
Σt Σ
Writing the volume as an integral over the fixed space Σ makes it conceptu-
ally more straightforward to compute the derivative. From Proposition 1.3,
we know that
∂ 1
dμt = (trh ḣ) dμh ,
∂t t=0 2

∂
where ḣ = ht . So in order to find the first variation of the volume
∂t t=0
measure, it suffices to compute trh ḣ.
Let e1 , . . . , em be a local orthonormal frame for (Σ, h). Let (ht )ij be
the expression for ht with respect to this fixed choice of frame. Define
ei (t) := dΦt (ei ) to be the push forward of ei , which lives on Σt . By definition
1 One might suspect that this begs the question of the manifold structure on Diff (M ), but
0
one can define this smoothness without too much fuss. You may want to try it yourself.
of ht and Xt , we have
∂ ∂
(ht )ij = ei , ej ht
∂t ∂t
∂
= Φ∗t ei (t), ej (t)g
∂t
= Φ∗t (Xt ei (t), ej (t)g )
= Φ∗t (∇Xt ei (t), ej (t)g + ei (t), ∇Xt ej (t)g )

= Φ∗t ∇ei (t) Xt , ej (t)g + ei (t), ∇ej (t) Xt g ,
where we used properties of the Levi-Civita connection ∇ in the last two
equalities. (Take note of how the torsion-free property was used, and why
that step is valid.) In particular, we have

∂
(2.4) (ht )ij = ∇ei X, ej + ei , ∇ej X.
∂t t=0
Since the trace of the above expression is just 2 divΣ X, we obtain the fol-
lowing.
Lemma 2.8 (First variation of the volume measure). The linearization of
the volume measure at Σ in the direction of the vector field X is

∂
(2.5) D(dμ)|Σ (X) := dμht = (divΣ X) dμΣ .
∂t t=0
We now decompose X = X̂ + X ⊥ into its tangential and normal compo-

nents so we can see how the expression above depends on those components:

m
divΣ X = ∇ei X, ei
i=1
m
= ∇ei X̂, ei + ∇ei X ⊥ , ei
(2.6) i=1
m

= divΣ X̂ + ei X ⊥ , ei − X ⊥ , ∇ei ei
i=1
= divΣ X̂ − H, X ⊥ .
Exercise 2.9. Use the same sort of reasoning as above to show that the
linearization of the induced metric itself in the direction of the vector field
X is
(2.7) Dh|Σ (X) = LX̂ h − 2A, X ⊥ ,
where Dh is interpreted to mean the first variation of the pullback of the
induced metric to Σ.
In looking at equation (2.6), we should not be surprised to see the divΣ X̂

term. This is because a tangential first-order deformation can arise from a
family of diffeomorphisms that preserves Σ, and, for such a deformation,
the definition of Lie derivative shows that D(dμ)|Σ (X̂) = LX̂ (dμΣ ), which
equals (divΣ X̂) dμΣ by Exercise 1.8. If a tangential deformation X̂ vanishes
at the boundary ∂Σ, then of course it should not have any effect on the
total volume of Σ since such a deformation does not even change Σ (to first-
order). However, changing the boundary does change the volume. Indeed,
combining equation (2.6) with Lemma 2.8 and the divergence theorem, we
immediately obtain the following.
Proposition 2.10 (First variation of volume). The linearization of the total
volume functional μ at Σ in the direction of the vector field X is

d ⊥
Dμ|Σ (X) := μ(Σt ) = − H, X dμΣ + X̂, η dμ∂Σ ,
dt t=0 Σ ∂Σ
where η is the outward-pointing conormal unit vector (tangent to Σ but or-

thogonal to ∂Σ).
In the two-sided hypersurface case we may write X ⊥ = ϕν, where ν is
the distinguished unit normal, and this formula reduces to

Dμ|Σ (X) = Hϕ dμΣ + X̂, η dμ∂Σ .
Σ ∂Σ
Note that this formula does not assume that X is a normal variation,
nor that X vanishes at the boundary ∂Σ. (Do you see why the boundary
contribution is intuitive?) From this formula, one easily sees that if H
is identically zero, then the volume of Σ is stationary with respect to all
possible deformations of Σ that preserve the boundary. For this reason
any submanifold Σ with vanishing H is called a minimal submanifold of
(M, g). Despite the nomenclature, a minimal submanifold need not be a
local minimum of the volume functional (in the space of all submanifolds
with the same boundary) but only a critical point.
In order to assess whether we do have a local minimum, we must compute
d2
the second derivative, that is, the second variation of volume, dt2 μ(Σt ).
t=0
First, note that Proposition 2.10 tells us that for any t,
∂
dμt = Φ∗ (divΣt Xt ) dμt .
∂t
From this we can see that

∂ 2 ∂ ∗ 2
(2.8) dμt = Φ (divΣt Xt ) + (divΣ X) dμΣ .
∂t2 t=0 ∂t t=0
So we focus on
∂ ∗
Φ (divΣt Xt ) = Φ∗t (Xt (divΣt Xt )).
∂t t
Recalling that ei (t) := dΦt (ei ) and using Exercise 2.4 to expand the right
side, we have

m
Xt (divΣt Xt ) = Xt ei (t), ej (t)∇ei (t) Xt , ej (t)
i,j=1

m

(2.9) = Xt ei (t), ej (t) ∇ei (t) Xt , ej (t)
i,j=1
m
(2.10) + ei (t), ej (t)∇Xt ∇ei (t) Xt , ej (t)
i,j=1
m
(2.11) + ei (t), ej (t)∇ei (t) Xt , ∇Xt ej (t).
i,j=1
We will handle each of the three terms above separately. We start with
the first term (2.9). Recall the linear algebra fact that for a matrix-valued
function B(t),
d
B(t)−1 = −B(t)−1 B (t)B(t)−1 .
dt
Using this together with equation (2.4), we obtain

∂
Xt ei (t), ej (t) t=0 = Φ∗ (ei (t), ej (t))
∂t t=0 t

∂
= hij
∂t t=0 t
m
ik ∂
=− h (ht )k hj
∂t t=0
k,=1
m
=− hik (∇ek X, e + ek , ∇e X)hj
k,=1
= −∇ei X, ej − ei , ∇ej X.
Therefore the total contribution of the first term (2.9) at t = 0 is

m
m
2
(2.12) − |(∇ei X) | − ei , ∇ej X∇ei X, ej .
i=1 i,j=1
Moving on to the second term (2.10),

∇Xt ∇ei (t) Xt , ej (t) = ∇ei (t) ∇Xt Xt , ej (t) − Riem(Xt , ei (t), Xt , ej (t)),
where Riem refers to the ambient curvature of (M, g). If we define Z to be

the vector field ∇Xt Xt at t = 0, then the total contribution of the second
term (2.10) at t = 0 is

m
(2.13) divΣ Z − Riem(X, ei , X, ei ).
i=1
Moving on to the third term (2.11),
∇ei (t) Xt , ∇Xt ej (t) = ∇ei (t) Xt , ∇ej (t) Xt .
Therefore the total contribution of the third term (2.11) at t = 0 is

m
(2.14) |∇ei X|2 .
i=1
Finally, by combining the terms (2.12), (2.13), and (2.14), and inserting
them into (2.8), we obtain the following.
Proposition 2.11 (Second variation of the volume measure). Given the

setup described above,
m
∂ 2
⊥ 2
dμ = |(∇ X) | − Riem(X, e , X, e )
∂t2 t=0
t ei i i
i=1
⎞
m
− ei , ∇ej X∇ei X, ej + divΣ Z + (divΣ X)2 ⎠ dμΣ .
i,j=1
This formula is quite general but perhaps not so easy to use. The role of
Z might seem a bit mysterious at first. In computing the second derivative
along the path Φt , the computation will depend on more than just the
“tangent vector” X to this curve at the point Σ. It may be useful to think
about the finite-dimensional analog, where the situation is clearer.
We will now specialize to the case of a two-sided hypersurface Σn−1 ⊂
M n with a distinguished unit normal ν and rewrite the formula in terms of
the decomposition
X = X̂ + ϕν
into its tangential and normal components. We can also decompose
Z = Ẑ + ζν.
For many applications, it is sufficient to consider the case of variations that

are purely normal, so that X̂ = 0 and X = ϕν. In this special case, it is not
difficult to deal with each term that appears in Proposition 2.11. First, we
have

n−1
|(∇ei X)⊥ |2 − Riem(X, ei , X, ei )
i=1

n−1
= |(∇ei (ϕν))⊥ |2 − ϕ2 Riem(ν, ei , ν, ei )
i=1

n−1
= |∇ei ϕ|2 − ϕ2 Ric(ν, ν)
i=1
= |∇ϕ|2 − ϕ2 Ric(ν, ν),
where we used the fact that ∇ei ν is tangential. Next, we have

n−1
n−1
ei , ∇ej X∇ei X, ej = ei , ∇ej (ϕν)∇ei (ϕν), ej
i,j=1 i,j=1

n−1
= ϕ2 ei , ∇ej ν∇ei ν, ej
i,j=1
= ϕ2 |A|2 .
Finally, by equation (2.6), we have
divΣ Z = Hζ + divΣ Ẑ,
(divΣ X)2 = H 2 ϕ2 .
Putting it all together, we obtain the following.

Proposition 2.12 (Normal first and second variation for hypersurfaces).
In the two-sided hypersurface case, given a purely normal variation X = ϕν,
we have

∂
dμt = Hϕ dμt ,
∂t t=0

∂ 2
dμ = |∇ϕ| 2
− (Ric(ν, ν) + |A| 2
− H 2 2
)ϕ + Hζ + div Ẑ dμΣ .
∂t2
t Σ
t=0
Exercise 2.13 (First variation of mean curvature for hypersurfaces). Using

the above proposition or otherwise, prove that the first variation of the mean
curvature H of a hypersurface is given by the following formula, where X
need not be a normal variation:

∂
DH|Σ (X) := Ht = −ΔΣ ϕ − (|A|2 + Ric(ν, ν))ϕ + ∇X̂ H,
∂t t=0
where Ht = Φ∗ (HΣt ) and ΔΣ is the Laplacian on Σ computed using the
induced metric.
Exercise 2.14. Let Σn−1 be a two-sided hypersurface of (M n , g) with n ≥ 3,

4
and let g̃ = u n−2 g be a conformally related metric, where u is some positive
smooth function. Show that
−2

−1
H̃ = u n−2 H + 2(n−1)
n−2 u ∇ ν u ,
where H is the mean curvature of Σ in (M, g) computed with respect to the
unit normal ν and H̃ is the mean curvature of Σ in (M, g̃) computed with
respect to its unit normal pointing in the same direction as ν. Hint: Use
Proposition 2.12.
We easily obtain the following.

Theorem 2.15 (Second variation formula for minimal hypersurfaces). Let
Σn−1 be a compact two-sided minimal hypersurface of M n , possibly with
boundary, and let Σt be a smooth family of compact hypersurfaces of M with
Σ0 = Σ, whose first-order deformation vector field X = X̂ + ϕν along Σn−1
vanishes at ∂Σ. Then

d2
2 μ(Σt ) = |∇ϕ|2 − (Ric(ν, ν) + |A|2 )ϕ2 dμΣ .
dt t=0 Σ
Technically, we have only proved the theorem above if X is a normal

variation, but it is true more generally as long as we have the vanishing con-
dition, and this makes sense intuitively since tangential components should
not contribute as long as they do not move the boundary. (In any case, we
prove a more general statement in Theorem 2.19.)
Definition 2.16. A compact minimal submanifold is called stable if its
second variation of volume is nonnegative for all boundary-preserving de-
formations. In light of Theorem 2.15, for the case of a two-sided minimal
hypersurface Σ, this is equivalent to demanding that for all ϕ ∈ C0∞ (Σ),
that is, smooth functions ϕ on Σ vanishing at ∂Σ, we have

|∇ϕ|2 − (Ric(ν, ν) + |A|2 )ϕ2 dμΣ ≥ 0.
Σ
This inequality is called the stability inequality. If the inequality is strict for
all nonzero ϕ, then we say that Σ is a strictly stable minimal hypersurface.
Given a hypersurface Σ in M , we can define the stability operator LΣ for Σ
by
LΣ ϕ := −ΔΣ ϕ − (Ric(ν, ν) + |A|2 )ϕ
for all smooth functions ϕ on Σ. (We use an upright letter L in order
to distinguish this from the conformal Laplacian.) Therefore stability of
a two-sided minimal hypersurface is equivalent to nonnegativity of its sta-
bility operator (with Dirichlet boundary condition), and strict stability is
equivalent to its positivity.
From the above characterization of stability, we easily obtain the follow-

ing observation of James Simons.
Proposition 2.17 (Simons [Sim68]). A Riemannian manifold with positive
Ricci curvature cannot contain any stable two-sided closed minimal hyper-
surfaces.
Proof. Suppose there did exist such a hypersurface. Setting ϕ = 1 in the

stability inequality leads to a contradiction.
Observe that the stability operator is the same thing as the first variation
of mean curvature (Exercise 2.13) under normal variation. We can also
remove the troublesome Ricci term in the expression using the traced Gauss
equation (Corollary 2.7), which tells us that
1
Ric(ν, ν) = (RM − RΣ − |A|2 + H 2 ).
2
Inserting this into the definition of LΣ (see also Exercise 2.13), we obtain
the useful formula
1
(2.15) DH|Σ (ϕν) = LΣ ϕ = −ΔΣ ϕ + (RΣ − RM − |A|2 − H 2 )ϕ.
2
When Σ is minimal, H vanishes, and we see that the stability inequality for
a minimal hypersurface can be restated as

1
(2.16) |∇ϕ| + (RΣ − RM − |A| )ϕ dμΣ ≥ 0.
2 2 2
Σ 2
We can now start to see the connection between minimal surfaces and
scalar curvature. The following observation was first made by R. Schoen
and S.-T. Yau.
Proposition 2.18 (Schoen-Yau [SY79b]). If (M, g) is a 3-manifold with
positive scalar curvature, then every stable, two-sided closed minimal surface
Σ in M must be a sphere or a projective plane. If M is orientable, then Σ
must be a sphere.
Proof. Suppose Σ is a stable, two-sided compact minimal surface in a 3-

manifold (M, g) with positive scalar curvature. If we use the test function
ϕ = 1, then the stability inequality (2.16) tells us that

(RΣ − RM − |A|2 ) dμΣ ≥ 0.
Σ
By assumption, this tells us that

RΣ dμΣ > 0.
Σ
The Gauss-Bonnet Theorem and the classification of surfaces tell us that Σ

must be a sphere or a projective plane. If M is orientable, then so is Σ, in
which case it must be a sphere.
We close this section with the general second variation formula, in which
we do not assume that X is normal or vanishes at the boundary, nor that
Σ is minimal.
Theorem 2.19 (General second variation of volume of hypersurfaces). Let

Σn−1 be a compact two-sided hypersurface of M n , possibly with boundary,
and let Σt be a smooth family of compact hypersurfaces of M with Σ0 = Σ.
Let Xt be its deformation vector field defined along Σt , where X = X̂ + ϕν
is the vector field at t = 0. Let Z = Ẑ + ζν be the vector field ∇X Xt . Then

d2
μ(Σt ) = |∇ϕ|2 − (Ric(ν, ν) + |A|2 − H 2 )ϕ2
dt2 t=0 Σ

+H(ζ − 2∇X̂ ϕ + A(X̂, X̂)) dμΣ

+ 2HϕX̂ − 2ϕS(X̂) + (divΣ X̂)X̂ − ∇ ˆ X̂ + Ẑ, η dμ∂Σ .
X̂
∂Σ
Proof. The proof is rather involved, though the individual steps are elemen-
tary. The reader may wish to skip this proof. Throughout the computation,
we assume that the ei are parallel at our point of interest. We begin with
Proposition 2.11 and we split those terms into four parts:
m

|(∇ei X)⊥ |2 − Riem(X, ei , X, ei )
i=1

m
− ei , ∇ej X∇ei X, ej + divΣ Z + (divΣ X)2
i,j=1
= CZ + Cperp + 2Ccross + Ctan .
Here the terms on the right side are defined as follows. CZ = divΣ Z is just
the contribution from Z, which easily leads to the desired ζ and Ẑ terms
via (2.6). Other than that term, the rest of the expression is quadratic in
X. Since X = X̂ + ϕν, we can decompose the various X terms into normal-
normal terms Cperp , tangent-tangent terms Ctan , and cross-terms Ccross . In
Proposition 2.12, we already computed Cperp . We turn our attention to

n−1
Ccross = (∇ei X̂)⊥ , (∇ei (ϕν))⊥ − Riem(X̂, ei , ϕν, ei )
i=1

n−1
− ei , ∇ej X̂∇ei (ϕν), ej + (divΣ X̂)(divΣ ϕν)
i,j=1

n−1
= A(ei , X̂), (∇ei ϕ)ν − ϕRiem(X̂, ei , ν, ei )
(2.17) i=1

n−1
− ei , ∇ej X̂ + ϕS(ei ), ej + Hϕ divΣ X̂
i,j=1

n−1
= −A(∇ϕ, X̂) − ˆ e X̂, ei )
ϕRiem(X̂, ei , ν, ei ) + ϕA(∇ i
i=1
+ Hϕ divΣ X̂,
where ∇ˆ is the induced connection of Σ. The idea is to reorganize these

terms into divergences. One might reasonably guess what those divergence
terms might be, and with the right choices the curvature term will vanish.
Claim.
(2.18) Ccross = divΣ (HϕX̂ − ϕS(X̂)) − H∇X̂ ϕ.
The first and third terms in the claim give us

divΣ (HϕX̂) − H∇X̂ ϕ = (∇H)ϕ, X̂ + H∇ϕ, X̂ + Hϕ divΣ X̂ − H∇X̂ ϕ
= ϕ∇X̂ H + Hϕ divΣ X̂
n−1

= ϕ∇X̂ ∇ei ν, ei + Hϕ divΣ X̂
i=1

n−1
=ϕ ∇X̂ ∇ei ν, ei + Hϕ divΣ X̂.
i=1
Meanwhile, the second term in the claim is

− divΣ (ϕS(X̂)) = −∇ϕ, S(X̂) − ϕ divΣ (S(X̂))

n−1
= −A(∇ϕ, X̂) − ϕ ∇ei ∇X̂ ν, ei .
i=1
Putting the last two computations together, we get
divΣ (HϕX̂ − ϕS(X̂)) − Hϕ∇X̂ ϕ

n−1
= Hϕ divΣ X̂ − A(∇ϕ, X̂) + ϕ ∇X̂ ∇ei ν − ∇ei ∇X̂ ν, ei
i=1

n−1
= Hϕ divΣ X̂ − A(∇ϕ, X̂) + ϕ −Riem(X̂, ei , ν, ei ) + ∇[X̂,ei ] ν, ei
i=1

n−1
= Hϕ divΣ X̂ − A(∇ϕ, X̂) + ϕ −Riem(X̂, ei , ν, ei ) + ∇−∇
ê X̂ ν, ei
i
i=1

n−1
= Hϕ divΣ X̂ − A(∇ϕ, X̂) + ϕ ˆ e X̂, ei ) ,
−Riem(X̂, ei , ν, ei ) − A(∇ i
i=1
verifying the claim. Next we consider

n−1
Ctan = |(∇ei X̂)⊥ |2 − Riem(X̂, ei , X̂, ei )
i=1

n−1
− ei , ∇ej X̂∇ei X̂, ej + (divΣ X̂)2
i,j=1

n−1
= ˆˆ
A(ei , X̂)2 − Riem(X̂, ei , X̂, ei ) − ei , ∇ + (divΣ X̂)2 .
∇e i X̂ X̂
i=1
Claim.
ˆ X̂] + HA(X̂, X̂).
Ctan = divΣ [(divΣ X̂)X̂ − ∇ X̂
The first term in the claim is
divΣ [(divΣ X̂)X̂] = ∇(divΣ X̂), X̂ + (divΣ X̂)2

n−1

=∇ ˆ e X̂, ei + (divΣ X̂)2
∇
X̂ i
i=1

n−1
= ˆ e X̂, ei + (divΣ X̂)2 .
∇X̂ ∇ i
i=1
The second term in the claim is

n−1
ˆ X̂) = −
divΣ (−∇ ê ∇
∇ ˆ X̂), ei .
X̂ i X̂
i=1
Using the Gauss equation (Theorem 2.5), we can see that the sum of those
two terms gives us
ˆ X̂)
divΣ ((divΣ X̂)X̂ − ∇ X̂

n−1
= ˆ
−RiemΣ (X̂, ei , X̂, ei ) + ∇ X̂, e i + (divΣ X̂)2
[X̂,ei ]
i=1

n−1
= −Riem(X̂, ei , X̂, ei ) + A(X̂, ei )2 − A(X̂, X̂)A(ei , ei )
i=1

ˆ ˆ X̂, ei + (divΣ X̂)2
−∇ ∇i X̂
= Ctan − HA(X̂, X̂),

verifying the claim. Finally, if we combine the two claims with our calcula-
tion in Proposition 2.12 and the divergence theorem, the result follows.

2.3. Minimizing hypersurfaces and positive scalar curvature

2.3.1. Three-dimensional results. In order to get some mileage out of
Proposition 2.18, one needs to be able to find stable minimal surfaces. We
will not prove the existence theorems stated in this section, because their
proofs would take us too far from our main focus. Instead, we will just offer
a taste of the ideas used in their proofs, as well as offer references for further
study. We will also avoid defining concepts that are not used much in the
rest of the book.
Historically, this line of inquiry began with the classical Plateau prob-
lem: given a simple closed curve γ in R3 , does there exist an immersed
minimal disk whose boundary is γ? In a seminal breakthrough in the birth
of geometric analysis, this problem was solved, independently, by Jesse Dou-
glas [Dou31] and Tibor Radó [Rad30]. Notably, Douglas was awarded an
inaugural Fields Medal for this work in 1936. See [GM08] for discussion of
these discoveries.
The most naive approach to solving this problem is the so-called di-
rect method, a generalization of Dirichlet’s principle. Start with a sequence
of disks with the given boundary γ, whose areas approach the infimum of
all possible areas (we call this a minimizing sequence), and then hope to
extract a subsequential limit. In this approach, there is an obvious compli-
cation arising from the diffeomorphism invariance of area. One can see this
same problem arise when trying to prove the existence of length-minimizing
curves in a Riemannian manifold. In a typical Riemannian geometry text-
book, we learn that one way to get around this problem is to observe that
2.3. Minimizing hypersurfaces and positive scalar curvature 39
a length-minimizing curve that is parameterized by arclength minimizes en-

ergy in addition to minimizing length. Conversely, an energy-minimizing
map will also minimize length, while simultaneously being parameterized
by arclength. The advantage is that since energy depends on parameteriza-
tion, it breaks the diffeomorphism invariance and can therefore be minimized
more directly.
In two dimensions, there is no such thing as arclength parameterization,
but perhaps the next best thing is the use of so-called isothermal coordinates,
which is just a conformal parameterization. In modern language, one can
show that the image of a conformal map from a surface into a Riemannian
manifold is minimal if and only if that map is harmonic, where harmonic
means that the map is a critical point of the energy functional for maps.
Although Douglas and Radó did not quite employ the direct method,
the importance of isothermal coordinates was already understood at the
time. Later, R. Courant was able to solve the Plateau problem via the
direct method [Cou37], and C. B. Morrey [Mor48] was able to extend
this work to the Plateau problem in Riemannian manifolds. Following up
on these ideas, Jonathan Sacks and Karen Uhlenbeck were able to prove
existence theorems for minimal spheres in Riemannian manifolds [SU81].
For higher genus surfaces, we have the following theorem, which was proved,
independently, by Schoen and Yau, and by Sacks and Uhlenbeck.
Theorem 2.20 (Schoen-Yau [SY79b], Sacks-Uhlenbeck [SU82]). Let
(M 3 , g) be a compact Riemannian manifold. (1) If π1 (M, ∗) contains a non-
cyclic abelian subgroup, then there exists a smooth minimal embedding of
the 2-torus φ : T 2 −→ M that minimizes area among all other maps from
the torus that induce maps of the fundamental groups that are conjugate to
that of φ. (2) If π1 (M, ∗) contains a subgroup isometric to π1 (Σ, ∗) for some
orientable surface Σ with genus greater than 1, then there exists a smooth
minimal embedding φ : Σ −→ M that minimizes area among all other maps
from Σ that induce maps of the fundamental groups that are conjugate to
that of φ.
One difficulty in these theorems, compared to, for example, the much
older theorem of C. B. Morrey, is that one must contend with the conformal
geometry of closed Riemann surfaces rather than disks. Indeed, the case of
minimal spheres, treated in [SU81], is particularly interesting.
Since the minimal surfaces constructed by Theorem 2.20 are certainly
stable, if we combine this theorem with Proposition 2.18, we immediately
obtain Schoen and Yau’s main theorem of [SY79b], which gave the first
topological restriction on positive scalar curvature that was not proved using
spinors. (We will discuss the spinor technique in Chapter 5.)
Corollary 2.21 (Schoen-Yau [SY79b]). Let M be an orientable compact

3-manifold. If either (1) π1 (M, ∗) contains a noncyclic abelian subgroup, or
(2) π1 (M, ∗) contains a subgroup isometric to π1 (Σ, ∗) for some surface Σ
with genus greater than 1, then M cannot carry a metric with positive scalar
curvature.
One can see that the basic idea behind Corollary 2.21 is actually quite
simple. It is very similar to the reasoning used to prove Synge’s Theorem
[Wik, Synge’s theorem]. The main difficulty lies in Theorem 2.20, rig-
orously establishing the existence of minimal surfaces that one intuitively
hopes should exist. The topological restrictions imposed by Corollary 2.21
are strong enough so that, when combined with current understanding of
the classification of 3-manifolds, they are sufficient to prove Theorem 1.29.
We omit the proof, which is a purely topological argument. (One can deal
with the nonorientable case by passing to the double cover.)
2.3.2. Dimensions less than or equal to 8. The general technique of

using harmonic maps from surfaces into a manifold, described above, cannot
be generalized to find minimal hypersurfaces in higher dimensions. Instead
of trying to break the diffeomorphism invariance as discussed above, another
approach is to use a formalism that does not rely on parameterization at all.
This approach falls under the general umbrella of geometric measure theory.
The main theorem of relevance to us is the following.
Theorem 2.22 (Existence and regularity of minimizing hypersurfaces). Let
(M n , g) be a compact Riemannian manifold with n < 8. For each nonzero
homology class α ∈ H n−1 (M, Z), there exists an integral sum of smooth
oriented minimal hypersurfaces Σ ∈ α that minimizes volume among all
smooth cycles in α.
The “integral sum” here means that Σ may be a disjoint union of smooth
minimal hypersurfaces, each of which may have “integer multiplicity.” One
way to rephrase the above theorem is the following. For each nonzero α ∈
H n−1 (M, Z), we can find an (n − 1)-dimensional compact oriented manifold
Σ, not necessarily connected, and a map f : Σ −→ M whose induced map on
homology gives f∗ ([Σ]) = α, such that this pair (Σ, f ) has minimum volume
among all possible pairs satisfying the above, and f is a smooth embedding
on each component of Σ such that the maps from the different components
are either disjoint or exactly the same.
A proof of Theorem 2.22, together with all of the necessary background,
would require an entire book of its own, but we will attempt to tell the
highly abbreviated story of this theorem. A much better telling of this story
can be found in the accessible survey paper of C. De Lellis [DL16]. For a
full proof of Theorem 2.22, see Leon Simon’s book [Sim83]. To learn about
geometric measure theory, see the book [LY02], or [Mor16] for a lighter
introduction.
The approach to Theorem 2.22 is to use the so-called direct method.
Start with a minimizing sequence of smooth cycles in α, and try to extract
a subsequential limit. This is only possible if one chooses a sufficiently weak
topology and looks for a limit in the completion with respect to that topol-
ogy. We can view our smooth cycles as currents in the sense of G. de Rham
[Wik, Current (mathematics)], that is, as objects that are dual to smooth
differential (n − 1)-forms on M via integration. The topology one chooses is
the weak topology dual to the smooth topology of (n − 1)-forms. Or equiva-
lently in this context, we can use the flat topology [Wik, Flat convergence].
Once we are working in the completion of the space of smooth cycles in
this topology, it is trivial to extract a subsequential limit. This comple-
tion consists of (n − 1)-dimensional integral currents in M . These abstract
objects generalize oriented hypersurfaces but still have a good deal of struc-
ture. This formalism of integral currents was pioneered by Herbert Federer
and Wendell Fleming [FF60]. (A more set-theoretic approach led to early
results of E. R. Reifenberg [Rei60].)
After verifying that this limit indeed minimizes (a suitable generaliza-
tion of) volume, the remaining task is to prove that the limit object that
one obtains is actually a smooth hypersurface. A good analogy (for those
familiar with the topic) is using the direct method to solve the Dirichlet
problem for Laplace’s equation: in that case, we are looking for a function
that minimizes energy, subject to the Dirichlet boundary constraint. If we
choose an energy-minimizing sequence of functions, we cannot expect it to
converge in, say, the C 2 topology, but the sequence will be bounded in the
Sobolev space W 1,2 , and hence we can extract a subsequence converging
weakly in W 1,2 . This procedure allows us to find an energy-minimizer in
W 1,2 , simply by virtue of the formalism. However, in the end we must prove
that this energy-minimizer is actually C 2 and hence a classical solution of
Laplace’s equation.
Exercise 2.23. Given two points p, q in a complete Riemannian manifold
(M, g), prove the well-known fact that there exists a minimizing geodesic
between them using the direct method as follows. Consider an energy-
minimizing sequence of smooth paths from p to q, extract a subsequential
limit in W 1,2 , and then prove that this limit is indeed energy-minimizing
and smooth.
Essentially, once the formalism of integral currents is in place, the ques-

tion of existence of minimizers becomes very easy, thereby placing all of
the difficulty in the question of regularity. All of the formalism we have
described so far (and hence, the existence theory) works just as well in
higher codimension. However, the regularity theory is far more complicated
in higher codimension, where the analog of Theorem 2.22 is F. Almgren’s
“big regularity theorem” [Alm00], so called because of its difficulty and its
nearly thousand page length. Recently, this work has been streamlined and
simplified by De Lellis and E. Spadaro [DLS14]. See [DL16] for a survey
of this work.
Returning to the codimension one case, W. Fleming first established
Theorem 2.22 for n = 3 [Fle62]. A major breakthrough came from Ennio
De Giorgi [DG61], who first understood how regularity of the tangent cone
at any point could be used to prove local regularity near that point. (The
paper was about sets of finite perimeter, but the insights carry over to the
setting of integral currents.) The tangent cone of an object at a point is
the result of blowing up the object at that point. A tangent cone being
regular just means that it is a hyperplane (which is, of course, the tangent
space of any smooth hypersurface). Consequently, if one can show that the
hyperplane is the only possible minimizing codimension one tangent cone,
then De Giorgi’s argument should imply regularity. Almgren was able to
carry out this argument for n = 4 [Alm66], and later J. Simons was able
to show that all minimizing codimension one tangent cones are hyperplanes
for n < 8 [Sim68], leading to Theorem 2.22.
It turns out that when n ≥ 8, it is indeed possible that a codimension one
minimizing integral current is not smooth, as demonstrated by E. Bombieri,
E. De Giorgi, and E. Giusti [BDGG69], who proved that the Simons cone
[Sim68] is a nontrivial minimizing cone. However, it was shown by H. Fed-
erer that the “singular set” of a codimension one minimizing integral current
has Hausdorff dimension less than or equal to n − 8 [Fed70]. Or in other
words, the minimizing integral current is a smooth hypersurface away from
a singular set of codimension at least 7 inside of it. L. Simon was able
to prove more about the structure of the singular set [Sim95]. Although
these are fairly strong results, the singularities still cause problems for many
geometric arguments.
When n = 8, Federer showed that the singular points are isolated
[Fed70], and then Nathan Smale was able to perturb these isolated sin-
gularities away.
Theorem 2.24 (Smale [Sma93]). Let M be an eight-dimensional compact

manifold. For each nonzero homology class α ∈ H n−1 (M, Z), there is a
dense open set of metrics (in any C k topology) for which α can be represented
by an integral sum of smooth oriented minimal hypersurfaces Σ ∈ α that
minimizes volume among all smooth cycles in α.
Theorem 2.22 is quite powerful and elegant, but one downside (especially
in contrast to Theorem 2.20, for example) is that it does give you direct con-
trol over the topology of the minimizing hypersurface. However, sometimes
one can still get around this. For example, we can use Theorem 2.22 in place
of Theorem 2.20 to see why the 3-torus T 3 cannot carry a metric of positive
curvature: by topological reasoning, one can find a class α in H 2 (T 3 , Z) that
cannot be represented by a sum of 2-spheres. (We will explain this argu-
ment in more detail below.) Then the area-minimizer in α, whose existence
is guaranteed by Theorem 2.22, must have at least one component that is
not a sphere. But this contradicts Proposition 2.18.
We now turn our attention to proving Theorem 1.30, which implies that
a torus does not admit a metric of positive scalar curvature. We first prove
a higher-dimensional generalization of Proposition 2.18.
Proposition 2.25 (Schoen-Yau). Let (M n , g) be a Riemannian manifold
with positive scalar curvature. Then every stable, closed two-sided minimal
hypersurface of M carries a metric of positive scalar curvature.
Proof. The n = 3 case is Proposition 2.18, and the n = 2 case is even

simpler. (Exercise.) So we may assume n > 3. Let (M n , g) be a compact
Riemannian manifold with positive scalar curvature, and suppose that Σn−1
is a stable, closed two-sided minimal hypersurface of M . Let Lh be the
conformal Laplacian (1.7) of the induced metric h on Σ. The crux of the
argument is that if RM > 0, then the stability inequality implies positivity
of the conformal Laplacian. For any smooth ϕ on Σ,

4(n − 2)
Lh ϕ, ϕL2 (Σ) = − ΔΣ ϕ + RΣ ϕ ϕ dμΣ
Σ n−3

4(n − 2)
= |∇ϕ| + RΣ ϕ
2 2
dμΣ
Σ n−3

2(n − 2) 1
=2 |∇ϕ| + RΣ ϕ
2 2
dμΣ
Σ n−3 2

1
≥2 |∇ϕ| + RΣ ϕ dμΣ
2 2
Σ 2

1
>2 |∇ϕ| + (RΣ − RM − |A| )ϕ dμΣ
2 2 2
Σ 2
≥ 0,
where we used RM > 0 and the stability inequality (2.16) in the last two
lines.
It is a standard PDE fact (see Theorem A.10) that Lh has a principal
eigenfunction ϕ1 and principal eigenvalue λ1 , meaning that ϕ1 is a smooth
positive function and λ1 is a constant such that Lh ϕ1 = λ1 ϕ1 . By the
computation above, if we use ϕ1 as our test function ϕ, we see that λ1 > 0.

4
Using this ϕ1 as our conformal change for (Σn−1 , h), we define h̃ = ϕ1n−3 h.
Then by equation (1.8), we obtain
− n−3
n+1
− n−3
n+1
Rh̃ = ϕ1 L h ϕ1 = ϕ1 λ1 ϕ1 > 0,
completing the proof.
Exercise 2.26. Use the argument above to prove that if (M, g) has non-
negative scalar curvature and contains a compact hypersurface Σ, then
1
λ1 (LΣ ) ≤ λ1 (Lh ) ,
2
where λ1 (LΣ ) is the principal eigenvalue of the stability operator (2.15) and
λ1 (Lh ) is the principal eigenvalue of the conformal Laplacian of the induced
metric h = g|Σ . Moreover, if g has strictly positive scalar curvature, then
the inequality is strict. Hint: Use the Rayleigh quotient characterization of
the principal eigenvalue, as explained in the proof of Theorem A.10.
The last part of the proof of Proposition 2.25 can be conveniently re-
stated as follows.
Corollary 2.27. Let (Σ, h) be a Riemannian manifold, and let Lh denote
its conformal Laplacian. If the principal eigenvalue of Lh is positive (i.e., Lh
is a strictly positive operator), then h is conformal to a metric with positive
scalar curvature.
In view of the preceding exercise and corollary, the proof of Proposi-

tion 2.25 can be summarized as follows. Stability and positive scalar cur-
vature imply that 0 ≤ λ1 (LΣ ) < 12 λ1 (Lh ), which implies that Σ is Yamabe
positive.
Using Proposition 2.25 together with Theorem 2.22, one can inductively
derive various topological obstructions to scalar curvature in dimensions
up to 7, as Schoen and Yau did in [SY79d]. See [Ros07] for a general
topological formulation of what Schoen and Yau’s inductive argument yields.
One particularly interesting case for us (as stated in [SY17]) is the following.
Theorem 2.28. Let M n be a compact orientable manifold, and suppose
that there exist classes ω1 , . . . , ωn ∈ H 1 (M, Z) such that their cup product
ω1 ∪ · · · ∪ ωn ∈ H n (M, Z) is nonzero. Then M cannot carry a metric of
positive scalar curvature.
By Poincaré duality [Wik, Poincare duality], this could instead be

phrased in terms of intersections of homology classes, which is probably
a more natural point of view for the intuition behind inducting Proposition
2.25. We present the full proof for n ≤ 8, and we will briefly discuss some
of the ideas used in the n > 8 case in the following subsection.
Proof of the n ≤ 8 case. The theorem is trivial for n = 1 and is a conse-

quence of the Gauss-Bonnet Theorem for n = 2. We will prove the theorem
by induction, assuming that it holds in dimension n − 1 and using that to
prove that it holds in dimension n. (We take n = 2 as the base of our
induction.) For the induction step, we allow M to be disconnected.
Suppose that M n satisfies the hypotheses of the theorem, and that M
does carry a metric of positive scalar curvature. Let α = [M ] ∩ ω1 ∈
Hn−1 (M, Z) be the Poincaré dual of ω1 . By Theorem 2.22, there exists
a (possibly disconnected) compact oriented manifold Σn−1 and a map f :
Σ −→ M such that f∗ ([Σ]) = α and f is a stable minimal embedding of
each component of Σ. (When n = 8, we instead apply Theorem 2.24, which
requires us to first perturb the metric slightly.) Since M is orientable, f is a
two-sided embedding. By Proposition 2.25, it follows that Σ admits a metric
of positive scalar curvature. Meanwhile, for 2 ≤ i ≤ n, we can consider the
pullbacks f ∗ ωi ∈ H 1 (Σ, Z), and observe that
f∗ ([Σ] ∩ (f ∗ ω2 ∪ · · · ∪ f ∗ ωn )) = [M ] ∩ (ω1 ∪ · · · ∪ ωn ),
which is nonzero by assumption. Therefore f ∗ ω2 ∪ · · · ∪ f ∗ ωn is nonzero
in H n−1 (Σ, Z), which allows us to use our induction hypothesis to reach a
contradiction.
Let us give an overview of how Theorem 2.28 works when n ≤ 8, so

that we can better understand what goes into it. We begin with a compact
manifold M n satisfying the topological hypotheses of Theorem 2.28 and
suppose that it has positive scalar curvature. Essentially what we did was
construct a nested “slicing” of submanifolds Σ2 ⊂ · · · ⊂ Σn−1 ⊂ Σn = M
according to the following procedure: since M contains a nontrivial Hn−1
homology class, we can construct a minimizing hypersurface Σn−1 in M
by Theorem 2.22. (Note that the invocation of Theorem 2.22 is the most
technical part of the proof, and consequently it is the one part of the proof
that we leave unexplained.) Moreover, the topological hypotheses on M
are such that Σn−1 will inherit those same topological hypotheses. We use
the principal eigenfunction of the conformal Laplacian of Σn−1 to make
a conformal change to Σn−1 . The stability of Σn−1 guarantees that this
conformal change will give Σn−1 positive scalar curvature (as in the proof of
Proposition 2.25). Next we choose Σn−2 to be a minimizing hypersurface in
Σn−1 with respect to the new conformally changed metric, and we iterate the
process until we construct a two-dimensional surface Σ2 which has positive
scalar curvature. Since the topological hypotheses are passed on at each
step, all the way to Σ2 , we eventually obtain a contradiction to the Gauss-

Bonnet Theorem.
We can now prove Theorem 1.30, which we restate for convenience.
Theorem 2.29. Let T n be the n-dimensional torus, and let M n be a com-
pact manifold. Then T n #M cannot carry a metric of positive scalar curva-
ture.
Proof. First suppose that M is orientable. We can easily produce ω1 , . . . , ωn

as in the hypotheses of Theorem 2.28 for the torus T n . Next, consider the
map from T n #M to T n that squashes M to a point. Pulling back the ωi
by this map yields cohomology classes which can be used to apply Theo-
rem 2.28 to T n #M . The case when M is nonorientable can be handled by
first passing to the orientable double cover of T n #M
Theorem 1.30 is of central importance to us, because we will use it to

prove the positive mass theorem. As a simplified test of the positive mass
conjecture, R. Geroch conjectured that there cannot exist a nonflat metric
of nonnegative scalar curvature on Rn that is equal to the Euclidean metric
outside a compact set [Ger75]. A. Fischer and J. Marsden had ruled out the
possibility of examples that are close to Euclidean metric in their detailed
study of the linearization of scalar curvature [FM75]. Even at the time
of Geroch’s conjecture, it was already understood to be equivalent to the
nonexistence of positive scalar curvature on the torus (Theorem 1.30) via
Theorem 1.23, which we restate here for convenience.
Theorem 2.30. Suppose that (M, g) is a compact Riemannian manifold
with nonnegative scalar curvature, but M does not admit any metric with
positive scalar curvature. Then g must be Ricci-flat.
Proof. We present an elementary version of the proof in [KW75b]. Assume

(M, g) as in the hypotheses. Our first step is to show that g is scalar-flat.
Suppose that Rg is positive at least at one point. In this case, we use the
same basic argument as in Proposition 2.25. That is, by Theorem A.10, we
can choose ϕ1 to be a principal eigenfunction of the conformal Laplacian Lg
with eigenvalue λ1 , and then
λ1 ϕ1 , ϕ1 L2 (M ) = Lg ϕ1 , ϕ1 L2 (M )

4(n − 1)
= − Δg ϕ1 + Rg ϕ1 ϕ1 dμg
M n−2

4(n − 1)
= |∇ϕ1 | + Rg ϕ1 dμg
2 2
M n−2
> 0.
Therefore λ1 > 0. Using this ϕ1 as our conformal change for g, we define

4
g̃ = ϕ1n−2 g. Then by equation (1.8), we obtain
− n−2
n+2
− n−2
n+2
Rg̃ = ϕ1 L g ϕ1 = ϕ1 λ1 ϕ1 > 0.
Thus g̃ has positive scalar curvature, which contradicts our original hypoth-
esis, and thus g must be scalar-flat.
Next we will show that if g is scalar-flat but not Ricci-flat, then it can
be perturbed to a new metric whose conformal Laplacian is strictly positive,
which will yield a contradiction by the general argument described above.
Specifically, we claim that the metric gt := g − tRicg has this property for
small t > 0. To see this we will compute a lower bound for the Rayleigh
quotients for Lgt ,

1 4(n − 1)
At (u, u) := |∇u|gt + Rgt u
2 2
dμgt .
u2L2 (M,gt ) M n−2
(The reader may wish to review the proof of Theorem A.10 for the relevance
of these Rayleigh quotients.) Observe that since Rg = 0, the constant
functions are principal eigenfunctions for Lg . By Exercise 1.10, we also
know that Ricg is divergence-free. Using Exercise 1.18 and the fact Rg = 0,
we now compute

d d 1
At (1, 1) = Rg dμgt
dt t=0 dt t=0 |M |gt M t

1
= (DR|g )(−Ricg ) dμg
|M |g M

1
= (Δg (trg (Ricg )) − divg (divg (Ricg )) + |Ricg |2 ) dμg
|M |g M

1
= |Ricg |2 dμg
|M |g M
> 0.
So there exists > 0 such that for small enough t, we have At (u, u) ≥ t for
all nonzero constant functions u.
Now consider nonzero smooth functions u that are orthogonal to the
constants in L2 (M, g). For such u, we have A0 (u, u) ≥ λ2 (Lg ) > λ1 (Lg ) = 0,
where λ2 is the second eigenvalue of Lg . For small enough t, the quantities
gt , dμgt , and Rgt can only change by a small amount, and thus it is clear
that At (u, u) ≥ 12 λ2 (Lg ) > 0 for small enough t, for all nonzero smooth
functions u orthogonal to the constants in L2 (M, g). Thus for small t > 0,
we see that At (u, u) ≥ t for all nonzero smooth functions u on M . In other
words, λ1 (Lgt ) ≥ t > 0, completing the proof.
Remark 2.31. Note that the proof above uses a small deformation in the
direction of −Ricg . From a modern perspective, one can simply use Ricci
flow to execute this argument in a clean way: Ricci flow evolves a family of
metric gt according to
∂
g = −2Ric.
∂t
By Exercises 1.18 and 1.10, Ricci flow evolves R according to
∂
(2.19) R = Δg R + 2|Ric|2 ,
∂t
where we have suppressed the dependence on t in the notation. From this
point of view, if we start Ricci flow with initial metric g0 with zero scalar
curvature, then the parabolic strong maximum principle implies that gt must
have nonnegative scalar curvature. Moreover, it can only remain scalar-flat
if the term 2|Ric|2 is identically zero. For the reader interested in learning
more about Ricci flow, there are many great resources, but one particularly
good starting point is the book [CLN06].
We now obtain a generalization of Geroch’s conjecture as a corollary of

Theorem 1.30 and Theorem 1.23. It is a special case of positive mass rigidity
(Theorem 3.19).
Corollary 2.32. Let (M, g) be a Riemannian manifold with nonnegative
scalar curvature such that there is a compact set K ⊂ M with (M K, g)
isometric to the Euclidean metric on Rn Br (0) for some r > 0. Then
(M, g) is isometric to Euclidean space.
Proof. All of the interesting geometry of (M, g) is contained in K, which

is contained in a large Euclidean cube. If we identify the faces of the cube,
we get a new compact Riemannian manifold (M̃ , g̃) such that M̃ has the
topology of T n #K̃, where K̃ is the manifold obtained by taking K and
collapsing ∂K to a point. The new object (M̃ , g̃) clearly has nonnegative
scalar curvature and, by Theorem 1.30, it cannot carry a metric of positive
scalar curvature. So by Theorem 1.23, g̃ is Ricci-flat.
Recall the Hodge Theorem [Wik, Hodge theory] which states that every
element of H 1 (M̃ , R) can be represented by a harmonic 1-form, that is, a
1-form whose Hodge Laplacian is zero. By considering the map from M̃ to
T n that squashes K to a point (as in our proof of Theorem 1.30), one can see
that the dimension of H 1 (M̃ , R) is at least as large as that of H 1 (T n , R),
which is n. Therefore we have n linearly independent harmonic 1-forms
ω1 , . . . , ωn on M̃ . Next we consider the Weitzenböck formula for 1-forms,
which states that every 1-form ω satisfies
ΔH ω = ∇∗ ∇ω + Ric(ω , ·),
where ΔH is the Hodge Laplacian and ∇∗ is the formal adjoint operator

of ∇. Since M̃ is Ricci-flat, it follows that ∇∗ ∇ωi = 0 for each of our
harmonic 1-forms ωi . By definition of the adjoint, ∇ωi L2 (M̃ ) = 0, or in
other words each ωi is a parallel 1-form. In particular, ω1 , . . . , ωn forms a
global parallel coframe and, with respect to this coframe, the matrix g̃ij is
constant. Therefore the metric g̃ must be flat, and hence g is flat.
Exercise 2.33. Complete the proof above by showing that if (M, g) is iso-
metric to Euclidean space outside a compact set and is flat everywhere,
then it must be globally isometric to Euclidean space. (Hint: Look at
the universal cover by applying either the Killing-Hopf Theorem [Wik,
Killing-Hopf theorem] or the Cartan-Hadamard Theorem [Wik, Cartan-
Hadamard theorem].)
Exercise 2.34. In the proof above, once we know g̃ is Ricci-flat, it follows
that the original (M, g) is Ricci-flat. For the reader familiar with the Bishop-
Gromov comparison theorem [Wik, Bishop-Gromov inequality], use this
theorem (in place of the Hodge Theorem and Weitzenböck formula) to prove
that if (M, g) is isometric to Euclidean space outside a compact set and is
Ricci-flat everywhere, then it must be globally isometric to Euclidean space.
2.3.3. Higher dimensions. We now discuss some of the basic ideas used
in Schoen and Yau’s approach to Theorem 2.28 for general dimension in
[SY17]. As we have seen, the problem in dimensions n > 8 is that Theo-
rem 2.22 does not hold. More specifically, minimal hypersurfaces can have
singular sets of codimension 7. The Schoen-Yau approach can be described
as taking these problematic singularities head-on. Another approach to
proving Theorem 2.28 in all dimensions is to attempt to perturb away the
singularities as in Theorem 2.24. Lohkamp follows this approach in higher di-
mensions using his new concept of skin structures [Loh06,Loh15c,Loh15a,
Loh15b].
In the n ≤ 8 proof of Theorem 2.28 described in the previous subsection,
starting with (M, g), we built a nested slicing Σ2 ⊂ · · · ⊂ Σn−1 ⊂ Σn = M
with the property that each Σi is a minimizing hypersurface in Σi+1 with
4
respect to the metric g̃i+1 := (ϕi+1 · · · ϕn−1 ) n−2 gi+1 , where gi+1 is the metric
on Σi+1 induced by the original metric g, and each ϕj > 0 is the principal
eigenfunction of the conformal Laplacian on Σj with respect to the metric
induced by g̃j+1 . As mentioned, for n > 8, the minimizer hypersurfaces may
have codimension 7 singular sets. If one naively attempts to push through
this argument even with the singularities, the main issue is that although
Σn−1 can have a singular set of codimension at least 7, which is quite small,
the next slice Σn−2 might intersect that singular set and consequently it
could potentially have a singular set as large as codimension 6 inside it. In
order to finish the argument, we need to be sure that Σ2 is smooth enough

to apply the Gauss-Bonnet Theorem, and there is no obvious reason why
this should be true. And this is even assuming that we can make sense of
the idea of constructing the eigenfunctions ϕi at each step.
What one really requires is some sort of regularity theorem, not for
minimal hypersurfaces, but for “minimal k-slicings” Σk ⊂ · · · ⊂ Σn−1 ⊂
Σn = M of the sort described above. As long as one can show that the Σk
in such a k-slicing has singular set of codimension at least 3, that would be
enough to show that the Σ2 appearing in a minimal 2-slicing is smooth. A
result such as this can be proved in a similar manner to how regularity of
minimal hypersurfaces is proved: in that case, the key input is a regularity
result for minimizing tangent cones. Here what we need is a regularity result
for homogeneous minimal k-slicings.
In order to obtain an appropriate regularity result for minimal k-slicings,
Schoen and Yau needed to alter the construction quite a bit. Unfortunately,
a complete discussion of Schoen and Yau’s proof of Theorem 2.28 [SY17] is
beyond the scope of this book since it is primarily concerned with regularity
issues, essentially going beyond the sort of geometric measure theory argu-
ments that we have already skipped over in this book. However, we can go
into some more detail on the geometric side of the proof. The first observa-
tion is that there is quite a bit of a gap between positivity of the stability
operator and positivity of the conformal Laplacian, and in the construction
above, a lot of “useful positivity” is being thrown away. (To see this pre-
cisely, examine the inequalities used in the proof of Proposition 2.25.) In
order to avoid this, instead of deforming to positive scalar curvature at each
step, it is possible to wait to do this until the end. Specifically, we will
describe an alternative proof of Theorem 2.28 for n ≤ 8.
Alternative proof of Theorem 2.28 (n ≤ 8). Let (M n , g) be a compact

manifold satisfying the hypotheses of Theorem 2.28, and suppose that its
scalar curvature Rn > 0. Using the exact same reasoning that we used in our
original proof, it is straightforward to build a nested sequence Σ2 ⊂ · · · ⊂
Σn−1 ⊂ Σn = M with the property that each Σi is a minimizing hypersurface
in Σi+1 with respect to the volume measure ui+1 · · · un−1 dμi , where dμi is
the volume measure of the metric gi induced on Σi by the original ambient
metric g, and each uj > 0 is the principal eigenfunction of the stability
operator (instead of the conformal Laplacian) of Σj in Σj+1 with respect to
the volume measure uj+1 · · · un−1 dμj . To put it more explicitly, if we set
ρj := uj · · · un−1 , then Σi is defined
to be the minimizer of the weighted
volume functional Vρi+1 (Σ) = Σ ρi+1 dμi over all Σ in its homology class in
Σi+1 . Meanwhile, uj is the principal eigenfunction of the stability operator
Lj of Vρj+1 at Σj , described more explicitly below in equation (2.21).
Given this setup, we claim that each Σi constructed in this way is

Yamabe positive. In particular, Σ2 is Yamabe positive, and we use this
to complete the proof. The big difference in this alternative proof is that it
takes quite a bit of calculation to verify that Σi is Yamabe positive.
The calculation is assisted by the fact that this setup has a helpful
interpretation in terms of warped products. For each j, consider the n-
dimensional warped product manifold
⎛ ⎞

n−1
⎝Σj × T n−j , ĝj := gj + u2p dt2p ⎠ ,
p=j
where (tj , . . . , tn−1 ) are the coordinates on the torus T n−j . Using this defini-
tion, it is straightforward to see that for any i-dimensional hypersurface Σ in
Σi+1 , Vρi+1 (Σ) is precisely the volume of Σ × T n−i−1 computed with respect
to the metric ĝi+1 . In particular, this means that Σi × T n−i−1 is minimizing
in (Σi+1 × T n−i−1 , ĝi+1 ) with respect to variations that are independent of
the T n−i−1 factor. We let g̃i denote the metric on Σi × T n−i−1 induced by
(Σi+1 × T n−i−1 , ĝi+1 ), so that

n−1
g̃i := gi + u2p dt2p ,
p=i+1
where (ti+1 , . . . , tn−1 ) are the coordinates on the torus T n−i−1 . Since there
is a lot of notation to keep track of, the main thing to keep in mind is that
(Σi × T n−i−1 , g̃i ) → (Σi+1 × T n−i−1 , ĝi+1 )
is a minimal embedding with respect to variations independent of the T n−i−1
factor. If we let Ãi denote the second fundamental form of this embedding,
then applying the stability inequality (2.16) to this embedding of warped
products above, we see that the stability of Σi with respect to Vρi+1 translates
to the statement that

1
(2.20) |∇ϕ| + (R̃i − R̂i+1 − |Ãi | )ϕ ρi+1 dμi ≥ 0
2 2 2
Σi 2
for all ϕ ∈ C ∞ (Σi ). Here R̃i and R̂i+1 denote the scalar curvatures of g̃i
and ĝi+1 , respectively. The corresponding stability operator on Σi is then
1
(2.21) Li = −Δ̃i + (R̃i − R̂i+1 − |Ãi |2 ),
2
where Δ̃i denotes the Laplacian with respect to g̃i (for functions indepen-
dent of the T n−i−1 factor), so that ui is the principal eigenfunction of this
operator Li . The stability tells us that Li ui ≥ 0.
Our next task is to use the stability inequality above involving R̃i and
R̂i+1 to show that Σi is Yamabe positive. We first observe that the con-
struction gives us the following.
Claim.
R̂i+1 ≥ Rg ≥ 0.
Note that for each j, ĝj = g̃j + u2j dt2j .
Exercise 2.35. Use Proposition 1.13 to show that
R(g + u2 dt2 ) = R(g) − 2u−1 Δg u.
By the exercise, it follows that
R̂j = R̃j − 2u−1

j Δ̃j uj .
Using the fact that Lj uj ≥ 0, this becomes
R̂j ≥ R̃j − (R̃j − R̂j+1 − |Ãj |2 ) ≥ R̂j+1 .
Iterating this all the way up to ĝn = gn = g proves the claim.

The R̃i term is a bit trickier to deal with.
Claim.
−1/2 1/2
R̃i ≤ Ri − 4ρi+1 Δi ρi+1 .
For this computation,

we fix i, and then for each j from i + 1 to n − 1,
we define ḡj := gi + n−1 2 2
p=j p dtp (with ḡn = gi ). In particular, ḡi+1 = g̃i . By
u
Exercise 2.35, we have
R̄j = R̄j+1 − 2u−1

j Δj+1 uj
= R̄j+1 − 2u−1 −1
j ρj+1 divi (ρj+1 ∇uj )
= R̄j+1 − 2u−1
j Δi uj − 2∇ log ρj+1 , ∇ log uj

n−1
= R̄j+1 − 2u−1
j Δi uj − 2 ∇ log up , ∇ log uj ,
p=j+1
where the gradients are all with respect to gi . Iterating this for j from i + 1
to n − 1, we obtain

n−1
R̃i = Ri − 2 u−1
j Δi uj − 2 ∇ log up , ∇ log uq .
j=i+1 i<p<q<n
Next observe that, in general, u−1 Δu = Δ log u + |∇ log u|2 . Using this and
rewriting the sum over i < p < q < n, we obtain

n−1

R̃i = Ri − 2 Δi log uj + |∇ log uj |2
j=i+1
2
n−1

n−1
(2.22)
−
∇ log uj + |∇ log uj |2
j=i+1 j=i+1
≤ Ri − 2Δi log ρi+1 − |∇ log ρi+1 |2

−1/2 1/2
= Ri − 4ρi Δi ρi+1 ,
completing our proof of the second claim.
Combining the stability inequality (2.20) with the two claims above, we
see that for any ϕ ∈ C ∞ (Σi ),

1 −1/2 1/2
(2.23) |∇ϕ| +
2
Ri − 2ρi+1 Δi ρi+1 ϕ2 ρi+1 dμi ≥ 0,
Σi 2
which obviously implies that

1 −1/2 1/2
2
2|∇ϕ| + Ri − 2ρi+1 Δi ρi+1 ϕ2 ρi+1 dμi ≥ 0,
Σi 2
where the only change we made was placing a 2 in front of the |∇ϕ|2 term.
−1/2
The reason why we do that is that if we now use φρi+1 as a test function
in the above inequality, we can force all of the ρi+1 terms to drop out, and
we will be left with

1
0≤ 2
2|∇φ| + Ri φ 2
dμi .
Σi 2
(Check this.) Just as we saw in the proof of Proposition 2.25, the right side
is less than or equal to 12 φ, Lgi φ, where Lgi is the conformal Laplacian of
gi , as long as 2 ≤ 2(i−1)
i−2 , which it is. Therefore Lgi is a positive operator
and thus Σi is Yamabe positive.
Although the alternative proof above is more complicated than the orig-
inal proof, it has the benefit that it is easier to see which positive terms have
been thrown away in the course of the argument; therefore, it suggests how
one can alter the construction to take advantage of those positive terms.
Specifically, when we wrote down (2.23), we threw away the |Ãi |2 term, and
in inequality (2.22), we threw away some |∇ log uj |2 terms. In light of these
facts, Schoen and Yau defined the operator
1
n−1
3
Li := Li + |Ãi | +
2
|∇ log up |2 + |Ãp |2 ,
8 8n
p=i+1
where Li is the stability operator described in (2.21). We can now execute

the exact same construction of Σk ⊂ · · · ⊂ Σn−1 ⊂ Σn = M as in the
proof above, except that we define each ui to be the principal eigenfunction
of Li rather than Li . It is this setup that Schoen and Yau define to be
a minimal k-slicing. The point here is that the added positive terms in
Li make its associated quadratic form more coercive than the one for Li ,
but it still has the geometric property that positive scalar curvature of M
implies Yamabe positivity of the Σ2 slice of a minimal 2-slicing. This can be
directly verified by going through similar computations as in our alternative
proof of Theorem 2.28, except making sure not to throw away the positive
terms we threw away before. On the other hand, the improved analytic
properties of Li turn out to be strong enough to prove the desired regularity
of minimal k-slicings. Specifically, it is important to be able to show that
the eigenfunctions uj do not “concentrate” at the singular set.
2.4. More scalar curvature rigidity theorems

Observe that we can generalize Proposition 2.18 to give a relationship be-
tween lower scalar curvature bounds, topology, and area.
Proposition 2.36. Let (M, g) be a 3-manifold with scalar curvature Rg ≥ κ
for some constant κ ∈ R, and let Σ be a stable, two-sided closed minimal
surface in M . Then
κ|Σ| ≤ 4πχ(Σ).
Moreover, if equality is attained, then Σ is a totally geodesic surface with
constant Gauss curvature equal to 12 κ, such that along Σ, Rg = κ and
Ricg (ν, ν) = 0.
When κ > 0, we have already seen the topological restriction χ(Σ) > 0
in Proposition 2.18, but now we see that we also obtain an upper bound on
the area of Σ. When κ = 0, we see that Σ must be S 2 , RP2 , a torus, or a
Klein bottle, but no area bounds are obtained. And when κ < 0, there is no
topological restriction, but we do obtain a lower bound on area if Σ is not
an S 2 , RP2 , a torus, or a Klein bottle. The equality case of Proposition 2.36
with κ = 0 was first observed in [FCS80], while the area bounds were first
noted in [SZ97].
Exercise 2.37. Prove Proposition 2.36. Hint: Follow the proof of Propo-
sition 2.18 to prove the inequality. For the equality case, show that 1 is in
the kernel of the stability operator.
If we upgrade the assumption on Σ in Proposition 2.36 from stable to
area-minimizing, then we obtain a splitting theorem. We say that a surface
Σ is locally area-minimizing if it has area less than or equal to that of all
nearby surfaces, where nearby is meant in the smooth sense.
2.4. More scalar curvature rigidity theorems 55
Theorem 2.38 (Scalar curvature splitting theorem in three dimensions).

Let (M, g) be a 3-manifold with scalar curvature Rg ≥ κ for some constant
κ ∈ R, and let Σ be a locally area-minimizing two-sided closed surface in M .
If
κ|Σ| = 4πχ(Σ),
then Σ has constant Gauss curvature equal to 12 κ, and M splits as a Rie-
mannian product Σ × (−, ) near Σ. In particular, it is impossible for Σ to
be strictly locally area-minimizing.
If we further assume that M is complete and Σ is area-minimizing in its
isotopy class, then the product Σ × R is a Riemannian covering of (M, g).
It is useful to think of this theorem as being a theorem about three

different cases: the κ = 0 case was proved by Mingliang Cai and Gregory
Galloway [CG00,Gal11], the κ > 0 case was proved by Hubert Bray, Simon
Brendle, and André Neves [BBN10], and the κ < 0 case was proved by
Ivaldo Nunes [Nun13]. It is worth pointing out that Theorem 2.20 provides
hypotheses guaranteeing that minimizing tori and minimizing higher genus
surfaces exist. For spheres, a theorem of W. Meeks and S.-T. Yau [MY80]
shows the existence of minimizing spheres if π2 (M ) is nontrivial and M
contains no projective planes.
There are also some noncompact analogs of the κ = 0 case of Theo-
rem 2.38. See Theorem 3.46 and the recent preprint of O. Chodosh, M. Eich-
mair, and Vlad Moraru [CEM18].
We will give a somewhat unified proof of Theorem 2.38, roughly following
the exposition given by M. Micallef and V. Moraru [MM15].
Proof. Assume the hypotheses of the theorem, and let ν be a global unit
normal on Σ. For each smooth function u on Σ, we consider the image hy-
persurface Σ[u] of Σ under the map Fu (x) = expx (u(x)ν). All hypersurfaces
that are close to Σ = Σ[0] in the smooth sense can be parameterized by
functions u that are close to zero.
For α ∈ (0, 1), consider the map
Ψ : C 2,α (Σ) × R −→ C 0,α (Σ) × R,
defined by

1
Ψ(u, s) = Fu∗ HΣ[u] − s, u dμΣ ,
|Σ| Σ
where Fu∗ HΣ[u] is the mean curvature scalar of the image surface Σ[u], pulled
back to the original surface Σ. Here, Ψ(u, s) is technically only defined for
sufficiently small u ∈ C 2,α (Σ), and one can check that the image lies in
C 0,α (Σ) × R. By Exercise 2.13 and Proposition 2.36, we can compute the
linearization of Ψ to be

1
DΨ|(0,0) (u, s) = −ΔΣ u − s, u dμΣ .
|Σ| Σ
It is well known that the kernel of the ΔΣ is the constants and that the image
is the orthogonal complement of the constants (Theorem A.8), and from this
it follows that DΨ|(0,0) is an isomorphism. (Note that this explains why we
augment the mean curvature operator with the extra parameter s.) Hence
we can invoke the inverse function theorem (Theorem A.43) to see that
there exist > 0 and a smooth map (v, H) : (−, ) −→ C 2,α (Σ) × R such
that Ψ(v(t), H(t)) = (0, t) for all t ∈ (−, ). The equation Ψ(v(t), H(t)) =
(0, t) means each surface Σt := Σ[v(t)] has constant mean curvature H(t)
(hence our choice of variable name), and differentiating this equation at
∂
t = 0 shows that ∂t t=0
v(t) = 1 (since it must be in the kernel of ΔΣ and
have average equal to 1). This means that the surfaces Σt form a smooth
foliation of constant mean curvature hypersurfaces for small values of t. In
particular, by taking smaller if necessary, a tubular neighborhood of Σ can
be identified with Σ × (−, ), via the map Ψ : Σ × (−, ) −→ R given by
Ψ(x, t) = Fv(t) (x). Under this diffeomorphism, we can rewrite the metric as
g = ht + ϕ2t dt2 ,
where ht is the induced metric on the constant mean curvature surface Σt =

Σ × {t}, ϕt νt = ∂t
∂
is the deformation vector field corresponding
to the map
∂
Ψ(x, t), and νt is the unit normal of Σt . Since ϕ0 = ∂t t=0
v(t) = 1, we can
1
choose small enough so that 2 < ϕt < 2 for all t. We can also demand
that 12 < |Σ t|
|Σ| < 2.
By (2.15), for each t, we have
1
H (t) = −ΔΣt ϕt + (RΣt − RM − |AΣt |2 − H(t)2 )ϕt
2
1
≤ −ΔΣt ϕt + (2KΣt − κ)ϕt .
2
Dividing both sides by ϕt and integrating over Σt , we obtain

1
H (t) ϕ−1
t dμ Σt ≤ [−ϕ−1
t ΔΣt ϕt + KΣt − κ] dμΣt
Σt 2
Σt
1
= −ϕ−2
t |∇ϕt | dμΣt + 2πχ(Σ) − κ|Σt |
2
Σt 2
1
≤ κ(|Σ0 | − |Σt |)
2
t
1 d
=− κ |Σs | ds
2 0 ds
t
1
(2.24) =− κ H(s) ϕs dμΣs ds,
2 0 Σs
where we used integration by parts, the Gauss-Bonnet Theorem, our hy-

pothesis κ|Σ| = 4πχ(Σ), and the first variation formula (Proposition 2.10).
We claim that, after suitably shrinking again, H(t) ≤ 0 for all t ∈ [0, ).
To prove the claim, we consider three cases.
Case 1: κ = 0. Then inequality (2.24) says H (t) ≤ 0 for all t, and the
claim is immediate since H(0) = 0.
Case 2: κ > 0. Suppose that H(t1 ) > 0 for some value of t1 > 0. Further
suppose that H(t) ≥ 0 for all t ∈ [0, t1 ]. Then inequality (2.24) implies that
H (t) ≤ 0 for all t ∈ [0, t1 ], which together with H(0) = 0 contradicts the
assumption that H(t1 ) > 0. Therefore H(t) must be negative somewhere in
[0, t1 ]. Choose t0 to be some time achieving the negative minimum of H(t)
over [0, t1 ]. Then inequality (2.24) implies that for any t ∈ [0, t1 ],
t
1
(2.25) H (t) ϕ−1
t dμ Σt ≤ − κH(t 0 ) ϕs dμΣs ds.
Σt 2 0 Σs
Using our estimates on ϕt and |Σt |, it follows that H (t) 14 ≤ − 12 κH(t0 )4t,
or more simply,
(2.26) H (t) ≤ −8κH(t0 ).
Thus
t1
H(t1 ) = H(t0 ) + H (s) ds ≤ H(t0 ) − 8κH(t0 )2 = (1 − 8κ2 )H(t0 ),
t0
which is a contradiction for small enough, since H(t1 ) > 0 > H(t0 ).
Case 3: κ < 0. Again suppose that H(t1 ) > 0 for some value of t1 > 0,
but now choose t0 to be the time achieving the maximum of H(t) over
[0, t1 ]. Then because of the reversed sign on κ, we obtain the exact same
inequalities (2.25) and (2.26) for all t ∈ [0, t1 ]. Thus

t0
H(t0 ) = H(0) + H (s) ds ≤ −8κH(t0 )2 ,
0
which is again a contradiction for small enough .
Now that we have established the claim H(t) ≤ 0 for t ∈ [0, ), it follows
from the first variation formula that |Σt | ≤ |Σ|. But by the locally area-
minimizing property of |Σ|, we see that for small enough t, Σt must be
locally area-minimizing as well. Hence, by Proposition 2.36, it follows that
each Σt is totally geodesic and Ricg (νt , νt ) vanishes along Σt . Therefore the
metric ht on Σt = Σ × {t} is constant in t, and the first variation of mean
curvature (2.15) tells us that ΔΣt ϕt = 0, and hence ϕt is constant over Σt .
(In fact, our normalization of Ψ forces it to be identically 1.) Putting all of
this together, we see that g = h + dt2 on Σ × [0, ), where h is the induced
metric on Σ (after possibly shrinking appropriately). The same argument
works for t < 0, completing the proof.
If Σ is minimizing in its isotopy class, then we can use a standard open-
closed argument on t ∈ R to obtain a local isometry from Σ × R to M , which
must be a Riemannian covering since Σ × R is complete.
Exercise 2.39. Find a counterexample to show that the hypothesis of sta-
bility in Proposition 2.36 is insufficient to prove that the local splitting in
the conclusion of Theorem 2.38 exists.
Proposition 2.36 could be extended to higher dimensions, but it would
require replacing 4πχ(M ) by the integral of the scalar curvature, which is
not a topological invariant. Not only would this be a far less interesting
statement, but the proof of Theorem 2.38 would break down without this
topological invariance. However, with some extra work, we can still obtain
a meaningful higher-dimensional result in the κ = 0 case. Specifically, we
have the following generalization of the κ = 0 case of Proposition 2.36 to
higher dimensions, which is a simple extension of Proposition 2.25.
Proposition 2.40. Let (M n , g) be a Riemannian manifold with nonnegative
scalar curvature, and assume Σn−1 is a stable, closed two-sided minimal
hypersurface of M . Then Σ is Yamabe positive, or Σ is a totally geodesic,
Ricci-flat hypersurface of M such that Rg and Ricg (ν, ν) vanish along Σ.
Proof. We follow the exact same argument as in Proposition 2.25, except

with RM ≥ 0 instead of RM > 0. We obtain

1
λ1 ϕ1 L2 (Σ) ≥ 2 |∇ϕ1 |2 + (RΣ − RM − |A|2 )ϕ21 dμΣ ≥ 0,
Σ 2
where ϕ1 and λ1 are the principal eigenfunction and eigenvalue of the con-
formal Laplacian Lh of the induced metric h on Σ.
Case 1: λ1 > 0. Then using ϕ1 as a conformal factor gives Σ positive

scalar curvature, as already shown in the proof of Proposition 2.25.
Case 2: λ1 = 0. Then we see that RM and A must vanish along Σ, and
that ϕ1 is constant. Since Lh ϕ1 = 0, it follows that Σ is scalar-flat, and
then by the traced Gauss equation (Corollary 2.7), Ric(ν, ν) also vanishes.
Since Σ is scalar-flat, Theorem 1.23 says that Σ is either Yamabe positive,
or else it is Ricci-flat.
A corresponding splitting theorem was proved by Mingliang Cai.
Theorem 2.41 (Cai [Cai02]). Let (M n , g) be a Riemannian manifold with

nonnegative scalar curvature, and let Σ be a locally volume-minimizing two-
sided closed hypersurface in M . If Σ is not Yamabe positive, then Σ is Ricci-
flat and M splits as a Riemannian product Σ×(−, ) near Σ. In particular,
if Σ is strictly locally volume-minimizing, then it must be Yamabe positive.
If we further assume that M is complete and Σ is volume-minimizing in
its isotopy class, then the product Σ×R is a Riemannian covering of (M, g).
Proof. Our presentation here uses an elegant argument taken from Gal-
loway in [Gal18] rather than following the approach in [Cai02]. Assume
the hypotheses of the theorem, including the assumption that Σ is not
Yamabe positive. We begin the proof in the exact same way as in the proof
of Theorem 2.38. Specifically, we construct a map Ψ : Σ × (−, ) −→ R
such that under this diffeomorphism, we can rewrite the metric as
g = ht + ϕ2t dt2 ,
where ϕt > 0 and each slice Σt = Σ × {t} has constant mean curvature H(t).
As before, we have the equation
1
H (t) = LΣt ϕt = −ΔΣt ϕt + (Rht − RM − |AΣt |2 − H(t)2 )ϕt ,
2
where ht is the induced metric on Σt , and LΣt is the stability operator for
Σt , as defined in equation (2.15).
As in the proof of Theorem 2.38 we would like to prove that H (t) ≤ 0
for all t ∈ (0, ). Suppose there exists some small τ such that H (τ ) > 0.
Then LΣτ ϕτ > 0. Then there is a number c > 0 such that
LΣτ ϕτ ≥ cϕτ .
It is a general fact that this inequality for a positive function ϕτ leads to

the lower eigenvalue bound
0 < c ≤ λ1 (LΣτ ) .
For the proof, see Theorem A.11. Recall from Exercise 2.26 that
1
λ1 (LΣτ ) ≤ λ1 (Lhτ ) ,
2
where Lhτ denotes the conformal Laplacian. So we have 0 < λ1 (Lhτ ), which
implies that Σ is Yamabe positive (Corollary 2.27). As this contradicts our
original assumption, it proves that H (t) ≤ 0 for all t ∈ (0, ), and the rest
of the proof proceeds exactly as in Theorem 2.38.
In all of the results above, we deal with two-sided minimal hypersurfaces,

but we have the following rigidity result for a one-sided minimal hypersur-
face.
Theorem 2.42 (Bray-Brendle-Eichmair-Neves [BBEN10]). Let (M, g) be
a compact 3-manifold with Rg ≥ 6, and suppose Σ is an embedded projective
plane in M that minimizes area among all embedded projective planes. Then
|Σ| ≤ 2π,
and equality is attained if and only if (M, g) is isometric to a standard RP3
with constant sectional curvature equal to 1.
If (M, g) admits an embedded projective plane at all, then the existence
of a minimizer Σ follows from a powerful theorem of W. Meeks, L. Simon,
and S.-T. Yau [MSY82], as explained in [BBEN10]. Note that when Σ
is two-sided, Proposition 2.36 tells us that |Σ| ≤ 2π 3 , a much lower bound
on area than the one in the theorem. The inequality for the one-sided case
comes from the so-called “Hersch trick” [Her70]. Although one can no
longer use a global unit normal vector ν as a deformation in the second
variation formula, one can use other natural normal vectors arising from the
coordinate functions x1 , x2 , and x3 defined on the sphere covering Σ. They
give rise to normal vectors on Σ since they are odd functions on S 2 . More-
over, x1 , x2 , and x3 are useful for calculation since they are eigenfunctions
of the Laplacian on S 2 . See [BBEN10] for the details of this argument.
For the rigidity, they showed that Ricci flow will break the inequality unless
(M, g) has constant sectional curvature.
In the case of a sphere S 3 , we do not expect to find stable minimal 2-
spheres, but a theorem of Leon Simon and Francis Smith guarantees the
existence of an embedded minimal 2-sphere [Smi82]. The following striking
theorem of Fernando Marques and André Neves is an interesting companion
to the previous one.
Theorem 2.43 (Marques-Neves [MN12]). Let g be a metric on S 3 such
that Rg ≥ 6. Then if Σ is an embedded minimal 2-sphere in (S 3 , g) that
minimizes area among all embedded minimal spheres, then
|Σ| ≤ 4π,
and equality is attained if and only if g is a standard round metric on S 3

with constant sectional curvature equal to 1.
Since the minimal spheres here are not stable, the nature of the proof is
quite different from the other ones described in this section, and it requires
the use of so-called min-max methods.
It is natural to wonder whether there is a hyperbolic analog of Corol-
lary 2.32.
Theorem 2.44 (Scalar curvature rigidity of hyperbolic space). Let n < 8,

and let (M n , g) be a Riemannian manifold with Rg ≥ −n(n − 1) such that
there is a compact set K ⊂ M with (M K, g) isometric to an exterior region
of standard hyperbolic space, Hn Br (0) for some r > 0. (By standard, we
mean that the sectional curvature is −1.) Then (M, g) is isometric to Hn .
In higher dimensions, the result still holds if we assume M is spin.
The spin case was proved by work of Maung Min-Oo [MO89] together
with that of Lars Andersson and Mattias Dahl [AD98]. Theorem 2.44 is
now seen as a special case of the positive mass theorem for asymptotically
hyperbolic manifolds, which was proved for spin manifolds by Xiaodong
Wang [Wan01] and improved by P. Chruściel and M. Herzlich [CH03].
The n < 8 case of Theorem 2.44 was proved by L. Andersson, M. Cai,
and G. Galloway [ACG08] as a stepping stone toward a restricted version
of the positive mass theorem for asymptotically hyperbolic manifolds. A
version of the Penrose inequality (Conjecture 4.12) has also been conjectured
for asymptotically hyperbolic spaces (see [LN15, Amb15] for some partial
results).
Theorem 2.44 led Min-Oo to conjecture that a similar statement might
be true for spherical geometry. That is, given a Riemannian manifold
(M n , g) with Rg ≥ n(n − 1) such that there is a compact set K ⊂ M
with (M K, g) isometric to a standard spherical region S n Br (p) for
some r > π/2 and p ∈ S n , does this imply that (M, g) is isometric to the
standard round unit sphere? The point of having r > π/2 is that we are
only allowing the standard S n to be changed within one hemisphere. It is
easy to see that the conjecture fails if we allow r < π/2. (Imagine a coun-
terexample as an exercise.) When n = 2, the answer was known to be yes by
classical work of Toponogov [Top59]. If we upgrade the hypothesis to the
much stronger assumption Ricg ≥ (n−1)g in place of Rg ≥ n(n−1), Fengbo
Hang and Xiaodong Wang showed that the answer is yes [HW09]. (In this
paper they also give a nice proof of the Toponogov result.) It was somewhat
surprising when counterexamples to Min-Oo’s conjecture were discovered for
n ≥ 3.
Theorem 2.45 (Brendle-Marques-Neves [BMN11]). Let n ≥ 3. There

exists a smooth metric g on the sphere S n with the following properties:
• The metric g agrees with the standard unit sphere metric on a ball
Br (p) for some r > π/2 and p ∈ S n . (Here, Br (p) is a geodesic
ball in the standard sphere.)
• Rg ≥ n(n − 1).
• Rg > n(n − 1) somewhere.
Note that by doubling the nontrivial hemisphere through the antipodal

map, one can find a nontrivial example of an RP3 which is standard near
its equatorial RP2 and satisfies R ≥ 6, which is quite interesting in light of
Theorem 2.42.
On the other hand, one obtains an affirmative answer to Min-Oo’s con-
jecture (for nearby metrics) if one is willing to bump up the size of r. That
is, we restrict the possible deformations to a small spherical cap.
Theorem 2.46 (Brendle-Marques [BM11]). Let n ≥ 3. There exists r0 < π
and > 0 with the following property. Let g be a metric on S n such that
Rg ≥ n(n − 1) and g agrees with the standard unit sphere metric on Br0 (p)
for some p. Then if g is within of the standard metric in C 2 distance, then
(S n , g) is the standard round unit sphere.
Chapter 3
The Riemannian
positive mass theorem
3.1. Background
3.1.1. The Schwarzschild metric. In the study of partial differential
equations, it is often instructive to understand solutions with many symme-
tries. For example, looking for spherically symmetric solutions of Laplace’s
equation leads to the discovery of the fundamental solution. Here we will
look at an analog of the “fundamental solution” for the constant scalar cur-
vature equation.
Let g be a spherically symmetric metric. In other words, suppose that g
is a warped product of a line with the standard (n − 1)-sphere. Explicitly, if
we use the notation dΩ2 to denote the standard unit sphere metric on S n−1 ,
we consider metrics of the form
g = ds2 + r(s)2 dΩ2
for some positive function r(s). Note that dr
ds = 0 corresponds to the places
where the symmetric spheres are minimal (in fact, totally geodesic), and any
region where r(s) is constant corresponds to g being cylindrical (that is, the
Riemannian product of an interval with a sphere). Wherever dr ds is nonzero,
we can use r as a coordinate and rewrite the metric as
dr2
g= + r2 dΩ2
V (r)
for some positive function V (r). Let us focus our attention on such metrics
since the general case can always be broken up into pieces like this, plus
cylindrical pieces.
63
64 3. The Riemannian positive mass theorem
Exercise 3.1. Show that the scalar curvature of g is

n−1
Rg = 2
(n − 2)(1 − V (r)) − rV (r) .
r
Try doing the computation in two different ways: first, using Proposi-
tion 1.13, and second, using the first and second variation formulas (Propo-
sitions 2.10 and 2.12) with a unit normal variation, together with the traced
Gauss equation (Corollary 2.7).
In light of the above, the prescribed scalar curvature equation for spher-
ically symmetric spaces is a nonhomogeneous linear ordinary differential
equation in V .
Exercise 3.2. Let κ be a constant. Use the previous exercise to show that
the only spherically symmetric metrics with constant scalar curvature κ are
(up to diffeomorphism) of the form
dr2
g= + r2 dΩ2
V (r)
with
2m κ
V (r) = 1 − − r2 ,
r n−2 n(n − 1)
where m is some constant parameter.
Obviously, since m here is an arbitrary parameter, the factor of 2 and

the minus sign are immaterial, but it is conventional to parameterize the
solutions in this way. When κ = 0, we will call these metrics Schwarzschild
metrics. When κ > 0, we call them Schwarzschild–de Sitter metrics, and
when κ < 0, we call them Schwarzschild–anti-de Sitter metrics. These
metrics were first discovered by K. Schwarzschild for κ = 0, and by F. Kottler
and H. Weyl in the general case, in the context of general relativity. Be
aware that in the literature, these names frequently refer to (closely related)
Lorentzian metrics.
For now we are mainly interested in the Schwarzschild metrics. We
define the Schwarzschild metric of mass m to be

2m −1 2
(3.1) gm = 1 − n−2 dr + r2 dΩ2 .
r
We would like to understand the natural manifolds on which these metrics
live. When m = 0, this is obviously the Euclidean metric, so we take
Euclidean Rn to be the Schwarzschild space of mass zero.
For m > 0, it is easy to see that for large r, the metric is complete.
1
There appears to be a singularity at r = (2m) n−2 , but it turns out that this
is merely a “coordinate singularity” rather than a true geometric singular-
ity. It is a similar phenomenon to how spherical coordinates in R3 have a
3.1. Background 65
Figure 3.1. The Schwarschild space of mass m > 0.
“singularity” at the origin. On a historical note, this coordinate singular-

ity caused a great deal of confusion in the early days of general relativity
[Wik, Schwarzschild metric#History]. This history is recounted in detail
1
by J. Eisenstaedt in [Eis93]. One way to see that r = (2m) n−2 does not
represent a true “geometric singularity” is to change our choice of radial co-
ordinate. One particularly nice choice is the one that exhibits the conformal
factor relating gm to the Euclidean metric.
Exercise 3.3. Show that there exists a radial coordinate ρ such that gm
takes the form
4
gm = [u(ρ)] n−2 (dρ2 + ρ2 dΩ2 ),
where
m
u(ρ) = 1 + .
2ρn−2
1
Verify that this formula is correct, and that the region where ρ > ( m
2)
n−2
1
corresponds to where r > (2m) n−2 . (This is a bit tedious, but it is essentially
just single variable calculus.)
Note that since the new expression for gm makes sense for all ρ > 0,
we now have a metric on all of (0, ∞) × S n−1 . Show that the map ρ →
2
(m n−2 ρ−1 defines an isometry on ((0, ∞) × S n−1 , g ). Use this to help
2) m
explain why ((0, ∞) × S n−1 , gm ) is complete.
Observe that if we switch to rectangular coordinates x1 , . . . , xn on Rn
with ρ = |x|, we see that gm can be written as a metric on Rn {0} via
4
(gm )ij = [u(x)] n−2 δij ,
where
m
u(x) = 1 + 2|x|n−2
.
These coordinates are often called isotropic coordinates for the Schwarzschild
metric in the physics literature.
From now on, for any m > 0, we will refer to the above Riemannian
manifold ((0, ∞)×S n−1 , gm ) as the Schwarzschild space of mass m. From the
exercise above, we can see that this space is indeed a complete, noncompact
manifold with two ends that is scalar-flat everywhere. Moreover, we can see
1
that the sphere at ρ = ( m2)
n−2 is totally geodesic (and therefore minimal).
See Figure 3.1.

The negative mass Schwarzschild metric can also be written in isotropic
coordinates, but this does not allow us to extend the metric in this case.
Exercise 3.4. Show that for m < 0, the geometry of the metric gm becomes
singular as r approaches zero, where r is the coordinate used in formula (3.1).
That is, show that there is no way to extend the metric gm to a smooth
Riemannian metric on a larger space.
3.1.2. Asymptotic flatness. Observe that for any Schwarzschild metric,

as ρ approaches infinity (or zero), the metric is asymptotic to the Euclidean
metric (in some sense). These Schwarzschild spaces will be our models for
the study of asymptotically flat manifolds with nonnegative scalar curvature.
Notice that in ρ coordinates, for large ρ, gm differs from the Euclidean metric
by a quantity of order O(ρ2−n ). Although this is a reasonable class of metrics
to study, it turns out that many theorems of interest can be proved for the
wider class of metrics described below.
Definition 3.5. Let n ≥ 3. A Riemannian manifold (M n , g) is said to
be asymptotically flat if there exists a bounded set K such that M K
is a finite union of ends M1 , . . . , M such that for each Mk , there exists a
diffeomorphism
Φk : Mk −→ Rn B̄1 (0),
where B̄1 (0) is the standard closed unit ball (we will often write B1 in
place of B1 (0)), such that if we think of each Φk as a coordinate chart
with coordinates x1 , . . . , xn , then in that coordinate chart (which we will
often call the asymptotically flat coordinate chart or sometimes the exterior
coordinate chart), we have
gij (x) = δij + O2 (|x|−q )
for some q > n−2 −q
2 . Here, O2 (|x| ) refers to an unspecified function in the
2 . We say that f ∈ C 2 if
weighted space C−q −q
|f (x)| + |x| · |∂f (x)| + |x|2 · |∂ 2 f (x)| < C|x|−q

3.1. Background 67
Figure 3.2. An asymptotically flat manifold with three ends.
for some constant C. (See Definition A.22.) Here, ∂ = ∇ refers to derivatives

with respect to the Euclidean background metric. We will refer to this q as
the asymptotic decay rate of g. See Figure 3.2.
Moreover, we also require that the scalar curvature is integrable over
(M, g).
It is perhaps more accurate to call these manifolds asymptotically Eu-

clidean rather than asymptotically flat (and many authors do so), but by
this point, the latter term has stuck. We will see a bit later on why the last
part of the definition concerning scalar curvature is desirable. For now, note
that even if one assumes an asymptotic decay rate of q = n − 2 (which is
the case of most interest), it only follows that Rg = O(|x|−n ), which is not
strong enough decay to guarantee that Rg ∈ L1 .
Exercise 3.6. Explicitly show that a Schwarzschild space of mass m > 0 is
asymptotically flat.
We have worded the definition of asymptotic flatness so that it makes

sense for incomplete manifolds, but typically we are interested in complete
manifolds (sometimes with boundary), in which case K is compact. Be fore-
warned that different papers typically define asymptotically flat manifolds
in slightly different ways. The basic idea is always the same, but the de-
tails may be important for the technical needs of the paper. In particular,
another common way to define asymptotic flatness uses weighted Sobolev
norms, that is, an integral decay condition rather than a pointwise decay
condition. (See the Appendix for more on weighted Sobolev spaces.) Within
the class of asymptotically flat manifolds, we have the following simpler class.
Definition 3.7. Let n ≥ 3. A Riemannian manifold (M n , g) is said to

be asymptotically Schwarzschild if there exists a bounded set K such that
M K is a finite union of ends M1 , . . . , M such that for each Mk , there
exist a real number mk and a diffeomorphism
Φk : Mk −→ Rn B̄1 (0),
such that if we think of each Φk as a coordinate chart with coordinates
x1 , . . . , xn , then in that coordinate chart, we have

2mk
gij (x) = 1 + |x| 2−n
δij + O2 (|x|1−n ).
n−2
This mk is called the mass of the end Mk .
As one would expect, the Schwarzschild space of mass m has two ends,
each of which is asymptotically Schwarzschild with mass m, according to
this definition. (Check this.)
3.1.3. Motivation for mass. Observe that the parameter m in Defini-

tion 3.7 cleanly describes the deviation of our metric g from being Euclidean.
But what about asymptotically flat metrics that are not asymptotically
Schwarzschild? For these metrics there is a generalization of this mass called
the ADM mass, named after R. Arnowitt, S. Deser, and C. Misner, who first
defined this mass using physical reasoning [ADM60, ADM61, ADM62].
In fact, they first developed the concept of asymptotically flat spaces for
this purpose. Here we provide some alternative motivation for ADM mass
based on some more simplistic physical reasoning.
Let us review the Newtonian theory of gravity. In this theory, our uni-
verse is R3 , and the effect of gravity is dictated by the gravitational potential,
which is a function V : R3 −→ R. Specifically, the effect of gravity is that
the acceleration of any test particle is equal to −∇V . The gravitational po-
tential, in turn, is determined by the distribution of matter in the universe,
which can be represented by a mass density function ρ : R3 −→ R. The two
functions are related by Poisson’s equation
ΔV = 4πρ,
where we choose units in which Newton’s gravitational constant G is equal
to 1. We also impose the boundary condition
lim V (x) = 0.
x→∞
As long as ρ has reasonable decay, Poisson’s equation can be solved using
the well-known formula

ρ(y)
(3.2) V (x) = − dy,
R3 |x − y|
3.1. Background 69
where dy is just ordinary Lebesgue measure on R3 . If we allow ρ to be

interpreted as a distribution, then this reasoning still holds. In particular,
we can represent a “point mass” m at x0 by taking the mass density ρ(x) to
be the Dirac delta function mδ(x − x0 ), and recover (a version of) Newton’s
law of gravitation, which states that this point mass m at x0 creates a
gravitational potential
m
(3.3) Vm,x0 (x) = − .
|x − x0 |
Of course, this is (up to a constant factor), the fundamental solution of
Laplace’s equation. Conversely, if we instead assume equation (3.3) as the
potential arising from a point mass and combine it with the “principle of
superposition,” we can recover equation (3.2), and consequently we recover
Poisson’s equation as well. (This is the approach taken in most introductory
physics courses.) Moreover, we have the following.
Theorem 3.8 (Newton’s shell theorem). Suppose that ρ is a purely radial,
compactly supported function. (That is, the distribution of matter is spheri-
cally symmetric around the origin.) Then V (x) = − |x|m
for all x outside the

support of ρ, where m = R3 ρ(x) dx.
Proof. First observe that V must be a radial function. (Can you prove
this?) Therefore outside the support of ρ, V is a radial harmonic function,
and every radial harmonic function that decays at infinity is of the form
V (x) = − |x|
m
for some constant m. (This is an easy fact to check. See
Section A.1.4.)
Finally,

1
ρ(x) dx = ΔV dx
R3 R3 4π

1 ∂V
= lim dμSr
r→∞ S 4π ∂r
r
1 m
= lim dμSr (x)
r→∞ S 4π |x|2
r
= m.

What happens if we pursue the same argument for matter distributions

that are not spherically symmetric? Consider ρ supported in the region
|x| < r1 . As before, V (x) is harmonic for |x| ≥ r1 . By expanding in
spherical harmonics (Corollary A.19), we have
m
V (x) = − + O1 (|x|−2 )
|x|
for some constant m. Then

1
ρ(x) dx = ΔV dx
R3 R3 4π

1 ∂V
= lim dμSr
r→∞ S 4π ∂r
r

1 m −3
= lim + O(|x| ) dμSr (x)
r→∞ S 4π
r
|x|2
= m.

Once again, we have m = R3 ρ(x) dx. We define this quantity to be the
total mass of the system. The physical significance of the total mass of the
system, m, is the following. The top-order behavior of V (x) for large x is
exactly the same as the potential arising from a point mass m at the origin.
In other words, the total mass tells us about the asymptotic behavior of V .
It also happens to be equal to the integral of the mass density, but note
that the latter quantity does not have any obvious physical significance. For
example, the total mass is completely irrelevant to test particles close to the
support of ρ. A test particle can only “feel” the quantity m when it is near
infinity. From this perspective, it is perhaps more natural to define the mass
by

1 ∂V
(3.4) m := lim dμSr ,
r→∞ 4π S ∂r
r

and think of the fact that m = R3 ρ(x) dx as a useful theorem. It holds
because of the “principle of superposition,” or in other words, because of
the linearity of the Laplacian.
We now consider a simplistic geometric model of an isolated gravitational
system in general relativity, in three dimensions. We consider a snapshot in
time to be a complete, asymptotically flat manifold (M, g). Again, we can
consider the matter distribution to be represented by a mass density function
ρ : M −→ R. However, unlike in Newtonian gravity, ρ does not determine
the metric g (which plays a similar role to the gravitational potential) but
only constrains it, according to the equation
Rg = 16πρ.
That is, the scalar curvature is the mass density, up to a constant. (See
Chapter 7 for details. More precisely, this is equivalent to the Einstein
constraint equations in Definition 7.16 for the case k = 0 and n = 3.)
Meanwhile, the asymptotic flatness condition serves as a sort of bound-
ary condition.
What should the total mass be? As argued above, it should
not be M ρ dμg , but rather it should be an asymptotic integral involving
g, since the total mass should pick up the asymptotic physical behavior.
3.1. Background 71
Here is one way to derive it (which we note is not the original motivation).
If the theory were linear (that is, if Rg were a linear
operator of g), then
1
it would be true that the total mass should be 16π M Rg dμg . In the case
when (M, g) is very close to the Euclidean background metric (R3 , ḡ), the
scalar curvature is approximately linear. That is, the scalar curvature can
be approximated by its linearization at the Euclidean background metric
ḡij = δij . That is,
Rg ≈ DR|δ (g − ḡ).
In this case it is reasonable to define mass to be

1
(3.5) m:= DR|ḡ (g − ḡ) dμ
16π R3

1
= (−Δ(tr g) + div(div g)) dμ
16π R3

1
= lim (div g − d(tr g))(ν) dμSr
r→∞ 16π S
r
3
1 xj
= lim (gij,i − gii,j ) dμ ,
r→∞ 16π S
r
|x| Sr
i,j=1
where we used Exercise 1.18, and ν is the Euclidean outward normal to Sr .

Although this formula was derived under the assumption that the metric
was globally close to Euclidean, the formula should be a good definition for
all asymptotically flat metrics, since all such metrics are close to Euclidean
as we approach infinity, and the formula itself is defined in terms of the
asymptotic behavior of g and should not care about the behavior of g in any
compact region.
We can now understand the statement of the positive mass theorem.
Physically, we know that the mass density function ρ should be a nonneg-
ative function. In Newtonian gravity, the divergence theorem tells us that
as long as ρ is nonnegative everywhere and positive somewhere, the total
mass m defined by equation (3.4) is also positive. This means that the grav-
itational potential is asymptotic to a potential created by a positive point
mass. Physically, this means that far-away test particles are “attracted” to
the source masses. In our simplistic geometric model of general relativity,
we would like the same to be true: that nonnegative mass density (i.e., non-
negative scalar curvature) implies nonnegative mass. This is the content of
the positive mass theorem. We can see that this is physically very desirable.
A counterexample would suggest that perhaps there is some configuration of
matter that is somehow “repulsive” at large distances. On the other hand,
mathematically, it is far from obvious that the positive mass theorem should
be true.
3.1.4. ADM mass.

Definition 3.9. Given an asymptotically flat manifold (M n , g) with ends
M1 , . . . , M , we say that the ADM mass of the end Mk is

1
(3.6) mADM (Mk , g) = lim (div g − d(tr g))(ν) dμSρ .
ρ→∞ 2(n − 1)ωn−1 S
ρ
The barred quantities are all quantities computed using the Euclidean back-
ground metric determined by the asymptotically flat coordinate chart Φk
used in Definition 3.5. The Sρ refers to the coordinate sphere of radius ρ,
dμSρ is its volume measure induced by the Euclidean metric, ν is its Eu-
clidean outward normal vector, and ωn−1 is the volume of the unit (n − 1)-
sphere.1 More explicitly, we can write the more commonly used coordinate
expression
n
1 xj
(3.7) mADM (Mk , g) = lim (gij,i − gii,j ) dμ .
ρ→∞ 2(n − 1)ωn−1 S
ρ
|x| Sρ
i,j=1
Let us also take a moment to think about what this mass is, without
regard to physical reasoning. The integral of the scalar curvature Rg can-
not be written as a flux integral at infinity, because the operator Rg is not
in divergence form. However, the linearization of Rg near the Euclidean
metric is a divergence (or in other words, Rg is a divergence plus higher-
order terms). The mass is defined to be the flux integral at infinity that
corresponds to this divergence. From this perspective, although the geo-
metric content of the mass is far from clear, one can see that it is intimately
connected to the partial differential operator Rg , when expressed using the
Euclidean background.
Exercise 3.10. Prove that the ADM mass of any end of an asymptotically
flat manifold exists and is finite. (Hint: Use Exercise 1.18, the divergence
theorem, and integrability of the scalar curvature.) Moreover, your argu-
ment should show that, given the other hypotheses of asymptotic flatness,
the integrability of scalar curvature is actually equivalent to the ADM mass
being well-defined and finite. Furthermore, observe that this proof shows
that the Sρ in formula (3.6) can be replaced by any family of surfaces Σρ
that “exhausts” the end Mk . (Here, we interpret ν as the outward normal
of Σρ .)
A priori, the definition of the ADM mass is not obviously a geometric

invariant of the end, since its definition depends on a particular choice of
1 For n > 3, there does not seem to be a universally accepted convention for the constants
that appear in the definition of mass. Our convention was chosen in order to be consistent with
the fairly simple appearance of the mass parameter m in our definition of the Schwarzschild metric
in (3.1).
3.1. Background 73
coordinates. In fact, it was shown by V. Denisov and V. Solovev that if

one allows q = n−22 in Definition 3.5, then it is possible for a metric to
have two different ADM masses [DS83]. Reassuringly, it was proved by
R. Bartnik [Bar86] and by P. Chruściel [Chr86] that the ADM mass is
indeed a geometric invariant, given the definition of asymptotic flatness in
Definition 3.5.
At the other extreme, note that if the asymptotic decay rate q is greater
than n − 2, then the ADM mass must be zero. (In fact, Theorem 3.14 below
implies that Ricci decay greater than n is enough to force the ADM mass
to be zero.) The most interesting case, and perhaps the most natural one,
given the definition of the ADM mass, is the borderline case where q = n−2.
This case includes Schwarzschild space as well as other simple examples.
Exercise 3.11. Show that if an end of an asymptotically Schwarzschild

manifold (M n , g) has mass m, then the ADM mass of that end is indeed
equal to m. Note that by the previous exercise, this computation implies
that g has integrable scalar curvature. Use this to prove that an asymptoti-
cally Schwarzschild manifold is asymptotically flat with decay rate q = n − 2
(as defined in Definition 3.5).
Exercise 3.12. Let (M n , g) be a one-ended asymptotically flat manifold,

and suppose that u is a smooth positive function on M such that in the
asymptotically flat coordinate chart, we have
u(x) = 1 + O2 (|x|−q )
4
2 . Further assume that Δg u ∈ L . Prove that g̃ = u

for some q > n−2 1 n−2 g is
also an asymptotically flat metric on M , and that its ADM mass is

−2 ∂u
mADM (g̃) = mADM (g) + lim dμSρ .
ρ→∞ (n − 1)ωn−1 S ∂r
ρ
A particularly important case of the above is when the asymptotics of u

are modeled on that of a harmonic function. Suppose that u is a Euclidean
harmonic function on Rn B̄ρ for some ρ > 0, such that limx→∞ u(x)
is equal to some constant a. Then by expanding in spherical harmonics
(Corollary A.19), we know that
u(x) = a + b|x|2−n + O2 (|x|1−n )
for some constant b. More generally, Corollary A.38 implies that if g is
asymptotically flat, then any g-harmonic function can be expanded as
u(x) = a + b|x|2−n + O2 (|x|2−n−γ )
for some γ > 0.
Exercise 3.13. Let (M n , g) be a one-ended asymptotically flat manifold,

and suppose that u is a smooth positive function on M such that in the
asymptotically flat coordinate chart, we have
u(x) = a + b|x|2−n + O2 (|x|2−n−γ )
4
for some γ > 0 and some constants a and b with a > 0. Prove that g̃ = u n−2 g
is also an asymptotically flat metric on M , and that its ADM mass is
mADM (g̃) = a2 mADM (g) + 2ab.
The ADM mass can also be expressed in terms of curvature.

Theorem 3.14. Given an asymptotically flat manifold (M n , g) with ends
M1 , . . . , M , the ADM mass of an end can be expressed as

−1
mADM (Mk , g) = lim G(X, ν) dμΣi ,
i→∞ (n − 1)(n − 2)ωn−1 Σ
i
where G := Ric − 12 Rg is the Einstein tensor, X is the vector field xi ∂i on

the end Mk ∼= Rn B̄1 (0), and Σi is any sequence that exhausts the end Mk
and has the property that |Σi | ≤ C(inf x∈Σi |x|)n−1 for some C independent
of i. The barred quantities refer to quantities computed using the Euclidean
metric, as in Definition 3.9.
This sort of formula for mass in terms of curvature goes back to A. Ash-
tekar and R. O. Hansen [AH78]. We present a fairly simple proof of Theo-
rem 3.14, essentially due to Pengzi Miao and Luen-Fai Tam [MT16]. It can
also be proved (at least, for a sequence of spheres) using a density argument
(see Lemma 3.48), as explained by Lan-Hsuan Huang [Hua12].
Remark 3.15. The asymptotic decay of gij − δij implies that ν and dμΣi
can be replaced by ν and dμΣi in the theorem above.
Remark 3.16. If (M n , g) has asymptotic decay rate q = n − 2, then The-
orem 3.14 actually implies a coordinate independent formula for the mass:

−1
mADM (Mk , g) = lim ρG(ν, ν) dμSρ (p)
ρ→∞ (n − 1)(n − 2)ωn−1 S (p)∩M
ρ k
for any p ∈ M , where Sρ (p) is the geodesic ball around p.
Proof. The proof may be thought of as a “linearized” version of the proof

of R. Schoen’s Pohozaev-type identity [Sch88]. We can assume without
loss of generality that g is a metric defined on Rn , since the theorem is
purely an asymptotic statement. As we approach infinity, G can be well-
approximated by its linearization DG|ḡ (g − ḡ) at the Euclidean background
metric ḡij = δij . For simplicity of notation, we will just write this as Ġ,
and similarly we write Ṙ := DR|ḡ (g − ḡ). Since G is divergence-free, we
3.1. Background 75
can see that Ġ is also divergence-free with respect to Euclidean divergence.

For the purpose of the following computation, let us (abusively) think of G
as a (1, 1)-tensor, that is, with one raised index and one lowered index. Let
Ωi ⊂ Rn such that ∂Ωi = Σi . By the divergence theorem, we compute

−1
G(X) · ν dμΣi
(n − 1)(n − 2)ωn−1 Σi

−1
≈ Ġ(X) · ν dμΣi
(n − 1)(n − 2)ωn−1 Σi

−1
= div(Ġ(X)) dμ
(n − 1)(n − 2)ωn−1 Ωi

−1
= (div Ġ)(X) + Ġ · ∂X dμ
(n − 1)(n − 2)ωn−1 Ωi

−1
= (tr Ġ) dμ
(n − 1)(n − 2)ωn−1 Ωi

1
= Ṙ dμ,
2(n − 1)ωn−1 Ωi
which we know gives the correct expression for ADM mass in the limit as
i → ∞, thanks to our computations in (3.5) and Exercise 3.10. In the
above computation, Ġ · ∂X should be interpreted as the full contraction of
Ġ and ∂X with respect to the Euclidean metric. If the steps of the above
computation are unclear, try it in local coordinates.
Exercise 3.17. Fill in the following missing steps in the proof above. First,
justify that the approximation step G ≈ Ġ is valid, that is, in the limit as
i → ∞, the differencebetween the two integrals vanishes. Also check that
div Ġ = 0 and tr Ġ = 1 − n2 Ṙ.
We now state the positive mass theorem—a theorem that was conjec-
tured as soon as the concept of ADM mass was formulated back in 1961.
Since there is a more general version for initial data sets, the modifier Rie-
mannian is useful in order to avoid confusion, but in the context of Part 1
of this book, there is no risk of ambiguity.
Theorem 3.18 (Riemannian positive mass theorem). Let (M, g) be a com-
plete asymptotically flat manifold with nonnegative scalar curvature. Then
the ADM mass of each end of M is nonnegative.
Schoen and Yau first proved the three-dimensional case in 1979 [SY79c,
SY81a], and they soon saw how to generalize their result to dimensions less
than 8 in [SY79a]. (See also [Sch89].) However, the higher-dimensional
cases were stymied by the problem of singularities of minimal hypersurfaces
that we described in Chapter 2. As mentioned earlier, it is now understood
that the positive mass theorem is a consequence of Theorem 1.30 and this
implication is what we will explain later in the chapter. As mentioned
earlier, the higher-dimensional cases of Theorem 1.30 have been treated in
recent preprints of Schoen and Yau [SY17] and Lohkamp [Loh06, Loh15c,
Loh15a, Loh15b]. Meanwhile, in 1981 E. Witten discovered a proof that
works for all spin manifolds [Wit81]. We will discuss the proof in detail
in Chapter 5. Somewhat surprisingly, the spinor proof of the positive mass
theorem is actually simpler than the spinor proof of Theorem 1.30.
The case of zero mass is usually described as part of the positive mass
conjecture, but we prefer to think of it as a separate statement which follows
from the positive mass theorem (Theorem 3.18).
Theorem 3.19 (Positive mass rigidity). Let (M, g) be a complete asymp-

totically flat manifold with nonnegative scalar curvature. If the ADM mass
of any end of (M, g) is zero, then (M, g) must be isometric to Euclidean
space.
The fact that this is a direct consequence of Theorem 3.18 was essentially
proved by Schoen and Yau in [SY79c]. Witten’s spinor method also yields
the rigidity result, but only in the spin case [Wit81].
Given the above rigidity result, one might wonder about the correspond-
ing stability question. That is, if we have a sequence of complete asymptot-
ically flat manifolds with nonnegative scalar curvature whose masses con-
verge to zero, can we say that these Riemannian manifolds are approaching
Euclidean in some weak sense (given some scale-fixing assumptions)? This
question is rather subtle and essentially wide open, but various related re-
sults can be found in [Cor05, Lee09, LS14, LS12, LS15, HL15, HLS17,
SSA17, BF02, FK02, Fin09]. See Theorem 4.67.
3.2. Special cases of the positive mass theorem

This section can be skipped if the reader is only interested in the general case,
though there are some interesting and useful ideas presented here. Taking
our cue from Section 3.1.3, it would be nice if we could use a divergence
theorem argument to prove the positive mass theorem. There are a few
important special cases where this works. Most of these special cases include
the spherically symmetric case, which is the simplest of all.
3.2. Special cases of the positive mass theorem 77
3.2.1. Spherically symmetric case.
Proposition 3.20. Let g be a complete asymptotically flat manifold metric

on Rn which is spherically symmetric in the sense that, under the diffeo-
morphism Rn {0} ∼
= (0, ∞) × S n−1 , the metric can be expressed as
dr2
g= + r2 dΩ2
V (r)
for some smooth positive function V .
If g has nonnegative scalar curvature, then it has nonnegative ADM
mass. Moreover, if the mass is zero, then g is Euclidean.
Technically, the form of the metric g given in the statement of the propo-
sition implicitly assumes that none of the symmetric spheres around the
origin are minimal. The proof can be easily adapted when minimal surfaces
are present, as we will see in Proposition 4.20.
Exercise 3.21. Show that for an asymptotically flat spherically symmetric

2
metric g = Vdr(r) + r2 dΩ2 , the ADM mass is given by
1 n−2
mADM (g) = lim r (1 − V (r)).
r→∞ 2
This can be done in a coordinate-free manner using Theorem 3.14 together

with the traced Gauss equation (Corollary 2.7).
Proof. In solving Exercise 3.2, most likely one showed that

d n−2
R = (n − 1)r1−n [r (1 − V (r))].
dr
Thus, if R ≥ 0 everywhere, then 12 rn−2 (1−V (r)) is a nondecreasing function
in r for all r > 0. (When n = 3, this expression is just the Hawking mass
of the sphere at radius r. See Definition 4.23.) The assumption that g can
be extended to a complete metric on all of Rn then implies that V must be
bounded as r → 0. (Check this for yourself.) Consequently,
1 1
0 = lim rn−2 (1 − V (r)) ≤ lim rn−2 (1 − V (r)) = mADM (g).
r→0 2 r→∞ 2
If the ADM mass is actually zero, then we see that the nondecreasing func-
tion rn−2 (1 − V (r)) must be identically zero, which means that V (r) is
identically 1, and thus g is Euclidean.
3.2.2. Conformally flat case. The next case we consider is the globally
conformally Euclidean case.
Proposition 3.22. Let u be a smooth positive function on Rn satisfying

u(x) = 1 + O2 (|x|−q )
2 , and further assume that Δg u ∈ L . Then Exercise 3.12

n−2 1
for some q >
4
implies that (Rn , gij = u n−2 δij ) is a complete asymptotically flat manifold.
If g has nonnegative scalar curvature, then it has nonnegative ADM
mass. Moreover, if the mass is zero, then g is Euclidean.
Proof. By Exercise 1.8, we know that

4(n − 1) − n−2
n+2
Rg = − u Δu,
n−2
where the bar notation indicates the background Euclidean metric δij . By
Exercise 3.12, we have

2 ∂u
mADM (g) = mADM (δ) + lim − dμ
ρ→∞ (n − 2)ωn−1 S ∂r Sρ

ρ
2
= −Δu dμ
(n − 2)ωn−1 Rn

2 n − 2 n−2n+2
= u Rg dμ.
(n − 2)ωn−1 Rn 4(n − 1)
It is now clear that Rg ≥ 0 implies mADM (g) ≥ 0. To see the rigidity,
observe that if we furthermore have mADM (g) = 0, then Δu vanishes, and
hence u is a harmonic function on all of Rn . Since it approaches 1 at infinity,
it must then be identically equal to 1 by the maximum principle.
Notice that one nice feature of the globally conformally Euclidean setting
is that the scalar curvature operator becomes a partial differential operator
on the function u, rather than on a tensor g. We can see that if we linearize
this operator at u = 1, it tells us that for u ≈ 1, we have Rg ≈ − 4(n−1)
n−2 Δu.
So we see that if we set V = 2(1 − u), then we obtain a “Newtonian limit”
as u gets closer to 1. That is, when n = 3, the equation Rg = 16πρ with
u(∞) = 1 from Section 3.1.3 indeed (informally) reduces to ΔV = 4πρ with
V (∞) = 0 as u becomes closer to the constant function 1.
3.2.3. Graphical case. The next case we consider is the case of graphical
hypersurfaces of Euclidean space, due to Mau-Kwong George Lam. The
rigidity was proved by Lan-Hsuan Huang and Damin Wu.
Theorem 3.23 (Lam [Lam11], Huang-Wu [HW13]). Let f : Rn −→ R
be a smooth function such that limx→∞ f (x) is either a constant or ∞. Let
M be the graph of f in Rn+1 , and let g be the metric on M induced by the
Euclidean metric on Rn+1 . Assume that fi fj = O2 (|x|−q ) for some q > n−2
2 ,
where the subscripts on f denote partial differentiation, and assume that Rg

is integrable over (M, g).
Then (M, g) is asymptotically flat, and if it has nonnegative scalar cur-
vature, then it has nonnegative ADM mass. Moreover, if the mass is zero,
then (M, g) is Euclidean space.
The key to this proof is the fact that the scalar curvature can be written
as a divergence. This was first observed by Robert Reilly [Rei73].
Lemma 3.24. Let f : Rn −→ R be a smooth function, let M be the graph
of f in Rn+1 , and let g be the metric on M induced by the Euclidean metric
on Rn+1 . Using the coordinates x → (x, f (x)) as coordinates on M , we have
n

fii fj − fij fi
Rg = ∂j ,
1 + |∂f |2
i,j=1
where the subscripts on f denote partial differentiation.
Proof. Thinking of (M, g) as a hypersurface of Euclidean Rn+1 , we can use

the traced Gauss equation (2.7) to see that
Rg = H 2 − |A|2 ,
where A and H are the second fundamental form and mean curvature
of M .
(−∂f,1)
The upward unit normal to M is ν = w , where w = 1 + |∂f |2 .
Therefore the shape operator, written with respect to x coordinates, is

−fi fi
i
Sj = ∂j =− .
w w j
So we have
n
⎡ ⎤
fi n

f
·⎣ ⎦
j
H 2 = (tr S)2 =
w i w j
i=1 j=1

n

fi fj fi fj
= ∂j − ,
w i w w ij w
i,j=1
n

fi fj
|A|2 = tr(S 2 ) =
w j w i
i,j=1

n
fi fj fi fj
= ∂i −
w j w w ji w
i,j=1
n

fj fi fi fj
= ∂j − ,
w i w w ij w
i,j=1

n

fi fj fj fi
Rg = H − |A| =
2 2
∂j −
w i w w i w
i,j=1
n
(fii w − fi wi )fj − (fji w − fj wi )fi
= ∂j
w3
i,j=1
n

fii fj − fij fi
= ∂j .
w2
i,j=1
Proof of Theorem 3.23. Again, we use the coordinates x → (x, f (x)). In

these coordinates,
gij = ∂i + fi ∂n+1 , ∂j + fj ∂n+1 = δij + fi fj .
Therefore our hypotheses imply that the x coordinates are asymptotically
flat coordinates for M .
By our hypotheses, Rg is not just integrable over M with respect to dμg ,
but also Euclidean dμ. Applying Lemma 3.24 and the divergence theorem,
we see that nonnegativity of Rg implies that
n

fii fj − fij fi
0≤ Rg dμ = ∂j dμ
M M i,j=1 1 + |∂f |2
n

fii fj − fij fi
(3.8) = lim ν j dμSr .
ρ→∞ S
r
1 + |∂f | 2
i,j=1
We claim that the expression on the right is just the ADM mass, up to a
positive constant. The integrand in the ADM mass expression is
xj
(gij,i − gii,j ) = [(fi fj )i − (fi fi )j ]ν j = (fii fj − fij fi )ν j ,
|x|
which is the same as the integrand in (3.8) except for the factor of 1 + |∂f |2 ,
which is 1 + O(|x|−q ) by hypothesis.
The rigidity part of the argument is more complicated, so we merely
outline the idea. Suppose that the mass is zero. We will show that f must
be constant. If it is not, then by Sard’s Theorem, there is a smooth level
set Σ := f −1 (c), where c < limx→∞ f (x). If one applies the divergence
theorem argument above but also uses Σ as an inner boundary, one obtains
the formula

|∂f |2
(3.9) 2(n − 1)ωn−1 mADM (M, g) = H dμΣ +
2 Σ
Rg dμ,
Σ 1 + |∂f | f (x)>c
where H Σ is the mean curvature of Σ inside Euclidean Rn . (Prove this as

an exercise.)
The key theorem proved by Huang and Wu in [HW13] is that nonnega-
tive scalar curvature of the hypersurface M implies that the mean curvature
of M in Rn+1 cannot change sign. Using this, they can then show that HΣ
also cannot change sign. Thus HΣ ≥ 0. It is a well-known fact that there
are no compact minimal hypersurfaces in Rn . (It follows from the fact that
coordinate functions restricted to Σ are harmonic.) Therefore HΣ ≥ 0 and
HΣ > 0 somewhere, and it follows that mADM (M, g) > 0 by formula (3.9),
which is a contradiction.
3.2.4. Axisymmetric case. Axisymmetry refers to a global S 1 = ∼ SO(2)

symmetry of the metric. The axisymmetric three-dimensional case of the
positive mass theorem was proved by Dieter Brill, and historically this was
the first case that was proven. In fact, Brill’s work predates the work of
Arnowitt, Deser, and Misner, and Brill’s use of mass was a precursor to the
eventual formulation of ADM mass.
Theorem 3.25 (Brill [Bri59]). Let g be an asymptotically flat2 metric on

R3 that is invariant under rotations around the z-axis (we will take this as
our simplified definition of “axisymmetric”), and assume that g has nonneg-
ative scalar curvature. Then the ADM mass of g is nonnegative. Further-
more, it is zero if and only if g is Euclidean.
Proof. The proof begins by choosing cylindrical coordinates (ρ, ϕ, z), where
x = ρ cos ϕ and y = ρ sin ϕ. We consider metrics of the following form:
g = e−2U +2α (dρ2 + dz 2 ) + ρ2 e−2U (dϕ + ρBdρ + Adz)2
for some functions U , α, A, and B, which are all independent of the angle
variable ϕ. Moreover, assume α vanishes on the z-axis. Clearly, this metric
∂
is invariant under rotations around the z-axis. In particular, ∂ϕ is a Killing
field. It is a nonobvious, nontrivial fact that any axisymmetric metric on
R3 can be written in the above form, after suitable choice of coordinates.
For a rigorous proof of this fact (which involves construction of isothermal
coordinates), see the work of Chruściel [Chr08, Theorem 2.7], who extended
Brill’s result to simply connected axisymmetric manifolds with more than
one end. A strong enough asymptotic flatness assumption will guarantee
that the functions U , α, A, and B and their derivatives have decay rates
strong enough for the argument below to go through.
2 Technically, this theorem requires appropriate decay of five derivatives of the metric rather
than the usual two. See [Chr08] for details.

Once we have the above form for the metric, the proof follows a fairly
straightforward divergence theorem argument. However, we will only pro-
vide an outline, because the computations are rather involved. The inter-
ested reader may wish to supply the details. (There is an active literature
on axisymmetric metrics, and if one wishes to explore that field, then repro-
ducing these computations would be a worthwhile exercise.)
First, we try to compute the ADM mass in terms of the functions U , α,
A, and B. To do this, first take the above form of the metric and write it
in x, y, z coordinates so that we can use our formula (3.7) for ADM mass.
Recall that we can use any exhaustion to compute the mass, so it makes
sense to use a cylinder. For each r, let Cr denote the cylindrical region
where −r < z < r and ρ < r. Let Wr denote the lateral boundary of this
region where −r < z < r and ρ = r. After many computations, one can
show that

1 1 1
m = lim ∂(U − α) · ν dμCr + α dμWr .
r→∞ 4π ∂Cr 2 2 Wr
Since α vanishes on the z-axis, we can use the divergence theorem to obtain

1 1 1 2π r r ∂α
m = lim Δ(U − α) dμ + dρ dz dϕ
r→∞ 4π Cr 2 2 0 −r 0 ∂ρ

1 1 1 ∂α
= lim Δ(U − α) + dμ.
r→∞ 4π C 2 2ρ ∂ρ
r
Meanwhile, to compute R, one can show that

1 2 ∂α 1 2 −2α
e−2U +2α R = 4Δ(U − α) − 2|∂U |2 + − ρ e (ρBz − Aρ )2 ,
2 ρ ∂ρ 2
where the subscripts on A and B denote partial differentiations. Combined
with the above, we obtain

1 −2U +2α 1 2 −2α
m = lim e R + 2|∂U | + ρ e
2
(ρBz − Aρ ) dμ,
2
r→∞ 16π C 2
r
which we can easily see is nonnegative as long as R ≥ 0.

We now briefly sketch the rigidity argument. Suppose m = 0. Then R,
∂U , and ρBz − Aρ all vanish. This implies that U also vanishes because of
its asymptotics. The equation for R above then tells us that Δα − 2ρ
1 ∂α
∂ρ = 0,
and then a maximum principle argument (together with the asymptotics of
α) tells us that α vanishes also. Finally, the vanishing of ρBz − Aρ tells
us that ρBdρ + Adz is closed and hence equal to dλ for some λ. Hence
g = dρ2 + dz 2 + ρ2 (d(ϕ + λ))2 , and after a simple coordinate change, this
becomes the Euclidean metric in cylindrical coordinates on R3 .
3.2.5. Locally conformally flat manifolds. In this section we will state

a version of the positive mass theorem that is historically important since it
was needed for Schoen’s resolution of the Yamabe problem (Theorem 1.32)
in higher dimensions [Sch84]. However, since the proof is not so closely
related to anything else in this book, we will omit the proof and offer only
a brief discussion. For a more complete discussion, see [SY94, Chapter 6,
SY88].
A Riemannian manifold is called locally conformally flat if every point
has a neighborhood in which the metric is conformal to the Euclidean metric.
By stereographic projection, this is equivalent to saying that any neighbor-
hood is conformal to an open subset of the round spherical metric. In this
way, we can think of a locally conformally flat manifold as having an atlas
of charts in S n such that the transition functions for the atlas are conformal
transformations from one open set in S n to another.
It is a classical fact that every conformal transformation from an open
set in S n to another can be uniquely extended to a global conformal trans-
formation of the entire round sphere S n . These transformations are usually
called Möbius transformations, and they are known to be generated by so-
called “inversions” through hyperspheres in S n . Every simply connected lo-
cally conformally flat manifold admits a conformal immersion into S n called
its developing map. Essentially, the developing map can be constructed by
“patching together” the various conformal local charts into S n . The basic
reason why this works is that the conformal transition functions uniquely
extend to conformal maps of the entire sphere.
We note that one can characterize the condition of being locally confor-
mally flat by the vanishing of the Weyl tensor in dimension at least 4, or the
vanishing of the Bach tensor in dimension 3. In dimension 2, all metrics are
locally conformally flat. (This is just the statement that there always exist
isothermal coordinates near any point.)
We now briefly touch on the relationship to the Yamabe problem (see
[LP87]). T. Aubin [Aub76] strengthened N. Trudinger’s earlier work
[Tru68] on the Yamabe problem by proving the following.
Theorem 3.26. Let (M n , g) be a compact Riemannian manifold with n ≥ 6,

and assume that g is not locally conformally flat somewhere (or in other
words, the Weyl tensor does not vanish identically). Then there exists a
metric conformal to g which has constant scalar curvature.
Trudinger had already showed that the Yamabe problem could be solved
on conformal classes that do not contain positive scalar curvature metrics.
Let (M, g) be a compact Riemannian manifold with positive scalar curva-
ture. Then for any p ∈ M , one can show that there exists a positive Green
function Gp for the conformal Laplacian Lg at p. This means that Gp solves

Lg Gp = δp in the sense of distributions, where δp is the Dirac delta distri-
bution at p. Because of the way Lg transforms under conformal change, it
follows that a positive Green function exists for any metric g that is confor-
mal to a metric with positive scalar curvature. If we use Gp as a conformal
4
factor on (M {p}, g), then one can show that g̃ := Gpn−2 g is asymptotically
flat on M {p}. Since Lg Gp = 0 on M {p}, the metric g̃ is scalar-flat.
Note that applying this procedure to the round sphere results in the Eu-
clidean metric on the punctured sphere. (This is essentially stereographic
projection.)
R. Schoen’s crucial observation in [Sch84] was that positivity of the
ADM mass of g̃ implies solvability of the Yamabe problem. At the time,
Schoen and Yau had already proved the positive mass theorem in low di-
mensions. In higher dimensions, in light of Theorem 3.26, a fully general
positive mass theorem was not needed but rather only a version for locally
conformally flat manifolds. This discussion points us to the relevance of the
following case of the positive mass theorem.
Theorem 3.27 (Schoen-Yau [SY88]). Let n ≥ 4, and let (M n , g) be a

compact Riemannian manifold such that g is conformal to a metric with
positive scalar curvature. Let p ∈ M , and let Gp be the Green function for
the conformal Laplacian Lg at p. If g is locally conformally flat, then the
4
ADM mass of g̃ := Gpn−2 g is positive unless (M, g) is conformal to the round
sphere (in which case the ADM mass is zero).
As stated above, we omit the proof and only mention that the proof
involves a careful study of the developing map.
3.2.6. Kähler case. Another case where a divergence theorem argument

works well is the Kähler case. Hans-Joachim Hein and Claude LeBrun
proved a version of the positive mass theorem that holds for all Kähler
metrics [HL16]. We omit the proof since it is primarily a result in Kähler
geometry, but we can understand the case when the underlying complex
manifold is Cn . In this case, the Ricci form is exact, and thus the scalar
curvature can be written as a divergence (with respect to the Kähler met-
ric). Just as in Theorem 3.23, one then applies the divergence theorem and
shows that the boundary integral is the same as the ADM boundary integral
in the limit.
Exercise 3.28. For readers who are knowledgeable about Kähler geometry:
let n > 1, and let (Cn , J, g) be a complete Kähler manifold that is also
asymptotically flat (of real dimension 2n), where J is the standard complex
structure on Cn . Following the sketch above, prove that if g has nonnegative

scalar curvature, then the ADM mass is nonnegative.
More generally, when the underlying complex manifold is more compli-

cated, the Ricci form is not exact, but the failure of exactness occurs on a
divisor corresponding to the nontrivial canonical line bundle. Hein and Le-
Brun essentially prove that this failure of exactness contributes to the mass
a quantity equal to the volume of the divisor, which is positive. In fact, they
prove a more general result for all asymptotically locally Euclidean Kähler
manifolds that gives a formula for the mass in terms of (complex) topological
invariants and the integral of the scalar curvature.
3.2.7. Two-dimensional case. Here we discuss a two-dimensional version

of the positive mass theorem, first noted in the literature by Willie Wai
Yeung Wong [Won12]. This is not really a special case of the positive
mass theorem, but rather a “toy model” of it. It turns out that for surfaces
with nonnegative Gauss curvature, asymptotic flatness is too strong of a
condition to be interesting (as we shall see in a moment). Instead, we
consider asymptotically conical surfaces.
Definition 3.29. A Riemannian surface (M 2 , g) is said to be asymptotically

conical if there exists a bounded set K such that M K is a finite union of
ends M1 , . . . , M such that for each Mk , there exists a diffeomorphism
Φk : Mk −→ R2 B1 (0) ∼
= (1, ∞) × S 1 ,
such that under this diffeomorphism, we have
g = dr2 + r2 dθ2 + O1 (r−q )
for some q > 0, where r is a coordinate on (1, ∞) and dθ2 represents the
metric on the S 1 factor whose length is 2πα for some constant α > 0. The
parameter α is called the cone angle of that end.
Here, we have to be a little careful about what is meant by O1 (r−q ). We
mean that the quantity is a 2-tensor τ with the property that |τ |ḡ +r|∇τ |ḡ =
O(r−q ), where the computations are with respect to the background cone
metric ḡ = dr2 + r2 dθ2 .
Theorem 3.30 (Analog of positive mass theorem in two dimensions). Let

(M 2 , g) be a complete asymptotically conical surface with nonnegative Gauss
curvature. Then each end has cone angle at most 1, and if any cone angle
is equal to 1, then (M, g) must be the Euclidean plane.
Thus the cone angle (or perhaps 1 minus the cone angle) plays a role
analogous to that of the mass in higher dimensions.
Proof. Let Mρ be the compact region whose boundary ∂Mρ is the union of
the spheres {r = ρ} in each end. Invoking the Gauss-Bonnet Theorem for
Mρ , we obtain
Kg dμg = 2πχ(Mρ ) − κ ds.
Mρ ∂Mρ
By the implicit assumption of connectedness, 2πχ(Mρ ) ≤ 2π. Meanwhile,
the asymptotically conical assumption implies that for ∂Mρ , κ = ρ1 +

O(ρ−q−1 ), and the length of ∂Mρ is 2πρ k αk + O(ρ1−q ), where we sum
over the cone angles αk of all ends. Therefore

Kg dμg = 2π 1 − αk + O(ρ−q ).
Mρ k

Taking the limit as ρ → ∞, we obtain k αk ≤ 1 as desired. (Actually,
note that for this conclusion, we do not require a sign on Kg but only the
integral of Kg .)
In the case where one of the ends
has cone angle 1, we immediately see
that it must be the only end, and M Kg dμg = 0. But this is only possible if
Kg = 0 identically, which means that M is flat. Therefore (M, g) is flat and
has one planar end. Then (M, g) must be Euclidean by the same argument
used in Exercise 2.33.
3.3. Reduction to Theorem 1.30

We will now explain the general proof of the positive mass theorem (The-
orem 3.18). Schoen and Yau first proved the dimension n = 3 case for
asymptotically Schwarzschild spaces in [SY79c], generalizing to asymptot-
ically flat spaces in [SY81a]. They announced the n < 8 case in [SY79a],
and the n = 8 case follows from the work of N. Smale [Sma93] (Theorem
2.24). All of the relevant arguments used by Schoen and Yau to prove the
theorem for n < 8 are nicely summarized in [Sch89]. The most important
difference between our presentation here and that of [Sch89] is that we
take advantage of Lohkamp’s simplification of the proof in [Loh99], which
is what allows us to reduce the positive mass theorem to Theorem 1.30, or
more precisely, its corollary, Corollary 2.32. Using this reduction, we see
that the positive mass theorem in general dimension is a consequence of
Theorem 1.30 in general dimension.
3.3.1. Reduction to Corollary 2.32. In this section we will need to solve

some elliptic PDEs on asymptotically flat manifolds, so now might be a good
time to read through Section A.2 in the Appendix. At the very least you will
need familiarity with weighted Sobolev spaces and weighted Hölder spaces
on asymptotically flat manifolds.
3.3. Reduction to Theorem 1.30 87
We first reduce the positive mass theorem to the scalar-flat case.

Lemma 3.31. Let (M, g) be a complete asymptotically flat manifold such
that the scalar curvature R is nonnegative everywhere and positive some-
0,α
where. Further assume that R ∈ Cs−2 for some s < − n−2
2 and α ∈ (0, 1).
Then g is conformal to a scalar-flat complete asymptotically flat metric g̃
such that mADM (Mk , g̃) < mADM (Mk , g) for each end Mk .
0,α
Remark 3.32. The assumption that R ∈ Cs−2 is undesirable since it is not
part of our definition of asymptotic flatness (which only assumes Cs−2 0 ). For
the rest of this section we will gloss over this point, but in Section 3.3.3, we
will provide an alternative proof of the positive mass theorem that obviates
the need for this assumption.
Proof. Let (M n , g) be a complete asymptotically flat manifold with Rg ≥ 0.

4
We seek a positive function u such that g̃ := u n−2 g is scalar-flat and u
approaches 1 at infinity on each end. By (1.8), this is equivalent to solving
4(n − 1)
Lg u := − Δg u + Rg u = 0,
n−2
with boundary condition u(x) → 1 as x → ∞ on each end. Setting v = u−1,
this is equivalent to
4(n − 1)
Lg v = − Δg v + Rg v = −Rg ,
n−2
with boundary condition v(x) → 0 as x → ∞ on each end. Since Rg decays
faster than O(|x|−2 ), it is easy to see that we can think of Lg as a map
between weighted Sobolev spaces
2,p
(3.10) Lg : W−q (M ) −→ Lp−q−2 (M )
for any p ≥ 1 and any real number q. (These spaces are defined in Defi-
nition A.20. See also Exercise A.31.) Since we seek to solve Lg v = −Rg ,
we need −Rg ∈ Lp−q−2 , so we must choose q to be less than the asymptotic
decay rate of g in Definition 3.5. Since we also want the solution v to decay
to zero, we need to choose q > 0 and p > n/2. With this choice of p and q,
2,p
weighted Sobolev embedding (Theorem A.25) guarantees that W−q ⊂ C−q
0 ,
so that the domain of (3.10) consists of pointwise decaying functions.

Next, since we want the operator (3.10) to be surjective, we will also
take q < n − 2, since this is the rate needed for surjectivity of the Laplacian
on these spaces (see Theorem A.40). To summarize, we have assumed that
p > n/2, 0 < q < n − 2, and q is less than the asymptotic decay rate of g.
With these assumptions in place, Corollary A.42 tells us that surjectivity
of (3.10) is equivalent to injectivity. Suppose Lg w = 0. As mentioned above
w approaches zero at infinity, and by elliptic regularity (Theorem A.4), w
is smooth. Since Rg ≥ 0, Lg satisfies a maximum principle (Theorem A.2),

and thus w must be identically zero. This proves our claim that Lg is an
isomorphism.
2,p
Hence, we have our desired solution v ∈ W−q ⊂ C−q0 such that L v =
g
−Rg , so that u = 1 + v solves Lg u = 0. By elliptic regularity, both v and
u are smooth. Since Rg is nontrivial, u cannot be identically 1, so by the
maximum principle (Theorem A.2) together with the fact that u approaches
1 at each infinity, we know that 0 < u < 1 everywhere. In particular, we can
4
use u as a conformal factor, and then the metric g̃ = u n−2 g is scalar-flat, by
construction.
However, it is not clear that g̃ is asymptotically flat. Here is where we
invoke our Hölder decay assumption on Rg . For this step we place another
0,α
restriction, q < −s, on our choice of q so that Rg ∈ C−q−2 . Now we can
use the weighted elliptic Hölder regularity (Theorem A.33) on the equation
2,α
Lg v = −Rg to see that v ∈ C−q . In particular, the conformal factor u =
−q
1+O2 (|x| ) in each end, so as long as we choose q > n−22 , g̃ is asymptotically
flat by Exercise 3.12.
The only thing left to check is that mADM (Mk , g̃) < mADM (Mk , g) on
each end. Let us first consider the case of one end M1 . Then Exercise 3.12
says that

−2 ∂u
mADM (M1 , g̃) − mADM (M1 , g) = lim dμSρ
ρ→∞ (n − 2)ωn−1 S ∂r

ρ
−2
= lim ∇u · ν dμ(Sρ ,g)
ρ→∞ (n − 2)ωn−1 S

ρ
−2
= Δg u dμg
(n − 2)ωn−1 M

−2 n−2
= Rg u dμg
(n − 2)ωn−1 M 4(n − 1)
< 0,
where we used the divergence theorem in the third line and the defining
equation for u in the next one. This completes the proof for the case of one
end.
Note that if we directly apply this argument to the case of multiple
ends, it only shows that the sum of the masses of the ends goes down.
Instead, choose > 0 small enough so that the region {x | u(x) > 1 − } is
entirely contained in the asymptotically flat ends. By Sard’s Theorem [Wik,
Sard’s theorem], we can choose so that the level set {x | u(x) = 1 − } is
smooth. Let Σk denote the component of that level set in the end Mk . Then
we can apply the same argument as above to the region {x ∈ Mk | u(x) >
1 − } instead of all of M to see that in each end Mk , we have

−2 ∂u
mADM (Mk , g̃) − mADM (Mk , g) ≤ dμΣk ,
(n − 2)ωn−1 Σk ∂ν
where ν is the outward-pointing normal for Σk . By the strong maximum
principle (Theorem A.2), ∂u
∂ν > 0 on Σk , and hence the result follows.
In light of the previous lemma, the general case of the positive mass
theorem (Theorem 3.18) follows from the scalar-flat case. Our next step is
to reduce to the harmonically flat case.
Definition 3.33. Let n ≥ 3. We say that (M n , g) is harmonically flat
outside a bounded set if there exists a bounded set K such that M K is a
finite union of ends M1 , . . . , M such that for each Mk , there exist an rk > 0
and a diffeomorphism
Φk : Mk −→ Rn B̄rk (0),
such that in this coordinate chart, we have
4
gij (x) = uk (x) n−2 δij ,
where uk is harmonic with respect to the Euclidean metric, and uk (x) → 1
as x → ∞.
By Corollary A.19, each of these harmonic functions uk can be expanded

as uk = 1 + Ak |x|2−n + O2 (|x|1−n ) for some Ak . By Exercise 3.13, any
metric g that is harmonically flat outside a bounded set K is automatically
asymptotically flat, and the ADM mass of the end Mk is just 2Ak . In
fact, g is asymptotically Schwarzschild. Moreover, by equation (1.8), the
metric g is also scalar-flat outside K. This harmonically flat condition is
useful, but it is quite special. However, the following theorem of Schoen
and Yau [SY81a] shows that these metrics are dense in the space of all
scalar-flat asymptotically flat metrics with nonnegative scalar curvature.
Lemma 3.34 (Density lemma for scalar-flat metrics). Let (M n , g) be a
scalar-flat complete asymptotically flat manifold. Let p > n/2 and q < n − 2
such that q is less than the asymptotic decay rate of g in Definition 3.5.
Then for any > 0, there exists a scalar-flat complete asymptotically flat
metric g̃ on M that is also harmonically flat outside a compact set, such
that g̃ − gW 2,p < .
−q
Proof. Let (M n , g) be a complete asymptotically flat manifold with Rg = 0,

and choose p, q as in the hypotheses of the lemma. Let χ be a smooth
nonnegative cut-off function on Rn that is equal to 1 on B1 and vanishes
outside B2 . For λ ≥ 1, define χλ (x) = χ(x/λ). For λ large enough, we can
think of χλ as being defined on M by extending it to be 1 on the compact
region of M . Let ḡ be a smooth background metric equal to the Euclidean

metric on each asymptotically flat end. Define
gλ := χλ g + (1 − χλ )ḡ,
so that gλ = g for |x| < λ, gλ = ḡ for |x| > 2λ, and gλ interpolates between
2,p
the two in the annular region in between. Check that gλ → g in W−q as
λ → ∞ because q is smaller than the asymptotic decay rate of g.
4
We attempt to find a conformal factor uλ such that g̃λ := uλn−2 gλ will
be scalar-flat. Just as in the proof of Lemma 3.31, if we set vλ = uλ − 1,
this boils down to solving
4(n − 1)
Lgλ vλ = − Δgλ vλ + Rgλ vλ = −Rgλ ,
n−2
with vλ vanishing at the infinity of each end. Assume without loss of gener-
ality that q > 0, and consider the operator
2,p
Lgλ : W−q (M ) −→ Lp−q−2 (M ).
We claim that this is an isomorphism. In Lemma 3.31, this was proved by
invoking the maximum principle. In this case, it follows from the fact that
Lgλ is a small perturbation of the Laplacian. More precisely, observe that
Rgλ → 0 in L∞ 4(n−1)
−2 as λ → ∞, and consequently Lgλ → Lg = − n−2 Δg in
the strong operator topology. (Check this.) Since p > 1 and 0 < q < n − 2,
Theorem A.40 says that Δg is an isomorphism, and then it follows that Lgλ
is also an isomorphism for λ larger than some fixed λ0 . Moreover, we will
have a uniform injectivity estimate
(3.11) wW 2,p ≤ CLgλ wLp−q−2
−q
2,p
for some C independent of λ > λ0 , for all w ∈ W−q .
Therefore we can solve Lgλ vλ = −Rgλ for large λ. Since q is smaller
than the asymptotic decay rate of g, we can check that Rgλ → 0 in Lp−q−2 .
2,p
Then the injectivity estimate (3.11) implies that vλ → 0 in W−q . Elliptic
regularity (Theorem A.4) ensures that vλ is smooth, and since p > n/2,
weighted Sobolev embedding (Theorem A.25) implies that vλ → 0 in C−q 0 .
In particular, for large enough λ, uλ = 1 + vλ > 0, so that we may use it as

a conformal factor.
4
We claim that g̃λ = uλn−2 gλ provides a sequence of metrics that will
fulfill the requirements of the theorem for large enough λ. Observe that g̃λ
is scalar-flat and harmonically flat in the region |x| > 2λ, by construction.
4
2,p
The only thing left to check is that g̃λ → g in W−q . Since g̃λ = (1+vλ ) n−2 gλ
and vλ W 2,p → 0, it follows that g̃λ − gλ W 2,p → 0. Combining this with
−q −q
2,p
the fact that gλ → g in W−q yields the desired result.
The following lemma shows that if we choose p > n and q > n−2
2 , then
the ADM mass of the metric g̃ constructed in the previous lemma can be
chosen to be arbitrarily close to that of g.
Lemma 3.35 (Convergence of ADM masses). Suppose that gi is a sequence

2,p
of asymptotically flat metrics converging in W−q to a limit asymptotically
flat metric g on an exterior coordinate chart, where p > n and q > n−2
2 , and
assume that Rgi converges to Rg in L1 . Then the ADM mass of gi converges
to the ADM mass of g.
Proof. Assume the hypotheses of the lemma. Consider fixed ρ0 > 1 and
let ρ > ρ0 . We use the notation Sρ − Sρ0 to mean Sρ ∪ Sρ0 , oriented so that
Sρ has outward normal while Sρ0 has inward normal. Then

(div g − d(tr g))(ν) dμSρ = [−Δ(tr g) + div(div g)] dμ
Sρ −Sρ0 ρ0 <|x|<ρ

= (Rg − Q(g)) dμ,
ρ0 <|x|<ρ
where Q(g) is the quadratic expression from Exercise 1.18.

Taking the limit as ρ → ∞ and using the definition of mass,

2(n − 1)ωn−1 mADM (g) = (div g − d(tr g))(ν) dμSρ
Sρ 0

+ (Rg − Q(g)) dμ.
|x|>ρ0
And of course the same holds true for each gi . Thus
2(n − 1)ωn−1 (mADM (g) − mADM (gi ))

= (div g − d(tr g)) − (div gi − d(tr gi )) (ν) dμSρ
Sρ 0

+ [(Rg − Rgi ) − (Q(g) − Q(gi ))] dμ.
|x|>ρ0
2,p
Since p > n, the Wloc convergence of gi to g implies (via Sobolev embed-
1 . In particular,
ding [Wik, Sobolev inequality]) that gi converges to g in Cloc
the flux integral at Sρ0 vanishes in the limit. Since p > n and q > n−2 2 , the
2,p
W−q convergence of gi to g also implies that Q(gi ) converges to Q(g) in L1 .
(Check this.) And finally, we have the explicit hypothesis that Rgi converges
to Rg in L1 . Therefore the integral over the region |x| > ρ0 also vanishes in
the limit.
Exercise 3.36. In the previous lemma, show that we can replace the hy-
pothesis that Rgi converges to Rg in L1 by the hypothesis that Rgi is uni-
formly bounded in L1−n−δ for some δ > 0. Hint: Deal with the problematic
term using a weighted Hölder inequality and choosing ρ0 large.
The previous exercise shows that the ADM mass is continuous in the
2,p
W−q topology, but only after being restricted to a family whose scalar cur-
vatures are uniformly bounded in L1−n−δ for some δ > 0.
Exercise 3.37. Prove that the ADM mass is NOT continuous as a function
of asymptotically flat metrics on a fixed exterior coordinate chart, in the
2,p
W−q topology, where p > n and q > n−2
2 . Hint: Construct a counterexample
sequence by considering metrics that are globally conformal to Euclidean
space.
Remark 3.38. If gi is a sequence of complete asymptotically flat metrics
converging to a limit asymptotically flat metric g uniformly on compact sub-
sets of a 3-manifold M , such that each (M, gi ) has nonnegative scalar curva-
ture and contains no minimal surfaces, then mADM (g) ≤ lim inf i→∞ mADM (gi ).
This is a very different sort of theorem than Lemma 3.35. While Lemma 3.35
is just a fact about asymptotics, the result mentioned here is a theorem
about the nature of nonnegative scalar curvature. In fact, the positive mass
theorem is a simple corollary of this result. See [Jau18, JL16, JL19].
Combining the previous three lemmas (Lemmas 3.31, 3.34, and 3.35),
we see that if there is a counterexample to the positive mass theorem (The-
orem 3.18), then there must exist a scalar-flat counterexample that is also
harmonically flat outside a compact set. Explicitly, given a complete asymp-
totically flat metric on a manifold M with nonnegative scalar curvature and
negative mass, Lemma 3.31 allows us to produce a scalar-flat example with
negative mass, then Lemma 3.34 allows us to find a small perturbation of
it that is scalar-flat and harmonically flat outside a compact set, and finally
Lemma 3.35 tells us that a small enough perturbation will still have negative
mass.
The original proof of Schoen and Yau constructed a complete stable
minimal hypersurface inside this potential counterexample and then used a
noncompact version of the argument in Proposition 2.25 to contradict the
positive mass theorem in one lower dimension (or the Gauss-Bonnet Theo-
rem in the base case of dimension 3). However, later work of J. Lohkamp
provides a much simpler argument, reducing the positive mass theorem in
the harmonically flat case to the case where the metric is actually Euclidean
outside a compact set, which is the case where Corollary 2.32 applies.
Lemma 3.39 (Lohkamp [Loh99]). Suppose there exists a counterexample
to the positive mass theorem (Theorem 3.18) on M that is harmonically flat
outside a compact set. So there is at least one end that has negative mass.
Let M − pt be the manifold obtained by taking the one-point compactifica-
tion of every other end. Then there exists a metric of nonnegative scalar
curvature on M − pt that is exactly Euclidean on its one noncompact end,
but not scalar-flat.
Proof. Let (M, g) be complete and harmonically flat outside a compact set
such that g has nonnegative scalar curvature and negative mass in at least
one of its ends. First we would like to reduce to the case of one end. We first
close up all of the ends except for one end (call it M1 ) that is assumed to
have negative mass. We do this by choosing a g-harmonic conformal factor
w such that w tends to 1 at the infinity of M1 , and 0 at the infinities of all
other ends. (Such a function can be constructed using Theorem A.40, for
example.) By the maximum principle, 0 < w < 1 everywhere, so we can
4
use it as a conformal factor to define a new function g̃ = w n−2 g. By equa-
tion (1.8), g̃ still has nonnegative scalar curvature and is still harmonically
flat in the end M1 . Using the same argument that was used in the proof of
Lemma 3.31, we can use the maximum principle and Exercise 3.13 to see
that mADM (g̃) < mADM (g) in the end M1 . Meanwhile, all other ends have
been metrically compactified, and we can see that the missing points can
be smoothly filled in because of the harmonic flatness assumption. (Specif-
ically, after a Kelvin transform, we can see that the neighborhood around
the missing point is conformally Euclidean via a bounded harmonic confor-
mal factor, and a bounded harmonic function on a punctured disk has a
removable discontinuity.)
This leaves us with a counterexample to the positive mass theorem on
M − pt, which only has one end, and that end is harmonically flat. Without
loss of generality, let us assume that g is such a counterexample. In the
4
exterior coordinate chart, for large |x|, we have gij = u n−2 δij for some
harmonic function u, which (by Corollary A.19) we can expand as
m 2−n
u(x) = 1 + |x| + O(|x|1−n ),
2
where m = mADM (g) < 0, by Exercise 3.13. Our next step is to interpolate
between g and the Euclidean metric over an annulus in such a way that
nonnegative scalar curvature is preserved. Specifically, we will replace u with
a new function ũ that agrees with u for |x| less than some large constant
ρ1 , while ũ is exactly constant for |x| larger than some bigger constant ρ2 .
4
By equation (1.8), nonnegativity of the scalar curvature of g̃ij := ũ n−2 δij is
equivalent to superharmonicity of ũ. Since m < 0, the asymptotic expansion
of u tells us that for small , we can find ρ1 < ρ2 such that

u(x) < 1 − 3 for |x| < ρ1 ,
u(x) > 1 − for |x| > ρ2 .
Since the minimum of two superharmonic functions is again superharmonic,
the function min(u + 2, 1) is essentially what we need, but since this func-
tion is not smooth, we need to smooth it out somehow while preserving
superharmonicity. It is intuitively clear that this should be possible, and
here is one simple way to do it. We will choose ũ = Ψ(u), where Ψ is a
smooth function such that
Ψ(u) = u + 2 for u < 1 − 3,
Ψ(u) = 1 for u > 1 − ,
Ψ (u) ≤ 0 everywhere.
So
Δũ = div(∇Ψ(u))
= div(Ψ (u)∇u)
= ∇Ψ (u) · ∇u + Ψ (u)Δu
= Ψ (u)|∇u|2
≤ 0,
4
and thus ũ is superharmonic as desired. Therefore g̃ij = ũ n−2 δij agrees with
a multiple of g inside a compact set, is exactly δij outside a larger compact
set, and has nonnegative scalar curvature everywhere. Furthermore, g̃ must
have strictly positive scalar curvature wherever Ψ (u) < 0 and ∇u = 0, and
it is easy to see that such points must exist.
Combining Lemmas 3.31, 3.34, 3.35, and 3.39, we see that a counterex-
ample to the positive mass theorem (Theorem 3.18) can be used to con-
struct a metric as in the conclusion of Lemma 3.39. Since this contradicts
Corollary 2.32, the proof of Theorem 3.18 is complete. Actually, we do not
really need to invoke Corollary 2.32 directly, since the metric obtained from
Lemma 3.39 can be used to construct a metric of positive scalar curvature
on T n #M as in the proof of Theorem 1.23, which contradicts Theorem 1.30
directly. This completes the proof of the positive mass theorem (Theo-
rem 3.18).
Remark 3.40. Technically, we have only proved Theorem 3.18 with the ad-
ditional hypothesis of Hölder decay of the scalar curvature as in Lemma 3.31.
In Section 3.3.3, we remove this hypothesis.
Remark 3.41. To summarize, in this section we have shown that, given

any counterexample (M, g) to the positive mass theorem, we can construct a
metric of positive scalar curvature T n #M , where M is the compact manifold
obtained by one-point compactifying each end of M .
3.3.2. Proof of rigidity. Finally, we tackle the rigidity of the positive

mass theorem.
Proof of Theorem 3.19. The proof combines ideas from the proofs of
Theorem 1.23 and Corollary 2.32, but with some added complications. Sup-
pose that (M, g) is a complete asymptotically flat manifold with nonneg-
ative scalar curvature, and assume that the ADM mass of one of its ends
is zero. By Lemma 3.31, g must be scalar-flat, because otherwise we could
find a conformal metric that violates the positive mass theorem. (Techni-
cally, Lemma 3.31 requires a Hölder decay assumption, but we will see in
Section 3.3.3 how to avoid this.)
Our next step is to show that g is Ricci-flat. Although there is a nice
proof of this that uses Ricci flow (see Remark 3.42 below), we will present
Schoen and Yau’s original argument. For simplicity, let us assume for now
that M has only one end. Here we use a Ricci deformation combined with a
conformal change. Let η ≥ 0 be any compactly supported cut-off function.
For small t > 0, let gt := g + tηRicg . Note that since gt equals g outside a
compact set, mADM (gt ) = 0. Next we make a conformal change back to zero
4
scalar curvature. That is, we look for ut such that g̃t = utn−2 gt is scalar-flat.
So we must solve
4(n − 1)
Lgt ut = − Δgt ut + Rgt ut = 0
n−2
with ut (∞) = 1. As in the proof of Lemma 3.34, this is equivalent to

solving Lgt vt = −Rgt for vt = ut − 1 with vt (∞) = 0. Arguing as in that
proof, the fact that Rgt smoothly converges to 0 as t → 0 tells us that for
2,p
small enough t, Lgt : W−q −→ Lp−q−2 is an isomorphism for any p > 1 and
0 < q < n − 2. Therefore we can find a smooth solution vt as desired, and
as before, as long as p > n/2, for small enough t, we have ut > 0. Since ut
is g-harmonic outside a compact set, Corollary A.38 and Exercise 3.13 tell
4
us that g̃t = utn−2 gt is asymptotically flat, and of course it is scalar-flat by
construction.
Following the same reasoning as in the proof of Lemma 3.31, we can see
that g̃t is a scalar-flat complete asymptotically flat metric, and

2 ∂ut
mADM (g̃t ) = mADM (gt ) + lim − dμSρ
ρ→∞ (n − 2)ωn−1 S ∂r

ρ
2
= 0 + lim −Δgt ut dμSρ
ρ→∞ (n − 2)ωn−1 S

ρ
−1
= Rg ut dμg .
2(n − 1)ωn−1 M t
By the positive mass theorem,
we know that mADM (g̃t ) ≥ 0 is minimized at
d
t = 0. Setting ġ = dt gt t=0 = ηRicg ,

d
0 = 2(n − 1)ωn−1 mADM (g̃t )
dt

t=0

d d
=− Rgt u0 + Rg0 ut dμg
M dt dt t=0
t=0
=− DR|g (ġ) dμg

M
=− [−Δg (trg ġ) + divg (divg ġ) − Ricg , ġg ] dμg
M
= η|Ricg |2 dμg ,
M
where we used u0 = 1, Rg0 = 0, the divergence theorem, and the fact that
ġ = ηRicg is compactly supported. Since the choice of η was arbitrary, this
proves that g is Ricci-flat.
Finally, to see why Ricci-flatness implies that (M, g) is Euclidean, we
can use a noncompact analog of the argument used to prove Corollary 2.32.
Instead of invoking the Hodge Theorem for compact manifolds to find har-
monic 1-forms, we instead construct harmonic 1-forms by finding a harmonic
coordinate system y 1 , . . . , y n asymptotic to the original one x1 , . . . , xn . That
is, we want Δg y i = 0, so if we set v i = y i − xi , this is equivalent to solving
Δg v i = −Δg xi . By asymptotic flatness, Δg xi ∈ C−q−1 1 , where q is the as-
ymptotic decay rate (which we take to be less than n − 2). By surjectivity of
the Laplacian (Theorem A.40), there exists a solution v i ∈ C1−q 2 . Thus dy i
is a harmonic 1-form, and since these 1-forms are asymptotic to dxi , they
must form a basis near infinity.
As in the proof of Corollary 2.32 we now invoke the Weitzenböck formula
to see that these 1-forms must be parallel, which implies that g is flat. Then
Exercise 2.33 implies that (M, g) is Euclidean space. Or as an alternative,
one may argue using the Bishop-Gromov comparison theorem as in Exercise
2.34.
Now we discuss how to reduce to the case of one end. Suppose there
are multiple ends, one of which, say M1 , has zero mass. In this case, we
can conformally close the other ends using a g-harmonic conformal factor u
approaching 1 at the infinity of M1 and approaching 0 at all of the other
infinities. This has the effect of “conformally closing” the other ends to ob-
tain a new one-ended, asymptotically flat manifold with nonnegative scalar
curvature, as in the proof of Lemma 3.39. Meanwhile, applying the rea-
soning used in the proof of Lemma 3.31 for the case of multiple ends, the
ADM mass of M1 must go down after this conformal change, resulting in a
negative mass end. This is essentially a contradiction to the positive mass
theorem, and thus there can only be one end.
However, there is a small technical problem with this argument: unlike
in Lemma 3.39, using a conformal factor to implement a one-point com-
pactification of an asymptotically flat end results in a metric that is not
necessarily smooth at the point corresponding to infinity. However, the
2,p
metric is still Wloc for some p > n/2, and consequently continuous. (Check
this.) Therefore one should prove that the positive mass theorem still holds
when the metric has isolated singularities of this sort. We will prove this in
Theorem 3.43 below.
Remark 3.42. In order to prove Ricci-flatness of g above, one can use a
Ricci flow argument similar to what was described in Remark 2.31. Let gt be
a Ricci flow with initial condition g. It turns out that Ricci flow preserves
nonnegative scalar curvature, asymptotic flatness, and ADM mass. (See
Section 3.4.) Since gt has zero ADM mass, the first part of the proof above
implies that gt is scalar-flat for all t. Then equation (2.19) implies that g is
Ricci-flat. In many ways, this is a much cleaner and simpler proof, especially
in the case of multiple ends.
Theorem 3.43 (Positive mass theorem with low regularity [GT12]). Let
p > n/2, and let g be a continuous metric on a smooth one-ended manifold
M n such that outside of some compact set K, g is a smooth asymptotically
2,p
flat metric, while g is Wloc over K. If Rg ≥ 0 as a function in Lploc , then
mADM (g) ≥ 0.
This theorem is much more than what is necessary to deal with iso-
lated singular points, but we provide its statement and proof because it is
conceptually not much harder than dealing with an isolated singular point.
Proof. The proof uses the conformal method of [Mia02, Bra01], but it is
quite a bit simpler. The basic idea is to approximate g by a smooth metric
with nonnegative scalar curvature whose mass can be chosen arbitrarily close
to that of g. By applying the usual positive mass theorem to the smooth
approximation, it then follows that mADM (g) must also be nonnegative.
The approximation works in two steps: we start with a fairly arbitrary

smoothing, and then we perform a conformal deformation to a metric with
nonnegative scalar curvature.
Observe that for any > 0, we can find a smooth g such that g is iden-
tically equal to g outside some K, while g − gW 2,p (M ) < . In particular,
the Sobolev inequality implies that g converges to g uniformly as → 0.
Exercise 3.44. Use the expression for scalar curvature in local coordinates,
together with the Hölder inequality and the Sobolev inequality, to show that
Rg converges to Rg in Lp (M ) as → 0.
We would like to make a conformal change that removes all of the neg-
ative scalar curvature from g . To do this, we would like to solve
4(n − 1)
− Δg u − (Rg )− u = 0
n−2
with u approaching 1 at each infinity, where (Rg )− denotes the negative
part of Rg (which is defined to be a nonnegative function). Setting v =
u − 1, this is the same thing as solving the equation
4(n − 1)
− Δg v − (Rg )− v = (Rg )− .
n−2
Just as in our proofs of Lemma 3.34 and Theorem 3.19, we will do this by
showing that the operator
4(n − 1) 2,p
(3.12) − Δg − (Rg )− : W−q −→ Lp−q−2
n−2
is an isomorphism, where 0 < q < n − 2. We can do this by showing that
the operator (3.12) is a small deformation of − 4(n−1)
n−2 Δg , which we already
know is an isomorphism by Theorem A.40. In fact, the injectivity estimate
for Δg can be chosen to be independent of , as can the constants in the
Sobolev inequality. While (Rg )− does not smoothly converge to 0 as → 0,
since p > n/2, we really only need it to converge in Lp−2 in order to see that
the operator (3.12) is close to − 4(n−1)
n−2 Δg in the strong operator topology.
(Check this.)
Next, since Rg ≥ 0, we have |(Rg )− | ≤ |Rg −Rg |, and then the previous
exercise implies that the compactly supported function (Rg )− converges to 0
in Lp−2 . Hence, the desired smooth solutions v exist. Moreover, the injectiv-
2,p
ity estimate we get from this argument tells us that v → 0 in W−q , and since
p > n/2, v → 0 in C−q by weighted Sobolev embedding (Theorem A.25).
0
4
In particular, u = 1 + v > 0 for small enough . Define g̃ = un−2 g . Since
u is g-harmonic outside K, it follows from Corollary A.38 and Exercise 3.13
2,p
that g̃ is asymptotically flat, and the W−q convergence of u − 1 implies
2,p
that g̃ − g → 0 in W−q . Meanwhile, by equation (1.8), the scalar curvature
of g̃ is

n+2
4(n − 1)
Rg̃ = u n−2
− Δg u + Rg u
n−2
2n
= un−2 ((Rg )− + Rg )
2n
= un−2 (Rg )+ .
Thus Rg̃ is nonnegative everywhere, and we can apply the positive mass
theorem to see that mADM (g̃ ) ≥ 0.
Finally, we must argue that mADM (g) = lim→0 mADM (g̃ ). Since we
are not assuming p > n, we cannot call upon Lemma 3.35. Instead, using
Exercise 3.12, we compute
mADM (g̃ ) − mADM (g) = mADM (g̃ ) − mADM (g )

−2 ∂u
= lim dμSρ
ρ→∞ (n − 2)ωn−1 S ∂r

ρ
−2
= lim ∇u · ν dμ(Sρ ,g )
ρ→∞ (n − 2)ωn−1 S

ρ
−2
= Δg u dμg
(n − 2)ωn−1 M

2 n−2
= (Rg )− u dμg .
(n − 2)ωn−1 K 4(n − 1)
The result now follows from the fact that (Rg )− → 0 in Lp while u → 1
uniformly.
Remark 3.45. Although Theorem 3.43 does not come with the rigidity
statement that if the mass is zero, then (M, g) is Euclidean, the rigidity
should follow from the same Ricci flow argument that was used in [MS12].
2,p
For the case of isolated singularities that still have regularity Wloc for some
p > n/2, we can prove rigidity using the same Schoen-Yau argument used in
the smooth case: we can prove Ricci-flatness away from the singular points
and use that to construct a parallel frame.
We take a moment to summarize the logical relationship between three

results: (1) Corollary 2.32, which states that Euclidean space is the only
complete manifold with nonnegative scalar curvature which is exactly Eu-
clidean outside a compact set; (2) the positive mass theorem (Theorem 3.18),
which states a complete asymptotically flat manifold with nonnegative scalar
curvature must have nonnegative mass, and (3) positive mass rigidity (Theo-
rem 3.19), which states Euclidean space is the only complete asymptotically
flat manifold with nonnegative scalar curvature and mass equal to zero. In
this book, we explained (1) first and then proved in this chapter how (1)
implies (2). In this section, we showed how (2) implies (3), and obviously
(1) is a special case of (3). Originally, in dimensions less than 8, Schoen and
Yau started by proving (2) directly, essentially executing a “noncompact”
version of their proof of (1).
Specifically, Schoen and Yau’s original proof in dimension 3 involved
showing that if a complete asymptotically flat manifold of nonnegative scalar
curvature has negative mass, then they can construct a complete stable
two-sided minimal surface inside it and derive a contradiction to the Gauss-
Bonnet Theorem. (We will describe a generalization of this argument in
detail in Chapter 8.) In light of this perspective, one reasonable question is
whether there is an asymptotically flat version of Cai and Galloway’s κ = 0
case of Theorem 2.38. Indeed, we have the following.
Theorem 3.46 (Chodosh-Eichmair [CCE16]). Let (M, g) be a complete
asymptotically flat 3-manifold with nonnegative scalar curvature. If M con-
tains a smooth, noncompact area-minimizing boundary, then M is Euclidean
space.
In contrast, a more recent preprint by Chodosh and Ketover [CK18]

shows that in the absence of a closed minimal surface, there are, in fact,
many (nonminimizing) complete minimal surfaces. When we say that Σ
is an area-minimizing boundary, we mean that Σ is the boundary of some
region, and given any ball B in M , |Σ ∩ B| minimizes area compared to any
other boundary that agrees with Σ outside B. Theorem 3.46 is significantly
more subtle than its compact analog. One indication of the subtlety is the
following striking theorem.
Theorem 3.47 (Carlotto-Schoen [CS16]). For any n ≥ 3, there exist many
complete asymptotically flat manifolds (M n , g) with nonnegative scalar cur-
vature such that part of (M, g) is isometric to Euclidean half-space, but
(M, g) is not Euclidean space.
This demonstrates how essential the global nature of the area-minimizing

hypothesis is in the statement of Theorem 3.46. It also provides a stark
contrast with Corollary 2.32, which says that we cannot replace the half-
space in Theorem 3.47 by the exterior of a ball. Instead, this theorem
exhibits behavior reminiscent of Theorem 2.45, though it is proved using a
version of Corvino’s techniques from [Cor00]. (See Theorem 3.51.)
3.3.3. More density theorems. In our proof of the positive mass theo-
rem, we used a two-step process to move from a general counterexample to
a scalar-flat counterexample to a counterexample that is harmonically flat
outside a compact set, but the following lemma shows that we can jump
straight to a counterexample that is harmonically flat outside a compact set

(but not necessarily scalar-flat everywhere).
Lemma 3.48 (Density lemma for nonnegative scalar curvature). Let (M n , g)
be a complete asymptotically flat manifold with nonnegative scalar curvature.
Let p > n/2 and q < n − 2 such that q is less than the asymptotic decay
rate of g in Definition 3.5. Then for any > 0, there exists a complete
asymptotically flat metric g̃ with nonnegative scalar curvature on M that
is harmonically flat outside a compact set, such that g̃ − gW 2,p < and
−q
Rg̃ − Rg L1 < .
Note that if we take p > n and q > n−2

2 , then Lemma 3.35 tells us that
in the conclusion of Lemma 3.48, we can also demand that
|mADM (g̃) − mADM (g)| < .
Proof. The proof starts off the same way as in Lemma 3.34, with the same
2,p
definition of gλ , which converges to g in W−q . The difference is that instead
of trying to make a conformal factor that deforms gλ to a scalar-flat metric
(which would result in a large deformation since g is not already scalar-flat),
we make a conformal change that removes all of the negative curvature, as
in the proof of Theorem 3.43.
That is, we attempt to solve
4(n − 1)
− Δgλ uλ − (Rgλ )− uλ = 0
n−2
with uλ (∞) = 1. As we have done before, set vλ = uλ − 1 so that this is
equivalent to
4(n − 1)
− Δg v − (Rg )− v = (Rg )−
n−2
with vλ (∞) = 0. Next, we assume q > 0 without loss of generality and then
show that
4(n − 1) 2,p
(3.13) − Δgλ − (Rgλ )− : W−q −→ Lp−q−2
n−2
is an isomorphism. By the construction and asymptotic flatness of g, it is
not hard to see that (Rgλ )− → 0 in L∞ −2 . (Check this.) This is good enough
to see that the operator (3.13) is close enough to − 4(n−1)
n−2 Δg (which is an
isomorphism by Theorem A.40) to be an isomorphism for large enough λ
(with an injectivity estimate independent of λ).
Hence, the desired solution uλ exists, and it is smooth by elliptic regular-
2,p
ity. As in the proof of Lemma 3.34, uλ → 1 in W−q ⊂ C−q
0 , and thus u > 0
λ
4
for large λ. Setting g̃λ = uλn−2 gλ , we see that g̃λ is harmonically flat outside
2,p
a compact set by construction and g̃λ → g in W−q . Meanwhile, just as we
2n
computed in the proof of Theorem 3.43, we have Rg̃λ = uλn−2 (Rgλ )+ ≥ 0.
The only thing left to check is that Rg̃λ → Rg in L1 .
Exercise 3.49. Complete the last step of the proof by checking that
2n
uλn−2 (Rgλ )+ → Rg in L1 .
For the proof of the positive mass theorem (Theorem 3.18), we can use
Lemma 3.48 in place of Lemmas 3.31 and 3.34, thereby avoiding the Hölder
decay assumption mentioned in Remark 3.40. We can also remove this
assumption from our proof of positive mass theorem rigidity (Theorem 3.19)
as follows.
Let (M, g) be a complete asymptotically flat manifold with nonnegative
scalar curvature and zero ADM mass in some end Mk . Recall that in the
proof of Theorem 3.19, we invoked Lemma 3.31 to show that g must be
scalar-flat. Here we present an alternative proof of this fact.
Suppose R > 0 is positive somewhere, and let η be a compactly sup-
ported function with 0 ≤ η ≤ 1 and η > 0 somewhere that R > 0. Instead
of solving Lg u = − 4(n−1)
n−2 Δu + Ru = 0, we can instead solve
4(n − 1)
− Δu + ηRu = 0.
n−2
Following the proof of Lemma 3.31, one can see that a smooth positive
solution u exists. Moreover,
4(n − 1)
Rg̃ = u− n−2 Lg u = u− n−2 (− Δu + Ru) = u− n−2 (1 − η)Ru ≥ 0.
n+2 n+2 n+2
n−2
Since u is g-harmonic outside a compact set, Corollary A.38 and Exer-
cise 3.13 implies that g̃ is asymptotically flat. Finally, the same argument
used in Lemma 3.31 shows that mADM (Mk , g̃) < mADM (Mk , g) = 0, contra-
dicting the positive mass theorem. Hence g must be scalar-flat.
Going a step further than Lemma 3.48, not only can we find a nearby
metric with nonnegative scalar curvature that is harmonically flat outside
a compact set, but we can find one that is exactly Schwarzschild outside a
compact set.
Theorem 3.50 (Bray [Bra97]). Let (M n , g) be a complete asymptotically
flat manifold with nonnegative scalar curvature which is harmonically flat
outside some compact set. For any δ > 0 there exists another asymptotically
flat metric g̃ on M such that
• g̃ has nonnegative scalar curvature,
• g̃ is exactly Schwarzschild outside some large radius,
• 0 < mADM (g̃) − mADM (g) < δ,
• g̃ − gC 0 < δ.
Proof. The proof is essentially a simple modification of the proof of Lem-

ma 3.39 described earlier. We know that in the harmonically flat region,
4
gij = u n−2 δij for some harmonic function u. By Corollary A.19 and Exer-
cise 3.13, we know that
m
u(x) = 1 + |x|2−n + O(|x|1−n ),
2
where m = mADM (g). Choose any m slightly larger than m and > 0, and

consider the following construction. Define v(x) = 1 + m2 |x|2−n . We will
interpolate between u and v over an annulus. For small enough , we can
find ρ1 < ρ2 such that
u < v − 3 for |x| < ρ1 ,
u>v− for |x| > ρ2 .
We define ũ = Ψ(u − v + 2) + v, where Ψ is some smooth function such that
Ψ(w) = w for w < −,
Ψ(w) = 0 for w > ,
Ψ (w) ≤ 0 everywhere.
By the same calculation as in the proof of Lemma 3.39, we can see that
4
ũ is superharmonic, and thus g̃ = ũ n−2 δij is a function that agrees with a
multiple of g for |x| < ρ1 , is exactly Schwarzschild of mass m for |x| > ρ2 ,
and has nonnegative scalar curvature everywhere. Furthermore, one can
check that by taking small enough and m close enough to m, we can
guarantee that g̃ is C 0 close to g.
There is an even more sophisticated theorem due to J. Corvino.

Theorem 3.51 (Corvino [Cor00]). Let (Rn Br0 (0), g) be an asymptoti-
4
cally flat manifold, and assume that gij = u n−2 δij for some harmonic func-
tion u. (In particular, g is scalar-flat.) If mADM (g) = 0, then for any
r1 > r0 , there exists a metric g̃ on Rn Br0 (0) such that
• g̃ is scalar-flat,
• g̃ = g in Br1 Br0 ,
• g̃ is exactly Schwarzschild outside some large radius,
• mADM (g̃) may be chosen arbitrarily close to mADM (g).
Moreover, g̃ will be close to g in weighted Sobolev space. This result can

be regarded as a “localized gluing” theorem, and the proof requires a deeper
understanding of the linearized scalar curvature operator. See Theorem 6.14
for a result with a similar flavor. Corvino and Schoen also proved a version
of Theorem 3.51 for asymptotically flat initial data sets [CS06, Theorem 4].
That result allows one to glue initial data satisfying harmonic asymptotics
as in Definition 8.2 to Kerr initial data.
3.4. A few words on Ricci flow

Recently, Yu Li has given an independent proof of the positive mass theorem
in dimension 3 using Ricci flow [Li18]. The proof requires extensive use of
the work of Perelman [Per02, Per03b, Per03a] on Ricci flow, and therefore
a complete discussion is beyond the scope of this book. However, we will
sketch the basic idea behind Li’s proof.
Recall that the Ricci flow evolves a family of metric gt according to
∂
g = −2Ric.
∂t
By the Ricci flow estimates of Wan-Xiong Shi [Shi89], we know that any
complete smooth manifold with bounded curvature can be used as initial
data for Ricci flow. In particular, if (M 3 , g) is a complete asymptotically
flat manifold with nonnegative scalar curvature, we can consider a Ricci flow
of complete metrics gt on M with initial condition g0 = g. Recall that the
scalar curvature R evolves according to equation (2.19):
∂
R = Δg R + 2|Ric|2 .
∂t
Combining this with the parabolic maximum principle (see [Eva10, Chapter
7], for example), it follows that nonnegativity of R is preserved by Ricci flow.
It is also true that asymptotic flatness and ADM mass are both preserved
under Ricci flow, as first observed by Xianzhe Dai and Li Ma [DM07] and
by T. Oliynyk and E. Woolgar [OW07]. (Technically, one must assume a
stronger notion of asymptotic flatness than in Definition 7.17 that involves
decay of higher derivatives of the metric [Li18].) The decay of the metric gt
essentially comes from Shi’s estimates, but integrability of R requires a little
more: we must use the scalar curvature evolution equation above, together
with decay of the |Ric|2 term coming from Shi’s estimates. See [DM07,
Theorem 11, MS12, Lemma 10] for details. To see why the ADM mass is
3.4. A few words on Ricci flow 105
preserved, observe that

d 1 ∂
mADM (gt ) = lim (gij,i − gii,j )ν j dμSρ
dt 16π ρ→∞ Sρ ∂t

1
= lim (−2Rij,i + 2Rii,j )ν j dμSρ
16π ρ→∞ Sρ

1
= lim (−∇i Rij + ∇j R)ν j dμSρ
8π ρ→∞ Sρ

1
= lim ∇ν R dμSρ ,
16π ρ→∞ Sρ
where we use asymptotic flatness in the third equality and the contracted
second Bianchi identity (Exercise 1.10) in the last equality. If we assume
pointwise decay of Rg strong enough for Rg to be integrable, then Shi’s
estimates will show that ∇Rgt decays enough for any t > 0 for the flux
integral to vanish in the limit. More generally, one can work a bit harder in
order to achieve the same result. See [MS12, Lemma 11] for details.
Li first proves that if the Ricci flow exists for all time, then the metric
on the exterior coordinate chart must converge to the Euclidean metric
in the weighted space C−q 2 for some q > n−2 . This argument works in
2
all dimensions. The main idea is to use Perelman’s μ-functional to derive
spatial decay estimates on the Riemann curvature tensor that decay in time
as well. Given convergence to Euclidean space, the next step of the argument
is perhaps the most interesting from a non-Ricci flow perspective. Note that
2 is not sufficient to invoke Lemma 3.35 to conclude
gt converging to g∞ in C−q
that mADM (g∞ ) = limt→∞ mADM (gt ). (This is expected since that equation
would give us the nonsense result that mADM (g) = 0.) However, if we
examine the proof of Lemma 3.35 and observe that we have the pointwise
estimate Rgt ≥ 0 = Rg∞ , we can see that
0 = mADM (g∞ ) ≤ lim mADM (gt ) = mADM (g),
t→∞

Of course, typically, one expects the Ricci flow to run into singularities,
but in three dimensions Perelman showed that one can run past these sin-
gularities using Ricci flow with surgery, as explained in John Morgan and
Gang Tian’s book [MT07]. This is, of course, the most difficult part of the
proof, but if we accept results on Ricci flow with surgery as a black box,
then the only thing that needs to be verified directly is that the long-time
evolution under Ricci flow with surgery can be constructed with only finitely
many surgery times. This is because after each surgery, nonnegative scalar
curvature, asymptotic flatness, and ADM mass are all preserved, and then
after the last surgery time, we will have a smooth Ricci flow that lasts for
all time. In other words, Li’s result described above applies. Li’s proof that
there are only finitely many surgery times again utilizes the μ-functional,
together with an understanding of the surgeries.
Chapter 4
The Riemannian
Penrose inequality
4.1. Riemannian apparent horizons

4.1.1. Basic properties. In this chapter we continue to explore the inter-
play between mass, scalar curvature, and minimal hypersurfaces. We will
need the following important lemma, which can be thought of as a nonlin-
ear version of the strong maximum principle for the minimal hypersurface
equation.
Lemma 4.1. Let B̄ be a closed ball in Rn−1 , and consider the cylinder
B̄ × R ⊂ Rn equipped with a Riemannian metric g. Let u1 , u2 : B̄ −→ R,
and let Σ[ui ] denote the graph of ui in B̄ × R. Assume that
u1 ≥ u2 ,
HΣ[u1 ] ≥ 0,
HΣ[u2 ] ≤ 0,
where the mean curvature is computed using the upward normal (so that the
mean curvature of a spherical “cap” in Euclidean space is positive). If Σ[u1 ]
and Σ[u2 ] meet anywhere in their interiors or are tangent at any point of
their boundaries, then u1 = u2 identically on all of B̄.
Rough idea of the proof. The basic idea is that if one writes out the
mean curvature operator in local coordinates, one obtains a quasi-linear
107
108 4. The Riemannian Penrose inequality
elliptic equation of the form

n−1
HΣ[u] (x) = aij (x, u, ∂u)∂i ∂j u + b(x, u, ∂u).
i,j=1
From this, one can show that the difference u1 − u2 satisfies a linear ellip-
tic inequality to which one can apply the usual strong maximum principle
(Theorem A.2). For the Euclidean case, see [Sch83, Lemma 1]. For fuller
details, see [AGH98, Section 3.1].
We immediately obtain the following corollary.

Corollary 4.2 (Strong comparison principle for mean curvature). Suppose
we have open sets Ω1 ⊂ Ω2 in a Riemannian manifold (M, g) and smooth
hypersurfaces Σ1 and Σ2 (possibly with boundary) lie on ∂Ω1 and ∂Ω2 , re-
spectively, with HΣ1 ≤ 0 and HΣ2 ≥ 0, where these are computed with respect
to the outward-pointing unit normal. If Σ1 touches Σ2 anywhere in their in-
teriors, or if they are tangent to each other at a common boundary point,
then they must be identically equal in a neighborhood of that point.
In particular, this means that a closed minimal hypersurface can never

“penetrate” a foliation by hypersurfaces of nonnegative mean curvature from
the “inside” (where the unit normals point outward). As a consequence, we
have the following extension of Theorem 2.22 to manifolds with (weakly)
mean convex boundaries.
Corollary 4.3. Let n < 8, and let (M n , g) be a compact Riemannian mani-
fold with boundary such that the boundary ∂M has nonnegative mean curva-
ture with respect to the outward-pointing normal. For each nonzero homology
class α ∈ H n−1 (M, Z), there exists an integral sum of smooth oriented min-
imal hypersurfaces Σ ∈ α that minimizes volume among all smooth cycles
in α, and each of these minimal hypersurfaces must either be disjoint from
∂M , or else be equal to a component of ∂M (which of course is only possible
if that component is minimal).
Sketch of the proof. The general theory used to construct a minimizer

in Theorem 2.22 works just as well when there is a boundary. The only
difference is that the minimizer might not be regular if it bumps against the
boundary. The basic idea is that one can construct a collar ∂M × [0, 1] with
∂M × {0} isometric to ∂M and with each ∂M × {s} being strictly mean
convex for s > 0. We can glue this collar to M along ∂M × {0} ∼ = ∂M ,
and then minimize in the new glued manifold. As long as the collar is
big enough, the minimizer cannot touch the new boundary ∂M × {1}, and
then the regularity theory of Theorem 2.22 tells us that the minimizer is an
integral sum of smooth oriented minimal hypersurfaces. Then Corollary 4.2
4.1. Riemannian apparent horizons 109
guarantees that each of these minimal hypersurfaces cannot penetrate the

interior collar at all, and that it can only touch ∂M if it is equal to a
component of ∂M .
In this chapter we will be interested in hypersurfaces that appear as

boundaries of open sets. In particular, we are interested in the concept of
perimeter. The perimeter of a general open subset Ω of an n-dimensional
manifold is the same as the (n − 1)-dimensional Hausdorff measure of its
reduced boundary, ∂ ∗ Ω. The reduced boundary ∂ ∗ Ω is always contained in
the topological boundary ∂Ω, but it has better measure-theoretic properties.
There is a rich theory of sets of locally finite perimeter (also called Cacciop-
poli sets; see [Wik, Caccioppoli set]), but the details will not be important
to the discussion here. One reason is that as long as ∂Ω is C 1 , the reduced
boundary coincides with the topological boundary.
Definition 4.4. Let (M, g) be a complete Riemannian manifold with a
distinguished noncompact end. In what follows, we only consider open sets
Ω ⊂ M (not necessarily connected) such that ∂Ω is compact, Ω has finite
perimeter, and Ω contains all ends other than the distinguished end (if
any). We will call such sets enclosed regions and their boundaries enclosing
boundaries, or enclosing hypersurfaces if they happen to be smooth.
We say that an enclosed region Ω is a minimizing hull or that Σ = ∂ ∗ Ω
is outward-minimizing if Ω has perimeter less than or equal to every other
enclosed region containing it. If the perimeter is always strictly less, then
we say that Ω is a strictly minimizing hull or that Σ is strictly outward-
minimizing.
If Σ = ∂Ω is a smooth minimal enclosing hypersurface, then we say
that Σ is an outermost minimal hypersurface if there are no other minimal
hypersurfaces enclosing Σ (in the sense of being the boundary of an en-
closed region containing Ω).1 We will often refer to an outermost minimal
hypersurface as an apparent horizon for the distinguished end.
We can adapt the definition above to manifolds with boundary as follows.
We arbitrarily “fill in” ∂M with some Riemannian region W to create a new
complete Riemannian manifold M̃ = M ∪ W without boundary. All of the
definitions above are now understood by replacing Ω by Ω ∪ W . So for
example, Ω is an enclosed region of M if Ω ∪ W is an enclosed region of
M̃ , ∂Ω actually refers to ∂(Ω ∪ W ), and the perimeter of Ω is defined to
be the perimeter of Ω ∪ W . An enclosing boundary is a set of the form
∂(Ω ∪ W ). This gives meaning to our other terms like outward-minimizing
and apparent horizon. It is clear that these definitions are independent of
the choice of W .
1 By convention, we say that a boundary hypersurface encloses itself.
Figure 4.1. Theorem 4.6 states that the boundary of the minimizing
hull of Ω is smooth and minimal away from Ω.
Keep in mind that under our conventions, if Ω contains ∂M , then ∂Ω

does not include ∂M (which is reasonable since one refers to a topological
boundary while the other refers to a manifold boundary anyway). Moreover,
under our conventions, the empty set is an enclosed region whose enclosing
boundary is ∂M , which lets us make sense of the idea that ∂M can be an
apparent horizon, for example.
The terminology used here is not quite standard, but it will be useful
for our purposes.
Apparent horizons have a physical interpretation related to black holes,
which we will discuss in Chapter 7. The simplest example of an apparent
horizon occurs in Schwarzschild space.
Exercise 4.5. Recall from Exercise 3.3 that there exists an isometry of
Schwarzschild space that exchanges its two ends, whose fixed set is a totally
geodesic sphere. If we think of this totally geodesic sphere as enclosing one
of the ends, prove that it is an apparent horizon with respect to the other
end. Hint: Use Corollary 4.2.
Theorem 4.6 (Existence and regularity of strictly minimizing hulls). Let
(M n , g) be a complete Riemannian manifold (possibly with boundary) and
a distinguished end. For each enclosed region Ω ⊂ M , define Ω to be the
intersection of all strictly minimizing hulls that contain Ω. Then Ω is itself
a strictly minimizing hull.
Now assume n < 8. If ∂Ω is C 2 , then ∂Ω is C 1,1 everywhere and is a
smooth minimal hypersurface away from ∂Ω. (See Figure 4.1.)
The proof is unfortunately outside the scope of this text. See [HI01,
Regularity Theorem 1.3, Tam84] for details.
Theorem 4.7 (Existence and uniqueness of apparent horizons). Let n < 8,
and let (M n , g) be a complete asymptotically flat manifold (possibly with
boundary).
(1) If M has nonempty boundary with nonpositive mean curvature (with

respect to the “outward” normal pointing into M ) and only one end,
then there exists a smooth apparent horizon.
(2) If an end of M has an apparent horizon, then it is unique, and
moreover both the horizon and the region outside the horizon are
orientable.
(3) The apparent horizon encloses all enclosing minimal hypersurfaces.
(4) The apparent horizon is outward-minimizing.
Remark 4.8. Regarding the dimension restriction, one can still show that
an “apparent horizon” exists in higher dimensions if one is willing to widen
the definition of apparent horizon to allow singular behavior, but this is a
concept that has not been addressed much in the literature, and we will not
say more about it.
Sketch of the proof. We start with the first statement. We begin by con-
structing a single enclosing minimal hypersurface homologous to a large
coordinate sphere. By asymptotic flatness, the mean curvature of the coor-
dinate sphere of radius ρ is approximately n−1
ρ , and thus the end is foliated
by hypersurfaces with positive mean curvature. Consider the region Ωρ
enclosed by one of these large coordinate spheres Sρ . Next we minimize
volume in the homology class of Sρ in Ωρ . By Corollary 4.3, we obtain a
smooth minimal hypersurface Σ enclosing ∂M . (It must be enclosing and
have multiplicity 1 because it is homologous to Sρ .)
Consider the family F of all enclosed regions in M whose boundaries
are minimal hypersurfaces in M homologous to Sρ . Above, we argued that
F is nonempty. Since the end of M beyond Sρ is foliated by positive mean
curvature, Corollary 4.2 implies that every element of F lies in Ω. For any
Ω1 , Ω2 ∈ F , we claim that there exists Ω ∈ F containing both Ω1 and Ω2 .
If ∂(Ω1 ∪ Ω2 ) is smooth, then it must be minimal, and thus Ω1 ∪ Ω2 ∈
F . Therefore we consider the case where ∂(Ω1 ∪ Ω2 ) contains a singular
set, which must occur where ∂Ω1 meets ∂Ω2 . The intuition here is that
∂(Ω1 ∪Ω2 ) should have nonpositive mean curvature in a “weak” set (visualize
two intersecting planes to see why this should be true), and indeed a result
of M. Kriele and S. Hayward [KH97] implies that it can be smoothed out in
such a way that it has nonpositive mean curvature and encloses ∂(Ω1 ∪ Ω2 ).
Now we apply Corollary 4.3 to produce a new element of F that encloses
Ω1 ∪ Ω2 .
%
The claim above implies that the region Ω∈F Ω can be exhausted by a
single increasing sequence Ωi with each Ωi ⊂ Ω and |∂Ωi | ≤ |Sρ |. Finally,
this sequence of stable minimal hypersurfaces ∂Ωi with bounded volume
must converge by estimates of R. Schoen and Leon Simon [SS81]. The limit
Figure 4.2. An asymptotically flat manifold with three ends. Corol-

lary 4.9 guarantees that each end has a corresponding apparent horizon,
represented in the figure above by dark circles.
gives us an enclosing minimal hypersurface boundary Σ∞ homologous to Sρ .

By construction, Σ∞ must enclose all elements of F , which implies that it
must be outermost and also property (3), which immediately implies the
uniqueness in property (2). The orientability follows from the fact that Σ∞
is homologous to Sρ . Finally, if property (4) did not hold, then we could
use Corollary 4.3 to construct a new element of F enclosing Σ∞ , which is
impossible.
Corollary 4.9. Let n < 8, and let (M n , g) be a complete asymptotically
flat manifold whose boundary is either empty or minimal. If M has more
than one end, then there is an apparent horizon corresponding to each end.
(See Figure 4.2.) The result still holds if M has a boundary, as long as that
boundary has nonpositive mean curvature.
Proof. Choose a distinguished end and then cut off all of the other ends at
large coordinate spheres. Now apply Theorem 4.7 to the result.
The following corollary of Theorem 2.41 gives us some control over the
topology of apparent horizons.
Corollary 4.10. If Σ is an apparent horizon in an asymptotically flat man-
ifold (M, g) with nonnegative scalar curvature, then Σ is orientable and
Yamabe positive. In particular, when M is three-dimensional, Σ is topo-
logically a union of spheres.
Proof. Observe that Σ is automatically two-sided since it is a boundary,

and it is orientable by Theorem 4.7. Suppose Σ is not Yamabe positive.
By Theorem 4.7, an apparent horizon is an outward-minimizing minimal
hypersurface. If it were minimizing (on both sides), Theorem 2.41 would

imply that M splits as a product with Σ × R, contradicting asymptotic
flatness. However, if one reviews the proof of Theorem 2.41, it becomes
clear that one-sided minimization is enough to prove a one-sided splitting
theorem.
When n = 3, we also have some good control over the topology of the
exterior, even without any scalar curvature assumption.
Theorem 4.11. Let (M 3 , g) be an asymptotically flat manifold whose bound-
ary is either empty or minimal. Assume that M contains no immersed
minimal surfaces. Then M is diffeomorphic to R3 minus a finite number
(possibly zero) of open balls.
Proof. This can be proved using a theorem of W. Meeks, L. Simon, and

S.-T. Yau [MSY82], but we present an argument due to M. Eichmair,
G. Galloway, and D. Pollack [EGP13]. The argument is simple, but it
invokes Thurston’s geometrization of 3-manifolds due to G. Perelman (see
[MT14]).
We first prove that M is simply connected. Suppose that it is not.
By work of J. Hempel [Hem76], together with the geometrization of 3-
manifolds, it follows that there exists a k-fold connected covering M̃ of M
for some k > 1. Then M̃ is an asymptotically flat manifold with at least k
ends (and minimal boundary, if any). Then by Corollary 4.9, M̃ contains
an embedded minimal surface, and hence M contains an immersed minimal
surface, contrary to our hypothesis.
Thus M is simply connected and, in particular, it is orientable. There-
fore ∂M is also orientable. Consequently, we can fill the boundary compo-
nents by handlebodies (where we consider the ball to be a 0-handlebody)
and compactify infinity at a point to obtain a closed manifold M . This M
is still simply connected since all of the fundamental group of a handlebody
comes from its boundary surface, which lies in the simply connected space
M . Or more concretely, it is not hard to see that any curve in M that in-
tersects one of the handlebodies is homotopic to one that does not. Now we
invoke the Poincaré-Perelman Theorem [MT07] to see that M is diffeomor-
phic to S 3 . Therefore M is just R3 with a certain number of handlebodies
removed. All of these handlebodies must be balls, since it is clear that R3
minus a higher genus handlebody is not simply connected.
4.1.2. The Riemannian Penrose inequality. For physical reasons that

we discuss in Section 7.6, long before the positive mass theorem was proved,
R. Penrose conjectured (in three dimensions) the following refinement of the
positive mass theorem [Pen73].
Conjecture 4.12 ((Riemannian) Penrose inequality). Let (M n , g) be a

complete asymptotically flat manifold with nonnegative scalar curvature, and
let Σ be an apparent horizon with respect to some end Mk . Then

n−2
1 |Σ| n−1
mADM (Mk , g) ≥ .
2 ωn−1
Moreover, if equality holds, then the part of M outside Σ is isometric to half
of the Schwarzschild space of mass mADM (Mk , g).
The conjecture was first proved by G. Huisken and T. Ilmanen in [HI01]

when n = 3 and |Σ| is replaced by the area of any component of Σ. Their
proof is discussed in Section 4.2. Hubert Bray gave a different proof for the
general case of n = 3 in [Bra01], and this proof was later generalized to
n < 8 in [BL09]. This proof is discussed in Section 4.3.
Remark 4.13. By Theorem 4.7, we can replace the hypothesis that Σ is
an apparent horizon by the assumption that Σ is an outward-minimizing
minimal hypersurface.
Exercise 4.14. Show that the conjecture fails if we replace the hypothesis
that Σ is an apparent horizon by the assumption that Σ is an enclosing
minimal hypersurface. Hint: Try altering the Schwarzschild metric in such
a way that a large minimal surface is created “behind” the apparent horizon.
The intuition behind the outward-minimizing assumption, as illustrated

by the exercise above, is that it does not matter what happens “behind”
the horizon. Because of this, it is perhaps more natural to state the Penrose
conjecture as follows.
Conjecture 4.15 (Boundary version of the Penrose inequality). Let (M n , g)
be a complete one-ended asymptotically flat manifold with boundary and with
nonnegative scalar curvature, and assume that ∂M is an apparent horizon.
Then

n−2
1 |∂M | n−1
mADM (M, g) ≥ .
2 ωn−1
Moreover, if equality holds, then M is isometric to half of the Schwarzschild
space of mass mADM (M, g).
In light of this perspective, there should be a version of the positive

mass theorem with minimal boundary. Indeed, H. Bray showed that this is
a direct consequence of the usual positive mass theorem [Bra01].
Theorem 4.16. Let (M, g) be a complete one-ended asymptotically flat
manifold with minimal boundary and nonnegative scalar curvature. Then
the ADM mass of (M, g) is nonnegative.
Figure 4.3. The glued Riemannian manifold (M, g) is smooth every-

where except along the hypersurface Σ, where g is Lipschitz across Σ,
and we have an inequality between the mean curvatures of Σ, as com-
puted using g on the two different sides of Σ.
This theorem can be seen as a special case of the following.

Theorem 4.17 (Bray, Miao, Shi-Tam, and McFeron-Székelyhidi). Let
(Mout , gout ) be a complete asymptotically flat manifold with boundary, and
let (Min , gin ) be either a compact Riemannian manifold with boundary or
a complete asymptotically flat manifold with boundary. In either case, as-
sume that ∂Mout is isometric to ∂Min , and let (M, g) be the result of gluing
(Mout , gout ) and (Min , gin ) along this common boundary Σ ⊂ M .
Assume that g has nonnegative scalar curvature away from Σ, and fur-
ther assume that Hout ≤ Hin along Σ, where Hout (respectively, Hin ) is the
mean curvature of Σ as computed by gout (respectively, gin ). Here we use
the normal ν pointing toward Mout . (See Figure 4.3 for a helpful picture.)
Then the ADM mass of each end of (M, g) is nonnegative.
Furthermore, if the mass of any end is zero, then Hout = Hin along Σ,
and moreover (M, g) is Euclidean space. Or more precisely, for some α ∈
(0, 1), there exists a C 1,α diffeomorphism M −→ Rn such that gij (x) = δij
in this coordinate chart.
Observe that Theorem 4.16 is just the case when (Mout , gout ) = (Min , gin )
with minimal boundary. Pengzi Miao [Mia02] generalized Bray’s argument
to obtain the statement of Theorem 4.17, except that rigidity was only
obtained in dimension 3. Yuguang Shi and Luen-Fai Tam gave a proof for
spin manifolds using Witten’s spinor technique in [ST02], which we will
describe in detail in Section 5.4.1. Later, D. McFeron and G. Székelyhidi
gave another proof using Ricci flow and established that the rigidity holds
in general [MS12].
Observe that the reason why Theorem 4.17 does not follow directly
from the ordinary positive mass theorem (Theorem 3.18) is that the met-
ric g need not be smooth across Σ. In general, it is only Lipschitz. One
perspective on Theorem 4.17 is that it is a positive mass theorem for sin-
gular metrics, though the singular behavior here is quite mild. For more
work on the positive mass theorem for other kinds of singular metrics,
see [MM17, Lee13, GT12]. In particular, in the spin case, Theorem 4.17
can be generalized to show that the positive mass theorem holds as long as
the scalar curvature can be defined in a distributional sense [LL15]. These
sorts of theorems might be useful for trying to understand the concept of a
“weak” notion of nonnegative scalar curvature.
Exercise 4.18. Find an example showing that Theorem 4.17 fails without
the assumption Hout ≤ Hin .
Sketch of the proof of Theorem 4.17. As mentioned above, g is only

Lipschitz across Σ, so its scalar curvature is not even defined there. However,
in a sense that can be made precise, the scalar curvature Rg can still be
defined as a distribution. The question is what sort of contribution the
singular behavior of g at Σ makes to this distributional scalar curvature.
Heuristically, we can see that the condition Hout ≤ Hin ensures that the
contribution is not too negative as follows.
Let Σt be the parallel hypersurface to Σ ⊂ M that is a signed distance
of t away from Σ, so that Σt ⊂ Mout for t > 0 and Σt ⊂ Min for t < 0.
By (2.15), we can see that
∂ 1
(4.1) HΣt = (RΣt − Rg − |AΣt |2 − HΣ2 t ).
∂t 2
Though this equation is not valid at t = 0, we see that the singular
behavior
of Rg at Σ ought to be controlled by the sign of − ∂t ∂
HΣt t=0 , which is
distributionally greater than or equal to zero if Hout ≤ Hin .
We follow the proof given by Miao in [Mia02]. We first work to prove
that mADM (g) ≥ 0. The overall strategy is the same as in the proof of The-
orem 3.43, except that we have to be more careful with how we choose our
smooth approximation. The conformal deformation step is essentially the
same. Consider a neighborhood of Σ with Gaussian normal coordinates in
which g can be written as g = ht +dt2 on Σ×(−δ, δ), where ht is the induced
metric on Σ × {t}. For each ∈ (0, δ), we define g = ht + dt2 , where ht is
a mollification of ht over the t-variable only, such that ht = ht for |t| > .
Let Σt refer to Σ × {t}, thought of as a hypersurface of (M, g ). Its induced
∂
metric is ht , and its second fundamental form is AΣt = ∂t ht . The key point
here is that since Hout ≤ Hin , one can show that ∂t HΣt is uniformly bounded
∂
from below in . The details of the mollification and this computation can
be found in [Mia02]. Since RΣt and |AΣt |2 are clearly uniformly bounded in
, equation (4.1) shows that Rg is uniformly bounded from below in . The
upshot is the following. After an appropriate mollification, we now have a
smooth metric g which agrees with g away from the -neighborhood of Σ,
and there is a constant C > 0, independent of , such that Rg > −C inside
that neighborhood.
We now make a conformal change that removes all of the negative scalar
curvature, just as in the proof of Theorem 3.43. We attempt to solve
− 4(n−1)
n−2 Δg v − (Rg )− v = (Rg )−
by showing that the operator
− 4(n−1) 2,p p
n−2 Δg − (Rg )− : W−q −→ L−q−2
is an isomorphism, where p > n/2 and 0 < q < n − 2. Once again, by
Theorem A.40, Δg is an isomorphism with an injectivity estimate for Δg
independent of . As before we need (Rg )− to converge to 0 in Lp−2 to show
that our operator can be chosen close enough to − 4(n−1)
n−2 Δg to make it an
isomorphism. But since |(Rg )− | < C and is supported in an -neighborhood
of Σ, (Rg )− Lp−2 = O(1/p ) → 0. Hence, we can solve for v and define
4
g̃ = (1 + v ) n−2 g .
The rest of the argument is the same as in the proof of Theorem 3.43,
2,p
except that it no longer needs to be true that g̃ → g in W−q or that
Rg̃ → Rg in L . But the convergence still holds away from Σ, and that is
1
all that is needed to invoke Lemma 3.35, since ADM mass is an asymptotic
quantity.
We now consider the case mADM (g) = 0. Suppose that Hout < Hin
somewhere in Σ. In this case, one can go further and show that (Rg )+ is
quite large near there, and consequently so is the nonnegative scalar curva-
ture Rg̃ . As seen in the proof of Lemma 3.31, this positive scalar curvature
implies that mADM (g̃ ) > 0. The hard part is to then show there is a posi-
tive lower bound independent of , and consequently mADM (g) > 0, giving
a contradiction. See [Mia02] for details.
Unfortunately, this argument does not yield the full rigidity result that
if mADM (g) = 0, then (M, g) is Euclidean space. As mentioned earlier,
McFeron and Székelyhidi [MS12] came up with an alternative method of
proof using Ricci flow in place of the conformal change. The main advantage
of their method is that it does yield a rigidity result. We give only a very brief
outline of the proof. Recall that Ricci flow preserves asymptotic flatness,
nonnegative scalar curvature, and ADM mass. Work of Miles Simon shows
that there exists a Ricci-DeTurck flow g(t) with initial condition g(0) =
g, which is Lipschitz but not smooth [Sim02]. Let g denote the same
smoothing of g constructed above by Miao, and consider the Ricci flows g (t)
with initial condition g . The idea is to compare g (t) to g(t), which will
simultaneously exist for some small time T . Miao’s construction shows that
(Rg )− is negligible (in an integral sense) and then, since Ricci flow has the
effect of increasing scalar curvature, (Rg (t) )− is also controlled as → 0.
McFeron and Székelyhidi showed that this control implies that Rg(t) ≥ 0
and also that mADM (g(t)) ≤ lim inf mADM (g (t)). Using the positive mass
theorem on the metric g(t) and the fact that Ricci flow (and Ricci-DeTurck
flow) preserves ADM mass, we have, for any small t < T ,
0 ≤ mADM (g(t)) ≤ lim inf mADM (g (t)) = lim inf mADM (g ) = mADM (g).
→0 →0
If mADM (g) = 0, then the inequalities become equalities. In particular,

mADM (g(t)) = 0, so g(t) is Euclidean space for all t < T , which implies that
g is also Euclidean since it is the limit of g(t) as t → 0.
We end this section by observing that in order to prove the inequality in

Conjecture 4.15, one may assume without loss of generality that the metric
is harmonically flat outside a compact set.
Lemma 4.19. Let (M n , g) be a complete one-ended asymptotically flat man-

ifold with boundary and with nonnegative scalar curvature, and assume that
∂M is an apparent horizon. Suppose that (M, g) violates the Penrose in-
equality (as given in Conjecture 4.15). Then there exists a complete one-
ended asymptotically flat manifold with boundary, (M̂ , ĝ), with nonnegative
scalar curvature such that ∂ M̂ is an apparent horizon, (M̂ , ĝ) also violates
the Penrose inequality, and moreover ĝ is harmonically flat outside a com-
pact set.
Proof. Let > 0. First, we double the (M, g) through its boundary to
obtain (M , ḡ). We apply the argument from Theorem 4.17 to obtain a
smoothing (M , ḡ ) such that ḡ has nonnegative scalar curvature, is -close
to g in C 0 , and has mass within of mADM (g). Next we apply our density
result (Lemma 3.48) to obtain a new harmonically flat metric ĝ on M that
has nonnegative scalar curvature, is -close to ḡ in C 0 , and has mass within
of mADM (g). By Corollary 4.9, (M , ĝ ) has an apparent horizon Σ with
respect to one of the ends. Let M̂ be a manifold without boundary obtained
by removing the part of M enclosed by Σ , so that ∂ M̂ = Σ .
We claim that if the Penrose inequality fails for (M, g), then it will also
fail for (M̂ , ĝ ) for small enough . This is because the mass does not change
much, while the volume of the outermost minimal hypersurface Σ in (M , ĝ )
cannot be significantly smaller than the volume of ∂M in (M, g). The reason
for this is that ∂M is not just outward-minimizing, but minimizing in (M , ḡ).
Thus |Σ |ḡ ≥ |∂M |g , and the C 0 closeness of ĝ to ĝ means that |Σ |ḡ cannot
be much smaller than |∂M |g . The result follows.
4.1.3. Special cases of the Penrose inequality. As we did for the pos-
itive mass theorem, we can consider some simple cases of the Penrose in-
equality. Of course, the easiest is the spherically symmetric case.
Proposition 4.20. Let (M, g) be a complete asymptotically flat manifold
diffeomorphic to [0, ∞)×S n−1 which is spherically symmetric in the sense of
Section 3.1.1. If g has nonnegative scalar curvature and ∂M is an apparent
horizon, then

n−2
1 |∂M | n−1
mADM (M, g) ≥ .
2 ωn−1
Proof. Since ∂M is an apparent horizon, our discussion in Section 3.1.1

implies that there exists a diffeomorphism between M and [r0 , ∞) × S n−1
such that
g = V −1 dr2 + r2 dΩ2
for some positive function V (r), where dΩ2 is the standard round unit sphere
metric, and r0 is determined by the equation |∂M | = ωn−1 r0n−1 .
Observe that minimality of ∂M implies that limr→r+ V (r) = 0. Follow-
0
ing the same proof we used for Proposition 3.20, we observe that

n−2
1 |∂M | n−1 1
= lim rn−2 (1 − V (r))
2 ωn−1 r→r0 2
+
1 n−2
≤ lim r (1 − V (r)) = mADM (g).
r→∞2
If we have equality, then 12 rn−2 (1 − V (r)) is identically equal to m :=
mADM (g) for all r, which means that V (r) = 1 − r2m n−2 . Therefore g is
1
the Schwarzschild metric of mass m (3.1), and r0 = (2m) n−2 .
One can also consider the graphical case.

Theorem 4.21 (Lam [Lam11], Huang-Wu [HW15]). Let Ω be an enclosed
region of Rn such that each component of ∂Ω is smooth, has positive mean
curvature, and is outward-minimizing with respect to the Euclidean metric
on Rn . Let f ∈ C ∞ (Rn Ω) ∩C 0 (Rn Ω) such that f is constant on ∂Ω and

limx→∞ f (x) is either a constant or ∞. Let M be the graph of f in Rn+1 ,
and let g be the metric on M induced by the Euclidean metric on Rn+1 .
Assume that fi fj = O2 (|x|−q ) for some q > n−2
2 , where the subscripts on f
denote partial differentiation, and assume that Rg is integrable over (M, g).
If g has nonnegative scalar curvature and ∂M is minimal in (M, g), then

n−2
1 |∂M | n−1
mADM (M, g) ≥ .
2 ωn−1

Proof. First observe that minimality of ∂M in (M, g) implies that |∂f | → ∞

as we approach ∂Ω. If we simply follow the proof of Theorem 3.23 and apply
the divergence theorem on Rn Ω, then similar to what we saw in (3.9), we
obtain

1
mADM (M, g) ≥ H ∂Ω dμ∂Ω ,
2(n − 1)ωn−1 ∂Ω
where H ∂Ω is the mean curvature of ∂Ω in Euclidean Rn .

For the last step, a classical inequality of Minkowski (which is a special
case of the Alexandrov-Fenchel inequality) states that for a convex surface Σ,

n−2
1 1 |Σ| n−1
H Σ dμΣ ≥ .
2(n − 1)ωn−1 Σ 2 ωn−1
An argument of G. Huisken using inverse mean curvature flow shows that

this classical result can be generalized to any outward-minimizing Σ with
positive mean curvature. (See [FS14] for a proof.) If we apply this inequality
n−2 n−2
to every component of ∂Ω and use the fact that i Ain−1 ≥ ( i Ai ) n−1 , we
obtain the desired Penrose inequality, keeping in mind that the volume of
∂Ω inside Euclidean space equals the volume of ∂M in (M, g).
While the inequality is due to G. Lam [Lam11], the rigidity is due to
L.-H. Huang and D. Wu [HW15]. Turning all inequalities into equalities in
the proof described above is enough to show that g is scalar-flat and ∂Ω is a
round sphere, but to go further Huang and Wu show that mean curvature of
the graph of f inside Rn+1 does not change sign (just as in Theorem 3.23)
and then use their strong maximum principle for the scalar curvature of
graphs to obtain the desired result.
4.2. Inverse mean curvature flow 121
4.2. Inverse mean curvature flow

4.2.1. Hawking mass. As mentioned, the Penrose inequality was first
proved in three dimensions by Huisken and Ilmanen, with the area of the
minimal surface replaced by the area of its largest component [HI01].
Theorem 4.22 (Huisken-Ilmanen’s Penrose inequality). Let (M 3 , g) be a
complete one-ended asymptotically flat manifold with boundary, whose as-
ymptotic decay rate (as in Definition 3.5) is at least 1. Assume that g has
nonnegative scalar curvature, and that ∂M is an apparent horizon. Then
&
A
mADM (M, g) ≥ ,
16π
where A is the area of any component of ∂M . Moreover, if equality holds,
then M is isometric to half of the Schwarzschild space of mass mADM (M, g).
Note that by Lemma 4.19, one can assume without loss of generality
that the asymptotic decay rate is at least 1, for the purpose of proving the
inequality, but not for the purpose of characterizing the equality case.
Our goal in this section is to summarize the main features of Huisken
and Ilmanen’s argument. We restrict our attention to dimension 3, since
the techniques described here do not generalize well to higher dimensions.
Definition 4.23. Given a closed surface2 Σ2 in a Riemannian manifold
(M 3 , g), we define its Hawking mass to be
&

|Σ| 1
mHaw (Σ) = 1− 2
H dμΣ .
16π 16π Σ
Hawking mass, introduced in [Haw68], was one of the first examples of

a quasi-local mass. The term quasi-local mass is widely used in the literature
but has no precise meaning on its own. It is a phrase used to describe a
concept. Given a region, or the boundary of a region, the quasi-local mass is
supposed to be some kind of measurement of “the amount of mass enclosed.”
As we discussed, mass only really makes sense when measured at infinity,
but for various purposes it is useful to try to understand how much a given
region of space “contributes” to this mass. Of course, in Newtonian gravity
this is straightforward—it is simply the integral of the mass density function
over the region, or equivalently the corresponding boundary flux integral
obtained from applying the divergence theorem. But in general relativity,
there is no obvious way to measure this quasi-local mass, or even say what
it means. In fact, even the question of which properties are considered
2 This formula for the Hawking mass is really only appropriate for spheres, but since we are
primarily interested in the Hawking mass of spheres in this section, it is convenient to define the
Hawking mass this way.
desirable in a quasi-local mass is open for debate. However, as is the case

in much of mathematics, the desired properties depend on context and are
closely related to the desired applications. Because of this, there are many
different notions of quasi-local mass, all of which may be interesting and
useful in their own ways.
Getting back to the Hawking mass, this particular quasi-local mass was
originally motivated by physical considerations,
but mathematically we see
2
that it includes the Willmore energy Σ H dμΣ , which was traditionally
studied in R3 because of its invariance under conformal transformations of
Euclidean R3 ∪ {∞} (perhaps first discovered by W. Blaschke [Kle68]).
Theorem 4.24 (Willmore inequality [Wil65]). Let Σ be an orientable,

immersed closed surface in Euclidean R3 . Then

H 2 dμΣ ≥ 16π,
Σ
with equality only for round spheres.
We include the proof given in [MN14], mainly because it is short, ele-

gant, and elementary.
Proof. We consider the Gauss map ν : Σ −→ S 2 that assigns to each point

in Σ its outward unit normal, thought of as an element of R3 . The derivative
of this map, Dν : T Σ −→ T S 2 , is essentially the shape operator of Σ, whose
determinant is the Gauss curvature K. Let V ⊂ Σ be the set of points in Σ
where K ≥ 0. We claim that ν : V −→ S 2 is surjective. The reason is that
for each direction v ∈ S 2 , we can maximize the value of x · v over x ∈ Σ.
This finds the most extreme point in Σ in the v direction. At this extreme
point, it is clear that ν(x) = v, and that K ≥ 0 at x, proving the claim.
Given the claim, we apply the area formula to the map ν (in the last
inequality below) to see that

H dμΣ ≥
2
H 2 dμΣ
Σ

V
≥4 K dμΣ
V
=4 det(Dν) dμΣ
V
≥ 4|S 2 | = 16π.
We omit the equality case, except to note that we can only have H 2 = 4K
where the surface is umbilic (that is, the principal curvatures are equal).
The famed Willmore conjecture [Wil65] states that any immersed torus
has
H 2 dμΣ ≥ 8π 2 ,
Σ
with equality only for so-called Willmore tori. This conjecture was proved
by Fernando Marques and André Neves using minimal surface techniques
[MN14].
Translated for our purposes, the Willmore inequality states that the
Hawking mass of an embedded surface in Euclidean R3 is nonpositive and
zero only for round spheres. One way to interpret this is that since Euclidean
space “contains no mass” in any sense, the Hawking mass only gives a “good”
quasi-local mass for round spheres. In general, we can easily see from the
definition that if we put lots of little wiggles in Σ, its Hawking mass can
be made to be an arbitrarily large negative number. The upshot is that
Hawking mass is most useful when Σ is somewhat nice.
Exercise 4.25. Let (M 3 , g) be an asymptotically flat manifold. Let Σρ be
the coordinate sphere of radius ρ in an asymptotically flat coordinate chart.
Prove that limρ→∞ mHaw (Σρ ) = mADM (g).
One useful property of the Hawking mass is that it is monotone under

inverse mean curvature flow. Both the flow and the monotonicity were first
discovered by R. Geroch [Ger73].
Definition 4.26. We say that a family of hypersurfaces Σt in a Riemannian
manifold (M n , g) is evolving under inverse mean curvature flow (or IMCF)
if its first-order deformation vector field is − HH2 . More explicitly, if we think
of our family of surfaces as a family of maps Φt : Σ −→ M with Φt (Σ) = Σt ,
then Φt being an inverse mean curvature flow means that for each x ∈ Σ,
d H 1
Φt (x) = − 2 = ν,
dt H H
where the right side involves the mean curvature of Σt evaluated at Φt (x),
and ν is chosen to point in the opposite direction from H. In particular, the
flow can only be well-defined if H is nonvanishing, so we may as well choose
ν so that H > 0. (Typically, ν will be the outward normal.)
Theorem 4.27 (Geroch monotonicity [Ger73]). Let Σt be a family of con-
nected two-sided closed surfaces evolving by inverse mean curvature flow in
a Riemannian manifold (M 3 , g) with nonnegative scalar curvature. Then
d
mHaw (Σt ) ≥ 0.
dt
Proof. We already have variation formulas for a normal variation X = ϕν
in Section 2.2. We simply need to apply these formulas when ϕ = H −1 . We
take it piece by piece. First, recall from Proposition 2.10 (and the discussion
following it) that
∂
dμt = Hϕ dμt = dμt ,
∂t
where dμt denotes the area measure on Σt . Indeed, one way of thinking
about inverse mean curvature flow is that it is precisely the flow that makes
the area measure of Σt grow exponentially. In particular,
d
|Σt | = |Σt |.
dt
Also recall from (2.15) that
∂ 1
H = −Δ(H −1 ) + (2K − R − |A|2 − H 2 )H −1 ,
∂t 2
where K is the Gauss curvature of Σt , R is the scalar curvature of the
ambient metric g, and we suppress much of the dependence on t in our
notation. In the following, λ1 and λ2 will denote the principal curvatures of
Σt . We compute

d
H 2 dμt
dt Σt

1
= (2H[−Δ(H −1 ) + (2K − R − |A|2 − H 2 )H −1 ] + H 2 ) dμt
2
Σt
= (2∇H, ∇(H −1 ) + 2K − R − |A|2 ) dμt
Σt
= (−2H −2 |∇H|2 + 2K − R − |A|2 ) dμt
Σt

1 1
= 4πχ(Σt ) + −2H −2 |∇H|2 − R − (λ1 − λ2 )2 − H 2 dμt
Σt 2 2

1
≤ 8π − H 2 dμt ,
2 Σt
where we used integration by parts in the second line, the Gauss-Bonnet
Theorem in the fourth line, and connectedness of Σt and nonnegativity of
R in the last line. Thus

3/2 d d
(16π) mHaw (Σt ) = |Σt | 16π − 2
H dμt
dt dt Σt

1 d
= |Σt | 16π − H dμt − |Σt |
2
H 2 dμt
2 Σt dt Σt
≥ 0,
using our previous calculation.
Note that the use of the Gauss-Bonnet formula above restricts this ar-
gument to dimension 3. The monotonicity provides an idea for how to prove
the Penrose inequality in three dimensions.
Let (M 3 , g) be a complete one-ended asymptotically flat manifold with
boundary, with nonnegative scalar curvature. Assume that ∂M is minimal
and outward-minimizing. Suppose there is an inverse mean curvature flow
Σt with initial surface Σ0 = ∂M , and suppose that this flow exists for all
time and Σt flows out toward the infinity of the end. Then by definition of
Hawking mass and Geroch monotonicity, we have
&
|Σ0 |
= mHaw (Σ0 ) ≤ lim mHaw (Σt ).
16π t→∞
Because of the parabolic nature of inverse mean curvature flow, it is reason-

able to hope that the flow makes Σt “nice” as t → ∞, and then, in light of
Exercise 4.25, we might hope that limt→∞ mHaw (Σt ) = mADM (g). (Though
of course, we only need inequality.)
Huisken and Ilmanen’s proof of the Penrose inequality essentially makes
this argument rigorous. First, how does one construct an inverse mean
curvature flow? Since the flow is parabolic when H > 0, standard parabolic
theory can be used to prove short-time existence for the flow if the initial
surface has H > 0. However, we want to use the minimal surface Σ0 as our
initial data, and, of course, it has H = 0. This is not merely a technical
issue; we will see that if Σ0 is minimal, then we need it to also be strictly
outward-minimizing in order for the flow to be continuous at time t = 0.
(To get a feel for why this must be so, imagine trying to apply the IMCF
argument to your counterexample from Exercise 4.14.) The larger problem
is that the overall argument requires long-time existence. In general, we do
not expect the flow to exist for all time. Singularities can and will occur.
The solution to this problem is to use a weak formulation of inverse mean
curvature flow that does exist for all time. Of course, one then has to prove
that Geroch monotonicity holds for the weak flow.
4.2.2. Huisken and Ilmanen’s weak inverse mean curvature flow.

Huisken and Ilmanen used a level set approach, inspired by earlier work on
mean curvature flow [ES91, CGG91]. In the level set approach, we think
of Σt as the level set u−1 (t) of a function u : M −→ R which we call the
arrival time function, since u(x) describes the time when the flow reaches
the point x. This point of view is reasonable since we want the inverse mean
curvature flow to push the surface in only one direction (never moving over
the same point twice). Let us translate inverse mean curvature flow of Σt
into a statement about the arrival time function u. The outward unit normal
to the level set Σt can we written

∇u
ν= .
|∇u|
From this, we have

∇u ∇u
H = divΣ = divM ,
|∇u| |∇u|
since the normal derivative makes no contribution. Meanwhile, the speed
of any level set flow is |∇u|−1 . So the inverse mean curvature flow equation
becomes |∇u|−1 = H −1 , or

∇u
(4.2) divM = |∇u|.
|∇u|
A smooth solution of inverse mean curvature flow gives rise to an arrival time
function u satisfying equation (4.2) with ∇u = 0. Because of this, when we
talk about “solutions of inverse mean curvature flow,” we are sometimes
referring to the function u and sometimes referring to its level sets (or even
its sublevel sets). However, the meaning will usually be clear from the
context.
Exercise 4.28. Verify that u = (n − 1) log |x| solves inverse mean curvature
flow (4.2) on Rn {0} with the Euclidean metric.
Also check that u = 12 (n − 1) log |x| is a subsolution of inverse mean
curvature flow on any asymptotically flat end (Rn B̄1 , g) where |x| is suf-
ficiently large. Being a subsolution means that

∇u
divM ≥ |∇u|,
|∇u|
or, in other words, the level sets always move outward at least as fast as
they would under inverse mean curvature flow.
We seek a weak solution to (4.2) with the boundary condition u = 0

at ∂M and u(∞) = ∞. A weak formulation of (4.2) must generalize it in
such a way that allows the possibility that ∇u = 0 somewhere (and also
allows less smoothness of u). This is a bit tricky because the equation does
not come from a variational problem. Nevertheless Huisken and Ilmanen
developed a weak formulation based on a minimization property as follows.
Definition 4.29. Let u, v be locally Lipschitz functions on a Riemannian

region (U, g). Given compact K ⊂ U , we define

JuK (v) := (|∇v| + v|∇u|) dμ.
K
We say that u (as above) is a weak solution of inverse mean curvature

flow (or IMCF) on U if for every compact K ⊂ U , and for every v as above
with u = v outside K, we have
JuK (u) ≤ JuK (v).
First we verify that this indeed generalizes the concept of a smooth

IMCF.
Lemma 4.30. Given a smooth function u on a Riemannian region (U, g)
such that ∇u is nonvanishing everywhere, u solves equation (4.2) if and only
if u is a weak solution of IMCF.
Proof. Assume that u is a weak solution such that u is smooth and ∇u

is nonvanishing. Let u̇ be a smooth function supported in some compact
K ⊂ U , and consider the deformation u + tu̇. By the minimization property,
we have

d
0 = JuK (u + tu̇)
dt t=0

∇u, ∇u̇
= + u̇|∇u| dμ
K |∇u|

∇u
= − divM + |∇u| u̇ dμ.
K |∇u|
In order for this to vanish for all choices of u̇, equation (4.2) must hold.
In order to see the reverse, assume u is a smooth solution of (4.2). Now
choose any locally Lipschitz function v that equals u outside some compact
set K, and compute

K
Ju (v) = (|∇v| + (v − u)|∇u| + u|∇u|) dμ

K

∇u
= |∇v| + (v − u) divM + u|∇u| dμ
K |∇u|

' (
∇u
= |∇v| + ∇(u − v), + u|∇u| dμ
K |∇u|

' (
∇u
= |∇v| + |∇u| − ∇v, + u|∇u| dμ
K |∇u|
≥ JuK (u).

Remark 4.31. Lemma 4.30 should NOT be interpreted to mean that a
smooth evolution and a weak evolution of a given initial surface must agree
until the smooth solution becomes singular. (Indeed, this is false.) The weak
formulation is global in character, while the smooth formulation is not.
As equation (4.2) came from a parabolic flow, it is not elliptic of course.

It is instead “degenerate” elliptic. We will solve the equation weakly by a
process called elliptic regularization: we add a small term to make it elliptic,
and then we can solve the “regularized” equation. We then take the limit
of the “regularized” solutions to obtain our desired weak solution. The key
point about the chosen weak formulation of IMCF is that it is preserved
under this limiting process.
Theorem 4.32 (Weak Existence Theorem 3.1 of [HI01]). Let (M n , g) be
a complete asymptotically flat manifold with a distinguished end. Then for
any enclosed region Ω0 with C 1,1 boundary, there exists a locally Lipschitz
weak solution u to inverse mean curvature flow with initial condition Ω0 .
Having initial condition Ω0 means that u is a weak solution to IMCF on
M Ω0 , u = 0 at ∂Ω0 , u < 0 in Ω0 , u ≥ 0 on M Ω0 , and u(x) → ∞ as
x → ∞.
Remark 4.33. The requirement that u(x) → ∞ as x → ∞ is important,
since it guarantees that each sublevel set Ωt := {u < t} is an enclosed region.
Without such a condition, we could have weak solutions that spontaneously
“jump to infinity.”
Note that this theorem works in any dimension. Moreover, it holds for
noncompact manifolds more general than asymptotically flat ones. All that
is really needed is the existence of a subsolution near infinity.
Sketch of the proof. Since the proof is quite involved and uses a fair bit
of analysis, we will only give a sketch, emphasizing the geometric content as
much as possible. As described above, we first attempt to solve a regularized
equation. Recall from Exercise 4.28 that v = 12 (n − 1) log |x| is a subsolution
for large enough |x| in the asymptotic region. For large L, let VL be the
region bounded between ∂Ω0 and the coordinate sphere v −1 (L). For each
> 0, we consider the following Dirichlet problem on VL :

)
∇u,L
(4.3) div = |∇u,L |2 + 2 ,
|∇u,L |2 + 2
(4.4) u,L = 0 at ∂Ω0 ,
(4.5) u,L = L − 2 at v = L.
The nonlinear elliptic equation (4.3) has a nice geometric interpretation.
Exercise 4.34. Given u,L as above, define the function U,L : VL ×R −→ R
by U,L (x, z) := u,L (x) − z. Assuming that u,L is C 2 , verify that U,L is
itself a solution of IMCF on the product space VL × R with metric g + dz 2 .
Clearly, the level sets of U,L are just translations of the graph of the
function −1 u,L : VL −→ R. Together with the exercise above, we see
Figure 4.4. The graph of z = u,L defines a downward translating

solution to IMCF in the product. As → 0 and L → ∞, these graphs
become purely vertical, giving us a weak solution to IMCF in the base M .
that u,L solves equation (4.3) if and only if the graph of −1 u,L moves by
downward translation under smooth IMCF.
Given fixed L, one can prove that the above Dirichlet problem can be
solved for sufficiently small . This is where the bulk of the work is, and it
is accomplished using some standard PDE techniques. The existence of a
subsolution v plays an essential role. (Keep in mind that the existence of
such a v is really a statement about the asymptotic behavior (M, g).) For
details, see Huisken and Ilmanen’s paper [HI01, Section 3].
Hence, we can construct a sequence of solutions ui of the regularized
Dirichlet problems with Li → ∞ and i → 0. These solutions come with
gradient estimates that allow us to apply Arzela-Ascoli to find a locally
Lipschitz limit function u such that ui → u uniformly on compact subsets
of M Ω0 . Clearly, we have the desired boundary conditions for u, that is,
u = 0 at ∂Ω0 , u ≥ 0 on M Ω0 , and u(x) → ∞ as x → ∞. (We can fill in
u arbitrarily to obtain u < 0 in Ω0 .)
Next we recall that the functions Ui (x, z) := ui (x) − i z are smooth solu-
tions of IMCF and converge to U (x, z) := u(x) uniformly on compact sets.
(See Figure 4.4.) The critical observation is that being a weak solution of
IMCF is preserved in the limit (assuming uniformly locally bounded gradi-
ents). Considering the minimization principle used to define the weak flow,
this is not too surprising. Finally, given that U is a weak solution of IMCF
on (M Ω0 ) × R, one can see that u is a weak solution on M Ω0 .
Exercise 4.35. Prove the last assertion in the proof sketch above. That is,
given that U (x, z) = u(x), if U is a weak solution of IMCF on (M Ω0 ) × R,
then u is a weak solution of IMCF on M Ω0 .
Definition 4.36. Given a function u, we define the following notation,
which we will use throughout this section:
Ωt := {u < t},
t := Int{u ≤ t},
Ω+
Σt := ∂Ωt ,
Σ+ +
t := ∂Ωt .
The minimization property of u can be recast in terms of the sublevel

sets of u as follows [HI01, Lemma 1.1]. Given a Riemannian region (U, g),
a compact set K ⊂ U , a locally Lipschitz function u, and an open set E of
locally finite perimeter, we define

∗
Ju (E) := |∂ E ∩ K| −
K
|∇u| dμ.
E∩K
Lemma 4.37. Let u be a weak solution of IMCF on (U, g). Then the sublevel
sets Ωt minimize Ju on U in the following sense. For any set F of locally
finite perimeter with Ωt F contained in some compact set K ⊂ U , we have
JuK (Ωt ) ≤ JuK (F ).
The minimization property in Lemma 4.37 allows one to derive the fol-
lowing regularity theorem [HI01, Regularity Theorem 1.3]. We omit both
proofs.
Theorem 4.38 (C 1,α regularity of weak IMCF). Let n < 8, and let (M n , g),
Ω0 , and u be as described in Theorem 4.32. Then Σt and Σ+ t are C
1,α for
any α < 2 . Moreover, for all t > 0, Σs converges to Σt as s → t− , and for

1
all t ≥ 0, Σs converges to Σ+t as s → t , where the convergence is in the

+
C 1,α sense.
In particular, this level of regularity implies that ∂ ∗ Ωt = ∂Ωt and

∂ ∗ Ω+ +
t = ∂Ωt , so we need not worry about reduced boundaries.
The minimization property in Lemma 4.37 also tells us that the sublevel
sets are minimizing hulls.
Corollary 4.39. Let n < 8, and let (M n , g), Ω0 , and u be as described in
Theorem 4.32. Then:
(1) For t > 0, Ωt is a minimizing hull.
(2) For t ≥ 0, Ω+
t is a strictly minimizing hull.
(3) For t ≥ 0, Ωt = Ω+

t .
(4) For t > 0, |Ωt | = |Ω+
t |. If Ω0 is a minimizing hull, then this also
holds for t = 0.
Proof. For item (1), let Ωt ⊂ F , where F has finite perimeter. We use the
minimization property from Lemma 4.37 to see that

∗
|∂Ωt ∩ K| − |∇u| dμ ≤ |∂ F ∩ K| − |∇u| dμ
Ωt ∩K F ∩K
for any suitably chosen K. Thus

|∂Ωt ∩ K| + |∇u| dμ ≤ |∂ ∗ F ∩ K|.
F Ωt
The result follows.

For item (2), an approximation argument shows that for t ≥ 0, Ω+ t
also obeys the minimization property in Lemma 4.37. Given a competitor
F ⊃ Ω+ t , we use the same argument as for (1), but this time we suppose
that F has the same perimeter as Ω+
t . In this case we must have

|∇u| dμ = 0.
F Ω+
t
This means that ∇u = 0 a.e. on F Ω+ t . Without loss of generality, we can

take F to be open, and we can deduce that u is constant on F Ω+ t . But
by definition of Ω+
t , this would imply that Ω+
t = F , and the result follows.
For item (4), since Ωt is a minimizing hull, |∂Ωt | ≤ |Ω+
t |. But by the
Ju -minimizing property of Ωt and the fact that |∇u| vanishes on Ω+
+
t Ωt ,
we also obtain the reverse inequality.
For item (3), we know from item (2) that Ωt ⊂ Ω+ t . By the same
argument used to prove item (4), |∂Ωt | = |Ω+ t |. But since Ωt is a strictly
minimizing hull, this is only possible if Ωt = Ω+t .
We can now describe a rough intuitive picture of how the weak IMCF
works: as long as Ωt remains a minimizing hull, it flows by “ordinary”
inverse mean curvature flow. (Here we say “ordinary” instead of “classical,”
since it is a flow of C 1,α surfaces, continuous in the C 1,α topology.) At any
moment when Ωt is about to cease being a minimizing hull, it “jumps” to its
strictly minimizing hull Ω+ t and then continues to flow by ordinary inverse
mean curvature flow. However, this is not rigorous because there can be
countably many jump times. Moreover, the weak flow essentially tells us
how to start (or restart) the flow from any strictly minimizing hull (which
typically only has H ≥ 0 but not H > 0), which the classical flow does not
cover.
4.2.3. Geroch monotonicity of the weak flow. Observe that the in-
tuitive picture described above suggests that Geroch monotonicity ought
to hold for the weak flow. We know that Geroch monotonicity holds for
the classical flow, and the weak flow should behave much like the classical
flow, except for some jumps. But the effect of jumps on Hawking mass has
“good sign.” This is because the “jump” is a jump from Ωt to its strictly
minimizing hull Ω+ t . The
perimeter |Σt | is continuous at a jump time by
Corollary 4.39, while Σt H can only jump down to Σ+ H 2 since Σ+
2
t is
t
minimal where it disagrees with Σt . Therefore the Hawking mass should
only be able to jump up at a jump time.
Unfortunately, since that argument is not rigorous, a more involved ar-
gument is needed. First we establish the desired volume growth.
Lemma 4.40. Let (M n , g), Ω0 , and u be as described in Theorem 4.32.
Then for t > 0, e−t |∂Ωt | is constant.
Exercise 4.41 (Exponential Growth Lemma 1.6 of [HI01]). Prove Lemma
4.40. Hint: Use the minimization property from Lemma 4.37 to see that
JuK (Ωt ) is constant in t. Then use the co-area formula to find an integral
equation for |∂Ωt |.
The following corollary is the critical place where the outward-minimizing

assumption in the Penrose inequality is used in the proof.
Corollary 4.42. Under the assumptions of Lemma 4.40, if we further as-
sume that Ω0 is a minimizing hull, then |∂Ωt | = et |∂Ω0 | for all t ≥ 0.
Proof. By Theorem 4.38, we know that |∂Ω+ 0 | = limt→0+ |∂Ωt |. Together

−t
with Lemma 4.40, this implies that e |∂Ωt | = |∂Ω+ 0 | for all t > 0. Finally,
if Ω0 is a minimizing hull, then Corollary 4.39 tells us that |∂Ω+ 0 | = |∂Ω0 |,
Next we follow the Geroch monotonicity argument in the smooth case,

except we apply it to the regularized solutions used in the proof of Theo-
rem 4.32, as defined in (4.3). Recall that we used a sequence of regularized
solutions ui of (4.3) corresponding to a sequence of downward translating
solutions Ui of the classical IMCF. Also recall that after we pass to the limit
as i → ∞, the level sets Σit = {(x, z) | Ui (x, z) = t} converge to cylindrical
sets Σ̃t := Σt × R. Since we ultimately want to “divide out” by this vertical
effect, we consider a nonnegative vertical cutoff function φ(z) supported in
[1, 5] whose integral is 1.
Exercise 4.43. Let Ui be the downward translating solution of IMCF de-
scribed in the proof of Theorem 4.32, and let Σit := {Ui (x, z) = t}. Fol-
low along with our earlier proof of Geroch monotonicity to show that for
0 ≤ t ≤ Li − 7,

d
2
φH dμt = φ(−2H −2 |∇H|2 − 2|A|2 − 2Ric(ν, ν) + H 2 )
dt Σit Σit

+∇φ, −2H −1 ∇H + Hν dμt .
The reason why we take t ≤ Li − 7 is to make sure that φ vanishes near the
boundary of Σit .
Integrating the result of the previous exercise, we have, for 0 ≤ r ≤ s ≤

Li − 7,
s

2
φH = 2
φH + dt φ(−2H −2 |∇H|2 −2|A|2 −2Ric(ν, ν)+H 2 )
Σis Σir r Σit

+∇φ, −2H −1 ∇H + Hν dμt .
A significant part of Huisken and Ilmanen’s paper is concerned with
taking the limit of the above equation as i → ∞. See Section 5 of [HI01]
for details. They show that at almost every time t, Σi φH 2 converges to
2
t
1,α , its mean curvature H is interpreted in a
Σt H . Since Σt is only C
standard weak sense, and it can be identified with |∇u|. Dealing with the
φ(2H −2 |∇H|2 + 2|A|2 ) term is trickier. In order for this to be sensible, they
respect to (n − 1)-dimensional
observe that for a.e. t, H > 0 a.e. on Σt (with
Hausdorff measure), and then show that Σi φH −2 |∇H|2 is lower semicon-
t
tinuous under convergence as i → ∞. They also use a weak formulation
of the second fundamental form A such that Σi φ|A|2 is lower semicontinu-
t
ous under convergence as i → ∞. Using an approximation argument, they
explain why Σt satisfies an appropriate Gauss-Bonnet Theorem using the
eigenvalues of A (defined a.e. on Σt ).
Meanwhile, the φ(2Ric(ν, ν)) term causes little trouble. The terms in-
volving ∇φ will vanish in the limit. To see why this should happen, infor-
mally, ∇φ points in the vertical direction, whereas, in the limit, ∇H and ν
will become orthogonal to the vertical.
Putting all of these arguments together (along with many other details
being glossed over), one obtains, for 0 ≤ r < s,
(4.6)
s
H ≤
2 2
H + dt (−2H −2 |∇H|2 − 2|A|2 − 2Ric(ν, ν) + H 2 ) dμt
Σs Σ r Σ
r s t

= H2 + dt 4πχ(Σt ) + −2H −2 |∇H|2 − R
Σr r Σt

1 1 2
− (λ1 − λ2 ) − H
2
dμt ,
2 2
where the dt integrand

is sensible for a.e. t. (Compare this to our earlier
d 2
computation of dt Σt H dμt for the case of smooth IMCF.)
Theorem 4.44 (Geroch monotonicity for weak IMCF). Let (M 3 , g), Ω0 ,
and u be as described in Theorem 4.32, and further assume that Ω0 is a
minimizing hull. Then for 0 ≤ r < s,
s
1
mHaw (Σs ) ≥ mHaw (Σr ) + |Σt | 1/2
16π − 8πχ(Σt )
(16π)3/2 r

−2
+ 2H |∇H| + R + (λ1 − λ2 ) dμt dt.
2 2
Σt
Exercise 4.45. Prove Theorem 4.44 by combining the previous inequality

with Corollary 4.42.
So we see that the Hawking mass is monotone as long as R ≥ 0 every-

where and χ(Σt ) ≤ 2 for all t. As long as Σt remains connected, the bound
χ(Σt ) ≤ 2 will hold.
This causes an immediate problem when Σ0 is itself disconnected, and
this is why the area A in the statement of Theorem 4.22 refers to the area
of a single component. Therefore we will restrict ourselves to the case where
the initial condition Σ0 is connected. Even still, we need to check that
Σt remains connected. This will be true if we have some control over the
topology of the ambient space. The following is a slight improvement of
Lemma 4.2 of [HI01].
Lemma 4.46. Let (M n , g), Ω0 , and u be as described in Theorem 4.32,
and further assume that M Ω0 has vanishing first Betti number and that
Σ0 = ∂Ω0 is connected. Then Σt is also connected for all t.
Proof. Theorem 4.38 says that if t is a jump time, then Σt can be approx-
imated by Σt for some nonjump time t . So without loss of generality, let
us assume that t is not a jump time, so that Σt = {u = t}.
We first show that {0 ≤ u ≤ t} = Ωt Ω0 is connected. Suppose it is
not. Then it must have a component K that is disjoint from the connected
set ∂Ω0 . Since ∂K ⊂ Σt = ∂Ωt , the interior of K, where 0 < u < t, must be
nonempty. Since K is compact, it follows that K attains a local minimum
in the interior of K. Using the Ju -minimizing property of u, one can show
this is only possible if u is constant on K, which is a contradiction. (Prove
this as an exercise.)
Similarly, we can show that {u ≥ t} = M Ωt is connected. Suppose
it is not. Since u → ∞ at infinity, we know that exactly one component
of M Ωt can be unbounded. So any other component K is compact, and
since ∂K ⊂ Σ+ +
t = ∂Ωt , we can see that the interior of K, where u > t,
has nonempty interior. Therefore K attains a local maximum in its interior,

and we obtain a contradiction as before.
Now consider the reduced Mayer-Vietoris exact sequence of the pair of
sets described above:
H1 (M Ω0 , Z) −→ Ĥ0 (Σt , Z) −→ Ĥ0 (M Ωt , Z) ⊕ Ĥ0 (Ωt Ω0 , Z).
Since we showed that Ωt Ω0 and M Ωt are connected, the last space
vanishes, and thus Ĥ0 (Σt , Z) lies in the image of H1 (M Ω0 , Z). Since
Ĥ0 (Σt , Z) is torsion-free, the assumption that H1 (M Ω0 , R) = 0 is enough
to conclude that Ĥ0 (Σt , Z) = 0, and hence Σt is connected.
Suppose that M is a complete asymptotically flat manifold with an ap-

parent horizon boundary ∂M . As explained above, if ∂M is not connected,
we cannot use it as an initial condition for IMCF if we want the Hawking
mass to remain monotone. Instead, suppose that Σ0 is just one component
of ∂M , or more generally that it is a connected surface enclosing some com-
ponents of ∂M (or possibly none) but not touching the others. We would
like to “modify” the weak IMCF starting at Σ0 in such a way that it “jumps”
over the other components of ∂M but does not touch the others. To do this,
we arbitrarily fill in the components of ∂M with regions W1 , . . . , W to create
a new space M̃ . We consider IMCF in M̃ with initial surface Σ0 = ∂Ω0 , but
then we modify the flow as follows. If Ωt1 is about to touch the fill-in region
%
W = i=1 Wi , we “jump” to the component F of the strictly minimizing
hull of Ωt1 ∪ W that contains Ωt1 . Then we restart the flow at Ω+ t1 := F .
(See [HI01, Section 6] for the details. In particular, one must check that F
is smooth enough to restart the flow.) Clearly, we will have to do this at
most times. Note that the minimizing hull property guarantees that
|∂Ωt1 | ≤ |∂Ω+
t1 |.
Moreover, since ∂Ω+

t1 is minimal where it disagrees with ∂Ωt1 , we also have

H ≥
2
H 2.
∂Ωt1 ∂Ω+
t 1
Combining these two facts implies that monotonicity holds at these jump
times, that is,
mHaw (∂Ωt1 ) ≤ mHaw (∂Ω+
t1 ).
Putting this discussion together with Theorem 4.44 and Lemma 4.46,
we immediately obtain the following.
Corollary 4.47 (Geroch monotonicity for modified weak IMCF). Let
(M 3 , g) be a complete asymptotically flat manifold whose boundary consists
of an outward-minimizing connected surface Σ0 and possibly other com-
ponents which are minimal. Assume that b1 (M ) = 0. Arbitrarily fill in
the boundary ∂M (in a way that does not introduce any b1 ) and consider
the “modified” weak IMCF Ωt as described above, with initial condition
Σ0 = ∂Ω0 . Then for 0 ≤ r < s,
mHaw (Σs ) ≥ mHaw (Σr )
s
1 −2
+ |Σt | 3/2
2H |∇H| + R + (λ1 − λ2 ) dμt dt.
2 2
(16π)3/2 r Σt
In particular, if M has nonnegative scalar curvature, then
mHaw (Σs ) ≥ mHaw (Σr ).
4.2.4. The long-time limit of the flow. Now that we have established
monotonicity for the modified weak IMCF, the only thing left to do is verify
that
lim mHaw (Σt ) ≤ mADM (M, g).
t→∞
Recall from Exercise 4.25 that the Hawking masses of the coordinate spheres
converge to the ADM mass. We will show that Σt becomes “rounder” as
t → ∞ in a sense that is strong enough to estimate the Hawking mass.
First we establish a uniqueness result for the space Rn {0} [HI01,
Proposition 7.2].
Lemma 4.48. Let u be a solution of weak inverse mean curvature flow
on Rn {0} with the Euclidean metric, such that the level sets of u are
compact. Then we must have u = c + (n − 1) log |x| for some constant c.
This corresponds to Σt being the sphere of radius e(t−c)/(n−1) .
The importance of this lemma is that our IMCF on (M, g) can be “blown-
down” to a solution on Rn {0}, as follows. Suppose that (M n , g) and
u are as described in Corollary 4.47. Then u solves weak IMCF in the
asymptotically flat end, which we may take to be Rn B̄r . Given λ > 0,
we consider uλ (x) := u(x/λ), which can easily be seen to solve weak IMCF
in Rn B̄λr with the scaled metric g λ (x) = λ2 g(x/λ). Using the gradient
bounds and the compactness theory from the proof of Theorem 4.32, one can
show that for any sequence of λ’s approaching zero, there is a subsequence
λi and constants cλi such that uλi − cλi converges locally uniformly to some
v solving IMCF on Rn {0} with the Euclidean metric.
Proof. We define the eccentricity of a subset Σ of Rn {0} to be

supx∈Σ |x|
θ(Σ) = .
inf x∈Σ |x|
Let u be as described in the statement of the lemma, and consider its cor-
responding Σt . Weak IMCF obeys a comparison principle, meaning that if
one flow is contained in another at time t, then this continues to hold for
all later t, assuming that the sublevel sets are enclosed regions. (We omit
the proof. See [HI01, Uniqueness Theorem 2.2].) Let S1 be the coordinate
sphere inscribing Σt , and let S2 be the coordinate sphere circumscribing
Σt . By evolving all three by the weak IMCF, the comparison principle tells
us that for τ > 0, Σt+τ is sandwiched between eτ /(n−1) S1 and eτ /(n−1) S2 .
Therefore θ(Σt ) is nonincreasing in t.
Moreover, we claim that if Σt is not a coordinate sphere, then θ(Σt ) must
strictly decrease. This is due to a strong maximum principle for smooth
IMCF, which says that if one initial surface is enclosed by a distinct sur-
face, then IMCF must immediately force them to become disjoint. Thus,
if Σt is smooth, then for τ > 0, Σt+τ is sandwiched between eτ /(n−1) S1
and eτ /(n−1) S2 but not touching them, and thus θ(Σt+τ ) < θ(Σt ). If Σt is
not smooth, we can still obtain the same conclusion by comparing to what
happens to a smooth surface lying between Σt and S2 .
On the space Rn {0}, it makes sense to blow up (rather than blow
down, as described above) the solution u to obtain a new solution ũ. Then
we can see that for any time τ ,
θ(Σ̃τ ) = lim θ(Σt ).

t→0
In particular, θ(Σ̃τ ) is constant. But according to the claim above, this

implies that Στ is a sphere, and thus limt→0 θ(Σt ) = 1. But since θ(Σt ) is
nonincreasing, this is only possible if θ(Σt ) is identically 1, that is, Σt is the
coordinate sphere for all t. The result follows.
Lemma 4.49 (Blowdown Lemma 7.1 of [HI01]). Let u be a weak solution

of IMCF on some asymptotically flat exterior region (Rn B̄, g), such that u
has precompact sublevel sets for large t. Then there exist constants cλ → ∞
as λ → 0 such that uλ − cλ → (n − 1) log |x| locally uniformly on Rn {0}
as λ → 0.
From Lemma 4.48 and the discussion immediately following it, we see
that the desired result will follow as long as we can establish that the blow-
down limit has compact level sets. We will prove this by bounding the
eccentricity of Σt for large t. The gradient bounds that were used in the
proof of Theorem 4.32 tell us that
|∇u| = O(|x|−1 ).
The comparison principle described in the proof of Lemma 4.48 also works
when a weak solution of IMCF is contained in a subsolution of IMCF (that
is, a family of surfaces moving “at least as fast” as IMCF). Recall from
Exercise 4.28 that 12 (n − 1) log |x| is a subsolution for large enough |x|.
Exercise 4.50. Prove that for large t, θ(Σt ) is bounded independent of t.

This can be proved using the subsolution comparison principle described
above, together with the gradient bound described above.
Exercise 4.51. Complete the proof of Lemma 4.49 by showing that the
bound on θ(Σt ) implies that the blow-down limit must have compact level
sets.
As a corollary of Lemma 4.49, it follows that as t → ∞, the rescaled

level set r(t) Σt converges to the unit sphere in R , where r(t) := |Σt |/4π.
1 3
In particular, note that Σt must be a topological sphere for large t. Using

this, we can prove the following.
Proposition 4.52 (Asymptotic Comparison Lemma 7.4 of [HI01]). Let
u be a weak solution of IMCF on some asymptotically flat exterior region
(R3 B̄, g) with asymptotic decay rate at least 1 (as in Definition 3.5). Then
lim mHaw (Σt ) ≤ mADM (M, g).
t→∞
Proof. We present a mild simplification of the proof in [HI01]. We know

that for any surface in M ,
0 ≤ 2|A|2 − H 2 = 2(R − 2K − 2Ric(ν, ν) + H 2 ) − H 2 ,
where the traced Gauss equation (Corollary 2.7) was used for the equality.
Thus
−H 2 ≤ −4K − 4Ric(ν, ν) + 2R.
As mentioned earlier, we know that Σt is a sphere for large t. Combining
the previous inequality with the Gauss-Bonnet Theorem, we can estimate
the Hawking mass as follows:
&

|Σt | 1
mHaw (Σt ) = 1− H 2 dμt
16π 16π Σt
*

|Σt |
≤ 16π + (−4K − 4Ric(ν, ν) + 2R) dμ t
(16π)3 Σt
*

|Σt | 1
= −4 Ric(ν, ν) − R dμt
(16π)3 Σt 2

1
=− G(r(t)ν, ν) dμt ,
8π Σt

where G is the Einstein tensor and r(t) := |Σt |/4π. Note that this is
almost the same expression for the ADM mass as in Theorem 3.14, except
that we have r(t)ν in place of X = xi ∂i .
We now observe that Lemma 4.49 implies that the rescaled level set
1
of area 4π must converge to the unit sphere in R3 in C 1 . The C 0
r(t) Σt
x
convergence tells us that r(t) → |x|
x
, while the C 1 convergence also tells us
that ν → |x|X
. Hence X − r(t)ν = o(r(t)). Since the asymptotic decay rate
is at least 1, we have Gij = O(|x|−2 ). Putting all of this together, we have

1
mHaw (Σt ) ≤ − G(X, ν) dμt + o(r(t)),
8π Σt
and now the result follows from Theorem 3.14 (and the remark following
it).
4.2.5. Summary of the argument. We finally obtain an inequality be-

tween Hawking mass and ADM mass.
Theorem 4.53 (Huisken-Ilmanen [HI01]). Let (M 3 , g) be a complete

asymptotically flat manifold with asymptotic decay rate at least 1 (as in Def-
inition 3.5), whose boundary consists of an outward-minimizing connected
surface Σ and possibly other components which are minimal. Assume that
b1 (M ) = 0 and Rg ≥ 0. Then
mHaw (Σ) ≤ mADM (g).
Proof. We construct M̃ and a modified weak IMCF Σt with initial con-

dition Σ, as described earlier (making sure to keep b1 (M̃ ) = 0), using our
existence theorem (Theorem 4.32). Our hypotheses allow us to apply Ge-
roch monotonicity in the form of Corollary 4.47 to see that mHaw (Σt ) is
monotone nondecreasing. Corollary 4.47 in turn relies on Theorem 4.44
and monotonicity at the times when the modified weak IMCF jumps over
the minimal components of ∂M . Note that the H1 (M, Z) = 0 hypothesis
guarantees that connectedness is preserved, which we need for monotonicity,
and that the outward-minimizing assumption is used to guarantee that the
monotonicity extends all the way to t = 0. (Recall that Theorem 4.44 was
proved using the regularized solutions that were used to construct the weak
flow in Theorem 4.32. Note that Theorem 4.32 and Theorem 4.44 are where
the bulk of the technical work comes in, and consequently we skipped over
most of the details in these two proofs.)
Eventually, this flow exhausts all of M̃ in the long-time limit. That
is, the level sets of the arrival time function u are compact, as guaranteed
by Theorem 4.32. Combining the monotonicity of the Hawking mass with
Proposition 4.52, we have
mHaw (Σ) ≤ lim mHaw (Σt ) ≤ mADM (g),

t→∞
where the second inequality essentially follows from the fact that Σt becomes
“more round” in some sense as t → ∞, which we proved using a blow-down
argument.
We now give the proof of Huisken-Ilmanen’s Penrose inequality.
Proof of Theorem 4.22. Assume the hypotheses of Theorem 4.22 and

without loss of generality, assume ∂M is an apparent horizon.
) Let Σ be
|Σ|
one component of ∂M . Since Σ is minimal, mHaw (Σ) = 16π , so the Pen-
rose inequality would follow immediately from Theorem 4.53 if we knew that
H1 (M, Z) = 0. We would like to prove that we can make this assumption
without loss of generality.
To do this, we can remove all closed minimal surfaces (including im-
mersed ones) and then take the metric completion of whatever is left after
doing so. One can show that the result of this is a new space M + whose
boundary is still an apparent horizon (made up of the original ∂M together
with some new components). This can be proved using an argument similar
to the one used to prove Theorem 4.7. (See [HI01, Lemma 4.1(i)].) Since
+ is now free of immersed minimal surfaces, we can apply Theorem 4.11
M
to see that M+ is diffeomorphic to R3 minus a finite number of balls. In
+) = 0.
particular, b1 (M
We describe an alternative, more direct, approach. For now suppose
we remove all orientable (and hence two-sided) closed embedded minimal
surfaces. (Note that we are only removing nonenclosing ones, since the out-
ermost property of ∂M already guarantees that there are no enclosing ones.)
+ that is free of closed embedded minimal surfaces
This results in a space M
(but not necessarily immersed ones) in its interior. We can directly prove
that this space has vanishing first homology. Let Ω be a region enclosed
by a large coordinate sphere S = ∂Ω in the asymptotically flat region of
+. Let K be the compact part of M enclosed by S. Let N = Ω, thought
M
of as a compact manifold with boundary ∂N = S ∪ ∂M . Recall that by
Poincaré-Lefschetz duality, H1 (N, Z) ∼
= H2 (N, ∂N, Z), and that the exact
sequence of the pair (N, ∂N ) gives
H2 (N, Z) −→ H2 (N, ∂N, Z) −→ H1 (∂N, Z).
By Theorem 7.43, ∂N is a union of spheres, so H1 (∂N, Z) = 0. Meanwhile,
positive mean curvature of S allows us to minimize the area of any class in
H2 (N, Z) (as in Theorem 2.22) to represent it as an integral sum of smooth
closed, oriented minimal surfaces. Since the only such minimal surfaces in
+ lie on ∂M , it follows that every class in H2 (N, Z) has vanishing image in
M
H2 (N, ∂N, Z), so the exact sequence implies that H2 (N, ∂N, Z) = 0. Thus
H1 (N, Z) and consequently H1 (M̃ , Z) both vanish. Finally, observe that it is
not really necessary to remove all of the orientable closed minimal surfaces;
we really only need to inductively remove enough of them (finitely many)
to kill the generators of H1 (M, R).
Next we consider the case of equality. In this case the monotonicity
inequality (Theorem 4.44) becomes an equality, and thus, for a.e. t,

−2
(4.7) 2H |∇H|2 + R + (λ1 − λ2 )2 dμt = 0.
Σt
By the semicontinuity from Theorem 4.38, this actually holds for all t. In
particular, we see that ∇H must vanish a.e. in Σt for all t. By the regularity
from Theorem 4.38, this implies that H is constant on Σt . Together with
the regularity from Theorem 4.38, one can then use the elliptic theory for
the constant mean curvature equation to show that Σt is actually smooth.
(We omit the proof.) The same reasoning can be used to prove that Σ+ t is
also smooth and has constant mean curvature. Recall that Σ+ t must have
H = 0 wherever it disagrees with Σt . Therefore in this case, we must have
Σ+ +
t = Σt , or else Σt would be minimal, violating the outermost property of
the apparent horizon ∂M . In particular, this means that the filled-in region
W is empty, and thus ∂M is connected and equal to Σ0 , which we know
must be a sphere by Corollary 4.10. The absence of “jumps” also allows us
to use the IMCF comparison principle to show that if one chooses an initial
time t0 > 0, then Σt must agree with the smooth evolution of Σt0 under
classical IMCF for t slightly larger than t0 (which exists by parabolicity
since Σt0 has positive mean curvature). In other words, Σt satisfies classical
IMCF for t > 0.
This allows us to view our inverse mean curvature flow Σt as defining a
diffeomorphism from [0, ∞) × S 2 to M . Since the speed of the flow is H −1 ,
the metric g pulled back to [0, ∞) × S 2 can be expressed as
g = H −2 dt2 + ht ,
where ht is just the induced metric on Σt , pulled back to S 2 . (Here and
below, we engage in some harmless abuse of notation, identifying quantities
∂
with their pullbacks.) By (2.7), we know that ∂t ht = H2 A. By (4.7), we
know that Σt is totally umbilic (that is, λ1 = λ2 everywhere) for all t, that
is, A = H2 ht . Thus ∂t
∂
ht = ht , and consequently ht = et h0 . Meanwhile, from
Exercise 2.13, we know that
∂
H = −ΔΣt H −1 − (|A|2 + Ric(ν, ν))H −1 .
∂t
Since we already established that H and A are constant on each Σt , the same
must be true for Ric(ν, ν). Since we also know that R = 0 from (4.7), the
traced Gauss equation implies that Σt also has constant Gauss curvature K.
Thus
g = H −2 dt2 + et h0 ,
where (∂M, h0 ) is a minimal round sphere in (M, g). Since R = 0 every-
where, we can now use Exercise 3.2 to obtain the desired result that (M, g)
is half of a Schwarzschild space.
Finally, we note that the inverse mean curvature flow argument can be
used to prove the positive mass theorem in three dimensions. Suppose (M, g)
contains a closed minimal surface. Then we can create a new space M + as
in the proof of Theorem 4.22 that contains no closed minimal surfaces in its
interior, but does have a nontrivial minimal boundary. Then Theorem 4.53
immediately implies that mADM (g) > 0. Now suppose that (M, g) does not
contain any closed minimal surfaces. Then for any point p ∈ M , we can
construct a solution u of weak IMCF on M {p} such that limx→p u(x) =
−∞, limx→∞ u(x) = ∞, and mHaw (Σt ) ≥ 0 for all t. The basic idea behind
the construction is the following (as explained in [HI01, Section 8]). For
each > 0, there exists a solution u to weak IMCF with initial condition
∂B (p) by Theorem 4.32. Again using gradient estimates and compactness of
solutions, one can extract a subsequence and find a sequence of real numbers
ci → ∞ such that ui − ci that converges to a weak solution u on M {p}.
Eccentricity estimates can be used to show that u has nonempty, compact
level sets for all t. Finally, since mHaw (∂Bi (p)) → 0, Geroch monotonicity
and Proposition 4.52 prove that mADM (g) ≥ limt→0+ mHaw (Σt ) ≥ 0.
4.3. Bray’s conformal flow

4.3.1. Definition and construction of the flow. H. Bray was able to
generalize Huisken-Ilmanen’s Penrose inequality (Theorem 4.22) in the sense
that he was able to replace the area A of the largest component of ∂M by
the total area of ∂M [Bra01]. This proof was based on a novel flow that he
called conformal flow. This proof was later extended to establish the Penrose
conjecture in dimensions less than 8 [BL09]. The goal of this section is to
prove this theorem.
Theorem 4.54 (Riemannian Penrose inequality in dimensions less than
8). Let n < 8, and let (M n , g) be a complete one-ended asymptotically flat
manifold with boundary, with nonnegative scalar curvature. Assume that
∂M is an apparent horizon. Then

n−2
1 |∂M | n−1
mADM (M, g) ≥ .
2 ωn−1
4.3. Bray’s conformal flow 143
The proof relies on the positive mass theorem as an ingredient. The

n < 8 restriction is because the proof also makes direct use of regularity of
minimal hypersurfaces. Unfortunately, knowing the positive mass theorem
in all dimensions does not immediately imply that the above result extends
to all dimensions.
The main tool used in the proof is a flow that we will call the Bray flow
(which he called the conformal flow). Given a complete one-ended manifold
M with boundary, the flow evolves both an enclosed region Ωt and a metric
gt . The flow evolves according to two principles:
• d
dt gt
4
= n−2 νt gt , where νt is the function on M such that νt (x) = 0
on Ωt , and outside Ωt , νt is the unique solution to the Dirichlet
problem
⎧
⎪
⎨ Δgt νt (x) = 0 on M Ωt ,
νt (x) = 0 at ∂Ωt ,
⎪
⎩ lim νt (x) = −1.
x→∞
%
• Ωt is the strictly minimizing hull of s<t Ωs in (M, gt ).
Individually, the two principles are straightforward, with the complication
being that these two rules are coupled to each other. Note that the metric
gt is always conformal to the initial metric g0 . (This is why Bray originally
4
called it conformal flow.) Explicitly, if we set gt = utn−2 g0 , then
d
ut = vt ,
dt
where vt (x) = 0 in Ωt , and outside Ωt , vt is the unique solution to the
Dirichlet problem
⎧
⎪
⎨ Δg0 vt (x) = 0 on M Ωt ,
vt (x) = 0 at ∂Ωt ,
⎪
⎩ lim vt (x) = −e−t .
x→∞
This is an equivalent formulation of the Bray flow.

Exercise 4.55. Let φ be a conformal factor relating the metrics g1 and g2
on M n . That is, φ is a smooth positive function such that
4
g2 = φ n−2 g1 .
Prove that for any smooth function f ,
n+2
Δg1 (f φ) = φ n−2 Δg2 f + f Δg1 φ.
Use this formula to verify that the alternative formulation of the Bray flow
is indeed equivalent to the original.
Although we might hope that Ωt evolves smoothly in t most of the time,

we expect that there will be times when it “jumps.” Indeed, this is clearly
necessary in cases where Ωt must undergo a change in topology. At one of
these jump times, the functions νt and vt above will not be continuous in t.
That is, we do not expect gt to be differentiable but perhaps only Lipschitz.
Because of this, we have to be a bit more careful in how we define the Bray
flow.
Definition 4.56. Let M be a one-ended manifold. Given an increasing

family Ωt of enclosed regions in M , we define the following notation:
0
Ω+
t := Ωs ,
s>t
1
Ω−
t := Ωs ,
s<t
Σt := ∂Ωt ,
Σ± ±
t := ∂Ωt .
Each time t such that Σ−

t = Σt is called a jump time.
+
We say that (M, gt , Ωt ) is a Bray flow (or conformal flow) on the interval
[0, T ) if gt is a family of metrics on M such that gt (x) is Lipschitz in t and
C 1 in x, and smooth in x away from Ωt , and the following conditions hold
for all t ∈ (0, T ):
4
• gt = utn−2 g0 , where
t
ut = 1 + vs ds,
0
and vt is the function on M such that vt (x) = 0 in Ωt , and outside

Ωt , vt is the unique solution to the Dirichlet problem
⎧
⎨ Δg0 vt (x) = 0 on M Ωt ,
vt (x) = 0 at Σt ,
⎩
limx→∞ vt (x) = −e−t .
• Ωt is the strictly minimizing hull of Ω0 in (M, gt ). (In particular,
Ωt is an increasing family.)
Note that our formulation of the second condition has changed, but it is
essentially equivalent to the previous version.
Lemma 4.57. The condition that (M, gt ) is a complete asymptotically flat

manifold is preserved under the Bray flow, as is nonnegativity of scalar
curvature in (M Ωt , gt ).
Proof. Let us assume that the conditions hold at t = 0 and prove that they
continue to hold at any time t > 0. Since vs is g0 -harmonic away from Ωt for
all s < t, it follows that ut is g0 -harmonic away from Ωt . We also see that
ut is asymptotic to a constant at infinity. By Theorem A.38 and Exercise
4
3.13, we see that gt = utn−2 g0 is asymptotically flat. Moreover, since g0 has
nonnegative scalar curvature outside Ωt , we can use (1.6) to see that gt also
has nonnegative scalar curvature outside Ωt .
It is also the case that Σt being minimal in (M, gt ) should also be pre-
served under the flow. Indeed, we are only interested in the case when Σt is
minimal in (M, gt ), and this fact will be rolled into our existence theorem.
Although we have chosen to formulate the Bray flow in terms of the pair
(Ωt , gt ) on M , we should really think of it as a flow of asymptotically flat
manifolds with boundary (M Ωt , gt ). This flow preserves the nonnegative
scalar curvature condition and the minimal boundary condition, and the
idea is that it should “improve” the space in the sense that these manifolds
should flow toward Schwarzschild space. For this to be the case, we expect
Schwarzschild space to be a sort of fixed point for the flow. More accurately,
Schwarzschild space should be thought of as a “soliton solution” of the Bray
flow, as illustrated in the following example.
4
Exercise 4.58. Fix m > 0. Let M = Rn and (gt )ij = Utn−2 δij , where
⎧ 1
⎪
⎪ e−t + m |x|
2t
2−n et for |x| ≥ m n−2 · e n−2 ,
⎪
⎨ 2 n−2 2
2 m 1 m 1 2t
Ut =
⎪ 2 m
for n−2
≤ |x| < n−2
· e n−2 ,
⎪
⎪
2|x| 2
1
2
⎩
U0 for |x| < m2
n−2
;
here U0 is any smooth positive fill-in function whose details are unimportant.
1 2t
Let Ωt = {|x| < m 2
n−2 e n−2 }. Verify that (M, g , Ω ) is a Bray flow for
t t
t ≥ 0, and that for each t ≥ 0, (M Ωt , gt ) is isometric to half of the
Schwarzschild space of mass m.
Theorem 4.59 (Existence of Bray flow). Let n < 8, and let (M n , g0 ) be a

one-ended asymptotically flat manifold with boundary. Let Σ0 = ∂Ω0 be a
strictly outward-minimizing minimal hypersurface. Then there exists a Bray
flow (M, gt , Ωt ) for all t ≥ 0. Moreover:
• Σt = ∂Ωt is an apparent horizon in (M, gt ).
• For all t2 > t1 ≥ 0, Σt2 encloses Σt1 without touching it.
• There are at most countably many jump times, and each Σ±
t is
smooth.
Sketch of the proof. The proof is quite technical, so we will only provide
an outline. (For the full details, see [Bra01, Section 4].) We will start with
a version of Bray flow in which vt and Ωt are discrete in time, and then we
take a limit as the discrete time intervals shrink to zero.
Let > 0. We will iteratively define vt , ut , gt , and Ωt . We know what
all of these should be at t = 0 (namely, their counterparts without the
superscripts). Assume that we have defined these objects for t ∈ [0, k],
where k is a nonnegative integer. Then we define them for t ∈ (k, (k + 1)]
as follows. The function vt is defined to be 0 on Ωk , and outside Ωk , vt is
the unique solution to the Dirichlet problem
⎧
⎨ Δg0 vt (x) = 0 on M Ωk ,
vt (x) = 0 at ∂Ωk ,
⎩
limx→∞ vt (x) = −(1 − )k .
Then we define
t
ut =1+ vs ds,
0
4
gt = (ut ) n−2 g0 ,
and we define Ωt to equal Ωk for all t ∈ (k, (k + 1)), while Ω(k+1) is the
strictly minimizing hull of Ωk with respect to the metric g(k+1)
.
The crucial nontrivial fact is that the hypersurfaces Σt := ∂Ωt are not
only smooth, but also their local C k,α bounds can be shown to be indepen-
dent of . This is proved in [Bra01, Appendix E], using regularity theory of
sets of finite perimeter developed by E. De Giorgi [DG61] (see [MM84]).
In particular, this regularity guarantees that vt (x) has Lipschitz bounds in
x independent of , and consequently ut (x) has Lipschitz bounds in both x
and t independent of . By Arzela-Ascoli, we can find a sequence i such
that ut i (x) converges uniformly on compact sets to some Lipschitz function
ut (x).
4
We next define gt = utn−2 g0 and Ωt to be the strictly minimizing hull of
Ω0 with respect to gt . The C k,α bounds on Σt allow us to extract smooth
(γ)
limit hypersurfaces Σt of Σt i , where γ is an index for these limit surfaces.
(γ) (γ)
Unfortunately, it need not be the case that Ωt = Ωt , where Ωt is the
(γ)
region enclosed by Σt . However, one can show (with some work) that if
t1 ≤ t2 ≤ t3 , then
(γ)
(4.8) Ωt1 ⊂ Ωt2 ⊂ Ωt3 .
See [Bra01, Section 4] for details.

We define vt to be 0 on Ωt , and outside Ωt , we define vt to be the unique

solution to the Dirichlet problem
⎧
⎨ Δg0 vt (x) = 0 on M Ωt ,
vt (x) = 0 at Σt ,
⎩
limx→∞ vt (x) = −e−t .
We claim that for almost every t, vti (x) converges to vt (x) as i → ∞. Since
Ωt has smooth boundary,
2 there can only be countably many times when Ωt
does not equal s>t Ωs . As long as t is not one of those “jump times” for
(γ)
the flow, it follows from (4.8) that we do have Ωt = Ωt . This implies that
Σt i converges to Σt , which implies the claim.
t
Given the claim, it follows easily that ut = 1 + 0 vs ds.
Next we explain why, for t1 < t2 , Σt1 does not touch Σt2 . Since the
metric gt can only get smaller, one can show that since the hypersurface
Σt1 is minimal with respect to t1 , it must have nonpositive mean curvature
with respect to t2 . Therefore the strong comparison principle (Corollary 4.2)
shows that Σt2 cannot touch Σt1 .
Using this fact, we see that for each x0 ∈ M , there is at most one time t
such that x0 ∈ Σt . Since vt is smooth everywhere except at Σt and has
Lipschitz bounds, one can conclude that ut is actually C 1 in x, and it is
clear that ut must be smooth outside Σt .
4.3.2. Volume of the apparent horizon and monotonicity of mass.

In order to prove the Penrose inequality (Theorem 4.54), we will prove that
the Bray flow on an asymptotically flat manifold with nonnegative scalar
curvature has the following properties:
• The volume of Σt in (M, gt ) is constant in t. Call it A.

• The ADM mass of (M, gt ), which we will call m(t), is nonincreasing.
• With the right choice of coordinates in the asymptotically flat end,
the metric gt outside Σt converges to a Schwarzschild metric of
mass m∞ .
• The volume A∞ of the apparent horizon in this Schwarzschild man-
ifold is greater than or equal to A.
Once we have established these properties, the inequality in Theorem 4.54

follows quite easily. Let (M n , g) be a complete one-ended asymptotically
flat manifold with an apparent horizon boundary and nonnegative scalar
curvature. If n < 8, we can invoke Theorem 4.59 to construct a Bray flow
(M, gt , Ωt ) with initial condition (M, g, ∅). (Recall that with our conven-
tions, we have ∂∅ = ∂M = Σ0 .) Then the itemized list above tells us that

n−2
n−2
1 A∞ n−1 1 A n−1
mADM (g) ≥ lim mADM (gt ) = m∞ = ≥ .
t→∞ 2 ωn−1 2 ωn−1
Lemma 4.60. Assume (M, gt , Σt ) is a Bray flow as constructed in Theo-
rem 4.59. Then the volume of Σt in (M, gt ) is constant in t. Moreover,
Σ+t = Σt for all t ≥ 0.
Sketch of the proof. We will first give the basic idea of why it is true by
considering the case when Σt varies smoothly in t at t = t0 . In this case we
have
d d d
|Σt |gt = |Σt |gt0 + |Σt0 |gt .
dt t=t0 dt t=t0 dt t=t0
The first term vanishes because Σt0 is minimal in gt0 , while the second term
vanishes because gt is unchanging at Σt0 at t = t0 .
In order to prove the result in the general case, we must delve into
the inner workings of the proof of Theorem 4.59. Specifically, we show
(γ)
that every limit surface Σt with respect to gt has volume equal to A :=
|Σ0 |g0 . (Actually, this fact is needed in the proof of Theorem 4.59 in order
to establish (4.8).) It suffices to show that A (t) := |Σt |gt approaches A as
→ 0. This, in turn, can be proved by showing that
A(k+1) − Ak = o().
Going back to the definitions and using the outward-minimizing property,
at Σ
this difference can be estimated in terms of the size of vk (k+1) . Finally,
the size of this quantity can be bounded using the uniform estimates on Σt
for most values of k. For details, see [Bra01, Section 5].
To see why Σ+
t = Σt , note that by lower semicontinuity of perimeter, we
have
t |gt ≤ lim |Σs |gt = lim |Σs |gs = |Σt |gt .
|Σ+ + +
s→t+ s→t+
Since Σt is strictly outward-minimizing and Σ+
t encloses it, it follows that
they must be equal.
Our next task is to show that m(t) is nonincreasing. Whereas Lem-

ma 4.60 is essentially a consequence of the way that the Bray flow is con-
structed, the monotonicity of m(t) is at the heart of the overall argument.
For each time t, consider the two-ended manifold (M t , ḡt ) obtained by
taking (M Ωt , gt ) and gluing it to itself along Σt . That is, (M t , ḡt ) is
the “double” of M Ωt , obtained by reflecting it through its boundary Σt .
Let ωt be the ḡt -harmonic function on M t that approaches 1 at one end
and 0 at the other end. We can use ωt to conformally close the 0-end by
Figure 4.5. The space M Ωt is doubled through its boundary Σt ,

and then we use a harmonic conformal factor ωt to close up the newly
t , g̃t ).
created end, creating a new space (M
4
considering the metric g̃t = (ωt ) n−2 ḡt on M t . The result is a new one-ended
manifold (M3t = M t ∪ {pt}, g̃t ) with nonnegative scalar curvature. (Recall
that we did something similar to this step in our proof of Lemma 3.39.) See
Figure 4.5 for an illustration of this procedure. This construction involving
doubling the manifold and then conformally closing up the newly created
asymptotically flat end was inspired by [BMuA87]. (See Theorem 6.25.)
Lemma 4.61. Assume (M, gt , Ωt ) is a Bray flow as constructed in Theo-
3t , g̃t ). Then
rem 4.59. Let m̃(t) be the mass of (M
t
m(t) = m(0) − 2 m̃(s) ds.
0
Corollary 4.62. Assume (M, gt , Ωt ) is a Bray flow as constructed in The-

orem 4.59 and that the initial metric g0 has nonnegative scalar curvature.
Then m(t) is monotone nonincreasing.
Proof. By Lemma 4.57, we know that (M t , ḡt ) has nonnegative scalar cur-
vature, and thus, by equation (1.6), (M 3t , g̃t ) also has nonnegative scalar
curvature. By the positive mass theorem (Theorem 3.18), we should have
m̃(t) ≥ 0. Then the corollary follows from Lemma 4.61.
There is a complication here, which is that (M t , ḡt ) is not smooth at the
hypersurface Σt where the gluing occurred, and therefore neither is (M 3t , g̃t ).
But although the positive mass theorem (Theorem 3.18) does not directly
apply, we can use Theorem 4.17 instead. (We can also use Theorem 3.43 to
take care of the lack of smoothness at the point added at infinity.)
Proof of Lemma 4.61. By Corollary A.38, we can expand

vt = −e−t + b(t)|x|2−n + O(|x|2−n−γ )
for some function b(t) and some γ > 0. Integrating this, we obtain
ut = e−t + B(t)|x|2−n + O(|x|2−n−γ ),
t
where B(t) = 0 b(s) ds.
By Exercise 4.55, observe that νt = uvtt is gt -harmonic and equal to −1

at infinity and 0 at Σt . By symmetry, we know that the function ωt used
3
in the construction of (Mt , g̃t ) must be equal to 2 1 − uvtt on one end (and
1

1
1 + vt
on the end to be closed up). Therefore, in the one end of M 3t
2 ut
outside of Σt , we have

4
1 vt n−2
g̃t = 1− gt
2 ut
4
1 n−2
= (ut − vt ) g0 .
2
Note that
1 1
(ut − vt ) = e−t + (B(t) − b(t))|x|2−n + O(|x|1−n−γ ),
2 2
and so by Exercise 3.13, we have
(4.9) m̃(t) = e−2t m(0) + e−t (B(t) − b(t)).
Using e−t as an integrating factor, we can integrate this to obtain

t
1
m̃(s) ds = (1 − e−2t )m(0) + e−t B(t).
0 2
4
Finally, applying Exercise 3.13 to the conformal change gt = utn−2 g0 , we
have
(4.10) m(t) = e−2t m(0) + 2e−t B(t).
Combining this with the previous equation yields the desired result.
∂
Note: This proof is a bit easier to follow if one assumes ∂t ut = vt , in
which case one can simply differentiate equation (4.10) and compare it with
equation (4.9) to see that m (t) = −2m̃(t). However, at jump times, the
function m(t) only has left and right side derivatives.
4.3.3. Convergence to Schwarzschild. The last step is to prove that

the Bray flow converges to a Schwarzschild space. As seen in Lemma 4.61,
m̃(t) provides a useful measure of how much gt is changing, or, in other
words, how much vt deviates from what it would be on Schwarzschild space.
In light of this, we can see that the following lemma is likely to be useful.
Lemma 4.63. Assume (M, gt , Ωt ) is a Bray flow as constructed in The-

orem 4.59 and that the initial metric g0 has nonnegative scalar curvature.
Then limt→∞ m̃(t) = 0.
∞
Proof. By Lemma 4.61, 0 m̃(s) ds must be finite. So in order to prove the
lemma, it suffices to prove that m̃(t) has a one-sided bound on its different
quotients.
By the definition of vt , et vt is the unique g0 -harmonic that is −1 at
infinity and 0 at Σt . Since Σt moves outward, the maximum principle shows
that for each fixed x, the function et vt (x) is nondecreasing in t. Using the
same notation as in the proof of Lemma 4.61, this tells us that et b(t) is
nondecreasing. By equations (4.9) and (4.10), we have
et b(t) = et B(t) + m(0) − e2t m̃(t)

1 1
= e2t m(t) − m̃(t) + m(0).
2 2
Therefore e2t (m(t) − 2m̃(t)) is nondecreasing. Next we use the fact that
if f is a function such that e2t f is nondecreasing, then f + 2 f is also
nondecreasing. Therefore
t
m(t) − 2m̃(t) + 2 (m(s) − 2m̃(s)) ds is nondecreasing.
0
Using Lemma 4.61 to eliminate the integral of m̃(s), we see that
t
3m(t) − 2m̃(t) + 2 m(s) ds is nondecreasing.
0
We will use this to show that the difference quotients of m̃ are bounded
above. Taking a difference quotient of the above expression with h > 0
shows that
t+h
1
0≤ 3m(t + h) − 3m(t) − 2m̃(t + h) + 2m̃(t) + 2 m(s) ds
h t
2
≤ − (m̃(t + h) − m̃(t)) + 2m(t),
h
where we used the fact that m(t) is nonincreasing. Thus
1
(m̃(t + h) − m̃(t)) ≤ m(t) ≤ m(0).
h
∞
By Corollary 4.62 and nonnegativity of m(t), we have 0 m̃ ≤ 12 m(0). Since
m̃(t) is integrable and has an upper bound on its difference quotients, the
result follows. (Once again, the argument is simpler if one assumes that
m(t) and m̃(t) are differentiable.)
Recall from Exercise 4.58 that the Schwarzschild space is not a fixed
point for the Bray flow but rather a “soliton” solution in the sense that
its evolution is related to the original via diffeomorphism (in the region
outside the horizon). Therefore, we only expect our Bray flow to converge
to Schwarzschild space after pulling back by a suitable diffeomorphism. Or
to put it another way, we are interested in the long-time behavior of the

region outside Σt , but, with respect to a fixed coordinate system, Σt is
running off to infinity. Therefore we will change our choice of coordinates
as t changes. One way to do this is to introduce a one-parameter group of
diffeomorphisms.
Definition 4.64. Choose a smooth vector field X on M such that
2 ∂
X= n−2 r ∂r
on Rn Br0 for some large r0 , where r = |x| is the radial coordinate on
Rn Br0 . (We extend X inside the compact region so that it is smooth.)
Let Φt be the one-parameter group of diffeomorphisms of M generated by X.
Given a Bray flow (M, gt , Ωt ), we define the normalized Bray flow
(M, gt∗ , Ω∗t ) by
gt∗ = Φ∗t gt ,
Ω∗t = Φ−1
t (Ωt ),
Σ∗t = Φ−1 ∗
t (Σt ) = ∂Ωt .
Our goal will be to show that under the normalized Bray flow, (M Ω∗t ,
gt∗ ) converges to half of Schwarzschild space, which is a fixed point of the
normalized Bray flow. Recall from Lemma 4.19 that for the purpose of
proving the Penrose inequality, we may assume without loss of generality
that (M, g0 ) is harmonically flat outside a compact set, and we will do so
starting now. By Exercise 4.55, one can see that the Bray flow preserves
the property of being harmonically flat outside a compact set. Being har-
monically flat outside a compact set means that the metric is conformal to
Schwarzschild outside a compact set. Therefore in order to prove conver-
gence to Schwarzschild, we need only prove convergence of a single function.
One important step is to obtain at least a small amount of control
over Ω∗t .
rem 4.59, and assume that g0 is harmonically flat outside a compact set. In
the harmonically flat coordinates, there exists some large r1 such that the
normalized region Ω∗t is always enclosed by the coordinate sphere |x| = r1 .
Idea of the proof. The basic idea is that if Σ∗t extends too far out, that
will cause it to have large volume, but we already know that it has fixed
volume A. The reason why it will have to have large volume is the following.
A well-known property of a smooth minimal hypersurface Σ in Euclidean
Rn is that if p ∈ Σ, then
|Σ ∩ Br (p)| ≥ ωn−1 rn−1 .
This is usually called the monotonicity property for minimal hypersurfaces.

(For example, see [CM11, Corollary 1.13].) The surface Σ∗t is not minimal
in Euclidean Rn , but it is outward-minimizing in (M, gt∗ ) which should be
close to Euclidean out near |x| = r1 . (Actually, one must use an inductive
argument, since this closeness actually depends on where Σ∗t is.) This is
enough to show that Σ∗t satisfies a monotonicity-like lower bound on volume.
For details of this argument, see [BL09, Section 3].
We know that (M, g0 ) is harmonically flat for |x| > r1 , where r1 is given
by Lemma 4.65. Thus
4
(g0 )ij (x) = U n−2 δij
for some harmonic function U , for |x| > r1 . We can extend U to a globally
defined positive function on M , and then define the metric ḡ via
4
g0 = U n−2 ḡ.
The idea behind ḡ is that we can use it as a background metric which is

flat where |x| > r1 . Now consider r0 > 0, the vector field X, and the family
of diffeomorphisms Φt described in Definition 4.64, and define the following
quantities:
Ut := et (ut U ) ◦ Φt ,
Vt := et (vt U ) ◦ Φt ,
−4t
ḡt := e n−2 Φ∗t ḡ,
g̃t∗ := Φ∗ g̃t outside Ω∗t .
Note that since Ut and Vt are harmonic (with respect to the Euclidean
metric), we can expand them in spherical harmonics as in Corollary A.19.
n−2
Exercise 4.66. With the definitions above, let t0 = 4 log(r1 /r0 ), and
verify the following facts:
• For |x| > r1 , (ḡt )ij (x) = δij . That is, for t > t0 , ḡt is also a flat
background metric outside a compact set. It is also equal to δij for
|x| > r0 when t > t0 .
4
• gt∗ = Utn−2 ḡt .
4
• g̃t∗ = Wtn−2 ḡt , where Wt := 12 (Ut − Vt ) outside Ω∗t .
t
• Ut = U0 + 2 0 (Us − Ws + XUs ) ds.
•
m(t) 2−n
Ut = 1 + |x| + O(|x|1−n),
2
m(t) − 2m̃(t) 2−n
Vt = −1 + |x| + O(|x|1−n),
2
m̃(t) 2−n
Wt = 1 + |x| + O(|x|1−n).
2
• In the region |x| > r1 , Ut , Vt , and Wt are all (Euclidean) harmonic
functions. When t > t0 , they are harmonic in the region (M Br0 )
∗
Ωt .
Recall that the Schwarzschild space of mass m is essentially a fixed point

of the normalized Bray flow. On this model solution, we have
m
Ut = 1 + |x|2−n ,
2
m
Vt = −1 + |x|2−n ,
2
Wt = 1.
We will prove convergence to Schwarzschild by showing that, in general, the
O(|x|1−n) terms of Ut , Vt , and Wt vanish in the long-time limit. The key to
this is the function Wt , since it represents the conformal factor of a metric
whose mass m̃(t) vanishes in the limit (by Lemma 4.63). Since the mass is
approaching zero, we expect the metric g̃t∗ to get flatter, that is, for Wt to
approach 1.
Theorem 4.67 (Lee [Lee09]). Given n ≥ 3, α > 1, and > 0, there exists
δ > 0 with the following property.
Let (M n , g) be a complete asymptotically flat manifold with nonnegative
scalar curvature, with coordinates in some end satisfying
4
gij (x) = W (x) n−2 δij
for |x| > r, for some positive harmonic function W on Rn B̄r (0) approach-
ing 1 at infinity.
n−2
If mADM (g) < δrn−2 , then for all |x| ≥ αr, |W (x) − 1| < |x|
r
.
This result was first proved by Bray in [Bra01] in the case where M is
spin. In that case, it follows from Witten’s spinor proof of the positive mass
theorem, and we provide the argument in Section 5.4.2. For the general
case, see [Lee09].
rem 4.59, and assume that g0 has nonnegative scalar curvature and is har-
monically flat outside a compact set. Let m∞ = limt→∞ m(t). For any
α > 1, the following limits hold uniformly over all |x| ≥ αr1 , where r1 is the
constant given in Lemma 4.65:
m∞ 2−n
lim Ut (x) = 1 + |x| ,
t→∞ 2
m∞ 2−n
lim Vt (x) = −1 + |x| ,
t→∞ 2
lim Wt (x) = 1.
t→∞
Proof. Define error terms

m(t) 2−n
Ût (x) := Ut (x) − 1 + |x|
2
and

m̃(t) 2−n
Ŵt (x) := Wt (x) − 1 + |x| .
2
From Exercise 4.66, Lemma 4.61, and Lemma 4.65, it follows that for
|x| > r1 ,
t

2 ∂
(4.11) Ût = Û0 + 2 Ûs − Ŵs + r Us ds.
0 n − 2 ∂r
By Theorem 4.67 and Lemma 4.63, we know that given any α > 1,
Wt converges to 1 uniformly over |x| > αr1 , as t → ∞, establishing the
last equation of the lemma to be proved. Or in other words, Ŵt converges
to 0 uniformly over |x| ≥ αr1 , as t → ∞. Since Ŵt is harmonic of order
O(|x|1−n) and converging to 0, it follows that for any > 0, we can choose
t large enough so that for all |x| > 2αr1 ,
|Ŵt (x)| < C|x|1−n
for some constant C independent of . Analyzing equation (4.11), one may
conclude that for large enough t,
|Ût (x)| < 3C|x|1−n .
The first equation of the lemma follows from this, and the second equation
follows immediately from the other two.
Lemma 4.69. Assume all of the hypotheses of Lemma 4.68 and further
suppose that m∞ > 0. Let r∞ be the Schwarzschild radius (in conformal
1
coordinates) corresponding to m∞ . That is, r∞ = (m∞ /2) n−2 . Assume
r0 < r∞ . Then there is a subsequence of Σ∗t that converges to the coordinate
sphere Sr∞ in Hausdorff distance.
Recall that we were free to choose r0 to be as small as we like.

Sketch of the proof. Using the general compactness theory for sets of fi-
nite perimeter, we can extract a subsequence Ω∗ti that converges to some
Ω∞ (in the sense that their characteristic functions converge in L1 ). We
can use the outward-minimizing property to help us show that Σ∗ti actually
converges to Σ∞ := ∂Ω∞ in the Hausdorff sense in the region where |x| > r0 .
The main problem is to show that Σ∞ = Sr∞ .
∗
Since Vt is harmonic on (M Br0 )Ωt for large t and uniformly bounded
in t, we can choose a subsequence such that Vti converges uniformly on
compact subsets of (M Br0 ) Ω∞ . Since the limit must be harmonic, it
follows from Lemma 4.68 that the limit is V∞ (x) = −1 + m2∞ |x|2−n .
Suppose that part of Σ∞ lies within Sr∞ . Choose a point x0 lying outside
Σ∞ but inside Sr∞ . Then V∞ (x0 ) > 0, and so we can find ti large enough so
that Vti (x0 ) > 0 while x0 lies outside Σti . But this contradicts the definition
of Vti .
Now suppose that part of Σ∞ lies outside Sr∞ . Then for some x0 ∈ Σ∞
and some r > 0, the ball B2r (x0 ) lies completely outside Sr∞ . We outline
a proof for how to obtain a contradiction from this. We know that V∞ is
bounded above by some negative number in Br (x0 ). On the other hand, we
know that Vti is zero at Σ∗ (ti ), which cuts through Br (x0 ). The only way
that this can happen is if the gradient of Vti is blowing up. More precisely,
one can show that it blows up badly enough that the energy of Vti blows up
as i → ∞. (For details of this argument, see [BL09, Section 3].) However,
we can bound the energy of Vti independently of i. To see this, note that Vti
is the harmonic function that is equal to 0 at Σti and −1 at infinity. Since
Σ∗ti is contained in Sr1 (by Lemma 4.65), the energy-minimizing property of
harmonic functions shows that the energy of Vti is less than the energy of
the harmonic function that is equal to 0 at Sr1 and −1 at infinity.
Lemma 4.70. Assume all of the hypotheses of Lemma 4.68. Then m∞ > 0.
Sketch of the proof. Just as in the proof of the previous lemma, we ex-
tract a subsequence of Σti that converges to some Σ∞ in Hausdorff distance
in the region where |x| > r0 . Using the outward-minimizing property of Σti
and the fact that the volume A = |Σ∗ti |gt∗ is constant, we can argue that the
i
1
part of Σ∞ where |x| > r0 cannot be empty, so long as r0 < (A/ωn−1 ) n−1 .
See [BL09, Section 3] for details.
Once we know that there is a point in Σ∞ where |x| > r0 , we can use
the same energy estimate argument used in the previous lemma in order to
contradict the possibility that m∞ ≤ 0.
Proof of Theorem 4.54. We first establish the inequality. By Lem-

ma 4.19, we may assume without loss of generality all of the hypotheses
of Theorem 4.54 plus the assumption that the metric is harmonically flat
outside a compact set. Using Theorem 4.59, we construct a long-time Bray
flow with initial condition (M, g, ∅). By Lemma 4.70, the long-time limit
of the mass m∞ , which exists by Corollary 4.62, must be positive. Let r∞
be its corresponding Schwarzschild radius (in conformal coordinates). By
Lemma 4.60, A = |Σt |gt . Choose r0 < r∞ and use this r0 to define the
normalized Bray flow (M, gt∗ , Ω∗t ).
Let > 0. By Lemma 4.69, for large enough t, Σ∗t lies within the sphere
Sr∞ + . Since Σ∗t is outward-minimizing with respect to gt∗ , we can see that
for large enough t,
A = |Σ∗ti |gt∗
i
≤ |Sr∞ + |gti
2(n−1)
= Ut n−2
Sr∞ +
2(n−1)
m∞ 2−n n−2
≤ 1+ |x| + ,
Sr∞ + 2
n−1
which converges to ωn−1 (2m∞ ) n−2 as → 0. Finally, we know that m∞ ≤
mADM (g) by monotonicity of mass (Corollary 4.62). Thus

n−2
1 A n−1
(4.12) mADM (g) ≥ .
2 ωn−1
Next we consider the case of equality. Suppose that we have initial data
(M, Ω, g), not necessarily harmonically flat outside a compact set, such that

n−2
1 A n−1
mADM (g) = .
2 ωn−1
If we evolve the (M, Ω, g) by Bray flow, Lemma 4.61 states that
T
m(T ) = m(0) − 2 m̃(s) ds
0
for any T > 0. Since the volume must remain constant (Lemma 4.60) and
the Penrose inequality (4.12) continues to hold for each (M, Ωt , gt ), it follows
that T
m̃(s) ds ≤ 0.
0
Since m̃(s) ≥ 0 by Theorem 4.17, it follows that m̃(s) = 02for almost
every s in [0, T ]. From Lemma 4.60, we know that Ω0 = Ω+ 0 = s>0 Ωs . In
particular, this implies that ωs converges to ω0 as s → 0+ (where ωs is the
conformal factor used in the construction of g̃s in the proof of Lemma 4.61).
Since m̃(s) is picked up by the asymptotics of ωs , it follows that m̃(0) = 0.
By rigidity of the positive mass theorem, this means that g̃0 is Euclidean
space. Reversing the construction of g̃0 , this means that the doubled metric
(M 0 , ḡ0 ) is globally conformal to Euclidean space Rn {0}, where the con-
formal factor is harmonic. This is only possible if (M 0 , ḡ0 ) is a Schwarzschild
space, or equivalently if (M Ω0 , g) is half of a Schwarzschild space. Note
that once again, because of the nonsmoothness of g̃0 , we have to invoke the
rigidity of Theorem 4.17. (See also Remark 3.45 for dealing with the singular
point in (M 30 , g̃0 ).)
Chapter 5
Spin geometry
5.1. Background
In our presentation of spinors below, we will attempt to emphasize facts
that are most directly relevant to the computations that we will need to do
later on, while deemphasizing formalism and the conceptual side. Because
of this, our introduction to spinors will be quite brief and fairly shallow.
There are many excellent resources for a more thorough introduction to the
subject. Specifically, see the books [LM89] and [Har90].
5.1.1. Bundle constructions.

Definition 5.1. Let G be a Lie group, and let M be a smooth manifold. We
say that F is a principal G-bundle over M if there exist a smooth projection
map π : F −→ M and a smooth right group action F × G −→ F such that
the group action acts freely and transitively on the fibers π −1 (p) for each
p ∈ M.
For our purposes, it is not necessary to understand this definition ab-

stractly (or even know much about Lie groups), since we will only be con-
cerned with rather specific principal G-bundles. Note that the requirement
that the group G acts freely and transitively on the fibers tells us that G acts
by diffeomorphisms on each fiber, and moreover each fiber is (noncanoni-
cally) diffeomorphic to G. That is, each fiber may be thought of as an “affine
copy” of the group G.
On a more practical level, we can understand a principal G-bundle by
looking at its local trivializations. A local trivialization is an open set U in
M and a diffeomorphism Φ : π −1 (U ) −→ U ×G such that Φ is G-equivariant
in the sense that for any p ∈ U and g, h ∈ G, Φ−1 (p, g)·h = Φ−1 (p, gh). Each
159
160 5. Spin geometry
local trivialization defines a distinguished local section s : U −→ π −1 (U ) by

s(p) = Φ−1 (p, e) for all p ∈ U , where e is the identity of G. Conversely, by
equivariance, this local section s actually determines the local trivialization
Φ over U , via the equation Φ−1 (p, g) = s(p) · g. Therefore a choice of local
trivialization of a principal G-bundle is equivalent to a choice of local section.
All of the information of the principal G-bundle can be recovered by local
trivializations covering M , together with the transition functions between
them. Given local sections si : Ui −→ π −1 (Ui ), the transition function
tij : Ui ∩ Uj −→ G is defined by the equation sj (p) = si (p) · tij .
Our most fundamental example of a principal G-bundle is the frame
bundle over a manifold M n . In this case, each element of F is a basis
of tangent vectors at some point p (that is, a frame at p), and π of this
element is the base point p. This frame bundle can be seen as a principal
GL(n)-bundle by considering the usual right action by GL(n) on the set of
bases of the vector space Tp M . As described above, any local frame field
s = (u1 , . . . , un ) over an open set U determines a local trivialization of this
bundle. Given an overlapping local frame field s = (u1 , . . . , un ) over an open
set U , the transition function t : U ∩ U −→ GL(n) is the matrix-valued
function that takes the standard basis of Rn to the basis of Rn obtained
by writing u1 , . . . , un in the u1 , . . . , un basis. We denote the frame bundle
by FGL .
If we have a Riemannian metric on M , then we can instead consider the
bundle F whose elements are orthonormal bases of tangent vectors. This is
called the orthonormal frame bundle FO , which can be viewed as a principal
O(n)-bundle over M . Note that one purely topological improvement of the
orthonormal frame bundle over the ordinary frame bundle is that the fibers
and the Lie group O(n) acting on them are compact. If the manifold is also
oriented, then we can further restrict to oriented orthonormal bases, which
leads us to the oriented orthonormal frame bundle FSO , which is a principal
SO(n)-bundle over M . The local trivializations and transition functions are
defined as they were for FGL with the essential difference being that for FSO ,
the transition functions now take values in SO(n).
Definition 5.2. Given a principal G-bundle F and a representation ρ :
G −→ GL(V ), there is an associated vector bundle V (M ) constructed by
taking the quotient of F × V by the diagonal action of G given by (x, v) · g =
(x · g, ρ(g −1 )v), where x ∈ F , v ∈ V , and g ∈ G.
Observe that each fiber of V (M ) is isomorphic to V and carries an action

of G. This construction neatly generalizes the way we turn pointwise con-

structions such as k V ∗ into global constructions such as k T ∗ M . (Take
a moment to think about what the corresponding representation is in this
case.) A local section s of F over U gives rise to a map φ : U × V −→ V (M )
5.1. Background 161
given by φ(p, v) = [s(p), v] for any (p, v) ∈ U × V , where the brackets rep-
resent the quotient map F × V −→ V (M ). We will refer to this map φ as
a local trivialization of the bundle V (M ). (However, note that the usual
definition of a trivialization of a vector bundle uses U × Rm , which is related
to ours simply by composing with an isomorphism V ∼ = Rm .) A transition
map t between two local trivializations of F gives rise to the transition map
ρ ◦ t for the corresponding trivializations of V (M ).
One can also use this formalism to generalize the way that the Levi-
Civita connection gives rise to connections on tensor bundles. Recall that
the concept of a connection on a vector bundle is equivalent to parallel
transport, and the concept of parallel transport generalizes nicely to princi-
pal G-bundles. For example, given the Levi-Civita connection on an oriented
Riemannian manifold M , we know what it means to parallel transport an
SO(n)-frame along a path in M . Although there is a general definition of
parallel transport (or connection) on a principal G-bundle, we will not need
it. Instead we will concretely explain how to use parallel transport in FSO
to directly define parallel transport in an associated bundle V (M ).
Let γ be a path in M starting at p. Choose a local SO(n)-frame field
s = (e1 , . . . , en ) over U , giving rise to the local trivialization π −1 (U ) −→
U × SO(n), under which s corresponds to the identity section of U × SO(n).
Define g(t) ∈ SO(n) so that (γ(t), g(t)) corresponds to the parallel trans-
port of s(p) in FSO along γ. Now select any v in V . We now define
[s(γ(t)), ρ(g(t))v] to be the parallel transport of [s(p), v] in V (M ) along γ.
In the local trivialization, we would simply write this as (γ(t), ρ(g(t))v) is
the parallel transport of v at p. One can show that this gives a well-defined
connection on V (M ). In words, after a choice of local SO(n)-frame, parallel
transport of any SO(n)-frame at a point along a curve can be thought of as
a parallel transport of an element of SO(n) along that curve, and then the
action of SO(n) on V tells us how to parallel transport an element of V (M )
along that curve.
5.1.2. Spinors. For n ≥ 3, it is a fact that π1 (SO(n)) = Z2 . (This is

not hard to see when n = 3, and then one can proceed inductively by
looking at the long exact sequence of homotopy groups of the fibration of
SO(n + 1) over S n .) We will explain how to construct an explicit nontrivial
double cover of SO(n) lying inside the Clifford algebra Cl(n). Given an inner
product space V , we define Cl(V ) to be the free tensor algebra on V modulo
the relation v 2 = −|v|2 for all v ∈ V , that is,
∞ 4
5
r
Cl(V ) = V I,
r=0
where I is the ideal generated by the relations v ⊗ v = −|v|2 for all v ∈ V ,

though we typically write the Clifford product without any multiplication
symbol. In particular, for any v, w ∈ V , we have vw + wv = −2v, w in
Cl(V ). If e1 , . . . , en is an orthornormal basis, this reduces to the equation
(5.1) ei ej + ej ei = −2δij .
Note that V ⊂ Cl(V ) in a natural way. We use Cl(n) to denote the Clifford
algebra of Rn with the standard inner product. It is also not hard to see
that Cl(n) has dimension 2n as a vector space.
Exercise 5.3. Let e1 , . . . , en be an orthonormal basis of V , and let θ1 , . . . , θn
denote its dual basis. For each i, j from 1 to n with i = j, define
Aji := ej ⊗ θi − ei ⊗ θj ∈ End(V ).
Show that for any i = j and v ∈ V ,
1
Aji (v) = (ei ej v − vei ej ),
2
where the right side is computed using Clifford multiplication.
We define Spin(n) ⊂ Cl(n) to be all products of elements of the form

vw, where v, w ∈ Rn are unit vectors. It is easy to see that Spin(n) is a
group, and furthermore one can see that it is a Lie group because it is a
closed subgroup of the group of units in the algebra Cl(n) (which is open in
Cl(n)). We can define a homomorphism ξ : Spin(n) −→ SO(n) by defining
how each element of Spin(n) acts on Rn . Explicitly, each generator vw acts
on Rn via reflection through the plane orthogonal to w followed by reflection
through the plane orthogonal to v. (By “plane,” we mean plane through
the origin.)
Proposition 5.4. The map ξ : Spin(n) −→ SO(n) defined above is surjec-
tive, and its kernel is {1, −1}.
Proof. The surjectivity follows directly from the Cartan-Dieudonné Theo-

rem [Wik, Cartan-Dieudonne theorem], which says that every element of
SO(n) can be written as a product of an even number of reflections.
To compute the kernel, observe that if x ∈ Rn and v is a unit vector
in Rn , then vxv is the reflection of x through the plane orthogonal to v.
Consequently, the action of vw ∈ Spin(n) on x ∈ Rn described earlier is
(vw)x(wv)−1 . From this it follows that for any ϕ ∈ Spin(n), the action of
ϕ on x ∈ Rn is ϕxϕ−1 .
Exercise 5.5. Prove that the only elements of Spin(n) that commute with
every x ∈ Rn under Clifford multiplication are 1 and −1. Hint: Write out
the element of Spin(n) in terms of an orthogonal basis.
5.1. Background 163
The exercise shows that if ϕ ∈ Spin(n) acts as the identity on Rn , then

it must be 1 or −1, completing the proof.
Exercise 5.6. Construct a path connecting 1 to −1 in Spin(n).
This exercise together with the preceding proposition shows that Spin(n)
is a connected Lie group that double covers SO(n). In particular, they must
have the same dimension, and since π1 (SO(n)) = Z2 , it follows that Spin(n)
is simply connected.
We are interested in the derivative of ξ. Technically, this derivative is
an isomorphism between Lie algebras, but we are primarily interested in
calculating this map explicitly.
Exercise 5.7. Consider the map
Dξ : T1 Spin(n) −→ TId SO(n),
where T1 Spin(n) is regarded as a subspace of Cl(n) and TId SO(n) is re-
garded as a subspace of End(Rn ). Then for any i, j from 1 to n with i = j,
we have
Dξ(ei ej ) = 2Aji ,
where e1 , . . . , en is an orthonormal basis for Rn , and Aji is defined as in
Exercise 5.3. Hint: Consider an appropriate path in Spin(n) and explicitly
compute its image under ξ.
An orientable manifold M is said to be spin if its oriented orthonormal

frame bundle FSO can be “lifted” to a principal Spin(n)-bundle FSpin . In
other words, this means that there exists a principal Spin(n)-bundle FSpin
double covering FSO in such a way that (equivariantly) respects that double
cover Spin(n) −→ SO(n). This is a purely topological condition, indepen-
dent of choice of metric. More precisely, it is equivalent to the vanishing
of the second Stiefel-Whitney class. The particular choice of lifting (up to
homotopy) is called a spin structure on M . In terms of transition functions,
we can always locally lift the SO(n)-valued transition functions for FSO to
Spin(n)-valued transition functions, in two different ways. The property
of being spin means that this can be done in such a way that the cocycle
condition for building a fiber bundle from transition functions is satisfied.
Using one of these bundles FSpin , any representation of Spin(n) gives
rise to an associated bundle. Since there are representations of Spin(n) that
do not descend to representations of SO(n), we obtain new bundles that are
not tensor bundles. We refer to sections of these bundles as spinors. The
intuitive difference between a spinor and tensor is as follows. Both objects
transform as you perform rotations, but if you continuously rotate around
an axis until you perform a complete rotation (explicitly, this means you are
moving through a homotopically nontrivial loop in SO(n)), your spinor will

pick up a factor of −1, while the tensor returns to its original state. (The
physically important, remarkable fact is that actual physical quantities can
display this behavior!)
The true value of these bundles comes when we have a Clifford action
on them, so we would like to build a Clifford bundle Cl(M ) over a Riemann-
ian manifold M . Consider the following representation of ρ : SO(n) −→
GL(Cl(Rn )). For each g ∈ SO(n) and any vectors v1 , . . . , vk ∈ Rn , we
define the action of g on their product in Cl(Rn ) to be ρ(g)(v1 · · · vk ) =
g(v1 ) · · · g(vk ). The Clifford bundle Cl(M ) is defined to be the associated
vector bundle of this representation. The fiber at each p ∈ M is the Clifford
algebra Cl(Tp M ). Note that this construction only requires the metric and
does not require a spin structure.
Now let S be a vector space that carries the structure of a real module
over Cl(n). One can show that S carries an inner product such that each
unit vector v ∈ Rn ⊂ Cl(n) acts orthogonally on S. (See [LM89, Propo-
sition 5.16] for a proof.) Since v 2 = −1, it follows that v acts as a skew-
symmetric operator on S. Therefore all vectors in Rn ⊂ Cl(n) act as skew-
symmetries. Since Spin(n) ⊂ Cl(n), S is also a representation of Spin(n).
Therefore, if M is spin, we can use a principal Spin(n)-bundle FSpin to build
the associated bundle S(M ), which we will call a spinor bundle.1 Note that
although there are other bundles arising from representations of Spin(n), we
reserve the phrase spinor bundle for the ones that carry a Clifford action.
Observe that the module structure of S over Cl(n) carries over to the cor-
responding bundles, so that each fiber of Cl(M ) acts on the corresponding
fiber of S(M ). We will use · to denote this action.
The Levi-Civita connection extends to a connection on Cl(M ) via the
general bundle construction described earlier, but for purposes of calcula-
tion, we can understand it more easily by the fact that it obeys an appro-
priate Leibniz rule:
∇(στ ) = (∇σ)τ + σ(∇τ )
for any σ, τ ∈ C ∞ (Cl(M )).
We can also define a connection on S(M ) induced by the Levi-Civita
connection. Essentially the concept of parallel translation in FSO easily lifts
to parallel translation in FSpin via the covering property, and then we can
define parallel translation in S(M ) in the same way we described for tensor
bundles. Explicitly, let e = (e1 , . . . , en ) be an SO(n)-frame field over U .
This is a section of FSO , so it lifts to a section ẽ of FSpin . So it gives us a
1 What we have defined is a real spinor bundle. In the literature, complex spinor bundles,
which arise from complex modules over Cl(n), are more prevalent, but they are not necessary for
our purposes.
5.1. Background 165
local trivialization U × Spin(n) of FSpin and U × S of S(M ). Let γ be a path

in M starting at p and consider parallel translation along γ. As we discussed
earlier, let g(t) ∈ SO(n) with g(0) = Id be such that (γ(t), g(t)) ∈ U ×SO(n)
corresponding to parallel translation of e(p) along γ in FSO . Consider the
unique continuous lifting g̃(t) of g(t) from SO(n) up to Spin(n) with g̃(0) = 1.
Then (γ(t), g̃(t)) is the parallel translation of ẽ(p) along γ in FSpin . Finally,
we define [ẽ(γ(t)), g̃(t) · s] to be the parallel translation of [ẽ(p), s] along γ in
S(M ) for any s ∈ S. In the local trivialization, we just say that (γ(t), g̃(t)·s)
is the parallel translation of s along γ.
We say that a local section ψ of S(M ) is constant with respect to the
frame e = (e1 , . . . , en ) if ψ is equal to some constant s ∈ S with respect
to the local trivialization of S(M ) coming from e. Or in other words, if
ψ = [ẽ, s].
In the following, we abuse notation slightly by writing everything in the
local trivialization. Set X = γ (0). Using the relationship between covariant
differentiation and parallel transport, we have at p, for any vector v ∈ Rn ,
0 = ∇X (g(t)v)
= g (0)v + ∇X v
= g (0)v + ωji (X)(ei ⊗ θj )v,
where the ωji are the connection 1-forms determined by the local frame (as
discussed in Section 1.1.4), and we use the Einstein summation convention.
Recall from Exercise 1.5 that ωji is antisymmetric in i and j. Therefore
1
g (0) = −ωji (X)ei ⊗ θj = − ωji (X)Aji .
2
As a consequence of Exercise 5.7, we can see that
1
g̃ (0) = − ωji (X)ei ej .
4
We define covariant differentiation in S(M ) in terms of parallel transport,
so that we must have
0 = ∇X (g̃(t) · s) = g̃ (0) · s + ∇X s.
Therefore
1 i
n
∇X s = ωj (X)ei ej · s.
4
i,j=1
Hence, we have the following general formula for any spinor ψ that is con-
stant with respect to the given frame:
1 i
n
(5.2) ∇ψ = ωj ei ej · ψ.
4
i,j=1
The connection on S(M ) respects the Hermitian product in the sense

that for any spinors φ, ψ ∈ C ∞ (S(M )),
∇φ, ψ + φ, ∇ψ = 0.
Exercise 5.8. Show that the above equation can be seen as a direct conse-
quence of (5.2) and the fact that vectors act as skew-symmetric operators
on S, by writing an arbitrary spinor as a linear combination of constant
spinors.
One can similarly show that the connection respects the module struc-
ture of S(M ) over Cl(M ) in the sense that for any v ∈ C ∞ (T M ) and
ψ ∈ C ∞ (S(M )),
∇(v · ψ) = (∇v) · ψ + v · (∇ψ).
5.2. The Dirac operator

We now define the Dirac operator on D : C ∞ (S(M )) −→ C ∞ (S(M )). For
any ψ ∈ C ∞ (S(M )), we define

n
Dψ = ei · ∇i ψ,
i=1
where e1 , . . . , en is any local orthonormal frame. The equation Dψ = 0 is

called the Dirac equation,2 and solutions of this equation are called harmonic
spinors.
Exercise 5.9. Check that the Dirac operator D is well-defined in the sense
that the definition above is independent of choice of local orthonormal frame.
Also, show that D is formally self-adjoint.
Theorem 5.10 (Schrödinger-Lichnerowicz formula). Let (M, g) be a Rie-

mannian spin manifold. For any ψ ∈ C ∞ (S(M )),
1
D2 ψ = ∇∗ ∇ψ + Rψ,
4
where ∇∗ is the formal adjoint of the operator ∇ on S(M ).
2 This is not quite the same as the famous “Dirac equation” commonly used in quantum
physics, originally discovered by P. A. M. Dirac.

5.2. The Dirac operator 167
Proof. We compute at a point p, and choose an orthonormal frame e1 , . . . , en

parallel at p. Then using the Clifford relations (5.1), we have

n
D ψ=
2
ei · ∇i (ej · ∇j ψ)
i,j=1
n
= ei ej ∇i ∇j ψ
i,j=1
n
n
= ei ei ∇i ∇i ψ + ei ej (∇i ∇j − ∇j ∇i )ψ
i=1 i<j

n
= ∇∗ ∇ψ + ei ej ReSi ,ej ψ,
i<j
where ReSi ,ej denotes the curvature of the spinor bundle S(M ). Let us com-
pute the operator RS . Since it is a zero-order operator, it suffices to com-
pute how it acts on a spinor ψ that is constant with respect to the frame,
for which (5.2) applies:
RS ψ = ∇(∇ψ)
⎛ ⎞
1 n
= ∇⎝ ωji ei ej · ψ ⎠
4
i,j=1
1
n
= ∇(ωji )ei ej · ψ + ωji ∧ ei ej · ∇ψ
4
i,j=1
1 n
1
n
= dωji ei ej · ψ + ωji ∧ ωk ei ej ek e · ψ.
4 16
i,j=1 i,j,k,=1
We use the Clifford relations (5.1) to commute ei ej past ek e :
ei ej ek e = −2δjk ei e + 2δj ei ek − 2δik e ej + 2δi ek ej + ek e ei ej .
Therefore
ωji ∧ ωk ei ej ek e = − 2ωki ∧ ωk ei e + 2ωi ∧ ωk ei ek − 2ωjk ∧ ωk e ej

+ 2ωj ∧ ωk ek ej + ωji ∧ ωk ek e ei ej
= − 2ωki ∧ ωjk ei ej + 2ωki ∧ ωkj ei ej − 2ωjk ∧ ωik ei ej
+ 2ωjk ∧ ωki ei ej + ωk ∧ ωji ei ej ek e
= 8ωjk ∧ ωki ei ej − ωji ∧ ωk ei ej ek e ,
where the second equality used reindexing and the third equality used anti-
symmetry. Therefore
ωji ∧ ωk ei ej ek e = 4ωkj ∧ ωik ei ej .
Putting it all together, we obtain
1
n
S
R = (dω + ω ∧ ω)ij ei ej
4
i,j=1
1 n
= Riem(·, ·, ei , ej )ei ej ,
4
i,j=1
by (1.3).
Substituting this into our calculation of D2 ψ, we obtain
1
n
D2 ψ = ∇∗ ∇ψ − Rijk ei ej ek e ψ.
8
i,j,k,=1
We now work on simplifying the second curvature term. We claim that

for any fixed ,

Rijk ei ej ek = 0.
i,j,k distinct
The reason for this is that the expression ei ej ek is invariant under cyclic
permutations, while the first Bianchi identity tells us that the sum of the
cyclic permutations of Rijk in i, j, k vanishes. Explicitly, by reindexing, we
have
1
Rijk ei ej ek = Rijk ei ej ek + Rjki ej ek ei + Rkij ek ei ej
3
i,j,k distinct i,j,k distinct
1
= (Rijk + Rjki + Rkij )ei ej ek
3
i,j,k distinct
= 0.

Therefore, in the sum i,j,k, Rijk ei ej ek e , we are only concerned with
when i, j, k are not distinct. Note that i = j contributes nothing because of
antisymmetry of Rijk , so we need only worry about when i = k and when
j = k. (Note that the intersection i = j = k does not contribute anything.)
Therefore, using only symmetries of the curvature tensor and properties of
5.3. Witten’s proof of the positive mass theorem 169
Clifford multiplication, we have

n
n
−Rijk ei ej ek e = (−Rijj ei ej ej e − Riji ei ej ei e )
i,j,k,=1 i,j,=1
n
= (−Rjij ei e − Riji ej e )
i,j,=1
n
= −2Ri ei e
i,=1
= 2R,
From this formula, together with integration by parts, we immediately

conclude the following theorem.
Theorem 5.11 (Lichnerowicz [Lic63]). If (M, g) is a compact Riemannian

spin manifold with positive scalar curvature, then it has no harmonic spinors
(other than the zero spinor).
It follows from the Atiyah-Singer index theorem [Wik, Atiyah-Singer

index theorem] that the nonexistence of harmonic spinors implies vanishing
of the Hirzebruch Â genus mentioned in Chapter 1. See [LM89, Theorem
8.11] for details.
5.3. Witten’s proof of the positive mass theorem

In this section we prove the following theorem of E. Witten [Wit81]. A
mathematically rigorous exposition of Witten’s proof first appeared in [PT82].
Theorem 5.12 (Positive mass theorem for spin manifolds). Let (M, g) be a
complete asymptotically flat spin manifold with nonnegative scalar curvature.
Then the ADM mass of each end is nonnegative. Moreover, if the mass of
any end is zero, then (M, g) is Euclidean space.
Since we would like to apply the Schrödinger-Lichnerowicz formula (The-

orem 5.10) to asymptotically flat spin manifolds, we prove the following
corollary of it which takes into account a boundary term.
Corollary 5.13. Let Ω be a bounded open set with smooth boundary in a

complete Riemannian spin manifold M , and let ψ ∈ C ∞ (S(M )). Then

1
|∇ψ| − |Dψ| + Rψ
2 2 2
dμM = ψ, ∇ν ψ + ν · Dψ dμ∂Ω
Ω 4 ∂Ω
n
= ψ, Li ψν i dμ∂Ω ,
∂Ω i=1
where Li = (δij + ei ej ) · ∇j .
Proof. We will prove the second expression involving Li first, and that is
the version of the formula we will use later on. It is easy to see that it is
equal to the manifestly frame-independent expression above it. Adopting
Einstein summation notation, we compute
−|Dψ|2 = −ei · ∇i ψ, ej · ∇j ψ
= ∇i −ei · ψ, ej · ∇j ψ + ei · ψ, ∇i (ej · ∇j ψ)
= ∇i ψ, ei ej · ∇j ψ − ψ, ei · ∇i (ej · ∇j )ψ
= ∇i ψ, ei ej · ∇j ψ − ψ, D2 ψ.
Meanwhile,
|∇ψ|2 = ∇i ψ, ∇i ψ + ψ, ∇∗ ∇ψ
= ∇i ψ, δij ∇j ψ + ψ, ∇∗ ∇ψ.
Adding these two computations and combining them with the Schrödinger-
Lichnerowicz formula and the divergence theorem yields the desired result.

The idea behind Witten’s proof of the positive mass theorem is to find a
spinor that solves the Dirac equation while being asymptotically constant at
infinity, in which case the boundary term in Corollary 5.13 is proportional
to the mass, completing the theorem. First, let us see why that boundary
term gives us the mass.
Proposition 5.14. Let (M, g) be an asymptotically flat spin manifold, and
let e1 , . . . , en be an orthonormal frame in some end Mk . Let ψ0 be a constant
spinor with respect to this frame. Then
n
1
lim ψ0 , Li ψ0 ν i dμSρ = (n − 1)ωn−1 |ψ0 |2 mADM (Mk , g),
ρ→∞ S
ρ
2
i=1
where Sρ is a coordinate sphere in Mk .
Proof. The orthonormal frame e1 , . . . , en can be obtained by orthonormaliz-

ing the coordinate frame ∂1 , . . . , ∂n . Let q = n−2
2 . Recall from Definition 3.5
that asymptotic flatness means that hij := gij − δij decays at a faster rate
than q = n−2 2 . For convenience we choose to use the letter q for the con-
stant n−2
2 rather than the actual assumed asymptotic decay rate. Therefore
hij = o2 (|x| ). (We say that a function f is o2 (|x|−q ) if for any > 0, we
−q
have |f | + |x| · |Df | + |x|2 | · |D 2 f | < |x|−q for sufficiently large |x|.) Direct
computation can be used to show that
1
ei = ∂i − hij ∂j + o1 (|x|−q ),
2
and thus
ωji (ek ) = ∇ek ej , ei
6 7
1
n
= ∇∂k ∂j + hj ∂ , ∂i + o(|x|−2q−1 )
2
(5.3) =1
1
= Γijk − hij,k + o(|x|−2q−1 )
2
1
= (gik,j − gjk,i ) + o(|x|−2q−1 ).
2
Note that this expression is antisymmetric in i and j, as it should be. Next,
compute

n
ψ0 , Li ψ0 ν i = ψ0 , ei ej ∇j ψ0 ν i
i=1 i=j
6 7
1 k
(5.4) = ψ0 , ei ej ω (ej )ek e · ψ0 νi
4
i=j k=
1
= ωk (ej )ψ0 , ei ej ek e · ψ0 ν i .
4
i=j
k=
Note that by (5.3), the sum of the expression ωji (ek ) over cyclic permuta-
tions of i, j, k vanishes modulo o(|x|−2q−1 ), that is, ωji (ek )+ωkj (ei )+ωik (ej ) =
o(|x|−2q−1 ). For i, j, k distinct, ei ej ek is invariant under cyclic permutation,
and therefore by the same argument used in the proof of Theorem 5.10,

ωk (ej )ej ek e = o(|x|−2q−1 ).
j,k, distinct
Therefore the only relevant terms of (5.4) are when j = k or j = . The

terms with j = k and i = must vanish, because in that case ψ0 , ei e ·ψ0 =
0 due to the skew-symmetric action of ei e on ψ0 . Similarly, the terms with
j = and i = k also vanish.
That leaves us with only the terms with either j = k and i = , or j =

and i = k. Using (5.3) in the second equality below, we obtain

n
1 i
ψ0 , Li ψ0 ν i = ωj (ej )ψ0 , ei ej ei ej · ψ0
4
i=1 i=j

+ωij (ej )ψ0 , ei ej ej ei · ψ0 ν i + o(|x|−2q−1 )
1
= (−(gij,i − gjj,i ) + (gjj,i − gij,j ))|ψ0 |2 ν i + o(|x|−2q−1 )
8
i=j
1
= (gjj,i − gij,j )ν i + o(|x|−2q−1 ).
4
i,j
Since the integral of the o(|x|−2q−1 ) vanishes in the limit, the result follows.

Corollary 5.15. Assume the same hypotheses as in Proposition 5.14, and

suppose that ψ ∈ C ∞ (S(M )) such that ψ − ψ0 ∈ W−q
1,2
(S(M )), where q =
n−2
2 . Then

n
1
lim ψ, Li ψν i dμSρ = (n − 1)ωn−1 |ψ0 |2 mADM (Mk , g),
ρ→∞ S
ρ i=1
2
where Sρ is a coordinate sphere in Mk .
Proof. Let ξ = ψ − ψ0 . We can break down
(5.5) ψ, Li ψ = ψ0 , Li ψ0 + ψ0 , Li ξ + ξ, Li ψ0 + ξ, Li ξ.
The first term will give us what we want by Proposition 5.14. We want to
show that the other terms do not contribute. It is straightforward to see that
the integrals of the last two terms will not contribute in the limit because
of decay of ξ and ∇ψ0 . We will show that the integral of the ψ0 , Li ξ term
is the same as the integral of the ξ, Li ψ0 term, and therefore it must also
vanish in the limit. This essentially follows from integration by parts. Define
α to be the (n − 2)-form defined by

α= ψ0 , ei ej ξei ej dvolM ,
i=j
where dvolM is the Riemannian volume form on M . Using the antisymmetry

of ei ej when i = j, we obtain

dα = 2 (−∇j ψ0 , ei ej ξ)ei dvolM
i=j

=2 (ei ej ∇j ψ0 , ξ − ψ0 , ei ej ∇j ξ)ei dvolM
i=j

=2 (Li ψ0 , ξ − ψ0 , Li ξ)ei dvolM .
i

Since dα is a closed (n − 1)-form, Sρ dα = 0. Since integrating against
ei dvolM over Sρ is the same as integrating against ν i dμSρ , we see that the
second and third terms on the right side of (5.5) make identical contributions
to the integral, as claimed.
Using the previous corollary, the positive mass theorem will follow from
being able to solve the following Dirac equation with prescribed asymptotics.
Proposition 5.16. Let (M n , g) be a complete asymptotically flat spin man-
ifold with nonnegative scalar curvature, and let q = n−2
2 . The operator
1,2
D : W−q (S(M )) −→ L2−q−1 (S(M ))
is an isomorphism. Note that L2−q−1 = L2 with this choice of q.
Proof. It is straightforward to check that D is a well-defined bounded linear

operator. Next, we will prove an injectivity estimate. By Corollaries 5.13
1,2
and 5.15 with ψ0 = 0, we see that for any ϕ ∈ W−q (S(M )),

1
|∇ϕ|2 − |Dϕ|2 + Rϕ2 dμM = 0.
Ω 4
By the nonnegative scalar curvature assumption, we have
∇ϕL2 ≤ DϕL2 .
This is the same as writing
∇ϕL2−q−1 ≤ DϕL2−q−1 .
Next we invoke the weighted Poincaré inequality (Theorem A.28), which

states that there is a constant C independent of ϕ such that
ϕL2−q ≤ C∇|ϕ|L2−q−1 ≤ C∇ϕL2−q−1 .
Combined with the above, we obtain the injectivity estimate

ϕW 1,2 ≤ (C + 1)DϕL2 .
−q
We now only have to prove surjectivity. That is, given any η ∈ L2 (S(M )),
1,2
we need to find a spinor ξ ∈ W−q (S(M )) solving Dξ = η. We first consider
the case where η is compactly supported. Our estimates above show that the
1,2
pairing ω, ϕH := Dω, DϕL2 is equivalent to the W−q Hilbert product of ω
and ϕ. Observe that the map ϕ → η, ϕL2 is a well-defined bounded linear
1,2
functional on W−q (S(M )). (Check this.) Applying the Riesz representation
theorem [Wik, Riesz representation theorem] to this functional and using
1,2
the equivalence between the H product and the W−q product, it follows
1,2
that there must exist some ω ∈ W−q (S(M )) with the property that
Dω, DϕL2 = η, ϕL2
1,2
for every ϕ ∈ W−q (S(M )). We claim that ξ = Dω is the desired solution.
We know that ξ ∈ L2 (S(M )). To prove better regularity, let ξj be a sequence
1,2 1,2
of W−q spinors converging to ξ in L2 . For any test function ϕ ∈ W−q , we
obtain
lim Dξj , ϕL2 = lim ξj , DϕL2 = ξ, DϕL2 = η, ϕL2 ,
j→∞ j→∞
by construction of ξ. Therefore Dξj converges to η in the weak L2 topology.

In particular, Dξj L2 is bounded independently of j. The injectivity esti-
mate then implies that ξj W 1,2 is bounded. Therefore ξj must converge to
−q
1,2
ξ weakly in W−q , and we finish the argument by observing that
Dξ, ϕL2 = ξ, DϕL2 = η, ϕL2
1,2
for any compactly supported spinor ϕ ∈ W−q , so it must be the case that
Dξ = η everywhere. Finally, for the general case of η ∈ L2 (S(M )), we can
simply use a density argument: approximate η in L2 by compactly supported
1,2
spinors. Their preimages under D must converge to some ξ ∈ W−q because
of the injectivity estimate for D, and then it follows that Dξ = η.

Proof of the positive mass theorem (Theorem 5.12). Let (M n , g) be

a complete asymptotically flat spin manifold with nonnegative scalar curva-
ture. Select an end Mk and an orthonormal frame e1 , . . . , en for that end.
Choose ψ0 ∈ C ∞ (S(M )) such that ψ0 is constant with respect to e1 , . . . , en
and |ψ0 | = 1 in Mk , while ψ0 vanishes in all other ends.
Let η = −Dψ0 . Check that η ∈ L2 (S(M )). By Proposition 5.16, there
1,2
exists ξ ∈ W−q such that Dξ = η, where q = n−2
2 . Define ψ := ψ0 + ξ.
Combining Corollaries 5.13 and 5.15, we have

1 1
|∇ψ| − |Dψ| + Rψ
2 2 2
dμM = (n − 1)ωn−1 mADM (Mk , g),
M 4 2
5.4. Related results 175
where mADM (M, g) is the mass of the selected end (since the contributions
from the other ends will be zero). Noting that
Dψ = Dψ0 + Dξ = η − η = 0,
it follows that

2 1
(5.6) mADM (Mk , g) = |∇ψ| + Rψ
2 2
dμM ,
(n − 1)ωn−1 M 4
which is manifestly nonnegative if R ≥ 0.
In the spin case, we get a simple proof of rigidity of the positive mass
theorem. We now suppose that mADM (M, g) = 0. Then the above equation
implies that ψ is parallel everywhere, that is, ∇ψ = 0. Note that any choice
of constant spinor in the end Mk leads to the construction of a parallel spinor
that is asymptotic to it. In particular, for i = 1, . . . , n, we can construct a
spinor ψi asymptotic to ei · ψ0 in Mk such that ∇ψi = 0. Define Vi to be
the vector field with the property that
Vi , w = w · ψ, ψi
for any w ∈ Tp M at any point p ∈ M , where ψ is the original parallel spinor
we constructed that is asymptotic to ψ0 .
Exercise 5.17. Show that ∇Vi = 0 everywhere and Vi is asymptotic to ei
at infinity.
By the exercise, V1 , . . . , Vn is a global basis of parallel vector fields,

which implies that (M, g) is flat, and hence (M, g) must be Euclidean space
by Exercise 2.33.
5.4. Related results

5.4.1. A spinor proof of Theorem 4.17.
Theorem 5.18 (Shi-Tam [ST02]). Let (Mout , gout ) be a complete asymp-
totically flat manifold with boundary, and let (Min , gin ) be either a compact
Riemannian manifold with boundary or a complete asymptotically flat mani-
fold with boundary. In either case, assume that ∂Mout is isometric to ∂Min ,
and let (M, g) be the result of gluing (Mout , gout ) and (Min , gin ) along this
common boundary Σ ⊂ M , and assume that (M, g) is spin.
Assume that g has nonnegative scalar curvature away from Σ, and fur-
ther assume that Hout ≤ Hin along Σ, where Hout (respectively, Hin ) is the
mean curvature of Σ as computed by gout (respectively, gin ). Here we use the
normal ν pointing toward Mout . Then the ADM mass of each end of (M, g)
is nonnegative.
Furthermore, if the mass of any end is zero, then Hout = Hin along Σ,
and moreover (M, g) is Euclidean space. Or more precisely, there exists a
C 1,α diffeomorphism M −→ Rn such that gij (x) = δij in this coordinate

chart.
Proof. Assume there is only one end (since the proof is really no different in
the general case). We follow the proof of Theorem 5.12 given in the previous
section. Note that g being Lipschitz is enough regularity that we can still
1,2
construct a spinor ψ ∈ W−q asymptotic to a constant spinor ψ0 that solves
the Dirac equation Dψ = 0, where q = n−2 2 . (Recall that ψ0 is chosen to have
|ψ0 | = 1 at the infinity of the end we care about, and is zero at the other
infinities.) The main difference here is that the Schrödinger-Lichnerowicz
formula (Theorem 5.10),
1
D2 ψ = ∇∗ ∇ψ + Rψ,
4
is no longer valid at the singular set Σ. (For one thing, R is not defined
there.) However, away from Σ, everything is smooth, so that this formula is
still valid, and so is its integrated version with boundary (Corollary 5.13).
In the following we use a hat to denote quantities computed using gin
and no hat to denote quantities computed using gout :
(5.7)

1
0≤ |∇ψ| − |Dψ| + Rψ
2 2 2
dμM
M 4

1
= |∇ψ| − |Dψ| + Rψ
2 2 2
dμM
Mout 4

1
+ |∇ψ|2 − |Dψ|2 + Rψ 2 dμM
Min 4

1
= (n − 1)ωn−1 mADM (g) − ψ, (∇ν + ν · D)ψ dμ∂Mout
2 ∂Mout

+ ˆ ν + ν · D̂)ψ dμ∂M
ψ, (∇ in
∂Min

1 ˆ ν − ∇ν )ψ + ν · (D̂ − D)ψ dμΣ ,
= (n − 1)ωn−1 mADM (g) + ψ, (∇
2 Σ
where we used Corollary 5.13 on Mout and Min separately, and then Corol-
lary 5.15 to identify the boundary term at infinity with the mass, and in all
of the integrals, ν is the unit normal of Σ pointing toward Mout , which is
the same for both gout and gin .
In order to compute the integrand of the Σ integral above locally, we
choose a local frame near a point of Σ which is adapted to Σ. That is,
we choose e1 , . . . , en−1 to be a local frame for Σ, choose en = ν, and then
extend the whole frame e1 , . . . , en away from Σ by demanding ∇ν ei = 0. Let
ω be the connection 1-form ωji (ek ) = ∇k ej , ei with respect to this frame,
computed using gout , and we define ω̂ similarly, except using gin . By our
choice of frame, along Σ we have, for i, j, k = 1, . . . , n − 1,
ωji (en ) = ω̂ji (en ) = 0,
ωji (ek ) = ω̂ji (ek ),
ωni (ek ) = −ωin (ek ) = A(ek , ei ),
ω̂ni (ek ) = −ω̂in (ek ) = Â(ek , ei ),
where the second line is just the induced connection 1-form on Σ, which
is the same for both gout and gin . Choose a basis of spinors σA which is
constant with respect to this frame, and write ψ = A ψ A σA in that basis.
Since it is easy to see that ∇ˆ ν ψ = ∇ν ψ along Σ, we need only compute the
D̂ − D term in (5.7). Using formula (5.2) for ∇σA and the equations for ω
above, we compute

ν · (D̂ − D)ψ = ψ A ν · (D̂ − D)σA
A

n
= ψA ˆ j − ∇j )σA
νej · (∇
A j=1
1 A
n−1 n
= ψ νej [ω̂k (ej ) − ωk (ej )]ek e · σA
4
A j=1 k,=1
n−1
1 A n
n−1
= ψ [ω̂ (ej ) − ωn (ej )]νej νe
4
A j=1 =1

n−1
+ [ω̂n (ej ) − ωn (ej )]νej ek ν · σA
k k
k=1
1
n−1
= ψA [2ω̂kn (ej ) − 2ωkn (ej )]ej ek · σA
4
A j,k=1
1 A
n−1
= ψ [−Â(ej , ek ) + A(ej , ek )]ej ek · σA
2
A j,k=1
1 A
n−1
= ψ [Â(ej , ej ) − A(ej , ej )] · σA
2
A j
1
= (Hin − Hout )ψ.
2
Feeding this into (5.7), we obtain

1 1
0 ≤ (n − 1)ωn−1 mADM (g) + (Hin − Hout )|ψ|2 dμΣ ,
2 Σ 2
so the nonnegativity of mass now follows from our assumption that Hout ≤
Hin . If we have mADM (M, g) = 0, then just as in the rigidity proof in the
smooth spin case, we obtain parallel spinors that can be used to construct
parallel vector fields, and since these spinors are nonvanishing, the above
inequality tells us that Hout = Hin .
5.4.2. A spinor proof of Theorem 4.67. We can now prove the stability-
type result for the positive mass theorem that we used in the Bray flow proof
of the Penrose inequality in Section 4.3 [Bra01, Corollary 8].
Theorem 5.19 (Bray). Given n ≥ 3, α > 1, and > 0, there exists δ > 0
with the following property.
Let (M n , g) be a complete asymptotically flat spin manifold of nonnega-
tive scalar curvature on M , with coordinates in some end satisfying
4
gij (x) = W (x) n−2 δij
for |x| > r, for some positive harmonic function W on Rn B̄r (0) approach-
ing 1 at infinity.
n−2
If mADM (g) < δrn−2 , then for all |x| ≥ αr, |W (x) − 1| < |x|
r
.
Proof. In the following we will use Br as an abbreviation for Br (0). Let

n ≥ 3, α > 1, and > 0. By rescaling, we may assume without loss of
generality that r = 1. Define A to be the set of all triples (M n , g, W ) such
that:
• (M, g) is a complete asymptotically flat spin manifold of nonnega-
tive scalar curvature.
• W is a positive harmonic function on Rn B̄1 with limx→∞ W (x)
= 1.
4
• In a distinguished end of M , gij (x) = W (x) n−2 δij in Rn B̄1 .

• In that distinguished end, mADM (g) ≤ 1.
We can now rephrase our goal as follows. We want to show that there exists
δ > 0 such that for any (M, g, W ) ∈ A with mADM (g) < δ, |W (x) − 1| <
for |x| ≥ α. The decay |W (x) − 1| < |x|2−n then follows from the fact that
W is harmonic.
Define H to be the space of all positive harmonic functions W on Rn B̄α
such that limx→∞ W (x) = 1, where the topology on H is given by the C 0
(or sup) norm. Since the functions in H are harmonic, it follows that any
sequence in H that converges in C 0 actually converges smoothly in the region
Rn B̄2α .
Note that any choice of orthonormal frame e1 , . . . , en determines an iden-
tification between spinor representation space S and the space of spinors that
are constant with respect to e1 , . . . , en . For any spinor σ which is constant

with respect to that frame, define the functional Fσ : H −→ R by
8

Fσ (W ) = inf |∇ψ| dμg ψ ∈ C ∞ (S(M ))
2
Rn B2α
9
such that lim ψ(x) = σW ,
x→∞
4
where the metric on Rn B2α is defined to be gij (x) = W (x) n−2 δij and
σW is the constant spinor corresponding to σ via the orthonormal frame
−2
ei = W n−2 ∂i . The proof will follow from a series of claims.
Claim. For each σ, the functional Fσ : H −→ R is continuous.
By standard elliptic
theory, for each fixed W ∈ H, we know that the
energy functional Rn B2α |∇ψ|2 dμg is minimized by the unique Neumann
solution ψ satisfying ∇∗ ∇ψ = 0 and ∇ν ψ = 0 at ∂B2α , in addition to
limx→∞ ψ(x) = σ. If we were to write out these equations in local coor-
dinates, we would see that all of the coefficients of this elliptic system can
be written explicitly in terms of W and its derivatives. Consequently, the
minimizer depends continuously on W ∈ H. This implies the claim above.
Claim. The restriction map from R : A −→ H has relatively compact image
in H.
If (M, g, W ) ∈ A, then W is harmonic on Rn B̄1 and by Corollary A.19

and Exercise 3.13,
m
W (x) = 1 + |x|2−n + O(|x|1−n ),
2
where m = mADM (M, g). Corollary A.19 also implies that the average value
of W over the sphere of radius ρ is precisely 1 + m2ρ
2−n . Together with the
Harnack inequality, this can be used to show that W can be bounded purely
in terms of m and α in the region Rn B̄α . (See [Lee09] for a proof.) Since
the definition of A includes the requirement that |m| ≤ 1, the second claim
follows.
Let
σ 0 be a constant spinor of unit length and define σi := ei · σ0 . Let
n
F := i=0 Fσi .
Claim. If W ∈ H and F (W ) = 0, then W must be the constant function 1.
If F (W ) = 0, then Fσi (W ) = 0 for each i, and the minimizing spinor

realizing Fσi (W ) must be a parallel spinor asymptotic to σi . The argument
we gave in our proof of rigidity in Theorem 5.12 then implies that g is flat.
Therefore W must be 1 on Rn B2α , and since it is harmonic, it is 1 on all
of Rn B̄α , proving the third claim.
Putting the first two claims together, F is a nonnegative continuous

functional on the compact closure of R(A) in H. The third claim says that
this functional can only be zero if W is identically equal to 1. It follows
that for any > 0, there exists some δ > 0 such that if W ∈ R(A) and
F (W ) < δ , then |W − 1| < on Rn B̄α .
We are finally ready to invoke Witten’s argument to complete the proof.
If (M, g, W ) ∈ A, then as argued earlier in the previous section, for each
i = 0, 1, . . . , n, there exists some ψi asymptotic to σi solving the Dirac equa-
tion. By equation (5.6), for this spinor we have

1 1
(n − 1)ωn−1 mADM (g) = |∇ψi | + Rψi dμg ≥ Fσi (W ),
2 2
2 M 4
and consequently
1
(n + 1)(n − 1)ωn−1 mADM (g) ≥ F (W ).
2
2δ
Therefore the result is proved, with δ = (n+1)(n−1)ω n−1
.
Chapter 6
Quasi-local mass
6.1. Bartnik mass and static metrics

As briefly mentioned earlier, quasi-local mass represents some way of mea-
suring “how much mass” that region of space contains. So far we have met
the Hawking mass, which seems to provide an underestimate of “how much
mass” is contained within a surface and was useful when applied to evolu-
tion by inverse mean curvature. There are other notions of quasi-local mass,
with each one useful in different contexts.
One natural concept of quasi-local mass is due to R. Bartnik [Bar89].
Let (Ω, g) be a compact Riemannian manifold with nonempty boundary and
nonnegative scalar curvature. We consider all possible ways of extending Ω
to a complete asymptotically flat manifold with nonnegative scalar curva-
ture. We now take the infimum of the ADM masses of all of these extensions.
The positive mass theorem implies that this infimum is nonnegative. This
might seem like a good definition of a quasi-local mass, but in fact this in-
fimum is likely to be zero, because if Ω is enclosed by an apparent horizon,
then it effectively has no influence on the geometry near infinity. Recall that
this phenomenon was touched upon in Exercise 4.14. Because of this, it is
natural to only consider extensions that do not enclose Ω within a minimal
hypersurface. Moreover, in light of Theorem 4.17, it is natural to consider
extensions that are not smooth across ∂Ω; instead, we only ask that the
metric is Lipschitz at ∂Ω and the mean curvature of ∂Ω as measured from
the outside is less than or equal to the mean curvature as measured from
the inside. (Also recall that the Hawking mass of a surface Σ in (M, g) only
depends on Σ, g|Σ , and HΣ .) We call an extension with these properties
181
182 6. Quasi-local mass
admissible and observe that the space of admissible extensions does not ac-
tually depend on all of (Ω, g), but only on the induced metric and mean
curvature on its boundary. We can summarize the above discussion with
the following definition.
Definition 6.1. Let n ≥ 3, and let (Σn−1 , γ) be a compact Riemannian

manifold equipped with a nonnegative function η. We refer to the triple
(Σ, γ, η) as Bartnik data. We say that (M n , g) is an admissible extension of
(Σ, γ, η) if the following hold:
(1) (M, g) is a complete, asymptotically flat manifold with boundary
∂M identified with Σ.
(2) Rg ≥ 0 in the interior of M .
(3) The metric on Σ = ∂M induced by g is equal to γ.
(4) Let Hg denote the mean curvature of Σ = ∂M in (M, g), computed
with respect to the “outward” normal ν, which points into M . The
mean curvature satisfies Hg ≤ η.
(5) Σ = ∂M is not enclosed by an apparent horizon (except in the case
η ≡ 0, in which case we allow ∂M itself to be a horizon).
Let P(Σ, γ, η) denote the space of all admissible extensions of (Σ, γ, η). We
define the Bartnik mass of (Σ, γ, η) to be
mB (Σ, γ, η) := inf mADM (M, g).
(M,g)∈P(Σ,γ,η)
We call any element of P(Σ, γ, η) achieving this infimum a Bartnik mini-

mizer. A priori, P(Σ, γ, η) could be empty, in which case we take mB (Σ, γ, η)
to be infinity.
Remark 6.2. The case n = 3 is the case of greatest interest, but we will
present results in general dimension when possible. When n = 3, one typi-
cally assumes that Σ is a sphere.
Remark 6.3. Unfortunately, there are many different versions of Bartnik

mass in the literature, and unfortunately, each variant has slightly different
properties. See the paper of Jeffrey Jauregui [Jau19] for a nice explanation
of the different definitions and their relative advantages and disadvantages.
For example, another version of condition (4) asks for Hg = η instead of
Hg ≤ η. With our choice, we have the simple monotonicity statement that
if η1 ≤ η2 , then mB (Σ, γ, η1 ) ≥ mB (Σ, γ, η2 ).
Perhaps the most significant choice made in the definition of admissibil-

ity is our condition (5). One reason why our choice of condition (5) is useful
is the lemma below.
6.1. Bartnik mass and static metrics 183
Lemma 6.4. Let n < 8, and let (M n , g) be an extension of Bartnik data with
η ≡ 0 such that (M, g) satisfies condition (5) of Definition 6.1. Then any
other extension which is a sufficiently small smooth perturbation of (M, g)
also satisfies condition (5).
Sketch of the proof. Assume the contrary. That is, suppose that (M, g)
is an extension satisfying condition (5), but there is a sequence of extension
metrics gi converging to g, each of which violates condition (5). So in each
(M, gi ), we have an apparent horizon Σi enclosing ∂M . There exists a
radius r, independent of i, such that all coordinate spheres of radius larger
than r will be mean convex in (M, gi ). By the strong comparison principle
(Corollary 4.2), the coordinate sphere Sr must enclose Σi . The outward-
minimizing property of apparent horizons (Theorem 4.7) then implies that
|Σi |gi ≤ |Sr |gi . Therefore the volume of Σi is bounded independently of i.
Since the Σi are stable minimal hypersurfaces with a uniform volume
bound, we can apply Schoen-Simon estimates [SS81] as in the proof of The-
orem 4.7 in order to extract a limit minimal hypersurface in (M, g) enclos-
ing ∂M . Then by Theorem 4.7, there is an apparent horizon in (M, g) enclos-
ing ∂M , contradicting the assumption that (M, g) satisfies condition (5).
The positive mass theorem (or more precisely, Theorem 4.17) easily gives
us the following nonnegativity property of the Bartnik mass.
Proposition 6.5. Let n ≥ 3, and let (Ωn , g) be a compact Riemannian
manifold with nonempty boundary and nonnegative scalar curvature. Sup-
pose that Σ = ∂Ω is connected and has HΣ ≥ 0 (with respect to the normal
pointing out of Ω). Then mB (Σ, g|Σ , HΣ ) ≥ 0.
More generally, the same result holds if Σ is just one component of ∂Ω,
as long as the other components are minimal.
Along with proposing this quasi-local mass concept, Bartnik also con-
jectured (for n = 3) that if η > 0, then there always exists an admissible
extension of (Σ, γ, η) realizing the Bartnik mass [Bar89]. A recent discovery
of Michael Anderson and Jeffrey Jauregui showed that the conjecture fails
at the level of generality originally envisaged by Bartnik [Jau13]. Because
of this, we will state a fairly narrow version of Bartnik’s conjecture.
Conjecture 6.6 (Bartnik minimal mass extension problem). If (Σ2 , γ, η)
is Bartnik data with η > 0 such that γ has positive Gauss curvature, then
there exists a Bartnik minimizer.
Positive Gauss curvature is probably not the ideal assumption for this
conjecture, but it is a weak enough assumption that the conjecture is highly
nontrivial, while it is narrow enough to guarantee existence of admissible
extensions (see the next section) and also to rule out the Anderson-Jauregui
counterexamples. Those counterexamples come from taking a flat closed
topological 3-ball (B, g) that can be immersed in Euclidean 3-space in such
a way that its interior is embedded, but there is a double point at the
boundary. Since (B, g) is very close to an embedded flat closed ball, they
are able to show (with some work) that the Bartnik mass (of the boundary
data) must be zero. If a Bartnik minimizer did exist, it would be a mass
zero admissible extension of (B, g) and therefore Euclidean. But they obtain
a contradiction by showing that this is impossible for such (B, g).
The power of Bartnik’s conjecture comes from the fact that a Bartnik
minimizer should be very special. For physical reasons, Bartnik also conjec-
tured that any Bartnik minimizer should be vacuum static. Mathematically,
this conjecture was also supported by a calculus of variations computation
which we will see a bit later.
Definition 6.7. A metric g defined in a region U is called vacuum static if
g is scalar-flat and there exists a nontrivial function f on U such that
Δf = 0,
Hess f = f Ric.
The function f is called a static potential. The pair (g, f ) can be referred to
as vacuum static initial data on U .
We will discuss the physical significance of these equations in Chapter 7,

but for now we can regard the vacuum static condition as saying that the
Ricci curvature takes a certain special form.
Exercise 6.8. Recall that the Schwarzschild metric of mass m is given by
the formula
dr2
gm = + r2 dΩ2 ,
V (r)
where
2m
V (r) = 1 − n−2 .
r √
Show that gm is vacuum static with static potential f = V using Propo-
sition 1.13.
The following exercise illustrates how this concept naturally relates to

the scalar curvature operator.
Exercise 6.9. Use Exercise 1.18 to show that at any smooth metric g,
the formal L2 -adjoint (with respect to g) of the linearized scalar curvature
operator at g is given by
DR|∗g (f ) = −(Δf )g + Hess f − f Ric
6.1. Bartnik mass and static metrics 185
for any smooth function f . In particular, observe that DR|∗g (f ) = 0 is

equivalent to the pair of equations
1
Δf = − Rf,
n−1

1
Hess f = f Ric − Rg ,
n−1
where n is the dimension.
In particular, g being vacuum static in a region U is equivalent to DR|∗g

having nontrivial kernel, together with scalar-flatness. Furthermore, we have
the following lemma.
Lemma 6.10 (Fischer-Marsden [FM75]). Let (U, g) be a Riemannian man-

ifold, and suppose that f is a nontrivial function such that DR|∗g (f ) = 0 in
U in the weak sense. Then f is smooth, its zero set is a smooth totally
geodesic hypersurface, and Rg is constant.
Proof. Exercise 6.9 shows that if DR|∗g (f ) = 0 in the weak sense, then f
also satisfies the elliptic equation Δf = − n−1
1
Rf weakly. Therefore we can
invoke elliptic regularity of weak solutions (Theorem A.5) in order to see
that f must be smooth. We will first prove that f has a smooth zero set by
showing that 0 is a regular value of f . Suppose, to the contrary, that there
exists a point x where we have both f (x) = 0 and df (x) = 0.
Let β be any geodesic starting at x, and note that the Hessian equation
of Exercise 6.9 implies that the composition F = f ◦ β satisfies

1
F (t) = Ric(β (t), β (t)) − R · F (t).
n−1
Then f (x) = 0 and df (x) = 0 tells us that F (0) = F (0) = 0. Consequently,
F (t) is identically zero. Since this argument works for any β, we see that f
vanishes in a neighborhood of x. This means that the set of all points where
f and df both vanish is an open set. Since it is obviously closed as well, this
means that f and df vanish identically everywhere, which contradicts the
nontriviality of f . Thus 0 is a regular value of f .

Observe that along the zero set Z, Hess f = f Ric − n−1 1
Rg = 0. In
particular, this implies that along Z, ∇f is a parallel normal vector, and
thus Z is totally geodesic.
Next we show that Rg is constant. To see this, we will take the divergence
of the equation DR|∗g (f ) = 0. Recall the Weitzenböck formula for a 1-form ω,
ΔH ω = ∇∗ ∇ω + Ric(ω , ·),
where ΔH = dδ + δd is the Hodge Laplacian. (Recall that δ is the adjoint

of d.) If we apply this formula to ω = df , we obtain
ΔH df = ∇∗ ∇df + Ric(∇f, ·),
dδdf = − div(Hess f ) + Ric(∇f, ·).
Rearranging, we obtain
div(Hess f ) = d(Δf ) + Ric(∇f, ·).
We can now compute
0 = div[DR|∗g (f )]
= div[−(Δf )g + Hess f − f Ric]
= −d(Δf ) + div(Hess f ) − Ric(∇f, ·) − f div(Ric)
= −f div(Ric)
1
= − f (dR),
2
where the last line is the fact that the Einstein tensor is divergence-free
(Exercise 1.10). Thus dR vanishes wherever f = 0. Since f vanishes on a
codimension one subset, we must have dR = 0 everywhere. That is, Rg is
constant on U .
Corollary 6.11. Let (M, g) be a complete asymptotically flat manifold.
Then
2,p
DR|g : W−q (T ∗ M T ∗ M ) −→ Lp−q−2 (M )
is surjective for all p > 1 and 0 < q < n − 2.
Proof. First check that asymptotic flatness implies that DR|g indeed maps
between these two spaces. We claim that the image of DR|g has finite
codimension. To see this, we restrict DR|g to deformations of the metric of
2,p
the form vg for v ∈ W−q (M ), i.e., conformal deformations. We can compute
DR|g (vg) using Exercise 1.18 to see that it is a second-order elliptic operator
on v of the form in Assumption A.29, and consequently the image of this
operator has finite codimension in Lp−q−2 (M ) by Corollary A.42. Since the
image of the full map DR|g contains the image of this restriction, it too has
finite codimension.
Suppose DR|g is not surjective. Since the image is closed, this implies
that there exists a nonzero element f in the dual space of Lp−q−2 (M ) such
that DR|∗g (f ) = 0 in the weak sense on all of M . By Lemma 6.10, f is
smooth and Rg is constant. Since g is asymptotically flat, this constant
is zero, and thus Δf = 0. It is not hard to see that the dual space of
∗
Lp−q−2 (M ) is just Lpq+2−n (M ), where p∗ = p−1
p
. Since q < n − 2, we can
apply weighted elliptic regularity (Corollary A.34) to see that f decays at
6.2. Bartnik minimizers 187
infinity, and then the maximum principle implies that f vanishes identically,
which is a contradiction.
6.2. Bartnik minimizers

Theorem 6.12 (Corvino [Cor00], Miao [Mia04], Anderson-Jauregui
[Jau13]). Let n < 8, let (Σn−1 , γ, η) be Bartnik data, and suppose that
(M n , g) is a Bartnik minimizer in P(Σ, γ, η). Then g is vacuum static.
Furthermore, Hg = η at ∂M = Σ.
If either n = 3 or η ≡ 0, then the static potential f can be chosen to be
positive in the interior of M and approach 1 at infinity.
The main consequence that g is vacuum static is originally due to

J. Corvino [Cor00] (with the strength of the argument substantially im-
proved in [Cor17]). Pengzi Miao proved the part that a minimizer must
have Hg = η at ∂M [Mia04]. Lan-Hsuan Huang, Daniel Martin, and Miao
proved the statement about the static potential when n = 3 [HMM18].
(See also references cited therein. The n = 3 restriction comes from in-
voking Theorem 3.46.) A recent preprint of Michael Anderson and J. Jau-
regui proves Theorem 6.12 using an approach very different from that of
Corvino [Jau13].
In some sense, this theorem could be thought of as a generalization of
positive mass rigidity (Theorem 3.19). Just as in that proof, the easier first
step is to show that g is scalar-flat. The “boundary analog” of scalar-flatness
of g is the fact that Hg = η at ∂M .
Lemma 6.13. Let n < 8, let (Σn−1 , γ, η) be Bartnik data with η ≡ 0,

and suppose that (M n , g) is a Bartnik minimizer in P(Σ, γ, η). Then g is
scalar-flat. Moreover, Hg = η at ∂M = Σ.
Proof. Assume the hypotheses of the lemma, and suppose that g is not
scalar-flat. Just as in the proof of positive mass rigidity (Theorem 3.19),
the idea is that we can choose a conformal factor u that reduces the scalar
curvature while maintaining nonnegativity. Since this has the effect of re-
ducing the mass, this will violate the minimizing property of g, giving us our
desired contradiction. The only difference now is that we have to account
for the presence of a boundary.
Following the proof of Theorem 3.19, we let ζ be a positive compactly
supported function such that ζRg is positive somewhere and then seek a
positive conformal factor u such that
4(n − 1)
− Δg u + ζRg u = 0,
n−2
with u(x) → 1 as x → ∞, except this time we also impose the boundary

condition u = 1 at ∂M . This u exists by the same reasoning used before.
4
Continuing exactly as before, we see that if g̃ = u n−2 g, then Rg̃ > 0, and

1
mADM (g̃) = mADM (g) − ζRg udμg < mADM (g).
2(n − 1) M
This contradicts the Bartnik minimizing property of g, provided g̃ is an
admissible extension. We can easily see that conditions (1), (2), and (3) in
the definition of admissibility (Definition 6.1) hold. Using Exercise 2.14, we
can see that at ∂M ,
2(n − 1)
Hg̃ = Hg + ∇ν u,
n−2
where ν is the outward normal. Recall that the maximum principle implies
that u ≤ 1 everywhere, and therefore ∇ν u ≤ 0. Thus Hg̃ ≤ Hg ≤ η,
fulfilling condition (4) of admissibility. For sufficiently small ζ, Lemma 6.4
shows that condition (5) also holds, and thus g̃ is indeed admissible. (The
use of Lemma 6.4 is the only place where the n < 8 assumption is used.)
Therefore g must be scalar-flat.
The second part of the lemma, regarding the mean curvature, is an inde-
pendent result of Miao [Mia04], and we will only give the rough idea behind
the proof. Suppose that the Bartnik minimizer (M, g) has Hg < η some-
where in ∂M = Σ. We obtain a contradiction by constructing an admissible
extension with strictly smaller mass. Following similar reasoning used in the
proof of Theorem 4.17, one can use a mollification and a conformal change
to produce the smaller mass competitor, except that one must make sure to
only make changes to the extension (M, g) and not the “inside” of Σ as was
done in the proof of Theorem 4.17.
Corvino’s proof that a mass-minimizing admissible extension must be

vacuum static was essentially a corollary of the following theorem [Cor00].
Theorem 6.14 (Localized scalar curvature deformation [Cor00]). Let U
be a precompact open subset of a Riemannian manifold (M n , g), and assume
that the adjoint of the linearization of scalar curvature, DR|∗g , is injective
over U , or, more precisely,
DR|∗g : Wloc
2,2
(U ) −→ L2loc (U )
has vanishing kernel. Then for any smooth function κ such that κ − Rg is
supported in U and sufficiently small, there exists a metric g̃ with Rg̃ = κ
and g̃ = g outside U .
For a more explicit statement of Theorem 6.14, see [Cor00]. The nov-
elty of this theorem is that the deformation of the metric is compactly sup-
ported. Without that requirement, a global version of this sort of theorem is
a standard result for elliptic operators and follows from the inverse function
theorem (Theorem A.43). The reason why it is possible to localize in this
way is that the scalar curvature operator R, as an operator on the metric g,
is heavily overdetermined. Corvino realized that this could be exploited to
obtain localized estimates on the operator DR|∗g . We omit the proof of The-
orem 6.14, but we note that in addition to being used to prove Theorem 6.12,
it was also used to prove Theorem 3.51 [Cor00].
We can now loosely explain Corvino’s proof that a Bartnik minimizing
extension (M, g) of Bartnik data (Σ, γ, η) must be vacuum static.
Sketch of Corvino’s approach to Theorem 6.12. As we already saw

in Lemma 6.13, g must be scalar-flat. Suppose it is not vacuum static. An
argument used in Corvino’s proof of Theorem 6.14 shows that there must
exist some precompact open U ⊂ M (not touching ∂M ) on which g is not
vacuum static. By Exercise 6.9 together with scalar-flatness, this is equiva-
lent to saying that DR|∗g is injective on U . So we can apply Theorem 6.14 to
see that there exists an arbitrarily small, compactly supported deformation
g̃ of g that still has nonnegative scalar curvature but is no longer scalar-flat.
Since the deformation is small and compactly supported, we can see that g̃
is also an admissible extension and it has the exact same mass as g. But this
contradicts Lemma 6.13. Corvino has recently sharpened this argument to
show that if an admissible extension (M, g) is not vacuum static, then one
can actually deform g away from a neighborhood of ∂M to find an admissible
extension with strictly smaller mass [Cor17]. (This obviates the need to
invoke Lemma 6.4 and consequently the need to assume that η ≡ 0.)
Therefore we have established Theorem 6.12, except for the part that
says that the static potential f can be chosen to be positive in the interior
of M and approach 1 at infinity. When n = 3, this was proved in [HMM18,
MT15]. We omit the proof.
We will now sketch an alternative approach to Theorem 6.12 along the

lines set forth by Bartnik in [Bar05]. Although this was worked out in a
recent preprint by Anderson and Jauregui [Jau13], we will provide a less
rigorous exposition inspired by a perspective taken in [HL17].
Fix Bartnik data (Σn−1 , γ, η), and suppose M n is a one-ended space with
∂M = Σ equipped with a background metric ḡ that is Euclidean outside a
2,α
compact set. Let α ∈ (0, 1) and n−2 2 < q < n − 2, and define C−q (γ) to
be the space of all metrics g on M such that g − ḡ ∈ C−q 2,α
(T ∗ M ⊗ T ∗ M )
and g induces the metric γ on ∂M . We would like to use a variational
argument, using the fact that a Bartnik minimizer minimizes ADM mass
2,α
over all admissible extensions in C−q (γ).1
The admissibility conditions include two inequalities Rg ≥ 0 and Hg ≤ η,
but Lemma 6.13 tells us that a Bartnik minimizer actually satisfies the
equalities Rg = 0 and Hg = η. Therefore, if we define the constraint space
: ;
2,α
C = g ∈ C−q (γ) g is scalar-flat, and Hg = η at ∂M ,
we can see that a Bartnik minimizer minimizes ADM mass over a small
neighborhood in C. (Lemma 6.4 guarantees that all metrics in a small
enough neighborhood are admissible.) As long as this constraint space C
is smooth, we should be able to invoke the method of Lagrange multipliers
(Theorem A.47). Therefore, a crucial needed result is the following.
Theorem 6.15 (Anderson-Jauregui [Jau13]). Let (M n , g) be a complete
asymptotically flat manifold with induced metric γ on ∂M . Let n−2 2 < q <
n − 2 be less than the asymptotic decay rate of g (as in Definition 3.5), and
let α ∈ (0, 1). If we think of the scalar curvature operator as a map
2,α 0,α
R : C−q (γ) −→ C−q−2 (M ),
and the mean curvature operator as a map
2,α
H : C−q (γ) −→ C 1,α (∂M ),
then at any smooth, asymptotically flat metric g, the map
2,α
(DR|g , DH|g ) : C−q,0 (T ∗ M T ∗ M ) −→ C−q−2
0,α
(M ) × C 1,α (∂M )
2,α
is surjective. Here, the 0-subscript in C−q,0 denotes vanishing at ∂M .
Surjectivity of the map DR|g in the case without boundary (for Sobolev
spaces) was proved in Corollary 6.11. Note that this surjectivity is pre-
cisely what is needed to invoke Lagrange multipliers (Theorem A.47). The
boundary introduces a highly nontrivial technical issue which was handled
by Anderson and Jauregui by taking an approach involving static spacetimes
(as defined in Section 7.1.3). Instead, we will provide a heuristic sketch that
gives a sense for why the theorem is reasonable.
The following exercise is another illustration of the relationship between
scalar curvature and mean curvature.
Exercise 6.16. Let g be a smooth, asymptotically flat metric on a manifold
M with boundary. Then for any smooth function v on M , and any compactly
supported ġ ∈ C0∞ (T ∗ M T ∗ M ),

∗
v · DR|g (ġ) dμg = DR|g (v), ġg dμg + v · DH|g (ġ) dμ∂M .
M M ∂M
1 Technically speaking, in the argument that follows, we are assuming that we have a smooth
2,α
Bartnik minimizer that minimizes over the class of C−q (γ) competitors.
Here, C0∞ means vanishing at ∂M .
Heuristic sketch of the proof of Theorem 6.15. We will make the

simplifying assumption that (DR|g , DH|g ) has closed range, which is ac-
tually the most serious difficulty in the proof. Suppose that (DR|g , DH|g )
is not surjective. Given closed range, there must exist (generalized) func-
0,α
tions v and u in the dual spaces of C−q−2 (M ) and C 1,α (∂M ), respectively,
2,α
such that for all ġ ∈ C−q,0 (T ∗ M T ∗ M ), we have

v · DR|g (ġ) dμg + u · DH|g (ġ) dμ∂M = 0.
M ∂M
We now make another simplifying assumption that the generalized functions

v and u can actually be represented by functions. (Note that, as argued in
Lemma 6.10, elliptic regularity already guarantees that v is represented by
a smooth function in the interior of M .) Then Exercise 6.16 tells us that

DR|∗g (v), ġg dμg + (u + v) · DH|g (ġ) dμ∂M = 0.
M ∂M
Since ġ is arbitrary, it is clear that DR|∗g (v) = 0. By Lemma 6.10 and

asymptotic flatness of g, it follows that Rg = 0 is scalar-flat. Consequently,
0,α
v is g-harmonic by Exercise 6.9. Since v is in the dual space of C−q−2 (M ),
we know that v ∈ L1q+2−n , and q + 2 − n < 0 by assumption, so v decays at
infinity in an integral sense. By weighted elliptic regularity (Corollary A.34),
v decays to zero at infinity, and then the maximum principle implies that v
is zero.
In order to show that u also vanishes, it is sufficient to prove directly
that DH|g is surjective. But this is fairly clear. As part of the solution
to Exercise 6.16, one observes that DH|g (ġ) = tr∂M ∇ν ġ. For any function
in C 1,α (∂M ), one can easily construct a preimage ġ vanishing at ∂M via
2,α
integration. The only tricky part is choosing ġ to have C−q,0 (T ∗ M T ∗ M )
regularity, but this can be achieved via a smoothing argument.
As we have seen in Exercise 3.10, the ADM mass is not well-defined on

2,α
all of C−q (γ) since the scalar curvature need not be integrable. Therefore we
cannot directly apply the method of Lagrange multipliers (Theorem A.47)
to the ADM mass functional. Because of this, it is useful to define a closely
2,α
related quantity that is defined on all of C−q (γ), due to T. Regge and
C. Teitelboim [RT74].
Definition 6.17. Let M n be a one-ended space, possibly with boundary.

2,α
Let α ∈ (0, 1) and n−2
2 < q < n − 2. For each g ∈ C−q (γ), the Regge-
Teitelboim Hamiltonian of (M, g) is

1
H(M, g) = mADM (M, g) − Rg dμg
2(n − 1)ωn−1 M

1
= lim [div g − d(tr g)](ν) dμSρ − Rg dμg .
2(n − 1)ωn−1 ρ→∞ Sρ M
Although the individual terms in the definition of H(M, g) are techni-

cally undefined without the additional assumption Rg ∈ L1 (M ), it is possi-
ble to rewrite it so that it is always well-defined. Specifically, the rigorous
definition is

2(n − 1)ωn−1 H(M, g) = [div g − d(tr g)](νg ) dμ∂M,g

∂M

+ divg [div g − d(tr g)] − Rg dμg .
M
One can show that this is always finite by following calculations as in Exer-
cise 3.10, and the divergence theorem shows that it is equal to the previous
formula whenever the ADM mass is well-defined.
We next compute the linearization of H. We will do the computation
formally, with the understanding that we can make it rigorous by going
through the rigorous definition:
(6.1)
2(n − 1)ωn−1 DH|g (ġ)

1
= lim [div ġ − d(tr ġ)](ν) dμSρ − DR|g (ġ) + Rg trg ġ dμg
ρ→∞ S M 2

ρ
dμSρ ,g
= lim [div ġ − d(tr ġ)](ν) − [divg ġ − d(trg ġ)](νg ) dμSρ ,ḡ
ρ→∞ S dμSρ ,ḡ

ρ
∗ 1
− DR|g (1) + Rg g, ġ dμg − DH|g (ġ) dμ∂M,g
2
M ∂M
1
=− DR|∗g (1) + Rg g, ġ dμg − DH|g (ġ) dμ∂M,g .
M 2 ∂M
The first equality follows from Exercise 6.16 together with a rewriting of the
boundary term at infinity. The second equality follows from the fact that
the Sρ integral can be shown to vanish in the limit.
Heuristic proof of Theorem 6.12. Assume that (M, g) is a Bartnik

minimizer. By Lemma 6.13, g ∈ C and, in particular, Rg = 0. Since
H = mADM on C, it follows that g minimizes H over C. Since the defining
6.3. Brown-York mass 193
equations of C are Rg = 0 and Hg = 0, Theorem A.47 tells us that we

can find Lagrange multipliers for the minimizer g of H subject to the con-
straint C. In other words, there must exist (generalized) functions v and u
0,α
in the dual spaces of C−q−2 (M ) and C 1,α (∂M ), respectively, such that for
all ġ ∈ C−q,0 (T ∗ M T ∗ M ), we have
2,α

(6.2) 2(n − 1)ωn−1 DH|g (ġ) = v · DR|g (ġ) dμg + u · DH|g (ġ) dμ∂M .
M ∂M
Once again, we will make the simplifying assumption that u and v can be
represented by functions. Then Exercise 6.16 tells us that

∗
2(n − 1)ωn−1 DH|g (ġ) = DR|g (v), ġg dμg + (u + v) · DH|g (ġ) dμ∂M .
M ∂M
Combining this with our computation of DH|g (ġ) in (6.1), we have

∗
(6.3) 0 = DR|g (v + 1), ġg dμg + (u + v + 1) · DH|g (ġ) dμ∂M .
M ∂M
Since this must vanish for all ġ, it follows that DR|∗g (v + 1) = 0 in the
interior of M . By Lemma 6.10, v is smooth, and since it is in the dual
0,α
space of C−q−2 (M ), it lies in L1q+2−n . So by weighted elliptic regularity
(Corollary A.34), its limit at infinity is 0. Taking f := v + 1, we obtain the
desired static potential whose limit is 1 at infinity (in an integral sense at
least). Note that equation (6.3) also implies that u = −f at ∂M .
All that remains is to show that f is positive. Suppose that f < 0
somewhere in M . Since f is harmonic, the maximum principle implies that
f < 0 on ∂M . Let q be a nontrivial, nonpositive function on ∂M supported
where f < 0. By Theorem 6.15, there exists a first-order variation ġ such
that DR|g (ġ) = 0 and DH|g (ġ) = q. For this choice of ġ, equation (6.2)
tells us

2(n − 1)ωn−1 DH|g (ġ) = uq dμ∂M = −f q dμ∂M < 0.
∂M ∂M
By Theorem 6.15 and the local surjectivity theorem (Theorem A.46),

we can construct a family of scalar-flat metrics whose first-order variation
is ġ, such that the family is admissible for positive parameters. (Lemma 6.4
ensures the no horizon condition.) Therefore we are able to construct an
admissible, scalar-flat extension with smaller H than g, which contradicts
the Bartnik minimizing property of g. Hence f ≥ 0 on M , and by the strong
maximum principle, it cannot be zero in the interior of M .
6.3. Brown-York mass

An obvious question that we have not yet addressed is how do we know
if there are admissible extensions at all? R. Bartnik discovered a method
of constructing “quasi-spherical” metrics with prescribed scalar curvature

[Bar93]. He considered a family of metrics depending on a single unknown
function and showed that the prescribed scalar curvature equation becomes
a parabolic equation in this unknown. Using this method, Yuguang Shi and
Luen-Fai Tam were able to prove the following.
Theorem 6.18 (Shi-Tam [ST02]). Suppose that (Σn−1 , γ, η) is Bartnik data

such that η > 0 and (Σn−1 , γ) can be isometrically embedded into Euclidean
Rn as a strictly convex hypersurface. Then we can construct an admissible
extension of (Σ, γ, η).
Note that in order for the hypotheses to hold, Σ must be a topologi-

cal sphere. When n = 3, the relevant isometric embedding problem was
originally posed by H. Weyl, and it was solved in independent works of
L. Nirenberg [Nir53] and A. V. Pogorelov [Pog52].
Theorem 6.19 (Weyl embedding theorem). A closed surface (Σ2 , γ) can

be isometrically embedded into Euclidean R3 as a strictly convex surface if
and only if γ has positive Gauss curvature. Furthermore, this embedding is
unique up to Euclidean isometries.
See also [GL94] for the case of nonnegative Gauss curvature. In higher
dimensions, the analogous problem is heavily overdetermined (even local so-
lutions need not exist), and one generally does not expect to find such em-
beddings except under exceptional circumstances (see [LW99]). However,
we will keep our discussion of Theorem 6.18 in general dimension because
nothing about the proof is specific to three dimensions.
Actually, it is not the mere existence of the extension in Theorem 6.18
that is important to us, but rather the specific construction used. So al-
though we will omit the proof, we will outline the construction: by hypothe-
sis, (Σ, g) is isometric to some strictly convex hypersurface S0 in Rn . Recall
that if we flow outward at unit speed in the normal direction from a strictly
convex hypersurface in Rn , we obtain a smooth flow for all time, and strict
convexity is preserved. Let Sρ be the result of flowing S0 for time ρ. (In
particular, there is a natural diffeomorphism from S0 to Sρ .) Or equiva-
lently, Sρ is the hypersurface of distance ρ away from S0 , on the outside.
Let Ω be the region enclosed by S0 in Rn so that ∂Ω = S0 . The fact that
S0 is convex implies that the family Sρ smoothly foliates all of Rn Ω and
hence the Euclidean metric on Rn Ω can be expressed as dρ2 + hρ via the
diffeomorphism Rn Ω ≈ [0, ∞) × Σ, where hρ is the induced metric on Sρ ,
pulled back to Σ. We consider metrics of the form
g = u2 dρ2 + hρ ,
where u is an arbitrary positive function on [0, ∞) × Σ. We can compute

the scalar curvature of g using (2.15). In the following, we will use Hu
and Au to denote the mean curvature and second fundamental form of the
surface Sρ in g, and use H and A to denote the corresponding quantities in
Euclidean space. It is easy to see that Au = u−1 A, and thus Hu = u−1 H.
The unit normal to Sρ in g is just u−1 ∂ρ
∂
, and therefore (2.15) tells us that
∂
the variation of Hu with respect to ∂ρ is
∂Hu 1
(6.4) = −ΔSρ u + (RSρ − Rg − |Au |2 − Hu2 )u,
∂ρ 2
and thus
∂u ∂H 1
−u−2 H + u−1 = −ΔSρ u + (RSρ − Rg )u − (|A|2 − H )u−1 .
2
∂ρ ∂ρ 2
Using (2.15) to compute the variation of H with respect to its Euclidean
∂
unit normal ∂ρ gives us
∂H 1 2
(6.5) = (RSρ − |A|2 − H ).
∂ρ 2
Combining these two equations, we obtain
∂u 1 1
−u−2 H = −ΔSρ u + RSρ (u − u−1 ) − Rg u.
∂ρ 2 2
This equation determines Rg in terms of u, so if we want g to be scalar-flat,
then u must satisfy the equation
∂u 1
(6.6) H = u2 ΔSρ u + RSρ (u − u3 ).
∂ρ 2
Since H > 0, this is a nonlinear parabolic equation in u. The main technical
content of Theorem 6.18 is the following [ST02, Theorem 2.1].
Lemma 6.20. Given the setup above, for any smooth positive function u0
on Σ, there exists a smooth positive global solution u on [0, ∞) × Σ to (6.6)
with initial condition u(0, x) = u0 (x) for all x ∈ Σ. Moreover,
u = 1 + O2 (ρ2−n ) = 1 + mρ2−n + O1 (ρ1−n )
for some constant m.
It is not hard to see that Sρ is close to a standard large round sphere for
large ρ. If one makes this statement precise, it then follows fairly easily from
the lemma that g = u2 dρ2 + hρ is asymptotically flat, and mADM (g) = m.
Since Hu = u−1 H > 0, we know that [0, ∞) × Σ is foliated by hypersurfaces
which are strictly mean convex with respect to g. In particular, Corollary 4.2
implies that S0 is not enclosed by any horizons in ([0, ∞)×Σ, g). Combining
all of this information, we see that ([0, ∞) × Σ, g) is an admissible extension
of the Bartnik data (S0 , g0 , u−1

0 H). Therefore, in order to find an admissible
extension for (Σ, γ, η), we merely have to choose initial condition u0 = H/η
in Lemma 6.20, since (Σ, γ) is isometric to (S0 , g0 ), where H is the mean
curvature of S0 in Euclidean space.
Given that we have an admissible extension g, the definition of Bart-
nik mass tells us that mB (Σ, γ, η) ≤ mADM (g). Let us now try to better
understand the mass of this extension. By Lemma 6.20, it follows that
mADM (g) = lim ρn−2 (1 − u−1 )

ρ→∞

1
= lim ρ−1 (1 − u−1 )
ρ→∞ ωn−1 S

ρ
1
= lim H(1 − u−1 ) dμgρ
ρ→∞ (n − 1)ωn−1 S

ρ
1
= lim (H − Hu ) dμgρ ,
ρ→∞ (n − 1)ωn−1 S
ρ
where the third equality follows from the fact that the sphere Sρ is close to
a standard large round sphere for large ρ. We choose to write the mass in
this way because the integral above is monotone in ρ.
Lemma 6.21 (Shi-Tam monotonicity [ST02]). Given the setup above, the
quantity

(H − Hu ) dμgρ
Sρ
is monotone nonincreasing in ρ, and consequently

1
(H − Hu ) dμg0 ≥ mADM (g).
(n − 1)ωn−1 S0
Proof. Observe that the traced Gauss equation (Corollary 2.7) in Euclidean
space tells us that
2
RSρ = H − |A|2 .
Combining (6.4) and (6.5), we compute

d
(H − Hu ) dμgρ
dρ Sρ

∂
= H(H − Hu ) + (H − Hu ) dμgρ
Sρ ∂ρ

H (1 − u−1 ) + ΔSρ u
2
=
Sρ

1 1
+ RSρ (1 − u) − (|A|2 + H )(1 − u−1 ) dμgρ
2
2 2

1 1 2 −1
= RSρ (1 − u) + (H − |A| )(1 − u ) dμgρ
2
Sρ 2 2

1
= RSρ (2 − u − u−1 ) dμgρ
Sρ 2

1
=− RS u−1 (u − 1)2 dμgρ
2 Sρ ρ
≤ 0,
since the convex spheres Sρ must have positive scalar curvature.
Although we have carried out this argument starting with a foliation

of Euclidean space by parallel surfaces, it has been observed by others that
there is some flexibility in this, and that other flows can be useful for certain
applications. See, for example, [EMW12, LM17, Lin14].
In light of the above computations, it is natural to define another notion
of quasi-local mass, though it was originally proposed by J. David Brown
and James York for purely physical reasons [BY93].
Definition 6.22. Let (Σ2 , γ, η) be Bartnik data with positive Gauss curva-
ture. By Theorem 6.19, (Σ, γ) is isometric to some strictly convex surface
Φ(Σ) in Euclidean R3 . We define the Brown-York mass of (Σ, γ, η) to be

1
mBY (Σ, γ, η) = (H − η) dμγ ,
8π Σ
where H is the mean curvature of Φ(Σ) in Euclidean space, pulled back to

Σ.
Note that the uniqueness part of Theorem 6.19 guarantees that the
Brown-York mass is well-defined. Our discussion in this section leads up
to the following theorem, which is essentially due to Shi and Tam [ST02].
Theorem 6.23 (Shi-Tam). Let (Σ2 , γ, η) be Bartnik data with positive

Gauss curvature and η > 0. Then
mB (Σ, γ, η) ≤ mBY (Σ, γ, η).
In particular, if we further assume that (Σ, γ) is the boundary of a Rie-
mannian region (Ω3 , g) with nonnegative scalar curvature such that HΣ = η,
then we have mBY (Σ, γ, η) ≥ 0, with equality only if (Ω3 , g) is isometric to
a region of Euclidean space.
A higher-dimensional version of this theorem also holds, under the as-

sumption that the appropriate embedding exists.
Proof. Consider the admissible extension constructed in Theorem 6.18, so

that the ADM mass mADM of this extension is at least as large as mB .
Combining this with Lemma 6.21, we see that mB ≤ mADM ≤ mBY . For
the second part of the theorem, if we are able to “fill in” the Bartnik data
with a region of nonnegative scalar curvature, then we can apply the positive
mass theorem (the singular version, Theorem 4.17) to the result of (Ω, g)
glued to the extension along their common boundary Σ. Therefore mBY ≥
mADM ≥ 0. If mBY = 0, then mADM = 0 also, and then the rigidity part
of Theorem 4.17 implies that the glued object is Euclidean space, and in
particular, (Ω, g) is a region of Euclidean space.
We can also relate Brown-York mass to Hawking mass.

Theorem 6.24. Let (Σ2 , γ, η) be Bartnik data with positive Gauss curvature
and η > 0. Then
mHaw (Σ, γ, η) ≤ mBY (Σ, γ, η).
The corollary is an immediate consequence of applying Lemma 6.21

and Theorem 4.53 (relating Hawking mass to ADM mass) to the admis-
sible extension constructed in Theorem 6.18. Theorem 4.53 applies because
the extension is topologically R3 minus a ball, and because the positive
mean curvature foliation of the extension guarantees that Σ will be outward-
minimizing. However, P. Miao observed that Theorem 6.24 is a fairly simple
consequence of some algebraic manipulations combined with the classical
Minkowski inequality, so that the proof just described is a rather heavy-
handed approach [Mia09].
We close this section by noting that the concept of Brown-York quasi-
local mass can be adapted to the setting of 2-surfaces lying in four-dimen-
sional Lorentzian spacetimes (which are discussed in Chapter 7). This con-
cept, called the Wang-Yau quasi-local mass of a 2-surface, is defined in
terms of all possible isometric embeddings of the surface into Minkowski
4-space [WY09, LY06, CWY11]. This concept has been developed by
6.4. Bartnik data with η = 0 199
Po-Ning Chen, Mu-Tao Wang, and S.-T. Yau, among others. See the survey
paper [Wan15] for an introduction to the topic.
6.4. Bartnik data with η = 0

6.4.1. Static extensions. The following theorem (together with the lem-
ma following it) shows that unless the Bartnik data (Σn−1 , γ, 0) is a round
sphere, we cannot find a vacuum static extension, and consequently, by
Theorem 6.12, neither can we find a Bartnik minimizing extension (at least
for n = 3). This is why Conjecture 6.6 does not include the case η = 0.
Theorem 6.25 (Static uniqueness of Schwarzschild). Let (M n , g) be a com-

plete, one-ended vacuum static asymptotically flat manifold with nonempty
minimal boundary ∂M , such that the static potential f vanishes at ∂M , is
positive in the interior, and approaches 1 near infinity. Then there exists a
1
diffeomorphism M ≈ [(2m) n−2 , ∞) × S n−1 such that g = f −2 dr 2 + r 2 dΩ2
)
and f = 1 − r2m n−2 for some m > 0. In particular, (M, g) is isomorphic to
half of the Schwarzschild space.
Although we introduced this theorem in the context of the Bartnik prob-

lem, its true significance lies in its physical relevance. Roughly, it states that
the only “vacuum static black holes” are Schwarzschild. See Chapter 7 for
an explanation of how this fact fits into the larger study of black holes. The
theorem is generally credited to Werner Israel [Isr67], but a truly general
proof was first given by H. Müller zum Hagen, David C. Robinson, and
H. J. Seifert [MzHRS73] (see also [Rob77]). Their proof is nicely sum-
marized at the end of the survey paper [Rob09]. Later on, Gary Bunting
and A. K. M. Masood-ul-Alam discovered a fairly simple proof based on the
positive mass theorem [BMuA87]. This is the proof we will present, and
unlike the earlier results it works when ∂M has multiple components, which
is actually important for physical application of the theorem.
Proof. Since the vacuum static potential f is g-harmonic, Corollary A.38

implies that
f = 1 + A|x|2−n + O2 (|x|2−n−γ )
for some constant A and some γ > 0. We claim that A = −m, where
m is the ADM mass of g. To see this, we use our formula for mass from
Theorem 3.14 together with the vacuum static equations (Definition 6.7).

−1
m = lim |x| · Ric(ν, ν) dμSr
r→∞ (n − 1)(n − 2)ωn−1 S
r
−1 ∇ν ∇ν f
= lim |x| · dμSr
r→∞ (n − 1)(n − 2)ωn−1 S f
r
−1
= lim A|x|1−n dμSr
r→∞ ωn−1 S
r
= −A.
Just as we did for our proof of Lemma 4.61, consider the two-ended
manifold (M , ḡ) obtained by taking (M, g) and gluing it to an isometric
copy (M , g ) of itself along ∂M . That is, (M , ḡ) is the “double” of (M, g),
obtained by reflecting it through its boundary. Since ∂M is totally geodesic
in (M, g), it follows that we can choose coordinates so that ḡ is actually C 2
across the hypersurface where the gluing occurred. Let ω be the ḡ-harmonic
function that approaches 1 at the infinity of the original M and approaches
0 at the infinity of the new end M . We use ω to conformally close the 0-end
4
by considering the metric g̃ = ω n−2 ḡ on M . The result is a new one-ended
asymptotically flat manifold (M 3 = M ∪ {pt}, g̃). Note that ḡ is scalar-flat
and ω is ḡ-harmonic.
Moreover, by symmetry, we can compute ω directly in terms of f . If
Φ : M −→ M is the name of the aforementioned isometry, then we must
have ω = 12 (1 + f ) on M and ω = 12 (1 − f ◦ Φ) on M , since these formulas
give us a ḡ-harmonic function that satisfies the boundary conditions at the
2 |x|
two infinities. In particular, ω = 1 − m 2−n + O (|x|1−n ) in the M end, so
2
by Exercise 3.13, it follows that mADM (g̃) = mADM (g) − m = 0.
By rigidity of the positive mass theorem (Theorem 3.19), it follows that
3
(M , g̃) must be Euclidean space. (See Remark 3.45 regarding the singular
3, g̃).) Reversing this construction, (M , ḡ) is scalar-flat and glob-
point in (M
ally conformal to Rn {0} via a harmonic function. Since we know what
all of those harmonic functions are, it follows that (M , ḡ) is Schwarzschild
of mass m, and (M, g) is one half of the Schwarzschild space of mass m.
1
Thus there exists a diffeomorphism M ≈ [(2m) n−2 , ∞) × S n−1 such that
g = (1 − r2m −1 dr 2 + r 2 dΩ2 . Finally, by Exercise 6.8, we must have
n−2 )
)
f = 1 − r2m n−2 since it is the unique g-harmonic function approaching 1
1
at infinity and vanishing at r = (2m) n−2 .
The applicability of the above theorem can be widened by the following

lemma.
Lemma 6.26. Let (M, g) be a complete, vacuum static asymptotically flat

manifold, possibly with boundary, and let Σ be an apparent horizon for one
of the ends. Assume that the vacuum static potential f is positive near the
infinity of that end. Then f vanishes at Σ and is strictly positive outside of
it.
In particular, this means that when η = 0, an extension as in the con-

clusion of Theorem 6.12 must satisfy the hypotheses of Theorem 6.25. Of
course, the lemma also applies if f is negative near infinity. Also, note that
since a static vacuum potential f is g-harmonic, if one assumes sublinear
growth of f at infinity, then f must approach a constant at infinity by
Corollary A.38, and as long as that constant is not zero, f will have a sign
near infinity.
Proof. Let Ω be a region enclosed by the apparent horizon Σ = ∂Ω. We

first claim that f ≥ 0 on M Ω. Suppose that f > 0 somewhere in
M Ω. Then Ω is properly contained in Ω1 := Ω ∪ {f < 0}. Since f is
positive near infinity, Ω1 is an enclosed region. Since the zero set {f = 0}
is smooth and totally geodesic by Lemma 6.10, it follows that ∂Ω1 is made
up of minimal pieces (coming from Σ and {f = 0}). As mentioned in the
proof of Theorem 4.7, ∂Ω1 has nonpositive mean curvature in a weak sense,
and therefore there exists a minimal hypersurface enclosing ∂Ω1 . But this
contradicts the outermost property of Σ, proving the claim.
We next consider the outward normal flow of hypersurfaces Σt with
speed f and initial condition Σ0 = Σ for a small amount of time. Suppose
that f > 0 on some part of Σ, so that this flow is nontrivial. Suppressing
the dependence on t in our notation, we can use Exercises 2.13 and 2.3 to
compute
∂H
= DH|Σt (f ν)
∂t
= −ΔΣt f − (|A|2 + Ric(ν, ν))f
= −(Δg f − ∇ν ∇ν f + H, ∇f ) − (|A|2 f + ∇ν ∇ν f )
= −H, ∇f − |A|2 f,
where ν is the outward unit normal. Since f ≥ 0 outside Σ, it follows
from simple ODE analysis that HΣt ≤ 0 for all small t, which implies that
Σt has volume less than or equal to Σ, contradicting the strictly outward
minimizing property of Σ.
6.4.2. Calculation of Bartnik mass. We now turn to the calculation of

mB (Σ, γ, 0).
Proposition 6.27. Let n < 8. Given Bartnik data (Σn−1 , γ, 0), we have

n−2
1 |Σ| n−1
mB (Σ, γ, 0) ≥ .
2 ωn−1
Moreover, if (Σ, γ) is a round sphere, then we have equality, and the Schwarzs-
n−2
child space of mass 12 ω|Σ|
n−1
n−1
is the unique Bartnik minimizing extension.
Note that when n = 3, the right-hand side expression is the same as

mHaw (Σ, γ, 0).
Proof. This is a simple consequence of the Penrose inequality (Theorem

4.54). Let (M, g) be an admissible extension of (Σ, γ, 0). Then the mean
curvature of ∂M = Σ with respect to g is less than or equal to 0. By The-
orem 4.7, either Σ is already an apparent horizon in (M, g), or else there is
some other horizon enclosing it, but the latter would be a violation of condi-
tion (5) of Definition 6.1. Therefore we can apply the Penrose inequality to
n−2
see that mADM (g) ≥ 12 ω|Σ|
n−1
n−1
. The result now follows since mB (Σ, γ, 0)
is the infimum of mADM (g) over all such extensions.
If (Σ, g) happens to be a round sphere, then it is easy to see that the
n−2
Schwarzschild space of mass m = 12 ω|Σ|
n−1
n−1
is an admissible extension
of (Σ, g, 0), and therefore mB (Σ, γ, 0) ≤ m. Therefore we have equality,
and the Schwarzschild space of mass m is a Bartnik minimizer. Any other
n−2
minimizing extension would also have to have mass equal to 12 ω|Σ|
n−1
n−1
and Σ as its horizon, thus rigidity of the Penrose inequality tells us that no
other minimizers are possible.
Next we focus on the case n = 3. We have the following remarkable

theorem.
Theorem 6.28 (Mantoulidis-Schoen [MS15]). Let γ be a metric on the
sphere S 2 such that λ1 (−Δγ + K) > 0, where K is the Gauss curvature of
γ. Then &
|Σ|
mB (S 2 , γ, 0) = .
16π
)
|Σ|
By Remark 6.3, we also obtain mB (S , γ, η) ≤ 16π
2 for all η ≥ 0. Note
that the eigenvalue condition holds automatically if K ≥ 0. The condition
λ1 (−Δγ + K) > 0 is a fairly natural one. As explained in the proof of
Proposition 6.27, if (Σ, γ, 0) admits any admissible extension (M, g), then Σ
is a horizon in that extension. Looking at formula (2.15), we see that
λ1 (−Δγ + K) ≥ λ1 (LΣ ) ≥ 0,
where the second inequality is the stability inequality, so that the only rel-
evant case that we are excluding is the borderline case λ1 (−Δγ + K) = 0.
As discussed earlier, we know that if (Σ2 , γ) is not round, then there
can be no Bartnik minimizing extension, so instead the theorem is proved
by constructing a minimizing sequence of extensions. It is easy to see that
the previous theorem follows immediately from the following construction,
which is itself interesting.
Proposition 6.29 (Mantoulidis-Schoen [MS15]). Given any sphere (S 2 , γ)
) the surface {r} × Σ in the space
with λ1 (−Δγ + K) > 0, let Σr denote
|Σ|
M := [0, ∞) × Σ. Then for any m > 16π , there exists a metric g on M
such that
(1) Rg ≥ 0,
(2) Σ0 is minimal in M and its induced metric is isometric to γ,
(3) there exists r0 such that over (r0 , ∞)×Σ, g is identically equal to the
Schwarzschlild metric with mass m (with r as a radial coordinate
for Schwarzschild),
(4) for r > 0, Σr has positive mean curvature.
Note that the enumerated properties guarantee that (M, g) is an admis-

sible extension of (Σ, γ, 0).
Part 2
Initial data sets

Chapter 7
Introduction to general
relativity
We provide a brief introduction to general relativity, from the perspective of

Riemannian geometry. For an extensive treatment of general relativity, see
traditional physics textbooks such as [Wal84, MTW73]. For a more math-
ematical treatment of general relativity that is more closely related to the
materials presented here, see [CB15, O’N83, HE73]. For a good introduc-
tion to causality theory leading up to the Penrose incompleteness theorem
(Theorem 7.29), see [Gal14]. For an excellent survey of mathematical rela-
tivity, see [CGP10].
7.1. Spacetime geometry

7.1.1. Lorentzian geometry. First, the basic setting of general relativity
is Lorentzian geometry. Every symmetric bilinear form A on Rn can be
diagonalized in the sense that there exists a basis e1 , . . . , en such that if we
express arbitrary vectors u = ui ei and v = v i ei in that basis (using Einstein
notation), then A(u, v) = εij ui v j , where εij is a diagonal matrix in which
each diagonal entry is 1, −1, or 0. The number of 1’s, −1’s, and 0’s appearing
as diagonal entries is independent of the choice of e1 , . . . , en . (Prove these
facts.) The basis e1 , . . . , en generalizes the concept of orthonormal basis in
the case where A is an inner product and εij = δij is the identity matrix.
We say that A is nondegenerate if none of the diagonal entries of εij are
zero. The number of 1’s and −1’s that occur for a nondegenerate symmetric
bilinear form is referred to as its signature.
207
208 7. Introduction to general relativity
Definition 7.1. Let M be a manifold. We say that g is a pseudo-

Riemannian metric on M if it assigns to each point p a nondegenerate sym-
metric bilinear form on Tp M and does so in a way that smoothly depends
on p. Or in other words, g ∈ C ∞ (T ∗ M T ∗ M ) and g is nondegenerate at
each point. We say that such a g is Lorentzian if its signature has exactly
one −1 in it.1
We will adopt the (unusual) convention of using a boldface g as our

default variable for a Lorentzian metric and use boldface for all quantities
computed using g. The most important Lorentzian metric is the Minkowski
metric η on Rn+1 . It is convenient to index the components of Rn+1 by the
numbers 0, 1, . . . , n. We define η to be
η = −(dx0 )2 + (dx1 )2 + · · · + (dxn )2 .
Note that in addition to being a Lorentzian metric, we can also think of

η as a nondegenerate symmetric bilinear form. Explicitly, for any u ∈ Rn+1 ,
we can write uμ as its μth component in the standard basis. (We typically
use Greek letters for indices running from 0 up to n, whereas Latin letters
continue to be used for indices running from 1 to n.) Then for any u, v ∈
Rn+1 , we have η(u, v) = ηij uμ v ν , where
⎧
⎨ −1 for μ = ν = 0,
ημν = 1 for μ = ν > 0,
⎩
0 for μ = ν.
Note that every Lorentzian metric g is locally modeled on the Minkowski
metric, in the sense that there always exists a basis at each point such that
gμν = ημν at that point. This is analogous to how every Riemannian metric
is locally modeled on the Euclidean metric and can always be written as δij
after choosing an orthonormal basis.
Since we usually think of x0 as a time coordinate, sometimes we write t in
place of x0 when the meaning is clear. In physics, we are primarily interested
in the case n = 3 (for three space dimensions plus one time dimension), but
in this book we will work in general dimension whenever possible.
Special relativity is essentially the principle that all physics in Rn+1
should respect the bilinear form η. Or more precisely, all physics should be
invariant under Lorentz transformations, which are defined to be elements
of GL(n + 1) that preserve η. They form the Lorentz group O(1, n). Of
course, the Lorentz group contains the orthogonal group O(n) acting on the
spatial components. It also contains the orientation-reversing time-reversal
symmetry that maps t → −t.
1 Some physics texts instead use the convention that Lorentzian signature has exactly one 1
in it.
7.1. Spacetime geometry 209
There is another important type of Lorentz symmetry called boosts. For

every constant v ∈ (−1, 1), consider the map
t − vx1
t → √ ,
1 − v2
x1 − vt
x1 → √
1 − v2
that leaves x2 , . . . , xn unchanged. This is a boost in the x1 direction with
velocity v, but we can analogously define boosts in any spatial direction.
Check that boosts preserve η. Less obvious is the fact that all of O(1, n) can
be generated by O(n), time-reversal, and boosts. The group O(1, n) breaks
into four connected components, one of which contains the identity, called
the restricted Lorentz group SO+ (1, n), one of which contains time-reversal,
one of which contains orientation-reversing symmetries of O(n), and the
last of which contains the product of time-reversal with a spatial reflection.
The union of the first and the fourth of these components forms the group
SO(1, n) of orientation-preserving Lorentz transformations, while the union
of the first and the third forms the group O+ (1, n) of orthochronous Lorentz
transformations.
In particular, we can count the dimension of O(1, n). For example, the
dimension of O(1, 3) is 6, with three dimensions coming from O(3) and the
other three coming from boosts. We can also consider all maps of Rn+1 that
preserve the Minkowski metric on Rn+1 . These maps form the Poincaré
group, which is a semidirect product of the Lorentz group and translations
of Rn+1 . (This is analogous to how the isometry group of the Euclidean
plane is a semidirect product of the orthogonal group and translations.)
Note that when n = 3, the Poincaré group has 10 dimensions.
It is fairly intuitive that physical laws should be invariant under O(3)
and under spatial translations, as well as under time translation and time-
reversal. That is, if one performs a rotation or translation of the coordinate
system, one expects to be able to use the exact same equations governing
physical phenomena, as long as all quantities are rewritten in terms of the
new coordinate system. In classical physics we also expect that physics
should be invariant under a Galilean transformation that maps
t → t,
x1 → x1 − vt
and leaves x2 and x3 unchanged, where v is a constant. That is, if you

are moving at constant velocity v in the x1 direction and decide to use a
new coordinate system in which your position is fixed, then you should be
able to use the same equations in your coordinate system. This is especially
important when you consider that everything is always moving relative to

other things, so there should never be a preferred reference frame (that is,
choice of coordinate system for space and time), and therefore we need some
principle for moving between different reference frames. “Galilean relativity”
was once believed to accomplish this.
The starting point of special relativity was the observation that Max-
well’s equations for electromagnetism [Wik, Maxwell’s equations], in ad-
dition to being invariant under Euclidean isometries (and time translation
and time-reversal), were also invariant under boosts, where we use units in
which the speed of light c is equal to 1. Explicitly, if one includes the factors
of c, the boost above would be written as
1
t − vx
t → ) c2
v 2 ,
1− c
x1 − vt
x1 → ) 2 .
1 − vc
In particular, Maxwell’s equations are not invariant under Galilean trans-
formations. Although the Lorentz invariance of Maxwell’s theory was well-
understood by others, one of Einstein’s insights was to take the Lorentz
invariance seriously as an underlying feature of reality. Note that for v c,
the Galilean transformation is a good approximation of the corresponding
boost. Using boosts, one can derive basic features of special relativity such
as time dilation and length contraction. One consequence of the Lorentz
invariance is that although time has a “distinguished status” compared to
the spatial directions, one can no longer meaningfully separate space and
time, since boosts “mix” the two in some sense.
In general relativity, we take things a step further by asking for physics
to be invariant under general coordinate transformations, but we still want
our geometry to be locally modeled on special relativity (that is, Minkowski
space). To that end, the setting of general relativity is a Lorentzian manifold
(Mn+1 , g). Given p ∈ M, a tangent vector v ∈ Tp M is said to be spacelike
if g(v, v) > 0, timelike if g(v, v) < 0, or null if g(v, v) = 0. Note that the
null vectors form a double-sided cone in Tp M, called the null cone or light
cone, which separates timelike vectors from spacelike ones. See Figure 7.1.
In a Lorentzian manifold, we say that a curve is spacelike, timelike, or null
if its tangent directions are always spacelike, timelike, or null, respectively.
7.1.2. Causal structure and global hyperbolicity. A physical object

traces out a “worldline” in M, that is, a curve2 in the Lorentzian manifold.
2 In this section, we will assume our curves to be piecewise smooth unless stated otherwise.
Figure 7.1. The null cone in Tp M.
For massive objects, this curve is always timelike, and for massless objects
(such as photons), the curve is always null (meaning that the object is
traveling at the speed of light). Because of this, we use the word causal to
describe vectors or curves that can be timelike or null. Essentially, something
that happens at a certain point in M can affect another point in M only if
they can be joined by a causal curve. We would like to think of one of those
points as being in the future of the other. In order to do this, we must choose
which side of the null cone at each p ∈ M is the future and which is the past.
If such a choice can be made smoothly consistently over all of M, that choice
is called a time-orientation of (M, g). Note that a choice of global timelike
vector field on M determines a time orientation. Technically, choosing a
time-orientation is equivalent to reducing the structure group from O(1, n)
down to O+ (1, n).
Definition 7.2. When we refer to a spacetime in this text, we will mean a

connected, time-oriented Lorentzian manifold.
Given a time orientation, we can separate the nontrivial timelike and null
vectors into those that are future pointing and those that are past pointing.
Given a point p ∈ M, we define J + (p) (respectively, J − (p)) to be the causal
future (respectively, causal past) of p, meaning the set of all points that can
be reached from p by future-pointing (respectively, past-pointing) causal
curves. The collection of sets J + (p) and J − (p) can be referred to as the
causal structure of a spacetime (M, g). We also define I + (p) (respectively,
I − (p)) to be the chronological future (respectively, chronological past) of
p, meaning the set of all points that can be reached from p by future-
pointing (respectively,
% past-pointing) timelike
% curves. Given a set S, we
define J ± (S) = p∈S J ± (p) and I ± (S) = p∈S I ± (p). As a convention, we
consider p ∈ J ± (p) but p ∈
/ I ± (p).
It turns out that the causal structure of a spacetime is equivalent to
the conformal structure together with the time orientation. To see this,
note that the information contained in the causal structure is equivalent to
knowledge of what the time-oriented null cones are at each point. Next, a
simple argument (as in [HE73, p. 61], for example) shows that at any point
p ∈ M the null cone in Tp M determines the Lorentzian metric on Tp M up
to a constant (and vice versa, of course).
Given a causal curve γ : [a, b] −→ M, we can define its length to be
b
L(γ) = −g(γ (t), γ (t)) dt.
a
For a timelike curve, the length measures proper time, which is the amount
of time that the object experiences, or, in other words, how much time has
passed according to a clock traveling with the object. A timelike curve can
be parameterized by proper time. A test particle moving under the influence
of gravity and no other forces will travel along a geodesic (timelike if it is
massive and null if it is massless). Note that the Levi-Civita connection
and geodesics in a Lorentzian manifold are defined in the exact same way
as in Riemannian manifolds. (Raising and lowering of indices is also defined
the same way.) However, in contrast to Riemannian geometry, a timelike
geodesic locally maximizes length if and only if it is free of conjugate points.
In the rest of this subsection we study the existence of causal geodesics.
Lemma 7.3. Given a spacetime (M, g), if γ is a causal curve from p to
q in M that is not null everywhere, then γ can be deformed to a timelike
curve from p to q.
We omit the proof, but note that it ultimately rests on the fact that it
holds in small balls covering the curve.
Proposition 7.4. Given a spacetime (M, g), let p, q ∈ M such that q ∈
J + (p) I + (p). Then there exists a future null geodesic from p to q.
Idea of the proof. Observe that Lemma 7.3 implies that if q ∈ J + (p)
I + (p), then they are joined by a null curve. Suppose that null curve joining
p to q is not geodesic. In this case we can fairly explicitly deform the curve
to a causal curve from p to q that is timelike somewhere. Think about the
situation in Minkowski space in order to understand the intuition behind
this claim. (If a null curve is not geodesic, then it must “spiral” as it moves
upward in time.) See [O’N83, Proposition 46, HE73, Proposition 4.5.10] for
details. But if there is a causal curve from p to q that is timelike somewhere,
then Lemma 7.3 says that q ∈ I + (p), which is a contradiction.
We will also be interested in various types of hypersurfaces of (M, g).

We say that a hypersurface is timelike if its normal vector is always spacelike,
spacelike if its normal vector is always timelike, or null if its normal vector
is always null. Keep in mind that in the last case, that null normal vector
will actually be tangent to the hypersurface. Equivalently, a hypersurface
is timelike if g induces a Lorentzian metric on it, spacelike if g induces a
Riemannian metric on it, or null if g induces a degenerate bilinear form on
it.
While Proposition 7.4 can be used to establish the existence of null geo-
desics, it turns out that in order to construct maximizing timelike geodesics,
one usually needs the hypothesis of global hyperbolicity, which plays a role in
spacetime geometry similar to that of completeness in Riemannian geometry.
Completeness is often an undesirable assumption for spacetimes, because
important examples such as the Schwarzschild spacetime are incomplete.
Definition 7.5. Given a spacetime (M, g), we say that a smooth hypersur-
face M is a Cauchy hypersurface if every inextendible causal curve must pass
through M exactly once. A spacetime that contains a Cauchy hypersurface
is called globally hyperbolic.
Here, an inextendible causal curve γ : I −→ M is simply one that cannot

be extended to a causal curve on a larger domain. Be aware that there is
another common definition of Cauchy hypersurface used in the literature
that is broader than this one. Interestingly, although we define a Cauchy
hypersurface to be a smooth hypersurface for simplicity, more generally one
can actually show that any set satisfying the Cauchy hypersurface condition
is a closed C 0 hypersurface in M. See [Gal14].
Proposition 7.6. Given a globally hyperbolic spacetime (M, g), let p, q ∈

M such that q ∈ I + (p). Then there exists a length-maximizing future-
pointing timelike geodesic from p to q.
We omit the proof, but as one might expect, it bears some similarity to
the proof that there exists a length-minimizing geodesic connecting any two
points in a Riemannian manifold. See [Gal14] or [HE73, Section 6.7] for
details. The global hyperbolicity is used because it allows us to execute the
needed compactness argument, though this is not at all obvious from the
way we defined global hyperbolicity. (There is another common definition
of global hyperbolicity that is known to be equivalent to ours [Ger70].)
The following proposition of R. Geroch [Ger70] gives us a simple picture of

globally hyperbolic spacetimes.
Proposition 7.7. If M is a Cauchy hypersurface in (M, g), then there
exists a homeomorphism between M and R × M that provides a foliation
of M by Cauchy hypersurfaces. Moreover, any Cauchy hypersurface in M
must be homeomorphic to M .
See [Gal14, HE73, Section 6.6] for details. Observe that one simple
consequence of the above proposition is that globally hyperbolic spacetimes
have the desirable property that there are no closed causal curves (which
would violate the physical concept of “causality”).
7.1.3. Static spacetimes. We can construct simple examples of space-

times by simply taking a Lorentzian warped product of a Riemannian mani-
fold with a line. That is, given a Riemannian manifold (M, g) and a positive
warping function N on M , we consider the Lorentzian warped product met-
ric
(7.1) g = −N 2 dt2 + g
on R × M . Since N and g are independent of the time coordinate t, this
may be regarded as a spacetime metric that “does not change in time,”
and consequently we say that g is static. Formally, we have the following
definition.
Definition 7.8. A spacetime is static if it admits a global, nonvanishing
timelike Killing field whose orthogonal distribution is involutive.
It is easy to see that a spacetime is static if and only if it can be locally

written in the form (7.1), where the nonvanishing Killing field is just ∂t ,
which is orthogonal to the level sets of t.
7.2. The Einstein field equations

To summarize the previous section, we view our universe as a manifold Mn+1
equipped with a Lorentzian metric g, which we regard as the “gravitational
field.” The data (M, g) determines how test particles move under the in-
fluence of gravity. This leads us to the natural question of what determines
the spacetime metric itself. The metric ought to somehow be determined
by the distribution of gravitational sources. Those sources are represented
by a symmetric (0, 2)-tensor T called the stress-energy tensor, rather than
a single mass density function as in Newtonian gravity. If we regard a
tangent vector v ∈ Tp M as an “observer” at the spacetime point p, then
T (v, ·) represents the energy-momentum density of the source as seen by the
7.2. The Einstein field equations 215
observer v. Given this tensor T , the spacetime metric g must satisfy the
Einstein (field) equations,
1
Ric − Rg = (n − 1)ωn−1 T,
2
where the Riemann, Ricci, and scalar curvatures of a Lorentzian metric g
are defined exactly the same as for a Riemannian metric.3 Or we could
simply write G = (n − 1)ωn−1 T , where G is the Einstein tensor of g.
An important generalization of the Einstein equations is the Einstein
field equations with cosmological constant Λ,
1
Ric − Rg + Λg = (n − 1)ωn−1 T,
2
where Λ is a constant. In this book, we will focus on the case Λ = 0.
7.2.1. Motivation for the Einstein equations. We will first give some
heuristic physical reasoning to motivate the Einstein equations. (Our discus-
sion here loosely follows that of [Ger13].) We have decided that in general
relativity, we want test particles to follow geodesics. We know that cur-
vature shows up when we look at variations of geodesics. That is, if γs (t)
∂γ
is a family of geodesics through γ0 = γ and X = ∂s is its first-order
s=0
variation, then the Jacobi equation [Wik, Jacobi field] states that
(7.2) X , Y = −Riem(X, γ , Y, γ ),
where Y is any other vector field along γ, and the primes denote (covariant)
differentiation in the t-variable.
Let us compare this to what happens in Newtonian gravity (in the n = 3
case). Recall (from Section 3.1.3 or otherwise) that in Newtonian gravity,
there is a gravitational potential function V : R3 −→ R satisfying ΔV =
4πρ, where ρ represents the mass density of all sources. The acceleration of
any test particle is −∇V . Therefore, if we imagine a family of Newtonian

trajectories γs (t) through γ0 = γ with first-order variation X = ∂γ ∂s s=0 ,
then it is straightforward to derive that
X · Y = −∇Y ∇X V,
where Y is any vector field along γ and the left side is just dot product. In
Newtonian gravity, we know that the trace of the right side of the above
equation (over the X and Y slots) is −4πρ. If we take the trace of equa-
tion (7.2) (noting that the full trace over X and Y will give the same result as
3 When n = 3, the constant (n − 1)ω
n−1 is just 8π. For n > 3, there does not seem to
be a universally accepted convention for what this constant should be, but ours is chosen to be
consistent with our earlier definition of ADM mass (Definition 3.9). Unfortunately, Exercise 7.9
suggests a different convention based on physics, but we have decided not to use this convention
because it would require an extra factor of (n − 2) in many of our formulas.
tracing over the spatial directions orthogonal to γ ), we see that the quantity
Ric(γ , γ ) should correspond to 4πρ.
Of course, Ric(γ , γ ) depends on the observer γ while 4πρ does not.
This dependence on observer is where the stress-energy tensor T comes
in. As described above, the energy density of the sources, as seen by the
observer γ , is T (γ , γ ), suggesting that ρ should be identified with T (γ , γ ).
However, one could also reasonably identify ρ with − tr T in the Newtonian
limit. Compromising between these two candidates for a timelike observer
suggests that we take
Ric = 4π(rT + (1 − r)(tr T )g)
for some constant r. From a physical perspective, we would like T to be a
conserved quantity, that is, div T = 0.
Exercise 7.9. Show that, in order for div T = 0 to be consistent with the
contracted second Bianchi identity (Exercise 1.10), we should choose r = 2,
and hence
1
Ric − Rg = 8πT.
2
7.2.2. Lagrangian formulation and matter models. The simplest case
of the Einstein equations is the vacuum case when T is identically zero. This
means that there are no sources at all for the gravitational field. Minkowski
space is our basic model of empty space (with Λ = 0), but one of the most
important features of general relativity is that there is an incredibly rich
theory even in the vacuum case.
The vacuum Einstein field equations can also be derived from an action
principle. The Einstein-Hilbert action is the functional

S[g] = R dμg .
M
(Note that for a Lorentzian
√ metric, the expression
√ for dμg with respect to a
local frame involves − det g instead of det g, since the det g will be neg-
ative.) It is easy to see from a formal computation that the Euler-Lagrange
equations for the Einstein-Hilbert action are precisely the vacuum Einstein
field equations. That is, the variation of S at g in the direction of every
compactly supported variation of g vanishes if and only if G = 0. (Check
that this follows immediately from Exercise 1.18 and Proposition 1.3.) The
Einstein field equations with cosmological constant Λ arise from using the
action S[g] = M (R − 2Λ) dμg .
Finally, observe that the vacuum Einstein equations G = 0 imply that
R = 0, and thus the vacuum Einstein equations can be rewritten as
Ric = 0.
More generally, if one wants to build a model of general relativity in

which other fields interact with gravity, then one would add extra terms to
the Einstein-Hilbert action for those fields, and then the resulting Euler-
Lagrange equations will determine what the stress-energy tensor T is, as
well as give equations governing those other fields: the Einstein equations
coupled with other field equations such as Maxwell, Klein-Gordon, Yang-
Mills, Vlasov, etc. These other fields are generally referred to as “matter
fields.”
As a quick example of how this works, if we wish to include the electro-
magnetic field, we introduce an electric potential 1-form A and define the
electromagnetic field to be F = dA. If we define an action

S[g, A] = (R − Fμν F μν ) dμg ,
M
then its Euler-Lagrange equations from the variation of g will be
1
Gμν = 2(Fμα Fνα − Fαβ F αβ gμν ),
4
so that we can identify the right side as 8πTμν . The Euler-Lagrange equa-
tions from the variation of A will be
∇μ F μν = 0.
These equations are together known as the Einstein-Maxwell equations, and
solutions are called electrovacuum since there are no sources.
7.2.3. The Schwarzschild spacetime. Selecting a specific matter field

model is necessary if one wishes to solve the Einstein field equations. His-
torically, the first and most important nontrivial solution of the vacuum
Einstein equations was the Schwarzschild metric. The Schwarzschild met-
ric comes about in a natural way by looking for a solution with a lot of
symmetry. Specifically, we can look for a solution that is both static and
spherically symmetric. Or in other words, we consider the ansatz
(7.3) g = −N (r)2 dt2 + V (r)−1 dr2 + r2 dΩ2 ,
where dΩ2 is the standard metric on a unit S n−1 sphere, and we try to
find N and V such that Ric = 0. Observe that a constant t slice is totally
geodesic in the ambient spacetime metric g. By the traced Gauss equation
(Corollary 2.7), it follows that the Riemannian metric g = V (r)−1 dr2 +r2 dΩ2
is scalar-flat. By Exercise 3.2, it follows that V (r) = 1 − r2m n−2 for some
parameter m. Next we solve for N . We use the following general fact about
static metrics.
Exercise 7.10. Use Proposition 1.13 to show that the static metric g =
−N 2 dt2 + g solves the vacuum Einstein equations if and only if the pair
(g, N ) is vacuum static initial data in the sense of Definition 6.7 and N is
strictly positive. Note that this explains why we used the words vacuum
static in Definition 6.7.
By the exercise above, we just need to find N (r) such that Hessg N =
N Ricg , where g = V (r)−1 dr2 + r2 dΩ2 . In particular, Δg N = 0. Writing this
out as√a second-order ODE for N (r), it is fairly straightforward to see that
N = V is a solution. (It is also the unique one with the property that g
approaches Minkowski space as r → ∞.) Thus we have determined what N
and V must be.
Exercise 7.11. Use Proposition 1.13 to check that with this choice of N
and V , we have Hessg N = N Ricg .
Thus we define the Schwarzschild spacetime metric of mass m to be

2m 2m −1 2
(7.4) gm = − 1 − n−2 dt + 1 − n−2
2
dr + r2 dΩ2 .
r r
This metric solves the vacuum Einstein equations in the region where r is
1
greater than r0 := (2m) n−2 , which is called the Schwarzschild radius. Our
argument above proves that the Schwarzschild metric is the only spheri-
cally symmetric, static spacetime metric that solves the vacuum Einstein
equations, but more generally Birkhoff’s theorem states that it is the only
spherically symmetric solution of the vacuum Einstein equations [Bir23].
This result holds even locally.
In the following we will assume that m is positive (in order to avoid
an undesirable spacelike singularity at r = 0). Just as was the case for
the Riemannian Schwarzschild metric, the singularity at r = r0 is only
a coordinate singularity rather than a true geometric singularity. To see
why this is the case, we rewrite the Schwarzschild metric in terms of null
coordinates. If we define a new coordinate r∗ by

2m −2 2
(dr∗ )2 = 1 − n−2 dr
r
and then define
u = t − r∗ ,
v = t + r∗ ,
then we have

2m
g = − 1 − n−2 du dv + r2 dΩ2 .
r
∗
In the literature, r is called the Regge-Wheeler radial coordinate or tortoise
coordinate, and u and v are called the outgoing (or retarded) and ingoing
(or advanced) Eddington-Finkelstein coordinates, respectively. Note that u
and v are null coordinates in the sense that ∂u and ∂v are null (when put
together with coordinates on S n−1 to create a coordinate system).
Exercise 7.12. For n = 3, solve for r∗ explicitly. Let U = −e−u/4m and V =
ev/4m . Show that we
can rchoose
r/2mthe integration constant in the definition
of r∗ so that U V = 1 − 2m e , and show that the Schwarzschild metric
becomes
32m3 −r/2m
gm = − e dU dV + r2 dΩ2 ,
r
where we now regard r as the function of U and V implicitly defined above.
Observe that the original region r > r0 corresponds to U < 0 and V > 0 in
the new coordinates. The coordinates U and V are called Kruskal-Szekeres
coordinates.
Note that in Kruskal-Szekeres coordinates, the metric is not singular
when U = 0 or V = 0. However, one can show that there really is a singu-
larity at r = 0 (because the curvature blows up there), which corresponds to
where U V = 1. This allows us to naturally extend the Schwarzschild metric
to the product {(U, V ) ∈ R2 | U V < 1} × S 2 . See Figure 7.2. We will refer to
this vacuum spacetime as the Schwarzschild spacetime of mass m, though it
is sometimes called the Kruskal extension of Schwarzschild or the Kruskal-
Szekeres spacetime in the literature. This spacetime can be thought of as a
“maximal extension” of the metric (7.4) that was defined in the region r > r0
(region I in Figure 7.2), in the sense that it is a simply connected vacuum
extension such that every geodesic can either be extended to a complete
geodesic, or else it hits the singularity at U V = 1.
The construction generalizes to higher dimensions, with the main dif-
ference being that one no longer has ∗ However, if
for r .
a simple formula
−(n−2)u (n−2)v
one defines U = − exp 2(2m) 1/(n−2) and V = exp 2(2m) 1/(n−2) , then the
extension works out in essentially the same way. See [Chr15, Remark 1.2.6]
for details.
Observe that the region where U > 0 and V < 0 (region III in Figure 7.2)
is just an isometric copy of the “original” region where U < 0 and V > 0,
and thus the Schwarzschild spacetime should be thought of as having two
“asymptotically flat” ends since these two regions resemble Minkowski space
for large r. In particular, any negative constant U/V slice (including the
sphere at U = V = 0) of the Schwarzschild spacetime (which extends a
constant t slice of the original r > r0 region) is precisely the two-ended
Riemannian Schwarzschild space described in Chapter 3. The metric in the
regions where U > 0 and V > 0 (region II), and where U < 0 and V < 0
(region IV), which are isometric to each other, can be identified with the
metric (7.4) in the region 0 < r < r0 , where the t coordinate becomes
spacelike and the r coordinate becomes timelike.
U V
r= UV=1
r ∞
0 t=
t=
− r0
∞ r=
t=0
II
t=0 t=0
III I
IV
r=
t=0
r
0
∞
t=
t=
−
∞
0
r
r=
UV=1
Figure 7.2. The Schwarzschild spacetime (aka the Kruskal-Szekeres

spacetime), with some of the level sets of r and t drawn in. Each point
in the diagram represents a sphere.
Another particularly important family of vacuum solutions in the n = 3

case especially is the Kerr spacetime, which is axisymmetric and comes in a
family with two parameters—the mass m > 0 and the angular momentum
a with 0 ≤ a ≤ m. Axisymmetric here means that there is an S 1 ∼ = SO(2)
group of isometries. In contrast, the spherical symmetry of the Schwarzschild
spacetime with n = 3 means that there is a full SO(3) group of isometries.
Although the Kerr metric can be written down explicitly, we will not need
this formula. See [Wik, Kerr metric] for details. When a = 0, the formula
reduces to the one for the Schwarzschild metric in (7.4). The Kerr space-
time has the property that it is stationary, meaning that there exists a global
Killing field that is timelike near infinity. (Note that this is a much looser
condition than being static, but it only really makes sense as a global con-
dition since it asks us to look near infinity.) Thus, the Kerr spacetime has
a two-parameter family of isometries. There is a higher-dimensional version
of the Kerr spacetime known as the Myers-Perry spacetime [MP86]. The
Kerr family of vacuum solutions also generalizes to the Kerr-Newman fam-
ily [Wik, Kerr-Newman metric] of electrovacuum solutions, which is also
axisymmetric and stationary but carries an additional charge parameter.
7.3. The Einstein constraint equations 221
As a special case of Kerr, the Schwarzschild spacetime is also stationary:

the Killing vector field ∂t for the metric (7.4) naturally extends across the set
where U V = 0 to a global Killing field on the entire Schwarzschild spacetime.
(n−2)
Explicitly, it is 2(2m) 1/(n−2) (−U ∂U + V ∂V ). However, note that the Killing
field becomes spacelike in the region U V > 0 and is null where U V = 0.
7.3. The Einstein constraint equations

7.3.1. The Einstein equations as an initial data problem. It is nat-
ural to want to formulate and solve an initial value problem for the vacuum
Einstein equations. That is, given knowledge of the metric at some fixed
time, we would like to know how the solution must develop as we move for-
ward in time. Since “fixed time” does not carry direct meaning in general
relativity, this is usually taken to mean a fixed spacelike slice of the space-
time M. This basic problem was essentially solved in the pioneering work of
Yvonne Choquet-Bruhat [FB52]. If one writes out the Einstein equations
in local coordinates, one can see that they do not quite form a hyperbolic
system of equations. The underlying reason why they cannot be hyperbolic
is that the Einstein equations are invariant under diffeomorphisms (or as
physicists might say, they are “gauge invariant”). Choquet-Bruhat discov-
ered that if one chooses to write the equations in “wave coordinates,” that
is, coordinates that solve the wave equation for the metric g, then the equa-
tions become hyperbolic, and therefore they could be solved (for a short
time) using existence theory for hyperbolic systems. In this formulation,
one does not require all of g and its time derivative at M , but rather only
the induced metric g and its time derivative. However, in order for the
“wave coordinate” condition to be preserved as one solves for g forward in
time, it is necessary for g and its time derivative to satisfy certain equations.
Explicitly, we have the following theorem.
Theorem 7.13 (Choquet-Bruhat [FB52]). Let (M n , g) be a Riemannian
manifold, and let k be a symmetric (0, 2)-tensor on M such that (g, k) ∈
W m+1,2 × W m,2 for some m > n/2. Suppose that the following equations
hold:
Rg + (trg k)2 − |k|2 = 0,
divg k − d(trg k) = 0.
Then there exists a spacetime (M, g) solving the vacuum Einstein equations
such that (M, g) isometrically embeds into (Mn+1 , g) as a Cauchy hyper-
surface with second fundamental form k.
When we say that the second fundamental form is k, we mean that

if n is the future-pointing timelike unit normal to M , then k = A, −n,
where A is the second fundamental form of M in (M, g), defined precisely

as in Chapter 2. This is essentially a short-time existence theorem in the
sense that M need only be a small neighborhood of M . It helps explain
the reason why we use the vocabulary Cauchy hypersurface. The following
theorem establishes existence of a unique solution that is maximal in some
sense and illustrates the importance of the globally hyperbolic condition.
Theorem 7.14 (Choquet-Bruhat–Geroch [CBG69]). Let (M n , g) be a
smooth Riemannian manifold, and let k be a smooth symmetric (0, 2)-tensor
on M . Suppose that the following equations hold:
Rg + (trg k)2 − |k|2g = 0,
divg k − d(trg k) = 0.
Then there exists a spacetime (Mn+1 , g) solving the vacuum Einstein equa-
tions such that (M, g) isometrically embeds into (M, g) as a Cauchy hy-
persurface with second fundamental form k. Moreover, this solution is the
unique (up to isometry) maximal globally hyperbolic solution. By this we
mean that (M, g) does not sit inside any larger globally hyperbolic space-
time.
The spacetime (M, g) is called the vacuum development of the initial
data (M, g, k).
Exercise 7.15. Let (M n , g) be a Riemannian manifold isometrically embed-
ded in a Lorentzian manifold (Mn+1 , g) with second fundamental form k,
and let n be a unit normal vector to M . Show that along M , we have
1
G(n, n) = Rg + (trg k)2 − |k|2g ,
2
G(n, ·) = divg k − d(trg k),
where G is the Einstein tensor of g and the · input is a vector tangent to
M.
In light of the exercise above, it becomes clear why the assumptions on

g and k in Theorems 7.13 and 7.14 are necessary.
Given an observer whose worldline in a spacetime (M, g) has causal tan-
gent vector v, recall that the quantity Tμν v μ may be regarded as the energy-
momentum density of the gravitational sources, as seen by the observer. We
say that (M, g) satisfies the dominant energy condition (or DEC) if for any
future-pointing causal vector v, the covector Tμν v μ is always future-pointing
causal. This a natural physical assumption to impose on the gravitational
sources that roughly corresponds to the assumption of nonnegative mass
densities in Newtonian gravity. Note that the dominant energy condition is
a general condition on a spacetime and has nothing to do with any particular
matter field, but many physical models will naturally satisfy the dominant
energy condition. This discussion motivates the following definition.
Definition 7.16. An initial data set (M n , g, k) is a Riemannian manifold
(M, g) equipped with a symmetric (0, 2)-tensor k. We define
1
μ := Rg + (trg k)2 − |k|2g ,
2
J := (divg k) − ∇(trg k).
The quantity μ is called the energy density while J is called the current
density. Here we have defined J to be a vector quantity rather than a 1-
form as some other texts do. (This is why we use the raising operator .
Also note that for n = 3, our definition of energy density μ differs from the
Newtonian mass density ρ described earlier by a factor of 8π.) Together,
these equations are known as the (Einstein) constraint equations, and we will
refer to (μ, J) as the constraints of (g, k). They are called the constraints
because if one is given a stress-energy tensor T , this determines the pair
(μ, J), which constrains (but does not determine) the initial data (g, k)
according to the equations in Definition 7.16. In particular, in a vacuum
spacetime, T = 0, and consequently μ and J both vanish. More generally,
any initial data set with vanishing μ and J is said to satisfy the vacuum
(Einstein) constraint equations, that is, the same conditions appearing in the
hypotheses of Theorem 7.13. We say that (M, g, k) satisfies the dominant
energy condition (or DEC) whenever μ ≥ |J|g everywhere. We say that the
strict DEC holds if μ > |J|g everywhere.
We can emphasize the fact that J is a divergence by writing
J = divg π,
where π is the symmetric (2, 0) tensor defined by
π ij := k ij − (trg k)g ij ,
where the indices on k have been raised using g. Note that π contains the
same information as k since we can invert the relationship via
1
kij = πij − (trg π)gij .
n−1
We may sometimes (abusively) refer to the triple (M, g, π) as an initial data
set when the meaning is clear.
We say that (M, g, k) sits inside a spacetime (M, g) if M embeds into
M in such a way that g induces g, and k is the second fundamental form of
the embedding.
In light of Theorems 7.13 and 7.14, we see that an initial data set
(M, g, k) solving the vacuum constraints is the appropriate Cauchy data
for solving the vacuum Einstein equations. Building on the work of Theo-
rems 7.13 and 7.14, similar theorems can be established for Einstein equa-
tions with various matter fields. For a thorough discussion of these various
Cauchy problems, see the book [CB09].
The study of the Einstein equations using methods of hyperbolic PDE
is an extensive field of current research, but it is not the focus of this book.
However, we will mention a couple of the large, motivating problems in
the field. In four spacetime dimensions, it is conjectured that the Kerr
solutions are the only vacuum stationary spacetimes. (The analogous state-
ment for the Myers-Perry solutions in higher dimensions is known to be
false.) In fact, this conjecture is very close to being a known fact: S. Hawk-
ing [HE73] showed that any analytic vacuum stationary spacetime must
be axisymmetric, and then work of Brandon Carter [Car73] and David C.
Robinson [Rob75] shows that an axisymmetric vacuum stationary space-
time must either be static, or else it lies in the Kerr family. But if it is static,
then one can show that it must correspond, via Exercise 7.10, to a vacuum
static asymptotically flat manifold with minimal boundary. Theorem 6.25
then implies that it must be Schwarzschild, which is a special case of Kerr.
A mathematically rigorous version of this overall argument, drawing
on work of various contributors, appears in a paper by P. Chruściel and
J. Costa [CC08] (see also references cited therein). The analyticity assump-
tion is nearly removed by work of S. Alexakis, A. Ionescu, and S. Kleiner-
man [AIK10], which is strong enough to prove the uniqueness result for
small perturbations of Kerr. The general topic of black hole uniqueness the-
orems has grown in a number of directions. See [Rob09] for a survey of
developments.
The uniqueness of Kerr leads to a far more ambitious “final state con-
jecture” that all vacuum solutions of the Einstein equations should settle
down to a Kerr solution in the long-time limit. A more tractable piece of
this conjecture is simply the conjecture that the Kerr family is stable in the
sense that initial data that is a small perturbation away from Kerr initial
data will asymptotically settle down to a (possibly different) Kerr solution
in the long-time limit. This is a highly active area of research that was set
into motion by D. Christodoulou and S. Klainerman’s pioneering proof of
the stability of Minkowski space [CK93]. (H. Lindblad and I. Rodnianski
later gave an alternative proof under stronger hypotheses [LR10].) Lydia
Bieri extended the Christodoulou-Klainerman result by relaxing assump-
tions on both the regularity and decay [Bie09]. These results and questions
have natural analogs for the Einstein-Maxwell equations. Specifically, it is
also hoped that the Kerr-Newman family is stable and more generally that
any electrovacuum solution of the Einstein equations must settle down to a
Kerr-Newman solution. Nina Zipser proved that Minkowski space is stable

under evolution via the Einstein-Maxwell equations, thereby generalizing
the Christodoulou-Klainerman result [Zip09]. In fact, a much broader ver-
sion of the final state conjecture is that for various physical matter models,
all matter except electromagnetic and purely gravitational energy should
“radiate away” and leave us with a development that settles down to a
Kerr-Newman solution.
7.3.2. Asymptotically flat initial data sets. In this book we are mainly
interested in initial data sets. There is extensive literature on constructing
initial data sets solving the Einstein constraint equations (chiefly the con-
formal method ), but that is not our focus. We are primarily concerned with
general properties of asymptotically flat initial data sets that satisfy the
dominant energy condition.
Definition 7.17. Let n ≥ 3. An initial data set (M n , g, k) is said to be

asymptotically flat if there exists a bounded set K such that M K is
a finite union of ends M1 , . . . , M such that for each Mk , there exists a
diffeomorphism
Φk : Mk −→ Rn B̄1 (0),
where B̄1 (0) is the standard closed unit ball, such that if we think of each Φk
as a coordinate chart with coordinates x1 , . . . , xn , then in that coordinate
chart (which we will often call the asymptotically flat coordinate chart or
sometimes the exterior coordinate chart), we have
gij (x) = δij + O2 (|x|−q ),

kij (x) = O1 (|x|−q )
n−2
for some q > 2 . Moreover, we also require that μ and J are integrable
over M .
The case when k is identically zero is called the time-symmetric (or Rie-
mannian) case, and in this case the definition above reduces to the statement
that (M, g) is an asymptotically flat manifold. We reiterate that in the lit-
erature, the precise definition of asymptotic flatness can vary from paper to
paper.
Definition 7.18. Let (M n , g, k) be a smooth, asymptotically flat initial

data set. We define the ADM energy-momentum (E, P ) of an end of M to
be
n
1
E = lim (gij,i − gii,j )ν j dμSρ ,
ρ→∞ 2(n − 1)ωn−1 S
ρ i,j=1
n
1
Pi = lim (kij − (trg k)gij )ν j dμSρ
ρ→∞ (n − 1)ωn−1 S
ρ j=1
for i = 1, . . . , n, where the right sides are calculated in the asymptotically
flat coordinates of the chosen end, and barred quantities are calculated using
the Euclidean metric in the end. If (E, √P ) is future causal, then the ADM
mass of that end is defined to be m = E 2 − P 2 . Observe that we can also
write
n
1
Pi = lim πij ν j dμSρ for i = 1, . . . , n.
ρ→∞ (n − 1)ωn−1 S
ρ
j=1
The definition above is due to Arnowitt, Deser, and Misner [ADM60,

ADM61, ADM62]. Note that this definition of ADM energy is the exact
same one we used for the ADM mass of asymptotically flat manifold. This
apparent clash of nomenclature is fine because ADM mass and ADM energy
may be regarded as the same thing for a time-symmetric asymptotically flat
initial data set (M, g, k = 0).
Exercise 7.19. Check that the ADM energy-momentum is well-defined for
asymptotically flat initial data.
Conjecture 7.20 ((Spacetime) positive mass conjecture). Let (M n , g, k) be
a complete asymptotically flat initial data set satisfying the dominant energy
condition. Then in each end the ADM energy-momentum vector (E, P ) is
future causal. Or, in other words, E ≥ |P |.
Conjecture 7.21 ((Spacetime) positive mass rigidity conjecture). Assume
the hypotheses of the previous conjecture, and suppose that we also have
E = |P | in some end. Then (M, g, k) sits inside Minkowski space.
We will discuss the known cases of these conjectures in greater detail in

the following chapter.
Noether’s Theorem [Wik, Noether’s theorem] states that every sym-
metry of a physical system gives rise to a conserved quantity. In classical
physics in flat space, spatial translation symmetry gives rise to conservation
of total linear momentum, while time translation symmetry gives rise to
conservation of total energy. Although an asymptotically flat initial data
set need not have any symmetries, one can think of it as having asymptotic
symmetries at infinity since the asymptotic flatness allows us to think of it
as being “close” to a slice of Minkowski space near infinity. Since Minkowski
space does have spacetime translation symmetries, it is possible to motivate

the above definitions for the ADM energy-momentum using heuristic rea-
soning along these lines. The reader may recall that the Poincaré group of
symmetries of Minkowski space also contains spatial rotations and boosts.
The spatial rotations give rise to total angular momentum, while boosts give
rise to a concept of center of mass. (In nonrelativistic physics, the Galilean
transformations give rise to the usual concept of center of mass.) However,
in order to define these quantities, we require stronger decay assumptions
on (g, k) than asymptotic flatness gives us.
Definition 7.22. A smooth asymptotically flat initial data set (M n , g, k)
is said to satisfy the Regge-Teitelboim conditions if we have
odd
gij (x) = O2 (|x|−q−1 ),
even
kij (x) = O2 (|x|−q−2 )
n−2
in the asymptotically flat coordinate chart, where q > 2 and we define
odd
gij (x) := gij (x) − gij (−x),
even
kij (x) := kij (x) + kij (−x).
If (M, g, k) satisfies the Regge-Teitelboim conditions, we define the ADM
angular momentum
n
1
Jm = lim (kij − (trg k)gij )Zm
i
ν j dμSρ
ρ→∞ (n − 1)ωn−1 S
ρ i,j=1
for 1 ≤ < m ≤ n, where Zm is the vector field x ∂m − xm ∂ generating

rotations around the plane perpendicular to the x xm -plane. In the case
n = 3, we instead simply write Jk for the angular momentum around the
xk -axis.
If we moreover know that the ADM energy E is nonzero, we define the
ADM center of mass to be
n
1

C = lim [x (gij,i − gii,j )ν j − (gi ν i − gii ν )] dμSρ
ρ→∞ 2(n − 1)ωn−1 E S
ρ i,j=1
for = 1, . . . , n.
Lan-Hsuan Huang [Hua09] showed that the center of mass can also be
written as

1
C = lim G(Z , ν) dμSρ ,
ρ→∞ 2(n − 1)(n − 2)ωn−1 E S
ρ
n
where G is the Einstein tensor of g and Z = i=1 (|x|2 δ i − 2x xi )∂i gen-

erates a conformal symmetry of Euclidean space. This can be proved along

similar lines as in the proof of Theorem 3.14. See [MT16].
7.4. Black holes and Penrose incompleteness

In this section we introduce the important concept of black holes. Unfortu-
nately, we will have to be a bit loose with this discussion to avoid getting
bogged down in technical definitions. Because we are using nonrigorous defi-
nitions, we will be careful not to state “official” theorems depending on them.
One reason why we take this vague approach is that the optimal theorems
can sometimes be quite sensitive to the precise definitions used, and another
is that the issue of what the “right” definitions are is itself often an inter-
esting research question. In any case, the purpose of our discussion of black
holes is only to provide some physical context behind the geometric ideas
to be studied later on. In particular, we would like to motivate the physical
relevance of marginally outer trapped surfaces and apparent horizons, the
latter of which we already introduced in Chapter 4 in the time-symmetric
case.
We are interested in spacetimes that are “asymptotically flat” in some
sense. This means that they should “look like” Minkowski space “near
infinity.” One simplifying assumption that we will adopt (for ease of presen-
tation) is to consider spacetimes that are conformally compactifiable, in the
same sense that Minkowski space is itself conformally compactifiable. Going
back to the discussion in Section 7.1.2, this compactified space is equivalent
to the original one as far as causality questions are concerned. Thus, the
compactification allows us to regard “infinity” as the boundary of the com-
pactified space, made up of future null infinity I + and past null infinity I − ,
which meet at spacelike infinity i0 which is a single point for each end. For
example, any complete future-pointing null geodesic “ends” at a point on
I + , which is pronounced “scri plus.” (Actually, having a meaningful notion
of I + is the main desirable property for the following discussion.) Using
the conformal compactification, one obtains a nice “picture” of the global
causal structure of the spacetime. When this picture is “sketched out” in
two dimensions, it is referred to as a Penrose diagram.
The black hole region of a spacetime (M, g) is all of the points in M that
can never reach I + via future-pointing causal curves. The boundary of the
black hole region is called the black hole event horizon. Similarly, the white
hole region is all of the points that can never reach I − via past pointing
causal curves, and its boundary is the white hole event horizon. The points
that lie outside both the black hole and white hole horizons are considered
to be in the domain of outer communication (or d.o.c.). Thus, the black
hole event horizon serves as a “point of no return” because if one passes
from the domain of outer communication into the black hole region, then by
definition it is not possible to return to the domain of outer communication.
7.4. Black holes and Penrose incompleteness 229
singularity
I+ black hole
I+ I+
II
d.o.c. d.o.c.
i0 i0 i0
III I
I− I− white hole I−
IV
singularity
Figure 7.3. Penrose diagrams for the Minkowski spacetime (left) and
the Schwarzschild spacetime (right).
We examine these concepts for the simple case of a Schwarzschild space-

time. The black hole region is where U > 0 and V > 0 in Kruskal-Szekeres
coordinates, while the black hole event horizon is its boundary, made up of
one piece U = 0, V ≥ 0 bordering one end and another piece U ≥ 0, V = 0
bordering the other. The white hole region is where U < 0 and V < 0. The
domain of outer communication is the region where U V < 0, which has two
components corresponding to the two infinite ends. Recall that each of these
components separately corresponds to the r > r0 region of (7.4). Although
one does not approach a singularity as r approaches r0 from above, one does
approach the event horizon (of the black hole and/or the white hole) as r
approaches r0 . Note that although our spacelike Schwarzschild constant t
slices (or more accurately, negative constant U/V slices) do pass through
the event horizon at U = V = 0, they do not actually intersect the interiors
of the black hole or white hole regions.
7.4.1. Geometry of null hypersurfaces. In general, it is not clear

whether an event horizon in (M, g) will be a smooth hypersurface of M, but
wherever it is smooth, it must be a null hypersurface. To see why, suppose
H is a black hole event horizon which is a smooth hypersurface near p ∈ H.
If H is timelike at p, then there must exist past and future-pointing causal
vectors at p pointing toward both sides of H. In particular, this means that
both J + (p) and J − (p) intersect both the d.o.c. and the black hole region,
which contradicts the definitions. If H is spacelike at p, then either the d.o.c.
is on the past side of H while the black hole region is on the future side,
which is a clear contradiction, or else we have the opposite. In the latter case
J + (x) lies in the black hole region for all x in a small enough neighborhood
U ⊂ H of p. This means that for a point q in the d.o.c. close enough to p, its
entire causal future will have to pass through U and consequently into the
black hole region, which is a contradiction. Hence H is null at p. A similar
argument works for white hole horizons.
For this reason and others, we will be interested in studying the ge-
ometry of null hypersurfaces. In general, it is important to consider null
hypersurfaces of low regularity, but for our discussion here we will focus on
the smooth case. Let H be a null hypersurface, and let be a nonvanishing
future-pointing null normal vector field. This is equivalent to asking for to
be a nonvanishing future-pointing null tangent vector field. Since there is no
natural normalization for null vectors, is only determined up to multipli-
cation by a positive function. That is, only its direction is naturally defined.
Consequently, the integral curves of are defined independently of choice
of , and in fact, after reparameterization, these integral curves become null
geodesics. To see why, let p ∈ H and X ∈ Tp H, and then extend X in the
direction of so that X remains tangent to H and [, X] = 0. Then
∇ , X = , X − , ∇ X

= −, ∇X
1
= X,
2
= 0.
Thus ∇ is normal to Tp H and therefore points in the same direction as

. This means that the integral curves of are geodesics after reparam-
eterization (or equivalently, after multiplying by an appropriate positive
function). These null geodesics are called (null) generators of H.
Because of the degeneracy of the induced metric on H, we have to be
careful about how to study the geometry of H. In particular, defining the
second fundamental form is not quite straightforward. To do it correctly, we
must work modulo . That is, we work with the quotient space Tp H/ at
each p ∈ H, where is the subspace spanned by , which is independent
of our choice of . We will use bar notation to denote the quotient map
from Tp H to Tp H/. The degenerate induced metric on Tp H descends to
an inner product on Tp H/, which we will still abusively denote ·, ·.
We define the null second fundamental form A and null shape operator
(or null Weingarten map) S of H via
A(X̄, Ȳ ) := S(X̄), Ȳ := ∇X , Y
for any vector fields X, Y tangent to H, where ∇ is the Levi-Civita connec-

tion of the ambient spacetime (M, g). Note that this definition is similar
to equation (2.1). More precisely, A is a symmetric bilinear form and S is
a symmetric operator on the space Tp H/ at each p ∈ H. The null mean
curvature or (null) expansion of H is defined to be θ = tr A = tr S, where
the trace is taken over the space Tp H/ at each p ∈ H.
One can show that multiplying by a positive function has the effect
of multiplying A, S, and θ by the same positive function. In particular,
although the sizes of these quantities have no invariant geometric meaning,
their signs do.
Next we would like to see how A, S, and θ evolve along a null generator γ.
Given a vector field X tangent to H, one can see that ∇γ X must be tangent
to H. (Check this.) So we can define the covariant derivative of any X̄ ∈
C ∞ (T H/) along γ by X̄ = ∇γ X for any X ∈ C ∞ (T H) that projects to
X̄. This is well-defined in the sense that it is independent of the choice of
X. With this definition, we can easily extend covariant differentiation along
γ to A and S in the standard way.
Theorem 7.23 (Riccati equation for null generators). Let γ be a null gen-
erator for a null hypersurface H, and let = γ be our choice of null normal
along γ. Then along γ, the null shape operator satisfies
S (X̄), Ȳ = −S 2 (X̄)Ȳ − Riem(X, , Y, ),
at any p along γ, where X, Y ∈ Tp H. The null expansion satisfies
1
θ = − θ2 − |S̊|2 − Ric(, ),
n−1
where S̊ is the trace-free part of S. The equation for θ is usually referred to
as the Raychaudhuri equation in the literature, and |S̊| is called the shear
scalar.
Proof. First, we can extend X, Y along γ so that they remain tangent to

H and [, X] = ∇ Y = 0. Note that the curvature term becomes
Riem(X, , Y, ) = −∇ ∇X + ∇X ∇ + ∇[,X] , Y
= −∇ ∇ X, Y .
Note that this is just the Jacobi equation for X, which should be expected
when you observe that H is ruled by null geodesic generators, and that X
is simply varying through them. Thus
S (X̄), Ȳ = ∇ S(X̄), Ȳ − S(X̄ ), Ȳ − S(X̄, Ȳ
= ∇ ∇X , Y − S(∇ X), Ȳ
= ∇ ∇ X, Y − S(∇X ), Ȳ
= ∇ ∇ X, Y − S(S(X̄)), Ȳ
= Riem(X, , Y, ) − S 2 (X̄), Ȳ ,
where we used the Jacobi equation for X in the last line.
To prove the Raychaudhuri equation, we simply take the trace of the
Riccati equation for S over the space Tp H/ and use linear algebra.
The formulas in Theorem 7.23 may look familiar from Riemannian ge-
ometry. Although we can use the same reasoning above to prove the Rie-
mannian analog, it is interesting to note that it can also be seen as a special
case.
Corollary 7.24 (Riccati equation for parallel hypersurfaces). Let Σ be a
hypersurface of a Riemannian manifold (M, g) with unit normal ν. For
each point x ∈ Σ, consider the geodesic expx tν emanating from Σ with
normal vector ν. Let Σt be the image obtained from Σ by flowing along
those geodesics for time t. (This Σt is often called a parallel hypersurface.)
Then along each geodesic, as long as Σt is smooth near the geodesic, the
shape operator St of Σt evolves according to
S (X), Y = −S 2 (X), Y − Riem(X, ν, Y, ν)
along γ, where X, Y ∈ Tp M . The mean curvature satisfies
1
H = − H 2 − |S̊|2 − Ric(ν, ν),
n−1
where S̊ is the trace-free part of S.
Proof. Given the hypotheses, we can construct a situation that satisfies

the hypotheses of Theorem 7.23 as follows. Let M = R × M equipped
with the Lorentzian metric g = −dt2 + g, and isometrically embed (M, g)
in (M, g) as its zero slice {0} × M . Let = ∂t + ν along Σ in M, and then
consider the null geodesics emanating from Σ with initial tangent vector .
These null geodesics will generate the null hypersurface H = {(t, x) | x ∈
Σt }. We can now apply Theorem 7.23 to the null hypersurface H. All that
remains is to check that the null shape operator S of H at (t, x) ∈ H is
essentially the same object as the shape operator S of Σt at x ∈ Σt , and
that Riem(X, , Y, ) = Riem(X, ν, Y, ν), where X, Y are tangent vectors to
Σt that can also be thought of as tangent to {t} × Σt ⊂ H.
Exercise 7.25. Check those last two details in the proof above.
The Raychaudhuri equation is tremendously important in general rela-

tivity because we often have control over Ric(, ). We say that a spacetime
satisfies the null energy condition (or NEC ) if Ric(v, v) ≥ 0 for all null vec-
tors v. Physical spacetimes typically satisfy this condition. In particular,
note that it is much weaker than the dominant energy condition.
Exercise 7.26. Let (M, g) be a spacetime satisfying the null energy con-
dition, and let H be a smooth null hypersurface H. If θ < 0 at some point
p ∈ H, show that the null generator through p cannot be future geodesically
complete. Hint: Use the Raychaudhuri equation to show that if the geodesic
exists for all parameter times t > 0, then θ has to blow up.
In light of this exercise, negative θ can be used to imply future geodesic

incompleteness, as we shall see below.
7.4.2. Trapped surfaces and the Penrose incompleteness theorem.

Let (Mn+1 , g) be a spacetime, and let Σn−1 be a spacelike submanifold of
M, meaning that g induces a Riemannian metric on Σ. We will refer to
Σn−1 as a “surface” since it is two-dimensional when n = 3. When g is
restricted to the normal space N Σ, its signature will have one −1 and one
1, and therefore at each p ∈ Σ, Np Σ can be spanned by two future-pointing
null vectors, which we denote + and − . If N Σ is a trivial bundle over Σ,
then + and − can be defined globally over Σ. Once again, keep in mind
that since these vectors are null, there is no notion of “normalizing” this
basis in the way we can for orthogonal bases. We consider the components
of the second fundamental form and mean curvature of Σ in M with respect
to these two null directions. For any v, w ∈ Tp Σ, we define
χ± (v, w) = g(∇v ± , w),
and we define
θ± = trΣ χ± = divΣ ± .
We call the χ± the null second fundamental forms and θ± the null mean
curvatures, or null expansion scalars. Once again, keep in mind that since ±
cannot be normalized, these quantities depend on the choice of ± . Specifi-
cally, multiplying ± by a positive function has the effect of multiplying θ±
by the same function, so that only the signs of χ± and θ± have geometric
significance.
We can easily relate these quantities to the A and θ that were defined
on null hypersurfaces above. If we define H± to be the null hypersurfaces
generated by the geodesics leaving Σ with tangent vectors ± , respectively,
it is easy to see from the definitions that
χ± (X, Y ) = AH± (X̄, Ȳ )
for any X, Y tangent to Σ, and that θ± for Σ is equal to the null mean
curvature θ for H± . Hence, our abusive choice to use the same notation for
both is reasonable.
Generally, we like to think of + as being outgoing and − as being
ingoing.
Exercise 7.27. Let (M n , g, k) be an initial data set sitting inside a space-
time (Mn+1 , g). Let Σn−1 be a surface in M . Let ν be a unit normal of Σ
in M , and let n be the future-pointing unit normal to M in M. If we define
± = n ± ν, show that
θ± = trΣ k ± H,
where H is the mean curvature scalar of Σ in M with respect to the normal ν.

Consequently, we can use this formula to define θ± for any surface Σ in
an initial data set with a distinguished normal, even without specifying a
spacetime (Mn+1 , g).
If Σ is a compact boundary surface, then we take ν to be outward point-
ing by convention.
Definition 7.28. Given a spacelike surface Σn−1 , in either a spacetime

(M, g) or an initial data set (M n , g, k), we say that Σ is
• outer trapped if θ+ < 0,
• weakly outer trapped if θ+ ≤ 0,
• outer untrapped if θ+ > 0,
• weakly outer untrapped if θ+ ≥ 0,
• marginally outer trapped if θ+ = 0.
We often refer to a marginally outer trapped surface as a MOTS for con-
venience. We have similar definitions with “inner” in place of “outer” if we
replace θ+ by θ− on the right. A surface is called trapped if it is both outer
trapped and inner trapped.
These definitions make sense as long as we have a distinguished choice
of+or ν, and Σ need not have an actual “outside” or ”inside.”
To get a sense for what the sign of θ+ means, recall from Proposition 2.10
that θ± represents how the area form on Σ is changing when we vary Σ in
the ± direction (which is why it is called a “null expansion”). Meanwhile,
+ and − represent the two most “extreme” directions that a light ray can
travel away from Σ. We can think of + as shooting light outward from Σ and
− as shooting light inward. A trapped surface is one for which the following
is true. If you flow Σ in the direction of any smooth family of light rays
emanating from Σ, this always has the effect of decreasing area. Intuitively,
this is to be expected if you shoot light rays inward (corresponding to θ− ),
but it is not so expected when you shoot light rays outward (corresponding
to θ+ ). For example, it is not hard to see that any large coordinate sphere
in an asymptotically flat initial data set has θ+ > 0 and θ− < 0. The
physical intuition is that only a “strong gravitational field” can cause light
to be “trapped” in the sense that shooting light in any direction is area
decreasing.
The famous Penrose incompleteness theorem [Pen65] states that in a
spacetime satisfying the NEC, trapped surfaces force the formation of sin-
gularities in the spacetime (assuming there is a noncompact Cauchy hyper-
surface). We will instead state and prove a version appearing in [Gal14]
better suited to our interests.
Theorem 7.29 (Penrose incompleteness theorem for outer trapped sur-

faces). Let (Mn+1 , g) be a spacetime containing a noncompact Cauchy hy-
persurface M n and satisfying the null energy condition. Suppose there exists
a precompact open subset Ω ⊂ M such that Σn−1 = ∂Ωn is outer trapped.
Then (M, g) is future null geodesically incomplete.
According to Theorem 7.14, we saw that there always exists a maximal

globally hyperbolic vacuum development of a vacuum initial data set, but
it said nothing about future completeness. The Penrose incompleteness
theorem is significant because it says that if the initial data contains an outer
trapped surface, then its evolution cannot be future complete. The Penrose
incompleteness theorem is often called a “singularity” theorem, though that
might be slightly misleading since all it says is that there is a future null
geodesic that cannot be extended forward with infinite parameter time. It
does not mean that the curvature must blow up there, since there could be a
smooth spacetime extension in which M ceases to be a Cauchy hypersurface.
Sketch of the proof. Let (Mn+1 , g) be a spacetime containing a Cauchy

hypersurface M n and let Ω be an open subset of M with Σn−1 := ∂Ωn .
Define ∂ out J + (Σ) := ∂J + (Ω) Ω. The notation on the left is chosen to
abbreviate the right-hand side in such a way that reminds us of what it is,
intuitively: it is supposed to represent the “outer boundary” of the causal
future of Σ.
Claim. Each q ∈ ∂ out J + (Σ) lies on a null geodesic leaving Σ with tangent
vector + = n + ν, where n is the future unit timelike normal to M and ν is
the outward normal of Σ in M . Moreover, this geodesic arrives at q before
it passes any conjugate points.
The proof of Claim 1 only uses the global hyperbolicity and does not use
any assumptions about compactness, trapping, or geodesic completeness. It
should perhaps be thought of as a background lemma from causality theory.
One can show that global hyperbolicity implies that J + (Ω) is closed, that
is, ∂J + (Ω) ⊂ J + (Ω). (Again, this is not so obvious from the way we defined
global hyperbolicity.) Since being timelike is an open condition, one can see
that I + (Ω) is an open subset of J + (Ω), and thus ∂J + (Ω) ⊂ J + (Ω) I + (Ω).
By Proposition 7.4, it then follows that for any q ∈ ∂ out J + (Σ) there exists
a null geodesic γ starting at some p ∈ Ω and ending at q.
We now argue that the starting point p lies in Σ: if it started in Ω,
then we could move the starting point slightly to construct a causal curve
that is timelike near its new starting point in Ω. (Imagine the picture in
Minkowski space to see why this is clear.) By Lemma 7.3, q would lie in
I + (Ω), a contradiction.
Next we claim that γ can be chosen so that γ (0) = + = n + ν. Without

loss of generality, we can choose γ so that γ (0) = n + v for some unit vector
v ∈ Tp M . If v points into Ω, then just as in the paragraph above, we
can move the starting point in the v direction to find a causal curve that
is timelike near the new starting point in Ω, which is a contradiction. If
v, ν ≥ 0, but v = ν, we can instead move the starting point of γ in the
direction of v − v, νν ∈ Tp Σ in such a way that the new starting point
remains in Σ, and such that we can construct a causal curve that is timelike
near its new starting point. (Again, this should be intuitively clear in the
local picture where Ω is a half plane in the t = 0 slice of Minkowski space.)
Once again, Lemma 7.3 then implies that q ∈ I + (Ω), a contradiction. Thus
v = ν.
Finally, we argue that γ is free of conjugate points. If it did have con-
jugate points, we could use the corresponding Jacobi field to deform γ to
again obtain a broken null geodesic, which can then be deformed to a smooth
causal curve that is timelike somewhere, and again use Lemma 7.3 to obtain
a contradiction. This completes the proof of our Claim 1.
We now assume the full hypotheses of Theorem 7.29, and, in addition,
we assume that (M, g) is actually future null geodesically complete and
work toward a contradiction.
Claim. ∂J + (Ω) is compact.
This part of the proof uses the Raychaudhuri equation, which lies at
the heart of the Penrose incompleteness theorem. By the assumption of
future null geodesic completeness, each null geodesic starting in Σ with
null vector + can be extended for infinite parameter time. Let H+ be the
union of all of these null geodesics. This H+ must be a smooth (possibly
immersed) null hypersurface away from the conjugate points. Since Σ is
compact and trapped, there exists a constant c > 0 such that θ+ < −c < 0.
By the Raychaudhuri equation, each generator must blow up before reaching
parameter time T = (n − 1)/c. (See Exercise 7.26.) In other words, it must
reach a point of nonsmoothness of H+ , which translates to saying that every
generator must reach a conjugate point of the generator before parameter
time T . By Claim 1, ∂ out J + (Σ) lies in the part of H+ with parameter
times lying in the closed interval [0, T ]. Since this latter space is clearly
compact and ∂ out J + (Σ) is a closed subset of it, we conclude that ∂ out J + (Σ)
is compact. Since Ω is assumed to be compact, it follows that ∂J + (Ω) is
compact. See Figure 7.4 for an “illustration” of this (impossible) situation.
To complete the argument, we construct a map from ∂J + (Ω) to M
as follows. By time-orientability, there exists a global timelike vector field
on M generating timelike integral curves. For each point in ∂J + (Ω), we
Figure 7.4. On the left is an example of ∂ out J + (Σ) in Minkowski space.

On the right, if Σ is trapped, then the NEC together with the Raychaud-
huri equation and future null completeness forces ∂ out J + (Σ) to “close
up,” but this is intuitively impossible because the timelike future of Ω
has nowhere to go.
follow one of these timelike integral curves to reach a point in M . By the

definition of a Cauchy hypersurface, this map must be well-defined. Arguing
as above (together with the fact that Ω lies on a Cauchy hypersurface),
one can see that no point in ∂J + (Ω) can be in the chronological future of
another point of ∂J + (Ω), and thus the map from ∂J + (Ω) to M is actually
injective and continuous. Finally, one can show that ∂J + (Ω) is a Lipschitz
hypersurface of M without boundary as a manifold. (The fact that it is a
Lipschitz hypersurface is not obvious, but it is intuitively clear that ∂J + (Ω)
should have no manifold boundary since it is itself a boundary.) Putting
it all together, this gives us an injective continuous map from a compact
topological manifold to a noncompact manifold, which can be shown to be
impossible for topological reasons.
The hypotheses of Theorem 7.29 can almost be relaxed to weakly outer

trapped surfaces. That is, it was shown in [EGP13] that,“generically,” if
Σ is a MOTS, we obtain the same conclusion (where “generic” here means
that certain curvatures do not vanish identically along the null generators).
Theorem 7.29 can be used to show that topology can force incomplete-
ness.
Theorem 7.30 (Gannon [Gan75], C. W. Lee [Lee76]). Let (Mn+1 , g) be
a spacetime containing an asymptotically flat Cauchy hypersurface M and
satisfying the null energy condition. If M is not simply connected, then
(M, g) is future null geodesically incomplete.
Proof. Assume that (M, g) is future null geodesically complete, and we

work toward a contradiction. By Proposition 7.7, M is homeomorphic to
R×M . We will use the nonsimply connected hypothesis in a manner similar
to the way it was used in the proof of Theorem 4.11. Consider the universal
cover M̃ of M , which sits inside the universal cover M̃ ∼ = R × M̃ of M,

which must also be future null geodesically complete. If M is not simply
connected, then M̃ has at least two ends. If we take Ω to be one of the
infinite ends beyond a sphere of large enough radius, then it is easy to see
that ∂Ω is outer trapped with respect to the unit normal pointing out of
Ω. We now run the same argument that was used in Theorem 7.29 with
the difference being that Ω is no longer compact, and therefore the space
∂J + (Ω) is not compact. However, we still obtain an injective continuous
map from ∂J + (Ω) to M̃ , where the former space is a topological manifold
with one noncompact end, while the latter is a topological manifold with at
least two noncompact ends. This is still impossible for topological reasons.

7.4.3. Discussion. At first glance, the Penrose incompleteness theorem

might look like bad news. It means that if we start with any initial data set
containing an outer trapped surface, then its maximal globally hyperbolic
development under the Einstein equations (using any matter model satis-
fying the NEC, including vacuum) must come to some sort of abrupt end.
(It is not too difficult to construct such initial data sets. See [SY83] for a
theorem explaining how concentrating a lot of matter in a small place can
force the existence of outer trapped surfaces.) In some sense, this suggests a
failure of the Einstein equations as a physical theory. As a response to this
problem, Penrose proposed the weak cosmic censorship hypothesis [Pen02].
Roughly, the weak cosmic censorship hypothesis is the conjecture that al-
though singularities may form, they always stay inside the black hole region.
Since the physically observable world exists in the domain of outer commu-
nication, weak cosmic censorship provides an elegant way for the theory to
save face by only failing in a way that will never affect us. A more tech-
nical (but still vague) way to state the weak cosmic censorship conjecture
is that “generic” initial data gives rise to a maximal globally hyperbolic
development under the Einstein equations that admits a complete I + . Of
course, the development depends on the matter model, but even the vac-
uum case is an important open problem. At the beginning of this section on
black holes, we essentially started with the assumption of a spacetime with
a complete I + before we even defined the concept of a “black hole” (implicit
in our vague assumption that our spacetime was asymptotic to Minkowski
in some sense). This is one reason why we tried to state weak cosmic cen-
sorship without making mention of black holes, and it also illustrates the
issue alluded to earlier about the trickiness involved in definitions.
One bit of evidence in favor of weak cosmic censorship is that while outer
trapped surfaces force future null incompleteness, they also indicate the ex-
istence of black holes, which makes one hope that they go hand in hand.
More precisely, given an asymptotically flat spacetime satisfying global hy-

perbolicity and the NEC, an outer trapped surface (with respect to a partic-
ular end) cannot intersect that end’s domain of outer communication. We
present the basic argument, due to S. Hawking [HE73]. Suppose that some
part of a trapped surface Σ = ∂Ω lies outside the black hole region. This
means that J + (Ω) reaches all the way out to I + . Let q be a point in the
boundary of the intersection of I + and J + (Ω). Following similar reason-
ing as in Theorem 7.29, q must lie at the end of a null geodesic leaving Σ
with tangent vector + , and moreover it should be free of conjugate points
for all parameter times. But if θ+ < 0 at Σ, then the Raychadhuri equa-
tion (together with the NEC) forces the existence of a conjugate point in
finite parameter time (Exercise 7.26), which is a contradiction. In fact, this
reasoning can be extended to weakly outer trapped surfaces, because if the
null generator starts with θ+ = 0 and has no conjugate points, then the
Raychaudhuri equation implies that it must have θ+ = 0 for all parameter
times. But since this null generator eventually reaches I + , it passes through
a spacetime region that is close to Minkowski space, and there one can prove
that such an “outgoing” null hypersurface with θ+ = 0 is impossible. How-
ever, in the argument above, note that we implicitly assumed that I + was
complete in the construction of the point q, so in some sense the argument
rests upon the weak cosmic censorship hypothesis. Despite this, we tend to
think of outer trapped surfaces as indicating the presence of a black hole.
There is another famous conjecture of Penrose called the strong cosmic
censorship, which is logically independent from the weak cosmic censorship
hypothesis. We will not discuss it here, except to note recent developments
by M. Dafermos and Jonathan Luk [DL17], whose work implies that the
“C 0 -inextendibility” formulation of the conjecture requires revision.
For many years, it was an open question whether a vacuum initial data
set free of outer trapped surfaces could develop outer trapped surfaces in
its vacuum development. This represents a black hole forming from pure
gravity, rather than from concentration of matter. Although it was generally
believed to be possible, the first examples were constructed in a celebrated
work of D. Christodoulou [Chr09].
Let us go back to considering a black event horizon H. Assuming as-
ymptotic flatness, global hyperbolicity, and the NEC, not only must weakly
outer trapped surfaces lie on the inside side of H, but a similar argument
shows that H must have θ ≥ 0 wherever it is smooth [HE73]. (Again, we
note that event horizons need not be smooth, so it is important to have
proofs that work more generally. See [CDGH01].) We will summarize the
argument. Suppose there is a point p ∈ H where H is smooth and θ < 0.
Consider a smooth spacelike surface Σ in H passing through p. Then θ+ < 0,
since the outgoing null normal + of Σ must be the null normal of H. We can
slightly push Σ outward toward the d.o.c. near p to obtain another surface
Σ such that a small part of Σ leaks into the d.o.c. with θ+ < 0, while the
rest of Σ is identically equal to Σ. As above, we can argue that there must
exist a null geodesic from Σ out to I + which is free of conjugate points.
By definition of the d.o.c., this geodesic must originate from a point in the
intersection of Σ with the d.o.c., where θ+ < 0. (And the geodesic must be
outgoing, since an ingoing geodesic will surely enter the black hole region.)
But this contradicts the Raychaudhuri equation via Exercise 7.26. This fact
that the null expansion of a black hole horizon is nonnegative is usually
called the Hawking area theorem or the second law of black hole mechanics.
It is called the area theorem because if Σ is a spacelike surface in H, then
+
θΣ ≥ 0 represents the rate of change of the area element of Σ as it flows in
the direction + . In particular, if Σ1 and Σ2 are smooth closed surfaces on
a smooth part of H with Σ2 in the causal future of Σ1 , then |Σ1 | ≤ |Σ2 |.
A corollary of the Hawking area theorem is that under the same assump-
tions as above we have the following. If the spacetime is stationary, then a
spacelike surface Σ in H has θ+ = 0, or, in other words, it is a marginally
outer trapped surface (or MOTS). Recall that stationarity means that there
is a Killing vector field and hence a family of spacetime isometries. These
isometries must preserve the event horizon and therefore they send Σ to
another cross-section Σ of H in its future, and |Σ| = |Σ |. Because of the
Hawking area theorem, this is only possible if θ+ = 0 along Σ.
7.5. Marginally outer trapped surfaces

Although the event horizon is an important concept, it can be difficult to
study directly because it is determined by the global causal structure of the
entire spacetime. The preceding discussion motivates the study of MOTS
as a way of studying black holes using local geometry. Specifically, it allows
us to use an initial data perspective.
Definition 7.31. Let (M, g, k) be a complete initial data set with a distin-
guished noncompact end, let Σ be a smooth enclosing boundary, and take
ν to be the outward-pointing normal of Σ. We say that Σ is an outermost
MOTS if it cannot be enclosed by any other weakly outer trapped surfaces.
An outermost MOTS will also be referred to as an apparent horizon for that
end.
Be aware that, like many terms arising from physics, the phrase “appar-
ent horizon” does not always have a consistent technical definition in the
literature. Based on our earlier discussion, if (M, g, k) lies in an asymptot-
ically flat, globally hyperbolic spacetime satisfying the NEC, an apparent
7.5. Marginally outer trapped surfaces 241
horizon will always lie on the inside of the black hole event horizon (in-
cluding the horizon itself), and if the spacetime is stationary, the apparent
horizon will lie on the black hole event horizon. It is the closest we can come
to locating where the event horizon must intersect our initial data set. In
particular, the concept has applications to numerical relativity.
7.5.1. Stability of MOTS. Observe that for a time-symmetric initial data

set, a MOTS is just a minimal hypersurface. From a Riemannian geometry
perspective, the MOTS equation θ+ = 0 can be thought of as a general-
ization of the minimal hypersurface equation H = 0. Although it shares
many similarities with the minimal hypersurface equation, the most signif-
icant difference is that it does not arise from a variational principle. While
volume is an essential tool for studying minimal hypersurfaces, there is no
analogous quantity that can be used to study MOTS. Recall from Chapter 2
that the stability inequality (2.16) was a powerful tool for the study of scalar
curvature. Even without a variational principle, we can still generalize the
concept of stability from minimal hypersurfaces to the setting of MOTS.
Given a vector field X defined along a hypersurface Σ with unit normal ν,
we can decompose X into its normal and tangential components
X = ϕν + X̂.
Given X, we adopt this convention for the notation ϕ and X̂.
We compute the linearization of the outward null expansion θ+ .
Proposition 7.32. Let Σn−1 be a hypersurface with unit normal ν in an
initial data set (M n , g, k). The linearization of θ+ on Σ in the direction of
the vector field X is given by
Dθ+ |Σ (X) = −ΔΣ ϕ + 2WΣ , ∇ϕ + (divΣ WΣ − |WΣ |2 + QΣ )ϕ
1
+ θ+ [θ− + 2k(ν, ν)]ϕ + ∇X̂ θΣ+
,
2
where
1 1
QΣ := RΣ − μ − J, ν − |kΣ + AΣ |2 .
2 2
Here kΣ denotes the restriction of k to vectors tangent to Σ, and WΣ is the
tangential vector field on Σ that is dual to the 1-form k(ν, ·) along Σ.
Note that the dominant energy condition implies that QΣ ≤ 12 RΣ .
Proof. As seen in Section 2.2, it is clear what the contribution from the
tangential component X̂ must be, so it suffices to consider the case of a
normal variation X = ϕν. Recall from (2.15) that we already know that the
variation of H is
1
DH|Σ (ϕν) = −ΔΣ ϕ + (RΣ − RM − |A|2 − H 2 )ϕ.
2
Therefore the only thing we have to compute is the variation of trΣ k =

trg k − k(ν, ν). Since g and k are defined on the ambient space, the main
∂ ∗
quantity we need to understand is ∂t Φt νt t=0 , where Φt is the family of
diffeomorphisms generated by X = ϕν and νt is the unit normal of Σt =
∂ ∂ ∗
Φt (Σ). As is customary, we will use the abbreviated notation ∂t ν := ∂t Φt νt .
Exercise 7.33. Show that for any hypersurface Σ deformed in the X = ϕν
∂
direction, ∂t ν = −∇ϕ.
So we can compute
∂ ∂
(trΣ k) = (trg k − k(ν, ν))
∂t ∂t

∂ ∂ ∂
= (trg k) − k (ν, ν) − 2k ν, ν
∂t ∂t ∂t
= ϕ∇ν (trg k) − ϕ(∇ν k)(ν, ν) + 2k(∇ϕ, ν)
= [∇ν (trg k) − (∇ν k)(ν, ν)]ϕ + 2W, ∇ϕ,
where W = WΣ is as defined in the statement of the proposition. This is
already enough to compute the variation of θ+ , but we would like to put it
in a nicer form. Specifically, we would like to see how the quantities μ and
J show up in the formula. Let W̃ be the vector field defined along Σ that is
dual to the k(ν, ·) so that W is just the tangential part of W̃ . We choose an
orthonormal frame e1 , . . . , en−1 for Σ and compute (using Einstein notation)
(∇ν k)(ν, ν) = (divg k − divΣ k)(ν)
= (divg k)(ν) − (∇ei k)(ν, ei )
= (divg k)(ν) − [∇ei k(ν)](ei ) + k(∇ei ν, ei )
= (divg k)(ν) − divΣ W̃ + k(Aij ej , ei )
= (divg k)(ν) − divΣ W + H, W̃ ⊥ + Aij kij
= (divg k)(ν) − divΣ W − Hk(ν, ν) + A, kΣ ,
where we used (2.6) to simplify divΣ W̃ . Putting the three previous compu-
tations together, we have
1
Dθ+ |Σ (ϕν) = −ΔΣ ϕ + 2W, ∇ϕ + [RΣ − RM − |A|2 − H 2
2
+ 2∇ν (trg k) − 2(divg k)(ν) + 2 divΣ W
+ 2Hk(ν, ν) − 2A, kΣ ]ϕ.
The rest of the computation is tedious but straightforward.
Exercise 7.34. Complete the proof of Proposition 7.32. Also, what should
the formula for the linearization of θ− be?
Definition 7.35. Let Σ be a hypersurface with unit normal ν in an initial

data set (M, g, k). By analogy with (2.15), we define the (MOTS) stability
operator LΣ on Σ to be
1
LΣ ϕ := Dθ+ |Σ (ϕν) = L0Σ ϕ + θ+ [θ− + 2k(ν, ν)]ϕ,
2
where
L0Σ ϕ := −ΔΣ ϕ + 2WΣ , ∇u + (divΣ WΣ − |WΣ |2 + QΣ )ϕ
for any smooth function ϕ on Σ, where WΣ and QΣ are as defined in Propo-
sition 7.32.
Since the MOTS stability operator generalizes the stability operator we

already defined in Definition 2.16 in the time-symmetric case, we use the
same notation. Again, the upright letter in LΣ helps us distinguish the
stability operator from the conformal Laplacian. Of course, if Σ is a MOTS,
then LΣ and L0Σ are the same thing. Distinguishing these operators for
non-MOTS hypersurfaces will be convenient later on.
Recall from Definition 2.16 that in the time-symmetric case, a compact
minimal hypersurface (possibly with boundary) was defined to be stable
if and only if the operator LΣ is a nonnegative operator on smooth func-
tions vanishing at the boundary. Since LΣ is not self-adjoint in general,
this definition of stability is not appropriate, but L. Andersson, M. Mars,
and W. Simon proposed a concept of MOTS stability using the following
observation [AMS08].
Proposition 7.36. Let Σ be a compact hypersurface (possibly with bound-
ary) in an initial data set (M, g, k). There exists a (Dirichlet) eigenvalue of
the MOTS stability operator LΣ with minimal real part, which is called the
principal (Dirichlet) eigenvalue, denoted λ1 (LΣ ). Furthermore, this eigen-
value is real, and if Σ is connected, the corresponding eigenspace is a one-
dimensional space generated by a smooth principal eigenfunction that is pos-
itive on the interior of Σ. (The same is also true for L0Σ .)
If Σ is a MOTS and λ1 (LΣ ) ≥ 0, we say that Σ is a stable MOTS.
This proposition is a direct consequence of Theorem A.10, which in turn

relies on the Krein-Rutman Theorem [Wik, Krein-Rutman theorem].
We can “symmetrize” the stability condition to obtain the following
result.
Proposition 7.37 (Galloway-Schoen [GS06]). Let Σ be a compact hyper-
surface (possibly with boundary) in an initial data set (M, g, k). Then
λ1 (L0Σ ) ≤ λ1 (−ΔΣ + QΣ ) ,
where the right side denotes the principal (Dirichlet) eigenvalue of the self-
adjoint operator −ΔΣ + QΣ , where QΣ was defined in Proposition 7.32. In
particular, if Σ is a stable MOTS, then for any smooth function u compactly
supported in the interior of Σ, we have

(7.5) |∇u|2 + QΣ u2 ≥ 0.
Σ
Proof. We drop the Σ subscripts in the following. For any function ϕ that
is positive in the interior of Σ,
L0Σ ϕ = −Δϕ + 2W, ϕ−1 ∇ϕϕ + (div W − |W |2 + Q)ϕ

= −Δϕ + (|ϕ−1 ∇ϕ|2 + |W |2 − |W − ϕ−1 ∇ϕ|2 )ϕ
+ (div W − |W |2 + Q)ϕ
= −Δϕ + |∇ log ϕ|2 ϕ − |W − ∇ log ϕ|2 ϕ + (div W + Q)ϕ
= −(Δ log ϕ)ϕ − |W − ∇ log ϕ|2 ϕ + (div W + Q)ϕ
= [div(W − ∇ log ϕ)]ϕ − |W − ∇ log ϕ|2 ϕ + Qϕ.
Now let u be any smooth function compactly supported in the interior of Σ.

We multiply the above equation by u2 ϕ−1 to obtain
u2 ϕ−1 L0Σ ϕ = [div(W − ∇ log ϕ)]u2 − |W − ∇ log ϕ|2 u2 + Qu2

= div(u2 (W − ∇ log ϕ)) − W − ∇ log ϕ, 2u∇u
− |W − ∇ log ϕ|2 u2 + Qu2
= div(u2 (W − ∇ log ϕ))
+ |(W − ∇ log ϕ)u|2 + |∇u|2 − |(W − ∇ log ϕ)u + ∇u|2
− |W − ∇ log ϕ|2 u2 + Qu2
= div(u2 (W − ∇ log ϕ)) + |∇u|2 + Qu2
− |(W − ∇ log ϕ)u + ∇u|2 .
Integrating, we obtain

2 −1 0
(7.6) u ϕ LΣ ϕ ≤ |∇u|2 + Qu2 .
Σ Σ
By Proposition 7.36, we can choose ϕ to be the principaleigenfunction of

L0Σ , so that the left side of the inequality becomes λ1 (L0Σ ) Σ u2 . The result
then follows from the Rayleigh quotient characterization of λ1 (−ΔΣ + Q),
as explained in the proof of Theorem A.10.
Moreover, using reasoning similar to that of Exercise 2.26, observe that

if the DEC holds on (M, g, k), then for any hypersurface Σ, we have
1
(7.7) λ1 (−ΔΣ + QΣ ) ≤ λ1 (Lh ),
2
where Lh denotes the conformal Laplacian of the induced metric h = g|Σ .
Moreover, if the strict DEC holds, then the inequality is strict.
7.5.2. Apparent horizons in initial data sets. Here we discuss some

theorems which are analogous to those presented in Section 4.1.1. One can
prove that a version of the strong comparison principle for mean curvature
(Corollary 4.2) also holds for θ+ , and the proof is fairly similar. See [AM09,
Proposition 2.4, AG05, Proposition 3.1] for details, as well as [Gal00] for
a related maximum principle for null hypersurfaces.
Proposition 7.38 (Strong maximum principle for θ+ ). Suppose we have
open sets Ω1 ⊂ Ω2 in an initial data set (M, g, k) and smooth hypersurfaces
Σ1 and Σ2 (possibly with boundary) lie on ∂Ω1 and ∂Ω2 , respectively, with
+
θΣ 1
≤ 0 and θΣ
+
2
≥ 0, where these are computed using the outward-pointing
unit normal. If Σ1 touches Σ2 anywhere in their interiors or are tangent to
each other at a common boundary point, then they must be identically equal
in a neighborhood of that point.
Consequently, a closed MOTS can never “penetrate” a foliation by weak-

ly outer untrapped surfaces (meaning θ+ ≥ 0).
Next we would like to establish an existence result for apparent horizons,
generalizing Theorem 4.7, but we cannot produce a MOTS via a minimiza-
tion procedure, which is the standard technique used for producing minimal
hypersurfaces. However, we do have the following existence theorem due to
M. Eichmair [Eic09].
Theorem 7.39 (Existence theorem for MOTS). Let n < 8. Let (M n , g, k)
be an initial data set. Suppose Ω is an open subset of M whose compact
boundary ∂Ω can be divided into two smooth hypersurfaces with boundary
pieces ∂1 Ω and ∂2 Ω that meet along a smooth (n − 2)-dimensional subman-
ifold Γ.
Assume that ∂1 Ω is weakly outer untrapped (meaning θ+ ≥ 0) with re-
spect to the outward-pointing normal, and that ∂2 Ω is weakly outer trapped
(meaning θ+ ≤ 0) with respect to the inward-pointing normal. Then there
exists a smooth λ-minimizing stable MOTS Σ such that ∂Σ = Γ and Σ is
homologous to ∂1 Ω. Moreover, if (M, g, k) is asymptotically flat, then λ > 0
depends only on the geometry of (M, g, k).
The stability part of the conclusion was established in [EM16].

Briefly, a smooth boundary ∂E in Ω is λ-minimizing if |∂E ∩ Ω| ≤

|∂ ∗ F∩ Ω| + λ|EΔF | for any open F such that EΔF ⊂⊂ Ω. (See [Eic09]
for details.) The main point here is that much like minimizing boundaries,
λ-minimizing boundaries have good regularity properties and come with use-
ful volume bounds. When n ≥ 8, one still obtains a solution Σ that satisfies
the MOTS equation in a weak sense, but it might have a singular set of
Hausdorff dimension at most n − 8. Eichmair’s proof of Theorem 7.39 fol-
lowed a suggestion of Schoen to produce MOTS via limits of solutions of the
regularized Jang equation—a phenomenon that was first observed by Schoen
and Yau in their proof of the spacetime positive energy theorem [SY81b].
Notice that in the time-symmetric case k = 0, this theorem reduces
to the previously known fact that one can always find a stable minimal
hypersurface with given boundary Γ, as long as Γ lies on the mean convex
boundary hypersurface. (A result of this type was alluded to in our proof
of Theorem 4.7.)
In Theorem 7.39, Γ could be empty, in which case ∂1 Ω and ∂2 Ω are made
up of components of ∂Ω and the theorem produces a closed MOTS. This case
(for n < 7) was also proved by Lars Andersson and Jan Metzger [AM09]
who implemented Schoen’s suggestion using different techniques. Using this,
we obtain the following generalization of Theorem 4.7, due to Andersson,
Eichmair, and Metzger [AM09, Eic09, AEM11].
Theorem 7.40 (Existence and uniqueness of apparent horizons). Let n < 8,

and let (M n , g, k) be a complete asymptotically flat initial data set (possibly
with boundary).
(1) If M has nonempty weakly outer trapped boundary and only one
end, then there exists a smooth apparent horizon.
(2) If an end of M has an apparent horizon, then it is unique, and
moreover both the horizon and the region outside the horizon are
orientable.
Sketch of the proof. The proof is similar to that of Theorem 4.7. We

start with the first statement and construct a stable MOTS homologous to
a large coordinate sphere. By asymptotic flatness, the mean curvature of the
coordinate sphere of radius ρ is approximately n−1
ρ , while k decays faster.
Therefore the end is foliated by outer untrapped hypersurfaces. Let Ω be
the region of Int M whose boundary is a large coordinate sphere Sρ , so that
∂1 Ω can be taken to be Sρ and ∂2 Ω can be taken to be ∂M (with normal
vector pointing into Ω), while Γ = ∅. We can now see that Ω satisfies the
hypotheses of Theorem 7.39, and therefore we obtain closed stable MOTS
homologous to ∂1 Ω = Sρ .
From here, we can argue as we did in Theorem 4.7, though there are
some added technical issues to address. See [AEM11] for details.
We immediately obtain a generalization of Corollary 4.9.

Corollary 7.41. If (M n , g, k) is a complete initial data set with more than
one end, then there is an apparent horizon corresponding to each end. The
result still holds if M has a boundary, as long as that boundary is weakly
outer trapped.
Recall from Proposition 2.18 that every stable, two-sided closed minimal
surface in an orientable 3-manifold with positive scalar curvature must be a
sphere. The exact same reasoning leads to the following.
Exercise 7.42. Let (M 3 , g, k) be an orientable initial data set satisfying
the strict dominant energy condition, meaning that μ > |J|g everywhere.
Prove that every stable, two-sided closed MOTS is a sphere.
Using Proposition 7.37, we can control the topology of apparent horizons

as we did in the Riemannian case in Corollary 4.10.
Theorem 7.43 (Galloway-Schoen [GS06], Galloway [Gal18]). If Σ is an
apparent horizon in an initial data set (M n , g, k) satisfying the dominant
energy condition, then Σ is orientable and Yamabe positive. In particular,
if n = 3, then Σ is a union of spheres.
Proof. One critical observation is that an apparent horizon must be a stable

two-sided orientable MOTS. This follows from the construction and unique-
ness statement in Theorem 7.40, but it is instructive to see how the stability
follows directly from the outermost property. Deform Σ in the direction ϕν,
where ϕ > 0 is the principal eigenfunction of the MOTS stability opera-
tor (Definition 7.35) with eigenvalue λ. If Σ were not stable, then we would
∂ +
have ∂t θ = LΣ ϕ = λϕ < 0, which would mean that these small deforma-
tions would have θ+ < 0 and therefore be outer trapped. By Theorem 7.39,
we could then produce a MOTS strictly enclosing Σ, which would violate
the outermost property of Σ.
As we saw in Proposition 7.37 and inequality (7.7), stability and the
DEC imply that
1
0 ≤ λ1 (LΣ ) ≤ λ1 (−ΔΣ + QΣ ) ≤ λ1 (Lh ) ,
2
where Lh is the conformal Laplacian of h = g|Σ . If any of these inequalities
is strict, then it follows that Σ is Yamabe positive, by Corollary 2.27.
Suppose that all of the equalities above hold and that Σ is not Yamabe
positive. We will argue that this leads to a contradiction by constructing a
splitting of M near Σ, which will contradict the outermost MOTS property

of Σ. This part of the proof comes from [Gal18]. Note that we have already
seen much of this argument in our proof of Theorem 2.41.
We follow the steps of the proof of Theorem 2.38 in order to construct
a foliation with constant θ+ . For each smooth function u on Σ, we consider
the image hypersurface Σ[u] of Σ under the map Fu (x) = expx (u(x)ν).
All hypersurfaces that are close to Σ = Σ[0] in the smooth sense can be
parameterized by functions u that are close to zero. For α ∈ (0, 1), consider
the map Ψ from a small ball in C 2,α (Σ) × R to C 0,α (Σ) × R, defined by

∗ + 1
Ψ(u, s) = Fu θΣ[u] − s, u dμΣ ,
|Σ| Σ
where Fu∗ θΣ[u]
+
is the mean curvature scalar of the image surface Σ[u], pulled
back to the original surface Σ. Then

1
DΨ|(0,0) (u, s) = LΣ u − s, u dμΣ .
|Σ| Σ
Since λ1 (LΣ ) = 0, the kernel of LΣ is spanned by a positive principal eigen-
function, and then it is an exercise to conclude that the kernel of L∗Σ is also
spanned by a positive function. From this one can see that DΨ|(0,0) is an
isomorphism. By the inverse function theorem (Theorem A.43), there ex-
ists > 0 and a smooth map (v, θ+ ) : (−, ) −→ C 2,α (Σ) × R such that
Ψ(v(t), θ+ (t)) = (0, t) for all t ∈ (−, ). Just as in the proof of Theo-
rem 2.38, this means that in a neighborhood of Σ, we can write the metric
g as
g = ht + ϕ2t dt2 ,
where ht is the induced metric on the hypersurface level sets Σt := Σ × {t},
+
and each level set has constant θΣ t
= θ+ (t), and t > 0 is on the outside of
Σ.
According to Proposition 7.32 and Definition 7.35, we have
1 −
(θ+ ) (t) = L0Σt ϕt + θ+ (t) θΣ + 2k(νt , νt ) ϕt .
2 t
−
Choose a constant C large enough so that 12 θΣ t
+ 2k(νt , νt ) ϕt ≤ C for all
t ∈ [0, ) and all points in Σt . We claim that θ+ (t) > 0 for all t ∈ (0, ). If
not, then we could use the existence theorem for MOTS (Theorem 7.39) to
construct a MOTS outside of Σ, but this contradicts the outermost property
of Σ, proving the claim. So we have
(θ+ ) (t) ≤ L0Σt ϕt + Cθ+ (t),
and thus
d −Ct +
e θ (t) ≤ e−Ct L0Σt ϕt .
dt
7.6. The Penrose inequality 249
Since e−Ct θ+ (t) is positive for all t ∈ (0, ) but zero at t = 0, it follows that
there exists a time t = τ such that the left side of the above inequality is
positive. Hence L0Στ ϕτ > 0, and thus ϕ−1 τ LΣτ ϕτ ≥ c > 0 for some constant
0
c. Using this choice of Στ and ϕτ in (7.6) and invoking (7.7), it follows that
1
0 < c ≤ λ1 (−ΔΣτ + QΣτ ) ≤ λ1 (Lhτ ).
2
Thus Σ is Yamabe positive, which is a contradiction.
The n = 3 case of Theorem 7.43 was first observed by Hawking in the

case of stationary spacetimes [Haw72]. (Interestingly, this predates Schoen
and Yau’s work on scalar curvature described in Chapter 1.) When the
spacetime dimension is greater than 4, there do exist examples of apparent
horizons with nonspherical topology. Most famously, there are the “black
ring” stationary vacuum Einstein solutions of R. Emparan and H. Reall
in 4+1 dimensions [ER02, ER06], whose apparent horizons have S 2 × S 1
topology. See [Chr15, Chapter 2] for a mathematical exposition of the
Emparan-Reall black rings. Other examples of apparent horizons with non-
spherical topology have been constructed by Fernando Schwartz [Sch08],
Kunduri and Lucietti [KL14], and Mattias Dahl and Eric Larsson [DL16].
See also recent work of M. Khuri, Y. Matsumoto, G. Weinstein, and S. Ya-
mada on the topology of (4 + 1)-dimensional stationary bi-axisymmetric
black holes [KMWY18] .
When n = 3, we have the following generalization of Theorem 4.11.
Theorem 7.44 (Eichmair-Galloway-Pollack [EGP13]). Let (M 3 , g, k) be
an asymptotically flat initial data set whose boundary is either empty or
a union of MOTS. Assume that M contains no immersed MOTS in its
interior. Then M is diffeomorphic to the R3 minus a finite number of open
balls.
We already gave the proof as our proof of Theorem 4.11. The only
difference is that now we have to use Corollary 7.41 in place of Corollary 4.9.
7.6. The Penrose inequality

We consider the following generalization of Conjecture 4.12.
Conjecture 7.45 ((Spacetime) Penrose inequality conjecture). Let
(M n , g, k) be a complete asymptotically flat initial data set satisfying the
dominant energy condition, and suppose it contans an apparent horizon Σ
with respect to one of the ends. Then if m is ADM mass of that end, and
Σ is the strictly minimizing hull of Σ, then

n−2
1 |Σ | n−1
m≥ ,
2 ωn−1
and if equality holds, then the part of (M, g, k) outside Σ sits inside the
Schwarzschild spacetime of mass m.
Clearly, in the time-symmetric case k = 0, the inequality reduces to

the one in Conjecture 4.12. We will now discuss Penrose’s original physical
motivation for this conjecture, though it is important to note that the orig-
inal conjecture was only for n = 3 since that is the only case in which the
physical motivation is valid. However, given that the Riemannian Penrose
inequality is known to be true for n < 8, it seems reasonable to conjecture
that the spacetime version also holds in higher dimensions.
Assuming weak cosmic censorship and the more general version of the
final state conjecture, the initial data in Conjecture 7.45 will evolve under
some appropriate matter model to a spacetime that is approaching a Kerr-
Newman solution. By Hawking’s argument described in Section 7.4.3, the
apparent horizon Σ must lie inside some black hole event horizon H. By
the Hawking area theorem, the cross-sections of H must have monotone
nondecreasing area as we move forward in time. Meanwhile, since energy
can only radiate away to infinity, the mass must be monotone nonincreasing.
More precisely, it is the Trautman-Bondi mass [BvdBM62, Tra58, Sac62],
which we have not discussed, that is nonincreasing, and we know that the
Trautman-Bondi mass approaches the ADM mass in many situations (and
hope or expect that it does so generally). Since the inequality
&
|H ∩ M0 |
m0 ≥
16π
is known to be true for a standard slice M0 of a Kerr-Newman spacetime of
mass m0 , it should then follow that the inequality also holds for the original
initial data set M . The final step is to replace |H ∩ M | by |Σ |, which holds
because Σ is enclosed by H. Note that since an apparent horizon need not
be outward-minimizing, we should not expect to be able to replace |H ∩ M |
by |Σ|.
The cleverness of Penrose’s conjecture is that he took a complicated con-
jectural picture about the future development of an initial data set under
Einstein’s equations and used it to produce a highly nontrivial, nonobvious
conclusion about quantities that are well-defined in the initial data set itself.
The main purpose of this was to test the plausibility of weak cosmic censor-
ship. For a much longer and better discussion of the motivation behind the
Penrose inequality and its consequences, see the survey [Mar09].
As we have seen in Chapter 4, this conjecture has been proved in the
time-symmetric case. The general conjecture is essentially wide-open,
though we do have a result for the spherically symmetric case. For this pur-
pose we consider a boundary ∂M which is an “outermost MOTS/MITS,”
meaning that ∂M is a MOTS or a MITS (that is, a marginally inner trapped

surface) and there are no other MOTS nor MITS enclosing it.
Theorem 7.46 (Spherically symmetric Penrose inequality [MḾ94,

Hay96]). Let (M, g, k) be a complete asymptotically flat initial data set dif-
feomorphic to [0, ∞) × S n−1 which is spherically symmetric in the sense that
the metric can be expressed as
g = ds2 + r2 dΩ2
for some smooth positive function r(s), where dΩ2 is the standard metric on
the sphere, while k can be written as
1
k = kνν ds2 + κr2 dΩ2 ,
n−1
where kνν and κ are smooth functions of s. Note that kνν = k(ν, ν), where
ν is the outward normal of a symmetric sphere at s, while κ is the trace of
k over the symmetric sphere at s.
If (g, k) satisfies the dominant energy condition and ∂M is an outermost
MOTS/MITS, then

n−2
1 |∂M | n−1
m≥ ,
2 ωn−1
where m is the ADM mass of (M, g). Moreover, if equality holds, then
(M, g, k) lies inside the Schwarzschild spacetime of mass m.
Proof. This exposition is based on the proof appearing in [Mar09, Sec-

tion 4]. We claim that ∂M is outward-minimizing, which is equivalent to
saying that dr dr 1
ds > 0 in the interior of M . Since ds = n−1 Hr, this is also
equivalent to saying that there are no minimal spheres strictly enclosing
∂M . If there were a minimal sphere strictly enclosing ∂M , then it would
have H = 0 there, and since H > |κ| for large r, it follows from continuity
that there must be some sphere with H = |κ| that strictly encloses ∂M . But
that would be a MOTS or a MITS, violating the outermost property of ∂M .
Note that this argument also shows that the statement in Conjecture 7.45
follows from Theorem 7.46 for spherically symmetric spaces, since in this
case the minimizing hull Σ of any MOTS Σ is either Σ itself or else Σ is
minimal. In either case, |Σ | must be less than or equal to the volume of the
outermost MOTS/MITS enclosing it.
Since dr
ds > 0 on the interior of M , we can perform a change of variables,
just as in the proof of Proposition 4.20, so that
g = V −1 dr2 + r2 dΩ2
√
on [r0 , ∞) × S n−1 , where V = dr ds and r0 satisfies |∂M | = ωn−1 r0
n−1
. We
consider the function

1 n−2 1
m(r) := r 1+ (κ − H )r
2 2 2
2 (n − 1)2
1 1
= rn−2 (1 − V ) + κ2 r n ,
2 2(n − 1)2
where the second expression can be easily compared to the one used in the
proof of Proposition 3.20. In the n = 3 case, this can be written more
geometrically as
&

|Sr | 1 + −
m(r) = 1+ θ θ dμSr ,
16π 16π Sr
where the right side is the appropriate generalization of the Hawking mass
of the sphere Sr to the initial data setting. In the spherically symmetric
setting, it is called the Misner-Sharp energy.
Claim.
1 κ
m (r) = rn−1 μ − J, ν .
n−1 H
In the proof of Proposition 3.20, we already saw that
d 1 n−2 n − 1 n−1
(7.8) (n − 1)2 r (1 − V ) = r R.
dr 2 2
Meanwhile, looking at the κ term in m(r), we compute

2 d
1 dκ n 2
(7.9) (n − 1) 2 n
κ r =r n−1
rκ + κ .
dr 2(n − 1)2 dr 2
Exercise 7.47. Show that in the spherically symmetric case, the constraint
equations (Definition 7.16) reduce to
1 n − 2 2
μ = R + 2κkνν + κ ,
2 n−1

1 dκ 1
J, ν = H − r + κνν − κ .
n − 1 dr n−1
Rearranging the result of the exercise above for inclusion in equations
(7.8) and (7.9),

n − 1 n−1 n−2 2
r R=r n−1
(n − 1)μ − (n − 1)κkνν − κ ,
2 2
dκ κ
rn κ = rn−1 −(n − 1) J, ν + (n − 1)κkνν − κ2 .
dr H
Feeding these into (7.8) and (7.9), we obtain the Claim. Another way to
derive the formula for m (r) is to start with the general variation formula for
θ+ in Proposition 7.32 as well as the analogous formula for θ− (Exercise 7.34)

and specialize to the case of spherical symmetry.
Again, observe that since H > |κ| for large r, any sphere with H < |κ|
will lead to the existence of a MOTS or MITS enclosing it. Therefore the
outermost property of ∂M implies that we must have H > |κ| in the interior
of M . From this, the dominant energy implies that μ ≥ H κ
J, ν, and hence

m (r) ≥ 0.
Finally, it is easy to check that since κ2 = H 2 = (n − 1)2 V (r0 )r0−2 at
∂M , we have

n−2
1 |∂M | n−1 1
= r0n−2 = m(r0 ) ≤ lim m(r) = E,
2 ωn−1 2 r→∞
where E is the ADM energy. The last equality follows from Exercise 3.21,
since the κ2 term in m(r) decays too fast to contribute to the limit. Finally,
one can easily see that spherical symmetry implies that the ADM momentum
is zero (as you would expect), so that the ADM energy E is the same as the
ADM mass m.
Next we consider the case of equality, which is unfortunately a bit messy.
We will show that if equality holds, then M sits inside the Schwarzschild
spacetime of mass m as a graph over the t = 0 slice. To do this, let us first
make some calculations on graphical hypersurfaces. Consider the spacetime
Schwarzschild metric of mass m,
gm = −Vm dt2 + Vm−1 dr2 + r2 dΩ2 ,
where Vm (r) = 1 − 2mr2−n . Next consider the graph of a radial function
f (r) over the t = 0 slice, that is, we look at the hypersurface defined by
t = f (r). A quick calculation shows that if we write the induced metric gf
on the graph as
g f = Vf−1 dr2 + r2 dΩ2 ,
then
Vf−1 = Vm−1 − Vm (f )2 .
We can also consider the second fundamental form k f of the graph in gm and
f
look at its normal-normal component kνν and its trace over the sphere κf
(both computed with respect to g f ). Some routine but somewhat involved
computations (try it) show that
n − 1
(7.10) κf = Vf − Vm ,
r
d
(7.11) f
kνν = Vf − Vm .
dr
By spherical symmetry, these components determine all of k f . Therefore,
in order to prove that some given spherically symmetric (M, g, k) sits inside
the Schwarzschild spacetime of mass m, it is sufficient to find some f such

f
that Vf = V , κf = κ, and kνν = kνν .
Now assume that (M, g, k) satisfies equality in the Penrose inequality.
By our proof of that inequality, we have m(r) = m for all r ≥ r0 . This
translates into the statement that

2
κr
(7.12) V − Vm = .
n−1
In particular, this means that it is possible to find some f such that Vf = V .
f
We now just need to show that with this choice of f , κf = κ and kνν = kνν .
f
The equation for κ follows directly from (7.10) and (7.12). To prove the
f
equation for kνν , observe that since m (r) = 0 for all r ≥ r0 , our Claim above
says that μ = H κ
J, ν. Recall that |κ|
H < 1 outside ∂M by the outermost
MOTS/MITS assumption, so together with the DEC, this tells us that μ =
J, ν = 0. By Exercise 7.47, it follows that

d κr
kνν = .
dr n − 1
f
Combining this with (7.11) and 7.12, we see that kνν = kνν , completing the
proof.
Getting back to the general Penrose conjecture (Conjecture 7.45),

H. Bray and M. Khuri have suggested an approach that involves using a
generalized version of the Jang equation used in [SY81b] and “coupling” it
with either inverse mean curvature flow or the Bray flow. This approach re-
duces the problem to one of solving a system of coupled PDEs. See [BK10].
In fact, even a positive mass theorem for initial data sets with MOTS
boundary, which would be the natural spacetime analog of Theorem 4.16,
does not easily follow from the version without boundary, since there is no
doubling trick for MOTS. However, we do have a result in the spin case. See
Theorem 8.29.
Chapter 8
The spacetime positive

mass theorem
We now consider the positive mass theorem for initial data sets.
Theorem 8.1 (Spacetime positive mass theorem). Let n ≥ 3, and let

(M n , g, k) be a complete asymptotically flat initial data set satisfying the
dominant energy condition. Assume that n < 8 or that M is spin. Then
E ≥ |P | in each end, where (E, P ) denotes the ADM energy-momentum
vector of (g, k).
After Schoen and Yau proved the positive mass theorem in the Rie-
mannian setting, they next used the Jang equation to prove a positive en-
ergy theorem (E ≥ 0) for general three-dimensional initial data sets satisy-
ing the dominant energy condition [SY81b]. Meanwhile, Witten’s spinor
proof, whose time-symmetric version was presented in Chapter 5, was orig-
inally written for general initial data sets. Many years later M. Eichmair
generalized Schoen and Yau’s E ≥ 0 theorem to dimensions less than 8
[Eic13]. Schoen suggested that Schoen and Yau’s proof of Theorem 3.18
could be generalized to the initial data setting using MOTS in place of mini-
mal hypersurfaces, and this was finally carried out in [EHLS16].1 Recently,
Lohkamp has treated the higher-dimensional cases [Loh16] using the theory
he developed for the Riemannian case [Loh15c, Loh15a, Loh15b].
1 That paper assumed stronger decay of (μ, J) than in Definition 7.17, but that was not
necessary.
255
256 8. The spacetime positive mass theorem
8.1. Proof for n < 8

We will essentially follow the exposition given in [EHLS16]. The proof given
there does not use a “compactification trick” as in Lemma 3.39, but rather it
instead follows Schoen and Yau’s original approach to the Riemannian posi-
tive mass theorem, with minimal hypersurfaces replaced by MOTS. It turns
out that there is also a clever compactification trick for the spacetime posi-
tive mass theorem, once again discovered by Lohkamp [Loh16, Section 2],
but it is significantly more complicated than Lemma 3.39 and relies upon
the so-called “boost theorem” of Christodoulou and O’Murchadha [CO81].
Since our proof here will mimic the original Schoen-Yau proof of Theo-
rem 3.18 in dimensions less than 8 (which we have not yet discussed), we will
summarize that proof here: it is a proof by induction on dimension and by
contradiction. Let 3 ≤ n < 8, assume that we have established the desired
result up to dimension n − 1 (unless n = 3), and suppose that (M n , g) is a
counterexample. We will show that there exists a counterexample one di-
mension lower (unless n = 3, in which case we use Gauss-Bonnet to obtain a
contradiction). First use Lemmas 3.31, 3.34, and 3.35 to assume without loss
of generality that our counterexample (M, g) is actually harmonically flat
outside a compact set. Next, the negative mass assumption and harmonic
flatness allows us to show that the coordinate planes xn = ±Λ are barriers
for minimal hypersurfaces for large enough Λ. This allows us to construct a
complete minimal hypersurface Σ in (M, g) with a desirable strong stability
property. (This is in contrast with what was done in Section 3.3, where a
compact minimal hypersurface was constructed inside a compact manifold.)
This Σ will be asymptotically flat with zero mass, and the strong stability
property allows us to make a conformal change on Σ that both lowers the
mass (making it negative) and gives it nonnegative scalar curvature. This
contradicts the positive mass theorem in one dimension lower.
For our proof of the spacetime positive mass theorem, we will follow each
of the steps described above, except that some of the steps are significantly
more complicated.
8.1.1. Construction of a complete stable MOTS. First we need an

analog of harmonic flatness that makes sense for initial data sets. In our
use of harmonic flatness in Section 3.3, it was not so important that the
conformal factor was actually harmonic but rather that it had the same
asymptotics as a harmonic function. Recall from Definition 7.16 that we
can think of initial data sets in terms of (M, g, π) rather than (M, g, k).
Definition 8.2. Let n ≥ 3. Let (M n , g, π) be an asymptotically flat initial
data set. We say that (M, g, π) has harmonic asymptotics in a particular
end if there exists a coordinate chart Rn B̄1 (0) for that end such that in
8.1. Proof for n < 8 257
that coordinate chart, there exists a smooth function u, a smooth vector

field Y , and constants a, bi such that
u(x) = 1 + a|x|2−n + O2+α (|x|1−n ),
Yi (x) = bi |x|2−n + O2+α (|x|1−n ),
4
gij = u n−2 δij ,
2
πij = u n−2 [(LY δ)ij − (divδ Y )δij ]
for some α ∈ (0, 1), where LY is the Lie derivative. The O2+α (|x|1−n)
2,α
indicates that the function lies in the weighted Hölder space C1−n as in
Definition A.22.
We will see a bit later why this is a useful definition.

Theorem 8.3 (Density theorem for DEC [EHLS16]). Let (M n , g, π) be
a complete asymptotically initial data set satisfying the dominant energy
2 < q < n − 2 such that q is less
condition μ ≥ |J|g , and let p > n and n−2
than the decay rate of (g, π) in Definition 7.17.
Then for any > 0, there exists initial data (g̃, π̃) on M also satisfying
the dominant energy condition such that (g̃, π̃) has harmonic asymptotics in
2,p 1,p
each end, (g̃, π̃) is -close to (g, π) in W−q × W−q−1 , and their constraints
˜ 1
(μ̃, J) are -close to (μ, J) in L .
Furthermore, we can choose (g̃, π̃) such that the strict dominant energy
˜ g̃ .
condition holds. That is, μ̃ > |J|
Alternatively, we can choose (g̃, π̃) to be vacuum outside a compact set.
˜ = 0 outside a compact set.
That is, μ̃ = |J|
The alternative conclusion is the more natural generalization of Lem-

ma 3.48, but the first conclusion (obtaining strict DEC) is the one we
will actually use. The proof is quite a bit more complicated than that of
Lemma 3.48 because the DEC is much harder to preserve than nonnegative
scalar curvature. See Section 9.3 for the proof of Theorem 8.3.
By the following lemma, the initial data (g̃, π̃) constructed in the pre-
vious theorem can also be chosen to have ADM energy-momentum that is
-close to the original.
Lemma 8.4. Suppose that (gi , πi ) is a sequence of asymptotically flat initial
2,p 1,p
data converging to a limit data (g, π) on M in W−q × W−q−1 , where p > n
n−2
and q > 2 , and assume that (μi , Ji ) converges to (μ, J) in L1 . Then on
each end, the ADM energy-momentum of (gi , πi ) converges to that of (g, π).
Exercise 8.5. Prove Lemma 8.4 by arguing as in the proof of Lemma 3.35.
Now let us begin our proof of the spacetime positive mass theorem (The-
orem 8.1) in earnest. We argue by induction on dimension and by contra-
diction. Let 3 ≤ n < 8, assume that we have established the desired result
up to dimension n − 1 (unless n = 3), and assume that (M n , g, π) is a
counterexample. Our goal is to show that there exists a counterexample
to the Riemannian positive mass theorem one dimension lower (which is a
special case of our induction hypothesis), unless n = 3, in which case we use
Gauss-Bonnet to obtain a contradiction.
By Theorem 8.3 and Lemma 8.4, we can assume, without loss of gener-
ality, that our counterexample (M n , g, π) has harmonic asymptotics, strict
DEC, and E < |P |. We intend to construct a complete MOTS Σ in M with
a certain desirable strong stability property to be described later. Here is
where we see the value of the harmonic asymptotics assumption.
Lemma 8.6. Let (M n , g, k) be an initial data set with harmonic asymp-

totics, and assume that E < |P | in a particular end. We may rotate co-
ordinates so that, without loss of generality, P = (0, . . . , −|P |). Then, for
+ +
sufficiently large Λ, we have θ{x n =Λ} > 0 and θ{xn =−Λ} < 0, where the ex-
pansion is computed with respect to the upward-pointing unit normal in both

cases.
Proof. Let u and Y be the functions as in Definition 8.2 so that in the

specified end, we have
u(x) = 1 + a|x|2−n + O2 (|x|1−n ),

Yi (x) = bi |x|2−n + O2 (|x|1−n ),
4
gij = u n−2 δij ,
2
πij = u n−2 [(LY δ)ij − (divδ Y )δij ] .
It follows from Exercise 3.13 that a = E/2. We claim that bi = − n−1

n−2 Pi . To
see this, recall the definition of Pi (Definition 7.18) and compute
n
(n − 1)ωn−1 Pi = lim πij ν j dμ
r→∞ |x|=r
j=1
n
2
= lim u n−2 (Yi,j + Yj,i − (divδ Y )δij ) ν j dμ
r→∞ |x|=r
j=1
n

−1
= lim (2 − n)|x| 1−n
bi ν j + bj ν i − bk ν δij + O(|x|
k
) ν j dμ
r→∞ |x|=r
j=1
= −(n − 2)bi ωn−1 ,
8.1. Proof for n < 8 259
2
noting that the u n−2 factor does not contribute to the limit. With our
n−2 |P | and bi = 0 for
assumption on the direction P points, we have bn = n−1
i < n.
Next, we claim that
−n
(8.1) +
θ{xn =Λ} = (n − 1)(|P | − E)Λ|x| + O(|x|−n ).
Since g is conformal to Euclidean space outside a compact set, we can use
Exercise 2.14 to compute
2(n − 1) n−2
−2
−1
H{xn =Λ} = u ∂n u
n−2

2(n − 1) n−2
−2
−1 2 − n −n n −n
= u E|x| x + O(|x| )
n−2 2
= −(n − 1)E|x|−n Λ + O(|x|−n ),
where we use the asymptotic expansion of u. To compute tr{xn =Λ} (k), first
note that
2 1
kij = u n−2 (LY δ)ij − (divδ Y )δij .
n−1
Using the fact that Yi = O1 (|x|1−n ) for i < n, we have

n−1
tr{xn =Λ} (k) = g ij kij
i,j=1

n−1
−2 1
= u n−2 δ ij [Yi,j + Yj,i − (divδ Y )δij ]
n−1
i,j=1

n−1
−2 −1
= u n−2 δ ij
Yn,n δij + O(|x|−n )
n−1
i,j=1
= −Yn,n + O(|x|−n )
= (n − 1)|P ||x|−n Λ + O(|x|−n ),
where we used the asymptotic expansion of Y in the last step. Hence we
+
have (8.1), which shows that for large enough Λ one has θ{xn =Λ} > 0. The
+
proof that θ{xn =−Λ} < 0 is similar.
Lemma 8.6 gives us barriers for MOTS. More specifically, this would be
enough to invoke Theorem 7.39 if these barriers were actually compact. In
order to create a region with a compact boundary, for each large ρ, let Cρ
be the region of M whose boundary is the cylinder ∂Cρ := {x | (x1 )2 + · · · +
(xn−1 )2 = ρ2 } in the asymptotically flat end. Then for each Λ > 0, define
Cρ,2Λ to be the part of Cρ lying between the two planes xn = ±2Λ. For
each h ∈ [−Λ, Λ], we can consider the sphere Γn−2
ρ,h which is where ∂Cρ,2Λ
intersects the plane xn = h. For each choice of h, we divide the boundary

∂Cρ,2Λ into two pieces, ∂1h Cρ,2Λ and ∂2h Cρ,2Λ , meeting along Γρ,h , where the
first piece is the part of ∂Cρ,2Λ lying above xn = h and the second piece
is the part lying below it. We claim that we can apply Theorem 7.39 to
construct a MOTS Σρ,h with boundary Γn−2 ρ,h , where the region Cρ,2Λ plays
h
the role of Ω while ∂i Cρ,2Λ plays the role of ∂i Ω.
To see this, note that the previous lemma shows that the planar caps
xn = ±2Λ have the desired sign on θ+ , and it is easy to see (from asymptotic
flatness) that the lateral part of ∂ih Cρ,2Λ also has the desired sign on θ+ for
large ρ. (Remember that the normal changes direction when computing θ+
for ∂1 Ω versus ∂2 Ω.) Hence we have the following.
Lemma 8.7. Assume the hypotheses of Lemma 8.6, and let Λ be large
enough so that the lemma holds. Then for any large enough ρ and any
h ∈ [−Λ, Λ], there exists a λ-minimizing stable MOTS Σρ,h whose boundary
is Γρ,h and lies inside Cρ,2Λ . The constant λ is independent of ρ and h.
There is a technical issue here because ∂Cρ,2Λ has corners, but intu-
itively, this should not be a problem since smoothing the corners creates
large positive mean curvature that only serves to help with the desired bar-
rier inequalities. (This smoothing of the corners is why we use 2Λ instead
of Λ.) Also, we assumed above that there was only one end, but if there are
other ends, we can just cut off our region Cρ,2Λ by large coordinate spheres
in those other ends, and as seen earlier those large coordinate spheres can
just be considered as part of ∂2h Cρ,2Λ since they have θ+ < 0 (with respect
to the appropriate normal).
Since what we really want is a complete MOTS rather than a MOTS
with boundary, we will take a limit as ρ → ∞.
Lemma 8.8. Assume the hypotheses of Lemma 8.6. For any choice of
ρj → ∞ and hj ∈ [−Λ, Λ], there exists a subsequence of Σρj ,hj that smoothly
converges on compact subsets of M to a complete properly embedded MOTS
Σ∞ . Moreover, there exists α ∈ (0, 1) and a constant c ∈ [−2Λ, 2Λ] such
that outside a large compact subset of M , Σ∞ can be written as the Eu-
clidean graph {xn = f (x )} of some function f (x ) = c + O3+α (|x |3−n ) in
the (x1 , . . . , xn−1 , xn ) = (x , xn ) coordinate system. In particular, expressed
in x coordinates, the metric h on Σ∞ satisfies hij − δij ∈ C2−n 2,α
(Σ∞ ).
Sketch of the proof. The proof uses some standard geometric measure
theory arguments, as well as some facts about the prescribed mean curvature
equation for graphs. Because of this, the full argument lies outside the scope
of the book, but we outline the basic ideas.
8.1. Proof for n < 8 261
Essentially, the λ-minimizing property is used to extract a subsequence

converging to a limit space Σ∞ in a weak sense, where Σ∞ is also λ-
minimizing. Allard regularity [All72, Sim83] is used to show that the con-
vergence is smooth on compact sets. Since each Σρj ,hj lies between the
planes xn = ±2Λ, so must the limit Σ∞ . In fact, one can argue using Allard
regularity that the Σρj ,hj are graphical over the x coordinates for large x ,
and consequently the same is true for the limit Σ∞ .
Since the convergence is smooth, Σ∞ is a MOTS, and thus HΣ∞ =
− trΣ∞ (k). Combining this with harmonic asymptotics and Exercise 2.14,
it follows that the Euclidean mean curvature of Σ∞ is
H Σ∞ = O(|x|1−n ).
We can use this together with the λ-minimizing property of Σ∞ and Allard
regularity to show that outside some compact set, Σ∞ is the graph of some
function f (x ) such that |f (x )| ≤ 2Λ and f (x ) = O1+γ (1) for some γ ∈
(0, 1). Using this initial estimate f = O1+γ (1), one can show that f satisfies
a prescribed mean curvature equation of the form H[f ] = Oγ (|x |1−n−γ ),
where H[f ] represents the mean curvature of the graph of f (x ) as a function
of x . Asymptotic analysis of the mean curvature equation as in [Mey63,
Sch83] shows that there exists a constant c ∈ [−2Λ, 2Λ] such that f (x ) =
c + O2+γ (|x |3−n ). (Compare this to Corollary A.38.) Repeating the above
argument with this stronger decay estimate for f shows that f (x ) = c +
O3+α (|x |3−n ), as asserted, where α is the Hölder exponent occurring in
Definition 8.2.
The last statement of the proof is straightforward to check using the fact
that the metric h on Σ∞ in x coordinates is given by
4
hij = u n−2 (δij + fi fj ).
The limit space Σ∞ is not just a MOTS, but it also enjoys a stability
property.
Lemma 8.9. Assume the hypotheses of Lemma 8.6. Let Σ∞ be the complete
1,2
MOTS described above. For any smooth v ∈ W 3−n (Σ∞ ), we have

2

|∇v|2 + QΣ∞ v 2 dμΣ∞ ≥ 0.
Σ∞
In other words, Σ∞ inherits the symmetrized stability property.
Proof. Using a standard approximation argument, it suffices to show that

the inequality holds for any smooth compactly supported function v ∈
Cc∞ (Σ∞ ). Consider the vector field vν∞ defined along Σ∞ , where ν∞ is
the upward unit normal, and extend it to some compactly supported vector
field X on M . For each j, let vj = X, νj , where νj is the upward unit
normal of Σj . For large enough j, vj will be compactly supported in Σj .

The smooth convergence on compact sets implies that

|∇v| + QΣ∞ v dμΣ∞ = lim
2 2
|∇vj |2 + QΣj vj2 dμΣj ≥ 0,
Σ∞ j→∞ Σ
j
by stability of the MOTS Σj together with Proposition 7.37.
8.1.2. Proof of the n = 3 case. The freedom to choose the heights hj

turns out to be important for n > 3, but the n = 3 case is significantly easier
and one can simply choose hj = 0 for all j. Let ρj → ∞, and let Σj := Σρj ,0
be the stable MOTS whose existence is guaranteed by Lemma 8.7, and then
pass to a subsequence that converges to Σ∞ as described by Lemma 8.8.
Technically, we do not know that Σ∞ is connected, but if it is not,
then we can throw away the compact components and only consider the one
noncompact component of Σ∞ , which we know is asymptotic to a plane.
Since Lemma 8.9 will still hold on this component, let us assume without
loss of generality that Σ∞ is connected. As we saw earlier, the symmetrized
stability property for a compact surface Σ allows us to find a conformal factor
giving Σ positive scalar curvature. We would like to do something similar
here for Σ∞ , but in order to do that, we need the symmetrized stability to
hold for functions v that are asymptotic to 1, whereas Lemma 8.9 only gives
the desired inequality for functions decaying to zero.
However, when n = 3, the surface Σ∞ is two-dimensional, and the gap
can be filled by a “logarithmic cut-off trick,” which was employed in Schoen
and Yau’s original paper on the positive mass theorem [SY79c].
For each large radius σ, consider the compactly supported function ϕσ
on Σ∞ defined by
⎧
⎪
⎨ 1 for |x | < σ,
|x |
ϕσ = 2 − log for σ < |x | < σ 2 ,
⎪
⎩ 0
log σ
for |x | > σ 2 .
Exercise 8.10. Assuming n = 3, prove that if Σ∞ is the surface described
above (on which Lemma 8.9 applies), then

QΣ∞ dμΣ∞ ≥ 0,
Σ∞
by using ϕσ in place of v and taking the limit as σ → ∞.
Together with the exercise above, the strict dominant energy condition
then implies that

(8.2) KΣ∞ dμΣ∞ > 0,
Σ∞
8.1. Proof for n < 8 263
where KΣ∞ denotes the Gauss curvature. However, one can see from the
Gauss-Bonnet Theorem that this is inconsistent with the fact that the sur-
face Σ∞ is asymptotically planar. More precisely, recalling from Lemma 8.8
that the induced metric h satisfies hij (x ) = δij + O1 (|x |−1 ), we can use
the same argument that was used to prove Theorem 3.30 to obtain the de-
sired contradiction. Therefore the spacetime positive mass theorem holds in
dimension 3.
8.1.3. Proof of the 3 < n < 8 case. When 3 < n < 8, we will construct a
complete (n − 1)-dimensional MOTS which can be conformally deformed to
provide a counterexample to the Riemannian positive mass theorem in n − 1
dimensions (Theorem 3.18), giving a contradiction. However, it is worth not-
ing that in this chapter we are also simultaneously reproving Theorem 3.18
as a special case of Theorem 8.1 via an induction argument.
As in the n = 3 case, we assume without loss of generality that Σ∞ is
connected. Let h be the induced metric on Σ∞ . We seek a smooth positive
4
function w such that w approaches 1 at infinity and w n−3 h is scalar-flat. By
equation (1.6), this means that we need to solve
4(n − 2)
Lh w := − ΔΣ∞ w + RΣ∞ w = 0,
n−3
with boundary condition w approaching 1 at infinity, where Lh is the con-

formal Laplacian for the induced metric h on Σ∞ . Setting v = w − 1, this
is equivalent to solving for v in
Lh v = −RΣ∞ ,
with boundary condition v approaching 0 at infinity. By decay of RΣ∞ , we

will be able to solve this as long as
2,p
Lh : W−q (Σ∞ ) −→ Lp−q−2 (Σ∞ )
is an isomorphism for some p > n, n−3 2 < q < n − 3. We will use the
2,p
stability from Lemma 8.9 to prove injectivity. Suppose that ϕ ∈ W−q solves
Lh ϕ = 0. By elliptic regularity, ϕ is smooth, and by weighted Sobolev
embedding, ϕ ∈ C−q
1 . We argue as in the proof of Proposition 2.25 that

0≤ |∇ϕ|2 + QΣ∞ ϕ2 dμΣ∞
Σ
∞

1
≤ |∇ϕ| + RΣ∞ ϕ
2 2
dμΣ∞
Σ∞ 2

1 4(n − 2)
≤ |∇ϕ| + RΣ∞ ϕ
2 2
dμΣ∞
2 Σ∞ n−3

1
= ϕLh ϕ dμΣ∞
2 Σ∞
= 0,
where the second inequality follows from the dominant energy condition,
the third follows from the fact that 2 < 4(n−2)
n−3 for n > 3, and the final
equality follows from integration by parts and the decay of ϕ. Therefore the
third inequality must be an equality, which implies that ϕ is constant and
2,p
therefore identically zero. So Lh : W−q (Σ∞ ) −→ Lp−q−2 (Σ∞ ) is injective,
and it has index zero by Corollary A.42; therefore it is an isomorphism.
Hence we have the desired v and w = v + 1, which is smooth by elliptic
regularity (Theorem A.4). By Lemma 8.8, the metric h is asymptotically flat
with ADM energy zero. By Corollary A.38 and Exercise 3.13, the conformal
4
scalar-flat metric h̃ := w n−3 h on Σ∞ is also asymptotically flat.2
In order to obtain a contradiction to the Riemannian positive mass theo-
rem in dimension n−1, all we need now is to show that h̃ has negative ADM
energy. We use Exercise 3.12 to compute the change in mass as follows:

2
E h̃ = E(h) − lim w∇ν w dμ|x |=r
(n − 3)ωn−2 r→∞ |x |=r

2
=0− lim w∇η w dμ∂(Σ∞ ∩Cr )
(n − 3)ωn−2 r→∞ ∂(Σ∞ ∩Cr )

2
=− (|∇w|2 + wΔΣ∞ w) dμΣ∞
(n − 3)ωn−2 Σ∞

−2 n−3
= |∇w| +
2
RΣ w 2
dμΣ∞ ,
(n − 3)ωn−2 Σ∞ 4(n − 2) ∞
where the first line involves an integral over a coordinate sphere in Rn−1
and ν is the Euclidean outward normal, while η is the outward normal of
2 Technically, we must check that w is positive. Although the maximum principle stated in
Theorem A.2 does not apply to Lh since we do not have a sign on RΣ∞ , there is a version of
the maximum principle that applies as long as there exists a positive subsolution, and we can
construct such a subsolution by taking a limit of the principal eigenfunctions of the conformal
Laplacians on Σρj ,hj .
8.1. Proof for n < 8 265
Σ∞ ∩ Cr in Σ∞ . The second equality follows from the known asymptotic

decay.
Therefore, in order to obtain the desired contradiction, we need the
stability inequality in Lemma 8.9 to hold not just for compactly supported
1,2
functions, or even ones in W 3−n , but also for the functions (such as w)
2
obtained by adding 1 to those functions. When n = 3, this follows from the
logarithmic cutoff trick, but when n > 3, this analysis does not work, and in
fact we need an extra geometric idea. Here is where the luxury of choosing
the height hj in the construction of Σj := Σρj ,hj is useful.
Lemma 8.11. There exist choices of hj in Lemma 8.8 such that the resulting
limit hypersurface Σ∞ has the property that for any function w on Σ∞ such
1,2
that w − 1 ∈ W 3−n (Σ∞ ), we have

2

(8.3) |∇w|2 + QΣ∞ w2 dμΣ∞ ≥ 0.
Σ∞
As described above, this lemma allows us to finish off the proof of the

spacetime positive mass theorem (Theorem 8.1), so from now on we will
focus on proving Lemma 8.11.
8.1.4. The functional F . Recall that Lemma 8.9 ultimately came from
variations of the Σj fixing the boundary. The stability we desire corresponds
to moving the boundary in the vertical direction. Thus Lemma 8.11 can be
thought of as a statement of “vertical stability.”
First we will briefly describe how this was done by Schoen and Yau
in the time-symmetric case. Define Cρ , Cρ,2Λ , and Γρ,h as we did earlier
in Section 8.1.1. Using geometric measure theory (specifically, a version of
Theorem 2.22 with prescribed boundary), given large ρ and any h ∈ [−Λ, Λ],
one can construct a minimal hypersurface Σρ,h with boundary Γρ,h with the
property that it minimizes volume compared to every other hypersurface
with the same boundary. (For the general spacetime positive mass theorem,
we instead used Theorem 7.39 to construct the desired stable MOTS.) How-
ever, in order to achieve the “vertical stability” for each fixed ρ, Schoen and
Yau also minimized volume over all h ∈ [−Λ, Λ], and the barriers guaran-
tee that the minimizing h lies in the interior (−Λ, Λ). One can then show
that Σρ,hρ has the desired vertical stability property if hρ is the minimiz-
ing height. Since there is no direct analog of volume minimization in the
MOTS setting, this is an essential difficulty in generalizing the Schoen-Yau
approach.
In order to motivate the discussion, let us take a closer look at what
happens in the time-symmetric case. Suppose for a moment that the family
{Σρ,h }|h|≤|Λ| is actually a smooth foliation of minimal hypersurfaces with a
first-order deformation vector field X that is equal to ∂n at ∂Cρ . As before,

decompose X = ϕν + X̂, where ν is the upward unit normal to Σρ,h and
X̂ is the tangential component. By the first variation formula including
boundary term (Proposition 2.10), we know that

d
(8.4) |Σρ,h | = ∂n , η dμ∂Σρ,h ,
dh ∂Σρ,h
where η is the outward-pointing normal to ∂Σρ,h tangent to Σρ,h . At the

minimizer hρ , the derivative (with respect to h) of the left side of the above
equation must be nonnegative. This suggests that in the general case (in
which we deal with MOTS rather than minimal hypersurfaces), we should
concentrate our attention on where the right side integral has nonnegative
derivative in h, since the left side is not going to be directly useful.
Definition 8.12. Let Σ be a compact hypersurface in M whose boundary
lies on some coordinate cylinder ∂Cρ . Then we define

(8.5) F (Σ) = ∂n , η dμ∂Σ ,
∂Σ
where η is the outward unit normal of ∂Σ tangent to Σ. Note that, using
harmonic asymptotics, one can easily see that

2(n−1)
(8.6) F (Σ) = u n−2 η̄ n dμ∂Σ ,
∂Σ
where η̄ n
is the nth component of the unit normal η̄ computed using the Eu-
clidean metric, and dμ∂Σ denotes the volume measure of the metric induced
by the Euclidean metric.
The barrier planes {xn = ±Λ} give us a sign on F (Σρ,±Λ ).

Lemma 8.13. Assume the hypotheses of Lemma 8.7, and let Λ ρ, and Σρ,h
be as described in that lemma. Then
F (Σρ,−Λ ) < 0 < F (Σρ,Λ ).
Proof. Recall from Lemma 8.6 that θ+ > 0 on the plane xn = Λ and all
planes above it. Then by the strong maximum principle for θ+ (Proposi-
tion 7.38), we know that Σρ,Λ lies below the plane {xn = Λ} in Cρ and
that they cannot meet tangentially at their common boundary Γρ,Λ . Hence
the Euclidean outward unit normal of ∂Σρ,Λ in Σρ,Λ satisfies η̄ n > 0. (See
Figure 8.1.) The inequality F (Σρ,Λ ) > 0 then follows from equation (8.6).
The proof that F (Σρ,−Λ ) < 0 is analogous.
By examining the proof of Theorem 7.39, one can show that the Σρ,h
are ordered in h in the sense that for h1 < h2 , Σρ,h1 lies below Σρ,h2 . More
8.1. Proof for n < 8 267
xn = Λ
Σρ,Λ
Σρ,h
0
Γρ,h Γρ,h
0 0
Σρ,h
0
Σρ,-Λ
xn = −Λ
∂Cρ ∂Cρ
Figure 8.1. For any h ∈ [−Λ, Λ], we can find at least one stable MOTS
Σρ,h with prescribed boundary sphere Γρ,h . The cylinder Cρ and the
planes xn = ±Λ serve as barriers. Uniqueness fails at jump heights,
illustrated in the picture above by the height h0 . When n > 3, we must
carefully choose a (nonjump) height hρ in order to obtain the desired
“vertical stability” property for Σρ,hρ .
precisely we mean that if we regard the Σρ,h ’s as relative boundaries in Cρ ,

then the regions they bound are nested. (See Figure 8.1.)
Now let h0 ∈ (−Λ, Λ]. One can show, using convergence arguments as
in the proof of Lemma 8.8, that the upper envelope of {Σρ,h }h<h0 is itself
a MOTS Σρ,h0 with boundary Γρ,h0 . Moreover, limh→h− Σρ,h = Σρ,h in the
0
smooth sense. By convention we define Σρ,−Λ := Σρ,−Λ . We similarly define
Σρ,h0 as the lower envelope of {Σρ,h }h>h0 for h0 ∈ [−Λ, Λ) and Σρ,Λ := Σρ,Λ
and note that analogous statements hold for these hypersurfaces.
We say that h0 is a jump height if Σρ,h0 does not equal Σρ,h0 . (See
Figure 8.1.)
Lemma 8.14. Assume the hypotheses of Lemma 8.7, and let Λ, ρ, and Σρ,h
be as described in that lemma. The function h → F (Σρ,h ) is continuous at
every h0 ∈ [−Λ, Λ] that is not a jump height. If h0 ∈ [−Λ, Λ] is a jump
height, then
lim F (Σρ,h ) ≥ F (Σρ,h0 ) ≥ lim F (Σρ,h ),
h→h−
0 h→h+
0
where both limits exist, and at least one of the inequalities above is strict.
In other words, there must be a downward jump discontinuity at every jump
height.
Proof. Let h0 ∈ [−Λ, Λ]. Since the lower and upper envelopes are smooth
limits, limh→h− F (Σρ,h ) = F (Σρ,h0 ) and limh→h+ F (Σρ,h ) = F (Σρ,h0 ). By
0 0
definition, if h0 is not a jump height, then both of these limits must equal
F (Σρ,h0 ), so that continuity holds at all nonjump heights.
Let h0 be a jump height. Since the family {Σρ,h }|h|≤Λ is ordered, it
is clear that Σρ,h0 lies below Σρ,h0 , which lies below Σρ,h0 (where “below”
does not necessarily mean strictly). And since they all share the common
boundary Γρ,h0 , we have an ordering of their corresponding quantities ∂n , η
along Γρ,h0 . From the definition of F , the desired inequalities follow. More-
over, since h0 is jump height, Σρ,h0 = Σρ,h0 , so the strong maximum prin-
ciple (Proposition 7.38) implies that at least one of the two inequalities is
strict.
For better readability, we define a smooth vector field Z on M which is

identically equal to ∂n outside a compact set, so that

F (Σ) = Z, η dμ∂Σ .
∂Σ
Consider the decomposition

Z = φν + Ẑ,
into normal and tangential components along Σ. We now compute the first
variation of F . In view of Lemmas 8.13 and 8.14 we may hope to find
hρ ∈ (−Λ, Λ) such that the derivative of h → F (Σρ,h ) at hρ (defined in a
suitably weak sense) is nonnegative.
Proposition 8.15. Let Σ be a compact hypersurface in (M, g) whose bound-
ary lies on some ∂Cρ , and let ν denote the upward unit normal of Σ. Let
X = ϕν + X̂ be a smooth vector field along Σ that is tangent to ∂Cρ at ∂Σ.
Then the linearization of F is

(8.7) DF |Σ (X) = φ∇ϕ + G(X), η dμ∂Σ ,
∂Σ
8.1. Proof for n < 8 269
where
(8.8) G(X) = ∇X Z − ∇Ẑ X̂ + (ϕH + divΣ X̂)Ẑ − φS(X̂) − ϕS(Ẑ),
and S and H are the shape operator and mean curvature of Σ as a hyper-
surface in (M, g) (and Z = φν + Ẑ = ∂n was defined above).
Proof. The main thing to observe is that the boundary integral in this
formula is similar to the one obtained in the second variation formula with
boundary (Theorem 2.19). This is because F is similar to the second term
appearing in the first variation formula with boundary (Proposition 2.10).
In fact, one could potentially use Theorem 2.19 as a starting point, but we
choose to argue from scratch instead.
Let e1 , . . . , en−2 be a local orthonormal frame for T (∂Σ). We can differ-
entiate Z, η, and the induced measure on ∂Σ to obtain
6 7

n−2
DF |Σ (X) = ∇X Z, η + Z, ∇η X, νν − ∇ei X, ηei
∂Σ i=1
+Z, η div∂Σ X dμ∂Σ

(8.9)
= ∇X Z, η + Z, ν∇η X, ν
∂Σ

n−2
− Z, ei ∇ei X, η + Z, η div∂Σ X dμ∂Σ .
i=1
The derivative of η above was computed using the fact that e1 , . . . , en−2 , η, ν
is an orthonormal basis and differentiating the orthogonality relations. The
second term in the integrand of (8.9) is
< =
Z, ν∇η X, ν = φ ∇η (ϕν + X̂), ν

(8.10) = φ ∇η ϕ + ∇η X̂, ν
= φ∇ϕ, η − φS(X̂), η.
Combining the last two terms in the integrand of (8.9) gives

n−2
Z, η div∂Σ X − Z, ei ∇ei X, η
i=1

n−2
= Z, η divΣ X − ∇η X, η − Z, ei ∇ei X, η
i=1

n−2
(8.11) = Ẑ, η divΣ X − Ẑ, η∇η X, η − Ẑ, ei ∇ei X, η

i=1
< =
= (ϕH + divΣ X̂)Ẑ, η − ∇Ẑ X, η
< = < =
= (ϕH + divΣ X̂)Ẑ, η − ∇Ẑ (ϕν + X̂), η
< =
= (ϕH + divΣ X̂)Ẑ, η − ϕ∇Ẑ ν + ∇Ẑ X̂, η

= (ϕH + divΣ X̂)Ẑ − ϕS(Ẑ) − ∇Ẑ X̂, η .
The result follows from combining the first term in the integrand of (8.9)
with the computations (8.10) and (8.11).
8.1.5. Height picking and stability. The following lemma finds a height
hρ which corresponds to the volume-minimizing height in the time-symmetric
case, as motivated by equation (8.4).
Lemma 8.16. Assume the hypotheses of Lemma 8.7, and let Λ, ρ, and
Σρ,h be as described in that lemma. Define Σρ,h to be the component of
Σρ,h containing its boundary Γρ,h . There exists hρ ∈ (−Λ, Λ) and a smooth
vector field X along Σρ,hρ that is equal to Z = ∂n at ∂Σρ,hρ = Γρ,hρ , such
that ϕ = X, ν > 0,
(8.12) Dθ+ |Σρ,h (X) = 0,
ρ
and
(8.13) DF |Σρ,h (X) ≥ 0.
ρ
Sketch of the proof. The actual proof of this lemma is somewhat com-
plicated, but in light of Lemmas 8.13 and 8.14, it is fairly intuitive and
expected. Viewed as a function of h, F (Σρ,h ) must go from negative (at
h = −Λ) to positive (at h = Λ), and its only discontinuities are jump dis-
continuities that jump down. Therefore there certainly ought to exist a value
hρ at which F (Σρ,h ) is increasing in h.
More precisely, we claim that hρ = inf{h ∈ [−Λ, Λ] | F (Σρ,h ) > 0} will
give us the desired height. If this family Σρ,h were differentiable in h at
h = hρ , we could choose X to be the first-order deformation of the family
8.1. Proof for n < 8 271
Σρ,h at h = hρ . Since X is a variation through MOTS, (8.12) holds, and the

definition of hρ implies that

d
F (Σρ,h ) ≥ 0,
dh h=hρ
which translates to (8.13).

In general, although the path h → Σρ,h hypersurfaces need not be dif-
ferentiable in h at hρ , Lemma 8.14 tells us that hρ is not a jump height,
so at least we have continuity of the map h → Σρ,h at h = hρ . When the
MOTS stability operator LΣρ,hρ (Definition 7.35) with Dirichlet boundary
condition has no kernel, the inverse function theorem (Theorem A.43) can
be used to show that the map h → Σρ,h really is C 1 in h at h = hρ , and the
result follows.
When the linearization does have kernel, the situation is more com-
plicated, but the inverse function theorem can still be used to find a useful
description of the Σρ,h for h near hρ , using the same basic idea that was used
in our proof of Theorem 2.38. (Fundamental work of Brian White [Whi87]
explained how to view spaces of minimal hypersurfaces with boundary as
smooth manifolds even when the stability operator has kernel.) Even though
Σρ,h need not be differentiable in h at h = hρ in this case, say, viewed as a
graph over Σρ,hρ , one can still extract a convergent subsequence of the dif-
ference quotient of the graph in order to find the desired X. For full details
of this proof, see [EHLS16].
The previous lemma implies the desired “vertical stability” for Σρ .

Lemma 8.17. Assume the hypotheses of Lemma 8.16, and choose Σρ :=
Σρ,hρ , X, and ϕ as in the conclusion of that lemma. Then for any smooth
function v on Σρ that is equal to φ = Z, ν at ∂Σρ , we have

(8.14) 2 2
(|∇v| + Qv ) dμ + G̃(X), η dA ≥ 0,
Σρ ∂Σρ
where
(8.15) G̃(X) = G(X) + φϕW.
Here W := WΣρ and Q := QΣρ are defined as in Proposition 7.32, and we
use the abbreviations dμ := dμΣρ and dA := dμ∂Σρ .
Proof. Recall from the proof of Proposition 7.37 that

v 2 ϕ−1 Dθ+ |Σρ (X) = div(v 2 (W − ∇ log ϕ))
+ |∇v|2 + Qv 2 − |(W − ∇ log ϕ)v + ∇v|2 ,
where we have suppressed the Σρ subscripts. Together with equation (8.12),

this implies
0 ≤ |∇v|2 + Qv 2 + div(v 2 (W − ∇ log ϕ)).
Noting that v = φ = ϕ at ∂Σρ , we can integrate this to obtain

0≤ 2 2
(|∇v| + Qv ) dμ + v 2 (W − ∇ log ϕ), η dA
Σρ ∂Σρ

= (|∇v|2 + Qv 2 ) dμ + φϕW − φ∇ϕ, η dA
Σρ ∂Σρ

= (|∇v|2 + Qv 2 ) dμ + G̃(X), η dA − DF |Σρ (X)
Σρ ∂Σρ

≤ 2 2
(|∇v| + Qv ) dμ + G̃(X), η dA,
Σρ ∂Σρ
where we used equation (8.7) and inequality (8.13) in the third and fourth
lines, respectively.
By Lemma 8.8, there exists a sequence ρj → ∞ such that Σρj converges

smoothly on compact sets to some Σ∞ that has the properties described in
Lemma 8.8. In order to transfer the vertical stability estimate Lemma 8.17
to the limit space Σ∞ , we need to have uniform control over the Σρ ’s as in
the following lemma.
Lemma 8.18. Assume the hypotheses of Lemma 8.16, and choose Σρ :=
Σρ,hρ as in the conclusion of that lemma. As before, let Z = ∂n outside
some compact set, and let φ = Z, ν be its normal component. Then the
following estimate holds uniformly in ρ:

(8.16) |∇φ|2 + Qφ2 dμ + G̃(Z), η dA = O(r−1 ),
Σρ Cr ∂(Σρ Cr )
using the same abbreviations as in Lemma 8.17.
Proof. The first step, which we omit, is to show that

(8.17) Dθ+ |Σρ (Z) = O(|x|−n ),
(8.18) DF |Σρ ∩Cr (Z) = O(r−1 )
hold uniformly in ρ. (You may wish to try this as an exercise.)
Next we vary Σρ in the direction Z, which is just vertical translation
outside a compact set. We again use the computation in the proof of Propo-
sition 7.37, suppressing Σρ subscripts, to see that
φDθ+ |Σρ (Z) = div(φ2 W − φ∇φ) + |∇φ|2 + Qφ2 − |W |2 φ2 .
8.1. Proof for n < 8 273
Using the definition of G̃ and equation (8.7), we have

2 2
(|∇φ| + Qφ ) dμ + G̃(Z), η dA

< =
= 2 2
(|∇φ| + Qφ ) dμ + G(Z) + φ2 W, η dA

= (|∇φ|2 + Qφ2 ) dμ + φ2 W − φ∇φ, η dA
+ DF |Σρ (Z) − DF |Σρ ∩Cr (Z)

= |∇φ|2 + Qφ2 + div(φ2 W − φ∇φ) dμ + O(r−1 )
Σρ Cr

= φDθ+ |Σρ (Z) + |W |2 φ2 dμ + O(r−1 )
Σρ Cr
−1
= O(r ),
where the estimate (8.18) was used to eliminate the DF terms, and the last
line follows from the estimate (8.17), the decay of W , and volume control
coming from the λ-minimizing property of Σρ .
We finally obtain the desired stability estimate on the complete MOTS

Σ∞ .
Proof of Lemma 8.11. Choose hj to be the hρj from Lemma 8.16. We

pass to a subsequence so that Σj := Σρj ,hρ converges to some limit space
j
Σ∞ as in Lemma 8.8. (Although Σ∞ need not be connected, we may assume
that it is connected without loss of generality for the purpose of the following
arguments.) Following the notation of Lemma 8.8 we use coordinates x on
Σ∞ K, where K is a large compact subset of M . It is easy to see from
Lemma 8.8 that
1 1
QΣ∞ = RΣ∞ − μ − J(νΣ∞ ) − |kΣ∞ + AΣ∞ |2 = O(|x |−n ).
2 2
Since the Σ∞ is asymptotically flat, we know that its volume grows like that
of Rn−1 . One can then see that

(8.19) |∇w|2 + |QΣ∞ |w2 dμ < ∞,
Σ∞
1,2
whenever w − 1 ∈ W 3−n (Σ∞ ).
2
In the following computation we will use a j-subscript to denote quan-
tities that depend on Σj , including j = ∞. Whenever r < ρj , we use
ηj to denote the outward unit normal of ∂(Σj ∩ Cr ) inside Σj . (In what
follows, we can restrict our attention to values of r such that ∂Cr is trans-
verse to each Σj .) Fix a smooth vector field Z on M that agrees with
∂n outside a compact set. Let φ∞ = νΣ∞ , Z and φj = νΣj , Z. Note

that Lemma 8.8 implies that φ∞ − 1 = O1 (|x |2−n ) ∈ W 3−n
1,2
(Σ∞ ) and that
2
G̃∞ (Z) = O(|x |1−n ), where G̃∞ was defined in (8.15). Therefore the surface
integral of G̃∞ (Z) vanishes in the limit. So we have

|∇φ∞ |2 + Q∞ φ2∞ dμ = lim |∇φ∞ |2 + Q∞ φ2∞ dμ
Σ∞ r→∞ Σ ∩C
∞
r

= lim |∇φ∞ |2 + Q∞ φ2∞ dμ + G̃∞ (Z), η∞ dA
r→∞ Σ∞ ∩Cr ∂(Σ∞ ∩Cr )

= lim lim |∇φj |2 + Qj φ2j dμ + G̃j (Z), ηj dA
r→∞ j→∞ Σj ∩Cr ∂(Σj ∩Cr )

= lim |∇φj | +
2
Qj φ2j dμ + G̃j (Z), ηj dA
j→∞ Σj ∂Σj
≥ 0,
where the third equality follows from the smooth convergence of Σj to Σ∞

on compact sets, the fourth equality follows from Lemma 8.18, and the
last inequality follows from Lemma 8.17. Hence (8.3) holds for v = φ∞ .
Moreover, since the argument works for any Z that equals ∂n outside a
compact set, inequality (8.3) holds for all test functions that agree with φ∞
outside a compact set.
We now argue by density that the lemma holds for any function w such
1,2
that w − φ∞ ∈ W 3−n (Σ∞ ). Note that Cc∞ (Σ∞ ) is dense in W 3−n
1,2
(Σ∞ ). Let
2 2
wi − φ∞ be a sequence of functions in Cc∞ (Σ∞ ) that converges to w − φ∞
1,2
in W 3−n (Σ∞ ). It is now straightforward to check that
2

0 ≤ lim inf |∇wi |2 + Q∞ wi2 dμ
i→∞
Σ∞

= lim inf (|∇w|2 + Q∞ v 2 ) + 2∇v, ∇(wi − w) + 2Q∞ w(wi − w)
i→∞ Σ∞

+(|∇(wi − w)|2 + Q∞ (wi − w)2 ) dμ

= |∇w|2 + Q∞ w2 dμ,
Σ∞
where the cross terms vanish because of (8.19). Finally, we note that
1,2
φ∞ − 1 ∈ W 3−n to obtain the desired result.
2
8.3. Proof for spin manifolds 275
8.2. Spacetime positive mass rigidity

Unfortunately, the techniques described in the previous section do not yield a
rigidity result, mainly because the proof works by contradiction. Recall that
the same was true in the Riemannian case, in which a completely separate
argument was used to prove rigidity (Theorem 3.19). In Schoen and Yau’s
proof of the inequality E ≥ 0 using the Jang equation [SY81b], they proved
the following result, which was later generalized to dimensions less than 8
by M. Eichmair [Eic13].
Theorem 8.19 (Spacetime positive energy rigidity [SY81b, Eic13]). Let
n < 8, and let (M n , g, k) be a complete asymptotically flat initial data set
satisfying the dominant energy condition. If n = 3, assume further that
trg k = O(|x|−γ ) for some γ > 2. If E = 0, then (M, g, k) sits inside
Minkowski space.
The Jang equation proof of the spacetime positive energy theorem in-
volves reducing to the Riemannian positive mass theorem (Theorem 3.18),
and the rigidity result above similarly relies upon Riemannian positive mass
rigidity (Theorem 3.19).
A similar result in the spin case (Corollary 8.28) follows directly from
Witten’s proof of the positive mass theorem and will be discussed in the
next section.
The theorem above does not consider the more general case of equality
E = |P | > 0. It turns out that this is impossible.
Theorem 8.20 (Spacetime positive mass rigidity [BC96, HL17]). Let M
be a manifold on which the spacetime positive mass theorem holds, and let
(M, g, k) be a complete asymptotically flat initial data set satisfying the dom-
inant energy condition. If E = |P |, then E = |P | = 0.
This theorem was proved using spinors by P. Chruściel and R. Beig
[BC96] in three dimensions and for higher-dimensional spin manifolds by
Chruściel and D. Maerten [CM06]. They also directly showed that (M, g, k)
sits inside Minkowski space, significantly strengthening the original rigidity
argument of Witten described in Corollary 8.28. By replacing the part of
the argument in [BC96] that depended on spinors, Lan-Hsuan Huang and
the author were able to extend Theorem 8.20 to any manifold on which the
spacetime positive mass theorem holds [HL17].
8.3. Proof for spin manifolds

This proof is not much harder than the proof presented in Chapter 5 for the
time-symmetric case. We will follow the exact same steps that were taken
in Chapter 5, generalizing appropriately at every step along the way.
Let M be a spin manifold, and let (M, g, k) be an initial data set. Then
we can make the following constructions. Consider the bundle R × T M , and
let e0 be the constant section (1, 0), meaning that it is always 1 in the R
component and zero in the vector field component. We can equip this bundle
with a Lorentzian product g by declaring g(eμ , eν ) = ημν , where e1 , . . . , en
is any orthonormal basis of T M and Greek indices run from 0 to n as usual.
(If we think of (M n , g, k) as sitting inside some spacetime (Mn+1 , g), then
this is just the pullback of the bundle T M, but we choose to take a purely
initial data point of view and avoid directly dealing with M.)
The Clifford algebra construction described in Chapter 5 works perfectly
well for products that are not positive definite, and the construction can be
used as a bundle construction to obtain a Clifford bundle
∞ 5
4r
Cl(R × T M ) = (R × T M ) I,
r=0
where I is the ideal bundle generated by the relations v ⊗ v = −g(v, v) for

all v ∈ R × Tp M at each p ∈ M , or, in other words,
eμ eν + eν eμ = −2ημν ,
where e1 , . . . , en is any orthonormal basis of Tp M
Let S(M ) be a spinor bundle on (M, g) carrying the structure of a real
module over Clifford bundle Cl(M ) := Cl(T M ), exactly as in Chapter 5.
(Recall that the construction of this S(M ) is the step that requires M to
be spin.) Recall that there is an inner product on S(M ) with the property
that unit vectors in T M ⊂ Cl(M ) act orthogonally on S(M ), and more-
over all elements of T M act as skew-symmetries on S(M ). We would like
to augment S(M ) to obtain a bundle that Cl(R × T M ) can act on. We
define S̃(M ) = S(M ) ⊕ S(M ), with the inner product obtained from adding
the inner products on the two components, and we can define an action of
Cl(R × T M ) on S̃(M ) as follows. For any p ∈ M , any spinors ψ1 , ψ2 ∈
Sp (M ), and any vector v ∈ Tp M ,
v · (ψ1 , ψ2 ) = (v · ψ1 , −v · ψ2 ),
e0 · (ψ1 , ψ2 ) = (ψ2 , ψ1 ).
One can check that this respects the Clifford relations and therefore gives
a well-defined action of Cl(R × T M ) on S̃(M ). However, note that unlike
e1 , . . . , en , the section e0 is symmetric rather than skew-symmetric. (Once
again, from a spacetime point of view, one could instead construct the de-
sired S̃(M ) by starting with a spinor bundle on (M, g) and then pulling it
back to M .) Also, the spin connection ∇ on S(M ) extends to a connection
on S̃(M ) that commutes with the action of e0 .
In this section we will adopt Einstein summation notation except when

we say otherwise. We define a new connection ∇ ˜ on S̃(M ) according to
˜ i = ∇i + 1 kij ej e0 .
∇
2
˜ comes from the am-
(From a spacetime perspective, the connection ∇
bient Levi-Civita connection on T M, while ∇ comes from the intrinsically
defined Levi-Civita connection on T M .) We now define the hypersurface
Dirac operator D̃ on S̃(M ) by
˜i
D̃ = ei · ∇
1
= D + kij ei ej e0
2
1
= D − (tr k)e0 ,
2
where D is the usual Dirac operator on S̃(M ) as defined in Chapter 5, and we
used symmetry considerations in the last line. (The trace of k is computed
with respect to g.) Next we obtain a version of the Schrödinger-Lichnerowicz
formula (Theorem 5.10) for initial data sets.
Theorem 8.21 (Witten). Let (M, g, k) be a spin initial data set. For any
ψ ∈ C ∞ (S̃(M )),
˜ i ψ + 1 (μ + Je0 ) · ψ,
˜ ∗i ∇
D̃2 ψ = ∇
2
˜ ∗ is the formal adjoint of ∇
where ∇ ˜ on S̃(M ).
Proof. We will take advantage of the work we already did to prove Theo-
rem 5.10 in Chapter 5. As usual, we choose an orthonormal basis e1 , . . . , en
that is parallel at the point where we are computing. For any ψ ∈ C ∞ (S̃(M )),
we have
1 1 1
D̃2 ψ = D2 ψ − ei · ∇i [(tr k)e0 · ψ] − (tr k)e0 ei · ∇i ψ − (tr k)2 ψ
2 2 4
∗ 1 1 1
= ∇ ∇ψ + Rψ − ∇i (tr k)ei e0 · ψ − (tr k)2 ψ
4 2 4
1 1
= ∇∗ ∇ψ + (R − (tr k)2 ) − ∇(tr k)e0 · ψ,
2 2
where we used Theorem 5.10 in the second line. On the other hand, since
the formal adjoint of ∇ on S̃(M ) is
˜ ∗ = −∇i + 1 kij ej e0 ,
∇ i
2
we can also see that

1 1 1
˜ ∗∇
∇ ˜ ∗
i i ψ = ∇i ∇i ψ − ∇i kij ej e0 · ψ + kij ej e0 ∇i ψ + kij ej e0 ki e e0
2 2 4
1 1
= ∇∗i ∇i ψ − (∇i kij )ej e0 · ψ − |k|2 ψ
2 4
1 1
= ∇∗i ∇i ψ − |k|2 + (div k) e0 · ψ,
2 2
where we used symmetry considerations to obtain the |k|2 term. Putting
these two computations together with the definition of μ and J yields the
result.
It is also worth pointing out that instead of using the result of Theo-
rem 5.10 as we did above, one could instead prove this formula by following
the same steps as in the proof of Theorem 5.10, except making all of the com-
putations in the spacetime setting and obtaining [G(e0 , e0 )+G(ei , e0 )ei e0 ]·ψ
as the zero order term.
Next we need a version of Corollary 5.13 for initial data sets.

Corollary 8.22. Let Ω be a bounded open set with smooth boundary in a
complete spin initial data set (M, g, k). Then for any ψ ∈ S̃(M ),

1 < =
˜
|∇ψ| 2
− |D̃ψ| + ψ, (μ + Je0 ) · ψ
2
dμM
Ω 2

= ˜ ν ψ + ν · D̃ψ dμ∂Ω
ψ, ∇
∂Ω
= ψ, L̃i ψν i dμ∂Ω ,

∂Ω
˜ j.
where L̃i = (δij + ei ej ) · ∇
Exercise 8.23. Prove the corollary above. Essentially, follow the proof of
Proposition 5.13, except that before attempting to integrate by parts, keep
in mind that ∇ is compatible with the inner product on S̃(M ) while ∇ ˜ is
not. Part of your computation will be a proof that D̃ is formally self-adjoint.
Next we prove an initial data version of Proposition 5.14.

Proposition 8.24. Let (M, g, k) be an asymptotically flat spin initial data
set, and let e1 , . . . , en be an orthonormal frame near infinity (of a particular
end). There exists ψ0 ∈ C ∞ (S̃(M )) which is constant with respect to this
frame (meaning that each of its components in C ∞ (S(M )) is constant) such
that
1
lim ψ0 , L̃i ψ0 ν i dμSρ = (n − 1)ωn−1 (E − |P |),
ρ→∞ S
ρ
2
where (E, P ) is the ADM energy-momentum of the chosen end.
Proof. First choose ψ0 ∈ C ∞ (S̃(M )) to be any spinor which is constant

with respect to the chosen frame. We can see that
˜ j ψ0
L̃i ψ0 = (δij + ei ej )∇
1
= Li ψ0 + (δij + ei ej ) kj e e0 · ψ0
2
(8.20) 1
= Li ψ0 + (ki e e0 − (tr k)ei e0 ) · ψ0
2
1
= Li ψ0 + (kij − (tr k)δij )ej e0 · ψ0 ,
2
where Li is defined in Corollary 5.13, and we used symmetry of k in the
second line. From Proposition 5.14, we know that the integral of the Li ψ0
term gives rise to the ADM energy term. Then

1
ψ0 , (kij − (tr k)δij )ej e0 · ψ0 ν i dμSρ
Sρ 2

1
= lim (kij − (tr k)gij )ν dμSρ ψ0 , ej e0 · ψ0 dμSρ
i
ρ→∞ Sρ 2
1
= (n − 1)ωn−1 ψ0 , Pj ej e0 · ψ0 ,
2
where we used the asymptotic decay of gij − δij , the definition of P , and the
fact that ψ0 is constant with respect to the chosen frame. Finally, we claim
that we can choose a unit length ψ0 so that
ψ0 , Pj ej e0 · ψ0 = −|P |.
Putting it all together yields the desired result.
Exercise 8.25. Prove the claim at the end of the proof above by construct-
ing an appropriate unit length eigenspinor ψ0 of the action of Pj ej e0 on S̃.
Corollary 8.26. Assume the hypotheses of Proposition 8.24, and choose
ψ0 as in its conclusion. Suppose that ψ ∈ C ∞ (S̃(M )) such that ψ − ψ0 ∈
1,2
W−q (S̃(M )), where q = n−22 . Then

lim ψ, L̃i ψν i dμSρ = 12 (n − 1)ωn−1 |ψ0 |2 (E − |P |).
ρ→∞ S
ρ
The real work needed to prove this corollary was already taken care of
in Corollary 5.15. All that is left to do here is check that the k terms in the
discrepancy between L̃i and Li have enough decay that they do not matter,
and this is indeed the case. (Check this.)
Next we prove that the hypersurface Dirac operator is an isomorphism,
as we did for the ordinary Dirac operator in Proposition 5.16.
Proposition 8.27. Let (M n , g, k) be an asymptotically flat spin initial data

set satisfying the dominant energy condition, and let q = n−2
2 . The operator
1,2
D̃ : W−q (S̃(M )) −→ L2−q−1 (S̃(M ))
is an isomorphism. Note that L2−q−1 = L2 with this choice of q.
1,2
Proof. We already know from Proposition 5.16 that D : W−q −→ L2−q−1
−1
is an isomorphism. Since k decays faster than |x| , we see that D can
be continuously deformed to D̃ = D − 12 (tr k)e0 in the strong operator
topology. Hence D̃ is also a Fredholm operator with index zero [Wik,
Fredholm operator], and it suffices to show that D̃ is injective.
1,2
Assume that ψ ∈ W−q (S(M )) with D̃ψ = 0. By Corollary 8.22 and
Corollary 8.26 with ψ0 = 0, we see that

˜ 1< =
|∇ψ| + ψ, (μ + Je0 ) · ψ
2
dμM = 0.
M 2
The dominant energy condition implies then that ∇ϕ ˜ = 0. That is, ∇i ψ =
− 2 kij ei e0 · ψ. By a simple bootstrapping argument, it follows that ψ has to
1
be smooth, and since q > 0, ψ must approach zero at infinity.

Consider the function f = |ψ|2 . Then
∇i f = ∇i ψ, ψ + ψ, ∇i ψ
' (
˜ 1
= ∇i − kij ej e0 · ψ, ψ + ψ, ∇i ψ
2
' (
∗
˜ i ψ − 1
= ψ, ∇ kij ej e0 · ψ, ψ + ψ, ∇i ψ
2
' ( ' (
1 1
= ∇i − kij ej e0 ψ, ψ − kij ej e0 · ψ, ψ + ψ, ∇i ψ
2 2
= kij ej e0 · ψ, ψ.
Therefore, if we define r to be a positive function on M equal to |x| in
the exterior ends, then since k decays faster than |x|−q−1 , we have |∇f | ≤
Cr−q−1 f for some constant C. So whenever f is nonzero,
|∇(log f )| ≤ Cr−q−1 .
Now suppose that f = |ψ|2 is nonzero somewhere. Integrating the inequality
above shows that ψ must be nonzero everywhere, and furthermore, since
∞ −q−1
1 r dr < ∞, ψ cannot approach zero along any radial ray going toward
infinity, obtaining a contradiction.
Now we can see that the spacetime positive mass theorem follows from
the previous proposition, Corollary 8.26, and Corollary 8.22, exactly the
way we concluded Theorem 5.12 in Chapter 5. Specifically, if ψ is chosen to
1,2
solve D̃ψ = 0 with ψ − ψ0 ∈ W−q , where ψ0 is a constant spinor chosen as
in Corollary 8.26, then

2 ˜ 1< =
E − |P | = |∇ψ| + ψ, (μ + Je0 ) · ψ
2
dμM ≥ 0,
(n − 1)ωn−1 M 2
where the inequality follows from the dominant energy condition.
Corollary 8.28 (Spacetime positive energy rigidity for spin manifolds).
Let (M, g, k) be a complete asymptotically flat spin initial data set sitting
inside some spacetime (M, g) satisfying the dominant energy condition, and
suppose that E = 0 in some end. Then the ambient spacetime metric is flat
along M .
Note that the assumption of sitting inside a spacetime satisfying the

DEC is stronger than merely assuming the initial data version of the DEC.
Proof. From the spacetime positive mass theorem, we know that |P | ≤

E = 0. In particular, we can choose ψ0 to equal any constant spinor in the
chosen end in order for Proposition 8.24 to hold. For convenience, we choose
ψ0 so that it is a diagonal element of S̃ = S ⊕ S in the end, or equivalently
e0 ψ0 = ψ0 there. Then our proof of the spacetime positive mass theorem
presented above tells us that we can find a spinor ψ asymptotic to ψ0 and
satisfying D̃ψ = 0, and

˜ 1
(8.21) 0= |∇ψ| + ψ, (μ + Je0 ) · ψ dμM .
2
M 2
The dominant energy condition then implies that ∇ψ ˜ = 0. Observe that
˜
means that for any constant spinor, we can find a ∇-parallel spinor asymp-
˜
totic to it. For each i = 1, . . . , n, we define ψi to be the ∇-parallel spinor
asymptotic to ei · ψ0 and then define Vi to be the vector field in M along
M with the property that
g(Vi , w) = −e0 w · ψ, ψi
˜
for each w ∈ Tp M at each p ∈ M . Here, ψ is still the ∇-parallel spinor
asymptotic to ψ0 . We also define V0 to be the vector field in M along M
with the property that
g(V0 , w) = −e0 w · ψ, ψ
for each w ∈ Tp M at each p ∈ M .
A simple calculation shows that along M , we have
˜ v ϕ, τ + e0 ϕ, ∇
∇v e0 · ϕ, τ = e0 ∇ ˜ vτ
for any vector v tangent to M and any ϕ, τ ∈ C ∞ (S̃(M )). Using this, we can
˜ v Vμ = 0 for each μ = 0, 1, . . . , n, where ∇
see that ∇ ˜ here denotes the Levi-
Civita connection of the ambient metric g. From the construction of Vμ ,
we can also see that Vμ is asymptotic to eμ at infinity. Consequently, the

V0 , V1 , . . . , Vn constitutes a parallel global g-frame for M along M . Since
these are only known to be covariant constant in the tangential directions,
we do not immediately obtain flatness of g along M , but according to the
definition of curvature, we do find that the ambient curvatures Rijμν = 0,
where the i, j refer to tangential directions (not the Vi directions which need
not be tangential and which we will no longer use). We would like to show
that all components vanish. By symmetries of the curvature tensor, the only
remaining components that we need to show vanish are the ones of the form
Ri0j0 . All others are already known to vanish. In particular, we have
Gi0 = Rici0 = 0,
Ric00 = Ri0i0 ,
R = −2Ri0i0 ,
1
G00 = Ric00 + R = 0.
2
We now invoke the dominant energy condition in the ambient spacetime,
which can be seen to imply that |Gij | ≤ G00 . Since we have seen that the
latter is zero, it follows that Gij is zero. Consequently G is identically zero,
and hence Ric is identically zero also. In particular,
Ricij = −Ri0j0
is zero, completing the proof.
As alluded to earlier, Witten’s theorem can be generalized to allow a

MOTS boundary, generalizing Theorem 4.16. Partial results were first ob-
tained by Gibbons, Hawking, Horowitz, and Perry in [GHHP83], and a
complete rigorous mathematical proof was obtained by Herzlich [Her98].
Theorem 8.29 (Spacetime positive mass theorem with MOTS boundary).

Let (M, g, k) be a complete one-ended asymptotically flat spin initial data
set with a MOTS boundary, such that the dominant energy condition holds.
Then E ≥ |P |, where (E, P ) is the ADM energy-momentum.
Sketch of the proof. The basic idea is to find a solution ψ of the hyper-
surface Dirac equation D̃ψ = 0 such that ψ is asymptotic to the constant
spinor ψ0 at infinity (just as in the proof described earlier), with the addi-
tional boundary condition
νe0 · ψ = ψ
at ∂M , where ν is the outward unit normal of ∂M (pointing into M ). The
key point is that this boundary condition guarantees that the ∂M term
arising from Corollary 8.22 vanishes: suppose for now that we can find the
desired ψ. Then by Corollary 8.22 and Corollary 8.26,

1
− 1)ωn−1 |ψ0 | (E − |P |) =
2 ˜ 2 + 1 ψ, (μ + Je0 ) · ψ dμM
|∇ψ|
2 (n 2
M

n
+ ψ, L̃i ψν i dμ∂M
∂M i=1

n
≥ ψ, L̃i ψν i dμ∂M ,
∂M i=1
where the inequality follows from the dominant energy condition. We will
show that the boundary condition νe0 · ψ = ψ implies that this boundary
integral vanishes. Recall from equation (8.20) that
1
L̃i ψ = Li ψ + (kij − (trg k)δij )ej e0 · ψ.
2
In the following, we adopt Einstein summation notation for indices from 1

to n − 1.
Claim. ψ, Li ψν i = − 12 H|ψ|2 along ∂M .
Choose e1 , . . . , en−1 to be an orthonormal frame for ∂M , set en = ν, and

compute
ψ, Li ψν i = ψ, νej · ∇j ψ

= ψ, νej · ∇j (νe0 · ψ)
< =
= ψ, νej (∇j ν)e0 · ψ + νej νe0 · ∇j ψ
= ψ, νej S(ej )e0 · ψ + ψ, ej e0 · ∇j ψ
= −Hψ, νe0 · ψ − e0 · ψ, ej · ∇j ψ
= −Hψ, ψ + ν · ψ, ej · ∇j ψ
= −H|ψ|2 − ψ, νej · ∇j ψ
= −H|ψ|2 − ψ, Li ψν i ,
which proves the Claim. Note that we used symmetry of the shape opera-
tor S. Next we take care of the k terms:
< 1 =
ψ, (kij − (trg k)δij )ej e0 · ψ ν i
2
1< =
= ψ, [k(ν, ej )ej − (trg k)ν]e0 · ψ
2
1< =
= ψ, [k(ν, ej )ej − (tr∂M k)ν]e0 · ψ
2
1 1
= ψ, k(ν, ej )ej e0 · ψ − (tr∂M k)ψ, νe0 · ψ
2 2
1 1
= − ψ, k(ν, ej )ej ν · ψ − (tr∂M k)ψ, ψ
2 2
1
= − (tr∂M k)|ψ| , 2
2
where the last line follows because ψ, ej ν · ψ = 0 when j = n. Altogether,
we have shown that
n
1
ψ, L̃i ψν i = − θ+ |ψ|2 ,
2
i=1
which vanishes under the assumption that ∂M is a MOTS.
The only thing left to do is establish the existence of the desired ψ.
To do this one first shows that the boundary condition νe0 · ψ = ψ is an
“elliptic boundary condition” in the sense that it satisfies the Lopatinski-
Shapiro conditions as in [Wlo87]. Once that is done, the main thing to
show is that the operator D̃ is injective on the space of spinors satisfying
the boundary condition and vanishing sufficiently fast at infinity. Recall that
this was the key fact underlying the proof of Proposition 8.27. Suppose that
ϕ solves D̃ϕ = 0 with boundary condition νe0 · ϕ = ϕ at ∂M , such that ϕ
1,2
vanishes at infinity in the sense that it lies in W−q . Then by Corollary 8.22,
Corollary 8.26, and the calculation above,

0= ˜ 2 + 1 ϕ, (μ + Je0 ) · ϕ dμM .
|∇ϕ|
M 2
Then by the DEC, it follows that ∇ϕ ˜ = 0 everywhere, and then by the
proof of Proposition 8.27, it follows that ϕ is zero. Hence we have the
desired injectivity, which can be used to complete the proof.
By considering a different boundary condition, Herzlich was able to make

some partial progress toward a spinor proof of the Riemannian Penrose
inequality [Her97], but a complete spinor proof has remained out of reach.
Chapter 9
Density theorems for

the constraint
equations
Our main goal for this chapter is to prove a density theorem for DEC (The-
orem 8.3), but we will also prove a density theorem for vacuum constraints
as well. First, let us settle some notation.
Assumption 9.1. Throughout this entire chapter, M will be a manifold of
dimension n ≥ 3 equipped with a background metric ḡ that is identically
Euclidean on each noncompact end of M . The variables p and q are real
numbers satisfying p > n and n−2 2 < q < n. If a statement assumes that
(g, π) is asymptotically flat initial data, then we will assume that this q is
smaller than the assumed decay rate of (g, π) in Definition 7.17. We also
4
define s := n−2 .
9.1. The constraint operator

2,p
Definition 9.2. Define W−q (ḡ) to be the space of continuous Riemannian
metrics g on M such that g − ḡ ∈ W−q 2,p
(T ∗ M T ∗ M ). Note that this space
is an open subset of an affine copy of W−q 2,p
(T ∗ M T ∗ M ), and in particular
inherits its topology. We define the constraint operator
2,p 1,p
Φ : W−q (ḡ) × W−q−1 (T M T M ) −→ Lp−q−2 (M ) × Lp−q−2 (T M )
by the formula

1
Φ(g, π) := (2μ, J) = Rg + (trg π)2 − |π|2g , divg π
n−1
285
286 9. Density theorems for the constraint equations
2,p 1,p
for all (g, π) ∈ W−q (ḡ) × W−q−1 (T M T M ), where we use 2μ instead of μ
merely to avoid factors of 2 in our formulas. (Recall the relationship between
π and k from Definition 7.16.)
For notational simplicity, we will often abbreviate notation by writing
things like
2,p 1,p
Φ : W−q (ḡ) × W−q−1 −→ Lp−q−2 .
Observe that elements (g, π) of the domain of Φ do not necessarily sat-

isfy our definition of asymptotically flat initial data (Definition 7.17)—first,
they need not be smooth, but more importantly the assumed decay rates of
(g, π) do not guarantee that μ and J are integrable. The first failing is not
very important, since one can usefully define asymptotic flatness with only
Sobolev regularity, but the second failing is essential because integrability
of μ and J is necessary to define ADM energy-momentum.
2,p 1,p
Exercise 9.3. Check that for (g, π) ∈ W−q (ḡ) × W−q−1 , the constraints
p
Φ(g, π) indeed lie in L−2−q .
Proposition 9.4 (Linearized constraints). The linearization of Φ at any

2,p 1,p
element (g, π) ∈ W−q (ḡ) × W−q−1 is the operator
2,p 1,p
DΦ|(g,π) : W−q × W−q−1 −→ Lp−q−2 ,
given by the formula
DΦ|(g,π) (h, w) = − Δg (trg h) + divg (divg h) − Ricg , hg

2
+ (trg π)π ij hij − 2gk π ik π j hij
n−1
2
+ (trg π)(trg w) − 2π, wg ,
n−1

1 ij k
(divg w) − g π hk;j + g π hj;k + 2 π (trg h),j
i ij k 1 ij
2
2,p 1,p
for all (h, w) ∈ W−q ×W−q−1 , where the more complicated contractions have
been written out in index notation with summation convention.
9.1. The constraint operator 287
Moreover, the formal L2 adjoint operator is given by the formula

DΦ|∗(g,π) (ξ, V )

2
= −(Δg ξ)g + Hess ξ − Ricg ξ + (trg π)π − 2 trg (π ⊗ π) ξ
n−1
1
+ (LV π + (div V )π − 2V div π) − (∇V, πg − V, div πg )g ,
2

1 2
− (LV g) +

(trg π)g − 2π ξ

2 n−1
for any function ξ and vector field V , where represents lowering of indices,
trg (π ⊗ π) = π ik π j gk , and (a b)ij := ai bj + aj bi .
Exercise 9.5. Prove the proposition above. The toughest part of the com-
putation of DΦ was already done in Exercise 1.18. For the DJ component
of DΦ, one approach is to use g as a background metric and invoke Exer-
cise 1.11 after writing the divergence in terms of the W tensor. Computation
of the adjoint should be straightforward.
Although our formulas in Proposition 9.4 may seem quite complicated,

from a PDE perspective, we can focus our attention to the top order parts of
the expressions. Much like the linearization of the scalar curvature operator,
the operator DΦ|(g,π) is heavily underdetermined (and its adjoint is heavily
overdetermined). Naively, this suggests that the constraint equations should
be “easy” to solve.
We start by looking at a restricted class of deformations for which the
constraint equations become elliptic. For example, restricting the scalar
curvature operator to a conformal class has the effect of turning it into an
elliptic operator. We can perform a similar trick for the constraint operator.
Since there are n + 1 components of the constraint operator, we would like
to have n + 1 functions worth of freedom to deform the pair (g, π). Again,
we consider conformal deformations of the metric (one function worth of
freedom), but we have to try something a little different for the deformation
of π. We need n more functions, so naturally a vector field Y (or equivalently
a 1-form) comes to mind. Moreover, since Φ(g, π) is only first-order in π, it
makes sense to deform π using derivatives of Y if we want to build a second-
order elliptic operator. These concerns led J. Corvino and R. Schoen [CS06]
to the following definition.
Definition 9.6. Given a metric g and a vector field Y on a manifold M ,
we define
Lg Y := (LY g − (divg Y )g) ,
where LY is the Lie derivative and represents raising of indices.
2,p
Let W−q (1) denote the space of positive functions u such that u − 1 ∈
2,p 2,p 2,p
W−q (M ). Note that W−q (1) is a subset of an affine copy of W−q (M ) and,
2,p 1,p
in particular, inherits its topology. Given a fixed (g, π) ∈ W−q (ḡ) × W−q−1 ,
we define a map
2,p 2,p
T : W−q (1) × W−q (T M ) −→ Lp−q−2 (M ) × Lp−q−2 (T M )
2,p 2,p
as follows. For any (u, Y ) ∈ W−q (1) ×W−q , we define new initial data (g̃, π̃)
by
g̃ := us g,
π̃ := us/2 (π + Lg Y ).
We define
˜
T (u, Y ) := Φ(g̃, π̃) = (2μ̃, J),
where μ̃ and J˜ are the corresponding energy-momentum densities of the
initial data (g̃, π̃).
2,p 1,p
Exercise 9.7. Let (g, π) ∈ W−q (ḡ) × W−q−1 , and define T as above. Show
2,p 2,p
that for any (u, Y ) ∈ W−q (1) × W−q ,

−s 1
T (u, Y ) = u u−1 Lg u + (trg π + trg Lg Y )2
n−1

− (|π|g + 2Lg Y, π + |Lg Y |g ) ,
2 2

−s/2 n−1
u (divg Lg Y + divg π)i + (π + Lg Y )ij u−1 u,j
n−2

1 ij −1
− trg (π + Lg Y )g u u,j ,
n−2
where Lg denotes the conformal Laplacian.
Also show that the linearization of T at (1, 0),
2,p
DT |(1,0) : W−q −→ Lp−q−2 ,
is given by the formula

4(n − 1) 2
DT |(1,0) (v, Z) = − Δg v + (trg π)(divg Z) − 4π, ∇Z − 2sμv,
n−2 n−1

n − 1 ij 1
i
(divg Lg Z) + π v,j − (trg π)∇ v − sJ v
i i
n−2 n−2
2,p
for all (v, Z) ∈ W−q , where (μ, J) are the constraints of (g, π).
Note the close relationship between Definition 9.6 and the definition of
harmonic asymptotics (Definition 8.2). The following lemma shows that the
asymptotics for u and Y that appear in Definition 8.2 actually follow from
strong decay of the constraints.
Lemma 9.8. Let (M, g, π) be an asymptotically flat initial data set with
0,α
constraints (μ, J). Assume that (μ, J) ∈ C−n−1−δ for some 0 < α < 1 and
δ > 0.
2,p 2,p
Suppose there exists (u, Y ) ∈ W−q (1) × W−q (T M ) such that
g = us ḡ,
π = us/2 LY
outside some compact set, where L := Lḡ . Then there exist constants a, bi ∈
R and α ∈ (0, 1) such that
u(x) = 1 + a|x|2−n + O2+α (|x|1−n ),
Y i (x) = bi |x|2−n + O2+α (|x|1−n ).
In other words, (g, π) has harmonic asymptotics in the sense of Defini-
tion 8.2.
Proof. By Exercise 9.7, outside of a compact set we have

4(n − 1) −1 1
2us μ = − u Δu + [tr(LY )]2 − |LY |2ḡ ,
n−2 n−1
n−1 1
us/2 J i = ΔY i + (LY )ij u−1 u,j − tr(LY )u−1 u,i ,
n−2 n−2
where the bars denote computations in terms of the Euclidean background
metric ḡ. By weighted Sobolev embedding (Theorem A.25), (u − 1, Y ) =
O1+α (|x|−q ) for any 0 < α ≤ 1 − np . If we examine the assumed decay of all
of the non-Laplacian terms above, we may conclude that
Δ(u, Y ) = Oα (|x|max(−2q−2,−n−1−δ))
for any 0 < α ≤ min(1 − np , α ). Since q > n−2
2 , we see that the decay rate
max(−2q −2, −n−1−δ) is less than 2 −n. Therefore Corollary A.37 implies
that
u = 1 + a|x|2−n + O2+α (|x|2−n−γ ),
Y i = bi |x|2−n + O2+α (|x|2−n−γ )
for constants a, bi , and some γ > 0. To complete the proof, we need to
upgrade the decay rate 2−n−γ to 1−n. Since we now know that (u−1, Y ) =
O2+α (|x|2−n), our equations give us the improved rate
Δ(u, Y ) = Oα (|x|max(2−2n,−n−1−δ) ).
If n > 3, then the decay rate max(2 − 2n, −n − 1 − δ) is less than −n − 1,

and then Corollary A.37 implies the desired result. The n = 3 proof is a bit
trickier. See [EHLS16, Proposition 24] for the details of that case.
Finally, we have the following initial data version of Corollary 6.11, due
to Corvino and Schoen [CS06, Proposition 3.1]. This theorem will allow us
to find solutions of the constraint equations nearby any known solution.
Theorem 9.9 (Surjectivity of the linearized constraint operator [CS06]).

Let (M, g, π) be a smooth asymptotically flat initial data set. Then the lin-
earized constraint map
2,p 1,p
DΦ|(g,π) : W−q × W−q−1 −→ Lp−q−2
is surjective.
Proof. First, we claim that DΦ|(g,π) has closed range. To see this, observe
that the range contains the range of DT |(1,0) , as defined above. By Exer-
cise 9.7, we see that the top-order part of DT |(1,0) is just the Laplacian.
Therefore a systems version of Corollary A.42 (as in [LM83]) tells us that
the range of DT |(1,0) has finite codimension, and this implies the claim.
Since DΦ|(g,π) has closed range, surjectivity of DΦ|(g,π) is equivalent to
injectivity of its formal L2 adjoint map. Suppose (ξ, V ) is an element in the
∗
kernel of DΦ|∗(g,π) . That is, it lies in the dual space Lp2+q−n and solves the
equation DΦ|∗(g,π) (ξ, V ) = (0, 0) in a weak sense. By taking the trace of
the first component of the formula for the adjoint in Proposition 9.4, we can
solve for Δξ in terms of everything else. Feeding that back into the untraced
equation, we see that
Hess ξ = “Ricg ∗ ξ + π ∗ π ∗ ξ + π ∗ LV g + ∇π ∗ V ”,
where the right side is supposed to represent the various kinds of terms
that will appear, with ∗ indicating some sort of generic contraction, and we
ignore the constant scalar factors. Similarly,
LV g = “π ∗ ξ”,
and therefore
(9.1) Hess ξ = “(Ricg + π ∗ π) ∗ ξ + ∇π ∗ V ”.

We can show that V also satisfies a Hessian-type equation, as seen in

[CH16]. Computing with respect to a coordinate basis, we have
(LV g)ij;k + (LV g)ki;j − (LV g)jk;i
= (Vi;jk + Vj;ik ) + (Vk;ij + Vi;kj ) − (Vj;ki + Vk;ji )
= 2Vi;jk + (Vi;kj − Vi;jk ) + (Vj;ik − Vj;ki ) + (Vk;ij − Vk;ji )
= 2Vi;jk + (Rikj + Rjik + Rkij )V ,
where we used Exercise 1.6 in the first line and the definition of Riemann
curvature in the last line. From this we can see that
(9.2) Hess V = “∇π ∗ ξ + π ∗ ∇ξ + Riem ∗ V ”.
Taking the trace of (9.1) and using asymptotic flatness and our initial
∗ ∗
decay assumption ξ, V ∈ Lp2+q−n , we have Δg ξ ∈ Lpq−n . So by weighted
2,p ∗
elliptic regularity (Theorem A.32), we have ξ ∈ W2+q−n . In particular,
∗
∇ξ ∈ Lp1+q−n , so we can now use the same argument on the trace of (9.2)
2,p ∗
to conclude that V ∈ W2+q−n also. Then by weighted Sobolev embedding
np∗
∗
(Theorem A.25), we have ξ, V ∈ L2+q−n
n−2p
. In summary, we have used elliptic
estimates to improve from a Lebesgue space with exponent p∗ to one with
np∗
exponent n−2p ∗ . We can repeat this process for further improvement. In
fact, we can continue this enough times to eventually guarantee that ξ, V ∈

1
C2+q−n . (This should be unsurprising since elliptic regularity guarantees
smoothness of ξ and V .)
The next step is to show that (ξ, V ) has improved decay, and here is
where we use the full power of having Hessian equations and not just elliptic
equations. Consider a coordinate chart for the asymptotically flat region and
write (9.1) and (9.2) in local coordinates. Then we have
∂i ∂j ξ = Aij ξ + Bij V,
(9.3)
∂i ∂j V k = Cij
k
V + Dij
k k
ξ + Eij ∂ ξ,
where asymptotic flatness guarantees that A, B, C, D = O(|x|−q−2 ) and E =
O(|x|−q−1 ). We claim that this implies that (ξ, V ) vanishes to infinite order
in the sense that ξ, V, ∂ξ, ∂V are all O(|x|−τ ) for any value of τ . From
the discussion above, we know that we have some starting level of decay
(ξ, V ) = O(|x|−τ ) and (∂ξ, ∂V ) = O(|x|−τ −1 ) for τ = n − 2 − q > 0.
By (9.3), we have (∂∂ξ, ∂∂V ) = O(|x|−τ −q−2 ). Simply by integrating along
rays out to infinity, it follows that (∂ξ, ∂V ) = O(|x|−τ −q−1 ), and integrating
again yields (ξ, V ) = O(|x|−τ −q ). Hence we have improved our assumed
decay rates by a fixed amount q and this can be done repeatedly, proving
the claim.
For the last step, we argue that infinite order vanishing implies that
(ξ, V ) vanishes identically. This follows from an elliptic argument as was
carried out in [CS06], but since we have the Hessian equations at our dis-
posal, we will use an ODE argument instead as explained in [HMM18].
Specifically, for any point p ∈ M , choose an arclength-parameterized curve
γ(t) starting at γ(1) = p and running off to infinity along a geodesic ray, and
note that t and |x| are uniformly bounded by each other at γ(t). Consider
the function
F (t) = t2 (|∇ξ|2 + |∇V |2 ) + ξ 2 + |V |2 ,
where each function is evaluated at γ(t). Then
|F (t)| ≤ 2t(|∇ξ|2 + |∇V |2 ) + 2t2 (|∇ξ| · | Hess ξ| + |∇V | · | Hess V |)
+ 2|ξ| · |∇ξ| + 2|V | · |∇V |.
We use (9.1) and (9.2) and asymptotic flatness to see that | Hess ξ| and
| Hess V | are both bounded by tC2 (|ξ| + |V |) + Ct |∇ξ|. Therefore
C
|F (t)| ≤ 2t(|∇ξ|2 + |∇V |2 ) + 2t2 (|∇ξ| + |∇V |) (|ξ| + |V |)
t2
C
+ 2t2 (|∇ξ| + |∇V |)|∇ξ| + 2|ξ| · |∇ξ| + 2|V | · |∇V |
t

1 2 1
≤ 2t(|∇ξ| + |∇V | ) + 2C t|∇ξ| + t|∇V | + |ξ| + · |V |
2 2 2 2 2
t t
1 1
+ Ct(3|∇ξ|2 + |∇V |2 ) + |ξ|2 + t|∇ξ|2 + |V |2 + t|∇V |2
t t
5(1 + C)
≤ F (t).
t
Using an integrating factor, this ODE inequality tells us that
h(t) ≥ h(1)t−5(1+C) .
But this contradicts the infinite order vanishing of (ξ, V ) and its gradient,
unless h(1) = 0. Hence (ξ, V ) vanishes at every point p ∈ M , completing
the proof of injectivity of DΦ|∗(g,π) .

9.2. The density theorem for vacuum constraints

Corvino and Schoen proved that vacuum initial data can always be per-
turbed to vacuum initial data with harmonic asymptotics [CS06, Theorem
1]. This is the initial data analog of Lemma 3.34.
Theorem 9.10 (Density theorem for vacuum constraints [CS06]). Let
(M n , g, π) be a complete asymptotically flat initial data set satisfying the
9.2. The density theorem for vacuum constraints 293
vacuum constraints, and let p > n and n−2 2 < q < n − 2 such that q is less
Then for any > 0, there exists vacuum initial data (g̃, π̃) on M with
harmonic asymptotics (in the sense of Definition 8.2) such that (g̃, π̃) is
2,p 1,p
-close to (g, π) in W−q × W−q−1 .
Note that by Lemma 8.4, the ADM energy-momentum of the sequence

(gi , πi ) constructed in the theorem will converge to that of (g, π).
Proof. We start out following the basic steps set forth in the proof of
Lemma 3.34, except that we use the deformation of initial data described
in the previous section. Let χ be a smooth nonnegative cutoff function on
Rn that is equal to 1 on B1 and vanishes outside B2 . For λ ≥ 1, define
χλ (x) = χ(x/λ). For λ large enough, we can think of χλ as being defined
on M by extending it to be 1 on the compact region of M . Recall that ḡ is
equal to the Euclidean metric on the asymptotically flat end. Define
gλ := χλ g + (1 − χλ )ḡ,
πλ := χλ π,
so that gλ = ḡ and πλ = 0 when |x| > 2λ. The asymptotic flatness of (g, π)
2,p 1,p
implies that (gλ , πλ ) → (g, π) in the W−q × W−q−1 sense as λ → ∞. For
convenience, we define (g∞ , π∞ ) := (g, π).
2,p 2,p
Next, for any pair (u, Y ) ∈ W−q (1) × W−q (T M ), we define
g̃ := us gλ ,
π̃ := us/2 (πλ + Lgλ Y ),
4
where s := n−2 , just as in Definition 9.6. If we can find a pair (u, Y )
such that (g̃, π̃) satisfies the vacuum constraints, then it will follow from the
construction and Lemma 9.8 that (g̃, π̃) has harmonic asymptotics. As in
Definition 9.6, we define
Tλ (u, Y ) := Φ(g̃, π̃) = Φ(us gλ , us/2 (πλ + Lgλ Y )),
where we use the λ subscript to denote the dependence of the operator Tλ
on λ. Then the vacuum constraint equations for (g̃, π̃) simply translate to
the system
Tλ (u, Y ) = 0.
Our basic strategy is clear: for large λ, we look for a solution (u, Y ), which
then gives us our desired vacuum initial data (g̃, π̃) with harmonic asymp-
totics. Moreover, as λ → ∞, the resulting initial data should converge to
the original data (g, π).
For the time being, let us assume that DT∞ |(1,0) is invertible so that we
can use the inverse function theorem (Theorem A.43) to solve Tλ (u, Y ) = 0.
For large λ, Tλ is a small deformation of T∞ , so it follows that DTλ |(1,0)

is also invertible for large enough λ, and moreover one can also see that
the relevant constants in Theorem A.43 are independent of λ. Therefore
Theorem A.43 tells us that there exists C independent of λ, such that for
small enough r > 0, Tλ has an inverse map from the ball of radius r around
Tλ (1, 0) in Lp−q−2 into the ball of radius Cr around (1, 0) in W−q
2,p
. Note that
p
Tλ (1, 0) = Φ(gλ , πλ ) which converges to Φ(g, π) = 0 in L−q−2 as λ → ∞.
Putting all of this together, we see that we can choose a sequence λk → ∞
and a corresponding pair (uk , Yk ) that solves Tλk (uk , Yk ) = 0 such that
2,p
(uk , Yk ) → (1, 0) in the W−q sense.1 In particular, we know that uk > 0 for
large k. Therefore the corresponding initial data
s/2
(gk , πk ) := (usk gλk , uk (πλk + Lgk Yk ))
2,p 1,p
converges to (g, π) in W−q ×W−q−1 and also satisfies the vacuum constraints.
Elliptic regularity arguments show that (gk , πk ) is smooth. As mentioned
earlier, Lemma 9.8 guarantees that (g̃k , π̃k ) has harmonic asymptotics, com-
pleting the main part of the proof.
In general, it is not so clear whether DT∞ |(1,0) is invertible. (Recall
that the analogous operator in the proof of Lemma 3.34 was just the Lapla-
cian.) However, it turns out that we can use the surjectivity of the linearized
constraints to get around this problem. As mentioned in the proof of The-
orem 9.9, we know that the range of DT∞ |(1,0) has finite codimension. By
surjectivity of DΦ|(g,π) (Theorem 9.9), we can find a finite-dimensional sub-
2,p 1,p
space K2 ⊂ W−q ×W−1−q of first-order initial data deformations (h, w) such
(K
that DΦ|g,π 2 ) is a complementing subspace for the range of DT∞ |(1,0) .
Moreover, since the complementing subspace property is an open condition,
we may assume without loss of generality that the elements in K2 are com-
pactly supported smooth functions by changing them by a small amount.
Meanwhile, let K1 be a complementing subspace for the kernel of DT∞ |(1,0)
2,p 1,p
in W−q × W−q−1 , and now define
T̂λ : [(1, 0) + K1 ] × K2 −→ Lp−q−2
by the formula
T̂λ (u, Y, h, w) = Φ(us gλ + h, us/2 (πλ + Lgλ Y ) + w).

By construction of K1 and K2 , we know that D T̂∞ is an isomor-
(1,0,0,0)

phism. Since D T̂λ converges to D T̂∞ in the strong operator
(1,0,0,0) (1,0,0,0)

topology, we see that D T̂λ is also an isomorphism for large λ, and we
1,0,0,0
1 Of course, the k in Yk here is indexing the sequence and does not mean the ith component.
9.3. The density theorem for DEC (Theorem 8.3) 295
can now apply the exact same inverse function theorem argument described
above. The upshot is that we obtain a sequence of vacuum initial data

s/2
(gk , πk ) = usk gλk + hk , uk (πλk + Lgλk Yk ) + wk ,
2,p 1,p
converging to (g, π) in the W−q × W−q−1 topology. The key point is that
although we do not know much about (hk , wk ), we do know that they are
compactly supported, and thus (gk , πk ) still has harmonic asymptotics.
In the theorem above, we solved for vacuum constraints, but upon exam-
ining the proof, it is clear that we can do something similar while preserving
any fixed constraints. Indeed, we can even specify a perturbation of those
fixed constraints.
Proposition 9.11. Let (M, g, π) be a complete asymptotically flat initial
data set with constraints (μ, J). There exist δ > 0 and a constant C such
that the following holds.
For any (ξ, Z) ∈ Lp−q−2 (M ) × Lp−q−2 (T M ) with (ξ, Z)Lp−q−2 < δ, there
exists (g̃, π̃) ∈ W 2,p (ḡ) × W 1,p such that its constraints (μ̃, J)
−q −q−1
˜ satisfy
˜ = (μ + ξ, J + Z),
(μ̃, J)
and
g̃ − gW 2,p < C(ξ, Z)Lp−q−2 , π̃ − πW 1,p < C(ξ, Z)Lp−q−2 .
−q −q−1
2,p 2,p
Moreover, there exists (u, Y ) ∈ W−q (1) × W−q (T M ) such that
g̃ = us ḡ,
π̃ = us/2 LY
˜ decays fast enough to apply
outside some compact set. In particular, if (μ̃, J)
Lemma 9.8, then (g̃, π̃) has harmonic asymptotics.
Exercise 9.12. Go through the proof of Theorem 9.10 and work out what
modifications need to be made in order to prove Proposition 9.11.
9.3. The density theorem for DEC (Theorem 8.3)

We first restate Theorem 8.3 for convenience.
Theorem 9.13 (Density theorem for DEC [EHLS16]). Let (M n , g, π) be
a complete asymptotically initial data set satisfying the dominant energy
2 < q < n − 2 such that q is less
condition μ ≥ |J|g , and let p > n and n−2
Then for any > 0, there exists initial data (g̃, π̃) on M also satisfying
the dominant energy condition such that (g̃, π̃) has harmonic asymptotics in
2,p 1,p
each end, (g̃, π̃) is -close to (g, π) in W−q × W−q−1 , and their constraints
˜ are -close to (μ, J) in L .
(μ̃, J) 1
Furthermore, we can choose (g̃, π̃) such that the strict dominant energy
˜ g̃ . Simultaneously, (μ̃, J)
condition holds. That is, μ̃ > |J| ˜ may be chosen to
decay as fast as we like.
Alternatively, we can choose (g̃, π̃) to be vacuum outside a compact set.
˜ = 0 outside a compact set.
That is, μ̃ = |J|
Note that by Lemma 8.4, the ADM energy-momentum may also be taken
to be -close to that of (g, π). Also observe that, compared to the statement
of Theorem 8.3, we added the conclusion that (μ̃, J) ˜ may be chosen to decay
as fast as we like. The meaning of this statement will be made explicit in
the proof.
According to Proposition 9.11, we can essentially prescribe the con-
straints we want, so naively it looks like Theorem 9.13 should follow easily.
However, the complication is that the dominant energy condition μ ≥ |J|g
depends on g, and when we prescribe constraints according to Proposi-
tion 9.11, we do not have control over the perturbed metric g̃. This problem
was solved in [EHLS16], and the underlying concept can be explained using
the following definition of Corvino and Lan-Hsuan Huang [CH16].
Definition 9.14. Given an initial data set (M, g, π), the modified constraint
operator Φ(g,π) is an operator on other initial data (γ, τ ) defined by

1
Φ(g,π) (γ, τ ) = Φ(γ, τ ) + 0, (γ · J) ,

2
where Φ is the usual constraint operator, J is the current density of the
original initial data (g, π), and the sharp operator is with respect to g, so
that (γ · J) is the vector field with components g ij γjk J k .
The main usefulness of this definition comes from the following obser-
vation, which tells us that knowledge of the modified constraints gives us
control over |J|g .
Lemma 9.15. Let (g, π) and (g̃, π̃) be initial data, and assume that
Φ(g,π) (g̃, π̃) − Φ(g,π) (g, π) = (2ψ, 0)
for some function ψ, where Φ(g,π) is the modified constraint operator. Then
as long as |g̃ − g|g ≤ 3, we have
˜ 2 ≤ |J|2 ,
|J| g̃ g
where J and J˜ are the momentum densities of (g, π) and (g̃, π̃), respectively.
Proof. Adopting the notation in the statement of the lemma and setting
h = g̃ − g, we can see that the main assumption reduces to the statement
1
J˜i + (h · J)i = J i .
2
We compute
˜ 2 = g̃ij J˜i J˜j
|J| g̃

1 1
= (gij + hij ) J − (h · J)
i i
J − (h · J)
j j
2 2

1
= (gij + hij ) J J − g hk J J + (h · J) (h · J)
i j i k j i j
4
3 1
= |J|2g − |h · J|2g + hij (h · J)i (h · J)j
4 4
≤ |J|g ,
2
where the last inequality holds as long as |h|g ≤ 3. Note the crucial can-
cellation that occurs in the fourth equality above; this is the underlying
motivation for the definition of the modified constraint operator.
The analysis of the modified constraint operator is nearly identical to

that of the original constraint operator, since we have only changed it by a
zero-order term. For notational convenience,
we will denote the linearization
of Φ(g,π) at (g, π) by DΦ(g,π) := DΦ(g,π) (g,π) . Clearly, we have

1
DΦ(g,π) (h, w) = DΦ|(g,π) (h, w) + 0, (h · J) ,
2
where (h, w) represents a first-order deformation of initial data.
Theorem 9.16. Let (M, g, π) be a smooth asymptotically flat initial data

set. The linearized modified constraint map
2,p 1,p
DΦ(g,π) : W−q × W−q−1 −→ Lp−q−2
is surjective.
The proof is the same as the proof of Theorem 9.9 with the only difference
being that one must check that the additional term 12 (h · J) does not affect
the argument.
We can now explain how to perturb to the strict dominant energy con-
dition.
Lemma 9.17. Let (M, g, π) be a complete asymptotically flat initial data

set satisfying the DEC. Then for all > 0, there exists initial data (g̃, π̃)
2,p 1,p ˜ are

that is -close to (g, π) in W−q × W−q−1 , such that its constraints (μ̃, J)
1
-close to (μ, J) in L and satisfy the strict dominant energy condition
˜ g̃
μ̃ > (1 + γ)|J|
for some γ > 0.
Proof. We employ some of the same reasoning as in the proof of Theo-

rem 9.10, except that we do not have to employ a cutoff (since we are not
yet dealing with trying to obtain harmonic asymptotics), and we replace the
constraint operator by the modified constraint operator. Choose a positive
function f on M that decays exponentially at infinity. Let (M, g, π) be a
complete asymptotically flat initial data set satisfying μ ≥ |J|g . For small
t, we attempt to solve the equation
(9.4) Φ(g,π) (g̃, π̃) = Φ(g,π) (g, π) + (t(f + |J|g ), 0).
If we can solve this, then we will have the desired inequality
˜ g̃ ,
μ̃ = μ + t(f + |J|g ) > μ + t|J|g ≥ (1 + t)|J|g ≥ (1 + t)|J|
where we used Lemma 9.15 for the last inequality.
So we focus on solving equation (9.4). To do this, consider a modified
2,p
version of our T operator from Definition 9.6: for any pair (u, Y ) ∈ W−q (1)×
2,p
W−q (T M ), we define
g̃ := us g,
π̃ := us/2 (π + Lg Y ),
and the operator T by

T λ (u, Y ) := Φ(g,π) (g̃, π̃) = Φ(g,π) us g, us/2 (π + Lgλ Y ) .
So we would like to solve

T (u, Y ) = Φ(g,π) (g, π) + (t(f + |J|g ), 0),
but just as in the proof of Theorem 9.10, we cannot do this directly. Note
that linearization of T is just

DT (1,0) (v, Z) = DT |(1,0) (v, Z) + (0, svJ).

So we see that DT (1,0) also has closed range (Corollary A.42), and by surjec-
tivity of DΦ(g,π) (Theorem 9.16), we can find a finite-dimensional subspace
2,p 1,p
K2 ⊂ W−q × W−q−1 of compactly supported smooth first-order initial data
deformations (h, w) such that DΦ(g,π) (K2 ) is a complementing subspace for

the image of DT (1,0) . Let K1 be a complementing subspace for the kernel

of DT (1,0) in W−q
2,p 1,p
× W−q−1 , and now define
T̂ : [(1, 0) + K1 ] × K2 −→ Lp−q−2
by the formula
T̂ (u, Y, h, w) = Φ(g,π) (us g + h, us/2 (π + Lg Y ) + w),

so that D T̂ is an isomorphism. By the inverse function theorem, T̂
(1,0,0,0)
has a local inverse that maps a small r-ball around T̂ (1, 0, 0, 0) = Φ(g,π) (g, π)
in Lp−q−2 into a Cr-ball around (1, 0, 0, 0) in W−q 2,p
. For small enough t,
Φ(g,π) (g, π) + (t(f + |J|g ), 0) lies in that r-ball, and therefore we obtain our
desired solution (u, Y, h, w) such that
T̂ (u, Y, h, w) = Φ(g,π) (g, π) + (t(f + |J|g ), 0).
Setting (g̃, π̃) = (us g + h, us/2 (π + Lg Y ) + w), we obtain our desired (g̃, π̃)
solving equation (9.4). As usual, for small enough t, we can show that u > 0,
and elliptic regularity arguments show that (u, Y ) is smooth.
Finally, we check that for small enough t, (μ̃, J) ˜ is close to (μ, J) in L1 .
First, μ̃ − μ = t(f + |J|g ), which clearly approaches 0 in L1 as t → ∞.
Second, J˜ − J = 12 ((g̃ − g) · J) , which can also be taken to be small in L1
2,p
since g̃ − g is small in W−q ⊂ C 0.
Proof of Theorem 9.13. By Lemma 9.17, we may assume without loss

of generality that (g, π) satisfies the strict dominant energy condition μ >
(1 + γ)|J|g for some γ > 0.
We first work on the first version of the conclusion of Theorem 9.13.
That is, we want to find a small perturbation of (g, π) that has harmonic
asymptotics, quickly decaying constraints, and the strict DEC. Choose any
smooth positive function f on Rn B1 such that f ≤ 1 everywhere, f (x) = 1
for |x| ≤ 2, and f decays at infinity. (This explains what we mean by
decaying “as fast as we like.”) Now define fk (x) := f xk . This fk can be
extended to all of M by defining fk = 1 in the compact part away from the
|x| > k.
We solve for initial data (gk , πk ) such that Φ(gk , πk ) = fk Φ(g, π). Since
the right-hand side converges to Φ(g, π) in Lp−q−2 , we can apply Proposi-
tion 9.11 and Lemma 9.8 to obtain our desired solutions (gk , πk ) converging
2,p 1,p
to (g, π) in W−q × W−q−1 such that (gk , πk ) has harmonic asymptotics.2
2 To be more precise, we need f Φ(g, π) ∈ C 0,α

k −n−1−δ for some α ∈ (0, 1) and δ > 0 in order
to apply Lemma 9.8. Since f is allowed any decay we want, the decay rate −n − 1 − δ is not a
problem. However, the Hölder decay is an issue since we have no assumed Hölder decay of Φ(g, π).
Moreover, their constraints of (gk , πk ) decay like fk and converge to

the original constraints in L1 . The only thing left to check is that their
constraints (μk , Jk ) satisfy the strict DEC:
μ k = fk μ
> fk (1 + γ)|J|g
= fk (|J|g + γ|J|g )
≥ fk (|J|gk − |gk − g|g1/2 |J|g + γ|J|g )
= |Jk |gk + fk |J|g (−|gk − g|g1/2 + γ).
By Sobolev embedding, we know that gk → g in C 0 , so for large enough k,
the second term is nonnegative.
For the second version of the conclusion of Theorem 9.13, we want
(gk , πk ) to be vacuum outside a compact set. The proof is exactly the same
as above, except that we use a cutoff function χ that vanishes for |x| > 2 in
place of fk . The only difference is that we now lose the strict inequality in
computation above.
However, this can be fixed by slightly perturbing the prescribed constraint fk Φ(g, π) to something
that does have Hölder decay but still preserves Lp−q−2 and L1 convergence to Φ(g, π).
Appendix A
Some facts about

second-order linear
elliptic operators
In this Appendix, we assume basic familiarity with Sobolev spaces and

Hölder spaces, including the Hölder inequality, interpolation inequalities,
the Sobolev inequality, the Poincaré inequality, the Rellich-Kondrachov com-
pactness theorem, and other topics, as in [Eva10, Chapter 5]. It is also help-
ful to be familiar with basic elliptic energy estimates and their applications,
as in [Eva10, Chapter 6].
A.1. Basics
Assumption A.1. Let (M, g) be a smooth Riemannian manifold (which
we assume is connected). Let V ∈ C ∞ (T M ), and let q ∈ C ∞ (M ). In this
section we will consider second-order elliptic linear operators of the form
Lu := −Δg u + V, ∇u + qu
for any u ∈ C 2 (M ).
Although many of the results in this section hold true for more general
operators (in particular, those with lower regularity of V , q, and even g),
this level of generality will be sufficient for most of our purposes. While this
keeps most of our hypotheses as simple as possible, the cost is that our results
are almost never stated with optimal regularity assumptions. Those optimal
regularity assumptions are actually quite important, not just for studying
low regularity phenomena, but also for studying nonlinear problems. The
301
302 A. Some facts about second-order linear elliptic operators
reason why we get away with such a simplistic approach here is that we are
focusing on the case of smooth metrics, and this book generally stays away
from the intricacies of nonperturbative nonlinear problems.
Note that we have chosen to define this L with −Δg instead of Δg . This
is convenient for us because −Δg has a nonnegative spectrum, and because
the conformal Laplacian and the linearizations of mean curvature H and the
null expansion θ+ all satisfy Assumption A.1 with this sign (up to a positive
constant).
The following standard fact, sometimes called the Hopf maximum prin-
ciple, can be found in [GT01, Lemma 3.4, Theorem 3.5], for example.
Theorem A.2 (Strong maximum principle). Let (M, g) be a (connected)

Riemannian manifold, possibly with boundary, and consider L as in Assump-
tion A.1, with the additional assumption that q ≥ 0. Let u ∈ C 2 (Int M ) ∩
C 0 (M ), and assume that Lu ≤ 0 in Int M . If u attains a nonnegative max-
imum value in Int M , or if u attains a nonnegative maximum at a point in
∂M where ∂u ∂ν = 0 and ν denotes the outward unit normal, then u must be
constant on all of M .
If we replace the assumption Lu ≤ 0 by Lu ≥ 0, then we obtain the same
conclusions with “nonnegative maximum” replaced by “nonpositive mini-
mum.”
Although we assume familiarity with Sobolev spaces and Hölder spaces,

the use of these spaces on manifolds may be a bit less familiar to the reader.
However, extending their use to Riemannian manifolds is fairly straightfor-
ward. For example, the metric allows one to define Lp spaces on a manifold,
to replace regular derivatives by covariant ones with respect to a metric, and
to use the metric to determine the pointwise magnitude of tensors. This al-
lows us to define Sobolev norms and Sobolev spaces. (See Section A.2 for
details.)
We use the notation W k,p (M ) to denote the space of Lp functions on M
whose derivatives up to order k are all in Lp as well. Observe that if we have
a coordinate chart over which the metric gij is uniformly equivalent to δij ,
then the Sobolev norms defined by g (over the chart) will be equivalent to the
ones in the coordinate chart. In fact, on a compact manifold, any two metrics
yield Sobolev norms that are equivalent, so although these norms depend
on the metric, the Sobolev spaces themselves (and their topologies) do not
depend on the choice of metric. Indeed, if we want to work with varying
metrics, it is sometimes useful to define the Sobolev norms using a fixed
background metric. See the book [Heb96] for a more thorough discussion
of Sobolev spaces on Riemannian manifolds and an explanation for why
many properties (such as the Sobolev inequality, the Poincaré inequality,
A.1. Basics 303
etc.) carry over to the manifold setting. Similarly, one can define Hölder
spaces on Riemannian manifolds, and similar remarks apply.
A.1.1. Elliptic estimates. Essentially all of the good properties of elliptic

operators rest upon elliptic estimates, which are the most nontrivial part of
the linear theory. We first present the elliptic Lp estimate, often referred to
as a Calderón-Zygmund estimate [CZ52]. The proof can essentially be found
in [GT01, Theorems 9.15, 9.19]. See [Wan03] for an alternative proof for
the essential case of the Euclidean Laplacian. Technically, one must make
some additional arguments to apply those results to the manifold case, but
this is straightforward because the version for bounded regions of Rn already
requires proving the estimate on small balls and using a patching argument.
Theorem A.3 (Global elliptic Lp estimate). Let (M, g) be a smooth Rie-
mannian compact manifold, possibly with boundary, and consider L as in
Assumption A.1. For every nonnegative integer k and every p > 1, there
exists a constant C such that for any u ∈ W0k+2,p (M ),
uW k+2,p (M ) ≤ C(LuW k,p (M ) + uLp (M ) ).
Here we use W0k+2,p (M ) to denote the elements of W k+2,p (M ) that vanish

at the boundary, in the trace sense.1
Closely related to the elliptic estimate is the concept of elliptic regularity.

Theorem A.4 (Interior elliptic regularity). Let (M, g) be a Riemannian
2,p
manifold, and consider L as in Assumption A.1. Given u ∈ Wloc (Int M )
with p > 1, if Lu is smooth, then so is u.
In fact, one does not really need to start with W 2,p regularity of u.
Even if u is only a distribution (as described in [Wik, Distribution
(mathematics)]) and Lu is only defined in the weak sense, elliptic regu-
larity still holds. We say that Lu = f in the weak sense if u, L∗ ϕ = f, ϕ
for all test functions ϕ, where L∗ denotes the adjoint operator and the an-
gle brackets represent the pairing between distributions and test functions.
See [Fol95], for example, for an introduction to distributions and also to see
how to prove the following regularity theorem.
Theorem A.5 (Interior elliptic regularity for weak solutions). Let (M, g)
be a smooth Riemannian manifold, and consider L as in Assumption A.1.
Given u in the dual space of Cck (Int M ) for some k ≥ 2, if Lu is smooth,
then u is smooth on Int M .
1 Many authors use the notation W0k+2,p (M ) to denote the completion of Cc∞ (Int M ) in the
W k+2,p norm.
In the above, we say that a distribution is “smooth” if its action on

Cck (Int M ) can be represented by integration against a smooth function.
There is also an elliptic Hölder estimate similar to the one in Theo-
rem A.3, often called a Schauder estimate [Sch34]. See [GT01, Theo-
rem 6.6] for the classical proof using potential theory, or [Sim97] for an
alternative proof using scaling. Once again, some straightforward adjust-
ments must be made to transfer to the manifold setting.
Theorem A.6 (Global elliptic Hölder estimate). Let (M, g) be a smooth
Riemannian compact manifold, possibly with boundary, and consider L as
in Assumption A.1. For every nonnegative integer k and every α ∈ (0, 1),
there exists a constant C such that for any u ∈ C0k+2,α (M ),
uC k+2,α (M ) ≤ C(LuC k,α (M ) + uC 0 (M ) ).
Here we use C0k+2,α (M ) to denote the elements of C k+2,α (M ) that vanish at
the boundary.
A.1.2. Laplacian on compact Riemannian manifolds. We present

some basic facts about the g-Laplacian.
Theorem A.7. Let (M, g) be a smooth compact Riemannian manifold with
nontrivial boundary. Then for each nonnegative integer k and every p > 1,
Δg : W0k+2,p (M ) −→ W k,p (M )
is an isomorphism, where W0k+2,p (M ) denotes the elements of W k+2,p that
vanish at the boundary.
Moreover, for any α ∈ (0, 1),
Δg : C0k+2,α (M ) −→ C k,α (M )
is also an isomorphism, where C0k+2,α (M ) denotes the elements of
C k+2,α (M ) that vanish at the boundary.
Proof. For this proof we will write Δ for Δg for notational simplicity. We
claim that the operator Δ : W0k+2,p −→ W k,p carries an injectivity estimate.
That is, there exists C such that for all u ∈ W0k+2,p ,
(A.1) uW k+2,p ≤ CΔuW k,p .
We also claim a similar estimate for Δ : C0k+2,α −→ C k,α . Of course, this is
a stronger statement than injectivity.
We will start by proving the simpler inequality
uLp ≤ CΔuLp
for all u ∈ W02,p . Suppose there does not exist such a constant. Then
we can find a sequence ui with ui Lp = 1 and Δui → 0 in Lp . By the
A.1. Basics 305
elliptic Lp estimate (Theorem A.3), it follows that ui is uniformly bounded

in W 2,p . By the Rellich-Kondrachov compactness theorem [Wik, Rellich-
Kondrachov theorem], there exists a subsequence of ui converging to some
function u weakly in W 2,p and strongly in Lp . Then for any compactly
supported smooth test function ψ,
Δu, ψ = u, Δψ = lim ui , Δψ = lim Δui , ψ = 0,

i→∞ i→∞
and therefore Δu = 0. By elliptic regularity (Theorem A.4), u must be

smooth. Since it vanishes at the boundary, the maximum principle (Theo-
rem A.2) implies that u is identically zero. But that is a contradiction since
we must have uLp = 1. Now that we have established uLp ≤ CΔuLp ,
the corresponding injectivity result of the form uW 2,p ≤ CΔuLp follows
from the elliptic Lp estimate (with a different constant C, of course).
Using the same argument, we obtain the desired injectivity estimate
(A.1) for the domains W0k+2,p with k > 0. An injectivity estimate for C0k+2,α
follows similarly: we just have to replace our use of Rellich-Kondrachov with
Arzela-Ascoli and replace the elliptic Lp estimate with the elliptic Hölder
estimate (Theorem A.6).
Now we turn to surjectivity. We need to solve the Poisson equation
Δu = f . First suppose that f is C ∞ . We find u by minimizing the functional

1
A(v) := (|∇v|2 + f v)dμg
M 2
over all v ∈ W01,2 . Together with the Poincaré inequality, the Lax-Milgram
Theorem [Wik, Weak formulation] can be used to show that a minimizer
u ∈ W01,2 exists, and then it is straightforward to see that u must solve
Δu = f weakly.2 (See [Eva10, Section 6.2], for example, for details of this
argument.) By elliptic regularity, u ∈ C0∞ .
Now consider Δ : W0k+2,p −→ W k,p and attempt to prove surjectivity.
Let f ∈ W k,p . Then there exists a sequence fi in C ∞ such that fi → f
in W k,p . As observed above, we can find ui ∈ C0∞ solving Δui = fi . By
the injectivity estimate described above, the fact that fi is Cauchy in W k,p
implies that ui is Cauchy in W k+2,p , and then its limit u ∈ W0k+2,p must
solve Δu = f . The surjectivity proof for Δ : C0k+2,α −→ C k,α proceeds in
the same way.
2 This is not quite as “weak” as our previously stated definition of a weak solution, so elliptic
regularity just follows from energy methods rather than the more sophisticated result, Theo-
rem A.5.
Theorem A.8. Let (M, g) be a compact Riemannian manifold without

boundary. Then for each nonnegative integer k and every p > 1, the op-
erator
Δg : W k+2,p (M ) −→ W k,p (M )
has a one-dimensional
kernel spanned by constants, and f lies in its range
if and only if M f dμ = 0. Moreover, for any α ∈ (0, 1), the same is true
for
Δg : C k+2,α (M ) −→ C k,α (M ).
Proof. The proof is similar to the previous one, but we have to make some
small adjustments. First observe that the condition M v dμ = 0 on v is a
closed condition in all of the spaces W k,p and C k,α . This fact will be used
implicitly in the following.
Obviously, the constants lie in the kernel of Δ. Instead of the injectivity
estimate (A.1), we prove that there exists a constant C such that for all
u ∈ W k+2,p ,

(A.2) if u dμ = 0, then uW k+2,p ≤ CΔuW k,p .
M
We can also prove a similar statement for C k+2,α . Note that (A.2) is a
stronger statement than the claim that the kernel is spanned by constants.
The proof of (A.2) is the same as the proof of (A.1), except that now when
we reach the point where Δu = 0, we can only conclude from the maximum
principle that u is constant, not that it is zero. But this will still imply that
u = 0 as long as the ui ’s were chosen so that M ui dμ = 0.
Now consider the range of Δ. By the divergence theorem, it is clear that
the range of Δ is orthogonal to the constants. We just need to solve Δu = f
for any f orthogonal to the constants. As in the proof of Theorem A.7, we
∞
start with f ∈ C with M f dμ = 0, and attempt to solve Δu = f . The
proof is similar to the surjectivity proof above,
except that we minimize the
functional A(v) over all v ∈ W 1,2 such that M v dμ = 0. Using the Poincaré
inequality (for functions with average equal to 0), we can show that Lax-
Milgram applies once again, and therefore we can find the desired minimizer
u ∈ W 1,2 . However, this minimizer only has the property that Δu − f is a
constant (not necessarily zero). But since the average value of f is zero by
hypothesis, that constant must be zero, and hence Δu = f . The rest of the
proof is identical.
Recall that a linear operator L : X −→ Y between Banach spaces is

said to be Fredholm if it has closed range and finite-dimensional kernel and
cokernel. Recall that the cokernel is Y /(im L), whose dual space can be
identified with the annihilator (im L)⊥ ⊂ Y ∗ . The index of a Fredholm
A.1. Basics 307
operator is defined to be the dimension of the kernel minus the dimension

of the cokernel (the latter of which is the codimension of the image).
Corollary A.9. Let (M, g) be a Riemannian compact manifold, possibly
with boundary, and consider L as in Assumption A.1. The operators
L : W0k+2,p (M ) −→ W k,p (M )
and
L : C0k+2,α (M ) −→ C k,α (M )
are Fredholm operators with index zero.
Proof. In either case, consider the path of operators Lt := (t − 1)Δg + tL

from L0 = −Δg to L1 = L. It is easy to see that this path is continuous
in the strong operator topology, and then we invoke the fact that the Fred-
holm property and its associated index are both preserved under continuous
deformation [Wik, Fredholm operator].
A.1.3. Eigenfunctions. Here we provide proofs of some standard facts

about eigenfunctions.
Theorem A.10 (Principal eigenfunctions of elliptic operators). Let (M, g)
be a smooth Riemannian compact manifold, possibly with boundary, and
consider L as in Assumption A.1. Then there exists a simple (Dirichlet)
eigenvalue of L with a corresponding eigenfunction which is positive in the
interior of M . To be more precise, there exist a ϕ1 ∈ C ∞ (M ) and λ1 ∈ R
such that
(1) ϕ1 > 0 in Int M ,
(2) ϕ1 = 0 and ∂ϕ 1
∂ν < 0 at ∂M , where ν is the outward normal (if there
is a boundary),
(3) Lϕ1 = λ1 ϕ1 , and
(4) the λ1 -eigenspace is one-dimensional.
We say that ϕ1 is a principal (Dirichlet) eigenfunction with principal
(Dirichlet) eigenvalue λ1 , which is sometimes denoted λ1 (L) for clarity.
It is also true that if λ = λ1 is a (possibly complex) eigenvalue of L,

then λ1 is strictly less than (the real part of) λ, but we will not need this
fact. See [Eva10, Section 6.5] for a proof of this fact (as well as a proof of
Theorem A.10). When V = 0, the operator is self-adjoint, and the proof
in this case is significantly easier. Although we will give the general proof,
we will first prove the self-adjoint case separately since it involves some
important ideas.
Proof of V = 0 case. The principal eigenfunction can be found using Ray-

leigh quotients. By the definition of L in Assumption A.1, and by integration
by parts, for each u ∈ Cc∞ (Int M ), we have

Lu, uL2 = (|∇u|2 + qu2 )dμ.
M
We define the right side to be a quadratic form A(u, u), which is well-defined
for all functions u ∈ W01,2 (M ). Next we attempt to minimize the Rayleigh
quotient
A(u, u)
λ1 := inf
u=0 u2 2
L
over nonzero functions in W01,2 (M ), which is the same as minimizing A(u, u)
over u ∈ W01,2 (M ) with the added restriction that uL2 = 1. Note that
λ1 ≥ inf q > −∞. To find a minimizer, we choose a minimizing sequence for
the constrained problem. So we have a sequence ui ∈ W01,2 (M ) such that
ui L2 = 1 and A(ui , ui ) → λ1 . One can see that this sequence is bounded
in W 1,2 and hence, by the Rellich-Kondrachov compactness theorem [Wik,
Rellich-Kondrachov theorem], we can extract a minimizing subsequence ui
that converges in L2 . Then
A(ui − uj , ui − uj ) = 2A(ui , ui ) + 2A(uj , uj ) − A(ui + uj , ui + uj )
≤ 2A(ui , ui ) + 2A(uj , uj ) − λ1 ui + uj L2 .
In the limit, for large i and j, the right side must approach zero. We can
then use this to show that ui − uj W 1,2 also approaches zero in the limit,
and thus ui converges to some ϕ1 in W01,2 . Clearly, this ϕ1 minimizes the
Rayleigh quotient, and we claim that this is the desired eigenfunction in the
statement of the theorem.
By the minimizing property, we know that for any u ∈ W01,2 , we have

d A(ϕ1 + tu, ϕ1 + tu)
0=
dt t=0 ϕ1 + tu2L2
= 2A(ϕ1 , u) − 2λ1 ϕ1 , u.
Or in other words, we say that (L − λ1 )ϕ1 = 0 in the weak sense. By elliptic
regularity,3 ϕ1 is a smooth solution of Lϕ1 = λ1 ϕ1 which vanishes at ∂M .
Finally, we have to show that ϕ1 can be chosen to be positive in the
interior, and that the λ1 eigenspace is one-dimensional. To see this, let u be
any eigenfunction with eigenvalue λ1 , and we will show that u must have a
sign in the interior (either strictly positive there or strictly negative there).
Observe that
λ1 = A(u, u) = A(u+ , u+ ) + A(u− , u− ) ≥ λ1 u+ 2 + λ1 u− 2 = λ1 ,
3 This only requires standard elliptic energy estimates as in [Eva10, Section 6.3].
A.1. Basics 309
where u± are the positive and negative parts of u, and the inequality follows
from the minimizing property of λ1 . Since the inequality is actually an
equality, u± are also eigenfunctions of L with eigenvalue λ1 . For large enough
C (specifically, C > sup |q− |), the maximum principle can be applied to
the operator L + C, and we can also choose C large enough so that λ1 +
C ≥ 0. Thus (L + C)u± = (λ1 + C)u± ≥ 0. Since u+ ≥ 0, the strong
+
maximum principle implies that either u+ > 0 in Int M and ∂u ∂ν < 0 at
∂M , or else u+ = 0. The exact same statement is true for u− , and obviously
u+ and u− cannot be simultaneously positive, so one of them must vanish
identically, showing that u indeed has a sign in the interior (and the desired
normal derivative at the boundary). Since two positive functions can never
be orthogonal, it follows that the eigenspace is one-dimensional.
Before we turn to the general case, in which L need not be self-adjoint,

we present the following classical estimate of the principal eigenvalue.
Theorem A.11 (Barta’s estimate). Let (M, g) be a smooth Riemannian
compact manifold, possibly with boundary, and consider L as in Assump-
tion A.1. Let v be a smooth function such that v > 0 on Int M , while v = 0
at ∂M . Then
Lv Lv
inf ≤ λ1 (L) ≤ sup ,
Int M v Int M v
and if either of the two inequalities is an equality, then v must be a principal
eigenfunction. In particular,
Lv Lv
sup inf = λ1 (L) = inf sup .
v>0 Int M v v>0 Int M v
Proof of V = 0 case. Let v be a smooth function as in the hypotheses.

The upper bound follows directly from Rayleigh quotients (which were dis-
cussed in the previous proof) and integration by parts: without loss of gen-
erality, assume vL2 = 1. Then

λ1 ≤ A(v, v) = |∇v|2 + qv 2

M
∂v
= v + −vΔv + qv 2
∂ν
∂M M
= (Lv)v
M

Lv Lv
≤ sup v 2 = sup ,
M Int M v Int M v
as desired. Moreover, if we have equality λ1 = supInt M Lv Lv

v , then v =
Lv
supInt M v = λ1 on all of Int M , so v is an eigenfunction with eigenvalue
λ1 .
The less obvious lower bound is a consequence of the upper bound.

Again, let v be a smooth function as in the hypotheses, and set γ = inf M Lv
v .
We want to show that γ ≤ λ1 . Let us assume the reverse inequality γ ≥ λ1 .
Let ϕ1 be a principal eigenfunction of L as in Theorem A.10. Then there
exists > 0 such that ϕ1 − v > 0 on Int M and vanishing at ∂M , so that
we can apply the upper bound
L(ϕ1 − v)
λ1 ≤ sup
Int M ϕ1 − v
λ1 ϕ1 − γv
≤ sup
Int M ϕ1 − v
≤ λ1 .
Therefore ϕ1 −v achieves equality in the upper bound for λ1 , and we showed
that this is only possible if ϕ1 − v itself is a principal eigenfunction. Thus v
is a principal eigenfunction with Lv = λ1 v. But Lv ≥ γv by definition of γ,
giving us a contradiction unless γ = λ1 . Tracing the logic, we have shown
that γ ≤ λ1 with equality only if v is a principal eigenfunction.
Observe that in the self-adjoint case, Barta’s estimate gives an alter-

native characterization of the principal eigenvalue, in place of the charac-
terization using Rayleigh quotients. This suggests that perhaps there is an
alternative proof of Theorem A.10 using this characterization, and hopefully
this proof does not use self-adjointness. Indeed, it turns out that this is the
case.
The general, non-self-adjoint case of Theorem A.10 follows fairly easily
from the Krein-Rutman Theorem, whose proof contains the central idea.
Theorem A.12 (Krein-Rutman Theorem [KR50]). Let C be a closed con-

vex cone in a Banach space X such that C ∩ (−C) = {0}. If T : C −→ C
is a compact linear operator that maps C {0} into the interior of C, then
there exists a positive eigenvalue of T with a one-dimensional eigenspace
spanned by a vector in the interior of C.
It is also true that this real eigenvalue in the conclusion of the theorem
is equal to the spectral radius of T , but we will not need this fact. The proof
of Theorem A.10 given in [Eva10, Section 6.5] efficiently rolls together the
proof of Krein-Rutman with the reduction to Krein-Rutman, but instead we
will separate the two arguments.
General proof of Theorem A.10. First, observe that for any constant
c, the operators L and L + c have the exact same eigenfunctions, and the
eigenvalues are simply translated by c, so we may as well assume without
A.1. Basics 311
loss of generality that q > 0. In this case, we claim that
L : C02,α −→ C 0,α
is an isomorphism. We prove injectivity first. Suppose that Lu = 0. Our

assumption that q > 0 allows us to invoke the strong maximum principle
(Theorem A.2) to see that u must be constant. But since q > 0, a nonzero
constant cannot be a solution, and thus L is injective. By Corollary A.9,
we know that L is Fredholm with index zero, and thus injectivity implies
surjectivity.
Since L is an isomorphism, we can compose its inverse L−1 with the
compact embedding C02,α ⊂ C 0,α to obtain the solution operator
G : C 0,α −→ C 0,α ,
which is a compact operator. We define the cone C = {u ∈ C 0,α | u ≥ 0}.

Observe that the strong maximum principle (Theorem A.2) implies that if
u ≥ 0 but not identically zero, then Gu > 0 in Int M , vanishes at ∂M , and
has negative normal derivative at ∂M . This tells us that G maps C {0}
into its interior. So we can apply the Krein-Rutman Theorem to see that
G has a positive real eigenvalue with a one-dimensional eigenspace spanned
by an element in the interior of C. But of course, this translates to having a
positive real eigenvalue of L with a one-dimensional eigenspace spanned by
an eigenfunction that is positive in the interior of M , and this eigenfunction
is smooth by elliptic regularity.
We now present a fairly elementary proof of the Krein-Rutman Theorem,

which is due to P. Takáč [Tak94].
Proof of the Krein-Rutman Theorem. Though our proof will work in

the generality stated, let us think of X = C 0 (M ) for some compact set M ,
and think of C as the cone of nonnegative functions in X in order to slightly
lessen the cognitive load. (It also reduces our need to define new notation.)
Then our main assumption is that T is a compact linear operator that maps
nonnegative functions (except 0) into positive functions.
In light of the previous proof where T = G and the upper bound charac-
terization of the λ1 (L) for self-adjoint operators in Theorem A.11, we might
hope that there is an eigenvalue of T given by a corresponding lower bound
characterization λ = supv>0 inf Int M Tvv . We express this concept without
using division as follows. Define
A := {γ | T u ≥ γu for some u ∈ C {0} },

λ := sup A.
Observe that for any u ∈ C {0}, since T u is strictly positive everywhere,

we can find γ > 0 such that T u ≥ γu, so A = ∅ and λ > 0 is well-defined.
We intend to show that λ is an eigenvalue of T with a positive eigenfunction.
We now take a moment to lay out our basic roadmap. As one might
expect, if we can show that sup A is attained (that is, sup A ∈ A), then it is
not hard to show that the corresponding u will be the desired eigenfunction.
To show that sup A is attained, we can choose a maximizing sequence for
sup A and try to extract a convergent subsequence from their corresponding
u’s. Since T is compact, this turns out to be easy to do. The only challenge
is to make sure that the limit we obtain from this process is not zero. This
motivates the following claim.
First, define K to be the closure of the T image of all the unit vectors
in C. Since T is a compact operator, it follows that K is compact.
Claim. If γ > 0 and u ∈ C {0} such that T u ≥ γu, then there exists
v ∈ K such that T v ≥ γv and also v ≥ γ.
We will construct the desired v by iterating

T and normalizing in or-
T nu
der to keep it in K. Define vn = T T n u ∈ K. Since we want to
extract a limit with norm bigger than γ, choose a subsequence of vn realiz-
ing lim supn→∞ vn . Since K is compact, we can find a subsequence of the
subsequence we already chose that converges to some limit v ∈ K. Since
T u − γu ≥ 0, our assumption on T implies that T n+1 (T u − γu) ≥ 0 also,
which implies that T vn ≥ γvn . Taking the limit, we obtain T v ≥ γv. The
only thing left to check is that v ≥ γ, which is the harder part.
Observe that
T n+1 u
v = lim sup vn = lim sup .
n→∞ n→∞ T n u
Let us call this quantity ρ−1(including the case ρ = ∞ if v = 0). Then it
follows (by the ratio test) that the formal power series
∞

F (z) = znT nu
n=0
converges absolutely in X for all |z| < ρ. So for any z ∈ (0, ρ),
∞

F (z) = u + z n+1 T n (T u)
n=0
∞
≥u+ z n+1 T n (γu)
n=0
= u + zγF (z),
A.1. Basics 313
where the inequality follows from the fact that T n (T u − γu) ≥ 0 for n ≥ 0.
So we have (1 − zγ)F (z) ≥ u. Since F (z) > 0 and u ≥ 0 is nontrivial, this
is only possible if 1 − zγ > 0. That is, γ < z −1 . Since this is true for all
z < ρ, it follows that γ ≤ ρ−1 = v, completing our proof of the Claim.
The result follows fairly easily from the Claim. We choose a sequence
of γi ∈ A such that γi → λ. By definition of A and the Claim, for each λi ,
there is a wi ∈ K such that wi ≥ γi and T wi ≥ γwi . Finally, since K is
compact, we can find a subsequence of wi converging to some ϕ ∈ K. Then
T ϕ ≥ λϕ, and the lower bound on wi guarantees that ϕ is nontrivial.
To prove that ϕ is the desired eigenfunction, suppose that T ϕ − λϕ is not
identically zero. Then our assumption on T implies the strict inequality
T (T ϕ − λϕ) > 0, so that for some small > 0, T (T ϕ − λϕ) ≥ T ϕ, which
implies if we set u = T ϕ ∈ C {0}, then T u ≥ (λ + )u. But that means
that λ + ∈ A, contradicting the definition of λ. Hence T ϕ = λϕ, and
ϕ ∈ Int C since it is in the image of T .
It only remains to show that this λ has a one-dimensional eigenspace.
Suppose that u ∈ X also has T u = λu. Let s > 0 be the largest number such
that both 0 ≤ ϕ + su and 0 ≤ ϕ − su, which clearly exists. If one of these
inequalities is an exact equality, then u = ±s−1 ϕ and we are done. Suppose
that neither one is, so that ϕ + su, ϕ − su ∈ C {0}. Then the positivity
property of T implies that both T (ϕ + su) and T (ϕ − su) are positive, which
means that ϕ + su and ϕ − su are strictly positive, since both functions are
λ-eigenfunctions. But this means there is enough wiggle room to contradict
the maximality of s.
General proof of Theorem A.11. From our construction of the princi-

pal eigenfunction in the two previous proofs, the upper bound (and corre-
sponding sharpness statement) in Theorem A.11 is immediate. The lower
bound follows from the upper bound as before, since that argument did not
use self-adjointness.
Without too much more work in the self-adjoint case, we can construct
a complete orthonormal set of eigenfunctions.
Theorem A.13 (Spectral theorem for self-adjoint elliptic operators). Let
(M, g) be a Riemannian compact manifold, possibly with boundary, and con-
sider L as in Assumption A.1. We further assume that V = 0. Then
there exists a sequence of smooth (Dirichlet) eigenfunctions ϕ1 , ϕ2 , ϕ3 , . . . ∈
C ∞ (Int M )∩C 0 (M ) vanishing at ∂M with corresponding discrete (Dirichlet)
eigenvalues λ1 < λ2 ≤ λ3 ≤ · · · diverging to ∞ such that for each i,
Lϕi = λi ϕi ,
and the {ϕi } form a complete orthonormal set in L2 (M ).
Proof. We already constructed (ϕ1 , λ1 ) in Theorem A.10. In order to con-

struct (ϕ2 , λ2 ), we consider the orthogonal complement of ϕ1 in W01,2 (M ),
and then find ϕ2 using the same Rayleigh quotient argument from the proof
of Theorem A.10, except this time we minimize the Rayleigh quotient in
the orthogonal complement of ϕ1 to obtain a minimum value λ2 . As in the
proof of Theorem A.10, we will find that
A(ϕ2 , u) = λ2 ϕ2 , u
for all u ∈ W01,2 (M ) orthogonal to ϕ1 . By self-adjointness of L, it follows that

the equation holds over all of W01,2 (M ), which implies that ϕ2 is smooth and
Lϕ2 = λ2 ϕ2 . We can then inductively construct all of (ϕi , λi ) in a similar
manner.
We claim that λi → ∞ as i → ∞. Recall that the ϕi we constructed are
unit length in L2 . Suppose that λi is a bounded sequence. Then the sequence
Lϕi is also bounded in L2 . By an elliptic energy estimate (as in [Eva10,
Section 6.3]), it follows that the sequence ϕi is bounded in W 1,2 , and thus
by the Rellich-Kondrachov compactness theorem, it has a subsequence that
converges in L2 . But this is impossible since ϕi is orthonormal in L2 , proving
the claim.
The only thing left to prove is that the set of eigenfunctions is complete
in L2 . Since W01,2 is dense in L2 , it suffices to show that the span of the
eigenfunctions is dense in the space W01,2 with respect to the L2 norm. Let
u ∈ W01,2 (M ), and define

k
uk := u, ϕi ϕi
i=1
to be the partial sum expansion of u in the eigenfunctions. By construction,

we know that u − uk is orthogonal to ϕ1 , . . . , ϕk , and thus
A(u − uk , u − uk )
λk+1 ≤ .
u − uk 2L2
From here we can compute

1 1
k
u − uk 2L2 ≤ A(u − uk , u − uk ) = A(u, u) − λi u, ϕi 2
.
λk+1 λk+1
i=1
Since λi → ∞, and in particular there can only be finitely many negative

eigenvalues, it follows that the right-hand side of the above inequality ap-
proaches zero as k → ∞. In other words, uk converges to u in L2 , completing
the proof.
A.1. Basics 315
A.1.4. Harmonic expansions. In this section we wish to study harmonic

functions in Euclidean space Rn . Throughout this section, we will assume
that n ≥ 3 in order to simplify the discussion. This is because we will only
really need results for n ≥ 3; for n = 2 certain theorem statements come out
a bit differently because the fundamental solution of the Laplacian behaves
differently.
Consider spherical coordinates r, θ1 , . . . , θn−1 , where r is the radial co-
ordinate, and θ = (θ1 , . . . , θn−1 ) represents coordinates on the sphere S n−1 .
Then the Euclidean metric is just dr2 + r2 dΩ2 , where dΩ2 represents the
standard round unit sphere metric on S n−1 . In these coordinates, the Eu-
clidean Laplacian takes the form
∂2 n−1 ∂ 1
Δ= 2
+ + 2 ΔS ,
∂r r ∂r r
where ΔS is the Laplacian on the unit sphere S n−1 . If we consider “sepa-
rated” solutions Y (r, θ) = R(r)Θ(θ) of the equation ΔY = 0, one can easily
see that there must be a constant λ such that
ΔS Θ + λΘ = 0,
n−1 λ
R + R − 2 R = 0.
r r
So in order for Y to be nontrivial, λ must be an eigenvalue of the spherical
Laplacian, Θ must be one of its eigenfunctions, and R(r) must be a linear
combination of rα1 and rα2 , where α1 and α2 are the two roots of the charac-
teristic equation α2 +(n−2)α−λ = 0. (Since ΔS can only have nonnegative
eigenvalues and n = 2, a double root cannot occur.) Eigenfunctions of the
spherical Laplacian are often called spherical harmonics.
Definition A.14. Let Hk (Rn ) denote the space of homogeneous harmonic
polynomials of degree k on Rn , and for any s ≥ 0,
s
>
H≤s (R ) :=
n
Hk (Rn ),
k=0
where s is the greatest integer function, and H≤s (Rn ) := {0} if s < 0.
Since a homogeneous function is a separated function in the sense de-

scribed above, the above computation implies that the restriction of any ele-
ment of Hk (Rn ) to the unit sphere must be an eigenfunction of the spherical
Laplacian with eigenvalue k(k + n − 2). In particular, given two homoge-
neous harmonic polynomials of different degree, their restrictions to the unit
sphere are L2 -orthogonal to each other.
Using some elegant linear algebra, one can prove the following.
Theorem A.15. Given any polynomial p on Rn , there exists a harmonic

polynomial whose restriction to the unit sphere is the same as the restriction
of p to the unit sphere. Moreover,

k+n−1 k+n−3
dim Hk (R ) =
n
− .
n−1 n−1
See [ABR01, Chapter 5] for a proof.
Corollary A.16. Let Hk (S n−1 ) denote the space of the restrictions of ele-
ments of Hk (Rn ) to the unit sphere. Then
∞
>
2
L (S n−1
)= Hk (S n−1 ).
k=0
Moreover, Hk (S n−1 ) is precisely the eigenspace of ΔS with eigenvalue

k(k + n − 2), and these are all of the eigenvalues.
Proof. We already mentioned that the spaces Hk (S n−1 ) are L2 -orthogonal

to each other. Theorem A.15 tells us that every polynomial on the unit
sphere is actually equal to a harmonic polynomial on the sphere, and then
the Stone-Weierstrass Theorem [Wik, Stone–Weierstrass theorem] applied
to the unit sphere tells us that the harmonic polynomials on the sphere are
dense in C 0 (S n−1 ). Since C 0 (S n−1 ) is dense in L2 (S n−1 ), the first assertion
follows.
The second assertion follows from the first. As discussed earlier, we al-
ready know that each element of Hk (S n−1 ) is an eigenfunction of the spher-
ical Laplacian with eigenvalue k(k + n − 2). The first assertion then implies
that there cannot be any other eigenfunctions besides the ones coming from
Hk (S n−1 ).
The previous corollary says that if ϕ is an eigenfunction of the Lapla-

cian on the unit sphere, then the corresponding eigenvalue λ must equal
k(k + n − 2) for some nonnegative integer k, and then |x|k ϕ(x/|x|) is a har-
monic polynomial on Rn . Moreover, our separation of variables calculation
tells us that |x|2−n−k ϕ(x/|x|) is also harmonic (on Rn {0}). Indeed, we
see that all homogeneous harmonic functions must be obtained in this way.
Definition A.17. For n ≥ 3, we define the exceptional set

Λ := Z (2 − n, 0).
It consists precisely of all possible degrees of nontrivial homogeneous har-
monic functions on Rn {0}. For each k ∈ Λ, we define Hk (Rn {0}) to be
the space of homogeneous harmonic functions of degree k.
A.1. Basics 317
Note that we have the redundant notation Hk (Rn {0}) = Hk (Rn )

when k ≥ 0, and that for k ≤ 2 − n, restriction to the unit sphere gives an
isomorphism from Hk (Rn {0}) to Hk+n−2 (S n−1 ).
Recall that any harmonic function defined on a closed ball can be ex-
pressed in terms of its values on the boundary of that ball via the Poisson
integral formula. Explicitly, for a harmonic function u on the closed unit
ball, we have

u(x) = K(x, ξ)u(ξ) dμ(ξ),
∂B1 (0)
where
1 − |x|2
1
K(x, ξ) =
ωn−1 |ξ − x|n
is known as the Poisson kernel. For each fixed ξ ∈ ∂B1 , we can use Corol-
lary A.16 to expand the restriction of K(x, ξ) to ∂B1 in spherical harmonics
in the x-variable. Consequently, this leads to an expansion of K(x, ξ) in
terms of homogeneous harmonic polynomials in the x-variable,
∞

K(x, ξ) = Zk (x, ξ),
k=0
where the function Zk (·, ξ) ∈ Hk (Rn ) is sometimes called the zonal harmonic
with pole ξ. By axisymmetry considerations, we can see that Zk (x, ξ) =
|x|k Pk (x̂ · ξ) for some polynomial Pk , where x̂ := x/|x|. This generalizes the
well-known formula for n = 2,
∞

1
K(x, ξ) = 1+ |x|k cos kθ ,
2π
k=0
where θ is the angle between x and ξ. When n = 3, the polynomials Pk

are called the Legendre polynomials [Wik, Zonal spherical harmonics], up
to some constant. For general n, they fall within the family of ultraspherical
or Gegenbauer polynomials.
In the expansion of K(x, ξ) above, the convergence is absolute and uni-
form over all x in a compact subset of B1 . Using this expansion in zonal
harmonics (and scaling appropriately), one can show the following.
Theorem A.18. Let u be a harmonic function on Bρ . Then there exist
hk ∈ Hk (Rn ) such that
∞

u= hk ,
k=0
where the convergence is absolute and uniform over any compact subset
of Bρ .
For a full explanation of this whole story, see [ABR01, Chapter 5]. If
we only want the expansion to hold in a small neighborhood of the origin
(which is sometimes all that is needed), the proof is quite a bit simpler.
Note that since a partial derivative of a harmonic function is also harmonic,
one can see that the expansion can be differentiated, or, in other words, the
series converges in C m on compact subsets for any m.
We can use the expansion in Theorem A.18 to help us understand the
asymptotics of a harmonic function defined on an exterior region of Rn .
Given such a harmonic function u on Rn B̄ρ , we can use a Kelvin transform
to construct a new harmonic function u∗ defined on B1/ρ {0} via
u∗ (x) = |x|2−n u(x/|x|2 ).
If limx→∞ u(x) = 0, then limx→0 |x|n−2 u∗ (x) = 0, and it follows that u∗ has
a removable singularity at the origin. Hence,
we can apply Theorem A.18
to u∗ , giving us the expansion u = |x|2−n ∞ h
k=0 k (x/|x| 2 ) for some h ∈
k
Hk (Rn ). So we have the following corollary.
Corollary A.19. Let ρ0 > 0, and let u be a harmonic function on Rn
Bρ0 (0) such that limx→∞ u(x) = 0. Then there exist spherical harmonics
ϕk ∈ Hk (S n−1 ) such that
∞

u= |x|2−n−k ϕk (x/|x|),
k=0
where the convergence is absolute and uniform over |x| > ρ for any ρ > ρ0 .
We also obtain convergence of any number of derivatives.
One can actually go further and cook up a Laurent expansion for any
harmonic function defined on an annulus [ABR01, Chapter 10]. In the next
section, we will prove a useful generalization of Corollary A.19.
A.2. Weighted spaces on asymptotically flat manifolds

This section assumes familiarity with the definition of asymptotically flat
manifolds in Definition 3.5. Please refer back to that definition as needed.4
One critical technical difficulty of working on a complete asymptotically flat
manifold rather than a compact one is we do not have Theorem A.7, which
is what we use to solve Poisson-type equations. In order to obtain an analog
of Theorem A.7, we must introduce weighted versions of Sobolev and Hölder
spaces. The basic intuition is that if a function on Euclidean space has a
certain order of decay at infinity, then its Laplacian decays two orders faster.
Conversely, if we wish to solve for u in the Poisson equation Δu = f , we
4 Definition 3.5 requires the scalar curvature to be integrable, but it is worth noting that
this condition is not used for any of the analytic results on asymptotically flat manifolds in this
Appendix.
A.2. Weighted spaces on asymptotically flat manifolds 319
would naturally look for a solution u that decays two orders slower than f .
Even for the case of Euclidean space, the theory is nontrivial.
With this motivation in mind, we briefly present some of the theory of
weighted spaces on asymptotically flat manifolds and the corresponding el-
liptic theory. The use of these weighted spaces to study elliptic problems
on Rn was first developed by L. Nirenberg and H. Walker [NW73] and
M. Cantor [Can74]. Much of what appears in this section can be found in
R. Bartnik’s paper [Bar86, Sections 1 and 2], but the current understanding
of this theory was built up over many years by contributions from N. Mey-
ers, Y. Choquet-Bruhat, A. Chaljub-Simon, R. McOwen, R. Lockhart, and
D. Christodoulou, among others. See [Mey63, CSCB79, McO79, McO80,
Loc81, Can81, CBC81, LM83].
Definition A.20. Let (M n , g) be a complete asymptotically flat manifold.
Slightly abusively, we will choose r to be a smooth positive function on M
such that r = |x| in each asymptotically flat coordinate chart. Given any
p ≥ 1, s ∈ R, we define the weighted Lebesgue space Lps (M ) to be the space
of all functions u ∈ Lploc (M ) with finite weighted norm

1/p
uLps (M ) = |u|p r−sp−n dμ .
M
We can extend this definition to include p = ∞ by taking uL∞

s (M )
=
rs uL∞ (M ) .
Next, for each positive integer k, we define the weighted Sobolev space
Wsk,p (M ) to be the space of all functions u ∈ Wloc
k,p
(M ) with finite norm

k
uW k,p (M ) = ∇i uLps−i (M ) .
s
i=0
We can also analogously define weighted Sobolev spaces for tensors and
spinors.
The convention for the dependence of these spaces on s is chosen so that

s corresponds to a decay rate, that is, rs ∈/ Lps , but rs−δ ∈ Lps for any δ > 0.
p
Note that Lp = L−n/p . The Sobolev spaces are set up so that taking one
derivative of the function causes it to decay one order faster. These weighted
spaces are Banach spaces, just like their unweighted counterparts.
Exercise A.21. As stated, the definitions of the norms and spaces depend
both on the exact choice of metric and the choice of r. Show that although
changing the asymptotically flat metric g and the function r will change the
norms, such a change will produce equivalent norms Lps , Ws1,p , and Ws2,p ,
and in particular membership in these spaces does not depend on the choice
of g or r. (In your answer, you may keep the choice of asymptotically flat
coordinate charts fixed.) What about Ws3,p ?
Just as there are weighted Sobolev spaces, there are also weighted Hölder
spaces.
Definition A.22. Let (M n , g) be a complete asymptotically flat manifold.

Slightly abusively, we will choose r to be a smooth positive function on M
such that r = |x| in each asymptotically flat coordinate chart. Given a
positive integer k, α ∈ (0, 1), s ∈ R, we define the weighted Hölder space
Csk,α (M ) to be the space of all functions u ∈ Cloc
k,α
(M ) with finite weighted
norm

k
uC k,α (M ) = sup |ri−s ∇i u| + sup rk+α−s [∇k u]C α (Br/2 (x)) ,
s
i=0 x∈M
where
|v(x) − v(y)|
[v]C α (U ) := sup
x,y∈U d(x, y)α
denotes the Hölder coefficient of the tensor-valued function v over the set U .
The quantity |v(x) − v(y)| can be defined using parallel translation along
a minimizing geodesic connecting x to y. (We may take r/2 to be smaller
than the injectivity radius.)
In contrast with weighted Sobolev spaces, note that rs ∈ Cs0,α , but

/ Cs0,α for all δ > 0.
rs+δ ∈
There are weighted versions of many of the inequalities studied in partial
differential equations.
Theorem A.23 (Weighted Hölder inequality). Let (M, g) be an asymptot-

ically flat manifold. Assume p, q > 1 such that p1 + 1q = 1 and s1 + s2 = s.
Then for any u ∈ Lps1 and v ∈ Lqs2 ,
uvL1s ≤ uLps · vLqs .
1 2
Exercise A.24. Show that this is a simple consequence of the usual Hölder
inequality.
Theorem A.25 (Weighted Sobolev embeddings). Let (M n , g) be a complete

asymptotically flat manifold. Let k be a positive integer, p, q ≥ 1, and s ∈ R.
If k − n/p < 0 and q ≤ np/(n − kp), then Wsk,p (M ) ⊂ Lqs (M ), and there
exists a constant C such that for all u ∈ Wsk,p ,
uLqs ≤ CuW k,p .
s
If k − n/p ≥ α and α ∈ (0, 1), then Wsk,p (M ) ⊂ Cs0,α (M ), and there

exists a constant C such that for all u ∈ Wsk,p ,
uCs0,α ≤ CuW k,p .
s
To prove this theorem, one first proves weighted Sobolev inequalities

in Euclidean space, which follow from looking at the usual scale-invariant
Sobolev inequalities on annuli in Euclidean space. See [Bar86, Theorem 1.2].
Then one obtains the general case using a standard patching argument on
charts where the metric is close to Euclidean (as in [Heb96, Chapter 2], for
example). Asymptotic flatness allows us to choose a patch for each infinite
end that is nearly Euclidean, and here is where we use the weighted Sobolev
inequality in Euclidean space.
Theorem A.26 (Weighted Rellich-Kondrachov compactness theorem). Let
(M n , g) be a complete asymptotically flat manifold. Let k be a positive in-
teger, p, q ≥ 1, and s < t. If k − n/p < 0 and q < np/(n − kp), then the
embedding Wsk,p (M ) ⊂ Lqt (M ) is a compact embedding.
Proof. This was first observed in [CBC81], but the proof is fairly straight-
forward. First, the embedding itself follows from weighted Sobolev embed-
ding and the weighted Hölder inequality. Let ui be a bounded sequence in
Wsk,p (M ). Then we can extract a subsequence (which we will also write as
ui ) that converges weakly to some u ∈ Wsk,p (M ). We claim that a subse-
quence of this ui converges to u in Lqt . Each sufficiently large radius ρ > 0
divides M into an exterior region Eρ = Rn B̄ρ and an interior compact set
Kρ = M Eρ (we assume one end for simplicity).
Let > 0. Since the Lqt (Eρ ) norm is bounded by ρs−t times the Lqs (Eρ )
norm, which in turn is bounded by the Wsk,p (Eρ ) norm (by weighted Sobolev
embedding), it is clear that we can choose ρ large enough so that
ui − uLqt (Eρ ) < /2. Meanwhile, ui is certainly bounded in Wsk,p (Kρ ),
and since Kρ is a compact manifold with boundary, the standard Rellich-
Kondrachov compactness theorem implies that a subsequence of ui converges
in Lqt (Kρ ), and the only thing it could converge to is the restriction of u to
Kρ . So there exists some particular i1 such that ui1 − uLqt (Kρ ) < /2, and
thus ui1 − uLqt (M ) < . Since this can be done for any > 0, the result
follows.
Similar reasoning combined with the Arzela-Ascoli Theorem on Kρ leads

to the following.
Proposition A.27. Let (M n , g) be a complete asymptotically flat manifold.
Let 0 < β < α < 1 and s < t. Then the inclusion Cs0,α ⊂ Ct0,β is a compact
embedding.
Theorem A.28 (Weighted Poincaré inequality). Let (M n , g) be a complete

asymptotically flat manifold. Let p ≥ 1 and s < 0. Then there exists a
constant C such that for all u ∈ Ws1,p ,
uLps ≤ C∇uLps−1 .
Proof. Our proof is modeled on an argument in [Bar86, Theorem 1.3].

It suffices to show that the result holds for compactly supported smooth
functions u. Choose a smooth positive function σ that is equal to |x| for
large |x| on each asymptotically flat end. So this σ has the same properties
as the function r in Definition A.20, but it can be a different function. Note
that the ratio between σ and r is controlled. Compute

∇(σ 2−n ) · ∇(σ −sp |u|p )
M

= (2 − n)σ 1−n ∇σ · −spσ −sp−1 |u|p ∇σ + σ −sp p|u|p−2 u∇u
M
≤ −sp(2 − n)|∇σ|2 σ −sp−n |u|p
M

+ (n − 2)p |∇σ|σ −sp+1−n|u|p−1 |∇u|
M
≤ −sp(2 − n)|∇σ|2 σ −sp−n |u|p + Cup−1 ∇uL1sp−1

M
≤ −sp(2 − n)|∇σ|2 σ −sp−n |u|p + Cup−1 Lp/(p−1) · ∇uLps−1
sp−s
M
= −sp(2 − n)|∇σ|2 σ −sp−n |u|p + CuL p−1
p · ∇uLp
s−1
s
M
for some constant C depending on the choice of σ, where we used the
weighted Hölder inequality (Theorem A.23) in the second to last line. We
compute the same quantity using integration by parts:

∇(σ 2−n ) · ∇(σ −sp |u|p ) = −Δg (σ 2−n )σ −sp |u|p .
M M
Combining the two computations, we have

−Δg (σ 2−n ) + sp(2 − n)|∇σ|2 σ −n |u|p σ −sp ≤ CuL
p−1
p · ∇uLp
s−1
.
s
M
We claim that we can select σ so that −Δg (σ 2−n ) + sp(2 − n)|∇σ|2 σ −n is

bounded below by some positive constant. Assuming the claim, the left-
hand side becomes an upper bound for upLp times some small constant,
s
and then the result easily follows.
To establish the claim, note that our assumption s < 0 implies that the
term sp(2 − n)|∇σ|2 σ −n ≥ 0. We know that |x|2−n is harmonic with respect
to the Euclidean metric, so it is not hard to see that

Δg (|x|2−n ) < sp(2 − n)|∇|x||2 |x|−n
for sufficiently large |x|, by asymptotic flatness. Meanwhile, we would like
to have Δg (σ 2−n ) < 0 in the compact region. The construction of such a
function is a bit tedious but the basic idea is to “cap” off the function |x|2−n
by a superharmonic function in the compact region. The tricky part is to
do so smoothly. We omit the details.
We now consider elliptic operators on a complete asymptotically flat

manifold.
Assumption A.29. We make all of the same assumptions as in Assump-
tion A.1, but we also assume that (M, g) is a complete asymptotically flat
manifold, and that5 V = O(|x|−1−δ ) and q = O(|x|−2−δ ) for some δ > 0.
Assumption A.30. We make all of the same assumptions as in Assump-
tion A.1, but we also assume that (M, g) is a complete asymptotically flat
0,α 0,α
manifold, and that V ∈ C−1−δ and q ∈ C−2−δ for some α ∈ (0, 1) and some
δ > 0.
Exercise A.31. Check that for any L as in Assumption A.29, for any p > 1
and s ∈ R,
L : Ws2,p (M ) −→ Lps (M )
is a bounded linear operator.
Check that for any L and α as in Assumption A.30, for any s ∈ R,
L : Cs2,α (M ) −→ Cs0,α (M )
is a bounded linear operator.
Given such an operator L, we obtain weighted versions of our Lp and

Schauder estimates (Theorems A.3 and A.6).
Theorem A.32 (Weighted elliptic global Lp estimate and regularity). Let
(M, g) be a complete asymptotically flat manifold, and consider L as in
Assumption A.29. Let p > 1 and s ∈ R. There exists a constant C such
that for any u ∈ Ws2,p (M ),
uWs2,p (M ) ≤ C(LuLps−2 (M ) + uLps (M ) ).
Moreover, if u ∈ Lps (M ) and the weak object Lu can be identified with a

function in Lps−2 (M ), then u ∈ Ws2,p (M ).
5 These assumptions are stronger than what we will actually need. See [Bar86, inequali-
ties (1.18)] for more natural assumptions.

Theorem A.33 (Weighted elliptic global Hölder estimate and regularity).

Let (M, g) be a complete asymptotically flat manifold, and consider L and
α as in Assumption A.30. Let s ∈ R. There exists a constant C such that
for any u ∈ Cs2,α (M ),
uCs2,α (M ) ≤ C(LuC 0,α (M ) + uCs0 (M ) ).
s−2
Moreover, if u ∈ Cs0 (M )
and the weak object Lu can be identified with a
0,α
function in Cs−2 (M ), then u ∈ Cs2,α (M ).
Both of these theorems are proved in essentially the same way.
Outline of the proof of Theorem A.33. The proof assumes some knowl-
edge of some of the precursors of Theorem A.6. Specifically, there exists an
interior elliptic Hölder estimate for a fixed annulus. That is, for annuli
A ⊂⊂ A lying in an asymptotically flat end, we have
(A.3) uC 2,α (A ) ≤ C(LuC 0,α (A) + uC 0 (A) ),
where C depends on A, A , α, and the coefficients of L. If we scale it up by
a factor of ρ, we obtain the scale-invariant estimate
(A.4) u∗C 2,α (ρA ) ≤ C(ρ2 Lu∗C 0,α (ρA) + uC 0 (ρA) ),
where we define

2
|D 2 u(x) − D 2 u(y)|
u∗C 2,α (ρA ) ρk sup |D k u| + ρ2+α · sup ,
x,y∈ρA |x − y|α
k=0
and Lu∗C 0,α (ρA) is defined similarly. The point is that these scale-invariant
norms, together with our hypotheses on g, V , and q, allow us to choose the
constant C in (A.4) to be independent of ρ. (The interior estimate and the
claimed dependence of C on the coefficients of L both follow from [GT01,
Theorem 6.2], for example.) Then of course we have
ρ−s u∗C 2,α (ρA ) ≤ C(ρ2−s Lu∗C 0,α (ρA) + ρ−s uC 0 (ρA) ).
Next, observe that on the annulus ρA, the constant ρ and the function
r(x) = |x| are uniformly bounded by each other. Therefore we can patch
these estimates together for an appropriate sequence of ρ’s (as well as patches
covering the compact part of M ) in a standard way to obtain our desired
weighted global estimate on M .
As for the regularity statement, standard interior elliptic regularity is
enough to guarantee that u is locally C 2,α regular, so really we only need to
show that uCs2,α (M ) < ∞, and the patching argument described above is
sufficient to do that.
Actually, we can say something even stronger.

Corollary A.34. Let p ≥ 1 and s ∈ R. Under Assumption A.30, given

u ∈ Lps (M ) and Lu ∈ Cs−2
0,α
(M ), then u ∈ Cs2,α (M ).
Proof. As mentioned in the previous proof, standard interior elliptic reg-

ularity (such as in Theorem A.5) is enough to guarantee that u is locally
C 2,α regular. We just have to show that uCs2,α (M ) < ∞. As before, con-
sider annuli A ⊂⊂ A. Observe that since C 2,α (A) ⊂ Lp (A) is a compact
embedding and C 0 (A) ⊂ Lp (A), a simple argument shows that there is an
interpolation inequality
uC 0 (A) ≤ uC 2,α (A) + C()uLp (A) .
(Prove this as an exercise.) One can combine this with the estimate (A.3)
and a standard (but nonobvious) PDE argument that shows that we have
the estimate
uC 2,α (A ) ≤ C(LuC 0,α (A) + uLp (A) ).
Scaling this one up by ρ leads to
−s− n
ρ−s u∗C 2,α (ρA ) ≤ C(ρ2−s Lu∗C 0,α (ρA) + ρ p uC 0 (ρA) ),
where once again the C will be independent of ρ. Consequently, we can see

that for some C, we have the inequality
uCs2,α (M ) ≤ C(LuC 0,α (M ) + uLps (M ) ),
s−2
which is valid for any u satisfying the hypotheses.
We now present the appropriate analog of Theorem A.8 for the whole
Euclidean space Rn .
Theorem A.35. Let Δ denote the Euclidean Laplacian on Rn , and let s be

any real number not in the exceptional set Λ from Definition A.17.
Then for any p > 1, the map
Δ : Ws2,p (Rn ) −→ Lps (Rn )
is Fredholm. More precisely, the kernel is precisely H≤s (Rn ) and the range
is precisely the closed subspace of Lps annihilated by H≤2−n−s (Rn ), where the
pairing is just integration of the product. In particular, if 2 − n < s < 0, the
map is an isomorphism.
We also have the exact same results for the map
Δ : Cs2,α (Rn ) −→ Cs0,α (Rn )
for any α ∈ (0, 1).
Cantor first interpreted the results of Nirenberg and Walker [NW73,

Lemma 2.1] in terms of weighted Sobolev spaces in order to prove some of
the isomorphism cases for weighted Sobolev spaces [Can74, Theorem 2].
The full theorem for weighted Sobolev spaces was first proved in indepen-
dent works of McOwen [McO79, Theorem 0] and Lockhart [Loc81, Theo-
rem 4.3]. The isomorphism cases for weighted Hölder spaces were observed
by Chaljub-Simon and Choquet-Bruhat [CSCB79] for n = 3, but the full
theorem seems difficult to pin down in the literature. See also closely related
results of Meyers [Mey63].
Remark A.36. To get a sense for why a problem occurs when s lies in the
exceptional set Λ, consider the example of a harmonic function of degree k
times log |x|.
Proof of the Hölder case. We will present only the Hölder case since it
seems to be underrepresented in the literature, and it is also simpler. The
computation of the kernel is fairly elementary: let u ∈ Cs2,α (Rn ) be har-
monic. If s < 0, then the maximum principle implies that u is identically
zero. If s > 0, gradient estimates for harmonic functions (as in [GT01, The-
orem 2.10]) imply that taking any s + 1 partial derivatives of u results in
a new harmonic function lying in u ∈ Cs−s−1
0 (Rn ), which again vanishes
because of the maximum principle. Therefore u must be a polynomial of
degree at most s, which is what we wanted to show. Note that the kernel
is finite dimensional.
Most of the work is in computing the range. First consider the case
2 − n < s < 0. In this case, we want to show that Δ : Cs2,α −→ Cs0,α
is surjective. Given f ∈ Cs0,α , we can construct a solution to the Poisson
equation Δu = f using the Newtonian potential in the usual way. That is,
we define
1
u(x) := |x − y|2−n f (y) dy.
(2 − n)ωn−1 Rn
Note that the decay of f guarantees that the integral is finite. For now,
assume that f ∈ Cc∞ . Then it is well known that Δu = f (e.g., [Eva10,
Section 2.2.1]).
Our main task is to check that this solution u lies in Cs2,α . The first step
is to show that u ∈ Cs0 (Rn ). We have

1
|u(x)| ≤ f Cs−2
0 |x − y|2−n r(y)s−2 dy,
(n − 2)ωn−1 R n
where we recall that r(x) is a smooth positive function equal to |x| outside
a compact set. For simplicity, let us choose the function r ≥ 1 so that
r(x) = 1 for |x| ≤ 1 and r(x) = |x| for |x| ≥ 2. (Different choices for r lead
to equivalent norms.)
Claim.
|x − y|2−n r(y)s−2 dy ≤ Cr(x)s
Rn
for some constant C. (Throughout this proof, we will use C as a generic
constant that can change from line to line.)
We will estimate the integral by breaking Rn into three pieces:

Bx = B 1 |x| (0),
2
Ax = B2|x| (0) B 1 |x| (0),

2
Ex = Rn B2|x| (0).
Suppose that |x| ≥ 4, so that |x| = r(x). Then we have

2−n
1
|x − y| 2−n
r(y) s−2
dy ≤ |x| |y|s−2 dy
Bx |y|< |x|
1 2
2
s
= C|x| ,
where we needed the fact that s > 2 − n in order to integrate |y|s−2 , and
s−2

2−n 1
|x − y| 2−n
r(y) s−2
dy ≤ |x − y| 2 x dy
Ax Ax
s−2
1
≤ x |x − y|2−n dy
2 |x−y|<3|x|
= C|x|s ,

2−n
1
|x − y| 2−n
r(y) s−2
dy ≤ |y| |y|s−2 dy
Ex |y|>2|x| 2
= C|x|s ,
where we needed the fact that s < 0 in order to integrate |y|−n+s . For the
case |x| < 4, we just have to estimate the integrals above by a constant in
order to complete our proof of the Claim. The first two are trivial, and we
leave the third as an exercise. The Claim implies that u ∈ Cs0 , and then we
can invoke the weighted elliptic Hölder estimate (Theorem A.33) to conclude
that
uCs2,α ≤ Cf C 0,α .
s−2
0,α
For the general case of f ∈ Cs−2 ,
we can approximate it by a sequence of
0,α
smooth compactly supported functions fi that converge to f in Cs−2 and
then solve Δui = fi as above. Then the estimate above implies that their
ui is Cauchy in Cs2,α , and then its limit u solves Δu = f . This completes
the proof in the case 2 − n < s < 0.
Next we consider the case 1 − n < s < 2− n (as a simple but representa-
tive case). Let u ∈ Cs2,α , and observe that Rn Δu dx = 0 by the divergence
theorem, since s < 2 − n guarantees that the boundary term at infinity van-
ishes. Therefore the Δ image of Cs2,α in Cs0,α is annihilated by the constants,
which is the same thing as H≤2−n−s (Rn ) = H0 (Rn ). We must now show
2,α
that if f ∈ Cs−2 with Rn f dx = 0, then we can find u ∈ Cs2,α such that
Δu = f . We use the same strategy as before, defining u in the same way. As
we saw above, it will suffice to show that for any f ∈ Cc∞ with Rn f dx = 0,
we have
uCs0 ≤ Cf Cs−2
0 .
However, this time we see that the estimate of the integral over Bx fails
since we no longer have s > 2 − n. Instead, we make the observation that

1
u(y) = |x − y|2−n − r(x)2−n f (y) dy,
(2 − n)ωn−1 Rn
since the second term integrates to zero by hypothesis on f . Similar to
before, we will prove the estimate

|x − y| 2−n
− r(x) 2−n
r(y) s−2
dy ≤ Cr(x)s .

Rn
As before we only treat the case |x| ≥ 4 and leave the |x| < 4 case as an
exercise:

|y|
|x − y| 2−n
− r(x) 2−n
r(y) s−2
dy ≤ C|x − y|2−n |y|s−2 dy
|x|
Bx Bx

1−n
1
≤C |x| |y|s−1 dy
2 |y|< |x|
1
2
= C|x|s ,
where we used the fact that s > 1 − n. The first inequality just follows
|y|
from the binomial theorem and the fact that |x| < 12 in Bx . Estimating the
integral over Ax causes no new problems, and for Ex , we can use the old
estimate together with an estimate of the new term:

r(x) 2−n
r(y) s−2
dy = |x| 2−n
|y|s−2 dy = C|x|s ,
Ex |y|>2|x|
where we used the fact that s < 2−n. Therefore we have all of the estimates
needed to complete the case 1 − n < s < 2 − n.
Now we can summarize how the argument works in general when s <
2 − n. Using Green’s second identity (i.e., integration by parts), we can
see that the Δ image of Cs2,α in Cs−2
0,α
is annihilated by every element of
H≤2−n−s (Rn ), because the boundary terms vanish quickly enough. Con-
versely, for any f ∈ Cc∞ annihilated by H≤2−n−s (Rn ), we define

1
u(x) := |x − y|2−n f (y) dy,
(2 − n)ωn−1 Rn
as before and work to prove the estimate uCs0 ≤ Cf Cs−2

0 . Recall that
|y|
the troublesome integral was over the region Bx , where |x| < 12 . In this
region, we can expand the fundamental solution |x − y|2−n as a power series
|y|
in |x| . It is not hard to see that this power series must take the form

|y| k
|x − y| 2−n
= |x| 2−n
Pk (x̂ · ŷ)
|x|
k=0
for some polynomials Pk , where x̂ := x/|x|. In fact, this power series is

precisely the harmonic expansion of |x − y|2−n over Bx in the y-variable,
as described in Theorem A.18. Moreover, it is also the exterior harmonic
expansion of |x − y|2−n in the x-variable, as described in Corollary A.19. In
fact, it turns out that these Pk are the same polynomials as the ones arising
from zonal harmonics in Section A.1.4, up to constants. We define
2−n−s
k
|y|
Γs (x, y) := |x − y| 2−n
− |r(x)| 2−n
Pk (x̂ · ŷ),
r(x)
k=0
where we have replaced |x| by r(x) in order to avoid singular behavior at

x = 0 (since what we really care about is the large |x| behavior anyway).
The important point is that |y|k Pk (x̂ · ŷ) ∈ Hk (Rn ) as a function of y, and
hence it annihilates f if k < 2 − n − s. Therefore

1
u(x) = Γs (x, y)f (y) dy.
(2 − n)ωn−1 Rn
We seek to show that

Γs (x, y) dy ≤ Cr(x)s .

Rn
Again, we will treat the case |x| ≥ 4.

If y ∈ Bx , then

∞
k
3−n−s
|y| |y|
|Γs (x, y)| = |x|2−n Pk (x̂ · ŷ) ≤ C|x|2−n ,
k=3−n−s |x| |x|
|y|
since |x| ≤ 12 . So

3−n−s
|y|
Γs (x, y)r(y) s−2
dy ≤ C|x| 2−n
|y|s−2 dy
|x|
Bx |y|< 12 |x|
= C|x|s ,
where the integral can be computed because
3 − n − s + (s − 2) > (2 − n − s) + (s − 2) = −n.
Estimating the integral over Ax is straightforward. For Ex , the |x − y|2−n
is estimated just as before while for y ∈ Ex , the other term in the integrand
can be estimated

2−n−s
|y|
k

2−n
Γs (x, y) − |x − y| = |x| 2−n
Pk (x̂ · ŷ)
|x|
k=0

2−n−s
|y|
≤ C|x|2−n ,
|x|
|y|
since |x| ≥ 2. Therefore

Γs (x, y) − |x − y|2−n r(y)s−2 dy
Ex

2−n−s
|y|
≤ C|x| 2−n
|y|s−2 dy
|y|>2|x| |x|
= C|x|s ,
where the integral can be computed because
2 − n − s + (s − 2) < (2 − n − s) + (s − 2) = −n.
This gives us all of the estimates needed for the s < 2 − n case.
All that remains is the s > 0 case. For this case, instead of expanding the
fundamental solution |x − y|2−n in spherical harmonics for x in the exterior
and y in the interior, we should swap their roles. That is, we write
∞

|x| k
|x − y|2−n = |y|2−n Pk (x̂ · ŷ),
|y|
k=0
and define
s

|x| k
Γs (x, y) := |x − y|2−n − |r(y)|2−n Pk (x̂ · ŷ).
r(y)
k=0
Let f ∈ Cc∞ and this time we define

u(x) := Γs (x, y)f (y) dy.
Rn
Observe the new term that has been subtracted from |x − y|2−n is actually
a harmonic polynomial in x, and consequently we still have Δu = f . As
before, we want to show that uCs2,α ≤ Cf C 0,α . Keep in mind that since
s−2
Δ now has a kernel, this solution u will not be a unique solution. It is a
specially chosen solution.
For estimating the integrals, the roles of Bx and Ex are in some sense
swapped. In order to estimate Bx , we use the fact that for y ∈ Bx , we can
estimate

|x| s
2−n
Γs (x, y) − |x − y| ≤ C|r(y)| 2−n
.
r(y)
Again, the estimate for Ax is straightforward, and the estimate for y ∈ Ex
uses
s+1
2−n |x|
|Γs (x, y)| ≤ C|y|
|y|
and the fact that (2 − n) − s + 1 + (s − 2) < −n. The reader can fill in
the details.
The proof of the Sobolev version of the theorem is fairly similar. The
solution u is defined in the same way, and the expansion of |x − y|2−n is
used in the same way, with the difference being that instead of proving C 0
estimates on u, one requires Lp estimates on u. These are a bit trickier, and
this is what was supplied by Nirenberg and Walker in [NW73, Lemma 2.1].

The next two results go back to work of Meyers [Mey63], who used some
of the same main ideas that went into the proof of Theorem A.35 above.
Corollary A.37. Let α ∈ (0, 1) and ρ > 0, and let τ < s be real numbers
not in the exceptional set Λ. Assume that u ∈ Cs2,α (Rn Bρ (0)) and Δu ∈
Cτ0,α
−2 (R Bρ (0)). Then there exists hk ∈ Hk (R {0}) for k = τ +
n n
1, . . . , s such that

s

u− hk ∈ Cτ2,α (Rn Bρ (0)),
k=τ +1
where we simply ignore the sum as vacuous if τ + 1 > s and also ignore
terms coming from k ∈/ Λ.
Proof. Consider u as in the statement of the corollary, and let f be any

smooth function on all of Rn that agrees with Δu outside a compact set.
By Theorem A.35, it is clear that f can be chosen to lie in the range of
Δ : Cτ2,α (Rn ) −→ Cτ0,α n
−2 (R ) by altering it on a compact set. So there is
some v ∈ Cτ2,α (Rn ) with the property that Δv = Δu outside a compact set.
Thus u−v ∈ Cs2,α (Rn Bρ (0)) and is also harmonic outside a compact set K.
If s < 0, we can expand u−v outside K as in Corollary A.19, completing the

proof by truncating all terms of degree less than τ in the expansion of u − v.
If s > 0, we simply have to subtract off a harmonic polynomial of degree
less than s before potentially making the same argument (if τ < 0).
The following limited version of the previous corollary for operators that
are asymptotic to Δ will be useful to us.
Corollary A.38. Let ρ > 0, and let τ < s be real numbers not in the
exceptional set Λ. Let (M = Rn Bρ (0), g) be an asymptotically flat manifold
with boundary, and consider L and α as in Assumption A.30.
Assume that u ∈ Cs2,α (M ) and Lu ∈ Cτ0,α
−2 (M ).
• If (τ, s) ∩ Λ = ∅, then u ∈ Cτ2,α (M ).

• Otherwise, if k is the largest element of (τ, s) ∩ Λ, then there exists
h ∈ Hk (Rn {0}) and γ > 0 such that
2,α
u − h ∈ Ck−γ (M ).
• Moreover, if k is the largest element of (τ, s) ∩ Λ, and we further
assume that τ < k − 1, and that the δ in Assumption A.30 and the
asymptotic decay rate of g (as in Definition 3.5) are both greater
than 1, then we obtain the stronger result that
2,α
u − h ∈ Ck−1 (M ).
Remark A.39. We will most often use the case when 2 − n < s < 0 and
τ < 2 − n, so that the largest element of (τ, s) ∩ Λ is 2 − n, and thus the
conclusion will be that for some constant A and some γ > 0,
2,α
u(x) − A|x|2−n ∈ C2−n−γ (M ).
Proof. We will show that we can improve the initial rate s. Let f = Lu ∈
Cτ0,α
−2 so that we can rewrite the equation Lu = f as
Δu = (Δg − Δ)u + V i ∂i u + qu − f,
where Δ is the Euclidean Laplacian. Without loss of generality, let us assume
that the δ from Assumption A.30 is less than the asymptotic decay rate of
0,α
g. Then we can see that (Δg −Δ)u+V i ∂i u+qu ∈ Cs−2−δ , and consequently
we have for any s < 0,
0,α
(A.5) u ∈ Cs2,α (M ) =⇒ Δu ∈ Cmax(s−2−δ,τ −2)
(M ).
If we combine this implication with Corollary A.37, we see that we can easily
bootstrap our way to the first two assertions of Corollary A.38. However,
since Corollary A.37 will never give us better than u ∈ Ck2,α (M ), we see that
the bootstrapping argument must terminate here. But in this case, we can
0,α
still see that Δu ∈ Cmax(k−2−δ,τ −2) (M ). By applying Corollary A.38 one last
time, we can see that specifically, the γ in the conclusion of the corollary
can be chosen so that k − γ is the maximum of τ , k − δ (keeping in mind
that δ is less than the asymptotic decay rate), and the next largest element
of Λ (which is k − 1, except when k = 0). In particular, if τ < k − 1 and
δ > 1, we see that the third assertion of Corollary A.38 follows.
Next we generalize Theorem A.35 to asymptotically flat manifolds.

Theorem A.40. Let (M n , g) be a complete asymptotically flat manifold.
Let Δg denote the g-Laplacian on M , and let s be any real number not in
the exceptional set Λ from Definition A.17.
Then for any p > 1, the map
Δg : Ws2,p (M ) −→ Lps (M )
is Fredholm. More precisely, the dimension of the kernel is dim H≤s (Rn )
and the dimension of the cokernel is dim H≤2−n−s (Rn ). In particular, if
2 − n < s < 0, the map is an isomorphism.
We also have the exact same results for the map
Δg : Cs2,α (M ) −→ Cs0,α (M )
for any α ∈ (0, 1).
For the Sobolev case, Cantor [Can81, Corollary 6.5], Lockhart [Loc81,
Theorem 6.2], and McOwen [McO80], all saw how the Fredholm property
of the Laplacian could be transferred to other operators that are asymptotic
to the Euclidean Laplacian, which is the basic idea behind Theorem A.40,
and then the dimensions of the kernel and cokernel can be computed using
the self-adjointness. See also [Bar86, Proposition 2.2]. For the Hölder case,
as mentioned, the 2 − n < s < 0 case is treated in [CSCB79], but the full
result seems hard to find.
We will use the following lemma to prove Theorem A.40. It is an im-
provement of the estimates of Theorems A.32 and A.33 when the decay rate
s is nonexceptional, but note that it does not come with a corresponding
“regularity of decay” result as in Theorems A.32 and A.33.
Lemma A.41. Let (M, g) be a complete asymptotically flat manifold, and
let s be any real number not in the exceptional set Λ. For large ρ > 1, let
Kρ be the compact subset of M enclosed by the sphere |x| = ρ.
For p > 1, there exists a constant C and a ρ > 1 such that for any
u ∈ Ws2,p (M ),
uWs2,p (M ) ≤ C(Δg uLps−2 (M ) + uL1 (Kρ ) ).
For α ∈ (0, 1), there exists a constant C and a ρ > 1 such that for any
u ∈ Cs2,α (M ),
uCs2,α (M ) ≤ C(Δg uC 0,α (M ) + uL1 (Kρ ) ).
s−2
Proof. Again, we will only present the proof for Hölder spaces, which is es-
sentially the same as the proof for Sobolev spaces in [Bar86, Theorem 1.10].
As usual, we use a generic constant C that can change from line to line.
First let us prove the result for the Euclidean Laplacian Δ on Rn . For
s < 0, Δ is injective, and our proof of Theorem A.35 showed directly that
for all u ∈ Cs2,α (Rn ),
uCs2,α (Rn ) ≤ CΔuC 0,α (Rn ) .
s−2
So we consider the case s > 0. In this case we do not have the above bound
(since injectivity fails), but our proof of Theorem A.35 showed that for
every u ∈ Cs2,α (Rn ), there exists v, h ∈ Cs2,α (Rn ) such that h is a harmonic
polynomial and
vCs2,α (Rn ) ≤ CΔuC 0,α (Rn ) .
s−2
Given any ρ > 0, we want to prove that there exists C such that the bound
uCs2,α (Rn ) ≤ C(ΔuC 0,α (Rn ) + uL1 (Bρ ) )
s−2
holds for all u ∈Cs2,α (Rn ).

Suppose it does not. Then we can find a sequence
ui ∈ Cs2,α (Rn ) such that ui Cs2,α (Rn ) = 1 while both Δui C 0,α (Rn ) and
s−2
ui L1 (Bρ ) converge to zero. Construct vi and hi as described above. Then
(A.6) vi Cs2,α (Rn ) ≤ CΔvi C 0,α (Rn ) = Δui C 0,α (Rn ) → 0.
s−2 s−2
In particular, vi → 0 in L1 (Bρ ), and thus hi → 0 in L1 (Bρ ) as well. The key

point is that H≤s (Rn ) is a finite-dimensional space such that every nontrivial
element is nonvanishing on Bρ , so there must exist C such that
hCs2,α (Rn ) ≤ ChL1
Bρ
for all h ∈ H≤s (Rn ). In particular, this implies the hi → 0 in Cs2,α (Rn ).
Combining this with (A.6) contradicts the assumption that ui Cs2,α (Rn ) = 1.
Hence, we have our desired estimate
uCs2,α (Rn ) ≤ C(ΔuC 0,α (Rn ) + uL1 (Bρ ) ).
s−2
Next we show how the estimate for Δ on Rn implies the desired estimate
for Δg on M . By the elliptic estimate (Theorem A.33), we have
uCs2,α (M ) ≤ C(Δg uC 0,α (M ) + uCs0 (M ) )
s−2
for any u ∈ Cs2,α (M ). So to prove our desired result, it is sufficient to show

that
uCs0 (M ) ≤ uCs2,α (M ) + CΔg uC 0,α (M ) + C()uL1 (Kρ )
s−2
for any u ∈ Cs2,α (M ).

For large ρ > 1, let χρ be a nonnegative cutoff function that is 1 on Kρ/2
and 0 outside Kρ , and decompose
u = u0 + u1 , where
u0 := χρ u,
u1 := (1 − χρ )u.
So we just need to bound u0 Cs0 and u1 Cs0 . The u0 bound follows imme-
diately from an interpolation inequality on the compact space Kρ ,
u0 Cs0 (M ) ≤ CuC 0 (Kρ ) ≤ uC 2,α (Kρ ) + C()uL1 (Kρ ) ,
where C is allowed to depend on ρ.
For the u1 bound, we use the estimate we just proved on Euclidean
space, but for the fixed ball B1 . Since u1 is supported outside that ball, we
have
u1 Cs2,α (Rn ) ≤ CΔuC 0,α (Rn ) ,
s−2
with this C being independent of ρ. Therefore (with changing C from line

to line),
u1 Cs2,α (M ) ≤ Cu1 Cs2,α (Rn )
≤ CΔu1 C 0,α (Rn )
s−2
≤ C(Δg u1 C 0,α (Rn ) + (Δ − Δg )u1 Cs0,α (Rn ) ).

s−2
The asymptotic flatness of g implies that for sufficiently large ρ, the second
term on the right can be absorbed into the left side to obtain
u1 Cs2,α (M ) ≤ CΔg u1 C 0,α (M ) .
s−2
Expand Δg u1 = −(Δg χρ )u − 2∇χρ , ∇u + (1 − χρ )u. Since all derivatives

of χρ are bounded in the annular region Kρ Kρ/2 and zero outside that
region, it is not hard to see that
Δg u1 C 0,α (M ) ≤ Δg uC 0,α (M ) + CuC 1,α (Kρ ) .
s−2 s−2
By interpolation on the compact space Kρ , we can bound

uCs1,α (Kρ ) ≤ uCs2,α (Kρ ) + C()uL1 (Kρ ) .
Putting it all together yields the desired result.
Proof of the Hölder case of Theorem A.40. First we will prove that
0,α
Δg : Cs2,α (M ) −→ Cs−2 (M )
has closed range. Suppose we have a sequence ui ∈ Cs2,α such that Δg ui

0,α
converges to some function f in Cs−2 . Choose ρ large enough so that
Lemma A.41 holds, and define Kρ as we did there. By Arzela-Ascoli, we
can easily find a subsequence of ui that converges in L1 (Kρ ). Then by the
estimate in Lemma A.41, it follows that this subsequence is Cauchy in Cs2,α
and therefore converges to some u in Cs2,α . Thus Δg ui converges to Δg u,
and hence f = Δg u, proving that f lies in the range.
Next we have to compute the dimensions of the kernel and cokernel.
Let ker(Δg , s) and im(Δg , s − 2) denote the kernel and image of the oper-
ator Δg : Cs2,α −→ Cs−2
0,α
, respectively. (We choose this convention to help
remind us about the decay rates of the elements.) Let (im(Δg , s − 2))⊥ de-
0,α ∗
note the annihilator of im(Δg , s − 2) in the dual space (Cs−2 ) , so that the
⊥
codimension of im(Δg , s − 2) is just dim(im(Δg , s − 2)) .
We will prove the theorem first by showing that for all s ∈
/ Λ,
ker(Δg , s) = (im(Δg , −n − s))⊥ ,
0,α
where ker(Δg , s) will be regarded as a subspace of (C−n−s )∗ via integration
against test functions. Note that this is the same relationship between the
dimensions of the kernels and cokernels that we proved for the Euclidean
Laplacian in Theorem A.35. Then to complete the proof, we will show that
dim ker(Δg , s) = dim H≤s (Rn ).
By Corollary A.38, every element of ker(Δg , s) actually lies in Ck2,α where

0,α
k = s < s, so indeed we have ker(Δg , s) ⊂ (C−n−s )∗ , and it is easy to see
using Green’s second identity that ker(Δg , s) does annihilate im(Δg , −n−s),
that is, ker(Δg , s) ⊂ (im(Δg , −n − s))⊥ . To prove the reverse inclusion, let
f ∈ (im(Δg , −n − s))⊥ . In particular, this means that f annihilates Δg v
for all v ∈ Cc∞ (M ), that is, Δg f = 0 in the weak sense. By interior elliptic
regularity (Theorem A.5), f is represented by a smooth function. And since
r−n−s ∈ C−n−s
0,α
, it follows that f ∈ L1s . By Corollary A.34, it follows that
f ∈ ker(Δg , s). Hence, (im(Δg , −n − s))⊥ ⊂ ker(Δg , s).
Next, we compute the dimensions of the kernels. When s < 0, the
maximum principle implies injectivity, and we are done. For each nonneg-
ative integer k, we consider s ∈ (k, k + 1). If u ∈ ker(Δg , s), then by
2,α
Corollary A.38, there exists h ∈ Hk such that u − h ∈ Ck−γ outside some
compact set for some γ > 0. Arguing inductively (on k), this can be used to
set up an injection from ker(Δg , s)/ ker(Δg , s − 1) to Hk , and consequently,
A.3. Inverse function theorem and Lagrange multipliers 337
for all s > 0 (but not an integer), we have

dim ker(Δg , s) ≤ dim H≤s .
For the reverse inequality, we use the same reasoning as above, but in reverse.
For each nonnegative integer k, we consider s ∈ (k, k + 1). For each h ∈ Hk ,
choose a function h0 on M such that h0 equals h outside a compact set.
0,α
Then Δg h0 ∈ Ck−2−γ (M ) for some γ > 0. From our earlier arguments,
we know that (im(Δg , −k − 2 − γ))⊥ = ker(Δg , 2 + γ − k − n) = 0, which
2,α
gives us the surjectivity we need to produce v ∈ Ck−γ solving Δg v = Δg h0 .
Then setting u = h0 − v gives us an element of ker(Δg , s). This procedure
going from h to u sets up an injection from Hk to ker(Δg , s)/ ker(Δg , s − 1).
Inducting on k gives the desired inequality.
By the same reasoning used to prove Corollary A.9, we have the following
immediate consequence of Theorem A.40.
Corollary A.42. Let (M n , g) be a complete asymptotically flat manifold,
and let s be any real number not in the exceptional set Λ.
Consider L as in Assumption A.29. Then for any p > 1,
L : Ws2,p (M ) −→ Lps−2 (M )
is a Fredholm operator whose index is the same as that of the Euclidean
Laplacian Δ : Ws2,p (Rn ) −→ Lps−2 (Rn ).
Consider L and α as in Assumption A.30. Then
L : Cs2,α (M ) −→ Cs0,α (M )
is also a Fredholm operator with the same index as above.
A.3. Inverse function theorem and Lagrange multipliers

In this book we mostly avoid direct use of serious techniques of nonlinear
partial differential equations, but we do make use of the inverse function
theorem, which allows us to understand solutions of nonlinear problems
that are close to known solutions.
Theorem A.43 (Inverse function theorem). Let X and Y be Banach spaces,
and let Br0 (x0 ) be a ball in X. Suppose that F : Br0 (x0 ) −→ Y is differ-
entiable in the sense that the linearization DF |x : X −→ Y exists at each
x ∈ Br0 (x0 ), and that DF |x is Lipschitz in x with Lipschitz constant CL .
Also assume that
DF |x0 : X −→ Y
is an isomorphism with injectivity estimate
? ?
v ≤ CI ?DF |x0 (v)?
for some constant CI independent of v ∈ X. Then there exists a C 1 inverse

function F −1 defined on the ball of radius

s = min 4CI2 CL )−1 , (2CI )−1 r0
around y0 = F (x0 ), and its image lies in the ball of radius

r = min (2CI CL )−1 , r0
around x0 .
Remark A.44. The derivative DF need only be continuous at x0 for the
result to hold. In other words, our version assumes that F is C 1,1 , but only
C 1 is needed. We state it here with a Lipschitz hypothesis so that we can
easily describe the size of the ball on which F −1 exists. In general, it would
depend on the modulus of continuity of DF at x0 . See the proof below to
see why.
Proof. Given y ∈ Br (y0 ), we would like to solve the equation F (x) = y,

which we can rephrase as looking for a fixed point of the map
(A.7) G(x) = x + L−1 (y − F (x)),
where L := DF |x0 . We will prove that it has a fixed point using the contrac-
tion mapping principle [Wik, Banach fixed-point theorem]. (Recall that
the fixed point is found by simply iterating G and extracting a limit. Essen-
tially, the proof of the inverse function theorem is just an implementation
of the usual Newton’s method taught in single-variable calculus.) Differen-
tiating the expression for G, we obtain
DG|x = I − L−1 DF |x = L−1 (DF |x0 − DF |x ).
By our assumptions, we have
? ?
?DG|x ? ≤ CI CL x − x0 ≤ 1
2
−1
for x − x0 ≤ (2CI CL ) . By the mean value theorem onBanach spaces,
it follows that for x1 , x2 ∈ X in the ball of radius r = min (2CI CL )−1 , r0
around x0 , we have
1
(A.8) G(x1 ) − G(x2 ) ≤ x1 − x2 .
2
Thus G is a contraction mapping, but we also have to make sure that
Br (x0 ) is preserved by G. If y ∈ Bs (y0 ), then G(x0 ) − x0 = −L−1 (y0 − y),
and therefore G(x0 ) lies in the ball of radius CI s around x0 . So if we take
s = (2CI )−1 r, we see that G(x0 ) − x0 ≤ r/2 and consequently, for each
x ∈ Br (x0 ), we have
G(x) − x0 ≤ G(x) − G(x0 ) + G(x0 ) − x0 ≤ r.
Therefore we can invoke the contraction mapping principle on Br (x0 ) to find

the desired fixed point.
In summary, for each y ∈ Bs (y0 ), there exists x ∈ Br (x0 ) such that
F (x) = y. To see why this inverse function F −1 is continuous, we have
G(x1 ) − G(x2 ) = x1 − x2 − L−1 (F (x1 ) − F (x2 ))
for all x1 , x2 ∈ Br (x0 ). Invoking (A.8) and the definition of CI , we have
1
x1 − x2 ≤ CI F (x1 ) − F (x2 ) + x1 − x2 ,
2
which is equivalent to
x1 − x2 ≤ 2CI F (x1 ) − F (x2 ).
In particular, this tells us that F −1 (y1 ) − F −1 (y2 ) ≤ 2CI y1 − y2 for all
y1 , y2 ∈ Bs (y0 ).
Exercise A.45. Finish the proof above by showing that for each y ∈ Bs (y0 ),
−1
(DF −1 )|y = DF |F −1 (y) and depends continuously on y.
Next we have the local surjectivity theorem.

Theorem A.46 (Local surjectivity theorem). Let X and Y be Banach
spaces, and let F be a C 1 map from a neighborhood of x0 to Y . If DF |x0 :
X −→ Y is surjective, then F surjects every neighborhood of x0 onto some
small neighborhood of F (x0 ).
Note that unlike the implicit function theorem in finite dimensions, this
theorem does not easily follow from the inverse function theorem, since
ker DF |x0 need not have a complementing subspace. If it does (for example,
if ker DF |x0 is finite dimensional), the result follows from applying the in-
verse function theorem to the augmented map (F, Π) : X −→ Y ×ker DF |x0 ,
where Π is the projection onto ker DF |x0 .
Proof. Without loss of generality, let us take x0 = 0 and F (0) = 0 for sim-
plicity. The hypotheses of the theorem say that F is a C 1 map from Br0 (0)
to Y for some r0 > 0, but we will present the proof with the stronger hy-
pothesis that DF |x is Lipschitz in x on Br0 (0), that is, F is C 1,1 . We do this
partly so that we can more easily describe how large the neighborhoods are,
and partly so that the proof will be analogous to our proof of Theorem A.43,
which was also written with the C 1,1 hypothesis. The modifications needed
for the more general C 1 hypothesis are left as an exercise.
The basic idea of this proof is essentially the same as that of the inverse
function theorem, except that we set it up as an iteration scheme and instead
of applying the inverse map (DF |0 )−1 , we will need to choose an appropriate
preimage under DF |0 at each step.
Since DF |0 is surjective, it induces a Banach space isomorphism L :

X/K −→ Y , where K = ker DF |0 . (Recall that the norm of X/K is defined
by taking the infimum of norms over the equivalence class.) Let CL be the
Lipschitz constant of DF as in Theorem A.43, and let CI = L−1 , where
L−1 : T −→ X/K is the bounded inverse map of L. Define
r = min((2CI CL )−1 , r0 ),
s = min((16CI2 CL )−1 , (8CI )−1 r0 ).
Given y ∈ Bs (0) ⊂ Y , we would like to solve the equation F (x) = y for
some x ∈ Br (0) ⊂ X, which can now be rephrased as saying
L−1 (y − F (x)) = 0.
We will recursively construct a sequence of pairs (xn , Pn ), where xn ∈
Br (0) ⊂ X and Pn = xn + K is its corresponding equivalence class in
X/K. We start with (x0 , P0 ) = (0, K), and given (xn , Pn ), we construct
(xn+1 , Pn+1 ) as follows:
(A.9) Pn+1 := Pn + L−1 (y − F (xn )).
(Compare this with (A.7).) Next we choose xn+1 to be any element of the
class Pn+1 such that
xn+1 − xn ≤ 2Pn+1 − Pn .
We will check that xn+1 stays in Br (0) later. The fact that this can be done
is a straightforward consequence of the definition of the norm on X/K. Since
xn ∈ Pn , Pn = L−1 DF |0 (xn ), so (A.9) can be rewritten as
Pn+1 = L−1 (DF |0 (xn ) + y − F (xn )),
and thus, for n ≥ 1,
Pn+1 − Pn = L−1 [DF |0 (xn − xn−1 ) − (F (xn ) − F (xn−1 ))]
= L−1 [DF |0 (xn − xn−1 ) − DF |u (xn − xn−1 )]
for some u ∈ Br (0), by the mean value theorem. By our Lipschitz assump-
tion, for all u ∈ Br (0), DF |u − DF |0 ≤ CL r, so we have, for n ≥ 1,
xn+1 − xn ≤ 2Pn+1 − Pn
= L−1 (DF |0 − DF |u )(xn − xn−1 )
≤ CI CL rxn − xn−1
1
≤ xn − xn−1 .
2
For n = 1, we have x1 − x0 ≤ 2P1 ≤ 2CI s ≤ 14 r, so we can now verify
(by induction) that xn stays in Br/2 (0) for all n. The above computation
also tells us that xn is a Cauchy sequence, and thus xn converges to some
x ∈ Br (0). Consequently, Pn → x + K as well, so that taking the limit

of (A.9) gives us L−1 (y − F (x)) = 0, as desired.
Finally, we include a theorem on Lagrange multipliers.

Theorem A.47. Let X and Y be Banach spaces, and let U be a neighbor-
hood of some x0 ∈ X. Let F : U −→ R and G : U −→ Y be C 1 maps.
Suppose that F has a local extremum (minimum or maximum) at x0 subject
to the constraint G(x) = 0, and that DG|x0 is surjective. Then:
(1) DF |x0 (v) = 0 for all v ∈ ker DG|x0 .
(2) There exists λ ∈ Y ∗ such that for all v ∈ X,
DF |x0 (v) = λ(DG|x0 (v)).
Proof. Without loss of generality, we may assume that F (x0 ) is a local

minimum subject to the constraint G(x) = 0. Define a C 1 map T : U −→
R × Y by
T (x) = (F (x), G(x)).
We start by proving the first assertion. Suppose on the contrary that there
is v ∈ ker DG|x0 so that DF |x0 (v) = 0. Then DT |x0 = (DF |x0 , DG|x0 )
is surjective because DG|x0 is surjective. By the local surjectivity theorem
above (Theorem A.46), for any > 0, there exists a δ > 0 and x ∈ B (x0 )
such that T (x) = (F (x) − δ, 0). This contradicts our assumption that x0 is
a local minimum of F subject to the constraint G(x) = 0, proving the first
assertion of the theorem.
The first assertion can be rephrased as saying that DF |x0 , as an element
in the dual space X ∗ , lies in the annihilator subspace (ker DG|x0 )⊥ ⊂ X ∗
with respect to the natural pairing of X and X ∗ . It is a standard Banach
space fact that since DG|x0 has closed range, it follows that (ker DG|x0 )⊥ =
im DG|∗x0 . In other words, there is a λ ∈ Y ∗ such that
DF |x0 = DG|∗x0 (λ),
as elements of X ∗ . This is equivalent to what the second claim says, after
unwinding notation.
Bibliography
[Ago13] Ian Agol, The virtual Haken conjecture, Doc. Math. 18 (2013), 1045–1087. With an
appendix by Agol, Daniel Groves, and Jason Manning. MR3104553
[AIK10] S. Alexakis, A. D. Ionescu, and S. Klainerman, Uniqueness of smooth stationary black
holes in vacuum: small perturbations of the Kerr spaces, Comm. Math. Phys. 299
(2010), no. 1, 89–127, DOI 10.1007/s00220-010-1072-1. MR2672799
[All72] William K. Allard, On the first variation of a varifold, Ann. of Math. (2) 95 (1972),
417–491, DOI 10.2307/1970868. MR0307015
[Alm66] F. J. Almgren Jr., Some interior regularity theorems for minimal surfaces and
an extension of Bernstein’s theorem, Ann. of Math. (2) 84 (1966), 277–292, DOI
10.2307/1970520. MR0200816
[Alm00] Frederick J. Almgren Jr., Almgren’s big regularity paper: Q-valued functions minimiz-
ing Dirichlet’s integral and the regularity of area-minimizing rectifiable currents up
to codimension 2, World Scientific Monograph Series in Mathematics, vol. 1, World
Scientific Publishing Co., Inc., River Edge, NJ, 2000. With a preface by Jean E.
Taylor and Vladimir Scheffer. MR1777737
[Amb15] Lucas C. Ambrozio, On perturbations of the Schwarzschild anti-de Sitter spaces of
positive mass, Comm. Math. Phys. 337 (2015), no. 2, 767–783, DOI 10.1007/s00220-
015-2360-6. MR3339162
[ACG08] Lars Andersson, Mingliang Cai, and Gregory J. Galloway, Rigidity and positivity of
mass for asymptotically hyperbolic manifolds, Ann. Henri Poincaré 9 (2008), no. 1,
1–33, DOI 10.1007/s00023-007-0348-2. MR2389888
[AD98] Lars Andersson and Mattias Dahl, Scalar curvature rigidity for asymptotically lo-
cally hyperbolic manifolds, Ann. Global Anal. Geom. 16 (1998), no. 1, 1–27, DOI
10.1023/A:1006547905892. MR1616570
[AEM11] Lars Andersson, Michael Eichmair, and Jan Metzger, Jang’s equation and its appli-
cations to marginally trapped surfaces, Complex analysis and dynamical systems IV.
Part 2, Contemp. Math., vol. 554, Amer. Math. Soc., Providence, RI, 2011, pp. 13–45,
DOI 10.1090/conm/554/10958. MR2884392
[AGH98] Lars Andersson, Gregory J. Galloway, and Ralph Howard, A strong maximum princi-
ple for weak solutions of quasi-linear elliptic equations with applications to Lorentzian
and Riemannian geometry, Comm. Pure Appl. Math. 51 (1998), no. 6, 581–624, DOI
10.1002/(SICI)1097-0312(199806)51:6581::AID-CPA23.3.CO;2-E. MR1611140
343
344 Bibliography
[AMS08] Lars Andersson, Marc Mars, and Walter Simon, Stability of marginally outer trapped
surfaces and existence of marginally outer trapped tubes, Adv. Theor. Math. Phys.
12 (2008), no. 4, 853–888. MR2420905
[AM09] Lars Andersson and Jan Metzger, The area of horizons and the trapped region,
Comm. Math. Phys. 290 (2009), no. 3, 941–972, DOI 10.1007/s00220-008-0723-y.
MR2525646
[ADM60] R. Arnowitt, S. Deser, and C. W. Misner, Energy and the criteria for radiation in
general relativity, Phys. Rev. (2) 118 (1960), 1100–1104. MR0127945
[ADM61] R. Arnowitt, S. Deser, and C. W. Misner, Coordinate invariance and energy expres-
sions in general relativity, Phys. Rev. (2) 122 (1961), 997–1006. MR0127946
[ADM62] R. Arnowitt, S. Deser, and C. W. Misner, The dynamics of general relativity, Grav-
itation: An introduction to current research, Wiley, New York, 1962, pp. 227–265.
MR0143629
[AH78] Abhay Ashtekar and R. O. Hansen, A unified treatment of null and spatial infin-
ity in general relativity. I. Universal structure, asymptotic symmetries, and con-
served quantities at spatial infinity, J. Math. Phys. 19 (1978), no. 7, 1542–1566, DOI
10.1063/1.523863. MR0503432
[AG05] Abhay Ashtekar and Gregory J. Galloway, Some uniqueness results for dynamical
horizons, Adv. Theor. Math. Phys. 9 (2005), no. 1, 1–30. MR2193368
[Aub70] Thierry Aubin, Métriques riemanniennes et courbure (French), J. Differential Geom-
etry 4 (1970), 383–424. MR0279731
[Aub76] Thierry Aubin, Équations différentielles non linéaires et problème de Yamabe con-
cernant la courbure scalaire, J. Math. Pures Appl. (9) 55 (1976), no. 3, 269–296.
MR0431287
[ABR01] Sheldon Axler, Paul Bourdon, and Wade Ramey, Harmonic function theory, 2nd
ed., Graduate Texts in Mathematics, vol. 137, Springer-Verlag, New York, 2001.
MR1805196
[Bar86] Robert Bartnik, The mass of an asymptotically flat manifold, Comm. Pure Appl.
Math. 39 (1986), no. 5, 661–693, DOI 10.1002/cpa.3160390505. MR849427
[Bar89] Robert Bartnik, New definition of quasilocal mass, Phys. Rev. Lett. 62 (1989), no. 20,
2346–2348, DOI 10.1103/PhysRevLett.62.2346. MR996396
[Bar93] Robert Bartnik, Quasi-spherical metrics and prescribed scalar curvature, J. Differ-
ential Geom. 37 (1993), no. 1, 31–71. MR1198599
[Bar05] Robert Bartnik, Phase space for the Einstein equations, Comm. Anal. Geom. 13
(2005), no. 5, 845–885. MR2216143
[BC96] Robert Beig and Piotr T. Chruściel, Killing vectors in asymptotically flat space-times.
I. Asymptotically translational Killing vectors and the rigid positive energy theorem,
J. Math. Phys. 37 (1996), no. 4, 1939–1961, DOI 10.1063/1.531497. MR1380882
[Bes08] Arthur L. Besse, Einstein manifolds, Classics in Mathematics, Springer-Verlag,
Berlin, 2008. Reprint of the 1987 edition. MR2371700
[Bie09] Lydia Bieri, Part I: Solutions of the Einstein vacuum equations, Extensions of the
stability theorem of the Minkowski space in general relativity, AMS/IP Stud. Adv.
Math., vol. 45, Amer. Math. Soc., Providence, RI, 2009, pp. 1–295. MR2537047
[Bir23] G. D. Birkhoff, Relativity and modern physics. With the cooperation of R. E. Langer,
Harvard University Press, Cambridge, MA, 1923.
[BK89] John Bland and Morris Kalka, Negative scalar curvature metrics on noncom-
pact manifolds, Trans. Amer. Math. Soc. 316 (1989), no. 2, 433–446, DOI
10.2307/2001356. MR987159
[BDGG69] E. Bombieri, E. De Giorgi, and E. Giusti, Minimal cones and the Bernstein problem,
Invent. Math. 7 (1969), 243–268, DOI 10.1007/BF01404309. MR0250205
Bibliography 345
[BvdBM62] H. Bondi, M. G. J. van der Burg, and A. W. K. Metzner, Gravitational waves in

general relativity. VII. Waves from axi-symmetric isolated systems, Proc. Roy. Soc.
Ser. A 269 (1962), 21–52, DOI 10.1098/rspa.1962.0161. MR0147276
[Bra97] Hubert Lewis Bray, The Penrose inequality in general relativity and volume com-
parison theorems involving scalar curvature, ProQuest LLC, Ann Arbor, MI, 1997.
Thesis (Ph.D.)–Stanford University. MR2696584
[Bra01] Hubert L. Bray, Proof of the Riemannian Penrose inequality using the positive mass
theorem, J. Differential Geom. 59 (2001), no. 2, 177–267. MR1908823
[BBEN10] H. Bray, S. Brendle, M. Eichmair, and A. Neves, Area-minimizing projective planes
in 3-manifolds, Comm. Pure Appl. Math. 63 (2010), no. 9, 1237–1247, DOI
10.1002/cpa.20319. MR2675487
[BBN10] Hubert Bray, Simon Brendle, and Andre Neves, Rigidity of area-minimizing two-
spheres in three-manifolds, Comm. Anal. Geom. 18 (2010), no. 4, 821–830, DOI
10.4310/CAG.2010.v18.n4.a6. MR2765731
[BF02] Hubert Bray and Felix Finster, Curvature estimates and the positive mass theorem,
Comm. Anal. Geom. 10 (2002), no. 2, 291–306, DOI 10.4310/CAG.2002.v10.n2.a3.
MR1900753
[BK10] Hubert L. Bray and Marcus A. Khuri, A Jang equation approach to the Pen-
rose inequality, Discrete Contin. Dyn. Syst. 27 (2010), no. 2, 741–766, DOI
10.3934/dcds.2010.27.741. MR2600688
[BL09] Hubert L. Bray and Dan A. Lee, On the Riemannian Penrose inequality in
dimensions less than eight, Duke Math. J. 148 (2009), no. 1, 81–106, DOI
10.1215/00127094-2009-020. MR2515101
[Bre97] Glen E. Bredon, Topology and geometry, Graduate Texts in Mathematics, vol. 139,
Springer-Verlag, New York, 1997. Corrected third printing of the 1993 original.
MR1700700
[BM11] Simon Brendle and Fernando C. Marques, Scalar curvature rigidity of geodesic balls
in S n , J. Differential Geom. 88 (2011), no. 3, 379–394. MR2844438
[BMN11] Simon Brendle, Fernando C. Marques, and Andre Neves, Deformations of the hemi-
sphere that increase scalar curvature, Invent. Math. 185 (2011), no. 1, 175–197, DOI
10.1007/s00222-010-0305-4. MR2810799
[Bri59] Dieter R. Brill, On the positive definite mass of the Bondi-Weber-Wheeler time-
symmetric gravitational waves, Ann. Physics 7 (1959), 466–483, DOI 10.1016/0003-
4916(59)90055-7. MR0108340
[Bro89] Robert Brooks, A construction of metrics of negative Ricci curvature, J. Differential
Geom. 29 (1989), no. 1, 85–94. MR978077
[BY93] J. David Brown and James W. York Jr., Quasilocal energy and conserved charges
derived from the gravitational action, Phys. Rev. D (3) 47 (1993), no. 4, 1407–1419,
DOI 10.1103/PhysRevD.47.1407. MR1211109
[BMuA87] Gary L. Bunting and A. K. M. Masood-ul-Alam, Nonexistence of multiple black holes
in asymptotically Euclidean static vacuum space-time, Gen. Relativity Gravitation
19 (1987), no. 2, 147–154, DOI 10.1007/BF00770326. MR876598
[Cai02] Mingliang Cai, Volume minimizing hypersurfaces in manifolds of nonnegative scalar
curvature, Minimal surfaces, geometric analysis and symplectic geometry (Baltimore,
MD, 1999), Adv. Stud. Pure Math., vol. 34, Math. Soc. Japan, Tokyo, 2002, pp. 1–7.
MR1925731
[CG00] Mingliang Cai and Gregory J. Galloway, Rigidity of area minimizing tori in 3-
manifolds of nonnegative scalar curvature, Comm. Anal. Geom. 8 (2000), no. 3,
565–573, DOI 10.4310/CAG.2000.v8.n3.a6. MR1775139
[CZ52] A. P. Calderon and A. Zygmund, On the existence of certain singular integrals, Acta
Math. 88 (1952), 85–139, DOI 10.1007/BF02392130. MR0052553
346 Bibliography
[Can74] M. Cantor, Spaces of functions with asymptotic conditions on Rn , Indiana Univ.

Math. J. 24 (1974/75), 897–902, DOI 10.1512/iumj.1975.24.24072. MR0365621
[Can81] Murray Cantor, Elliptic operators and the decomposition of tensor fields, Bull. Amer.
Math. Soc. (N.S.) 5 (1981), no. 3, 235–262, DOI 10.1090/S0273-0979-1981-14934-X.
MR628659
[CCE16] Alessandro Carlotto, Otis Chodosh, and Michael Eichmair, Effective versions of
the positive mass theorem, Invent. Math. 206 (2016), no. 3, 975–1016, DOI
10.1007/s00222-016-0667-3. MR3573977
[CS16] Alessandro Carlotto and Richard Schoen, Localizing solutions of the Einstein con-
straint equations, Invent. Math. 205 (2016), no. 3, 559–615, DOI 10.1007/s00222-
015-0642-4. MR3539922
[dC92] Manfredo Perdigão do Carmo, Riemannian geometry, Mathematics: Theory & Ap-
plications, Birkhäuser Boston, Inc., Boston, MA, 1992. Translated from the second
Portuguese edition by Francis Flaherty. MR1138207
[Car73] Brandon Carter, Black hole equilibrium states, Black holes/Les astres occlus (École
d’Été Phys. Théor., Les Houches, 1972), Gordon and Breach, New York, 1973, pp. 57–
214. MR0465047
[CSCB79] Alice Chaljub-Simon and Yvonne Choquet-Bruhat, Problèmes elliptiques du second
ordre sur une variété euclidienne à l’infini (French, with English summary), Ann.
Fac. Sci. Toulouse Math. (5) 1 (1979), no. 1, 9–25. MR533596
[Cha06] Isaac Chavel, Riemannian geometry: A modern introduction, 2nd ed., Cambridge
Studies in Advanced Mathematics, vol. 98, Cambridge University Press, Cambridge,
2006. MR2229062
[Che17] Bang-Yen Chen, Differential geometry of warped product manifolds and submanifolds,
World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2017. With a foreword by
Leopold Verstraelen. MR3699316
[CWY11] PoNing Chen, Mu-Tao Wang, and Shing-Tung Yau, Evaluating quasilocal energy and
solving optimal embedding equation at null infinity, Comm. Math. Phys. 308 (2011),
no. 3, 845–863, DOI 10.1007/s00220-011-1362-2. MR2855542
[CGG91] Yun Gang Chen, Yoshikazu Giga, and Shun’ichi Goto, Uniqueness and existence of
viscosity solutions of generalized mean curvature flow equations, J. Differential Geom.
33 (1991), no. 3, 749–786. MR1100211
[CEM18] Otis Chodosh, Michael Eichmair, and Vlad Moraru, A splitting theorem for scalar
curvature, arXiv:1804.01751 (2018).
[CK18] Otis Chodosh and Daniel Ketover, Asymptotically flat three-manifolds contain min-
imal planes, Adv. Math. 337 (2018), 171–192, DOI 10.1016/j.aim.2018.08.010.
MR3853048
[CB09] Yvonne Choquet-Bruhat, General relativity and the Einstein equations, Oxford
Mathematical Monographs, Oxford University Press, Oxford, 2009. MR2473363
[CB15] Yvonne Choquet-Bruhat, Introduction to general relativity, black holes, and cosmol-
ogy, Oxford University Press, Oxford, 2015. With a foreword by Thibault Damour.
MR3379262
[CBC81] Y. Choquet-Bruhat and D. Christodoulou, Elliptic systems in Hs,δ spaces on mani-
folds which are Euclidean at infinity, Acta Math. 146 (1981), no. 1-2, 129–150, DOI
10.1007/BF02392460. MR594629
[CBG69] Yvonne Choquet-Bruhat and Robert Geroch, Global aspects of the Cauchy problem
in general relativity, Comm. Math. Phys. 14 (1969), 329–335. MR0250640
[CLN06] Bennett Chow, Peng Lu, and Lei Ni, Hamilton’s Ricci flow, Graduate Studies in
Mathematics, vol. 77, American Mathematical Society, Providence, RI; Science Press
Beijing, New York, 2006. MR2274812
Bibliography 347
[Chr09] Demetrios Christodoulou, The formation of black holes in general relativity, EMS
Monographs in Mathematics, European Mathematical Society (EMS), Zürich, 2009.
MR2488976
[CK93] Demetrios Christodoulou and Sergiu Klainerman, The global nonlinear stability of
the Minkowski space, Princeton Mathematical Series, vol. 41, Princeton University
Press, Princeton, NJ, 1993. MR1316662
[CM06] Piotr T. Chruściel and Daniel Maerten, Killing vectors in asymptotically flat space-
times. II. Asymptotically translational Killing vectors and the rigid positive energy
theorem in higher dimensions, J. Math. Phys. 47 (2006), no. 2, 022502, 10, DOI
10.1063/1.2167809. MR2208148
[CO81] D. Christodoulou and N. O’Murchadha, The boost problem in general relativity,
Comm. Math. Phys. 80 (1981), no. 2, 271–300. MR623161
[Chr86] Piotr Chruściel, Boundary conditions at spatial infinity from a Hamiltonian point of
view, Topological properties and global structure of space-time (Erice, 1985), NATO
Adv. Sci. Inst. Ser. B Phys., vol. 138, Plenum, New York, 1986, pp. 49–59. MR1102938
[Chr08] Piotr T. Chruściel, Mass and angular-momentum inequalities for axi-symmetric ini-
tial data sets. I. Positivity of mass, Ann. Physics 323 (2008), no. 10, 2566–2590, DOI
10.1016/j.aop.2007.12.010. MR2454698
[Chr15] Piotr T. Chruściel, The geometry of black holes, 2015. http://homepage.univie.ac.
at/piotr.chrusciel/teaching/Black~Holes/BlackHolesViennaJanuary2015.pdf.
[CC08] Piotr T. Chruściel and João Lopes Costa, On uniqueness of stationary vacuum black
holes (English, with English and French summaries), Astérisque 321 (2008), 195–265.
MR2521649
[CDGH01] P. T. Chruściel, E. Delay, G. J. Galloway, and R. Howard, Regularity of hori-
zons and the area theorem, Ann. Henri Poincaré 2 (2001), no. 1, 109–178, DOI
10.1007/PL00001029. MR1823836
[CGP10] Piotr T. Chruściel, Gregory J. Galloway, and Daniel Pollack, Mathematical general
relativity: a sampler, Bull. Amer. Math. Soc. (N.S.) 47 (2010), no. 4, 567–638, DOI
10.1090/S0273-0979-2010-01304-5. MR2721040
[CH03] Piotr T. Chruściel and Marc Herzlich, The mass of asymptotically hyperbolic
Riemannian manifolds, Pacific J. Math. 212 (2003), no. 2, 231–264, DOI
10.2140/pjm.2003.212.231. MR2038048
[CM11] Tobias Holck Colding and William P. Minicozzi II, A course in minimal surfaces,
Graduate Studies in Mathematics, vol. 121, American Mathematical Society, Provi-
dence, RI, 2011. MR2780140
[Cor00] Justin Corvino, Scalar curvature deformation and a gluing construction for the Ein-
stein constraint equations, Comm. Math. Phys. 214 (2000), no. 1, 137–189, DOI
10.1007/PL00005533. MR1794269
[Cor05] Justin Corvino, A note on asymptotically flat metrics on R3 which are scalar-flat
and admit minimal spheres, Proc. Amer. Math. Soc. 133 (2005), no. 12, 3669–3678,
DOI 10.1090/S0002-9939-05-07926-8. MR2163606
[Cor17] Justin Corvino, A note on the Bartnik mass, Nonlinear analysis in geometry and
applied mathematics, Harv. Univ. Cent. Math. Sci. Appl. Ser. Math., vol. 1, Int.
Press, Somerville, MA, 2017, pp. 49–75. MR3729084
[CH16] Justin Corvino and Lan-Hsuan Huang, Localized deformation for initial data sets
with the dominant energy condition, arXiv:1606.03078 (2016).
[CS06] Justin Corvino and Richard M. Schoen, On the asymptotics for the vacuum Einstein
constraint equations, J. Differential Geom. 73 (2006), no. 2, 185–217. MR2225517
[Cou37] Richard Courant, Plateau’s problem and Dirichlet’s principle, Ann. of Math. (2) 38
(1937), no. 3, 679–724, DOI 10.2307/1968610. MR1503362
348 Bibliography
[DL17] Mihalis Dafermos and Jonathan Luk, The interior of dynamical vacuum black holes
I: The c0 -stability of the Kerr Cauchy horizon, arXiv:1710.01722 (2017).
[DL16] Mattias Dahl and Eric Larsson, Outermost apparent horizons diffeomorphic to unit
normal bundles, arXiv:1606.0841 (2016).
[DM07] Xianzhe Dai and Li Ma, Mass under the Ricci flow, Comm. Math. Phys. 274 (2007),
no. 1, 65–80. MR2318848
[DG61] Ennio De Giorgi, Frontiere orientate di misura minima (Italian), Seminario di
Matematica della Scuola Normale Superiore di Pisa, 1960-61, Editrice Tecnico Scien-
tifica, Pisa, 1961. MR0179651
[DL16] Camillo De Lellis, The size of the singular set of area-minimizing currents, Surveys
in differential geometry 2016. Advances in geometry and mathematical physics, Surv.
Differ. Geom., vol. 21, Int. Press, Somerville, MA, 2016, pp. 1–83. MR3525093
[DLS14] Camillo De Lellis and Emanuele Spadaro, Regularity of area minimizing currents
I: gradient Lp estimates, Geom. Funct. Anal. 24 (2014), no. 6, 1831–1884, DOI
10.1007/s00039-014-0306-3. MR3283929
[DS83] V. I. Denisov and V. O. Solovev, Energy defined in general relativity on the basis of
the traditional Hamiltonian approach has no physical meaning (Russian, with English
summary), Teoret. Mat. Fiz. 56 (1983), no. 2, 301–314. MR718105
[Dou31] Jesse Douglas, Solution of the problem of Plateau, Trans. Amer. Math. Soc. 33 (1931),
no. 1, 263–321, DOI 10.2307/1989472. MR1501590
[Eic09] Michael Eichmair, The Plateau problem for marginally outer trapped surfaces, J.
Differential Geom. 83 (2009), no. 3, 551–583. MR2581357
[Eic13] Michael Eichmair, The Jang equation reduction of the spacetime positive energy the-
orem in dimensions less than eight, Comm. Math. Phys. 319 (2013), no. 3, 575–593,
DOI 10.1007/s00220-013-1700-7. MR3040369
[EGP13] Michael Eichmair, Gregory J. Galloway, and Daniel Pollack, Topological censorship
from the initial data point of view, J. Differential Geom. 95 (2013), no. 3, 389–405.
MR3128989
[EHLS16] Michael Eichmair, Lan-Hsuan Huang, Dan A. Lee, and Richard Schoen, The space-
time positive mass theorem in dimensions less than eight, J. Eur. Math. Soc. (JEMS)
18 (2016), no. 1, 83–121, DOI 10.4171/JEMS/584. MR3438380
[EM13] Michael Eichmair and Jan Metzger, Large isoperimetric surfaces in initial data sets,
J. Differential Geom. 94 (2013), no. 1, 159–186. MR3031863
[EM16] Michael Eichmair and Jan Metzger, Jenkins-Serrin-type results for the Jang equation,
[EMW12] Michael Eichmair, Pengzi Miao, and Xiaodong Wang, Extension of a theorem of Shi
and Tam, Calc. Var. Partial Differential Equations 43 (2012), no. 1-2, 45–56, DOI
10.1007/s00526-011-0402-2. MR2860402
[Eis93] Jean Eisenstaedt, Lemaı̂tre and the Schwarzschild solution, The attraction of gravita-
tion: new studies in the history of general relativity (Johnstown, PA, 1991), Einstein
Stud., vol. 5, Birkhäuser Boston, Boston, MA, 1993, pp. 353–389. MR1735388
[ER02] Roberto Emparan and Harvey S. Reall, A rotating black ring solution in five
dimensions, Phys. Rev. Lett. 88 (2002), no. 10, 101101, 4, DOI 10.1103/Phys-
RevLett.88.101101. MR1901280
[ER06] Roberto Emparan and Harvey S. Reall, Black rings, Classical Quantum Gravity 23
(2006), no. 20, R169–R197, DOI 10.1088/0264-9381/23/20/R01. MR2270099
[Eva10] Lawrence C. Evans, Partial differential equations, 2nd ed., Graduate Studies in Math-
ematics, vol. 19, American Mathematical Society, Providence, RI, 2010. MR2597943
[ES91] L. C. Evans and J. Spruck, Motion of level sets by mean curvature. I, J. Differential
Geom. 33 (1991), no. 3, 635–681. MR1100206
Bibliography 349
[Fed70] Herbert Federer, The singular sets of area minimizing rectifiable currents with codi-
mension one and of area minimizing flat chains modulo two with arbitrary codimen-
sion, Bull. Amer. Math. Soc. 76 (1970), 767–771, DOI 10.1090/S0002-9904-1970-
12542-3. MR0260981
[FF60] Herbert Federer and Wendell H. Fleming, Normal and integral currents, Ann. of
Math. (2) 72 (1960), 458–520, DOI 10.2307/1970227. MR0123260
[Fin09] Felix Finster, A level set analysis of the Witten spinor with applications
to curvature estimates, Math. Res. Lett. 16 (2009), no. 1, 41–55, DOI
10.4310/MRL.2009.v16.n1.a5. MR2480559
[FK02] Felix Finster and Ines Kath, Curvature estimates in asymptotically flat manifolds
of positive scalar curvature, Comm. Anal. Geom. 10 (2002), no. 5, 1017–1031, DOI
10.4310/CAG.2002.v10.n5.a6. MR1957661
[FM75] Arthur E. Fischer and Jerrold E. Marsden, Deformations of the scalar curvature,
Duke Math. J. 42 (1975), no. 3, 519–547. MR0380907
[FCS80] Doris Fischer-Colbrie and Richard Schoen, The structure of complete stable minimal
surfaces in 3-manifolds of nonnegative scalar curvature, Comm. Pure Appl. Math.
33 (1980), no. 2, 199–211, DOI 10.1002/cpa.3160330206. MR562550
[Fle62] Wendell H. Fleming, On the oriented Plateau problem, Rend. Circ. Mat. Palermo (2)
11 (1962), 69–90, DOI 10.1007/BF02849427. MR0157263
[Fol95] Gerald B. Folland, Introduction to partial differential equations, 2nd ed., Princeton
University Press, Princeton, NJ, 1995. MR1357411
[FB52] Y. Fourès-Bruhat, Théorème d’existence pour certains systèmes d’équations aux
dérivées partielles non linéaires (French), Acta Math. 88 (1952), 141–225, DOI
10.1007/BF02392131. MR0053338
[FS14] Alexandre Freire and Fernando Schwartz, Mass-capacity inequalities for conformally
flat manifolds with boundary, Comm. Partial Differential Equations 39 (2014), no. 1,
98–119, DOI 10.1080/03605302.2013.851211. MR3169780
[Gal00] Gregory J. Galloway, Maximum principles for null hypersurfaces and null splitting
theorems, Ann. Henri Poincaré 1 (2000), no. 3, 543–567, DOI 10.1007/s000230050006.
MR1777311
[Gal11] Gregory J. Galloway, Stability and rigidity of extremal surfaces in Riemannian ge-
ometry and general relativity, Surveys in geometric analysis and relativity, Adv. Lect.
Math. (ALM), vol. 20, Int. Press, Somerville, MA, 2011, pp. 221–239. MR2906927
[Gal14] Gregory J. Galloway, Notes on Lorentzian causality, 2014. http://www.math.miami.
edu/~galloway/vienna-course-notes.pdf.
[Gal18] Gregory J. Galloway, Rigidity of outermost MOTS: the initial data version, Gen.
Relativity Gravitation 50 (2018), no. 3, Art. 32, 7, DOI 10.1007/s10714-018-2353-9.
MR3768955
[GS06] Gregory J. Galloway and Richard Schoen, A generalization of Hawking’s black hole
topology theorem to higher dimensions, Comm. Math. Phys. 266 (2006), no. 2, 571–
576, DOI 10.1007/s00220-006-0019-z. MR2238889
[Gan75] Dennis Gannon, Singularities in nonsimply connected space-times, J. Mathematical
Phys. 16 (1975), no. 12, 2364–2367, DOI 10.1063/1.522498. MR0389141
[GY86] L. Zhiyong Gao and S.-T. Yau, The existence of negatively Ricci curved metrics on
three-manifolds, Invent. Math. 85 (1986), no. 3, 637–652, DOI 10.1007/BF01390331.
MR848687
[Ger70] Robert Geroch, Domain of dependence, J. Mathematical Phys. 11 (1970), 437–449,
DOI 10.1063/1.1665157. MR0270697
[Ger73] Robert Geroch, Energy extraction, Sixth Texas symposium on relativistic astro-
physics, 1973, pp. 108.
350 Bibliography
[Ger75] Robert Geroch, General relativity, Differential geometry (Proc. Sympos. Pure Math.,
Vol. XXVII, Part 2, Stanford Univ., Stanford, Calif., 1973), Amer. Math. Soc., Prov-
idence, R.I., 1975, pp. 401–414. MR0378703
[Ger13] Robert Geroch, General relativity: 1972 lecture notes, Minkowski Institute Press,
Montreal, 2013.
[GHHP83] G. W. Gibbons, S. W. Hawking, Gary T. Horowitz, and Malcolm J. Perry, Posi-
tive mass theorems for black holes, Comm. Math. Phys. 88 (1983), no. 3, 295–308.
MR701918
[GT01] David Gilbarg and Neil S. Trudinger, Elliptic partial differential equations of second
order, Classics in Mathematics, Springer-Verlag, Berlin, 2001. Reprint of the 1998
edition. MR1814364
[GT12] James D. E. Grant and Nathalie Tassotti, A positive mass theorem for low-regularity
metrics, arXiv:1205.1302 (2012).
[GM08] Jeremy Gray and Mario Micallef, About the cover: the work of Jesse Douglas on
minimal surfaces, Bull. Amer. Math. Soc. (N.S.) 45 (2008), no. 2, 293–302, DOI
10.1090/S0273-0979-08-01192-0. MR2383307
[Gre63] L. W. Green, Auf Wiedersehensflächen (German), Ann. of Math. (2) 78 (1963),
289–299, DOI 10.2307/1970344. MR0155271
[GL80a] Mikhael Gromov and H. Blaine Lawson Jr., Spin and scalar curvature in the pres-
ence of a fundamental group. I, Ann. of Math. (2) 111 (1980), no. 2, 209–230, DOI
10.2307/1971198. MR569070
[GL80b] Mikhael Gromov and H. Blaine Lawson Jr., The classification of simply connected
manifolds of positive scalar curvature, Ann. of Math. (2) 111 (1980), no. 3, 423–434,
DOI 10.2307/1971103. MR577131
[GL94] Pengfei Guan and Yan Yan Li, The Weyl problem with nonnegative Gauss curvature,
[HW09] Fengbo Hang and Xiaodong Wang, Rigidity theorems for compact manifolds with
boundary and positive Ricci curvature, J. Geom. Anal. 19 (2009), no. 3, 628–642.
MR2496569
[Har90] F. Reese Harvey, Spinors and calibrations, Perspectives in Mathematics, vol. 9, Aca-
demic Press, Inc., Boston, MA, 1990. MR1045637
[Hat02] Allen Hatcher, Algebraic topology, Cambridge University Press, Cambridge, 2002.
MR1867354
[Haw68] S. W. Hawking, Gravitational radiation in an expanding universe, J. Math. Phys. 9
(1968), 598–604.
[Haw72] S. W. Hawking, Black holes in general relativity, Comm. Math. Phys. 25 (1972),
152–166. MR0293962
[HE73] S. W. Hawking and G. F. R. Ellis, The large scale structure of space-time, Cambridge
Monographs on Mathematical Physics, No. 1, Cambridge University Press, London-
New York, 1973. MR0424186
[Hay96] Sean A. Hayward, Gravitational energy in spherical symmetry, Phys. Rev. D (3) 53
(1996), no. 4, 1938–1949, DOI 10.1103/PhysRevD.53.1938. MR1380012
[Heb96] Emmanuel Hebey, Sobolev spaces on Riemannian manifolds, Lecture Notes in Math-
ematics, vol. 1635, Springer-Verlag, Berlin, 1996. MR1481970
[HL16] Hans-Joachim Hein and Claude LeBrun, Mass in Kähler geometry, Comm. Math.
Phys. 347 (2016), no. 1, 183–221, DOI 10.1007/s00220-016-2661-4. MR3543182
[Hem76] John Hempel, 3-Manifolds, Princeton University Press, Princeton, N. J.; University
of Tokyo Press, Tokyo, 1976. Ann. of Math. Studies, No. 86. MR0415619
[Her70] Joseph Hersch, Quatre propriétés isopérimétriques de membranes sphériques ho-
mogènes (French), C. R. Acad. Sci. Paris Sér. A-B 270 (1970), A1645–A1648.
MR0292357
Bibliography 351
[Her97] Marc Herzlich, A Penrose-like inequality for the mass of Riemannian asymp-
totically flat manifolds, Comm. Math. Phys. 188 (1997), no. 1, 121–133, DOI
10.1007/s002200050159. MR1471334
[Her98] Marc Herzlich, The positive mass theorem for black holes revisited, J. Geom. Phys.
26 (1998), no. 1-2, 97–111, DOI 10.1016/S0393-0440(97)00040-5. MR1626060
[Hit74] Nigel Hitchin, Harmonic spinors, Advances in Math. 14 (1974), 1–55, DOI
10.1016/0001-8708(74)90021-8. MR0358873
[Hua09] Lan-Hsuan Huang, On the center of mass of isolated systems with general asymp-
totics, Classical Quantum Gravity 26 (2009), no. 1, 015012, 25, DOI 10.1088/0264-
9381/26/1/015012. MR2470255
[Hua12] Lan-Hsuan Huang, On the center of mass in general relativity, Fifth International
Congress of Chinese Mathematicians. Part 1, 2, AMS/IP Stud. Adv. Math., 51, pt.
1, vol. 2, Amer. Math. Soc., Providence, RI, 2012, pp. 575–591. MR2908093
[HL15] Lan-Hsuan Huang and Dan A. Lee, Stability of the positive mass theorem for graphical
hypersurfaces of Euclidean space, Comm. Math. Phys. 337 (2015), no. 1, 151–169,
DOI 10.1007/s00220-014-2265-9. MR3324159
[HL17] Lan-Hsuan Huang and Dan A. Lee, Rigidity of the spacetime positive mass theorem,
arXiv:1706.03732 (2017).
[HLS17] Lan-Hsuan Huang, Dan A. Lee, and Christina Sormani, Intrinsic flat stability of
the positive mass theorem for graphical hypersurfaces of Euclidean space, J. Reine
Angew. Math. 727 (2017), 269–299, DOI 10.1515/crelle-2015-0051. MR3652253
[HMM18] Lan-Hsuan Huang, Daniel Martin, and Pengzi Miao, Static potentials and area min-
imizing hypersurfaces, Proc. Amer. Math. Soc. 146 (2018), no. 6, 2647–2661, DOI
10.1090/proc/13936. MR3778165
[HW13] Lan-Hsuan Huang and Damin Wu, Hypersurfaces with nonnegative scalar curvature,
[HW15] Lan-Hsuan Huang and Damin Wu, The equality case of the Penrose inequality for
asymptotically flat graphs, Trans. Amer. Math. Soc. 367 (2015), no. 1, 31–47, DOI
10.1090/S0002-9947-2014-06090-X. MR3271252
[HI01] Gerhard Huisken and Tom Ilmanen, The inverse mean curvature flow and the
Riemannian Penrose inequality, J. Differential Geom. 59 (2001), no. 3, 353–437.
MR1916951
[HY96] Gerhard Huisken and Shing-Tung Yau, Definition of center of mass for isolated phys-
ical systems and unique foliations by stable spheres with constant mean curvature, In-
vent. Math. 124 (1996), no. 1-3, 281–311, DOI 10.1007/s002220050054. MR1369419
[Isr67] Werner Israel, Event horizons in static vacuum space-times, Phys. Rev. 164 (1967),
1776–1779.
[Jau13] Jeffrey L. Jauregui, Fill-ins of nonnegative scalar curvature, static metrics,
and quasi-local mass, Pacific J. Math. 261 (2013), no. 2, 417–444, DOI
10.2140/pjm.2013.261.417. MR3037574
[Jau18] Jeffrey L. Jauregui, On the lower semicontinuity of the ADM mass, Comm. Anal.
Geom. 26 (2018), no. 1, 85–111, DOI 10.4310/CAG.2018.v26.n1.a3. MR3761654
[Jau19] Jeffrey L. Jauregui, Smoothing the Bartnik boundary conditions and other re-
sults on Bartnik’s quasi-local mass, J. Geom. Phys. 136 (2019), 228–243, DOI
10.1016/j.geomphys.2018.11.005. MR3885243
[JL16] Jeffrey L. Jauregui and Dan A. Lee, Lower semicontinuity of mass under c0 conver-
gence and Huisken’s isoperimetric mass, arXiv:1602.00732 (2016).
[JL19] Jeffrey L. Jauregui and Dan A. Lee, Lower semicontinuity of ADM mass under
intrinsic flat convergence, arXiv:1903.00916 (2019).
[Jos11] Jürgen Jost, Riemannian geometry and geometric analysis, 6th ed., Universitext,
Springer, Heidelberg, 2011. MR2829653
352 Bibliography
[Jos13] Jürgen Jost, Partial differential equations, 3rd ed., Graduate Texts in Mathematics,
vol. 214, Springer, New York, 2013. MR3012036
[KW75a] Jerry L. Kazdan and F. W. Warner, Existence and conformal deformation of metrics
with prescribed Gaussian and scalar curvatures, Ann. of Math. (2) 101 (1975), 317–
331, DOI 10.2307/1970993. MR0375153
[KW75b] Jerry L. Kazdan and F. W. Warner, Prescribing curvatures, Differential geometry
(Proc. Sympos. Pure Math., Vol. XXVII, Stanford Univ., Stanford, Calif., 1973),
Amer. Math. Soc., Providence, R.I., 1975, pp. 309–319. MR0394505
[KMWY18] Marcus Khuri, Yukio Matsumoto, Gilbert Weinstein, and Sumio Yamada, Plumbing
constructions and the domain of outer communication for 5-dimensional stationary
black holes, arXiv:1807.03452 (2018).
[Kle68] Felix Klein, Vorlesungen über höhere Geometrie (German), Dritte Auflage. Bear-
beitet und herausgegeben von W. Blaschke. Die Grundlehren der mathematischen
Wissenschaften, Band 22, Springer-Verlag, Berlin, 1968. MR0226476
[KR50] M. G. Kreı̆n and M. A. Rutman, Linear operators leaving invariant a cone in a
Banach space, Amer. Math. Soc. Translation 1950 (1950), no. 26, 128. MR0038008
[KH97] Marcus Kriele and Sean A. Hayward, Outer trapped surfaces and their apparent hori-
zon, J. Math. Phys. 38 (1997), no. 3, 1593–1604, DOI 10.1063/1.532010. MR1435684
[KL14] Hari K. Kunduri and James Lucietti, Supersymmetric black holes with lens-space
topology, Phys. Rev. Lett. 113 (2014), 211101, 5.
[Lam11] Mau-Kwong George Lam, The graph cases of the Riemannian positive mass and
Penrose inequalities in all dimensions, ProQuest LLC, Ann Arbor, MI, 2011. Thesis
(Ph.D.)–Duke University. MR2873434
[LM89] H. Blaine Lawson Jr. and Marie-Louise Michelsohn, Spin geometry, Princeton Math-
ematical Series, vol. 38, Princeton University Press, Princeton, NJ, 1989. MR1031992
[Lee76] C. W. Lee, A restriction on the topology of Cauchy surfaces in general relativity,
Comm. Math. Phys. 51 (1976), no. 2, 157–162. MR0426805
[Lee09] Dan A. Lee, On the near-equality case of the positive mass theorem, Duke Math. J.
148 (2009), no. 1, 63–80, DOI 10.1215/00127094-2009-021. MR2515100
[Lee13] Dan A. Lee, A positive mass theorem for Lipschitz metrics with small singular sets,
Proc. Amer. Math. Soc. 141 (2013), no. 11, 3997–4004, DOI 10.1090/S0002-9939-
2013-11871-X. MR3091790
[LL15] Dan A. Lee and Philippe G. LeFloch, The positive mass theorem for manifolds
with distributional curvature, Comm. Math. Phys. 339 (2015), no. 1, 99–120, DOI
10.1007/s00220-015-2414-9. MR3366052
[LN15] Dan A. Lee and André Neves, The Penrose inequality for asymptotically locally hyper-
bolic spaces with nonpositive mass, Comm. Math. Phys. 339 (2015), no. 2, 327–352,
DOI 10.1007/s00220-015-2421-x. MR3370607
[LS12] Dan A. Lee and Christina Sormani, Near-equality of the Penrose inequality for ro-
tationally symmetric Riemannian manifolds, Ann. Henri Poincaré 13 (2012), no. 7,
1537–1556, DOI 10.1007/s00023-012-0172-1. MR2982632
[LS14] Dan A. Lee and Christina Sormani, Stability of the positive mass theorem for ro-
tationally symmetric Riemannian manifolds, J. Reine Angew. Math. 686 (2014),
187–220, DOI 10.1515/crelle-2012-0094. MR3176604
[Lee97] John M. Lee, Riemannian manifolds: An introduction to curvature, Graduate Texts
in Mathematics, vol. 176, Springer-Verlag, New York, 1997. MR1468735
[LP87] John M. Lee and Thomas H. Parker, The Yamabe problem, Bull. Amer. Math. Soc.
(N.S.) 17 (1987), no. 1, 37–91, DOI 10.1090/S0273-0979-1987-15514-5. MR888880
[LS15] Philippe G. LeFloch and Christina Sormani, The nonlinear stability of rotationally
symmetric spaces with low regularity, J. Funct. Anal. 268 (2015), no. 7, 2005–2065,
DOI 10.1016/j.jfa.2014.12.012. MR3315585
Bibliography 353
[LW99] Yanyan Li and Gilbert Weinstein, A priori bounds for co-dimension one isometric
embeddings, Amer. J. Math. 121 (1999), no. 5, 945–965. MR1713298
[Li18] Yu Li, Ricci flow on asymptotically Euclidean manifolds, Geom. Topol. 22 (2018),
no. 3, 1837–1891, DOI 10.2140/gt.2018.22.1837. MR3780446
[Lic63] André Lichnerowicz, Spineurs harmoniques (French), C. R. Acad. Sci. Paris 257
(1963), 7–9. MR0156292
[Lin14] Chen-Yun Lin, Parabolic constructions of asymptotically flat 3-metrics of prescribed
scalar curvature, Calc. Var. Partial Differential Equations 49 (2014), no. 3-4, 1309–
1335, DOI 10.1007/s00526-013-0623-7. MR3168634
[LY02] Fanghua Lin and Xiaoping Yang, Geometric measure theory—an introduction, Ad-
vanced Mathematics (Beijing/Boston), vol. 1, Science Press Beijing, Beijing; Inter-
national Press, Boston, MA, 2002. MR2030862
[LR10] Hans Lindblad and Igor Rodnianski, The global stability of Minkowski space-time in
harmonic gauge, Ann. of Math. (2) 171 (2010), no. 3, 1401–1477, DOI 10.4007/an-
nals.2010.171.1401. MR2680391
[LY06] Chiu-Chu Melissa Liu and Shing-Tung Yau, Positivity of quasi-local mass. II, J.
Amer. Math. Soc. 19 (2006), no. 1, 181–204, DOI 10.1090/S0894-0347-05-00497-2.
MR2169046
[Loc81] Robert B. Lockhart, Fredholm properties of a class of elliptic operators on noncom-
pact manifolds, Duke Math. J. 48 (1981), no. 1, 289–312. MR610188
[LM83] Robert B. Lockhart and Robert C. McOwen, On elliptic systems in Rn , Acta Math.
150 (1983), no. 1-2, 125–135, DOI 10.1007/BF02392969. MR697610
[Loh94] Joachim Lohkamp, Metrics of negative Ricci curvature, Ann. of Math. (2) 140 (1994),
no. 3, 655–683, DOI 10.2307/2118620. MR1307899
[Loh99] Joachim Lohkamp, Scalar curvature and hammocks, Math. Ann. 313 (1999), no. 3,
385–407, DOI 10.1007/s002080050266. MR1678604
[Loh06] Joachim Lohkamp, The higher dimensional positive mass theorem I, arXiv:0608795
(2006).
[Loh15a] Joachim Lohkamp, Hyperbolic geometry and potential theory on minimal hypersur-
faces, arXiv:1512.08251 (2015).
[Loh15b] Joachim Lohkamp, Skin structures in scalar curvature geometry, arXiv:1512.08251
(2015).
[Loh15c] Joachim Lohkamp, Skin structures on minimal hypersurfaces, arXiv:1512.08249
(2015).
[Loh16] Joachim Lohkamp, The higher dimensional positive mass theorem II,
arXiv:1612.07505 (2016).
[LM17] Siyuan Lu and Pengzi Miao, Minimal hypersurfaces and boundary behavior of com-
pact manifolds with nonnegative scalar curvature, arXiv:1703.08164 (2017).
[MḾ94] Edward Malec and Niall Ó Murchadha, Trapped surfaces and the Penrose inequality
in spherically symmetric geometries, Phys. Rev. D (3) 49 (1994), no. 12, 6931–6934,
DOI 10.1103/PhysRevD.49.6931. MR1278625
[MM17] Christos Mantoulidis and Pengzi Miao, Total mean curvature, scalar curvature, and
a variational analog of Brown-York mass, Comm. Math. Phys. 352 (2017), no. 2,
703–718, DOI 10.1007/s00220-016-2767-8. MR3627410
[MS15] Christos Mantoulidis and Richard Schoen, On the Bartnik mass of apparent hori-
zons, Classical Quantum Gravity 32 (2015), no. 20, 205002, 16, DOI 10.1088/0264-
9381/32/20/205002. MR3406373
[MN12] Fernando C. Marques and André Neves, Rigidity of min-max minimal spheres
in three-manifolds, Duke Math. J. 161 (2012), no. 14, 2725–2752, DOI
10.1215/00127094-1813410. MR2993139
354 Bibliography
[MN14] Fernando C. Marques and André Neves, Min-max theory and the Willmore conjec-
ture, Ann. of Math. (2) 179 (2014), no. 2, 683–782, DOI 10.4007/annals.2014.179.2.6.
MR3152944
[Mar09] Marc Mars, Present status of the Penrose inequality, Classical Quantum Gravity 26
(2009), no. 19, 193001, 59, DOI 10.1088/0264-9381/26/19/193001. MR2545137
[MM84] Umberto Massari and Mario Miranda, Minimal surfaces of codimension one, North-
Holland Mathematics Studies, vol. 91, North-Holland Publishing Co., Amsterdam,
1984. Notas de Matemática [Mathematical Notes], 95. MR795963
[MS12] Donovan McFeron and Gábor Székelyhidi, On the positive mass theorem for
manifolds with corners, Comm. Math. Phys. 313 (2012), no. 2, 425–443, DOI
10.1007/s00220-012-1498-8. MR2942956
[McO79] Robert C. McOwen, The behavior of the Laplacian on weighted Sobolev spaces,
Comm. Pure Appl. Math. 32 (1979), no. 6, 783–795, DOI 10.1002/cpa.3160320604.
MR539158
[McO80] Robert C. McOwen, On elliptic operators in Rn , Comm. Partial Differential Equa-
tions 5 (1980), no. 9, 913–933, DOI 10.1080/03605308008820158. MR584101
[MSY82] William Meeks III, Leon Simon, and Shing Tung Yau, Embedded minimal surfaces,
exotic spheres, and manifolds with positive Ricci curvature, Ann. of Math. (2) 116
(1982), no. 3, 621–659, DOI 10.2307/2007026. MR678484
[MY80] William H. Meeks III and Shing Tung Yau, Topology of three-dimensional manifolds
and the embedding problems in minimal surface theory, Ann. of Math. (2) 112 (1980),
no. 3, 441–484, DOI 10.2307/1971088. MR595203
[Mey63] Norman Meyers, An expansion about infinity for solutions of linear elliptic equations.,
J. Math. Mech. 12 (1963), 247–264. MR0149072
[Mia02] Pengzi Miao, Positive mass theorem on manifolds admitting corners along a hy-
persurface, Adv. Theor. Math. Phys. 6 (2002), no. 6, 1163–1182 (2003), DOI
10.4310/ATMP.2002.v6.n6.a4. MR1982695
[Mia04] Pengzi Miao, Variational effect of boundary mean curvature on ADM mass in gen-
eral relativity, Mathematical physics research on the leading edge, Nova Sci. Publ.,
Hauppauge, NY, 2004, pp. 145–171. MR2068577
[Mia09] Pengzi Miao, On a localized Riemannian Penrose inequality, Comm. Math. Phys.
292 (2009), no. 1, 271–284, DOI 10.1007/s00220-009-0834-0. MR2540078
[MT15] Pengzi Miao and Luen-Fai Tam, Static potentials on asymptotically flat manifolds,
Ann. Henri Poincaré 16 (2015), no. 10, 2239–2264, DOI 10.1007/s00023-014-0373-x.
MR3385979
[MT16] Pengzi Miao and Luen-Fai Tam, Evaluation of the ADM mass and center of mass
via the Ricci tensor, Proc. Amer. Math. Soc. 144 (2016), no. 2, 753–761, DOI
10.1090/proc12726. MR3430851
[MM15] Mario Micallef and Vlad Moraru, Splitting of 3-manifolds and rigidity of area-
minimising surfaces, Proc. Amer. Math. Soc. 143 (2015), no. 7, 2865–2872, DOI
10.1090/S0002-9939-2015-12137-5. MR3336611
[MO89] Maung Min-Oo, Scalar curvature rigidity of asymptotically hyperbolic spin manifolds,
Math. Ann. 285 (1989), no. 4, 527–539, DOI 10.1007/BF01452046. MR1027758
[MTW73] Charles W. Misner, Kip S. Thorne, and John Archibald Wheeler, Gravitation, W. H.
Freeman and Co., San Francisco, Calif., 1973. MR0418833
[Mor16] Frank Morgan, Geometric measure theory: A beginner’s guide; Illustrated by James
F. Bredt, 5th ed., Elsevier/Academic Press, Amsterdam, 2016. MR3497381
[MT07] John Morgan and Gang Tian, Ricci flow and the Poincaré conjecture, Clay Math-
ematics Monographs, vol. 3, American Mathematical Society, Providence, RI; Clay
Mathematics Institute, Cambridge, MA, 2007. MR2334563
Bibliography 355
[MT14] John Morgan and Gang Tian, The geometrization conjecture, Clay Mathematics
Monographs, vol. 5, American Mathematical Society, Providence, RI; Clay Math-
ematics Institute, Cambridge, MA, 2014. MR3186136
[Mor48] Charles B. Morrey Jr., The problem of Plateau on a Riemannian manifold, Ann. of
Math. (2) 49 (1948), 807–851, DOI 10.2307/1969401. MR0027137
[MzHRS73] H. Müller zum Hagen, David C. Robinson, and H. J. Seifert, Black holes in static
vacuum space-times, General Relativity and Gravitation 4 (1973), 53–78, DOI
10.1007/bf00769760. MR0398432
[MP86] R. C. Myers and M. J. Perry, Black holes in higher-dimensional space-times, Ann.
Physics 172 (1986), no. 2, 304–347, DOI 10.1016/0003-4916(86)90186-7. MR868295
[Nir53] Louis Nirenberg, The Weyl and Minkowski problems in differential geometry in the
large, Comm. Pure Appl. Math. 6 (1953), 337–394, DOI 10.1002/cpa.3160060303.
MR0058265
[NW73] Louis Nirenberg and Homer F. Walker, The null spaces of elliptic partial differen-
tial operators in Rn , J. Math. Anal. Appl. 42 (1973), 271–301, DOI 10.1016/0022-
247X(73)90138-8. Collection of articles dedicated to Salomon Bochner. MR0320821
[Nun13] Ivaldo Nunes, Rigidity of area-minimizing hyperbolic surfaces in three-manifolds,
J. Geom. Anal. 23 (2013), no. 3, 1290–1302, DOI 10.1007/s12220-011-9287-8.
MR3078354
[OW07] Todd A. Oliynyk and Eric Woolgar, Rotationally symmetric Ricci flow on asymptot-
ically flat manifolds, Comm. Anal. Geom. 15 (2007), no. 3, 535–568. MR2379804
[O’N83] Barrett O’Neill, Semi-Riemannian geometry: With applications to relativity, Pure
and Applied Mathematics, vol. 103, Academic Press, Inc. [Harcourt Brace Jovanovich,
Publishers], New York, 1983. MR719023
[PT82] Thomas Parker and Clifford Henry Taubes, On Witten’s proof of the positive energy
theorem, Comm. Math. Phys. 84 (1982), no. 2, 223–238. MR661134
[Pen65] Roger Penrose, Gravitational collapse and space-time singularities, Phys. Rev. Lett.
14 (1965), 57–59, DOI 10.1103/PhysRevLett.14.57. MR0172678
[Pen73] Roger Penrose, Naked singularities, Annals N. Y. Acad. Sci. 224 (1973), 125–134.
[Pen02] R. Penrose, Gravitational collapse: the role of general relativity, Gen. Relativity Grav-
itation 34 (2002), no. 7, 1141–1165, DOI 10.1023/A:1016578408204. Reprinted from
Rivista del Nuovo Cimento 1969, Numero Speziale I, 252–276. MR1915236
[Per02] Grisha Perelman, The entropy formula for the Ricci flow and its geometric applica-
tions, arXiv:math (2002).
[Per03a] Grisha Perelman, Finite extinction time for the solutions to the Ricci flow on certain
three-manifolds, arXiv:math (2003).
[Per03b] Grisha Perelman, Ricci flow with surgery on three-manifolds, arXiv:math (2003).
[Pet16] Peter Petersen, Riemannian geometry, 3rd ed., Graduate Texts in Mathematics,
vol. 171, Springer, Cham, 2016. MR3469435
[Pog52] A. V. Pogorelov, Regularity of a convex surface with given Gaussian curvature
(Russian), Mat. Sbornik N.S. 31(73) (1952), 88–103. MR0052807
[QT07] Jie Qing and Gang Tian, On the uniqueness of the foliation of spheres of constant
mean curvature in asymptotically flat 3-manifolds, J. Amer. Math. Soc. 20 (2007),
no. 4, 1091–1110, DOI 10.1090/S0894-0347-07-00560-7. MR2328717
[Rad30] Tibor Radó, On Plateau’s problem, Ann. of Math. (2) 31 (1930), no. 3, 457–469, DOI
10.2307/1968237. MR1502955
[RT74] Tullio Regge and Claudio Teitelboim, Role of surface integrals in the Hamiltonian for-
mulation of general relativity, Ann. Physics 88 (1974), 286–318, DOI 10.1016/0003-
4916(74)90404-7. MR0359663
356 Bibliography
[Rei60] E. R. Reifenberg, Solution of the Plateau Problem for m-dimensional surfaces of

varying topological type, Acta Math. 104 (1960), 1–92, DOI 10.1007/BF02547186.
MR0114145
[Rei73] Robert C. Reilly, Variational properties of functions of the mean curvatures for hy-
persurfaces in space forms, J. Differential Geometry 8 (1973), 465–477. MR0341351
[Rob75] D. C. Robinson, Uniqueness of the Kerr black hole, Phys. Rev. Lett. 34 (1975), 905–
906.
[Rob77] D. C. Robinson, A simple proof of the generalization of Israel’s theorem, General
Relativity and Gravitation 8 (August 1977), 695–698.
[Rob09] David C. Robinson, Four decades of black hole uniqueness theorems, The Kerr space-
time, Cambridge Univ. Press, Cambridge, 2009, pp. 115–143. MR2789145
[Ros07] Jonathan Rosenberg, Manifolds of positive scalar curvature: a progress report, Sur-
veys in differential geometry. Vol. XI, Surv. Differ. Geom., vol. 11, Int. Press,
Somerville, MA, 2007, pp. 259–294, DOI 10.4310/SDG.2006.v11.n1.a9. MR2408269
[RS01] Jonathan Rosenberg and Stephan Stolz, Metrics of positive scalar curvature and
connections with surgery, Surveys on surgery theory, Vol. 2, Ann. of Math. Stud.,
vol. 149, Princeton Univ. Press, Princeton, NJ, 2001, pp. 353–386. MR1818778
[Sac62] R. K. Sachs, Gravitational waves in general relativity. VIII. Waves in asymp-
totically flat space-time, Proc. Roy. Soc. Ser. A 270 (1962), 103–126, DOI
10.1098/rspa.1962.0206. MR0149908
[SU81] J. Sacks and K. Uhlenbeck, The existence of minimal immersions of 2-spheres, Ann.
of Math. (2) 113 (1981), no. 1, 1–24, DOI 10.2307/1971131. MR604040
[SU82] J. Sacks and K. Uhlenbeck, Minimal immersions of closed Riemann surfaces, Trans.
Amer. Math. Soc. 271 (1982), no. 2, 639–652, DOI 10.2307/1998902. MR654854
[Sch34] J. Schauder, Über lineare elliptische Differentialgleichungen zweiter Ordnung
(German), Math. Z. 38 (1934), no. 1, 257–282, DOI 10.1007/BF01170635.
MR1545448
[Sch83] Richard M. Schoen, Uniqueness, symmetry, and embeddedness of minimal surfaces,
J. Differential Geom. 18 (1983), no. 4, 791–809 (1984). MR730928
[Sch84] Richard Schoen, Conformal deformation of a Riemannian metric to constant scalar
curvature, J. Differential Geom. 20 (1984), no. 2, 479–495. MR788292
[Sch88] Richard M. Schoen, The existence of weak solutions with prescribed singular behavior
for a conformally invariant scalar equation, Comm. Pure Appl. Math. 41 (1988),
no. 3, 317–392, DOI 10.1002/cpa.3160410305. MR929283
[Sch89] Richard M. Schoen, Variational theory for the total scalar curvature functional for
Riemannian metrics and related topics, Topics in calculus of variations (Montecatini
Terme, 1987), Lecture Notes in Math., vol. 1365, Springer, Berlin, 1989, pp. 120–154,
DOI 10.1007/BFb0089180. MR994021
[SS81] Richard Schoen and Leon Simon, Regularity of stable minimal hypersurfaces,
Comm. Pure Appl. Math. 34 (1981), no. 6, 741–797, DOI 10.1002/cpa.3160340603.
MR634285
[SY79a] Richard M. Schoen and Shing Tung Yau, Complete manifolds with nonnegative scalar
curvature and the positive action conjecture in general relativity, Proc. Nat. Acad.
Sci. U.S.A. 76 (1979), no. 3, 1024–1025, DOI 10.1073/pnas.76.3.1024. MR524327
[SY79b] R. Schoen and Shing Tung Yau, Existence of incompressible minimal surfaces and
the topology of three-dimensional manifolds with nonnegative scalar curvature, Ann.
of Math. (2) 110 (1979), no. 1, 127–142, DOI 10.2307/1971247. MR541332
[SY79c] Richard Schoen and Shing Tung Yau, On the proof of the positive mass conjecture
in general relativity, Comm. Math. Phys. 65 (1979), no. 1, 45–76. MR526976
Bibliography 357
[SY79d] R. Schoen and S. T. Yau, On the structure of manifolds with positive scalar cur-
vature, Manuscripta Math. 28 (1979), no. 1-3, 159–183, DOI 10.1007/BF01647970.
MR535700
[SY81a] Richard Schoen and Shing Tung Yau, The energy and the linear momentum of space-
times in general relativity, Comm. Math. Phys. 79 (1981), no. 1, 47–51. MR609227
[SY81b] Richard Schoen and Shing Tung Yau, Proof of the positive mass theorem. II, Comm.
Math. Phys. 79 (1981), no. 2, 231–260. MR612249
[SY83] Richard Schoen and S. T. Yau, The existence of a black hole due to condensation of
matter, Comm. Math. Phys. 90 (1983), no. 4, 575–579. MR719436
[SY88] R. Schoen and S.-T. Yau, Conformally flat manifolds, Kleinian groups and scalar cur-
vature, Invent. Math. 92 (1988), no. 1, 47–71, DOI 10.1007/BF01393992. MR931204
[SY94] R. Schoen and S.-T. Yau, Lectures on differential geometry, Conference Proceedings
and Lecture Notes in Geometry and Topology, I, International Press, Cambridge,
MA, 1994. Lecture notes prepared by Wei Yue Ding, Kung Ching Chang [Gong Qing
Zhang], Jia Qing Zhong and Yi Chao Xu; Translated from the Chinese by Ding and S.
Y. Cheng; With a preface translated from the Chinese by Kaising Tso. MR1333601
[SY17] Richard Schoen and Shing-Tung Yau, Positive scalar curvature and minimal hyper-
surface singularities, arXiv:1704.05490 (2017).
[Sch08] Fernando Schwartz, Existence of outermost apparent horizons with product
of spheres topology, Comm. Anal. Geom. 16 (2008), no. 4, 799–817, DOI
10.4310/CAG.2008.v16.n4.a3. MR2471370
[SZ97] Ying Shen and Shunhui Zhu, Rigidity of stable minimal hypersurfaces, Math. Ann.
309 (1997), no. 1, 107–116, DOI 10.1007/s002080050105. MR1467649
[Shi89] Wan-Xiong Shi, Ricci deformation of the metric on complete noncompact Riemann-
ian manifolds, J. Differential Geom. 30 (1989), no. 2, 303–394. MR1010165
[ST02] Yuguang Shi and Luen-Fai Tam, Positive mass theorem and the boundary behaviors
of compact manifolds with nonnegative scalar curvature, J. Differential Geom. 62
(2002), no. 1, 79–125. MR1987378
[Sim68] James Simons, Minimal varieties in riemannian manifolds, Ann. of Math. (2) 88
(1968), 62–105, DOI 10.2307/1970556. MR0233295
[Sim83] Leon Simon, Lectures on geometric measure theory, Proceedings of the Centre for
Mathematical Analysis, Australian National University, vol. 3, Australian National
University, Centre for Mathematical Analysis, Canberra, 1983. MR756417
[Sim95] Leon Simon, Rectifiability of the singular sets of multiplicity 1 minimal surfaces and
energy minimizing maps, Surveys in differential geometry, Vol. II (Cambridge, MA,
1993), Int. Press, Cambridge, MA, 1995, pp. 246–305. MR1375258
[Sim97] Leon Simon, Schauder estimates by scaling, Calc. Var. Partial Differential Equations
5 (1997), no. 5, 391–407, DOI 10.1007/s005260050072. MR1459795
[Sim02] Miles Simon, Deformation of C 0 Riemannian metrics in the direction of
their Ricci curvature, Comm. Anal. Geom. 10 (2002), no. 5, 1033–1074, DOI
10.4310/CAG.2002.v10.n5.a7. MR1957662
[Sma93] Nathan Smale, Generic regularity of homologically area minimizing hypersurfaces
in eight-dimensional manifolds, Comm. Anal. Geom. 1 (1993), no. 2, 217–228, DOI
10.4310/CAG.1993.v1.n2.a2. MR1243523
[Smi82] Francis R. Smith, On the existence of embedded minimal 2-spheres in the 3-sphere,
endowed with an arbitrary Riemannian metric, 1982. Thesis (Ph.D.)–University of
Melbourne.
[SSA17] Christina Sormani and Iva Stavrov Allen, Geometrostatic manifolds of small ADM
mass, arXiv:1707.03008 (2017).
[Spi79] Michael Spivak, A comprehensive introduction to differential geometry. Vol. I, 2nd
ed., Publish or Perish, Inc., Wilmington, Del., 1979. MR532830
358 Bibliography
[Sto92] Stephan Stolz, Simply connected manifolds of positive scalar curvature, Ann. of Math.
(2) 136 (1992), no. 3, 511–540, DOI 10.2307/2946598. MR1189863
[Tak94] Peter Takáč, A short elementary proof of the Kreı̆n-Rutman theorem, Houston J.
Math. 20 (1994), no. 1, 93–98. MR1272563
[Tam84] I. Tamanini, On the sphericity of liquid droplets (English, with French summary),
Astérisque 118 (1984), 235–241. Variational methods for equilibrium problems of
fluids (Trento, 1983). MR761754
[Top59] V. A. Toponogov, Evaluation of the length of a closed geodesic on a convex surface
(Russian), Dokl. Akad. Nauk SSSR 124 (1959), 282–284. MR0102055
[Tra58] A. Trautman, Radiation and boundary conditions in the theory of gravitation, Bull.
Acad. Polon. Sci. Sér. Sci. Math. Astr. Phys. 6 (1958), 407–412. MR0097266
[Tru68] Neil S. Trudinger, Remarks concerning the conformal deformation of Riemannian
structures on compact manifolds, Ann. Scuola Norm. Sup. Pisa (3) 22 (1968), 265–
274. MR0240748
[TW14] Wilderich Tuschmann and Michael Wiemeler, Differentiable stability and sphere
theorems for manifolds and einstein manifolds with positive scalar curvature,
arXiv:1408.3006 (2014).
[Won12] Willie Wai-Yeung Wong, A positive mass theorem for two spatial dimensions,
arXiv:1202.6279 (2012).
[Wal84] Robert M. Wald, General relativity, University of Chicago Press, Chicago, IL, 1984.
MR757180
[Wan03] Li He Wang, A geometric approach to the Calderón-Zygmund estimates, Acta
Math. Sin. (Engl. Ser.) 19 (2003), no. 2, 381–396, DOI 10.1007/s10114-003-0264-4.
MR1987802
[Wan15] Mu-Tao Wang, Four lectures on quasi-local mass, arXiv:1510.02931 (2015).
[WY09] Mu-Tao Wang and Shing-Tung Yau, Isometric embeddings into the Minkowski space
and new quasi-local mass, Comm. Math. Phys. 288 (2009), no. 3, 919–942, DOI
10.1007/s00220-009-0745-0. MR2504860
[Wan01] Xiaodong Wang, The mass of asymptotically hyperbolic manifolds, J. Differential
Geom. 57 (2001), no. 2, 273–299. MR1879228
[Whi87] Brian White, The space of m-dimensional surfaces that are stationary for a para-
metric elliptic functional, Indiana Univ. Math. J. 36 (1987), no. 3, 567–602, DOI
10.1512/iumj.1987.36.36031. MR905611
[Wik] Wikipedia contributors, Wikipedia, the free encyclopedia. [Online].
[Wil65] T. J. Willmore, Note on embedded surfaces (English, with Romanian and Russian
summaries), An. Şti. Univ. “Al. I. Cuza” Iaşi Secţ. I a Mat. (N.S.) 11B (1965),
493–496. MR0202066
[Wit81] Edward Witten, A new proof of the positive energy theorem, Comm. Math. Phys. 80
(1981), no. 3, 381–402. MR626707
[Wlo87] J. Wloka, Partial differential equations, Cambridge University Press, Cambridge,
1987. Translated from the German by C. B. Thomas and M. J. Thomas. MR895589
[Yam60] Hidehiko Yamabe, On a deformation of Riemannian structures on compact mani-
folds, Osaka Math. J. 12 (1960), 21–37. MR0125546
[Zip09] Nina Zipser, Part II: Solutions of the Einstein-Maxwell equations, Extensions of the
stability theorem of the Minkowski space in general relativity, AMS/IP Stud. Adv.
Math., vol. 45, Amer. Math. Soc., Providence, RI, 2009, pp. 297–491. MR2537048
Index
ADM energy-momentum, 225, 257 cosmic censorship, 238

ADM mass, 68, 72–74, 91, 226
apparent horizon d.o.c., see also domain of outer
in initial data sets, 228, 240, 245–247 communication
Riemannian, 107, 109, 110, 112 DEC, see also dominant energy
asymptotically flat, 66 condition
initial data sets, 225 deformation vector field, 27
axisymmetric, 81, 220 density theorem
for DEC, 295
background metric, 12 for vacuum initial data, 292
nonnegative scalar curvature case,
Bartnik mass, 182
101
Bianchi identities, 11, 12
scalar-flat case, 89
black hole, 228
Dirac operator, 166
Bochner formula, see also Weitzenböck
divergence, 9, 25
formula
divergence theorem, 10
Bondi mass, see also Trautman-Bondi
domain of outer communication, 228
mass
dominant energy condition, 222, 223
boost, 209
Bray flow, 142
Einstein constraint equations, see also
Brown-York mass, 197
constraint equations
Einstein equations, see also Einstein
Cauchy hypersurface, 213 field equations
causal, 211 Einstein field equations, 214
causal future, 211 Einstein tensor, 12
causal structure, 211 Einstein-Hilbert action, 216
Clifford algebra, 161 elliptic estimates, 303, 304
coframe, 4 enclosed region, 109
conformal, 21 enclosing, 109
conformal Laplacian, 22 enclosing boundary, 109
conformally flat, 77, 83 exceptional set, 316
constraint equations, 221
constraint operator, 285 first variation of mean curvature, 32
modified, 296 first variation of volume, 28, 29, 32
359
360 Index
frame, 4 Minkowski space, 208

Fredholm, 306 MOTS, see also marginally outer
Fredholm index, see also index trapped surface
Gauss curvature, 12 NEC, see also null energy condition

Gauss equation, 25, 26 null, 210
Gauss-Bonnet Theorem, 17 null energy condition, 232
Gauss-Codazzi equations, 25 null expansion, 230, 233
Geroch monotonicity, 123, 132 null generators, 230
globally hyperbolic, 213 null hypersurface, 213
graphical hypersurfaces, 78, 119 null second fundamental form, 233
Hölder inequality, 320 outermost minimal hypersurface, 109

harmonic functions, 10, 315 outward minimizing, 109
harmonic polynomial, 315
Hawking area theorem, 240 Penrose incompleteness, 234
Hawking mass, 121 Penrose inequality, 113, 121, 249
Hodge Laplacian, 186 perimeter, 109
Hopf maximum principle, see also Peterson-Codazzi-Mainardi equation, 25
maximum principle Poincaré group, 209
Poisson kernel, 317
index, 306 principal eigenfunction, 307
index form, 15 principal eigenvalue, 307
initial data set, 223
inverse mean curvature flow, 121, 123, quasi-local mass, 121
125
Raychaudhuri equation, 231
isotopic, 27
Rayleigh quotients, 308
Kähler, 84 Rellich-Kondrachov compactness, 321
Kelvin transform, 318 Riccati equation, 231, 232
Kerr spacetime, 220 Ricci curvature, 12
Killing field, 8 Ricci flow, 48, 97, 104, 117
Krein-Rutman Theorem, 310 Riemann curvature tensor, 11
Kruskal-Szekeres, 219 Riemannian case, see also
time-symmetric
Laplace-Beltrami operator, 10
Laplacian, 10 scalar curvature, 12, 14, 17
Legendre polynomials, 317 Schrödinger-Lichnerowicz formula, 166,
Levi-Civita connection, 8 277
Lichnerowicz formula, see also Schwarzschild
Schrödinger-Lichnerowicz formula space, 63, 66
Lie derivative, 7 spacetime, 217
linearization, 26 second fundamental form, 23
Lorentz transformations, 208 null, 230
Lorentzian, 207 second variation of volume, 31–33, 35
sectional curvature, 12
marginally outer trapped surface, 234, shape operator, 23
240 null, 230
maximum principle, 302 shear scalar, 231
mean curvature, 24 Sobolev embedding, 320
min-max, 61 spacelike, 210
minimal, 29 spacelike hypersurface, 213
minimizing hull, 109 spacetime, 211
Index 361
special relativity, 208

spectral theorem, 313
spherical harmonics, 315
spherically symmetric, 63, 77, 251
spinor, 164
spinors, 161
stability inequality, 33, 34
stability operator
for minimal hypersurfaces, 33
for MOTS, 243
stable
minimal submanifold, 33
MOTS, 243
static, 214
stationary, 220
stress-energy tensor, 214
strong maximum principle, see also
maximum principle
time-symmetric, 225
timelike, 210
timelike hypersurface, 213
trapped surface, 234
Trautman-Bondi mass, 250
two-sided, 24
uniformization theorem, 21
vacuum, 216, 223

static, 184, 218
Wang-Yau mass, 198

Weitzenböck formula, 48, 96, 185
Weyl tensor, 83
Willmore inequality, 122
Yamabe positive, 18
Yamabe problem, 21
zonal harmonic, 317

Selected Published Titles in This Series
201 Dan A. Lee, Geometric Relativity, 2019
199 Weinan E, Tiejun Li, and Eric Vanden-Eijnden, Applied Stochastic Analysis, 2019
198 Robert L. Benedetto, Dynamics in One Non-Archimedean Variable, 2019
197 Walter Craig, A Course on Partial Differential Equations, 2018
196 Martin Stynes and David Stynes, Convection-Diffusion Problems, 2018
195 Matthias Beck and Raman Sanyal, Combinatorial Reciprocity Theorems, 2018
194 Seth Sullivant, Algebraic Statistics, 2018
193 Martin Lorenz, A Tour of Representation Theory, 2018
192 Tai-Peng Tsai, Lectures on Navier-Stokes Equations, 2018
191 Theo Bühler and Dietmar A. Salamon, Functional Analysis, 2018
190 Xiang-dong Hou, Lectures on Finite Fields, 2018
189 I. Martin Isaacs, Characters of Solvable Groups, 2018
188 Steven Dale Cutkosky, Introduction to Algebraic Geometry, 2018
187 John Douglas Moore, Introduction to Global Analysis, 2017
186 Bjorn Poonen, Rational Points on Varieties, 2017
185 Douglas J. LaFountain and William W. Menasco, Braid Foliations in
Low-Dimensional Topology, 2017
184 Harm Derksen and Jerzy Weyman, An Introduction to Quiver Representations, 2017
183 Timothy J. Ford, Separable Algebras, 2017
182 Guido Schneider and Hannes Uecker, Nonlinear PDEs, 2017
181 Giovanni Leoni, A First Course in Sobolev Spaces, Second Edition, 2017
180 Joseph J. Rotman, Advanced Modern Algebra: Third Edition, Part 2, 2017
179 Henri Cohen and Fredrik Strömberg, Modular Forms, 2017
178 Jeanne N. Clelland, From Frenet to Cartan: The Method of Moving Frames, 2017
177 Jacques Sauloy, Differential Galois Theory through Riemann-Hilbert Correspondence,
2016
176 Adam Clay and Dale Rolfsen, Ordered Groups and Topology, 2016
175 Thomas A. Ivey and Joseph M. Landsberg, Cartan for Beginners: Differential
Geometry via Moving Frames and Exterior Differential Systems, Second Edition, 2016
174 Alexander Kirillov Jr., Quiver Representations and Quiver Varieties, 2016
173 Lan Wen, Differentiable Dynamical Systems, 2016
172 Jinho Baik, Percy Deift, and Toufic Suidan, Combinatorics and Random Matrix
Theory, 2016
171 Qing Han, Nonlinear Elliptic Equations of the Second Order, 2016
170 Donald Yau, Colored Operads, 2016
169 András Vasy, Partial Differential Equations, 2015
168 Michael Aizenman and Simone Warzel, Random Operators, 2015
167 John C. Neu, Singular Perturbation in the Physical Sciences, 2015
166 Alberto Torchinsky, Problems in Real and Functional Analysis, 2015
165 Joseph J. Rotman, Advanced Modern Algebra: Third Edition, Part 1, 2015
164 Terence Tao, Expansion in Finite Simple Groups of Lie Type, 2015
163 Gérald Tenenbaum, Introduction to Analytic and Probabilistic Number Theory, Third
Edition, 2015
162 Firas Rassoul-Agha and Timo Seppäläinen, A Course on Large Deviations with an
Introduction to Gibbs Measures, 2015
For a complete list of titles in this series, visit the

AMS Bookstore at www.ams.org/bookstore/gsmseries/.
Many problems in general relativity are essentially geometric in nature, in the sense that
they can be understood in terms of Riemannian geometry and partial differential equations.
This book is centered around the study of mass in general relativity using the techniques of
KISQIXVMGEREP]WMW7TIGM½GEPP]MXTVSZMHIWEGSQTVILIRWMZIXVIEXQIRXSJXLITSWMXMZIQEWW
theorem and closely related results, such as the Penrose inequality, drawing on a variety of
tools used in this area of research, including minimal hypersurfaces, conformal geometry,
MRZIVWI QIER GYVZEXYVI ¾S[ GSRJSVQEP ¾S[ WTMRSVW ERH XLI (MVEG STIVEXSV QEVKMREPP]
SYXIVXVETTIHWYVJEGIWERHHIRWMX]XLISVIQW8LMWMWXLI½VWXXMQIXLIWIXSTMGWLEZIFIIR
gathered into a single place and presented with an advanced graduate student audience in
mind; several dozen exercises are also included.
The main prerequisite for this book is a working understanding of Riemannian geometry
and basic knowledge of elliptic linear partial differential equations, with only minimal prior
knowledge of physics required. The second part of the book includes a short crash course
SRKIRIVEPVIPEXMZMX][LMGLTVSZMHIWFEGOKVSYRHJSVXLIWXYH]SJEW]QTXSXMGEPP]¾EXMRMXMEP
data sets satisfying the dominant energy condition.
For additional information

and updates on this book, visit
www.ams.org/bookpages/gsm-201
GSM/201
www.ams.org

Graduate Studies in Mathematics Geometric Relativity PDF

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Graduate Studies in Mathematics Geometric Relativity PDF

Caricato da

Copyright:

Formati disponibili

GRADUATE STUDIES

For additional information and updates on this book, visit

Library of Congress Cataloging-in-Publication Data

Part 1. Riemannian geometry

Chapter 1. Scalar curvature 3

Chapter 2. Minimal hypersurfaces 23

Chapter 3. The Riemannian positive mass theorem 63

Chapter 4. The Riemannian Penrose inequality 107

Chapter 5. Spin geometry 159

§5.1. Background 159

Part 2. Initial data sets

The mathematical study of general relativity is a large and active ﬁeld.

complete proof of the rigidity of the spacetime positive mass theorem as

I also thank Mu-Tao Wang for inviting me to give lectures at Columbia

1.1. Notation and review of Riemannian geometry

its dual space. Explicitly, the musical isomorphism : Tp M −→ Tp∗ M is

so that g = gij v i ⊗ v j . If X i and Y i are vector ﬁelds, X, Y  = gij X i Y j .

1.1.2. Volume. The Riemannian metric naturally gives rise to a volume

The exercise means that a metric and a choice of orientation combine

Note that since a Riemannian metric induces a metric space structure on M ,

Note that v 1 ∧ · · · ∧ v n is ﬁxed and does not depend on t. In order to

In the right-hand side of this formula, g and ġ are thought of as matri-

1.1.3. Lie derivatives. The main diﬃculty involved in diﬀerentiating a

Observe that if X is Killing, then if we choose local coordinates x1 , . . . , xn

1.1.4. Levi-Civita connection and divergence. Recall that a connec-

Using the properties of the Levi-Civita connection, one can derive

Given the Levi-Civita connection and a local frame v1 , . . . , vn , we can

Theorem 1.7 (Divergence theorem). Let (M, g) be a compact manifold,

It is a simple exercise to see that when M is orientable, this is equivalent

Finally, recall that the Hessian of a function f is

sign of the curvature is very important! However, the literature is consis-

The sectional curvature K(Π) of a 2-plane Π ⊂ Tp M can be deﬁned as

It is sometimes convenient to ﬁx a background metric ḡ and compare

Exercise 1.11. Derive the following formula:

1.1.6. Scalar curvature. Scalar curvature has a simple geometric inter-

Unfortunately, unlike the situation for sectional curvature, this sort of

preserving dμSM . Therefore

Two techniques arose which revolutionized our understanding of scalar

Exercise 1.18. Prove that

A more detailed computation shows that if g = ḡ + ġ, then

where Q(g) is a contraction of three copies of g −1 (that is, g with raised

1.2. A survey of scalar curvature results

where K is the Gauss curvature of (M, g), κ is the geodesic curvature of

In the case of no boundary, this sets up a simple relationship between

The compact three-dimensional case had been established earlier by

Theorem 1.22 (Lohkamp [Loh99]). Let n ≥ 3, and let (M n , g) be a Rie-

Exercise 1.26. Suppose M is a compact manifold that carries a metric

A much more sophisticated result is the following, which was proved

Theorem 1.27 (Surgery for positive scalar curvature). Suppose M is a

Recall that surgery on a k-sphere in M is a topological procedure in

Theorem 1.28 (Stolz [Sto92]). Let n ≥ 5. If M n is a compact simply

This leaves only the low-dimensional cases (n = 3 or 4) and the non-

Theorem 1.29 (Classiﬁcation of 3-manifolds carrying positive scalar curva-

The reverse implication follows immediately from the connected sum

Theorem 1.30. Let T n be the n-dimensional torus, and let M n be a com-

Theorem 1.30 has central importance for us because of its relevance to

Recall that, given a metric g, a metric g̃ is said to be conformal to

This theorem, ﬁrst proposed by H. Yamabe [Yam60], was proved over

We can deﬁne the conformal Laplacian

2.1. Basic deﬁnitions and the Gauss-Codazzi equations

Sν (X) := (−∇X ν̃) ,

related by the Weingarten equation,

In the hypersurface case, when Σ has dimension n − 1, it is often conve-

so that g = gij v i ⊗ v j . If X i and Y i are vector ﬁelds, X, Y = gij X i Y j .

Sν (X) := (−∇X ν̃) ,

∇ei (t) Xt , ∇Xt ej (t) = ∇ei (t) Xt , ∇ej (t) Xt .

divΣ [(divΣ X̂)X̂] = ∇(divΣ X̂), X̂ + (divΣ X̂)2