Calculus III - Notes of B. Tsirelson

Tel Aviv University, 2016 Analysis-III 1
1 Preliminaries
1a Conventions, notation, terminology etc. . . . . . 1

1b Linear algebra . . . . . . . . . . . . . . . . . . . . 2
1c Topology . . . . . . . . . . . . . . . . . . . . . . . . 2
1d Differentiation . . . . . . . . . . . . . . . . . . . . 4
1e Textbooks to 1b, 1c, 1d . . . . . . . . . . . . . . . 7
1f Change of basis . . . . . . . . . . . . . . . . . . . . 7
1a Conventions, notation, terminology etc.

R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . the real line
Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . {(x1 , . . . , xn ) : x1 , . . . , xn R}
Thus, Rm+n = Rm Rn up to canonical isomorphism.1
A B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x (x A = x B)
Thus, (A B) (B A) (A = B). 2
A ] B . . . . . . . . . . . . . . . . . . just A B when A B = , otherwise undefined.
(1, . . . , n) or (x1 , . . . , xn ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . finite sequence
(1, 2, . . . ) or (x1 , x2 , . . . ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . infinite sequence
f : A B . . . . . . . . . . . . . . . . . . f A B and x A ! y B (x, y) f . 3
T x . . . . . . . . . . . . . . . . . . . . . . . . . the same as p T (x) when a mapping T is linear.
|x| (for x Rn ) . . . . . . . . . . . . . . . . . . . . . . x21 + + x2n Euclidean norm
hx, yi (for x, y Rn ) . . . . . . . . . . . . . . . . x1 y1 + + xn yn scalar product
A , A (for A Rn ) . . . . . . . . . . . . . . . . . . . . . . . . . the interior and the closure
near a point . . . . . . . . . . . . . . . . . . . . . . . . . . . in some neighborhood of the point
Index of terminology and notation is often available at the end of a section.
1
a rule of thumb: there is a canonical isomorphism between X and Y if and only if
you would feel comfortable writing X = Y Reid Barton, see Mathoverflow, What is
the definition of canonical?
2
Why and $ rather than and ? First, our textbooks do so; second, I
need several times a day, while $ hardly once a month.
3
Here B is the codomain, generally not the image of f .
1b Linear algebra
Vector space (=linear space) (usually, over R)
Linear operator (=mapping=function) between vector spaces
Isomorphism of vector spaces: a linear bijection.
Basis of a vector space
Dimension of a finite-dimensional vector space: the number of vectors in
every basis.
Two finite-dimensional vector spaces are isomorphic if and only if their di-
mensions are equal.
Subspace of a vector space.
Inner product on a vector space: hx, yi
A basis of a subspace, being a linearly independent system, can be extended
to a basis of the whole finite-dimensional vector space.
1c Topology
A sequence of points of Rn ; its convergence, limit
Mapping Rn Rm ; continuity (at a point; on a set)
Cauchy criterion of convergence
Subsequence; Bolzano-Weierstrass theorem
Subset of Rn , its limit points; closed set; bounded set
Compact set
Open set
Closure, boundary, interior
Open cover; Heine-Borel theorem
Open ball, closed ball, sphere
1c1 Exercise. Prove or disprove: a mapping f : R2 R is continuous if and

only if it is continuous in each coordinate separately; that is, f (x, ) : R R
is continuous for every x, and f (, y) : R R is continuous for every y.
1c2 Exercise. (a) Prove that finite union of closed sets is closed, but union
of countably many closed sets need not be closed; moreover, every open set
in Rn is such union. However, intersection of closed sets is always closed.
(b) Formulate and prove the dual statement (take the complement).
1c3 Exercise. Prove that a set K Rn is compact if and only if every

continuous function f : K R is bounded.
1c4 Exercise. Prove that a continuous image of a compact set is compact,

but a continuous image of a bounded set need not be bounded, and a con-
tinuous image of a closed set need not be closed; moreover, every open set in
Rn is a continuous image of a closed set.1
1c5 Exercise. Prove that every decreasing sequence of nonempty compact

sets has a nonempty intersection. Does it hold for closed sets? for open sets?
1c6 Exercise. Let X Rn be a closed set, f : X Rm a continuous

mapping. Prove that its graph f = {(x, f (x)) : x X} is a closed subset
of Rn+m . Is the converse true?
1c7 Exercise. Formulate accurately and prove: composition of two contin-

uous mappings is continuous.
1c8 Exercise. Prove existence of a bijection f from the open unit ball {x :
|x| < 1} Rn onto the whole Rn such that f and f 1 are continuous. (Such
mappings are called homeomorphisms). What about the closed ball?
1c9 Exercise. Let f : R R be a continuous bijection. Prove that f 1 :

R R is continuous.
1c10 Exercise. Give an example of a continuous bijection f : [0, 1) S 1 =

{(x, y) : x2 + y 2 = 1} R2 such that f 1 : S 1 [0, 1) fails to be continuous.
The same for f : [0, ) S 1 .
1c11 Exercise. Give an example of a continuous bijec-

tion f : R A = {(x, y) : (|x| 1)2 + y 2 = 1} R2
such that f 1 : A R fails to be continuous.
1c12 Exercise. Give an example

p of a continuous bijection
f : R B = {(x, y, z) : ( x + y 2 1)2 + z 2 = 1} R3
2 2
such that f 1 : B R2 fails to be continuous.2
1
Hint: the closed set need not be connected.
2
What about a continuous bijection f : Rn Rn ? In fact, f 1 is continuous, which
can be proved using powerful means of topology (the Brouwer invariance of domain theo-
rem); well return to this point later.
1d Differentiation
f (x) = f (x0 ) + A(x x0 ) + o(|x x0 |) , or

f : Rn Rm ;
f (x + h) = f (x) + Ah + o(|h|) ;
A a matrix, or a linear mapping Rn Rm

A = (Df )x = Df (x) = df (x) = (etc) : Rn Rm derivative, or differential
Ah = A(h) = (Dh f )x = (Df )x h = Df (x)h = df (x, h) = (etc) Rm
derivative along vector
Dk f = Dek f Rm , ek = (0, . . . , 0, 1, 0, . . . , 0) partial derivative
m
(Df )x h = h1 (D1 f )x + + hn(Dn f )x R since h = h1 e1 + + hn en
(Df )x = (D1 f )x , . . . , (Dn f )x (columns of matrix)
f (x) (Df )
1 1 x
f (x) = ... ; (Df )x = ... (rows of matrix)
fm (x) (Dfm )x
(Df )x = (Dj fi )x i=1,...,m,j=1,...,n (elements of matrix)
D(f + g) x = (Df )x + (Dg)x , D(cf ) x = c(Df )x linearity of D
D(g f ) x = (Dg)f (x) (Df )x chain rule
n
For m = 1 only: (Dh f )x = hf (x), hi; f (x) R gradient
f (x)g(x) = f (x)g(x) + g(x)f (x) product rule
For n = 1 only: (Df )x h = hf 0 (x), f 0 (x) Rm , h R.
If D1 f, . . . , Dn f exist and are continuous, then Df exists (and is continu-
ous).1
If Di Dj f and Dj Di f exist and are continuous, then Di Dj f = Dj Di f . 2
1d1 Exercise. Generalize the product rule3

(a) for the scalar product hf (), g()i where f, g : Rn Rm ;
(b) for the pointwise product f g where f : Rn R and g : Rn Rm .
some clarifications
For Df to be defined at x it is necessary that f is defined near x. If f is

defined on a set with empty interior, we have no Df . For example, consider
the mapping from the cylinder C = {(x, y, z) : x2 + y 2 = 1, 1 < z < 1}
2 2 2
the sphere S = {(x,
to y, z) : x + y + x = 1}, defined by f (x, y, z) =
x 1 z 2 , y 1 z 2 , z . As youll see in Analysis-4, in this case (Df )x for
1
Moreover, if D1 f, . . . , Dn f exist near x0 and are continuous at x0 , then Df exists at
x0 . (Zorich, Sect. 8.4.2, Th. 2.)
2
Moreover, if Di Dj f exists near x0 and is continuous at x0 , then Dj Di f exists at x0 ,
and (Di Dj f )x0 = (Dj Di f )x0 . (Courant, Sect. 1.4d.)
3
More generally: Shurman Ex.4.4.8,4.4.9.
x C is a linear operator1 from the tangent plane Tx C to C to the tangent

plane Tf (x) S to S. But it is not a 3 3 matrix, and is beyond Analysis-3.
Never mind (until you reach Analysis-4). But note that linear operators will
be more useful than matrices.2
For f : Rn Rm we have Df : Rn Rn Rm (currying), in the
sense that Df : x 7 h 7 (Df )x h . Sometimes we treat it as a function of
x, sometimes as a function of h. For example, it is usual to say that if f
is linear then Df = f . Really?! For m = n = 1 we know that (ex )0 = ex ,
while x0 6= x, x0 = 1 (a constant). What happens?
For f (x) = ex we have (Df )x = f 0 (x) = ex , but this ex is treated as a
1 1 matrix (ex ), thus, the linear mapping h 7 ex h;
D x 7 ex : x 7 h 7 ex h .

For g(x) = x we have (Dg)x = g 0 (x) = 1 : h 7 1 h;
| 7
D(x {z x}) : x 7 (h
| 7
{z h}) .
id id
| {z }
const
In some sense this is id, and in another sense this is const.

It is also usual to say that the differential of the composition is the
composition of differentials. Really?! For m = n = 1 we know that (esin x )0 =
esin x cos x 6= ecosx . Yes, but one means that, given f (x) = y and g(y) = z, we
have D(g f ) x = (Dg)y (Df )x (the chain rule); and it is usual to write
AB rather than A B when A, B are linear operators.
1d2 Exercise. Formulate accurately and prove the following two claims
about a differentiable mapping f : Rn Rm :
(a) f is linear if and only if Df = f ;
(b) f is linear if and only if f (0) = 0 and Df is constant.
1d3 Exercise.
Consider functions f : R2 \ {(0, 0)} R constant on all
rays from the origin; that is, f (r cos , r sin ) = h()
for some h : R R, h( + 2) = h(). Assume that h
is continuous.
(a) Prove that the iterated limits
lim lim f (x, y) and lim lim f (x, y)
x0+ y0+ y0+ x0+
1
Not isometric, but preserves the area.
2
Zorich requires f to be defined near x in Sect. 8.2.2 and later, but not in Sect. 8.2.1
(thus, Df need not be unique in 8.2.1).
exist and are equal to h(0) and h(/2) respectively.

(b) prove that the full limit
lim f (x, y)
(x,y)(0,0),x>0,y>0
exists if and only if h is constant on [0, /2].

(c) It can happen that the two iterated limits exist and are equal, but the
full limit does not exist. Give an example.
(d) The same as (c) and in addition, f is a rational function (that is, the
ratio of two polynomials).1
(e) Generalize all that to arbitrary (not just positive) x, y.
1d4 Exercise.
Consider functions g : R2 \ {(0, 0)} R of the form
g(x, y) = f (x2 , y) where f is as in 1d3.
(a) Prove that the limit
lim g(ta, tb)

t0+
exists for every (a, b) 6= (0, 0); calculate the limit in terms of the function h
of 1d3.
(b) It can happen that the full limit
lim g(x, y)
(x,y)(0,0)
does not exist. Give an example.

d

1d5 Exercise. 2 It can happen that dt t=0
f (x0 + th) exists for all h but is
not linear in h. (Of course, such f cannot be differentiable at x0 .) Give an
example.3
d

1d6 Exercise. 4 It can happen that dt t=0
f (x0 + th) exists for all h and is
linear in h and nevertheless f is not differentiable at x0 . Give an example.5
The multivariate derivative is truly a pan-dimensional construct,

not just an amalgamation of cross sectional data.
(Shurman, p.156)
1
Hint: try x2 + y 2 in the denominator.
2
Shurman, Ex.4.8.10. p
3
Hint: try (x, y) 7 f (x, y) x2 + y 2 for f as in 1d3.
4
Shurman, Ex.4.8.11. p
5
Hint: try (x, y) 7 f (x, y) x2 + y 2 for f as in 1d4.
1e Textbooks to 1b, 1c, 1d

R. Courant, F. John Introduction to calculus and analysis vol. 2,
Springer 1989.
W. Fleming Functions of several variables Springer 1977.
J. Hubbard, B. Hubbard Vector calculus, linear algebra, and differen-
tial forms Prentice-Hall 2002.
S. Lang Undergraduate analysis Springer 1997.
T. Shifrin Multivariable mathematics Wiley 2005.
J. Shurman Multivariable calculus (online only).
V. Zorich Mathematical analysis I Springer 2004.
Textbook linear algebra topology differentiation
Courant 2.12.3 1.11.3; A.1A.3 1.41.7
Fleming 1.21.3 1.4; 2.12.5; 2.8, 2.11 3.13.3; 4.14.4
Hubbard 1.4 1.51.6 1.71.9
Lang 6.16.3 6.47.2; 8 15.115.2; 17
Shifrin 1; 4.3; 5.15.3 2; 5.1 3
Shurman 2.12.2; 3.13.2; 3.53.7 2.32.4 4.14.7.1; 4.8
Zorich 8.1 7 8.28.4.4
1f Change of basis
linear algebra
Let V be an n-dimensional vector space, and (1 , . . . , n ) a basis of V .

Then each v V is x1 1 + + xn n for some x1 , . . . , xn R, uniquely
determined by v, and the mapping L : Rn V defined by L (x1 , . . . , xn ) =
x1 1 + + xn n , is an isomorphism (of vector spaces). One says that these
x1 , . . . , xn are the coordinates of v w.r.t. this basis, and x = (x1 , . . . , xn ) Rn
is the coordinate vector of v relative to this basis.
In particular, if V = Rn and (1 , . . . , n ) is the standard basis (e1 , . . . , en )
of Rn , then L = id, that is, L (x1 , . . . , xn ) = (x1 , . . . , xn ). In general,
L (ei ) = i for i = 1, . . . , n.
Another basis (1 , . . . , n ) of V leads to another isomorphism L : Rn
V , L (ei ) = i ; and then we have
Rn >ei
L L L L
~
V / V i / i
L L1
L L1

That is, L L1 1
: V V , L L i = i . This is the so-called active transfor-
mation of V that transforms (1 , . . . , n ) to (1 , . . . , n ). On the other hand
we have
L1
L L1
L
Rn / Rn x ?/ y
L L
L ~ L
V v
x1 1 + +xn n = v = y1 1 + +yn n ; L1 n n 1
L : R R , L L (x1 , . . . , xn ) =
(y1 , . . . , yn ). This is the so-called passive transformation of Rn that trans-
forms the coordinate vector (of arbitrary v V ) relative to one basis into
the coordinate vector (of the same v) relative to the other basis.
1 n n
Let A = (a Pi,j )i,j be the matrix
P of the operator P L LP: R R ;
that is,
P Pyi = j ai,j xj . Then j xj j = v = i yi i = i,j ai,j xj i =
j xj i ai,j i , that is, X
j = ai,j i .
i
We see that A describes both the passive transformation and the relation
between the two bases.1
1f1 Exercise. 2 Consider the 2-dimensional vector subspace V = {(x, y, z) :

x + y + z = 0} of R3 , and two bases:
1 = (1, 1, 0) , 1 = (0, 1, 1) ,
and
2 = (1, 0, 1) , 2 = (1, 1, 2) .
Find the change-of-basis matrix A.
1f2 Exercise. Consider the 3-dimensional vector space V of all functions

P : R R such that x P 000 (x) = 0, and two coordinate systems on V :
P 7 P (0), P 0 (0), P 00 (0)

and P 7 P (1), P (0), P (1) .
Find the two bases of V (that correspond to these coordinate systems), and
the change-of-basis matrix.
1
See also: Change of basis and Active and passive transformation in Wikipedia;
Hubbard Sect. 2.6.
2
Hubbard 2.6.17. A quote therefrom:
Note that unlike R3 , for which the obvious basis is the standard basis vectors, the
subspace V R3 in Example 2.6.17 does not come with a distinguished basis.
topology
We may transfer all topological notions from Rn to arbitrary n-dimen-

sional vector space. For example, consider the space V of quadratic polynomi-
als (Exer. 1f2); given P, Pk V , we may interpret Pk P as Pk (0) P (0),
Pk0 (0) P 0 (0), Pk00 (0) P 00 (0). Or alternatively, as Pk (1) P (1),
Pk (0) P (0), Pk (1) P (1). Is it the same? Yes, it is, as well see soon.
1f3 Exercise. (a) Every linear mapping Rn Rm is continuous;
(b) every invertible linear mapping Rn Rn is a homeomorphism (that
is, continuous invertible mapping with continuous inverse).
Prove it.
1f4 Exercise. Every homeomorphism : Rn Rn preserves topological
notions; namely:
xk x (xk ) (x);
A is open (A) is open; and the same for closed, and com-
pact;
(A ) = (A) ; (A) = (A); and (A) = (A) (the boundary,

A = A \ A ).
Prove it.
We apply this, in particular, to = L1
L , and conclude.
Topological notions in Rn are insensitive to a change of basis.

Topological notions are well-defined in every n-dimensional vector space,
and preserved by isomorphisms of these spaces.
1f5 Exercise. Every (vector) subspace of a finite-dimensional vector space

is closed (topologically).
Prove it.1,2
A mapping f : Rn Rm relates two spaces; accordingly, we introduce
two homeomorphisms, : Rn Rn and : Rm Rm ,
f
Rn / Rm

f 1
Rn / Rm
getting a mapping g = f 1 : Rn Rm .
1
Hint: choose a basis.
2
This claim fails in infinite dimension.

1f6 Exercise. (a) f is continuous g is continuous ;
(b) x Rn f is continuous at x g is continuous at (x) ;
(c) x Rn f is continuous near x g is continuous near (x) .
Prove it.
Thus, when checking continuity of a given mapping, we may choose at
will a pair of bases. This applies to any pair of finite-dimensional vector
spaces. The case m = n is not an exception; for f : Rn Rn we still may
use two different bases, thus treating f as a mapping between two copies of
Rn .
metric
A Euclidean metric on an n-dimensional vector space V may be defined

equivalently as
an inner product x, y 7 hx, yi on V ;
a norm x 7 |x| on V that corresponds to some inner product by
|x|2 = hx, xi; in this case the norm | | is called Euclidean, and hx, yi =
1 1
2 2 2 2 2

2
|x + y| |x| |y| = 4
|x + y| |x y| ;
distance function x, y 7 |x y| that corresponds to some Euclidean
norm | |.
On Rn we have the standard Euclidean metric, and the standard basis of
Rn is orthonormal in this metric.
An arbitrary basis (1 , . . p . , n ) of a vector space V leads to the Euclidean
metric |x1 1 + + xn n | = x21 + + x2n , and is orthonormal in this (and
only this) metric. On the other hand, for arbitrary Euclidean metric on V
there exists an orthonormal basis (due to the orthogonalization process).
An n-dimensional vector space endowed with a Euclidean metric is called
n-dimensional Euclidean space.
Let E be an n-dimensional Euclidean space. A basis (1 , . . . , n ) of E is
orthonormal if and only if the operator L : (x1 , . . . , xn ) 7 x1 1 + + xn n
is isometric, that is, x Rn |L x| = |x|. By isomorphism of Euclidean
spaces we mean an isometric invertible linear operator. All n-dimensional
Euclidean spaces are isomorphic (to each other, and to Rn ).
For arbitrary (not just isometric) invertible linear operator L : E1 E2
between Euclidean spaces there exist a, b (0, ) such that
(1f7) x E1 a|x| |Lx| b|x| .
Indeed, the ball B = {x E1 : |x| 1} is compact, therefore L(B) E2 is
compact, which gives b < . The same argument applies to L1 : E2 E1 ,
giving 1/a < .
It follows that two arbitrary Euclidean norms ||1 , ||2 on a n-dimensional

vector space V are equivalent:1
(1f8) a, b (0, ) x V a|x|1 |x|2 b|x|1 .
Proof: apply 1f7 to E1 = (V, | |1 ), E2 = (V, | |2 ) and L = id : x 7 x.
1f9 Exercise. Find an orthonormal basis in the space V of 1f1 with the
standard Euclidean metric inherited from R3 .
1f10 Exercise. Is it possible to endow V of 1f2 with a Euclidean metric

such that both bases (mentioned in 1f2) are orthonormal?
space of matrices or linear operators
1f11 Definition. The norm kAk of a linear operator A : E1 E2 between

finite-dimensional Euclidean vector spaces E1 , E2 is
|Ax|
kAk = sup .
xE1 ,x6=0 |x|
Also,
kAk = max |Ax|
|x|1
(think, why); this is the maximum of a continuous function on a compact

set.
The operator norm kAk of a matrix A : Rn Rm is, by definition, the
norm of the corresponding operator.
1f12 Exercise. If a matrix A = (ai,j )i,j is diagonal then
kAk = max |ai,i |.

i=1,...,min(m,n)
Prove it.
The set L(Rn Rm ) of all matrices evidently is an mn-dimensional vec-

tor space. Does the operator norm turn it to a Euclidean space? No, it does
not. Even if we restrict ourselves to L(R2 R2 ), and even to its 2-dimen-
sional subspace of diagonal matrices, we get (by 1f12, up to isomorphism)
R2 with the norm
k(s, t)k = max(|s|, |t|) ,
1
In fact, two norms (Euclidean or not) are always equivalent in finite dimension (but
not in infinite dimension).
its unit ball {x : kxk 1} being the square [1, 1] [1, 1]. This is not the
Euclidean plane! For two non-collinear vectors a = (1, 1) and b = (1, 1) we
have kak = 1, kbk = 1 and ka+bk = 2, which never happens on the Euclidean
plane. Also, the parallelogram equality |a b|2 + |a + b|2 = 2|a|2 + 2|b|2
holds for arbitrary vectors a, b of a Euclidean space, but fails for the operator
norm.
1f13 Exercise. Prove that k k is a norm on L(Rn Rm ), that is,
ktAk = |t| kAk for all A L(Rn Rm ), t R ;

kA + Bk kAk + kBk for all A, B L(Rn Rm ) ;
kAk > 0 whenever A 6= 0 .
1f14 Exercise. Consider the composition BA : E1 E3 of two linear op-

erators A : E1 E2 and B : E2 E3 between Euclidean spaces E1 , E2 , E3 ;
prove that kBAk kBk kAk.
Treating a matrix as just mn numbers, we have a Euclidean norm, the

so-called Hilbert-Schmidt
P 2 1/2 norm kAkHS of a matrix A = (ai,j )i,j :
kAkHS = a
i,j i,j .
p
1f15 Exercise. (a) kAk HS = trace(A A);
(b) kAk kAkHS nkAk.1
Prove it.
Thus, the operator norm is equivalent to the Euclidean norm; both may
be used when dealing with topological notions in L(Rn Rm ).
1f16 Exercise. The following conditions on matrices A, Ak L(Rn Rm )

are equivalent:
(a) Ak A;
(b) all elements of Ak converge to the corresponding elements of A; that
is, (Ak )i,j Ai,j as k for all i, j.
Prove it.
1f17 Exercise. In the situation of 1f14 prove that BA is a continuous func-

tion of A, B, in two ways (via 1f14, and via 1f16).
1
denoting the rows of A by r1 , . . . , rm Rn we have Ax =
to kAk kAkHS :
Hint
hr1 ,xi

... . Hint to kAkHS nkAk: denoting the columns of A by c1 , . . . , cn Rm we
hrm ,xi
have |cj | kAk for each j = 1, . . . , n.
1f18 Exercise. (a) Determinant is a continuous function A 7 det A on

L(Rn Rn );
(b) invertible operators are an open set;
(c) the mapping A 7 A1 is continuous on this open set.
Prove it.1
1f19 Exercise. If A L(Rn Rn ) satisfies kAk < 1, then
(a) the series id A + A2 A3 + . . . converges in L(Rn Rn );
(b) the sum S of this series satisfies (id +A)S = id, S(id +A) = id; thus,
id +A is invertible;
(c) det(id +A) > 0.
Prove it.2
differentiation
Looking at the definition of (Df )x for f : Rn R,
f (x + h) = f (x) + (Df )x h + o(|h|) ,
we observe that it does not involve any basis. True, it involves the Euclidean
norm; but the notion o(|h|) is insensitive to the choice of a norm due to (1f8),
and we may write o(h) instead of o(|h|).
For f : Rn Rm , two norms appear:
|f (x + h) f (x) (Df )x h|Rm
0 as h 0 ,
|h|Rn
and still, (1f8) ensures that both norms do not matter.

When differentiating a given mapping, we may choose at will a pair of
bases. This applies to any pair of finite-dimensional vector spaces.
Here, by differentiating we mean checking differentiability and calcu-
lating the differential (interpreted as a linear operator, not matrix).
In contrast, partial derivatives (elements of the matrix of the linear oper-
ator) depend on the bases. Moreover, sometimes the partial derivative exist
but the differential does not exist.
1f20 Exercise. It can happen that both partial derivatives of f : R2 R at
(0, 0) vanish in the standard basis of R2 , but do not vanish in another basis.
Give an example.3
1
Hint: recall the algebraic formulas for det A and A1 .
2
Hint: (c) consider det(id +tA) for t [0, 1].
3
Hint: similar to 1d5.
Looking at the definition of the gradient,
hf (x), hi = (Dh f )x for f : Rn R ,
we observe that it does not involve any basis, but involves the Euclidean
metric. And indeed, the gradient depends on the choice of the metric. It
is well-defined for differentiable real-valued functions on a Euclidean space.
Any orthonormal basis may be used equally well.
1f21
R1 Exercise. On the space V of 1f2 consider the function f : P 7
1
P (t) dt. Find f (0) twice, in the two bases mentioned in 1f2 (that is, rel-
ative to the two corresponding Euclidean metrics). Did you get two different
elements of V ?
1f22 Definition. Let U Rn be an open set. A differentiable mapping
f : U Rm is continuously differentiable if the mapping Df is continuous
(from U to L(Rn , Rm )). The set of all continuously differentiable mappings
U Rm is denoted by C 1 (U Rm ). In particular, C 1 (U ) = C 1 (U R).
Here Rn and Rm may be replaced with finite-dimensional vector spaces.
Note that C 1 (U Rm ) is a vector space, and C 1 (U ) is an algebra:
f g C 1 (U ) for all f, g C 1 (U ).
1f23 Exercise. For f C 1 (U Rm ) and g C 1 (Rm R` ) prove that
g f C 1 (U R` ).1
1f24 Exercise. A mapping f is continuously differentiable if and only if all
parial derivatives Di fj exist and are continuous. (Here f (x) = f1 (x), . . . , fm (x) .)
Prove it.
1f25 Exercise. (a) Let f C 1 (U ) and g C 1 (U Rm ); prove that
f g C 1 (U Rm ) (pointwise product).
(b) Let f, g C 1 (U Rm ); prove that hf (), g()i C 1 (U ) (scalar
product).2
Below, by differentiate I mean: (1) find the derivative at every point of
differentiability, and (2) prove non-differentiability at every other point.
1f26 Exercise. (a) Differentiate the mapping R2 3 (r, ) 7 (r cos , r sin )
R2 .
(b) Differentiate the function f : (0, ) R R defined by f (r, ) =
g(r cos , r sin ) for a given differentiable g : R2 R.
1
Hint: chain rule, 1c7 and 1f17.
2
Hint: use 1d1.
(c) For f, g as in (b) prove that

2 2 2 2
g g f 1 f
+ = + 2
x y r r
whenever x = r cos , y = r sin , r > 0.
1f27 Exercise. 1 (a) Determinant is a continuously differentiable function

f : A 7 det A on L(Rn Rn );
(b) (Df )id (H) = tr(H) for all H L(Rn Rn );
(c) (D log |f |)A (H) = tr(A1 H) for all H L(Rn Rn ) and all invertible
A L(Rn Rn ).
Prove it.
Thus,
log | det(A + H)| log | det A| + tr(A1 H)
for small H.
1f28 Exercise. Let f : Rn Rm be differentiable and symmetric in the

sense that f (x1 , . . . , xn ) is insensitive to any permutation of x1 , . . . , xn . Prove
that
(a) (Di f )(x1 ,...,xn ) = (Dj f )(x1 ,...,xn ) whenever xi = xj ;
(b) the operator (Df )(x1 ,...,xn ) cannot be one-to-one if some of x1 , . . . , xn
are equal.
1f29 Exercise. Consider the vector space Vn+1 = {f : f (n+1) () = 0} and

the mapping : Rn Vn+1 ,
(t1 , . . . , tn ) : t 7 (t t1 ) . . . (t tn ) .
Prove that
(a) the operator (D)(t1 ,...,tn ) cannot be invertible if some of t1 , . . . , tn are
equal;
(b) the operator (D)(t1 ,...,tn ) is invertible whenever t1 , . . . , tn are pairwise
distinct;
(c) dim(D)(t1 ,...,tn ) (Rn ) = #{t1 , . . . , tn };
that is, the dimension of the image is equal to the number of distinct coor-
dinates.
1
Shurman:Ex.4.4.9
from mean value to finite increment
Recall the 1-dimensional mean value theorem: if f : [a, b] R is contin-

uous on [a, b] and differentiable on (a, b), then f (b) f (a) = f 0 (t)(b a) for
some t (a, b).
Applying this to the function t 7 f a+t(ba) we get the n-dimensional
mean value theorem: if G Rn is open, f : G R is continuous on G and
differentiable on G, and a, b G are such that a + t(b a) G for all
t (0, 1), then

f (b) f (a) = (Df )a+t(ba) (b a) = hf a + t(b a) , b ai
for some t (0, 1); and therefore
(1f30) |f (b) f (a)| |b a| sup k(Df )a+t(ba) k =

t(0,1)

= |b a| sup |f a + t(b a) | .
t(0,1)
Given open G Rn ; a, b as before; and f : G Rm continuous on G

and differentiable on G, f (x) = f1 (x), . . . , fm (x) , we may apply (1f30) to
f1 and get
|f1 (b) f1 (a)| |b a| sup k(Df )a+t(ba) k
t(0,1)
| {z }
C
(Df )
1 x
since k(Df1 )x k = k(Df )x k. The same holds for f2 , . . . , fm ,
...

(Dfm )x
which implies easily |f (b) f (a)| C n(b a); but we can get more,
(1f31) |f (b) f (a)| C|b a| , finite increment theorem1
just by changing the basis in Rm such that f (b) f (a) is proportional to the
first basis vector!
1
Zorich vol. 2, Sect. 10.4.1, Th. 1.
Index
active transformation, 8 inner product, 2
interior, 2
basis, 2 isometric, 10
Bolzano-Weierstrass theorem, 2 isomorphism, 2, 10
boundary, 2
bounded, 2 limit, 2
limit point, 2
Cauchy criterion, 2 linear operator, 2
chain rule, 4 linearity of D, 4
closed, 2
closure, 2 open, 2
compact, 2 operator norm, 11
continuity, 2
partial derivative, 4
continuously differentiable, 14
passive transformation, 8
convergence, 2
product rule, 4
derivative, 4 subspace, 2
derivative along vector, 4
differential, 4 vector space, 2
dimension, 2
C 1 (U ), 14
equivalent norms, 11 C 1 (U Rm ), 14
Euclidean metric, 10 (Dh f )x , 4
Euclidean space, 10 (Df )x h, 4
(Df )x , 4
gradient, 4 Dk f , 4
graph, 3 f 0 (x), 4
L , 7
Heine-Borel theorem, 2 L(Rn Rm ), 11
Hilbert-Schmidt, 12 f (x), 4
homeomorphism, 3, 9 kAk, 11
2 Equations, from linear to nonlinear
2a Introduction . . . . . . . . . . . . . . . . . . . . . 18
2b Main results formulated and discussed . . . . . . 23
2c Proof, the easy part . . . . . . . . . . . . . . . . . 25
2d Proof, the hard part . . . . . . . . . . . . . . . . . 27
2a Introduction
Born: I should like to put to Herr Einstein a question, namely, how
quickly the action of gravitation is propagated in your theory. . .
Einstein: It is extremely simple to write down the equations for the
case when the perturbations that one introduces in the field are in-
finitely small. . . . The perturbations then propagate with the same
velocity as light.
Born: But for great perturbations things are surely very complicated?
Einstein: Yes, it is a mathematically complicated problem. It is espe-
cially difficult to find solutions of the equations, as the equations are
nonlinear. Discussion after lecture by Einstein in 1913.
...
The hardest part of differential calculus is determining when replacing
a nonlinear object by a linear one is justified.1
In other words, we want to know, when the linear approximation
f (x0 + h) f (x0 ) + (Df )x0 h
may be trusted near x0 .
2a1 Example. 2
f : R R, f (x) = x + 3x2 sin x1 for x 6= 0, f (0) = 0, x0 = 0.
y
0.05
x
0.05
1
Quoted from: Hubbard, Sect. 1.7, pp. 125126.
2
Hubbard, Example 1.9.4 on p. 157; Shifrin, Sect. 6.2, Example 1 on pp. 251252.
This function is differentiable everywhere; its linear approximation near 0,

f (x) x, is one-to-one. Nevertheless, f fails to be one-to-one near 0 (and the
equation f (x) = y has more than one solution).1 The linear approximation
cheats. In fact,
lim inf f 0 (x) = 2 , lim sup f 0 (x) = 4

x0 x0
(think, why); f is differentiable, but not continuously.

This is why throughout this section we require f to be continuously dif-
ferentiable near x0 .
linear algebra
3 2 0 1
2a2 Example. The matrix A = 0 1 3 1 is of rank 2, it has (at least
3 1 3 2
one) non-zero minor 2 2, but not 3 3, since the first row is the sum of
the two other rows. Treated as a linear operator A : R4 R3 it maps R4
onto a 2-dimensional subspace of R3 , the image of A: A(R4 ) = {(z1 , z2 , z3 ) :
z1 z2 z3 = 0}. The kernel of A, A1 ({0}) = {u R4 : Au = 0}
is a 2-dimensional subspace of R4 spanned (for instance) by two vectors
(1, 1, 0, 1) and (0,1,
1, 2),
1 according
totwo linear
dependencies
1 of the
3 2 2 0
columns: 0 + 1 + 1 = 0, 1 + 3 3 + 2 1 = 0.
3 1 2 1 2
It is convenient to denote a point of R4 by (x1 , x2 , y1 , y2 ); that is, x R2 ,
y R2 , (x, y) R4 . The equation A ( xy ) = 0 becomes ( 30 21 ) x + ( 03 1 1
)y = 0
(the third row is redundant, being a linear combination of other rows); y =
1 1 3 2 1 1
( 03 1 ) ( 0 1 ) x = ( 13 12 ) x, since ( 03 1 ) = 31 ( 13 10 ). Not unexpectedly,
( 01 ) = ( 13 12 ) ( 1 1 1 1 0
1 ) and ( 2 ) = ( 3 2 ) ( 1 ). The more general equation
4
A(x, y) = z for a given z A(R ) may be solved similarly; ( 30 21 ) x+( 03 1 1
)y =
2 0 1 1 3 2
z (where z R is (z1 , z2 ) for z = (z1 , z2 , z3 )); y = ( 3 1 ) (z ( 0 1 ) x) =
1 1 1
( ) z ( 13 12 ) x.
3 3 0
In general, a matrix A : Rn Rm has some rank r min(m, n). The

image is r-dimensional, the kernel is (n r)-dimensional. We rearrange rows
and columns (if needed) such that the upper right r r minor is not 0, denote
a point of Rn by (x, y) where x Rnr , y Rr , then the equation A(x, y) = 0
becomes Bx + Cy = 0, B : Rnr Rr , C : Rr Rr , det C 6= 0;
nr r
A = r B C
mr
1
Bad news. . . But here are good news: all solutions of the equation f (x) = y are close
(to each other and y), namely, x = y + O(y 2 ).
the solution is y = C 1 Bx. More generally, the equation A(x, y) = z for a

given z A(Rn ) becomes Bx + Cy = z; the solution: y = C 1 (z Bx).
Note existence of an r-dimensional subspace E Rn such that the re-
striction A|E is an invertible mapping from E onto A(Rn ).1
Special cases:
r = m n A(Rn ) = Rm (onto);
r = n m A1 ({0}) = {0} (one-to-one); no x variables;
r = m = n A is invertible (bijection).
Note that A is onto if and only if its rows are linearly independent.
analysis
We turn to the equation f (x) = y where f : Rn Rm is continuously

differentiable near x0 , and introduce A = (Df )x0 . We want to compare two
mappings, f (nonlinear) and x 7 f (x0 ) + A(x x0 ) (linear),2 near x0 . Or,
equivalently, of h 7 Ah (linear) and h 7 f (x0 + h) f (x0 ) (nonlinear), near
0 (that is, for small h). The relevant properties of f , including its derivative
A, are insensitive to a change of the origin3 in Rn and Rm , and therefore we
may assume WLOG4 that x0 = 0 and f (x0 ) = 0.
Also, all relevant properties of f are insensitive to a change of basis, both
or Rn and Rm ; this argument will be used later.
Also, values of f outside a neighborhood of 0 are irrelevant; we consider
f near 0 only. This is why I often write just f : Rn Rm rather than
f : U Rm where U Rn is a neighborhood of 0.
The linear algebra gives us properties of A : Rn Rm , and we want to
prove the corresponding local properties of f : Rn Rm near 0. Here are
some relevant definitions, global and local; the local definitions are formulated
for the case x0 = 0, f (0) = 0; you can easily generalize them to arbitrary
x0 Rn and y0 = f (x0 ) Rm .
2a3 Definition. (a) Let U, V Rn be open sets. A mapping f : U V
is a homeomorphism, if it is bijective, continuous, and f 1 : V U is also
continuous.
(b) f : Rn Rn is a local homeomorphism, if there exist open sets
U, V Rn such that 0 U , 0 V , and f is a homeomorphism U V .
1
Another proof: take a basis 1 , . . . , r of A(Rn ); choose 1 , . . . , r Rn such that
A1 = 1 , . . . , Ar = r ; consider E spanned by 1 , . . . , r .
2
More exactly: affine.
3
It means, all points are changed (shifted), but vectors remain intact; x and y are
points, while h and Ah are vectors.
4
Without Loss Of Generality.
2a4 Definition. (a) Let U, V Rn be open sets. A mapping f : U V

is a diffeomorphism,1 if it is bijective, continuously differentiable, and f 1 :
V U is also continuously differentiable.
(b) f : Rn Rn is a local diffeomorphism, if there exist open sets U, V
Rn such that 0 U , 0 V , and f is a diffeomorphism U V .
2a5 Exercise. For a linear A : Rn Rm prove that the following conditions

are equivalent:
(a) A is invertible;
(b) A is a homeomorphism;
(c) A is a local homeomorphism;
(d) A is a diffeomorphism;
(e) A is a local diffeomorphism.
2a6 Definition. (a) Let U Rn be an open set. A mapping f : U Rm is

open, if for every open subset U1 U its image f (U1 ) Rm is open.
(b) f : Rn Rm is open at 0, if for every neighborhood U Rn of 0
there exists a neighborhood V Rm of 0 such that f (U ) V . 2,3
2a7 Exercise. (a) Prove that f is open at 0 if and only if for every sequence
y1 , y2 , Rm such that yk 0 there exists a sequence x1 , x2 , Rn such
that xk 0 and f (xk ) = yk for all k large enough;
(b) generalize 2a6(b) to arbitrary x0 and y0 = f (x0 );
(c) prove that f : U Rm is open if and only if f is open at x for every
x U.
2a8 Exercise. Prove or disprove: a continuous function R R is open if

and only if it is strictly monotone.
2a9 Exercise. For a linear A : Rn Rm prove that the following conditions

are equivalent:
(a) A(Rn ) = Rm (onto);
(b) A is open at 0; 4
(c) A is open.
1
This is a C 1 diffeomorphism (most important for this course); C 0 diffeomorphism
is just a homeomorphism, and C k diffeomorphism must be continuously differentiable k
times (both f and f 1 ).
2
This notion is seldom used; but see for instance Sect. 2.8 in Basic Complex Analysis
by G. De Marco.
3
In this form, we may interpret the phrase U is a neighborhood of 0 as 0 is an
interior point of U or, equally well, as > 0 U = {x : |x| < }.
4
Hint: (b) use subspace E such that A|E is an invertible mapping from E onto A(Rn ).
2a10 Exercise. Consider the mapping f : U R2 , where U = (1, 2)

(T, T ) R2 (for a given T (0, )) and f (r, ) = (r cos , r sin ). Denote
V = f (U ). For each of the following conditions (separately) find all T such
that the condition is satisfied:1
(a) V is open;
(b) f is continuous;
(c) f is uniformly continuous;
(d) f is continuously differentiable;
(e) f : U V is bijective;
(f) f : U V is a homeomorphism;
(g) f : U V is a homeomorphism and f 1 : V U is uniformly
continuous;
(h) f : U V is a diffeomorphism;
(i) f is a local homeomorphism near each point of U ;
(j) f is a local diffeomorphism near each point of U ;
(k) f is an open mapping.
In the linear case we may ignore the last m r (redundant) equations.
In the nonlinear case we cannot.
2a11 Example. Consider f : R2 R, f (x1 , x2 ) = x1 , x1 + c(x21 + x22 ) for

a given c. The linear approximation: f (x1 , x2 ) (x1 , x1 ); A = 11 00 . The

equation Ax = 0 is satisfied by all x = (0, x2 ). However, the equation f (x) =
0 is satisfied by x = (0, 0) only (unless c = 0). The linear approximation
cheats.
In fact, for every closed set F Rn containing 0 there exists a continu-
ously differentiable function f : Rn R such that F = {x : f (x) = 0} and
f (0) = 0, (Df )0 = 0. 2
In the linear approximation, f (x) 0; A = (0, . . . , 0); Ax = 0 for all x.

However, f (x) = 0 for x F only.
1
Hint: (h) use arccos and arcsin (you really need both); (i), (j): generalize definitions
2a3(b), 2a4(b) to arbitrary x0 and y0 = f (x0 ).
2
Hint: cover the complement with a sequence of open balls and take the sum of an
appropriate series of functions positive inside these balls and vanishing outside.
The case r < m is intractable. This is why we restrict ourselves to the

cases
r = m < n; A(Rn ) = Rm (onto, not one-to-one);
r = m = n; A is invertible (bijection).
2b Main results formulated and discussed

First, the case r = m = n. Here is a theorem called the inverse function
theorem 1 or inverse mapping theorem.2
2b1 Theorem. Let f : Rn Rn be continuously differentiable near 0,

f (0) = 0, and (Df )0 = A : Rn Rn be invertible.3 Then f is a local
diffeomorphism, and D(f 1 ) 0 = A1 .
2b2 Remark. The relation D(f 1 ) 0 = A1 is included for completeness.

It follows easily from the chain rule: f 1 f = id, therefore D(f 1 ) 0 (Df )0 =

id. (However, differentiability of f 1 does not follow from the chain rule!)
Similarly, D(f 1 ) f (x) = (Df )x 1 for all x near 0.

Second, the case r = m < n. Here is the implicit function theorem.4
2b3 Theorem. Let f : Rnm Rm Rm be continuously differentiable5

near (0, 0), f (0, 0) = 0, and (Df )(0,0) = A = ( B C ), B : Rnm Rm ,
C : Rm Rm , with C invertible. Then there exists g : Rnm Rm ,
continuously differentiable near 0, such that the two relations f (x, y) = 0
and y = g(x) are equivalent for (x, y) near (0, 0); and (Dg)0 = C 1 B.
Clearly, g(0) = 0 (since f (0, 0) = 0).
2b4 Exercise. Deduce from 2b3 existence of > 0, > 0 such that for
every x satisfying |x| < there exists one and only one y satisfying |y| <
and f (x, y) = 0; namely, y = g(x).

Clearly, f x, g(x) = 0 for x near 0.
2b5 Remark. The relation (Dg)0 = C 1 B is included for completeness.

It follows easily from the chain rule: f x, g(x) = 0, that is, f = 0 where
1
Fleming, Hubbard, Shifrin, Shurman, Zorich.
2
Lang.
3
Recall 2a5.
4
It is without question one of the most important theorems in higher mathematics
(Shifrin p. 255).
5
As a mapping Rn Rm .
id
x
: x 7 g(x) ; we have (0) = (0, 0) and (D)0 = ; thus, 0 =
(Dg)0
id
D(f )0 = (Df )(0,0) (D)0 = ( B C ) = B + C(Dg)0 . (However,
(Dg)0
differentiability of g does not follow from the chain rule!) Similarly, for all x
near 0 holds (Dg)x = Cx1 Bx where (Bx Cx ) = Ax = (Df )(x,g(x)) .
In dimension 1 + 1 = 2 we have
(D1 f )(x,y)
g 0 (x) = where y = g(x) .
(D2 f )(x,y)
Less formally, dy
dx
= g/x
g/y
since g
x
dx + g
y
dy = dg(x, y) = 0.
2b6 Exercise. Given k {1, 2, 3, . . . }, we define f : R2 R by f (x, y) =

Im (x + iy)k (where i2 = 1 and Im (a + ib) = b).

(a) Find all k such that f satisfies the assumptions of Theorem 2b3.
(b) Find all k such that f satisfies the conclusions of Theorem 2b3 (except
for the last equality).
2b7 Exercise. Let f satisfy the assumptions of Theorem 2b3. Show that f 2
(pointwise square) violates the assumptions of Theorem 2b3 but still satisfies
its conclusions (except for the last equality).
It is not easy to prove these two theorems, but it is easy to derive one of
them from the other. First, 2b3=2b1. The idea is simple: x is implicitly a
function of y according to the equation (y, x) = f (x) y = 0.
Proof of the implication 2b3 =2b1. Given f : Rn Rn as in 2b1, we de-
fine : Rn Rn Rn by (y, x) = f (x) y and check the conditions
of 2b3 for 2n, n, in place of n, m, f . We have (Df )(0,0) = ( id (Df )0 );
C = (Df )0 is invertible. Theorem 2b3 gives g : Rn Rn , continuously
differentiable near (0, 0), such that f (x) y = 0 x = g(y) near (0, 0).
We take > 0 such that both f and g are continuously differentiable on
{x : |x| < }, and f (x) = y x = g(y) whenever |x| < , |y| < . We
define U = {x : |x| < , |f (x)| < } and V = {y : |y| < , |g(y)| < }. Both
U and V are open and contain 0.
If x U and y = f (x), then |y| < and x = g(y), therefore y V . We
see that f (U ) V and g f = id on U . Similarly, g(V ) U and f g = id
on V . It means that f |U : U V and g|V : V U are mutually inverse;
thus, f is a local diffeomorphism.
Second, 2b1 =2b3. The idea is the diffeomorphism : (x, y) 7
x, f (x, y) and its inverse.
Proof of the implication 2b1 =2b3. Given f : Rnm Rm Rm as in 2b3,

we define : Rnm Rm Rnm Rm by (x, y) = x, f (x, y) and check
the conditions of 2b1 for n, in place of n, f . We have
id 0

(D)(0,0) = ;
B C
an invertible matrix, since its determinant is det(id) det(C) = det(C) 6= 0. 1

By Theorem 2b1, is a local diffeomorphism. Its inverse = 1 is also a
local diffeomorphism. Both and do not change the first component x.
We define g : Rnm Rm by x, g(x) = (x, 0) and note that g is con-
tinuously differentiable near 0, since it is the composition of three mappings

linear x x x linear
x 7 7 = 7 g(x) .
0 0 g(x)
Finally, f (x, y) = 0 (x, y) = (x, 0) (x, 0) = (x, y) y =

g(x) for (x, y) near (0, 0).
Having 2b1 2b3, we need to prove only one of the two theorems.
Which one? Both options are in use. Some authors2 prove 2b3 by induction
in dimension, and then derive 2b1. Others3 prove 2b1 and then derive 2b3;
we do it this way, too.
2c Proof, the easy part

Given f : Rn Rn as in 2b1, we may choose at will a pair of bases (as
explained in Sect. 1f and mentioned in Sect. 2a, Item analysis). We choose
bases (1 , . . . , n ) and (1 , . . . , n ) such that A1 = 1 , . . . , An = n (here
A = (Df )0 , as before), then A becomes id. That is, we may (and will)
assume WLOG that A = id. 4
2c1 Lemma. f is one-to-one near 0.
Proof. We take > 0 such that on the set U = {x : |x| < } Rn

the function f is continuously differentiable and kDf id k 12 . We have
1
Alternatively,
invertibility is easy to check with no determinant. The equation
x u

id 0

= for given u, v becomes x = u, Bx + Cy = v, and clearly has
B C y v
one and only one solution x = u, y = C 1 (v Bu).
2
Curant, Zorich.
3
Fleming, Hubbard, Lang, Shifrin, Shurman.
4
See also: Fleming, Sect. XVIII.3, p. 515.
1

kD(f id)k 2
on the convex set U ; by (1f31), | f (b) b f (a)
a | 21 |b a| for all a, b U . Thus, | f (b) f (a) (b a)| 21 |b a|;

|f (b)f (a)| |ba| 21 |ba| = 21 |ba|; 1 a 6= b = f (a) 6= f (b) whenever

|a| < , |b| < .
Taking U as above, we introduce V = f (U ) and f 1 : V U (really,
this is (f |U )1 ). The inequality |f (b) f (a)| 21 |b a| for a, b U becomes
|f 1 (y) f 1 (z)| 2|y z| for y, z V , which shows that f 1 is continuous
on V . Also, taking z = 0 we get |f 1 (y)| 2|y| for all y V .
2c2 Lemma. f 1 (y) = y + o(y) for y V , y 0.
Proof. We use and U from the proof of 2c1, and generalize that argument
as follows. For every (0, 12 ] there exists (0, ] such that kDf
id k on the subset U = {x : |x| < } of U , which implies (as before)
| f (b) f (a) (b a)| |b a| for all a, b U . We need only the special
case (for a = 0): |f (b) b| |b| for b U . It is sufficient to check that
1
|f 1 (y) y| 2|y| whenever y V , |y| < .
2
Given such y, we consider x = f 1 (y) U , note that x U (since |x|
2|y| < ), thus |f (x) x| |x|, which gives |y f 1 (y)| 2|y|.
Did we prove differentiability of f 1 at 0? Not yet. Is f 1 defined near
0? That is, is 0 an interior point of V ?
2c3 Theorem. Let f : Rn Rn be continuously differentiable near 0,
f (0) = 0, and (Df )0 = A : Rn Rn be invertible. Then f is open at 0.
The proof is postponed to Sect. 2d.
2c4 Lemma. Theorem 2c3 implies Theorem 2b1.
Proof. Given f as in 2b1, we take U as in the proof of 2c1; now 2c3 gives
> 0 such that the set V1 = {y : |y| < } satisfies V1 f (U ). The
set U1 = {x U : f (x) V1 } is open (since f is continuous on U ), and
f (U1 ) = V1 . Taking into account that f 1 is continuous on f (U ) we see that
f |U1 is a homeomorphism between open sets U1 and V1 ; thus, f is a local
homeomorphism. By 2c2, f 1 is differentiable at 0.
The same holds near every point of U1 (the assumption x0 = 0 was not
1
1 20). Thus, f
a loss of generality, see page
1
is differentiable on V1 , and
D(f ) y = (Df )f 1 (y) (as explained in 2b2). Continuity of D(f 1 )
follows by 1f18(c), and therefore f is a local diffeomorphism.
1
The triangle inequality |x + y| |x| + |y| implies |x| |x + y| |y|, that is, |u + v|
|u| |v|.
2d Proof, the hard part

What is the problem
We want to prove Theorem 2c3. As before, we may assume WLOG that
A = id. We cannot use the theorems, but can use Lemma 2c1. We know
that f : U V is a homeomorphism, where U = {x : |x| < } Rn is an
open ball, and is small enough; but we are not sure that V is open. Clearly,
0 V (since f (0) = 0). How to prove that 0 is an interior point of V ?
In dimension 1 this is easy: f (0) = 0, f 0 (0) = 1; f () > 0, f ()< 0 (for
small enough); 0 is an interior point of the interval f (), f () , and V
contains this interval.
How do we know that V contains this interval? Being homeomorphic to
the interval U = (, ), V must be an interval. It is connected. Any hole
inside the interval would disconnect it.
In dimension 2, V is homeomorphic to the disk U = {x : |x| < } R2 ,
therefore, connected. So what? Could V be like these?
True, V must contain paths through 0 in all directions (images of rays). So

what? This condition is also satisfied by these counterexamples.
Should we consider circles rather than rays? And what about higher dimen-
sion?
You see, n-dimensional topology is much more complicated than 1-dimen-
sional. A hole disconnects a line, but not a plane. Rather, a hole on a plane
disconnects the space of loops!
These two loops belong to different con-

nected components in the space of loops.
In R3 a hole does not disconnect the space of loops; rather, it disconnects

the space of. . . loops in the space of loops! And so on. Algebraic topology,
a long and hard way. . .
2d1 Remark. In fact, for every open U Rn , every continuous one-to-one

mapping U Rn is open (and therefore a homeomorphism between open
sets U and f (U )). This is a well-known topological result, the Brouwer
invariance of domain theorem.1
2d2 Exercise. Prove invariance of domain in dimension one.2
Topology assumes only continuity of mappings, not differentiability. We
wonder, are differentiable mappings more tractable than (just) continuous
mappings?
Yes, fortunately, they are. Two analytical (rather than topological) proofs
of Theorem 2c3 are well-known. Some authors3 consider the minimizer xy
of the function x 7 |f (x) y|2 (for a given y near 0) and prove that f (xy )
cannot differ from y, using invertibility of the operator (Df )xy . Others4 use
iteration (in other words, successive approximations), that is, construct a
sequence of approximate solutions x1 , x2 , . . . of the equation f (x) = y (for
a given y near 0) and prove that the sequence converges, and its limit is a
solution; we do it this way, too.
Here is the idea. First, the linear approximation f (x) x for f leads to
the same linear approximation f 1 (y) y for f 1 (see 2c2); thus, given y,
we consider x1 = y as the first approximation to the (hoped for) solution of
the equation f (x) = y. Alas, y1 = f (x1 ) differs from y, and we seek a better
approximation x2 . The inequality |(f (b) f (a)) (b a)| |b a| (seen
in the proof of 2c2) suggests that f (b) f (a) b a, and in particular,
f (x2 ) f (x1 ) x2 x1 . Seeking f (x2 ) y, that is, f (x2 ) f (x1 ) y y1 ,
we take x2 = x1 + y y1 . And so on: y2 = f (x2 ); x3 = x2 + y y2 ; . . .
It appears that every constant (0, 1) ensures convergence xk x,
and then f (x) = y. Thus, well use just = 21 (as in the proof of 2c1).
Proof of Theorem 2c3. Given f as in 2c3, we assume WLOG that A = id
(as before) and take U = {x : |x| < } Rn as in the proof of 2c1; we know
that f |U is a homeomorphism U V = f (U ), and
(2d3) |(f (b) f (a)) (b a)| 21 |b a| for all a, b U .
First, well prove that 0 is an interior point of V (and afterwards well prove
that the mapping f is open at 0). To this end it is sufficient to prove that
y V for all y Rn such that |y| < 21 .
1
By the way, it follows from the Brouwer invariance of domain theorem that an open
set in Rn+1 cannot be homeomorphic to any set in Rn (unless it is empty). Think, why.
2
Hint: recall 2a8.
3
Shurman; Zorich (alternative proof in Sect. 8.5.5, Exer. 4f).
4
Fleming, Hubbard, Lang, Shifrin; Curant (alternative proof in Sect. 3.3g).
Given such y, we rewrite the equation f (x) = y as
(x) = x fixed point
where : U Rn is defined by
(x) = y + x f (x) .
In order to prove that y V we need existence of a fixed point.

By (2d3),
|(b) (a)| 21 |b a| for all a, b U .
If |x| < , then |(x)| |(x) (0)| + |(0)| 12 |x 0| + |y| < 21 + 12 = .

We take x1 = y, x2 = (x1 ), x3 = (x2 ) and so on; then |xk | < , and
|x2 x1 | = |(y) (0)| 21 |y| ;

|x3 x2 | = |(x2 ) (x1 )| 12 |x2 x1 | 41 |y| .
And so on;1 xk U , xk+1 = (xk ), |xk+1 xk | 2k |y|. The series k |xk+1

P
xk | converges, thus xk are a Cauchy sequence, therefore convergent: xk x;
and |x| 2|y| < (since |xk | 2|y| for all k), thus x U . By continuity of
, (x) = limk (xk ) = limk xk+1 = limk xk = x; a fixed point is found, and
so, 0 is an interior point of V .
By 2a7(a) it remains to find, for given yk 0, some xk 0 such that
f (xk ) = yk for all k large enough. This is immediate: we note that yk V
(for large k), take xk = (f |U )1 (yk ) and use continuity of (f |U )1 .
All theorems are proved, since 2c3 implies 2b1 by 2c4, and 2b1 implies
2b3 as shown in Sect. 2b.
2d4 Exercise. Assume that f : Rn R is continuously differentiable

near the origin, and (D1 f )0 6= 0, . . . , (Dn f )0 6= 0. Then the equation
f (x1 , . . . , xn ) = 0 locally defines n functions x1 (x2 , . . . , xn ), x2 (x1 , x3 , . . . , xn ),
. . . , xn (x1 , . . . , xn1 ). Find the product
x1 x2 xn1 xn
...
x2 x3 xn x1
at the origin.2
1
More formally, we prove by induction in k existence of x1 , . . . , xk U such that
x1 = y, x2 = (x1 ), . . . , xk = (xk1 ), |xk xk1 | 2k 2|y|, and |xk | 1 2k 2|y|.
2
Hint: first, consider a linear f .
Using iteration in the proof only we need not bother about rate of con-
vergence. However, iteration is quite useful in computation. For fast conver-
gence, the transition from xk to xk+1 is made via Ak = (Df )xk rather than
A = (Df )0 .1
On the other hand, using only A = (Df )0 we could hope for convergence
assuming just differentiability of f near 0 (rather than continuous differentia-
bility). Let us try it for f of Example 2a1. Some xn , shown here as functions
of y, are discouraging.
0.08 0.08 0.08
x=x2 (y) x=x5 (y) x=x7 (y)
x=x3 (y)
y y y
y=f (x)
0.04 0.04 0.04

0.04 x 0.08 0.04 x 0.08 0.04 x 0.08
This is instructive. Never forget the word continuously in continuously

differentiable!2
True, the mean value theorem, and the finite increment theorem (1f31),
assume just differentiability. But this is a rare exception.
In the linear case, according to Sect. 2a (Item linear algebra), not only
A(x, y) = 0 y = C 1 Bx, but also A(x, y) = z y = C 1 (zBx).
In the nonlinear case the situation is similar.
2d5 Theorem. Let f : Rnm Rm Rm and A, B, C be as in Th. 2b3.

Then there exists g : Rnm Rm Rm , continuously differentiable near
(0, 0), such that the two relations f (x, y) = z and y = g(x,
z) are equivalent
1
for (x, y, z) near (0, 0, 0); and (Dg)(0,0) = C B C 1 .
Proof. Similarly to the proof of the implication 2b1 = 2b3 (in Sect. 2b)we
introduce the local diffeomorphism , its inverse , define g by x, g(x, z) =
x linear
(x, z), note that xz 7 xz = g(x,z) 7 g(x, z), and finally, f (x, y) =

z (x, y) = (x, z) (x, z) = (x, y) y = g(x, z) for (x, y, z)

near (0, 0, 0).
1
If interested, see Hubbard, Sect. 2.7 Newtons method and 2.8 Superconvergence.
2
Differentiable functions are generally monstrous! In particular, such a function can
be nowhere monotone. Did you know? Can you imagine it? See, for example, Sect. 9c of
my advanced course Measure and category.
Index
diffeomorphism, 21 local diffeomorphism, 21
local homeomorphism, 20
homeomorphism, 20
open mapping, 21
implicit function theorem, 23
invariance of domain, 28 rank, 19
inverse function theorem, 23
iteration, 28 WLOG, 20
3 Applications
3a Constrained optimization . . . . . . . . . . . . . . 32
3b Example: arithmetic, geometric, harmonic, and
more general means . . . . . . . . . . . . . . . . . 34
3c Example: Three points on a spheroid . . . . . . 38
3d Example: Singular value decomposition . . . . . 41
3e Sensitivity of optimum to parameters . . . . . . 43
3f Manifolds in Rn . . . . . . . . . . . . . . . . . . . . 44
3a Constrained optimization
One of the most brilliant and well-known achievements of differential
calculus is the collection of recipes it provides for finding the extrema
of functions. . . . Frequently a situation that is more complicated and
from the practical point of view even more interesting arises, in which
one seeks an extremum of a function under certain constraints . . . 1
Let Z Rn be a set, f : Z R a function, and x0 Z a point. We say

that x0 is a local maximum point of f on Z, if f (x) f (x0 ) for all x Z
near x0 . (A local minimum point is defined similarly.)
In particular, if Z = g 1 {0} = {x : g(x) = 0} for a given g : Rn Rm ,
a local maximum point of f on Z is called a local maximum point of f
subject to the constraintg() = 0. That is, subject to g1 () = = gm () = 0
where g1 (x), . . . , gm (x) = g(x). Extremum means either maximum or
minimum, of course.
3a1 Theorem. Assume that x0 Rn , 1 m n1, functions f, g1 , . . . , gm :

Rn R are continuously differentiable near x0 , g1 (x0 ) = = gm (x0 ) = 0,
and the vectors g1 (x0 ), . . . , gm (x0 ) are linearly independent. If x0 is a
local constrained extremum point of f subject to g1 () = = gm () = 0,
then there exist 1 , . . . , m R such that
f (x0 ) = 1 g1 (x0 ) + + m gm (x0 ) .

1
Quoted from: Zorich, Sect. 8.7.3a, p. 527.
The numbers 1 , . . . , m are called Lagrange multipliers.

A physicist could say: in equilibrium, the driving force is neutralized by
constraints reaction forces.
In practice, seeking local constrained extrema of f on Z = g 1 {0} one

solves (that is, finds all solutions of) a system of m + n equations
g1 (x) = = gm (x) = 0 , (m equations)

f (x) = 1 g1 (x) + + m gm (x) (n equations)
for m + n variables
1 , . . . , m , (m variables)
x. (n variables)
For each solution (1 , . . . , m , x) one ignores 1 , . . . , m and checks f (x).1

In addition, one checks f (x) for all points x that violate the conditions of
3a1; that is, g1 (x), . . . , gm (x) are linearly dependent, or f, g1 , . . . , gm fail
to be continuously differentiable near x.
If the set Z is not compact, one checks all relevant limits of f .
If all that is feasible (which is not guaranteed!), one finally obtains the
infimum and supremum of f on Z.
More formally: supxZ f (x) = limk f (xk ) (, +] for some x1 , x2 ,
Z. Choosing a subsequence we ensure either xk x for some x Z or
|xk | . In the case x Z the point x must violate conditions of 3a1.
That is enough if Z is compact. Otherwise, if Z is bounded and not closed,
the case x Z \ Z must be examined. And if Z is unbounded, the case
|xk | must be examined.
In order to prove Th. 3a1 we first generalize Th. 2c3 as follows (recall
2a9).
3a2 Theorem. Let f : Rn Rm be continuously differentiable near 0,
f (0) = 0, and (Df )0 = A : Rn Rm be onto. Then f is open at 0.
Proof. We take an m-dimensional subspace E Rn such that A|E is an
invertible mapping from E onto Rm (this is possible, as explained in Sect. 2a,2
Item linear algebra). Then D(f |E ) 0 = A|E is invertible; by Th. 2b1,
f |E is a local diffeomorphism, and therefore,3 open at 0. It follows that f is
open at 0.
1
Being ignored in this framework, (1 , . . . , m ) are of interest in another framework,
see Sect. 3e.
2
Choosing a basis in E we turn it to a copy of Rm . Or, alternatively, E may be chosen
to be spanned by some m out of the n standard basis vectors of Rn .
3
Use 2a7(a), as in the proof of 2c3.
Proof of Theorem 3a1. WLOG, the extremum is maximum, x0 = 0 and

f (0) = 0. Assume the contrary: f (0) is not a linear combination
of g1 (0), . . . , gm (0). Then vectors g1 (0), . . . , gm (0), f (0) are lin-
early independent. These vectors being the rows of (D)0 , where (x) =
g1 (x), . . . , gm (x), f (x) , we see that (D)0 : Rn Rm+1 is onto.1 By

Th. 3a2, is open at 0.

We take a neighborhood U Rn of 0 such that f (x) f (x0 ) for all
x U Z (where Z = g 1 {0} ), note that (U ) is a neighborhood of 0 in
Rm+1 , and therefore (U ) contains (0, . . . , 0, ) for > 0 small enough. That
is, (x) = (0, . . . , 0, ) for some x U . Then x Z and f (x) > f (0), which
is a contradiction.
Theorem 3a1, formulated in terms of gradients, involves a Euclidean met-
ric on Rn . However, it is easy to reformulate it for vector spaces (with no
given metric), to be invariant under arbitrary change of basis (not just or-
thonormal), as follows.
Assume that V is an n-dimensional vector space, x0 V , 1 m
n 1, functions f, g1 , . . . , gm : V R are continuously differentiable near x0 ,
g1 (x0 ) = = gm (x0 ) = 0, and the linear functions (Dg1 )x0 , . . . , (Dgm )x0 :
V R are linearly independent. If x0 is a local constrained extremum point
of f subject to g1 () = = gm () = 0, then there exist 1 , . . . , m R such
that
(Df )x0 = 1 (Dg1 )x0 + + m (Dgm )x0 .
3b Example: arithmetic, geometric, harmonic, and more

general means
Here is an isoperimetric inequality for triangles on the plane:
1
perimeter() 2 ,

area()
12 3
and equality is attained for equilateral triangles and only for them. In other
words, among all triangles with the given perimeter, the equilateral one has
the largest area.2
1
Recall Sect. 2a, Item linear algebra.
2 1
Generally, area(G) 4 perimeter(G) 2 for any G on the plane, and equality is
attained for disks only. This is a famous deep fact. But I do not give an exact formulation
(nor a proof, of course).
The proof is based on Herons formula for the area A of a triangle whose
side lengths are x, y, z (and perimeter L = x + y + z):

2 L L L L
A = x y z .
2 2 2 2
The sum of the three positive1 numbers L2 x, L2 y, L2 z is fixed (equal

to 3L
2
L = L2 ); their product is claimed to be maximal when these numbers
4 2
are equal (to L/6), and then A2 = L2 L6 3 = 24L33 ; A = 22L33 .

More generally, max{x1 . . . xn : x1 , . . . , xn 0, x1 + + xn = c} is

reached for x1 = = xn = c/n and is equal to (c/n)n . Equivalently,
max{(x1 . . . xn )1/n : x1 , . . . , xn 0, (x1 + + xn )/n = c} is reached for
x1 = = xn = c and is equal to c, which is the well-known inequality for
geometric mean and arithmetic mean,
1
(3b1) (x1 . . . xn )1/n (x1 + +xn ) for n = 1, 2, . . . and x1 , . . . , xn 0 .
n
It follows easily from concavity of the logarithm: the set A = {(x, y) : x
(0, ), y ln x} is convex, therefore the convex combination n1 (x1 + +
xn ), n1 (ln x1 + + ln xn ) of points (x1 , ln x1 ), . . . , (xn , ln xn ) A belongs to

A, which gives (3b1). And still, it is worth to exercise Lagrange multipliers.
3b2 Exercise. Prove (3b1) via Lagrange multipliers.
By the way, the harmonic mean h defined by h1 = n1 x11 + + x1n satisfies

h (x1 . . . xn )1/n ; just apply (3b1) to x11 , . . . , x1n .

More generally, the Holder mean (called also power mean) with exponent
p (, 0) (0, ) is
xp1 + + xpn
1/p
Mp (x1 , . . . , xn ) = for x1 , . . . , xn > 0 .
n
In particular, M1 is the arithmetic mean and M1 is the harmonic mean. For

p 0 LHopitals rule gives
1 xp1 + + xpn
ln lim Mp ((x1 , . . . , xn ) = lim ln =
p0 p0 p n
xp1 ln x1 + + xpn ln xn ln x1 + + ln xn
= lim p p = = ln(x1 . . . xn )1/n ;
p0 x1 + + xn n
1L x+y+z y+zx
2 x= 2 x= 2 > 0 by the triangle inequality.
accordingly, one defines
M0 (x1 , . . . , xn ) = (x1 . . . xn )1/n ,
and observes that M1 (x1 , . . . , xn ) M0 (x1 , . . . , xn ) M1 (x1 , . . . , xn ). For

p + we have
1 p xp1 + + xpn
p
max(x1 , . . . , xn ) max(xp1 , . . . , xpn ) ,
n n
therefore Mp (x1 , . . . , xn ) max(x1 , . . . , xn ); one writes
M+ (x1 , . . . , xn ) = max(x1 , . . . , xn ) ; M (x1 , . . . , xn ) = min(x1 , . . . , xn )
(the latter being similar to the former) and observes that M (x1 , . . . , xn )
M1 (x1 , . . . , xn ) M0 (x1 , . . . , xn ) M1 (x1 , . . . , xn ) M+ (x1 , . . . , xn ).
That is interesting! Maybe Mp Mq whenever p q?
We treat Mp as a function on (0, )n Rn and calculate its gradient
Mp , or rather, the direction of the vector Mp ; indeed, we only need to
know when two vectors Mp , Mq are linearly dependent, that is, collinear
(denote it q ). We have Mp q Mpp q (nMpp ) q (x1p1 , . . . , xp1 n ) for p 6=
0; however, this result holds for p = 0 as well, since M0 q ln M0 q
x1q1
(x1
1 , . . . , x 1
n ). Thus, M p , M q are collinear if and only if xp1
= =
1
xq1
n
xp1
, that is, xqp
1 = = xqp
n , or just x1 = = xn . In this case, evidently,
n
Mp = Mq . Does it prove that Mp Mq always? Not yet. Functions Mp , Mq
are continuously differentiable on the open set G = (0, )n , and on the set
Zp = {x G : Mp (x) = 1}1 the conditions of 3a1 are violated at one point
(1, . . . , 1) only. This could not happen on a compact Zp ! Surely Zp is not
compact, and we must examine Z p \ Zp and/or .
Case 1: 0 < p < q < . The set Zp is bounded, since max(x1 , . . . , xn )
p
(x1 + + xpn )1/p = n1/p Mp (x1 , . . . , xn ) = n1/p , but not closed.2 Functions
Mp , Mq are continuous on G = [0, )n . Maybe the (global) minimum of Mq
on Zp = {x G : Mp (x) = 1} is reached at some x Z p \ Zp ? In this case
at least one coordinate of x vanishes. We use induction in n. For n = 1,
Mp (x) = x = Mq (x). Having Mp Mq in dimension n 1 we get (assuming
1
No need to consider Mp (x) = c, since Mp (x) = Mp (x) for all (0, ) and all p,
M (x)
thus Mpq (x) does not depend on .
2
For example, the point (n1/p , 0, . . . , 0) belongs to Z p \ Zp .
xn = 0)
1/q
Mq (x) +1
n
(xq1
+ x q
n1 + 0 q
)
= 1/p =
Mp (x) 1 p p p
n
(x1 + + xn1 + 0 )
1/q
n p1 1q n1 1
(xq1 + + xqn1 ) n p1 1q
= 1/p > 1,
n1 n1

1 p p
n1
(x1 + + xn1 )
therefore Mq > Mp on Z p \ Zp .
Case 2: 0 = p < q < . Follows from Case 1 via the limiting procedure
p 0+.
Case 3: < p < q < 0. Follows from Case 1 applied to 1/x1 , . . . , 1//xn ,
since
xp + + xp 1/p
1 1 1 n
1/Mp (x1 , . . . , xn ) = = Mp (x1 , . . . , xn ) ;
n
Mp (x1 , . . . , xn ) = 1/Mp (x1 1 1 1
1 , . . . , xn ) 1/Mq (x1 , . . . , xn ) = Mq (x1 , . . . , xn ) .
Case 4: < p < q = 0. Follows from Case 3 via the limiting

procedure q 0.
Case 5: < p < 0 < q < . Follows from Cases 2 and 4: Mp
M0 Mq .
So, Mp Mq whenever p q.
Some practical advice.
The system of m + n equations proposed in Sect. 3a is only one way of
finding local constrained extrema. Not necessarily the simplest way.
No need to find f when f () = (g()); just find g and note that f

is collinear to g.
In many cases there are alternatives to the Lagrange method. For exam-
Mq (x)
ple, we could replace inf{Mq (x) : Mp (x) = 1} with inf M p (x)
: M 1 (x) = 1 ,
substitute xn = n(x1 + +xn1 ) and optimize in x1 , . . . , xn1 without con-
straints. Alternatively we could use convexity of the function t 7 tq/p , that
q/p
is, convexity of the set A = {(t, u) : t (0, ), u t p}. qThe convex combi-
p q
nation n (x1 + + xn ), n (x1 + + xn ) of points (x1 , x1 ), . . . , (xn , xqn ) A
1 p 1 q p
belongs to A, which gives n1 (xp1 + + xpn ) q/p n1 (xq1 + + xqn ), that is,
Mp Mq . Moreover, the same applies to weighted mean
Mp,w (x) = (xp1 w1 + + xpn wn )1/p

for given w1 , . . . , wn 0 satisfying w1 + +wn = 1. In particular, M1,w (x)

Mp,w (x) for p 1, that is, x1 w1 + + xn wn (xp1 w1 + + xpn wn )1/p .
q/p
Substituting xi = ai bi and wi = bqi where q is such that p1 + 1q = 1 we have
q/p q P p q q 1/p
, that is, i ai bi ( i api )1/p provided that
P P P
Pi aqi bi bi i ai b i b i
i bi = 1. This leads easily to the Holders inequality
X X 1/p X 1/q
p q
xi y i |xi | |yi |

i i i
for p, q (1, ), p1 + 1q = 1, and arbitrary xi , yi R. The right-hand side may

be rewritten as nMp (|x|)Mq (|y|), admitting p, q [1, ]. Note the special
cases p = q = 2 and p = 1, q = .
However, the shown way to this inequality is rather tricky.
3b3 Exercise. Given a1 , . . . , an > 0, maximize a1 x1 + + an xn on {x
[0, )n : xp1 + + xpn = 1} using the Lagrange method.1 Deduce Holders
inequality.
Holders inequality
P persists inPthe case of countably many variables xi
p
and yi . If two series |xi | and |yi |q converge (and p1 + 1q = 1), then the
P
series xi yi also converges (and the inequality holds).
3b4 Exercise. Given a, b, c, k > 0, find the maximum of the function f (x, y, z) =
xa y b z c where x, y, z [0, ) and xk + y k + z k = 1.
3b5 Exercise. Find the maximum of y over all points (x, y) R2 that
satisfy the equation x2 + xy + y 2 = 27.
3c Example: Three points on a spheroid

We consider an ellipsoid of revolution (in other words, spheroid)
x2 + y 2 + z 2 = 1
for some (0, 1) (1, ), and three points P, Q, R on this surface. We

want to maximize |P Q|2 + |QR|2 + |RP |2 .
Well see that the maximum is reached when P, Q, R are situated either
in the horizontal plane z = 0 or the vertical plane y = 0 (or another vertical
plane through the origin; they all are equivalent due to symmetry). Thus, the
three-dimensional problem boils down to a pair of two-dimensional problems
(not to be solved here).
1
Hint: induction in n is needed again.
We introduce 9 coordinates,
P = (x1 , y1 , z1 ) , Q = (x2 , y2 , z2 ) , R = (x3 , y3 , z3 )
and 4 functions f, g1 , g2 , g3 : R9 R of these coordinates,
f (x1 , . . . , z3 ) =(x1 x2 )2 + (y1 y2 )2 + (z1 z2 )2

+(x2 x3 )2 + (y2 y3 )2 + (z2 z3 )2
+(x3 x1 )2 + (y3 y1 )2 + (z3 z1 )2 ;
g1 (x1 , . . . , z3 ) =x21 + y12 + z12 1 ,
g2 (x1 , . . . , z3 ) =x22 + y22 + z22 1 ,
g3 (x1 , . . . , z3 ) =x23 + y32 + z32 1 .
We use the approach of Sect. 3a with n = 9, m = 3. The functions f, g1 , g2 , g3

are continuously differentiable on R9 . The set Z = Zg1 ,g2 ,g3 R9 is compact.
The gradients of g1 , g2 , g3 do not vanish on Z (check it) and are linearly
independent (and moreover, orthogonal).
We introduce Lagrange multipliers 1 , 2 , 3 corresponding to g1 , g2 , g3
and consider a system of m + n = 12 equations for 12 unknowns. The first
three equations are
x21 + y12 + z12 = 1 , x22 + y22 + z22 = 1 , x23 + y32 + z32 = 1 .
Now, the partial derivatives. We have

f
= 2(x1 x2 ) 2(x3 x1 ) = 4x1 2x2 2x3 ,
x1
which is convenient to write as 6x1 2(x1 + x2 + x3 ); similarly,
f
= 6xk 2(x1 + x2 + x3 ) ,
xk
f
= 6yk 2(y1 + y2 + y3 ) ,
yk
f
= 6zk 2(z1 + z2 + z3 )
zk
for k = 1, 2, 3. Also,
gk gk gk
= 2xk , = 2yk , = 2zk ;
xk yk zk
other partial derivatives vanish. We get 9 more equations:
6xk 2(x1 + x2 + x3 ) = k 2xk ,

6yk 2(y1 + y2 + y3 ) = k 2yk ,
6zk 2(z1 + z2 + z3 ) = k 2zk
for k = 1, 2, 3. That is,
(3 k )xk = x1 + x2 + x3 ,
(3 k )yk = y1 + y2 + y3 ,
(3 k )zk = z1 + z2 + z3 .
We note that
(x1 + x2 + x3 )yk = (3 k )xk yk = (y1 + y2 + y3 )xk
for k = 1, 2, 3.
Case 1: x1 + x2 + x3 6= 0 or y1 + y2 + y3 6= 0.
Then P, Q, R are situated on the vertical plane {(x, y, z) : (x1 +x2 +x3 )y =
(y1 + y2 + y3 )x}.
Case 2: x1 + x2 + x3 = y1 + y2 + y3 = 0 and (1 , 2 , 3 ) 6= (3, 3, 3).
If 1 6= 3 then x1 = y1 = 0; the three vectors (x1 , y1 ), (x2 , y2 ), (x3 , y3 ) R2
(of zero sum!) are collinear; therefore P, Q, R are situated on a vertical plane
(again). The same holds if 2 6= 3 or 3 6= 3.
Case 3: x1 + x2 + x3 = y1 + y2 + y3 = 0 and 1 = 2 = 3 = 3.
Then z1 = z2 = z3 = z1 +z 2 +z3
33
(since 6= 0), therefore z1 = z2 = z3 = 0;
P, Q, R are situated on the horizontal plane {(x, y, z) : z = 0}.
Another practical advice.
If Lagrange method does not solve a problem to the end, it may still give
a useful information. Combine it with other methods as needed.
3c1 Exercise. 1
Let a, b Rn be linearly independent, |a| = 5, |b| = 10.
Functions a , b on the sphere S1 (0) = {x : |x| = 1}
Rn are defined as follows: a (x) is the angular diameter a a (x)
of the sphere S1 (a) = {y : |y a| = 1} viewed from x; x
similarly, b (x) is the angular diameter of S1 (b) from x.
Prove that every point of local extremum of the function a + b on S1 (0) is
some linear combination of a, b.2
1
Exam of 26.01.14, Question 2.
2
Hint: show that sin 12 a (x) = 1/|x a|; use the gradient.
3d Example: Singular value decomposition

3d1 Proposition. Every linear operator from one finite-dimensional Eu-
clidean vector space to another sends some orthonormal basis of the first
space into an orthogonal system in the second space.
This is called the Singular Value Decomposition.1 It may be reformulated
as follows.
3d2 Proposition. Every linear operator from an n-dimensional Euclidean
vector space to an m-dimensional Euclidean vector space has a diagonal mn
matrix in some pair of orthonormal bases.
m<n m=n
m>n
In particular, this holds for every linear operator Rn Rn . It does not
mean that every matrix is diagonalizable! Two bases give much more freedom
than one basis.
Do you think this is unrelated to constrained optimization? Wait a little.
Prop. 3d1 will be derived from Prop. 3d3 below.
3d3 Proposition. Every finite-dimensional vector space endowed with two
Euclidean metrics contains a basis orthonormal in the first metric and or-
thogonal in the second metric.
Proof. Let an n-dimensional vector space V be endowed with two Euclidean
metrics. It means, two norms || and ||1 corresponding to two inner products
h, i and h, i1 by |x|2 = hx, xi and |x|21 = hx, xi1 . We denote by E the
Euclidean space (V, | |) and define a mapping A : E E by
x, y E hx, yi1 = hAx, yi ;
it is well-defined, since the linear form hx, i1 , as every linear form, is ha, i
for some a E. It is easy to see that A is a linear operator, symmetric in
the sense that
x, y E hAx, yi = hx, Ayi .
1
See: Todd Will, Introduction to the Singular Value Decomposition,
http://websites.uwlax.edu/twill/svd/ Quote:
The Singular Value Decomposition (SVD) is a topic rarely reached in undergraduate
linear algebra courses and often skipped over in graduate courses.
Consequently relatively few mathematicians are familiar with what M.I.T. Professor
Gilbert Strang calls absolutely a high point of linear algebra.
We want to maximize | |21 on the sphere S = {x E : |x| = 1}. We have1
|x|2 = 2x , |x|21 = 2Ax
by 1d1(a), or just by a very simple calculation:
|x + h|2 = |x|2 + hx, hi + hh, xi + |h|2 = |x|2 + 2hx, hi + o(|h|) ,

|x + h|21 = |x|21 + hx, hi1 + hh, xi1 + |h|21 = |x|21 + 2hAx, hi + o(|h|) .
These two gradients are collinear if and only if Ax = x; it means, x is

an eigenvector of A, and is the eigenvalue. Now we could use well-known
results of linear algebra, but here is the analytic way.
By compactness, | |21 reaches its maximum on S; by Theorem 3a1, a
maximizer is an eigenvector. Existence of an eigenvector is thus proved.
Denote it by en , and the eigenvalue by n .
If x en then Ax en due to symmetry of A: hAx, en i = hx, Aen i =
hx, n en i = n hx, en i = 0. We consider a hyperplane (that is, (n 1)-dimen-
sional subspace)
En1 = {x E : x en }
and the restricted operator
An1 : En1 En1 , An1 x = Ax for x En1 .
The Euclidean space En1 is endowed with two Euclidean metrics | | and
| |1 (restricted to En1 ), and hx, yi1 = hAn1 x, yi for x, y En1 .
Now we use induction in n. The case n = 1 is trivial. The claim for n 1
applied to En1 gives a basis (e1 , . . . , en1 ) of En1 orthonormal in | | and
orthogonal in | |1 . Thus, (e1 , . . . , en1 , en ) is a basis of E. We normalize en
to |en | = 1; now this basis is orthonormal in | |. It is also orthogonal in | |1 ,
since hek , en i1 = hAek , en i = 0 for k = 1, . . . , n 1.
3d4 Remark. Positivity of the quadratic form x 7 |x|21 = hx, xi1 was not
used. The same holds for arbitrary quadratic form on a Euclidean space. (In
contrast, positivity of | |2 was used.)
Proof of Prop. 3d1. We have two Euclidean spaces E, E2 and a linear oper-
ator T : E E2 . First, assume in addition that T is one-to-one. Then T
induces a second Euclidean metric on E:
|x|1 = |T x| ; hx, yi1 = hT x, T yi

1
All gradients are taken in E = (V, | |), not (V, | |1 )!
(of course, |T x| is the norm in E2 ). Prop. 3d3 gives an orthonormal basis

(e1 , . . . , en ) of E, orthogonal in the second metric: hek , el i1 = 0 for k 6= l.
That is, hT ek , T el i = 0, which shows that (T e1 , . . . , T en ) is an orthogonal
system in E2 .
If T is not one-to-one, the same argument applies due to Remark 3d4.1
Prop. 3d2 follows immediately, and gives a diagonal matrix. Its diagonal
elements can be made 0 (changing signs of basis vectors as needed) and
decreasing (renumbering basis vectors as needed); this way one gets the so-
called singular values of the given operator T . They depend on T only, not on
the choice of the pair of bases,2,3 and are the square roots of the eigenvalues
of the operator A = T T . The highest singular value is the operator norm
kT k of T (think, why). The lowest singular value (if not 0) is 1/kT 1 k.
3e Sensitivity of optimum to parameters

When using a mathematical model one often bothers about sensitivity4 of
the result (the output of the model) to the assumptions (the input). Here is
one of such questions.5
What happens if the restrictions g1 (x) = = gm (x) = 0 are replaced
with g1 (x) = c1 , . . . , gm (x) = cm ?
Assume that the system of m + n equations
g1 (x) = c1 , . . . , gm (x) = cm , (m equations)

f (x) = 1 g1 (x) + + m gm (x) (n equations)
for (, x) Rm Rn has a solution ((c), x(c)) for all c Rm near 0, and

the mapping c 7 x(c) is differentiable at 0. Then, by the chain rule,

f (x(c)) = f (x(0)), x(c) for k = 1, . . . , m .
ck c=0 ck c=0
On the other hand,
f (x(0)) = 1 (0)g1 (x(0)) + + m (0)gm (x(0))

1
Alternatively, define |x|21 = |T x|2 + |x|2 , hx, yi1 = hT x, T yi + hx, yi.
2
The only freedom in this choice (in addition to sign change and renumbering) is,
rotation within each eigenspace of dimension > 1 (if any).
3
On the space of operators, the Schatten norm is kT kp = |s1 |p + + |sn |p 1/p where
s1 , . . . , sn are the singular values of T (and 1 p ).
4
Closely related ideas: stability, robustness; uncertainty; elasticity, . . .
5
A more general one: g1 (x, c1 ) = 0, . . . , gm (x, cm ) = 0.
and
(
1, if k = 1,
g1 (x(0)), x(c) = g1 (x(c)) =
ck c=0 ck c=0 0, otherwise
(since g1 (x(c)) = c1 ). The same holds for g2 , . . . , gm . Therefore

f (x(c)) = k (0) .
ck c=0
It means that k = k (0) is the sensitivity of the critical value to the level
ck of the constraint gk (x) = ck . That is,
f (x(c)) = f (x(0)) + 1 (0)c1 + + m (0)cm + o(|c|) .
Does it mean that
(3e1) sup f = sup f + 1 (0)c1 + + m (0)cm + o(|c|)

Zc Z0
where Zc = {x : g1 (x) = c1 , . . . , gm (x) = cm }? Not necessarily, for sev-

eral reasons (possible non-compactness, non-differentiability, greater or equal
value at another critical point when c = 0). But if supZc f = f (x(c)) for all
c near 0 then (3e1) holds.1
3f Manifolds in Rn
Everyone knows what a curve is, until he has studied

enough mathematics. . . Felix Klein2
Image: (CC) Jonathan Johanson,

http://cliptic.wordpress.com
By a manifold (to be defined soon) we mean a differential k-dimensional

submanifold of Rn , of class C 1 , without boundary.3 It is also called k-di-
mensional smooth surface in Rn or k-dimensional submanifold on Rn ,4 or
smooth manifold in Rn 5 etc.
1
See also Sect. 13.2 in book: J. Cooper, Working analysis, Elsevier 2005.
2
Quoted from: Hubbard, Sect. 3.1 Manifolds.
3
Generally, smooth means as many times differentiable as is relevant to the prob-
lem at hand. . . . (Some authors use smooth to mean C : infinitely many times
differentiable. For our purposes this is overkill.) Hubbard, Sect. 3.1, p. 293294.
4
Zorich Sect. 8.7.1.
5
Hubbard Sect. 3.1.
Several equivalent definitions of a manifold are used: via equations;1 via

diffeomorphisms;2 via graphs of mappings;3 and via parametrizations (so-
called charts, to be treated in Analysis-4).
3f1 Theorem. The following conditions on a set M Rn , a point x0 M
and a number k {1, 2, . . . , n 1} are equivalent:
(a) there exists a mapping f : Rn Rnk , continuously differentiable
near x0 , such that (Df )x0 = A : Rn Rnk is onto, and
xM f (x) = f (x0 ) for all x near x0 ;
(b) there exists a local diffeomorphism near x0 such that
xM (x) Rk {0nk } for all x near x0 ;
(c) there exists a permutation (i1 , . . . , in ) of {1, . . . , n} and a mapping

g : Rk Rnk , continuously differentiable near (x0,i1 , . . . , x0,ik ), such that
xM g(xi1 , . . . , xik ) = (xik+1 , . . . , xin ) for all x near x0 .
Proof. First, WLOG, x0 = 0 (as usual).

Second, the three conditions are insensitive to permutations of the n
coordinates of x.4 Indeed, in (a) we may change the order of arguments of f
as needed; in (b) we may change the order of arguments of as needed; and
in (c) we may change the permutation (i1 , . . . , in ) as needed.
(a)=(c): WLOG, f (0) = 0 and A = ( B C ) with B = Rk Rnk ,
C : Rnk Rnk , C invertible (using the fact that rank A = nk). Theorem
2b3 (for n and n k in place of n and m) gives g : Rk Rnk such that
g(x1 , . . . , xk ) = (xk+1 , . . . , xn ) f (x1 , . . . , xn ) = 0 x M , which
gives (c) for (i1 , . . . , in ) = (1, . . . , n).
(c)=(b): WLOG, (i1 , . . . , in ) = (1, . . . , n). Similarly to the proof of
2b3=2b1 (in Sect. 2a) we define by (u, v) = u, g(u) v for u Rk
and v Rnk ; then (u, v) Rk {0nk } (u, v) = (u, 0)
g(u) = v x M .
(b)=(a): we define f (x) = (yk+1 , . . . , yn ) whenever (x) = (y1 , . . . , yn );
then f (0) = 0 and f (x) = 0 (x) Rk {0nk } x M .
3f2 Definition. A nonempty set M Rn is a k-dimensional manifold, if
the equivalent conditions 3f1(a,b,c) hold for every x0 M .
1
Fleming; also Hubbard, Th. 3.1.10.
2
Lang, Zorich.
3
Hubbard.
4
I mean, coordinates of x, not of f (x) or (x).
We may say that M is a k-manifold near x0 when 3f1(a,b,c) hold for M ,

x0 and k. Accordingly, M is a k-manifold when it is a k-manifold near every
point (of M ).
3f3 Exercise. Let : Rn Rn be a diffeomorphism, and M Rn .

(a) If M is a k-manifold near x0 , then its image (M ) is a k-manifold
near (x0 );
(b) M is a k-manifold if and only if (M ) is a k-manifold.
Prove it.
This applies, in particular, to shifts, rotations, and all invertible affine

transformations of Rn .
3f4 Exercise. Let M1 , M2 Rn be k-dimensional manifolds, and M =

M1 M2 .
(a) If M 1 M2 = and M1 M 2 = , then M is a k-dimensional manifold.
Prove it.
(b) It can happen that M1 M2 = but M is not a k-dimensional
manifold. Give a counterexample.
3f5 Exercise. Let 0 < m < n, and g1 , . . . , gm C 1 (Rn R) be such that

the vectors g1 (x), . . . , gm (x) are linearly independent for every x M
where M = {x : g1 (x) = = gm (x) = 0}. Then M is a (nm)-dimensional
manifold.
Prove it.
3f6 Exercise. Which of the following subsets of R2 are 1-dimensional man-

ifolds? Prove your answers, both affirmative and negative.
M1 = R {0};
M2 = [0, 1] {0};
M3 = (0, 1) {0};
M4 = {(0, 0)};
M5 = R {0, 1};
M6 = R Z;
M7 = R {1, 12 , 13 , . . . };
M8 = M7 M1 .
3f7 Example. The sphere S = {x Rn : |x| = 1} is a (n 1)-dimensional

manifold (by 3f5 for m = 1 and g(x) = |x|2 1).
Alternatively, we may prove that S is a manifold around just one point,
say, e1 = (1, 0, . . . , 0), and then use rotation invariance: U (S) = S for every
linear isometry U : Rn R n 1
p, and each x S is U e1 for some U ; use 3f3(a).
2 2
Near e1 the equality x1 = 1 x2 xn gives 3f1(c).
2
3f8 Example. Consider the set M of all 3 3 matrices A of the form
2
a ab ac
A = ba b2 bc for a, b, c R , a2 + b2 + c2 = 1 .
ca cb c2
These are orthogonal projections to one-dimensional subspaces of R3 , that

is, straight lines through the origin. Note that each line contains two points
of the sphere S = {(a, b, c) R : a2 + b2 + c2 = 1}, which gives a 2-to-1
mapping S M . We treat M as a subset of the six-dimensional space of all
symmetric 3 3 matrices.
The set M is invariant under transformations A 7 U AU 1 where U
runs over all orthogonal matrices (linear isometries); these are linear trans-
formations of the six-dimensional space of matrices. If A corresponds to
x = (a, b, c) then U AU 1 corresponds to U x. For arbitrary A, B M there
exists U such that U AU 1 = B (transitive action).
Thus, M looks the same around all its points (homogeneous space). In
order to prove that M is a 2-manifold (in R6 ) it is sufficient to prove this
near a single point of M , say,

1 0 0
A1 = 0 0 0 M ,
0 0 0
that corresponds to (a, b, c) = (1, 0, 0) (but also (1, 0, 0), of course). For
(a, b, c) (1, 0, 0) we have in the linear approximation
2
a ab ac 1 0 0 0 b c
ba b2 bc 0 0 0 + b 0 0
ca cb c2 0 0 0 c 0 0
(think, why). Thus, in the linear approximation all elements of A are func-
tions of two of them. Returning to the nonlinear situation we want to express
a2 , b2 , c2 and bc in terms of ab and ac (locally, for (a, b, c) near (1, 0, 0)). We
1
Since x is the first vector of some orthogonal basis.
2
The projective plane in disguise.
have
(ab)2 + (ac)2 = a2 (b2 + c2 ) = a2 (1 a2 ) ;

q
a2 = 21 + 14 (ab)2 (ac)2 ;
(ab)2 (ac)2 (ab)(ac)
b2 = 1 ; c2 = 1 ; bc = 1 ;
2
+ ... 2
+ ... 2
+ ...
thus, M is a 2-manifold near A1 according to 3f1(c).1

Interestingly, the part of M that corresponds to a spherical zone (sym-
metrical, around the equator), say a2 +b2 +c2 = 1, |c| < 1/2, is homeomorphic
to the Mobius strip2 (without the edge),
M = {h(s, ) : s (1, 1), [0, 2]} ,

(R+rs cos ) cos
2
h(s, ) = (R+rs cos 2 ) sin ,

rs sin 2
for given R > r > 0. You see, a straight segment on the x, z plane rotates
by /2 (around the y axis) and at the same time it rotates (in the three
dimensions) by around the z axis.
A point h(s, ) of the Mobius strip corresponds to the point
q q
1 14 s2 cos 21 , 1 14 s2 sin 21 , 12 s
on the sphere S, and the corresponding point of M . (Think, what happens

for = 2.)
The rest of M is homeomorphic to a disk (not two disks), and this disk
is glued to the Mobius strip in a way unthinkable in three dimensions.3
1
It is easy to check that, locally, every matrix that satisfies these equations belongs to
M.
2
Images from Wikipedia, Mobius strip.
3
Dimension 6 can be reduced to dimension 4 by taking only (a2 b2 , ab, ac, bc), see
Real projective plane in Wikipedia.
Index
Holder mean, 35 projective plane, 47
Holders inequality, 38
Lagrange multipliers, 33 singular value, 43

local maximum, minimum, extremum, 32 subject to the constraint, 32
Mobius strip, 48
manifold, 45 Mp , 35
4 Basics of integration
4a Introduction . . . . . . . . . . . . . . . . . . . . . 50
4b Darboux sums . . . . . . . . . . . . . . . . . . . . 52
4c Integral . . . . . . . . . . . . . . . . . . . . . . . . 54
4d Volume . . . . . . . . . . . . . . . . . . . . . . . . . 57
4e Normed space of equivalence classes . . . . . . . 59
4f Approximation . . . . . . . . . . . . . . . . . . . . 61
4g Sandwich . . . . . . . . . . . . . . . . . . . . . . . 64
4h Translation (shift) and scaling . . . . . . . . . . . 66
4i The volume under a graph . . . . . . . . . . . . . 68
Integral is a bridge between functions of point and functions of set.
4a Introduction
As already pointed out, many of the quantities of interest in contin-
uum mechanics represent extensive properties, such as mass, momen-
tum and energy. An extensive property assigns a value to each part of
the body. From the mathematical point of view, an extensive property
can be regarded as a set function, in the sense that it assigns a value to
each subset of a given set. Consider, for example, the case of the mass
property. Given a material body, this property assigns to each sub-
body its mass. Other examples of extensive properties are: volume,
electric charge, internal energy, linear momentum. Intensive proper-
ties, on the other hand, are represented by fields, assigning to each
point of the body a definite value. Examples of intensive properties
are: temperature, displacement, strain.
As the example of mass clearly shows, very often the extensive prop-
erties of interest are additive set functions, namely, the value assigned
to the union of two disjoint subsets is equal to the sum of the val-
ues assigned to each subset separately. Under suitable assumptions of
continuity, it can be shown that an additive set function is expressible
as the integral of a density function over the subset of interest. This
density, measured in terms of property per unit size, is an ordinary
pointwise function defined over the original set. In other words, the
density associated with a continuous additive set function is an inten-

sive property. Thus, for example, the mass density is a scalar field.
Marcelo Epstein1
We need a mathematical theory of the correspondence between set func-

tions Rn E 7 S(E) n
R R and (ordinary) functions R 3 x 7 f (x) R
via integration, S(E) = E f . The theory should address (in particular) the
following questions.
What are admissible sets E and functions f ? (Arbitrary sets are as
useless here as arbitrary functions.)
What is meant by disjoint?
What is meant by integral?
What are the general properties of the integral?
How to calculate the integral explicitly for given f and E ?
Many approaches coexist. Some authors2 start with Riemann sums (more
natural for complex-valued and vector-valued integrands) and then proceed
to Darboux sums. Other considerRDarboux R sums only; we do so, too.
Ultimately, all authors define E f as Rn fE where
(
f (x) for x E,
(4a1) fE (x) =
0 otherwise.
(Note that fE is generally discontinuous, even if f is continuous.) But ini-
tially one considers much simpler sets E. Most authors use products of
intervals, called n-rectangles,3 coordinate parallelepipeds, 4
compact boxes5
authors6 use
R R
etc., and for these simple E define E f before Rn
f . But some
dyadic cubes, called also pixels,7 for defining Rn f (before E f ) for bounded
R R
f with bounded support. We follow this way, thus avoiding partitions, com-
mon refinements, and simple but nasty technicalities that some authors treat
in detail8 and others leave to exercises.9 The cost is that the shift invariance
(change of origin) needs a proof,10 similarly to rotation invariance (change of
basis) that needs a proof in every approach.
1
The elements of continuum biomechanics, Wiley 2012. (See Sect. 2.2.1.)
2
Zorich.
3
Lang.
4
Zorich.
5
Shurman.
6
Hubbard.
7
Terry Tao.
8
For instance, Lang, p. 570 and 573.
9
For instance, Shifrin, p. 271.
10
Hubbard, Prop. 4.1.21.
Rb
In the one-dimensional theory, seeing a f (x) dx, we do not ask, is this the
integral over the open interval (a, b) or the closed interval [a, b]; we neglect
the boundary {a, b} of the interval. Similarly, in higher dimension we want
to neglect the boundary of E.
Two notions of small sets are used. One notion is called volume zero1
or zero content;2 the other notion is called measure zero.3 For compact
sets these two notions coincide, but in general they are very different. For-
tunately, the boundary E = E \ E of a bounded set E is always compact;
requiring it to be small (in either sense) we need not bother, whether the
integral is taken over the open set E or the closed set E; and we may treat
sets E, F as disjoint when they have no common interior points. In this case
the equality
(4a2) S(E F ) = S(E) + S(F )
is additivity of the set function S; and the inequality
(4a3) vol(E) inf f (x) S(E) vol(E) sup f (x)

xE xE
is the clue to the relation between f and S.
4b Darboux sums
We consider a function f : Rn R satisfying two conditions:4
(4b1) f is bounded; that is, sup |f (x)| < ,

xRn
(4b2) f has bounded support; that is, sup |x| < .
x:f (x)6=0
First,R recall dimension one (that is, nR = 1). Assuming existence of the
+
integral f (x) dx and denoting it just R f , we may sandwich it as follows
(Z is the set of integers, from till +):
X Z X
inf f (x) f sup f (x) ,
x[k,k+1]
kZ R kZ x[k,k+1]
1
Hubbard, Shifrin, Shurman; sometimes called negligible (Lang) which, however,
could be confused with the other notion.
2
Burkill.
3
Hubbard, Zorich.
4
If puzzled, why the bounded support, or why no continuity, look again at (4a1).
R P R k+1
since R
f = kZ k f (x) dx (additivity, see also (4a2)), and inf x[k,k+1] f (x)
R k+1
k
f (x) dx supx[k,k+1] f (x) (see also (4a3)) for each k. We write the in-
tegral over the whole R and the sum over the whole Z, but only a bounded
region contributes due to (4b2).
For a better sandwich we use a finer partition; here N = 0, 1, 2, . . . (and
for N = 0 we get the case above):
1 X 1 X
Z
inf f (x) f sup f (x) .
2N kZ x[ 2kN , k+1
2 N ] R 2N
kZ x[ k k+1
, ]
2N 2N
In dimension two (that is n = 2), paving the plane by squares, we hope

to have, first,
X Z X
inf f (x, y) f sup f (x) ,
x[k,k+1] R2
k,`Z y[`,`+1] k,`Z x[k,k+1]
y[`,`+1]
that is (using two-dimensional x and k),

X Z X
inf f (x) f sup f (x) ,
xQ+k R2 xQ+k
kZ2 kZ2
where Q = [0, 1]2 = [0, 1] [0, 1] and Q + k = {x + k : x Q}; and more

generally,
X Z X
2N
2 inf f (x) f 22N sup f (x) ,
x2N (Q+k) R 2 x2N (Q+k)
2
kZ | kZ2
{z } | {z }
LN,k (f ) UN,k (f )
where 2N (Q + k) = {2N (x + k) : x Q}. In arbitrary dimension n we

hope to have Z
LN (f ) f UN (f ) ,
Rn
where LN (f ) and UN (f ) are the lower and upper Darboux sums defined by
X
(4b3) LN (f ) = LN,k (f ) , LN,k (f ) = 2nN inf f (x) ,
x2N (Q+k)
kZn
X
(4b4) UN (f ) = UN,k (f ) , UN,k (f ) = 2nN sup f (x) ;
kZn x2N (Q+k)
here Q = [0, 1]n .

Clearly, LN (f ) UN (f ) and LN (f ) = UN (f ) (think, why).
4b5 Lemma. For every N ,
LN +1 (f ) LN (f ) , UN +1 (f ) UN (f ) .
Proof. The cube Q = [0, 1]n contains 2n smaller cubes 21 (Q + `), `

{0, 1}n . Accordingly, the cube 2N (Q+k) contains 2n smaller cubes 2(N +1) (Q+
2k + `), ` {0, 1}n . Thus, `{0,1}n UN +1,2k+` (f ) UN,k (f ), whence
P
X
UN +1 (f ) = UN +1,k (f ) =
kZn
X X X
= UN +1,2k+` (f ) UN,k (f ) = UN (f ) .
kZn `{0,1}n kZn
Finally, LN +1 (f ) = UN +1 (f ) UN (f ) = LN (f ).

It follows that both sequences LN (f ) N , UN (f ) N converge.
4c Integral
4c1 Definition. Lower and upper integrals of f are
L(f ) = lim LN (f ) , U (f ) = lim UN (f ) .

N N
Clearly, < L(f ) U (f ) < .
4c2 Definition. A bounded function f : Rn R with bounded support

is called integrable, if L(f ) = U (f ). In this case their common value is the
integral of f .
The integral is often denoted by1

Z Z Z
f= f (x) dx = f (x1 , . . . , xn ) dx1 . . . dxn =
Rn Rn Rn Z Z
f (x1 , . . . , xn ) dx1 . . . dxn ,
Rn
and sometimes by2 f (x) dVx , or3

R R R R
Rn
f dV = Rn Rn
f |dn x| = Rn
f (x) |dn x|,
or4 IRn (f ).
1
Burkill, Lang, Shurman, Zorich.
2
Shifrin.
3
Hubbard.
4
Lang.
4c3 Exercise. Let
f (x) = 1 , g(x) = 0 for all rational x (0, 1) ,

f (x) = 0 , g(x) = 1 for all irrational x (0, 1) ,
f (x) = 0 , g(x) = 0 for all x R \ (0, 1) .
Prove that
L(af + bg) = min(a, b) ,

U (af + bg) = max(a, b)
for all a, b R.
R1 R
4c4 Exercise. Find 0 x dx using only 4c1, 4c2. That is, R f where f (x) =
x for x (0, 1), otherwise f (x) = 0.1
4c5 Exercise. Let f : R [0, 1) be defined via binary digits, by

X 2k (x) X k (x)
f (x) = for x = , k (x) {0, 1} , lim inf k (x) = 0 ,
k=1
2k k=1
2k k
f .2
R
and f (x) = 0 for x R \ (0, 1). Prove that f is integrable, and find R
A wonder: this integrable function has no intervals of continuity!
4c6 Proposition (linearity). All integrable functions Rn R are a vector

space, and the integral is a linear functional3 on this space.
n
That is, if f,
R g : R R areR integrable
R and a, b R, then af + bg is
integrable and Rn (af + bg) = a Rn f + b Rn g.
1
Hint: calculate LN (f ) and UN (f ).
2
Hint: calculate L2N (f ) and U2N (f ).
3
Functions on infinite-dimensional spaces are often called functionals.
Proof. For a 0 we have LN (af ) = aLN (f ) and UN (af ) = aUN (f ) (think,

why), hence L(af ) = aL(f ) and UR(af ) = aUR(f ). Thus, L(f ) = U (f ) implies
L(af ) = U (af ), and in this case (af ) = a f .
For a 0 we have LN (af ) = aUN (f ) and UN (af ) = aLN (f ) (think,
why), hence L(af ) = aU (f ) and RU (af ) = aL(fR ). Still, L(f ) = U (f ) implies
L(af ) = U (af ), and in this case (af ) = a f .
It remains to consider the sum f +g. We have LN (f +g) LN (f )+LN (g)
and UN (f + g) UN (f ) + UN (g) (think, why), hence L(f + g) L(f ) + L(g)
and U (f + g) U (f ) + U (g). Thus, L(f R ) = U (f R) and RL(g) = U (g) imply
L(f + g) = U (f + g), and in this case (f + g) = f + g.
R R
4c7 Remark. Denoting the lower and upper integral by Rn f and Rn f
we note some properties.
Monotonicity:
Z Z Z Z
if f () g() then f g, f g,
Z Z
and for integrable f, g, f g.
R R
(It can happen that f > g; find an example.)
Homogeneity:
Z Z Z Z
cf = c f, cf = c f for c 0 ;

Z Z Z Z
cf = c f , cf = c f for c 0 ;
Z Z
if f is integrable then cf is, and cf = c f for all c R .
(Sub-, super-) additivity:

Z Z Z
(f + g) f+ g;
Z Z Z
(f + g) f+ g;

Z Z Z
if f, g are integrable then f + g is, and (f + g) = f + g .
R R R
(It can happen that (f + g) < f+ g; find an example.)
4d Volume
Given a set E Rn , its indicator (or characteristic) function, denoted 1lE or
E , is defined by (
1 for x E,
1lE (x) =
0 for x Rn \ E.
The integral of the indicator function (if exists) is called1 the volume, or2
n-dimensional volume, or3 content, or4 Jordan measure, and denoted v(E),
voln (E), c(E). It exists if and only if 1lE is integrable. In this case one says5
that E is admissible, or6 pavable, or7 has content.
4d1 Definition. (a) A bounded set E Rn is admissible, if 1lE is integrable.

R
(b) The volume v(E) = vol(E) R= voln (E) of an admissible set E is Rn 1lE .

(c)RFor arbitrary bounded E, Rn 1lE = v (E) is the outer volume of E,
and Rn 1lE = v (E) is the inner volume of E. 8
Note that v (E) = limN UN (1lE ), and UN (1lE ) is the total volume of all
N -pixels that intersect E. Also, v (E) = limN LN (1lE ), and LN (1lE ) is the
total volume of all N -pixels contained in E. And finally, E is admissible if
and only if v (E) = v (E); and in this case v (E) = v(E) = v (E), of course.
Later well see that a bounded E is admissible if and only if v(E) = 0,
but for now we do not need it. If v (E) = 0, then necessarily E (is admissible
and) has volume zero. By monotonicity (recall 4c7), if E has volume zero,
then every subset of E has volume zero. If E has volume zero, then E =
(think, why); the converse does not hold (think, why).9

4d2 Exercise. The cube [0, 1]n is admissible, and v [0, 1]n = 1.
Prove it.10
Similarly, all dyadic cubes (pixels) are admissible, and v(Q) = 2nN
for every N -pixel Q.
1
Lang, Shurman.
2
Hubbard.
3
Burkill, Zorich.
4
Zorich.
5
Lang, Zorich.
6
Hubbard.
7
Burkill.
8
Or, inner and outer Jordan content, according to Burkill, Sect. 6.8, p. 182.
9
Moreover, a closed subset of [0, 1] with empty interior need not have volume zero
(fat Cantor set).
10
Hint: LN (1l[0,1]n ) = 1 and UN (1l[0,1]n ) = 2nN (2N + 2)n .
4d3 Lemma (additivity of volume). Let E, F Rn be admissible, and EF

have volume zero. Then E F is admissible, and v(E F ) = v(E) + v(F ).
Proof. We have 1lEF = 1lE + 1lF 1lEF (think, why). Also, 1lEF (is
integrable
R and)Rhas integral
R zero; by linearity (recall 4c6), 1lEF is integrable,
and 1lEF = 1lE + 1lF .
A box1 in Rn is the (Cartesian) product of intervals,
B = [a1 , b1 ] [an , bn ] .
4d4 Exercise. Every box B is admissible, its interior B is also admissible,
and
v(B) = (b1 a1 ) . . . (bn an ) = v(B ) .
Prove it.2
Thus, every bounded pixelated set, that is, finite union of pixels, is
admissible, and we know its volume.
4d5 Definition. Let E Rn be an admissible set. A bounded function
f : E R is integrable on E, if the corresponding
R function
R fE : Rn R
n
(see (4a1)) is integrable (on R ). In this case, E f = Rn fE .
It is usual and convenient to write f 1lE instead of fE ; accordingly,
Z Z
f= f 1lE .
E Rn
The same applies when f is defined on the whole Rn , or on a set that contains
E. Note that
Z Z
(4d6) 1 = v(E) ; c = cv(E) for c R ;
E E
Z
(4d7) v(E) inf f (x) f v(E) sup f (x) ;
xE E xE
Z
(4d8) v(E) = 0 = f = 0.
E
Assuming v(E) 6= 0 one defines the mean value of f on E as

1
Z
f.
v(E) E
1
See Sect. 4a for other names. Some authors allow the degenerate case v(B) = 0
(Lang, Shurman); others disallow it explicitly (Burkill) or implicitly (Shifrin, Zorich), or
do not bother (Hubbard). For now we need not bother, too. But in Sect. 4g well allow
degeneration.
2
Hint: UN (1lB ) (b1 a1 + 2 2N ) . . . (bn an + 2 2N ) and LN (1lB ) (b1 a1
2 2 ) . . . (bn an 2 2N ).
N
4e Normed space of equivalence classes

All bounded functions Rn R with bounded support1 are a vector space.
On this space, the functional
Z
f 7 |f |
Rn
is a seminorm; that is, satisfies the first two conditions (recall 1f13),
Z Z
|cf | = |c| |f | ,
Rn Rn
Z Z Z
|f + g| |f | + |g|
Rn Rn Rn
(think, why), but violates the third condition,

Z
|f | > 0 whenever f 6= 0 . (Wrong!)
Rn
R
Functions f such that Rn |f | = 0 will be called negligible. Functions f, g
such that f g is negligible will be called equivalent. For example, for each
pixel Q functions 1lQ and 1lQ are equivalent.2 The equivalence class of f will
be denoted [f ].3
4e1 Exercise. (a) Negligible functions are an infinite-dimensional vector
space.
(b) Equivalence classes are an infinite-dimensional vector space;4 the func-
tional Z
[f ] 7 |f |
B
is well-defined on this vector space, and is a norm.6

5
Prove it.
Thus, equivalence classes are a normed space, therefore also a metric
space: Z

[f ], [g] = k [f ] [g] k = |f g| ;
B
1
Each functions separately.
2
R R
Indeed, the equality 1lQ = 1lQ follows easily from 4d4.
3
Zorich, Sect. 11.3.1.
4
The linear operations are c[f ] = [cf ] and [f ] + [g] = [f + g], of course.
5
That is, insensitive to the choice of a function within the given equivalence class.
6
In fact, every seminorm on a vector space leads to a normed space of equivalence
classes.
this metric will be called the integral metric, and the corresponding conver-
gence the integral convergence.
4e2 Exercise. Functionals

Z Z
[f ] 7 f, [f ] 7 f
Rn Rn
on the normed space of equivalence classes are well-defined and continuous;

moreover,
Z Z Z Z

f g kf gk ,
n f g kf gk .

n
R n R nR R
Prove it.1
Here and henceforth we often write kf k instead of k [f ] k.
4e3 Remark. A function equivalent to

R an integrable function is integrable.
R R R
R Proof: Rif [f ] = [g] thenR Rn f R= Rn g and Rn f = Rn g by 4e2, thus
Rn
f = Rn f implies Rn g = Rn g.
4e4 Exercise. If bounded functions f, g : Rn R with bounded support

differ only on a set of volume zero then they are equivalent.
Prove it.2,3,4
We may safely ignore values of integrands on sets of volume zero

(as far as they are bounded). Likewise we may ignore sets of
volume zero when dealing with volume.
4e5 Remark. If f1 , f2 , . . . are integrable and kfk f k 0, then f is

integrable. In other words:
The set of all (equivalence classes of) integrable functions is closed

(in the integral metric).
R R R R R
Proof: Rn Rfk RRn f and Rn fk Rn f by 4e2, thus Rn fk =
R
f implies Rn f = Rn f .
Rn k
1
R R
Hint: Rn f = Rn (f ).
2
Hint: |f g| const 1lE .
3
Sets of volume zero are small enough that they dont interfere with integration
(Shurman, p.272).
4
The converse does not hold; see 4f12.
Any admissible set E Rn may be used instead of the whole Rn . Equiv-

alence classes of bounded functions E R are a normed space (infinite-
dimensional if v(E) 6= 0, but 0-dimensional if v(E) = 0).
4e6 Exercise. (a) Uniform convergence of bounded functions E R implies

integral convergence; prove it;
(b) the converse is generally wrong; find a counterexample.
4e7 Remark. Pointwise convergence (on E) does not imply integral con-
vergence, even if the functions are uniformly bounded.1 Here is a counterex-
ample. We take a sequence (xk )k of pairwise different points xk (0, 1) that
is dense in (0, 1) and considerTdense countable sets Ak = {xk+1 , xk+2 , . . . }.
Clearly, A1 A2 . . . and k Ak = . Indicator functions fk = 1lAk con-
R
verge to 0 pointwise (and monotonically). Nevertheless, (0,1) fk = 1 for all
k.
4e8 Remark. Integral convergence (on E) does not imply pointwise con-
vergence, even if the functions are continuous. Not even in most of the
points. Here is a counterexample on E = [0, 1] R:
f1 f2 f4 f8
f3 f5 f9
f6 f10
f7 (and so on)
4f Approximation
It is usual and convenient to treat functions as equivalent classes, when deal-
ing with integrals of discontinuous functions.
A box B leads to the equivalence class [1lB ] = [1lB ]. Linear combinations2
of these are called step functions. Dealing with a step function we ignore its
values at discontinuity points (but still assume that the function is bounded).
All step functions are integrable.
1
It does, if the functions are integrable! But this fact is far beyond basis of integration.
2
Finite, of course.
4f1 Exercise. (a) Every continuous f : Rn R with bounded support is

integrable;
(b) every continuous function on a box is integrable on this box.
Prove it.1
4f2 Exercise. Let f : (0, 1)n R be continuous (on the open cube!) and
bounded. Then f is integrable (on this open cube).
Prove it.2
For example, the function f (x) = sin cot x on (0, 1) is integrable.
4f3 Proposition. Step functions are dense among integrable functions.
That is, for every integrable f : Rn R and every > 0 there exists a step
function g such that kf gk .
Proof. We take N such that UN (f )LN (f ) and introduce step functions
g, h by
g(a) = inf f (x) , h(a) = sup f (x) for a Q
xQ xQ
R R
where Q runs over all N-pixels. We have Rn g = LN (f ), Rn h = UN (f )
(think, why), and g f h everywhere (except maybe a set of volume
zero). Thus,
Z Z
kf gk = |f g| (h g) = UN (f ) LN (f ) .
Rn Rn
4f4 Remark. In addition, g can be chosen such that
inf f () inf g() sup g() sup f () and sup |x| sup |x| + .
x:g(x)6=0 x:f (x)6=0
4f5 Remark. A function is integrable if and only if it is the limit of some

sequence of step functions (in the integral convergence), which follows from
4f3 and 4e5. In other words:
The set of all (equivalence classes of) integrable functions is the
closure of the set of all (equivalence classes of) step functions (in
the integral metric).
4f6 Exercise. There exist continuous gk : Rn [0, 1] with uniformly
bounded support such that kgk 1l[0,1]n k 0.
Prove it.
1
Hint: uniform continuity, and approximation by step functions.
2
Hint: approximation by f 1l[,1]n .
The same holds for every pixel; taking a linear combination and using 4f3
we get the following.
4f7 Corollary. For every integrable f there exist continuous gk : Rn R
with uniformly bounded support such that kgk f k 0. Thus:
The set of all (equivalence classes of) integrable functions is the
closure of the set of all (equivalence classes of) continuous func-
tions with bounded support (in the integral metric).
4f8 Lemma. If f is integrable, then f 2 : x 7 f (x) 2 is integrable.

Proof. Using 4f3 and 4f4 we take step functions1 gk and a number M such
that kgk f k 0 and |f ()| M , |gk ()| M . It remains to prove that
kgk2 f 2 k 0 (since gk2 are step functions). We have
|gk2 (x) f 2 (x)| = |gk (x) + f (x)| |gk (x) f (x)| 2M |gk (x) f (x)| ,
thus, kgk2 f 2 k 2M kgk f k 0.

4f9 Corollary. The (pointwise) product of two integrable functions is inte-
grable.
Indeed, f g = 14 (f + g)2 (f g)2 .

4f10 Exercise. If f is integrable, then |f |, f + = 21 f + |f | , sin f , 1 cos f ,

and ef 1 are integrable. If g is also integrable, then max(f, g) is integrable.

Prove it.2
4f11 Exercise. (a) If f and f1 are equivalent, then f 2 and f12 are equivalent;
the same holds for |f |, f + , sin f , 1 cos f , and ef 1.
(b) If [f ] = [f1 ] and [g] = [g1 ], then [f g] = [f1 g1 ] and [max(f, g)] =
[max(f1 , g1 )].
Prove it.
4f12 Remark. It can happen that f and f1 are equivalent, but sgn f and
sgn f1 are not. Here is a counterexample. Let (rk )k be an enumeration of all
rational numbers on [0, 1]; consider f such that f (rk ) = ck for all k, f (x) = 0
for irrational x [0, 1] and for all x R \ [0, 1]. If ck 0, then [f ] = [0]
(think, why); but if ck = 1, then [f ] 6= [0] (think, why).
If two continuous functions are equal on a dense set, then they are equal
everywhere. This is not the case for integrable functions. But here is a
surprise.
1
Continuous functions may be used equally well.
2
Hint: consider (g f )+ .
4f13 Exercise. If two integrable functions are equal on a dense set, then
they are equivalent.
Prove it.1
On the other hand, a function equal to an integrable function on a dense
set need not be integrable (think, why).
4f14 Proposition. If E, F Rn are admissible sets, then the sets E F ,
E F and E \ F are admissible.
Proof. First, E F is admissible since 1lEF = 1lE 1lF is integrable by 4f9.
Further, 1lEF = 1lE + 1lF 1lEF and 1lE\F = 1lE 1lEF are integrable.
4f15 Exercise. Give another proof of 4f14 using max(f, g) (and min(f, g))
rather than f g.
4f16 Proposition. (a) A function integrable on Rn is integrable on every
admissible set;
(b) a function integrable on an admissible set is integrable on every ad-
missible subset of the given set.
Proof. (a) f 1lE is integrable by 4f9.
(b) Given E F , the function f 1lE = (f 1lF ) 1lE is integrable by
4f9.
4g Sandwich
The Darboux sums LN (f ) = UN (f ) and UN (f ) defined by (4b3), (4b4)
may be thought of as integrals of step functions,
Z Z
(4g1) LN (f ) = `N,f , UN (f ) = uN,f ,
Rn Rn
N
(4g2) uN,f (a) = sup f (x) for a 2 (Q + k)
x2N (Q+k)
| {z }
2nN UN,k (f )
and `N,f = uN,f ; here Q = [0, 1]n , again. In Sect. 4f we did not bother
about values of step functions at points of discontinuity. But sometimes we
need the inequality `N,f f uN,f to hold everywhere (including pixel
boundaries). We can ensure this by taking
X X
(4g3) 2nN uN,f = UN,k (f )1l2N (Q+k) + UN,k (f )1l2N (Q +k)
kZn :UN,k (f )>0 kZn :UN,k (f )<0
1
Hint: LN (|f g|) = 0.
and `N,f = uN,f (again). The values of these step functions on pixel
boundaries are somewhat strange but harmless; we have (think, why)
(4g4) `N,f f uN,f ,

(4g5) 2 inf f () inf `N,f () sup uN,f () 2n sup f () ;
n
the latter shows that `N,f and uN,f are bounded, uniformly in N .
A box was defined in Sect. 4d as B = [a1 , b1 ] [an , bn ]. Now we
clarify that < ai bi < + for i = 1, . . . , n; the degenerate case
v(B) = 0 is allowed. Further, we define a step function as a (finite) linear
combination of indicator functions of boxes. (On the level of equivalence
classes this definition conforms to Sect. 4f.)
Note that 1lB is a step function; for a proof, open the brackets in

1l[a1 ,b1 ] (x1 ) 1l{a1 } (x1 ) 1l{b1 } (x1 ) . . . 1l[an ,bn ] (xn ) 1l{an } (xn ) 1l{bn } (xn )
(assuming a1 < b1 , . . . , an < bn , of course). It follows that `N,f and uN,f are
step functions.
4g6 Proposition. For every bounded f : Rn R with bounded support,

Z Z Z Z

f = sup g step g f , f = inf h step h f .
n
R n R n n R R
Proof. ItR is sufficient to prove the latter; the former follows via (f ).
R R
: Rn h = Rn h Rn f by 4c7 (monotonicity).
R
: taking h = uN,f we see that the infimum Rn uN,f = UN (f ) for
all N .
Clearly, we have an equivalent definition of integrability and integral.
4g7 Corollary. For every bounded f : Rn R with bounded support,

Z Z Z Z

f = sup g integrable g f , f = inf h integrable h f .
n
R n R n n R R
4g8 Corollary. A function f : Rn R is integrable if and only if for

every
R R> 0 there exist step functions g and h such that g f h and
Rn
h Rn g .
We see that an integrable function can be sandwiched between step func-

tions. Or, alternatively, between continuous functions, see 4g9.
4g9 Exercise. (a) For every box B Rn and > 0 there exist continuous
n
functions
R g, h :RR [0, 1] with bounded support such that g 1lB 1lB
h and Rn h Rn g ;
(b) for every step function f : Rn R and > 0 there exist continuous
R
functions
R g and h with bounded support such that g f h and Rn h
Rn
g ;
(c) the same holds for every integrable f .
Prove it.1
R R
4g10 Exercise. (a) Define E f and E f similarly to 4d5;
R R R
(b) prove additivity of the upper integral: E]F f = E f + F f , and
the same for the lower integral;2
(c) generalize (4d7) to lower and upper integrals.
Thus, if f is not integrable, then the corresponding set function satisfying
(4a2) and (4a3) isR not unique; we have at least two such set functions, E 7
R
E
f and E 7 E f .
4h Translation (shift) and scaling

As before, we assume that f : Rn R is bounded, with bounded support.
Given a function f and a vector a Rn , we consider the shifted function
f ( + a) : x 7 f (x + a).
If a Zn , then L0 f ( + a) = L0 (f ) and U0 f ( + a) = U0 (f ) (think,
N n
why).
Moreover, if a 2 Z , then LN +i f (+a)R = LN +i (f ) and U R N +i f (+
a) = UN +i (f ) for i = 0, 1, 2, . . . , whence Rn
f ( + a) = Rn
f and
R R
Rn
f ( + a) = Rn f . Our theory is invariant under binary-rational shifts.
What about arbitrary shifts?
4h1 Proposition.
R f ( + a)
R is integrable if and only if f is integrable, and
in this case Rn f ( + a) = Rn f .
Proof. First, if f is the indicator function of a box, then the claim holds by
4d4.
Second, by linearity the claim holds for step functions.
We apply it to the step functions g and h of 4g6, note that g f
g( + a) f ( + a) and h f h( + a) f ( + a), and conclude that
Z Z Z Z
f ( + a) = f, f ( + a) = f;
Rn Rn Rn Rn
R R R R
thus, Rn
f = Rn f Rn
f ( + a) = Rn
f ( + a).
1
Hint: (a) product of n piecewise linear functions of one variable each; (a)=(b)=(c).
2
Hint: use 4g7.
4h2 Corollary. For every set E Rn and vector a Rn , the shifted set E+a
is admissible if and only if E is admissible, and in this case v(E + a) = v(E).
Consider now a linear operator A : Rn Rn of the form A(x1 , . . . , xn ) =
(a1 x1 , . . . , an xn ) (that is, diagonal matrix), and assume that a1 6= 0, . . . , an 6=
0 (that is, A is invertible).
4h3 Exercise.R f A is integrable
R if and only if f is integrable, and in this
case |a1 . . . an | Rn f A = Rn f .
Prove it.1
4h4 Exercise. For every set E Rn , its image A(E) = {Ax : x E}
is admissible if and only if E is admissible, and in this case v(A(E)) =
|a1 . . . an |v(E).
Prove it.
In particular,
Z Z
n
(4h5) |a| f (ax) dx = f,
Rn Rn
(4h6) v(aE) = |a|n v(E) .
The following fact is evident for continuous f but, surprisingly, does not
require continuity.
4h7 Proposition. For every integrable f : Rn R and > 0 there exists
> 0 such that for all a Rn
|a| = kf ( + a) f k .
Proof. First, assume in addition that f is continuous. Then we take M

(0, ) such that {x : f (x) 6= 0} [M, M ]n , and then, using uniform
continuity of f , we take such that |a| implies

x |f (x + a) f (x)| .
2n (M + )n
Then {x : f (x + a) f (x) 6= 0} [(M + ), M + ]n (think, why), whence

Z
|f ( + a) f ()| max |f (x + a) f (x)| v [(M + ), M + ]n

Rn x
n
2(M + ) = ,
2n (M + )n
1
Hint: similarly to 4h1.
that is, kf ( + a) f k .
Second, given an integrable f , by 4f7 there exists a continuous g : Rn
R with bounded support such that kg f k /3. We take such that
kg( + a) gk /3. Then, using the triangle inequality,
kf ( + a) f k kf ( + a) g( + a)k + kg( + a) gk + kg f k

kf gk + + kg f k + + = .
3 3 3 3
4i The volume under a graph

Here is a rich source of admissible sets.
4i1 Proposition. If a function f : Rn [0, ) is integrable, then the set
E = {(x, t) : 0 < t < f (x)} Rn R
R
is admissible, and vn+1 (E) = Rn f .
Proof. For N = 0, 1, 2, . . . and k Zn we introduce such boxes in Rn+1 :
BN,k = 2N (Q + k) [0, 2nN UN,k (f )] , CN,k = 2N (Q + k) [0, 2nN LN,k (f )]

(here Q = [0, 1]n , as in Sect. 4b) and note that k CN,k E k BN,k ,
therefore (recall (4b3), (4b4) and (4d4))
X X

LN (f ) = v(CN,k ) v (E) v (E) v(BN,k ) = UN (f )
k
| {z } k
| {z }
LN,k (f ) UN,k (f )
for all N .
4i2 Corollary. If functions f, g : Rn R are integrable, then the set
E = {(x, t) : f (x) < t < g(x)} Rn R
is admissible.
Proof. We take a box B Rn such that f = g = 0 on Rn \B, and a number
M such that |f | M , |g| M everywhere. Then
E = {(x, t) : x B, M < t < g(x)} {(x, t) : x B, f (x) < t < M }
(think, why). By 4f14 it is sufficient to prove that these two sets are ad-
missible. The second set becomes similar to the first set after reflection
(x, t) 7 (x, t) (recall 4h4). The first set is a shift (recall 4h2) by (0, M )
of the set {(x, t) : x B, 0 < t < g(x) + M } admissible by 4i1 applied to
(g + M )1lB .
It is easy to guess that vn+1 (E) = Rn (g f )+ . We could prove it now

R
with some effort.1 However, in the next section well get the same effortlessly.
4i3 Exercise. For f as in 4i1, the set
{(x, t) : t = f (x) > 0} Rn R
is of volume zero.
Prove it.2
4i4 Exercise. Prove that

(a) the disk {x : |x| 1} R2 is admissible;
(b) the ball {x : |x| 1} Rn is admissible;
(c) for every p > 0 the set Ep = {(x1 , . . . , xn ) : |x1 |p + +|xn |p 1} Rn
is admissible;
(d) v(Ep ) is a strictly increasing function of p.
4i5 Exercise. For the balls Er = {x : |x| r} Rn prove that

(a) v(Er ) = rn v(E1 );
(b) v(Er ) < en(1r) v(E1 ) for r < 1.
A wonder: in high dimension the volume of a ball is concentrated near

the sphere!
1 + +
P P
Hint: k (LN,k (g) UN,k (f )) v (E) v (E) k (UN,k (g) LN,k (f )) , and
+ +
(UN,k (g)LN,k (f )) (LN,k (g)UN,k (f )) (UN,k (g)LN,k (f ))(LN,k (g)UN,k (f )) =
(UN,k (g) LN,k (g)) + (UN,k (f ) LN,k (f )).
2
Hint: try f (x) + .
Index
additivity, 56 negligible, 59
additivity of volume, 58
admissible, 57 outer volume, 57
bounded support, 52 pixel, 57

box, 58, 65
seminorm, 59
convergence set function, 66
integral, 60 step function, 61, 65
pointwise, 61 subadditivity, 56
uniform, 61 superadditivity, 56
Darboux sums, 53 volume, 57

volume zero, 57
equivalence class, 59
equivalent, 59 [f ], 59
fE , 51
homogeneity, 56 1RlE , 57
ER
f , 58
indicator, 57
Rn
f , 56
inner volume, 57 R
Rn
f , 56
integrable, 54
L(f ), 54
integrable on E, 58
LN (f ), 53
integral, 54
`N,f , 64
integral convergence, 60
|| ||, 59
integral metric, 60
U (f ), 54
linearity, 55 UN (f ), 53
lower and upper integrals, 54 uN,f , 64
v(E), 57
mean value, 58 v (E), 57
monotonicity, 56 v (E), 57
5 Iterated integral
5a Introduction . . . . . . . . . . . . . . . . . . . . . 71
5b Simple cases . . . . . . . . . . . . . . . . . . . . . 72
5c Some counterexamples . . . . . . . . . . . . . . . 74
5d Integrable functions . . . . . . . . . . . . . . . . . 76
5e Cavalieris principle . . . . . . . . . . . . . . . . . 79
Iterated integral is an indispensable tool for calculating multidimensional

integrals (in particular, volumes).
5a Introduction
It is easy to see that
X Z
2
f (k, l) f as 0
k,lZ R2
for every continuous f : R2 R with bounded support. The double sum-

mation is evidently equivalent to iterated summation,
X X X
2
f (k, l) = f (k, l) ,
k,lZ kZ lZ
which suggests that

Z Z Z
f= f (x, y) dy dx ,
R2 R R
RR R R
(alternative notation: f (x, y) dxdy = dx dy f (x, y), and the like), that
is,
Z Z Z
(5a1) f= x 7 f (x, ) ,
R2 R R
where f (x, ) : R R is defined by f (x, ) : y 7 f (x, y).

It should be very useful, to integrate with respect to one variable at a
time.
Related problems:
does integrability of f imply integrability of f (x, ) for every x?

R
is the function x 7 R f (x, ) integrable?
is the two-dimensional integral equal to the iterated integral?
if the iterated integral is well-defined, does it follow that f is integrable?
And, of course, we need a multidimensional theory; R2 is only the simplest
case.
Some authors1 impose on f additional requirements. Others2 consider
all integrable functions f ; we do so, too, in Sect. 5d, but first we consider
simpler cases (Sect. 5b) and counterexamples (Sect. 5c).
5b Simple cases
step functions
First we consider a step function f : R2 R, treated as in Sect. 4g: a
linear combination of indicator functions of boxes (and boxes of volume zero
are allowed).
Given B = [a1 , b1 ] [a2 , b2 ] and f = 1lB , we have
Z
f (x, ) = 1l[a1 ,b1 ] (x)1l[a2 ,b2 ] ; f (x, ) = (b2 a2 )1l[a1 ,b1 ] (x) ;
R
Z Z Z
x 7 f (x, ) = (b1 a1 )(b2 a2 ) = v(B) = f.
R R R2
R R R
RR notation: dy f (x, y) = (b2 a2 )1l[a1 ,b1 ] (x); dx dy f (x, y) =
(Alternative
v(B) = f (x, y) dxdy.)
Similarly, given a box B Rm+n , we have RB = B1 B2 for some boxes
B1 Rm , B2 Rn ; thus, f (x, ) = 1lB1 (x)1lB2 ; Rn f (x, ) = v(B2 )1lB1 (x);
Z Z Z
x 7 f (x, ) = v(B1 )v(B2 ) = v(B) = f.
Rm Rn Rm+n
By linearity, Z Z Z
x 7 f (x, ) = f
Rm Rn Rm+n
for every step function f : Rm+n R R; in this case, all sections f (x, ) are step
functions, and the function x 7 Rn f (x, ) is also a step function. Similarly,
Z Z Z Z Z
(5b1) y 7 f (, y) = f= x 7 f (x, ) .
Rn Rm Rm+n Rm Rn
1
Lang, Shifrin, Shurman.
2
Burkill, Hubbard, Zorich.
continuous functions
Now we consider a continuous function f : Rm+n R with bounded

support. Integrability of f is ensured by 4f1(a), as well as integrability of
f (x, ).
If xn x, then f (xRn , ) f (x, ) uniformly
R (due to uniform continuity
of f ),R whence by 4e6 Rn f (xn , ) Rn f (x, ); we see that the function
x 7 Rn f (x, ) is continuous. Clearly it has a bounded support, and therefore
is integrable.
Now we use the sandwich. Given
R R > 0, by 4g8 there exist step functions
g, h such that g fR h and hR g . 1 R We have g(x, ) f (x, )
h(x, ) everywhere; Rn g(x, ) Rn f (x, ) Rn h(x, ) for all x. On one
hand,
Z Z Z Z Z
g= x 7 g(x, ) x 7 f (x, )
Rm+n Rm Rn Rm Rn
Z Z Z
x 7 h(x, ) = h;
Rm Rn Rm+n
R R R
on the other hand, Rm+n g Rm+n f Rm+n h. We see that
Z Z Z

m+n f x
7 f (x, ) ,

R m n R R
R R
since both numbers belong to the interval [ g, h] of length . We con-
clude that (5b1) holds for every continuous f with bounded support.
5b2 Exercise. Prove that

Z
f (x1 , . . . , xm )g(y1 , . . . , yn ) dx1 . . . dxm dy1 . . . dyn =
Rm+n
Z Z
= f (x1 , . . . , xm ) dx1 . . . dxm g(y1 , . . . , yn ) dy1 . . . dyn
Rm Rn
for continuous functions f : Rm R, g : Rn R with bounded support.
5b3 Exercise.
R 1 R 1 Calculate each integral in two ways:
x+y
(a) 0 dx 0 dy e ;
R 1 R /2
(b) 0 dy 0 dx xy cos(x + y).
1
This argument applies to all integrable f , of course; but (for now) the continuity
ensures existence of the iterated integral.
5b4 Exercise.
R Calculate integrals
(a) [0,1]n (x21 + + x2n ) dx1 . . . dxn ;
R
(b) [0,1]n (x1 + + xn )2 dx1 . . . dxn .
5b5 Exercise. For every continuous function f : R2 R with bounded
support, ZZ ZZ
f (x, y + sin x) dxdy = f (x, y) dxdy .
R2 R2
Prove it.

5b6 Exercise. For every continuous function f : R2 R with bounded

support,
ZZ ZZ

3 y
f x + x, 2 dxdy = f (x, y) dxdy .
R2 3x + 1 R2
Prove it.

5c Some counterexamples
5c1 Example. 1 Integrability of f does not imply integrability of f (x, ) for
every x.
Define f : R2 R by
(
1 if x = 0 and y [0, 1] is rational,
f (x, y) =
0 otherwise.
Then f = 0 outside a set {0} [0, 1] of area 0, therefore f (being negligible)

is integrable. However, f (0, ) is not integrable. On
R theR other hand, f (, y)
(being
R negligible) is integrable for every y, and R dy R dx f (x, y) = 0 =
R2
f .
5c2 Example. Existence of the iterated integral2 does not imply bound-
edness (the more so, integrability) of f , even if f is positive and symmet-
ric
R inR the sense that
R fR(x, y) = f (y, x) (and therefore the iterated integrals
dx dy f (x, y), dy dx f (x, y) are both well-defined, and equal).
1
Shifrin, Example 5 on p. 281.
2
R
That is, integrability of f (x, ) for all x and integrability of the function x 7 f (x, ).
Define f : R2 R by
(
1 if x, y (0, 1),
x+y
f (x, y) =
0 otherwise
and observe that
1

Z Z
dy y=1
f (x, ) = = 2 x + y y=0 = 2 x + 1 2 x
R 0 x+y
for x (0, 1), evidently an integrable function.
5c3 Example. 1 Existence of both iterated integrals does not imply their
equality, even if f is antisymmetric in the sense that f (x, y) = f (y, x).
Define f : R2 R by

2
1/y
if 0 < x < y < 1,
2
f (x, y) = 1/x if 0 < y < x < 1,

0 otherwise;

then
Z Z x Z 1
1 1 1 1 y=1 1 1
f (x, ) = 2 dy+ dy = x+ = 1 = 1

x 2 x2
x y y y=x x x

R 0
for all x (0, 1). Thus, one iterated integral is negative (1). By the
antisymmetry, the other iterated integral is positive (+1).
Or, alternatively,
(
xy
3 if x, y (0, 1),
f (x, y) = (x+y)
0 otherwise;
here
1 Z 1
xy 2x (x + y)
Z Z
f (x, ) = 3
dy = dy =
R 0 (x + y) 0 (x + y)3
Z 1 Z 1 y=1 y=1
dy dy 1 1 1
= 2x 3
2
= 2x (1) =
0 (x + y) 0 (x + y) 2 (x + y)2 y=0 x + y y=0
1 1 1 1 x + (x + 1) 1
= x 2
2 + = 2
=
(x + 1) x x+1 x (x + 1) (x + 1)2
for all x (0, 1). Thus, one iterated integral is positive (in fact, 1/2). By
the antisymmetry, the other iterated integral is negative (1/2).
1
Burkill, Exercise 9 on p. 265.
5c4 Remark. One may wonder, does existence of both iterated integrals
imply their equality if f is just bounded (but not necessarily integrable)?
Surprisingly, the answer is affirmative.1,2,3 It may be tempting to use this
fact for enlarging the two-dimensional integral. However, what about change
of variables then?
5c5 Example. Existence of the iterated integral does not imply integrability
of f even if f is bounded and symmetric (and therefore both iterated integrals
exist and are equal).
Here we use existence of a dense countable set A (0, 1) (0, 1), sym-
metric (in the sense that (x, y) A (y, x) A) and such that
{y : (x, y) A} is finite for every x.
For instance,4 the set of all qi , qj (0, 1) (0, 1) for natural i, j and
prime q.
Or the set of all (2i 1)/2n , (2j 1)/2n (0, 1) (0, 1).

Or the set of all (x, y) (0, 1) (0, 1) such that x 2 + y and x + y 2
are (both) rational. R
For every such A, its indicator function f = 1l A satisfies 0 = R2
f <
R R
R2
f = 1 and R f (x, ) = 0 for all x.
5c6 Exercise. 5 Consider a function f : R2 R of the form f (x, y) =

g(x)h(y) where g, h : R R are bounded functions with bounded support.
(a) If g is negligible, then f is negligible. Prove it.6
(b) Integrability of f does not imply that the set {x : f (x, ) is not integrable}
is of volume zero. Find a counterexample.7,8
5d Integrable functions
Recall that every integrable function is bounded, with bounded support.
1
In Riemann integration, of course. In Lebesgue integration the corresponding problem
is more complicated.
2
Lichtenstein 1911, Fichtenholz 1913; see Sect. 16.6 in book An interactive introduc-
tion to mathematical analysis by J.W. Lewin.
3
Amazingly, such f need not be Lebesgue measurable. (Basically, Sierpinski 1920;
see book Measure theory by V.I. Bogachev, vol. 1, Item 3.10.49 on page 232). I thank
Yonatan Shelah for this note.
4
Burkill, Exercise 8 on page 265; Shifrin, Example 7 on page 282.
5
Burkill, Exercise 6 on page 264.
6
R R
Hint: |g| (step function), ; |h| C 1l[M,M ] ; then |f | 2CM .
7
Hint: recall 4f12, use both cases (ck 0, and ck = 1); use (a).
8
Contrary to: Hubbard, Corollary A16.3 on page 724. Do you see the error there in
the proof?
5d1 Theorem. If a function f : Rm+n R is integrable, then the iterated

integrals
Z Z Z Z
dx dy f (x, y) , dx dy f (x, y) ,
Rm Rn Rm Rn
Z Z Z Z
dy dx f (x, y) , dy dx f (x, y)
Rn Rm Rn Rm
are well-defined and equal to

ZZ
f (x, y) dxdy .
Rm+n
R R
R claim that dx dy f (x, y) is well-defined means that
Clarification. The
the function x 7 dy f (x, y) is integrable.
The equality
Z Z Z Z
x 7 f (x, ) = x 7 f (x, )

implies integrability (with the same integral) of every function sandwiched

between
R the lower and upper integrals.1 It is convenient to interpret x 7
f (x, ) as any such function and write, as before,
Z Z Z
f= x 7 f (x, )
Rm+n Rm Rn
and Z Z ZZ Z Z
dx dy f (x, y) = f (x, y) dxdy = dy dx f (x, y)
even though fx may be non-integrable for some x.

Theorem 5d1 is proved via sandwiching (recall Sect. 4g), either by step
functions or by continuous functions. Let us use the former.
R R
Proof. By (4g6), Rm+n f R= inf hf Rm+n R h whereR h runs over all step func-
tions. R For every such h, Rm+n R
h = Rm
x
7 Rn
h(x, ) by (5b1). We
R
have h(x, ) = RRn h(x, )
Rn R Rn f (x, ) (since h(x, ) f (x, )), thus,
R
Rm+n
h Rm
x
7 Rn
f (x, ) for all these h. Therefore
Z Z Z
f x 7 f (x, ) .
Rm+n Rm Rn
1
But not every bounded function that is equal to the integral whenever it exists! In
contrast to Lebesgue integration, here we cannot take 0 whenever the integral does not
exist; recall 5c6(b). See also Zorich, Sect. 11.4.3, Exercise 1(c).
Similarly (or via (f )),

Z Z Z
f x 7 f (x, ) .
Rm+n Rm Rn
Using integrability of f ,
Z Z Z Z Z Z
f x 7 f (x, ) x 7 f (x, ) f,
Rm+n Rm Rn Rm Rn Rm+n
therefore
Z Z Z Z Z
f= x 7 f (x, ) = x 7 f (x, ) .
Rm+n Rm Rn Rm Rn
R
Integrability of the function x 7 Rn
f (x, ) follows, since
Z Z Z Z Z
f= x 7 f (x, ) x 7 f (x, )
Rm+n Rm Rn Rm Rn
Z Z Z
x 7 f (x, ) = f.
Rm Rn Rm+n
R
Similarly, the function x 7 Rn f (x, ) is also integrable. Thus,
Z Z Z Z Z
f= x 7 f (x, ) = x 7 f (x, ) .
Rm+n Rm Rn Rm Rn
The other two iterated integrals are treated similarly (or via f(y, x) =
f (x, y)).
5d2 Exercise. Give another proof of 5d1, via sandwiching by continuous
functions.
5d3 Exercise. Generalize 5b2 to integrable functions
(a) assuming integrability of the function (x, y) 7 f (x)g(y),
(b) deducing integrability of the function (x, y) 7 f (x)g(y) from integra-
bility of f and g (via sandwich).
5d4 Exercise. For every integrable function f : R2 R the function x, y 7
f (x, y + sin x) is also integrable, and
ZZ ZZ
f (x, y + sin x) dxdy = f (x, y) dxdy .
R2 R2
Prove it.1
1
Hint: use 5b5.
5d5 Exercise.For every integrable function f : R2 R the function x, y 7

f x3 + x, 3x2y+1 is also integrable, and
ZZ ZZ

3 y
f x + x, 2 dxdy = f (x, y) dxdy .
R2 3x + 1 R2
Prove it.1
5e Cavalieris principle
5e1 Exercise. If E1 Rm and E2 Rn are admissible sets then the set
E = E1 E2 Rm+n is admissible.
Prove it.
Applying Theorem 5d1 to a function f 1lE and taking 4d5 into account we
get the following.
5e2 Corollary. Let f : Rm+n R be integrable on every box, and E

Rm+n an admissible set; then
Z Z Z
f= x 7 fx
E Rm Ex
where Ex = {y : (x, y) E} Rn for x Rm .

R
Clarification. First, note thatR {x : Ex 6= } is bounded, and fx =
0. Second: it may happen that Ex fx is ill-defined for some x; then it is
R R
interpreted as anything between fx 1lEx and fx 1lEx .
In particular, taking f () = 1 we get
Z
(5e3) vm+n (E) = vn (Ex ) dx
Rm
where vk is the volume in Rk . For instance, the volume of a 3-dimensional

geometric body is the 1-dimensional integral of the area of the 2-dimensional
section of the body.
5e4 Corollary. If admissible sets E, F R3 satisfy v2 (Ex ) = v2 (Fx ) for all

x then v3 (E) = v3 (F ).2
1
Hint: use 5b6.
2
It is sufficient to check the equality for all x of a dense subset of R (since two Riemann
integrable functions equal on a dense set must have equal integrals by 4f13).
This is a modern formulation of Cavalieris principle:1,2

Suppose two regions in three-space (solids) are
included between two parallel planes. If ev-
ery plane parallel to these two planes intersects
both regions in cross-sections of equal area, then
the two regions have equal volumes.
Before emergence of the integral calculus, Cavalieri was able to calculate
some volumes by ingenious use of this principle. Here are two examples.
First, the volume of the upper half of a sphere is equal to the volume of a
cylinder minus volume of a cone:
Second, when a hole of length h is drilled straight through the center of a

sphere, the volume of the remaining material surprisingly does not depend
on the size of the sphere:
5e5 Exercise. Check the two results of Cavalieri noted above.

1
Bonaventura Francesco Cavalieri (in Latin, Cavalerius) (15981647), Italian mathe-
matician.
2
Images (and some text) from Wikipedia, Cavalieris principle.
5e6 Exercise. Check a famous result of Archimedes:1,2 a sphere inscribed

within a cylinder has two thirds of the volume of the cylinder.
Moreover, show that the volumes of a cone, sphere and cylinder of the same
radius and height are in the ratio 1 : 2 : 3.
5e7 Exercise. ForR f , g and E as in 4i2 prove that
(a) vn+1 (E) = Rn (g f )+ ;
R R R g(x)
(b) E h = Rn dx 1lf <g (x) f (x) dt h(x, t) for every h : E R integrable
on E.
5e8 Remark. Here 1lf <g is the indicator of the set {x : f (x) < g(x)}.
This set need not be admissible (it can be a dense countable set, recall
4f12).3 And nevertheless, the iterated integral is well-defined (according to
the clarifications. . . ).
5e9 Remark. Cavalieris principle is about parallel planes. What about
parallel surfaces or curves? Applying 5d4 to f = 1lE we get the following:
if admissible sets E, F R2 satisfy v1 (Ey ) = v1 (Fy ) for all y then v2 (E) =
v2 (F ); here Ey = {x : (x, y + sin x) E} (and the same for Fy ). But do not
think that v1 (Ey ) is the length of the sinusoid inside E; it is not.
1
Archimedes ( 287212 BC), a Greek mathematician, generally considered to be the
greatest mathematician of antiquity and one of the greatest of all time.
Cicero describes visiting the tomb of Archimedes, which was surmounted by a sphere in-
scribed within a cylinder. Archimedes . . . regarded this as the greatest of his mathematical
achievements.
2
Images (and some text) from Wikipedia, Volume (section Volume ratios for a
cone, sphere and cylinder of the same radius and height).
3
And even if f and g are continuously differentiable, still, this set is just open (not
necessarily admissible), see Sect. 2a, Footnote 2 on page 22.
Here is another case: Er = { [0, 2) : (r cos , r sin ) E}; now v1 (Er )

is the length of the circle inside E, multiplied by r; and in fact, the equality
v1 (Er ) = v1 (Fr ) for all r implies v2 (E) = v2 (F ).
Note that the parallel circles are equidistant; the parallel sinusoids are not.
However, curvilinear integration is postponed to Analysis 4.
5e10 Exercise. 1 Consider the setR E = {(x, y, z) : 0 z 1x2 y 2 } R3 .

(a) Find the volume of E via vR2 (E z ) dz. R
(b) Using (a) and the equality v2 (E z ) dz = v1 (Ex,y ) dxdy, find the
mean2 of the function (x, y) 7 1 x2 y 2 on the disk {(x, y) : x2 + y 2
1} R2 .
(c) Similarly to (a), (b), find the mean of the function x 7 |x|p on the
ball {x : |x| 1} Rn for p (0, ).3
5e11 Exercise. Calculate the integral

ZZZ
(x21 + x22 + x23 ) dx1 dx2 dx3 ,
E
where E = {(x1 , x2 , x3 ) [0, )3 : x1 + x2 + x3 a} R3 .

Answer: a5 /20.
5e12 Exercise. Find the volume of the intersection of two solid cylinders
in R3 : {x21 + x22 1} and {x21 + x23 1}.
Answer: 16/3.
1
Exam of 26.01.14, Question 4.
2
Recall the end of Sect. 4d.
3
Hint: you do not need the volume of the ball (nor the area of the disk)! And of
course, |x|p stands for (x21 + + x2n )p/2 .
5e13 Exercise. Find the volume of the solid in R3 under the paraboloid
{x21 + x22 = x3 } and above the square [0, 1]2 {0}.
Answer: 2/3.
5e14 Exercise. Let f : R R be a continuous function. Prove that
Z x Z x1 Z xn1 Z x
(x t)n1
dx1 dx2 ... dxn f (xn ) = f (t) dt .
0 0 0 0 (n 1)!
5e15 Example. Let us calculate the integral
Z
max(x1 , . . . , xn ) dx1 . . . dxn .
[0,1]n
First of all, by symmetry, we assume that 1 x1 x2 ... xn 0, and

multiply the answer by n!. Then max(x1 , ..., xn ) = x1 , and we get
Z 1 Z x1 Z xn1 Z 1 n
x1 dx1 n
n! x1 dx1 dx2 ... dxn = n! = .
0 0 0 0 (n 1)! n+1
R
5e16 Exercise. Compute the integral [0,1]n min(x1 , . . . , xn ) dx1 . . . dxn .
1
Answer: n+1 .
5e17 Exercise. Find the volume of the n-dimensional simplex
{x : x1 , ..., xn 0, x1 + ... + xn 1} .
1
Answer: n!
.
5e18 Exercise. Suppose the function f depends only on the first coordinate.
Then Z Z 1
f (x1 ) dx = vn1 f (x1 )(1 x21 )(n1)/2 dx1 ,
V 1
n
where V is the unit ball in R , and vn1 is the volume of the unit ball in
Rn1 .
The next exercises examine further a very interesting phenomenon of
concentration of high-dimensional volume touched before, in 4i5(b); it was
seen there that in high dimension the volume of a ball concentrates near the
sphere,1 and now well see that it also concentrates near a hyperplane!2
5e19 Exercise. Let V be the unit ball in Rn , and P = {x V : |x1 | < 0.01}.
What is larger, vn (P ) or vn (V \ P ), if n is sufficiently large?
1
See also 5e10(c).
2
Do you see a contradiction in these claims?
5e20 Exercise. Given > 0, show that the quotient
vn ({x V : |x1 | > })

vn (V )
tends to zero as n .1
Could you find the asymptotic behavior of the quotient above as n ?

Given an integrable f : Rn R and a box B Rn (of non-zero volume),
we introduce fB : Rn R by
Z
1
fB (x) = f;
v(B) B+x
that is, fB (x) is the mean value of f on the shifted box B+x = {b+x : b B}.
5e21 Exercise. Prove that fB is a continuous function.
5e22 Exercise. (a) Let n = 2 and B = [s1 , t1 ] [s2 , t2 ]. For a continuous f

with bounded support, prove that fB C 1 (Rn ) and
Z
1 1
fB (x1 , x2 ) = fx1 +t1 fx1 +s1 ;
x1 t2 s2 [s2 ,t2 ] t1 s1
(b) generalize (a) to arbitrary n.
5e23 Exercise. Prove that every continuous f with bounded support is the
limit of some uniformly convergent sequence of functions of C 1 (Rn ).2
R1
1 (1t2 )(n1)/2 dt
Hint: the quotient equals R1
(1t2 )(n1)/2 dt
.
0
2
Hint: consider fB for a small B close to 0.
6 Lebesgues criterion for Riemann integra-

bility
6a Introduction . . . . . . . . . . . . . . . . . . . . . 85
6b Integral of oscillation . . . . . . . . . . . . . . . . 87
6c Measure zero . . . . . . . . . . . . . . . . . . . . . 90
6d Continuity almost everywhere . . . . . . . . . . . 91
6a Introduction
Consider a bounded function f : (0, 1) R. If f is continuous then it is
integrable (even if it is not uniformly continuous, like sin(1/x)). A step func-
tion is (generally) discontinuous, and still, integrable; its set of discontinuity
points is finite. Non-integrable functions mentioned in 4c3 are very discon-
tinuous, having intervals of discontinuity points. The function of 4c5 (or
4f12) has a dense set of discontinuity points, and still, is integrable. Can
integrability be decided via the set of discontinuity points? An affirmative
answer was given by Lebesgue, it involves the notion of Lebesgue measure
zero (rather than volume zero).
This aesthetically pleasing integrability criterion has little practical value
1
(Bichteler).
Well, if you use it when proving simple facts, such as integrabil-
3
ity of f or f g (for integrable f and g), you may find far more elementary
proofs. But here is a harder case. The so-called improper integral (to be
treated later) may be applied to unbounded functions f on (0, 1) such that
the function

M when f (x) M,

mid(M, f, M ) : x 7 f (x) when M f (x) M,

M when M f (x)

is integrable for all M > 0. The sum of two such functions is also such
function. This fact follows easily from Lebesgues criterion. You may discover
another proof, but I doubt it will be simpler!
1
From book Integration a functional approach by Klaus Bichteler (1998); see
Exercise 6.16 on p. 27.
A natural quantitative measure of non-integrability is the difference

Z Z
A= f f [0, ) .
(0,1) (0,1)
What about a natural quantitative measure of discontinuity of f ? At a given

point x0 (0, 1) it is the oscillation,

Oscf (x0 ) = inf Oscf (x0 r, x0 + r) ,
r>0
where
(6a1) Oscf (U ) = diamf (U ) = sup f (x) inf f (x) .
xU xU
But it depends on x0 . In order to get a number we integrate the oscillation

function: Z
B= Oscf .
(0,1)
We would be happy to know that B = 0 = A = 0, even happier to know
that B = 0 A = 0, but here is a surprise:
A=B.
Qualitatively,
(f is integrable) (Oscf is negligible) .
And of course, we need a multidimensional theory; (0, 1) is only the simplest
case.
It may seem that the equality A = B is an easy matter, since Oscf =

f f where
f (x0 ) = sup inf f (x) , f (x0 ) = inf sup f (x) ,
r>0 |xx0 |<r r>0 |xx |<r
0
R
and so, B = Oscf = f
R R R R
f = f R f = A.
R However, f and f need
R R
not be integrable. In fact, f R= f , fR = f (which is rather easy to
R R R R
see), and (f f ) = f f , that is, f +(f ) = f + (f ) ,
which is not easy, and a surprise, since the upper integral is not linear!1 The
equality A = B will be proved, but not this way.
6a2 Exercise. For the function f of 4c5
(a) Oscf (x) = 2m if x = 22k+1
2m+1 for m = 0, 1, . . . and k = 0, . . . , 2
2m
1;
otherwise Oscf (x) = 0;
(b) Oscf is negligible.
Prove it (not using results of Sect. 6).2
1 R R R
In fact, (f + g) = f + g when f and g are (bounded, with bounded support,
and) upper semicontinuous, that is, f = f and g = g.
2
Hint: (b) recall 4f12.
6b Integral of oscillation
We consider a bounded function f : Rn R with bounded support, and its
oscillation function

(6b1) Oscf (x0 ) = inf Oscf {x : |x x0 | < r} ,
r>0
where Oscf (U ) is still defined by (6a1).
6b2 Theorem. Z Z Z
f f= Oscf .
Rn Rn Rn
Here is the easy part.
6b3 Proposition.
Z Z Z
f f Oscf .
Rn Rn Rn
Proof. Similarly to 4g9 (or combining 4g9 with 4g7), given > 0, there
exist continuous g, h with bounded support such that g f h and
Z Z Z Z

h + f, g + f,
Rn 2 Rn Rn 2 Rn
therefore Z Z Z
(h g) + f f.
Rn Rn Rn
For arbitrary U Rn ,
Oscf (U ) = sup f (x) inf f (x) sup h(x) inf g(x) ;

xU xU xU xU
by (6b1) and continuity of g and h,
Oscf (x0 ) h(x0 ) g(x0 )
for all x0 . Thus,

Z Z Z Z Z
Oscf (h g) = (h g) + f f
Rn Rn Rn Rn Rn
for all > 0.

Now, the hard part.
6b4 Proposition.
Z Z Z
f f Oscf .
Rn Rn Rn
6b5 Lemma (Lebesgues covering number). Let K Rn be a compact set,

U1 , . . . , Um Rn open sets, and K U1 Um . Then1

> 0 x K i {1, . . . , m} y |y x| < = y Ui .
Proof. Assume the contrary: for every k there exists xk K whose k1 -neigh-
borhood is not covered by a single Ui . By compactness, there exists an accu-
mulation point x0 K of the sequence (xk )k . We take i such that x0 Ui ,
and then > 0 such that Ui contains the 2-neighborhood of x0 . For all k
such that k1 < we know that the -neighborhood of xk is not contained in
Ui , and therefore |xk x0 | ; a contradiction.
Recall Sect. 4b (Darboux sums).
Proof of Prop. 6b4. We take a natural M such that {x : f (x) 6= 0}
(2M , 2M )n , and introduce the compact set K = [2M , 2M ]n .
Given > R0, we take aR continuous h with bounded support such that

Oscf h and Rn h + Rn Oscf .
For every x0 K there exists > 0 such that the neighborhood U = {x :
|x x0 | < } satisfies

Oscf (U ) + Oscf (x0 ) , Osch (U ) ,
2 2
then
(6b6) Oscf (U ) + inf h(x)

xU
(since Oscf (x0 ) h(x0 ) and h(x0 ) inf xU h(x) Osch (U ) 2 ).

By compactness, K U1 Um for some Ui satisfying (6b6). Lemma
6b5 gives us Lebesgues
covering number for this covering; we take natural
N such that 12 n2N ; then each pixel 2N (Q + k) (where Q = [0, 1]n
and k Zn ) contained in K, being contained in the 21 n2N -neighborhood
of its center, is contained in some Ui , and therefore, by (6b6),
Oscf 2N (Q + k) +

inf h(x) ,
x2N (Q+k)
1
Note the quantifier complexity: (and globally, ). Wow!
that is,
UN,k (f ) LN,k (f ) 2nN + LN,k (h) .

The sum over k Z [2M +N , 2M +n 1] n gives
UN (f ) LN (f ) 2nM + LN (h) ,
whence, taking N ,
Z Z Z Z
nM nM
f f 2 + h (2 + 1) + Oscf
Rn Rn Rn Rn
for all > 0.

6b7 Corollary. A bounded function f : Rn R with bounded support is
integrable if and only if Oscf is negligible.
6b8 Exercise. For a set E Rn ,
(a) Osc1lE = 1lE ;
(b) E is admissible if and only if E has volume 0;
(c) v (E) v (E) = v (E);
(d) if E is admissible, then E and E are admissible, and v(E ) = v(E) =
v(E).
Prove it.
6b9 Exercise. For sets E, F Rn ,
(a) prove that (E F ) E F , (E F ) E F , (E \ F )
E F ,
(b) give another proof of 4f14.
6b10 Exercise. For f, g : Rn [M, M ],
(a) prove that Oscf g M (Oscf + Oscg );
(b) give another proof of 4f9.
6b11 Exercise. Give another proof of 4f10 and 4f16, via oscillation.
If E is admissible, then integrability of f on E is well-defined (recall 4d5),
it is integrability on Rn of the function
(
f (x) for x E,
f 1lE : x 7
0 otherwise.
By 6b7, this integrability is equivalent to negligibility of Oscf 1lE . Note that

Oscf
on E ,
Oscf 1lE = something bounded on E,

0 outside E.

Taking into account that E is of volume zero by 6b8(b) we see that Oscf 1lE
is equivalent to Oscf 1lE . Thus,
(6b12) (f is integrable on E) (Oscf is negligible on E ) .
If the set {x : Oscf (x) 6= 0} is of volume zero, then Oscf is negligi-
ble by (4d8), thus f is integrable. However, an integrable function can be
discontinuous on a dense set; for example, see 4c5 (or 4f12).
6b13 Remark. It is tempting to invent an appropriate notion negligible
set such that1
(a) f is negligible if and only if {x : f (x) 6= 0} is negligible,
and therefore
(b) f is integrable if and only if {x : Oscf (x) 6= 0} is negligible.
Is this possible? Yes and no. . .
Bad news: it can happen that {x : f (x) 6= 0} = {x : g(x) 6= 0}, f is
negligible, but g is not.
Good news: it cannot happen that {x : Oscf (x) 6= 0} = {x : Oscg (x) 6= 0},
f is integrable, but g is not.
That is, (b) succeeds, but not due to (a). Rather, (b) succeeds in spite
of the fact that (a) fails.2
6c Measure zero
6c1 Definition. A set Z Rn has measure 0 if for Pevery > 0 there exist

boxes B1 , B2 , Rn such that Z B
k=1 k and k=1 v(Bk ) .
6c2 Proposition. Countable union of sets of measure 0 has measure 0.

Proof. Let Z = Z1 Z2 . . . and each Zk has measure 0. Given > 0, we
k
take 1 , 2 , > 0 such that 1 +2 + (for instance,
Pk = 2 ), and for

each k we take boxes BP k,` such that Zk `=1 Bk,` and `=1 v(Bk,` ) k . We
get Z k,` Bk,` , and k,` v(Bk,` ) . (And all pairs (k, `) are a countable
set, of course.)
Every set of volume 0 has measure 0 (think, why). Thus, countable
union of sets of volume 0 has measure 0 (even if dense in the whole Rn ). In
particular, every countable set has measure 0. Also, many sets of cardinality
continuum have measure 0 (see 4i3).3
1
Assuming that f is bounded, with bounded support, of course.
2
Puzzled? Here is an explanation: Oscf is not just a function; it is an upper semicon-
tinuous function. For upper semicontinuous f, g it cannot happen that {x : f (x) 6= 0} =
{x : g(x) 6= 0}, f is negligible, but g is not.
3
In dimension 1 the Cantor set is such example.
6c3 Proposition. A compact set has measure 0 if and only if it has volume
0.
Proof. If: trivial. Only if: let K Rn be compact, of measure 0. Given

> 0, we take boxes Bk as in 6c1, and boxes Ak such that Bk Ak and
v(Ak ) 2v(Bk ) (think, how). By compactness, K A1 Am for some
m. Thus, v (K) v(A1 ) + + v(Am ) 2v(B1 ) + + 2v(Bm ) 2.
6c4 Exercise. (a) If Z has measure 0, then Z = , and v (Z) = 0.

Prove it.1
However, v (Z) need not be 0, of course.
6d Continuity almost everywhere

6d1 Definition. A function f : Rn R is continuous almost everywhere, if
its points of discontinuity are a set of measure 0.
For an example, recall 6a2.

More generally, a property of a point of Rn is said to hold almost every-
where if it holds except on a set of measure zero.
6d2 Theorem (Lebesgues criterion). A bounded function f : Rn R

with bounded support is integrable if and only if it is continuous almost
everywhere.
6d3 Lemma. Let f : Rn R be a bounded function with bounded support.

If f is negligible then f () = 0 almost everywhere.2
Proof. We consider sets A = {x : f (x) 6= 0} and Ai = {x : |f (x)| 1i };

A = i Ai . For each i we have 1lAi i|f |, thus v (Ai ) i |f | = 0, which
R
implies that Ai has measure 0 and, by 6c2, A has measure 0.
6d4 Lemma. The set {x : Oscf (x) } is compact, for every > 0.
Proof. Boundedness is evident. Well prove that its complement, {x :

Oscf (x) < }, is open. Given Oscf (x0 ) < , we have Oscf (U ) < for some
neighborhood U of x0 . Thus, Oscf (x) Oscf (U ) < for all x U .
1
Hint: for Z = use 6c3; for v (Z) = 0 consider Darboux sums, or use 6b8(d).
2
The converse fails; try indicator of a dense countable set.
Proof of Theorem 6d2. By 6b7 it is sufficient to prove that the function

= Oscf is negligible if and only if f is continuous almost everywhere, that
is, = 0 almost everywhere.
Only if: just by 6d3 applied to .
If: for every > 0 the set {x : (x) } has measure 0; by 6d4
and 6c3, this set has volume 0. By (4d8), is equivalent (recall 4e) to
the function min(, ) : x 7 min (x), . We take M (0, ) such that
{x : f (x) > 0} [M, M ]n , then also {x : (x) > 0} [M, M ]n , and we
get Z Z
= min(, ) (2M )n
Rn Rn
for all > 0.
Index
almost everywhere, 91 oscillation, 86
oscillation function, 87
Lebesgues covering number, 88
Lebesgues criterion, 91
Oscf (U ), 86
measure 0, 90 Oscf (x), 87
7 Linear change of variables
7a Admissible sets in vector spaces . . . . . . . . . 93

7b Volume in Euclidean spaces . . . . . . . . . . . . 94
7c Linear change of variables . . . . . . . . . . . . . 96
7a Admissible sets in vector spaces

7a1 Proposition. Let A : Rn Rn be an invertible linear operator. Then,
for every E Rn ,
A(E) is admissible E is admissible.
7a2 Lemma. Let A : Rn Rn be a linear operator. Then, for every

bounded set Z Rn of volume 0, the set A(Z) has volume 0.
Proof. The image A(Q) of the cube Q = [0, 1]n is bounded (think, why).
We take a box B such that A(Q) B and get v A(Q) v(B) < .
Moreover, using 4h2 and (4h6) we get1 for all N and k ZN
v A(2N (Q + k)) M 2nN

where M = v(B).
Using subadditivity of the outer volume,2
(7a3) v (E F ) v (E) + v (F ) ,
we get for arbitrary bounded E,

X X
v A(E) v A(2N (Q + k)) M 2nN

1 = M UN (1lE ) ;
k:2N (Q+k)E6= k:2N (Q+k)E6=
for N it gives v A(E) M v (E). Thus, v (Z) = 0 implies

v (A(Z)) = 0.
7a4 Remark. Let A be invertible. Then Z has volume 0 if and only if A(Z)
has volume 0.
N
1
Since A(2 (Q + k)) 2N (B + RA(k)). R
2 R R
Indeed, 1lEF (1lE + 1lF ) 1lE + 1lF by 4c7.
Proof of Prop. 7a1. Well prove that A(E) is admissible whenever E is ad-
missible (then, applying it to A1 , we get the converse implication). By
6b8(b), E has volume 0; by 7a2, A(E) has volume 0; also, A(E) = A(E),
since A is a homeomorphism; thus, A(E) has volume 0; by 6b8(b) (again),
A(E) is admissible.
Similarly to Sect. 1f we conclude.
The notion admissible set is insensitive to a change of basis.
This notion is well-defined in every n-dimensional vector space, and
preserved by isomorphisms of these spaces.
The same holds for the notion volume 0.
7a5 Exercise. (a) Every bounded subset of a vector subspace V1 $ V has
volume 0;
(b) every vector subspace V1 $ V has measure 0.
Prove it.1
7b Volume in Euclidean spaces

7b1 Proposition. If a linear operator A : Rn Rn preserves the Euclidean
metric (that is, |Ax| = |x| for all x Rn ), then it preserves volume (that is,
v(A(E)) = v(E) for all admissible E Rn ).
Rotation invariance of volume, at last!
7b2 Lemma. Let A : Rn Rn be an invertible linear operator. Then there
exists C (0, ) such that, for every admissible E Rn ,

v A(E) = Cv(E) .
Proof. We take C = v(A(Q)) where Q = [0, 1]n (admissibility of A(Q) being

ensured by 7a1, and C 6= 0 by 7a4). Similarly to the proof of 7a2, using 4h2
and (4h6) we get for all N and k ZN
v A(2N (Q + k)) = C 2nN .

For k 6= ` the set A(2N (Q + k)) A(2N (Q + `)) = A (2N (Q + k))

(2N (Q + `)) has volume 0 by 7a2. Using additivity of volume 4d3, we get

X
v A(2N (Q + k)) = CUN (1lE ) ,

v A(E)
k:2N (Q+k)E6=
1
Hint: change of basis.
and similarly,
X
v A(2N (Q + k)) = CLN (1lE ) ;

v A(E)
k:2N (Q+k)E

for N it gives v A(E) = Cv(E).
If A is of the form A(x1 , . . . , xn ) = (a1 x1 , . . . , an xn ) (that is, diagonal
matrix), then C = |a1 . . . an | by 4h4.
Proof of Prop. 7b1. The constant C given by 7b2 is equal to 1, since the ball
E = {x : |x| 1} (admissible by 4i4, and of non-zero volume since E 6= )
satisfies A(E) = E.
Volume is insensitive to a change of orthonormal basis.

It is well-defined in every n-dimensional Euclidean space, and preserved
by isomorphisms of these spaces.
Now we are in position to find the constant C for arbitrary A.
7b3 Theorem. Let A : Rn Rn be an invertible linear operator. Then, for

every admissible E Rn ,

v A(E) = | det A| v(E) .
Recall the singular value decomposition (Sect. 3d).1

Proof. By 3d2, some change of two orthonormal bases in Rn turns A into
a diagonal matrix whose diagonal elements are the singular values s1 , . . . , sn
of A. The constant C, insensitive to this change of bases, is equal to s1 . . . sn
by 4h4. It remains to prove an algebraic fact: | det A| = s1 . . . sn .
A change of orthonormal bases multiplies a matrix from the left and from
the right by orthogonal matrices; it means, a matrix U such that |U x| = |x|
for all x. It follows that hx, yi = hU x, U yi = hU U x, yi, thus id = U U ;
1 = det(U U ) = det(U ) det U = (det U )2 ; det U = 1.
If | |1 , | |2 are two Euclidean norms on an n-dimensional vector space,
||1
then the ratio of norms || 2
varies between min(s1 , . . . , sn ) and max(s1 , . . . , sn )
(here s1 , . . . , sn are the singular values), depending on the direction of a
vector; but the ratio of volumes vv21 () ()
is s1 . . . sn , invariably.
1
Some linear algebra is needed here. Many authors decompose an arbitrary matrix
into the product of elementary matrices (of three types). But I prefer the singular value
decomposition.
On an n-dimensional vector space the volume is ill-defined, but

admissibility is well-defined, and the ratio v(E1)
v(E2 )
of volumes is well-defined.
That is, the volume is well-defined up to a coefficient.
7b4 Exercise. Find the volume cut off from the unit ball by the plane
ax + by + cz = t.
7b5 Exercise. Let vectors h1 , . . . , hn Rn be linearly independent, and

C = | det(h1 , . . . , hn )|.
(a) The parallelotope E = {u1 h1 + + un hn : 0 u1 , . . . , un 1} is
admissible, and v(E) = C.
(b) The simplex E = {u1 h1 + +un hn : u1 , . . . , un 0, u1 + +un 1}
is admissible, and v(E) = n!1 C.
(c) The ellipsoid E = {u1 h1 + + un hn : u21 + + u2n 1} is admissible,
and C1 v(E) is equal to the volume of the n-dimensional unit ball {x Rn :
|x| 1}.
Prove it.1
7c Linear change of variables

7c1 Theorem. Let A : Rn Rn be an invertible linear operator. Then, for
every bounded function f : Rn R with bounded support,
Z Z Z Z
| det A| f A= f and | det A| f A = f.

Thus, f A is integrable if and only if f is integrable, and in this case

Z Z
| det A| f A = f .
Proof. First, consider the indicator f = 1lE of an admissible set E Rn .

We have f A = 1lA1 (E) (think, why); this function is integrable by 7a1, and
Z Z
1 1 1
f A = v(A (E)) = | det A |v(E) = f
| det A|
by 7b3.
In particular, it holds for indicators
R of boxes.
R Taking linear combinations
we see that the equality | det A| f A = f holds for all step functions f .
1
Hint: (b) use 5e17.
Now,
R the general case. Given > 0, 4g6 gives a step function h f such
R
that h + f . We have
Z Z Z Z
| det A| f A | det A| h A = h + f
for all > 0, thus, Z Z

| det A| f A f.
1 1 1 R
R
Applying it to A and f A we get | det A | f A A f A, that
R R R R
is, f | det A| R f A. Thus,
R | det A| f A = f . Similarly (or using
(f )), | det A| f A = f .
In the exercises below you may start with changing basis, or with opening
brackets. When really needed, use iterated integral (and scaling). Sometimes
5e10(c) may help. Think, which way is shorter. A hint: in order to prove
that an integral is equal to 0 it is sufficient to find a change of basis that flips
the sign of the integral.
7c2 Exercise. If a1 a2 + b1 b2 + c1 c2 = 0, then

ZZZ
(a1 x + b1 y + c1 z)(a2 x + b2 y + c2 z) dxdydz = 0 .
x2 +y 2 +z 2 <1
Prove it.
7c3 Exercise. Find the mean value of the function (x, y, z) 7 (ax+by+cz)2
on the ball {(x, y, z) : x2 + y 2 + z 2 < 1}.1
7c4 Exercise. Find the mean value of the function (x, y, z) 7 (a1 x + b1 y +
c1 z)(a2 x + b2 y + c2 z) on the ball {(x, y, z) : x2 + y 2 + z 2 < 1}.2
7c5 Exercise. Let h1 , h2 , h3 R3 and t1 , t2 , t3 R. Find the mean value of

the function x 7 (hh1 , xi + t1 )(hh2 , xi + t2 )(hh3 , xi + t3 ) on the ball {(x, y, z) :
x2 + y 2 + z 2 < 1}.3
1 1 2 2 2
Answer: 5 (a + b + c ).
2 1
Answer: 5 (a1 a2 + b1 b2 + c1 c2 ).
3 1
Answer: 5 (hh1 , h2 it3 + hh1 , h3 it2 + hh2 , h3 it1 ) + t1 t2 t3 .
8 Nonlinear change of variables
8a Introduction . . . . . . . . . . . . . . . . . . . . . . . 98
8b Examples . . . . . . . . . . . . . . . . . . . . . . . . . 100
8c Measure 0 is preserved . . . . . . . . . . . . . . . . . 102
8d Approximation from within . . . . . . . . . . . . . . 104
8e All we need is small volume . . . . . . . . . . . . . 105
8f Small volume in the linear approximation . . . . 107
Change of variables is the most powerful tool for calculating multidimen-

sional integrals (in particular, volumes). Integration, differentiation (diffeo-
morphism, its derivative) and linear algebra (the determinant) are all rele-
vant.
8a Introduction
The area of a disk {(x, y) x2 + y 2 < 1} R2 may be calculated by iterated
integral,
1 1x2 1
dx dy 2
1 1x2 = 1 2 1 x dx = . . .
or alternatively, in polar coordinates,
1 2 1
0 r dr d = 2r dr = ;
0 0
the latter way is much easier! Note rdr rather than dr (otherwise we
would get 2 instead of ).
Why the factor r? In analogy to the one-dimensional theory we may
expect something like dx dy
dr d ; is it r? Well, basically, it is r because an in-
finitesimal rectangle [r, r + dr] [, + d] of area dr d on the (r, )-plane
corresponds to an infinitesimal rectangle or area dr rd on the (x, y)-plane.
rd
dr
Here we use the mapping (r, ) (r cos , r sin ), and r is det(D)(r,)

(see Exer. 8b2). Some authors1 denote det(D) by J and call it the Jaco-
bian of . Some2 denote det(D) by and call it the Jacobian determinant
(of the Jacobian matrix J ). Others3 leave det(D) as is. Here is a general
result, to be proved in Sect. 8f.
8a1 Theorem. Let U, V Rn be admissible open sets, U V a dif-

feomorphism, and f V R a bounded function such that the function
(f ) det D U R is also bounded. Then4
(a) (f is integrable on V ) (f is integrable on U )
((f ) det D is integrable on U );
(b) if they are integrable, then
V f = U (f ) det D .
8a2 Remark. Applying Th. 8a1 to a linear Rn Rn we get Th. 7c1

(for integrable functions). On the other hand, Th. 7c1 is instrumental in the
proof of Th. 8a1.
8a3 Remark. Applying Th. 8a1 to indicator functions f , we get:

(a) if det D is bounded, then v(V ) = U det D;
(b) if det D is bounded on an admissible set E U , then (E) is ad-
missible, and v((E)) = E det D.
8a4 Remark. (a) If det D is bounded, then boundedness of f implies

boundedness of (f ) det D;
(b) if det D is bounded away from 0, then boundedness of (f ) det D
implies boundedness of f ;
(c) f has a compact support within V 5 if and only if (f ) det D has
a compact support within U , and in this case boundedness of f is equivalent
to boundedness of (f ) det D (since det D is bounded, and bounded
away from 0, on the support).
Unbounded functions will be treated (in Sect. 9) by improper integral.

The proof of Theorem 8a1, rather complicated, occupies Sections 8c
8f. Some authors6 decompose an arbitrary diffeomorphism (locally) into the
1
Burkill.
2
Lang.
3
Hubbard, Shifrin, Shurman, Zorich.
4
Recall Def. 4d5.
5
It means existence of a compact K V such that f () = 0 on V K.
6
Shurman, Zorich.
composition of diffeomorphisms that preserve a part of the coordinates, and

use the iterated integral. Some1 introduce the derivative of a set function and
prove that det D is the derivative of E v((E)). Others2 reduce the
general case to indicators of small cubes and use the linear approximation.
We do it this way, too.
8b Examples
In this section we take for granted Theorem 8a1 (to be proved in Sect. 8f).
8b1 Exercise. Show that 5d4 and 5d5 are special cases of 8a1.
8b2 Exercise (polar coordinates in R2 ). (a) Prove that
x2 +y2 <R2 f (x, y) dxdy = 0<r<R,0<<2 f (r cos , r sin ) r drd
for every integrable function f on the disk x2 + y 2 < R2 ; 3

(b) it can happen that the function (r, ) rf (r cos , r sin ) is integrable
on (0, R)(0, 2), but f is not integrable on the disk; find a counterexample;
(c) however, (b) cannot happen if f is bounded on the disk; prove it.4
In particular, we have now the curvilinear Cavalieri principle for con-
centric circles promised in 5e9.
8b3 Exercise (spherical coordinates in R3 ). Consider the mapping R3

R3 , (r, , ) = (r cos sin , r sin sin , r cos ).
(a) Draw the images of the planes r = const, = const, = const, and of
the lines (, ) = const, (r, ) = const, (r, ) = const.
(b) Show that is surjective but not injective.
(c) Show that det D = r2 sin . Find the points (r, , ), where the
operator D is invertible.
(d) Let V = (0, ) (, ) (0, ). Prove that V is injective. Find
U = (V ).
1
Burkill.
2
Hubbard, Lang, Shifrin.
3
Do you use a diffeomorphism between (0, R) (0, 2) and the disk? (Look closely!)
4
Do not forget: Theorem 8a1 is taken for granted.
8b4 Exercise. Compute the integral x2 +y2 +(z2)2 1 xdxdydz

2 +y 2 +z 2 .
3 1
Answer: (2 2 log 3).
dxdy
8b5 Exercise. Compute the integral (1+x2 +y 2 )2 over one loop of the lem-
niscate (x2 + y 2 )2 = x2 y 2 . 2
8b6 Exercise. Compute the integral over the four-dimensional unit ball:
x2 +y 2 u2 v 2 dxdydudv. 3
x2 +y2 +u2 +v2 1 e
8b7 Exercise. Compute the integral xyz dxdydz over the ellipsoid {x2 /a2 +
y 2 /b2 + z 2 /c2 1}.
2 2 2
Answer: a b6 c . 4
The centroid 5 of an admissible set E Rn of non-zero volume is the point
CE Rn such that for every linear (or affine) f Rn R the mean of f on E
(recall the end of Sect. 4d) is equal to f (CE ). That is,
1
CE = ( x1 dx, . . . , xn dx) ,
v(E) E E
1
which is often abbreviated to CE = v(E) E x dx.
8b8 Exercise. Find the centroids
of the following bodies in R3 :
(a) The cone {(x, y, z) h x + y 2 < z < h} for a given h > 0.
2
(b) The tetrahedron bounded by the three coordinate planes and the
plane xa + yb + zc = 1.
(c) The hemispherical shell {a2 x2 + y 2 + z 2 b2 , z 0}.
(d) The octant of the ellipsoid {x2 /a2 + y 2 /b2 + z 2 /c2 1, x, y, z 0}.
The solid torus in R3 with minor radius r and major radius R (for 0 <
r < R < ) is the set

= {(x, y, z) ( x2 + y 2 R)2 + z 2 r2 } R3
generated by rotating the disk
= {(x, z) (x R)2 + z 2 r2 } R2
Hint: 1 < r < 3; cos > r 4r+3 .
2
1

2
Hints: use polar coordinates; 4 < < 4 ; 0 < r < cos 2; 1 + cos 2 = 2 cos2 ;
cos2 = tan .
d
Hint: The integral equals x2 +y2 1 ex +y (u2 +v2 1(x2 +y2 ) e(u +v ) dudv) dxdy. Now
2 2 2 2
3
use the polar coordinates.

4
Hint: 4h3 can help.
5
In other words, the barycenter of (the uniform distribution on) E.
on the (x, z) plane (with the center (R, 0) and radius r) about the z axis.
Interestingly, the volume 2 2 Rr2 of is equal to the area r2 of multiplied

by the distance 2R traveled by the center of . (Thus, it is also equal to
the volume of the cylinder {(x, y, z) (x, z) , y [0, 2R].) Moreover, this
is a special case of a general property of all solids of revolution.
8b9 Proposition (the second Pappuss centroid theorem).1,2 Let
(0, ) R R2 be an admissible set and = {(x, y, z) ( x2 + y 2 , z)
} R3 . Then is admissible, and
v3 () = v2 () 2xC ;
here C = (xC , zC ) is the centroid of .

8b10 Exercise. Prove Prop. 8b9.3
8c Measure 0 is preserved
8c1 Proposition. Let U, V Rn be open sets, and U V diffeomor-
phism. Then, for every set Z U ,
(Z has measure 0) ((Z) has measure 0) .
Recall Def. 6c1.

8c2 Lemma. The following three conditions on a set Z Rn are equivalent:
(a) for every > 0 there exist pixels Qi = 2Ni ([0, 1]n + ki ) such that

Z
i=1 Qi and i=1 v(Qi ) ;
1
Pappus of Alexandria ( 02900350) was one of the last great Greek mathematicians
of Antiquity.
2
The first Pappuss centroid theorem, about surface area, has to wait for Analysis 4.
3
Hint: use cylindrical coordinates: (r, , z) = (r cos , r sin , z).
(b) Z has measure 0;

(c) for every > 0 there exist admissible sets E1 , E2 , Rn such that

Z i=1 Ei and i=1 v(Ei ) .
Proof. Clearly, (a)(b)(c); well prove that (c)(a).

First, recall Sect. 4d: for every admissible E we have v(E) = v (E) =
limN UN (1lE ), and UN (1lE ) is the total volume of all N -pixels that intersect
E. Given > 0, we take N such that UN (1lE ) v(E) + , denote the N -pixels
that intersect E by Q1 , . . . , Qj and get E Q1 Qj and v(Q1 )+ +v(Qj )
v(E) + .
Now we prove that (c)(a). Given Ei as in (c) and > 0, we take i > 0
such that i i , and for each i we take pixels Qi,1 , . . . , Qi,ji such that
Ei Qi,1 Qi,ji and v(Qi,1 ) + + v(Qi,ji ) v(Ei ) + i . Then Z i Ei
i (Qi,1 Qi,ji ) and i (v(Qi,1 ) + + v(Qi,ji )) i (v(Ei ) + i ) 2. It
remains to enumerate all these Qi,j by a single index.
Euclidean metric is convenient when working with balls, not cubes. An-
other norm (called cubical norm or sup-norm),
x = max(x1 , . . . , xn ) for x = (x1 , . . . , xn ) Rn
becomes more convenient, since its ball {x x r} is a cube (of volume

(2r)n ), and is equivalent to the Euclidean norm, since 1n x x x.
(Some authors 1 use the cubic norm; others,2 using Euclidean norm, complain
about pesky n.) The corresponding operator norm (recall 1f11),
Ax
A = sup = max Ax ,
xRn x x 1
is also equivalent to the usual operator norm.
8c3 Exercise. Prove the cubical-norm counterpart of (1f31):3
f (b) f (a) Cb a if Df () C on [a, b] .
Proof of Prop. 8c1. It is sufficient to prove ; applied to 1 it gives .

We consider the pixels Q = 2N ([0, 1]n + k) for all N and all k Zn such
that Q U . They are a countable set,4 and their union is the whole U . Thus,
Z is the union of countably many sets Z Q of measure 0, and (Z) is the
1
Shifrin, Sect. 7.6 (explicitly); Lang, p. 590 (implicitly).
2
Hubbard, after Prop. A19.3.
3
Surprisingly, this is simpler than (1f31).
4
Many of them are redundant, but this is harmless.
union of countably many sets (Z Q). By 6c2 it is sufficient to prove that

each (Z Q) has measure 0.
By compactness, the exists M such that D(x) M for all x Q. By
8c3, (x) (y) M x y for all x, y Q.
Given > 0, using 8c2 we take pixels Qi = 2Ni ([0, 1]n + ki ) such that
Z Q i Qi and i v(Qi ) . WLOG, Qi Q.
For all x Qi we have (x) (2Ni ki ) M x 2Ni ki 2Ni M ,
thus, (Qi ) is contained in a cube of volume (2 2Ni M )n = (2M )n v(Qi ), 1
and therefore (Z Q) is contained in the union of cubes of total volume
(2M )n , which shows that (Z Q) has measure 0.
Here is a lemma needed (in addition to 8c1) in order to prove Th. 8a1(a).
8c4 Lemma. Let E Rn be an admissible set, and f E R a bounded

function. Then f is integrable on E if and only if the discontinuity points of
f on E are a set of measure 0.
Proof. Denote by Z the set of all discontinuity points of f 1lE ; then Z E is

the set of all discontinuity points of f on E . The difference Z (Z E ) E
has volume 0 (see 6b8(b)), therefore, measure 0. Using Lebesgue criterion
6d2,
(f is integrable on E) (Z has measure 0) (Z E has measure 0) .
Proof of Item (a) of Th. 8a1. Denote by Z the set of all discontinuity points
of f (on V ); then 1 (Z) is the set of all discontinuity points of f (on U ),
since is a homeomorphism, and of (f ) det D as well, since det D is
continuous and never 0. By 8c1, if one of these three functions is continuous
almost everywhere, then the other two are. It remains to apply 8c4.
8c5 Corollary. A set E U is admissible if and only if (E) V is admis-

sible.
8d Approximation from within

Here we reduce Item (b) of Theorem 8a1 to such a special case (to be proved
later).
8d1 Proposition. Let U, V, , f be as in Th. 8a1, and in addition, f be

compactly supported within V . Then 8a1(b) holds.
1
Moreover, of volume M n v(Qi ); never mind.
8d2 Lemma. For every > 0 there exists admissible compact K U satis-
fying
v(K) v(U ) , v((K)) v(V ) .
Proof. Recall Sect. 4d:1 v(U ) = v (U ) = limN LN (1lU ), and LN (1lU ) is the
total volume of all N -pixels contained in U ; denoting the union of these pixels
by EN we have v(EN ) v(U ), and each EN is an admissible compact subset
of U .
For every > 0 there exists admissible compact E U such that v(E)
v(U ) . Similarly, there exists an admissible compact F V such that
v(F ) v(V ) . By 8c5, 1 (F ) and (E) are admissible; we take K =
E 1 (F ).
Proof that Prop. 8d1 implies Th. 8a1(b). We take M such that f (y) M
for all y V , and f ((x)) det(D)x M for all x U .
We take i 0; Lemma 8d2 gives Ki for i ; we introduce functions
fi = f 1l(Ki ) , then fi = (f )1lKi .

We use the integral norm (recall Sect. 4e): f fi = f fi M
1lV (Ki ) M i , which gives the integral convergence: fi f as i .
Similarly, (fi ) det D (f ) det D.
We apply 8d1 to each fi and get 8a1(b) in the limit i , since the
integral convergence implies convergence of integrals.
8e All we need is small volume

Now we reduce Proposition 8d1, getting rid of the function f .
8e1 Proposition. Let U, V Rn be open sets, U V a diffeomorphism,

and K U a compact set. Then for every > 0 there exists > 0 such that
for all (0, ] and h Rn , if (Q + h) K , where Q = [0, 1]n , then
(Q + h) U and
v(((Q + h)))
(8e2) 1 1 + for all x (Q + h) .
n det(D)x
Note that ((Q + h)) is admissible by 8c5.

Proof that Prop. 8e1 implies Prop. 8d1 (and therefore Th. 8a1). We have a com-
pact K U such that f = 0 on V (K). Given > 0, well show that the two
integrals are -close. Prop. 8e1 gives , and we take N such that 2N
1
See also the proof of 8c2.
and UN ((f ) det D)LN ((f ) det D) . By 8e1, for every N -pixel
Q such that Q K ,
v((Q))
1 1 + for all x Q .
v(Q) det(D)x
That is,
(1 )v(Q)( sup det(D)x ) v((Q)) (1 + )v(Q)( inf det(D)x ) .

xQ xQ
WLOG, f 0 (otherwise, take f = f + f ). Denoting for convenience

g = (f ) det(D)x we have (below, Q runs over all N -pixels that intersect
K)
(1 )LN (g) = (1 ) v(Q) inf g(x) =

xQ
Q
= (1 ) v(Q) inf (f ((x)) det(D)x )

xQ
Q
(1 ) v(Q)( inf f ((x)))( sup det(D)x )

xQ xQ
Q
v((Q)) inf f (y) f = f v((Q)) sup f (y)

Q y(Q) Q (Q) V Q y(Q)
(1 + ) v(Q)( sup f ((x)))( inf det(D)x )

xQ xQ
Q
(1 + ) v(Q) sup(f ((x)) det(D)x ) = (1 + )UN (g) .

Q xQ
We see that V f [(1 )LN (g), (1 + )UN (g)]; also U g [LN (g), UN (g)];
thus,
g f (1+)UN (g)(1)LN (g) (1+)(LN (g)+)(1)LN (g) =

U V
= 2LN (g) + + 2 0 as 0 .
Now we reduce the proposition further, making it local, and formulated

in terms of the cubic norm.
For convenience we say that a cube Q0 Rn is -good, if Q0 U , and
every sub-cube Q Q0 satisfies
v((Q))
(8e3) 1 1 + for all x Q .
v(Q) det(D)x
Clearly, every sub-cube of an -good cube is also -good.
8e4 Proposition. Let U, V Rn be open sets, U V a diffeomorphism,

and x0 U . Then for every > 0 there exists > 0 such that the cube
Q0 = {x Rn x x0 } is -good.
Proof that Prop. 8e4 implies Prop. 8e1 (and therefore Th. 8a1). A compact set
K U is given, and > 0. For every x0 K, 8e4 gives an -good cube Q0 (x0 ).
Open cubes Q0 (x0 ) cover K. Applying 6b5 (in the cubic norm, equivalent to
the Euclidean norm) to a finite subcovering we get a covering number, denote
it 2 , such that for every x0 K the cube Q1 (x0 ) = {y y x0 < 2 } is
covered by a single Q0 (x) and therefore is -good. For every (0, ] every
cube ([0, 1]n +h) that intersects K at some x0 is contained in Q1 (x0 ), which
proves 8e1.
8f Small volume in the linear approximation

Now we prove Prop. 8e4. We have U V , x0 U , and > 0. We rewrite
(8e3), using the linear change of variables Th. 7b3:
v((Q))
(8f1) 1 1 + for all x Q ;
v((D)x (Q))
here (D)x (Q) = {(D)x h h Q}. Treating U Rn as U W

where W is an n-dimensional vector space, we note that (8f1), being about
the ratio of two volumes in W , is insensitive to (arbitrary) change of basis in
W (recall the framed phrase before (7b4)). Changing the basis (similarly to
Sect. 2c, 2d) we ensure, WLOG, that1 (D)x0 = id.
Thus, det(D)x0 = 1. WLOG,
(8f2) 1 det(D)x 1 + for all x U ;
otherwise we replace U with a small neighborhood of x0 (using continuity of

x det(D)x ).
Now we may replace (8f1) with
v((Q))
(8f3) 1 1 + ,
v(Q)
v((Q)) v((Q)) v(Q) v((Q)) 1
since v((D) x (Q))
= v(Q) v((D)x (Q)) = v(Q) det(D)x , and so (8f3) implies
(by (8f2))
1 v((Q)) 1+
,
1 + v((D)x (Q)) 1
1
Mind it: (D)x0 , not (D)x .
which is not quite (8f1), but we may change accordingly.

Similarly to (8f2), WLOG,
(D)x id for all x U ,
and in addition, U is convex (just a ball or a cube). By 8c3,1
(8f4) ((b) (a)) (b a) b a for all a, b U .
We take > 0 such that, first, the cube Q0 = {x Rn x x0 } satisfies

Q0 U , and second, {y Rn y y0 (1 + ) } V , where y0 = (x0 );
this is possible, since V is an (open) neighborhood of y0 .
It is sufficient to prove that
v((Q))
(8f5) (1 )n (1 + )n for every sub-cube Q Q0 .
v(Q)
This is not quite (8f3), but again, we may change accordingly.

Given such Q, WLOG, the center of Q is 0, and (0) = 0 (since, as before,
we may shift the origins in both copies of Rn ). Thus,
Q = {x Rn x r}
for some r (0, ]; it remains to prove that
(8f6) (1 )Q (Q) (1 + )Q .
By (8f4) for a = 0, (x) x x for all x U; thus, (1 )x

(x) (1 + )x . For x Q we get (x) (1 + )r, thus, (x)
(1 + )Q, which proves the inclusion (Q) (1 + )Q. It remains to prove
the other inclusion, (1 )Q (Q).
We note that V (1 )Q (Q), since (x) (1 )Q (x)
(1 )r (1 )x (1 )r x r x Q.
It remains to prove that (1 )Q V ; well prove a bit more: that
Q {y Rn y y0 (1 + ) } (and therefore Q V ).
The given inclusion Q Q0 means that x0 + r (think, why);
similarly, the needed inclusion becomes y0 + r (1 + ) . The latter
follows from the former:
y0 + r = (x0 ) + r (1 + )x0 + r (1 + )(x0 + r) (1 + ) ,
which completes the proof of Prop. 8e4, and therefore Theorem 8a1, at last!
1
Recall the proof of 2c1 (and 2c3).
9 Improper integral
9a Introduction . . . . . . . . . . . . . . . . . . . . . . . 109
9b Positive integrands . . . . . . . . . . . . . . . . . . . 110
9c Special functions gamma and beta . . . . . . . . . 114
9d Change of variables . . . . . . . . . . . . . . . . . . . 116
9e Iterated integral . . . . . . . . . . . . . . . . . . . . . 117
9f Multidimensional beta integrals of Dirichlet . . . 120
9g Non-positive (signed) integrands . . . . . . . . . . 122
Riemann integral and volume are generalized to unbounded functions and

sets.
9a Introduction
The n-dimensional unit ball in the lp metric,
E = {(x1 , . . . , xn ) x1 p + + xn p 1} ,
is an admissible set, and its volume is a Riemann integral,
v(E) = 1lE ,
Rn
of a bounded function with bounded support. In Sect. 9f well calculate it:

2n n ( p1 )
v(E) =
pn ( np + 1)
where is a function defined by

(t) = xt1 ex dx for t > 0 ;
0
here the integrand has no bounded support; and for t = p1 < 1 it is also un-
bounded (near 0). Thus we need a more general, so-called improper integral,
even for calculating the volume of a bounded body!
In relatively simple cases the improper integral may be treated via ad hoc
limiting procedure adapted to the given function; for example,
k
0 xt1 ex dx = lim xt1 ex dx .
k 1/k
In more complicated cases it is better to have a theory able to integrate rather

general functions on rather general n-dimensional sets. Different functions
may tend to infinity on different subsets (points, lines, surfaces), and still,
we expect (af + bg) = a f + b g (linearity) to hold, as well as change of
variables.1
9b Positive integrands
We consider an open set G Rn and functions f G [0, ) continuous
almost everywhere.2 We do not assume that G is bounded. We also do not
assume that G is admissible, even if it is bounded.3 Continuous almost
everywheremeans that the set A G of all discontinuity points of f has
measure 0 (recall Sect. 6d). We can use the function f 1lG equal f on G
and 0 on Rn G, but must be careful: 1lG and f 1lG need not be continuous
almost everywhere.
We define
(9b1) f = sup { g g Rn R integrable,

G Rn
0 g f on G, g = 0 on Rn G} [0, ] .
The condition on g may be reformulated as 0 g f 1lG . If f 1lG is

integrable (on Rn ), then clearly G f = Rn f 1lG , which generalizes 4d5. This
happens if and only if f 1lG is bounded, with bounded support, and
f (x) f (x0 ) = 0 as G x x0
for almost all x0 G (think, why). (Void if G has measure 0.)
9b2 Exercise. (a) Without changing the supremum in (9b1) we may restrict
ourselves to continuous g with bounded support; or, alternatively, to step
functions g; and moreover, in both cases, WLOG, g has a compact support
inside G;
1
Additional literature (for especially interested):
M. Pascu (2006) On the definition of multidimensional generalized Riemann integral,
Bul. Univ. Petrol LVIII:2, 916.
(Research level ) D. Maharam (1988) Jordan fields and improper integrals, J. Math.
Anal. Appl. 133, 163194.
2
This condition will be used in 9b9.
3
A bounded open set need not be admissible, even if it is diffeomorphic to a disk.
(b) if f is bounded (not necessarily a.e. continuous) and G is bounded,

then G f = Rn f 1lG , and in particular, G 1 = v (G);1
(c) if f is bounded and G is admissible, then the integral defined by (9b1)
is equal to the integral defined by 4d5.
Prove it.
There are many ways to treat the improper integral as the limit of
(proper) Riemann integrals; here are some ways.
9b3 Exercise. Consider the case G = Rn , and let be a norm on Rn of
the form2 x = (x1 p + + xn p )1/p for x = (x1 , . . . , xn ); here p [1, ] is a
parameter (and x = max(x1 , . . . , xn ) if p = ).
(a) Prove that
Rn f = k
lim
x<k
min(f (x), k) dx .
(b) For a locally bounded3 f prove that
Rn f = k
lim
x<k
f (x) dx .
(c) Can it happen that f is locally bounded, not bounded, and Rn f < ?
9b4 Example (Poisson). Consider
I = ex dx .
2
R2
On one hand, by 9b3 for the Euclidean norm (p = 2),

k 2 k2
(x2 +y 2 ) r2
I = lim e dxdy = lim r dr e lim eu du = .
d = k
k k
x2 +y 2 <k2 0 0 0
On the other hand, by 9b3 for (x, y) = max(x, y) (p = ),

k k + 2
(x2 +y 2 ) x2 y 2 x2
I = lim e dxdy = lim ( e dx)( e dy) = ( e dx) ,
k k
x<k,y<k k k
and we obtain the celebrated Poisson formula:

+
x 2
e dx = .

1
In fact, v (G) is Lebesgues measure of G.
2
But in fact, the same holds for arbitrary norm.
3
That is, bounded on every bounded subset of Rn .
9b5 Exercise. Consider
xa y b e(x
2 +y 2 )
I= dxdy [0, ]
x>0,y>0
for given a, b R. Prove that, on one hand,

/2
ra+b+1 er dr)(
2
I = ( cosa sinb d) ,
0 0
and on the other hand,

xa ex dx)( xb ex dx) .
2 2
I = (
0 0
9b6 Exercise. Consider f R2 [0, ) of the form f (x) = g(x) for a given
g [0, ) [0, ).

(a) If g is integrable, then f is integrable and R2 f = 2 0 g(r) r dr.

(b) If g is continuous on (0, ), then R2 f = 2 0 g(r) r dr [0, ].
Prove it.1
9b7 Exercise. Let be as in 9b3.2 Consider f Rn [0, ) of the form
f (x) = g(x) for a given g [0, ) [0, ).

(a) If g is integrable, then f is integrable, and Rn f = nV 0 g(r) rn1 dr
where V is the volume of {x x < 1}.

(b) If g is continuous on (0, ), then Rn f = nV 0 g(r) rn1 dr [0, ].
c) Let g be continuous on (0, ) and satisfy
g(r) ra for r 0+ , g(r) rb for r + .
Then f < if and only if b < n < a.

Prove it.3

9b8 Example. Rn ex dx = nV 0 rn1 er dr; in particular, Rn ex dx =
2 2 2

nVn 0 rn1 er dr where Vn is the volume of the (usual) n-dimensional unit
2
ball. On the other hand, Rn ex dx = (R ex dx)n = n/2 . Therefore

2 2
n/2
Vn = .
n 0 rn1 er2 dr

Not unexpectedly, V2 = 2 0 rer2 dr
= .
1
Hint: (a) polar coordinates; (b) use (a).
2
But in fact, the same holds for arbitrary norm.
3
Hint: (a) first, g = 1l[0,a] , second, a step function g, and third, sandwich; also,
(a)(b)(c).
Clearly, G cf = c G f for c (0, ).

9b9 Proposition. G (f1 + f2 ) = G f1 + G f2 [0, ] for all f1 , f2 0 on G,
continuous almost everywhere.
Proof. The easy part: G (f1 + f2 ) G f1 + G f2 .1 Given integrable g1 , g2
such that 0 g1 f1 1lG and 0 g2 f2 1lG , we have g1 + g2 = (g1 + g2 )
G (f1 + f2 ), since g1 + g2 is integrable and 0 g1 + g2 (f1 + f2 ) 1lG . The
supremum in g1 , g2 gives the claim.
The hard part: G (f1 + f2 ) G f1 + G f2 , that is, g G f1 + G f2 for
every integrable g such that 0 g (f1 +f2 )1lG . We introduce g1 = min(f1 , g),
g2 = min(f2 , g) (pointwise minimum on G; and 0 on Rn G) and prove that
they are continuous almost everywhere (on Rn , not just on G). For almost
every x G, both f1 and g are continuous at x and therefore g1 is continuous
at x. For almost every x G, g is continuous at x, which ensures continuity
of g1 at x (irrespective of continuity of f1 ), since g(x) = 0 (x G). Thus, g1
is continuous almost everywhere; the same holds for g2 .
By Lebesgues criterion 6d2, the functions g1 , g2 are integrable. We have
g1 +g2 min(f1 +f2 , g) = g, since generally, min(a, c)+min(b, c) min(a+b, c)
for all a, b, c [0, ) (think, why). Thus, g (g1 + g2 ) = g1 + g2
G f1 + G f2 , since 0 g1 f1 1lG , 0 g2 f2 1lG .
9b10 Proposition (exhaustion). For open sets G, G1 , G2 , Rn ,

Gk G f f [0, ]
Gk G
for all f G [0, ) continuous almost everywhere.
Proof. First of all, Gk f Gk+1 f (since 0 g f 1lGk implies 0 g
f 1lGk+1 ), and similarly, Gk f G f , thus Gk f and limk Gk f G f . We
have to prove that G f limk Gk f .
We take an integrable g, compactly supported inside G (recall 9b2(a)),
such that g f on G. By compactness, there exists k0 such that g f 1lGk0 .
Then g Gk f limk Gk f . The supremum in g proves the claim.
0
9b11 Corollary (monotone convergence for volume). For open sets

G, G1 , G2 , Rn , 2
Gk G v (Gk ) v (G) .
9b12 Remark. Let G1 , G2 , Rn be (pairwise) disjoint open balls. Then
v (G1 G2 . . . ) = v(G1 ) + v( G2 ) + . . .
even if the union is dense in Rn (which can happen; think, why).
1
Compare it with 4c7: (f + g) f + g.
2
Really, this is easy to prove without 9b10 (try it).
9c Special functions gamma and beta

The Euler gamma function is defined by1

(9c1) (t) = xt1 ex dx for t (0, ) .
0
This integral is not proper for two reasons. First, the integrand is bounded
near 0 for t [1, ) but unbounded for t (0, 1). Second, the integrand has
no bounded support. In every case, using 9b10,
k
(t) = lim xt1 ex dx < ,
k 1/k
since the integrand (for a given t) is continuous on (0, ), is O(xt1 ) as

x 0, and (say) O(ex/2 ) as x . Thus, (0, ) (0, ).
Clearly, (1) = 1. Integration by parts gives
k k k
t x
1/k x e dx = xt ex x=1/k + t xt1 ex dx ;
1/k
(9c2) (t + 1) = t(t) for t (0, ) .
In particular,
(9c3) (n + 1) = n! for n = 0, 1, 2, . . .
We note that
1 a+1
xa ex dx = (
2
(9c4) 0 ) for a (1, ) ,
2 2

since 0 xa ex dx = 0 ua/2 eu 2du
2
. For a = 0 the Poisson formula (recall
u
9b4) gives
1
(9c5) ( ) = .
2
Thus,
2n + 1 1 3 2n 1
(9c6) ( ) = .
2 2 2 2
The volume Vn of the n-dimensional unit ball (recall 9b8) is thus calculated:
n/2
(9c7) Vn = n n .
2 ( 2 )
1
This is rather (0,) .
3/2 3/2
Not unexpectedly, V3 = 3
( 32 )
= 3 1

= 43 .
2 2 2
/2
By 9b5, 12 ( a+b+2
2
) 0 cosa sinb d = 12 ( a+1
2
) 12 ( b+1
2
) for a, b (1, );
that is,

/2 1 ( 2 )( 2 )
(9c8) 0 cos1 sin1 d = for , (0, ) .
2 ( + 2 )
In particular,

/2
1
/2
1 ( 2 )
(9c9) 0 sin d = cos d = +1 .
0 2 ( 2 )
/2
The trigonometric functions can be eliminated: 0 cos1 sin1 d =
2 2
1 /2 1
2 0 cos2 sin2 2 sin cos d = 21 0 (1 u) 2 u 2 du; thus,
1
1 1
(9c10) 0 x (1 x) dx = B(, ) for , (0, ) ,
where
()()
(9c11) B(, ) = for , (0, )
( + )
is another special function, the beta function.
9c12 Exercise. Check that B(x, x) = 212x B(x, 21 ).1
9c13 Exercise. Check the duplication formula:2

22x1 1
(2x) = (x) (x + ) .
2
1

9c14 Exercise. Calculate 0 x4 1 x2 dx.

Answer: 32 .

9c15 Exercise. Calculate 0 xm ex dx.
n
Answer: n1 ( m+1
n
).
1
9c16 Exercise. Calculate 0 xm (ln x)n dx.
(1)n n!
Answer: (m+1) n+1 .
/2
1
Hint: 0 ( 2 sin 2 cos )2x1 d.
2
Hint: use 9c12.
/2
9c17 Exercise. Calculate 0 dx .
cos x
2 (1/4)
Answer: 2 2
.
xt1
9c18 Exercise. Check that (t)(1 t) = 0 1+x dx for 0 < t < 1.1
We mention without proof another useful formula
xt1
0 dx = for 0 < t < 1 .
1+x sin t
There is a simple proof that uses the residues theorem from the complex
analysis course. This formula yields that (t)(1 t) = sint for 0 < t < 1.
Is the function continuous?
For every compact interval [t0 , t1 ] (0, ) the given function of two
variables (t, x) xt1 ex is continuous on [t0 , t1 ][ k1 , k], therefore its integral
in x is continuous in t on [t0 , t1 ] (recall 4e6(a)). Also,
k
t1 x
1/k x e dx (t) uniformly on [t0 , t1 ] ,
1/k 1/k
since 0 xt1 ex dx 0 xt0 1 dx 0 as k and k xt1 ex dx
t 1 x
k x 1 e dx 0 as k . It follows that is continuous on arbitrary
[t0 , t1 ], therefore, on the whole (0, ).
In particular, t(t) = (t + 1) (1) = 1 as t 0+; that is,
1 1
(t) = + o( ) as t 0 + .
t t
9d Change of variables
9d1 Theorem (change of variables). Let U, V Rn be open sets, U V
a diffeomorphism, and f V [0, ). Then
(a) (f is continuous almost everywhere on V )
(f is continuous almost everywhere on U )
((f ) det D is continuous almost everywhere on U );
(b) if they are continuous almost everywhere, then
V f = U (f ) det D [0, ] .
Item (a) follows easily from 8c1 (similarly to the proof of 8a1(a) in Sect. 8c
but simpler: 8c4 is not needed now).
1
Hint: change x to y via (1 + x)(1 y) = 1.
9d2 Lemma. Let U, V, , f be as in Th. 9d1, and in addition, f be compactly

supported within V . Then 9d1(b) holds.
Proof. This is basically Prop. 8d1; there U, V are admissible, since otherwise
the integrals over U and V are not defined by 4d5. Now they are defined
(see the paragraph after (9b1)): V f = Rn f 1lV (and similarly for U ), and
the proof of 8d1 given in Sect. 8d applies (check it).
Proof of Th. 9d1(b). First, we prove that
(9d3) V f U (f ) det D .
Assume the contrary. By 9b2(a) there exists integrable g, compactly sup-

ported within V , such that g f on V and V g > U (f ) det D. By 9d2,
V g = U (g ) det D U (f ) det D; this contradiction proves (9d3).
Second, we apply (9d3) to 1 = 1 V U and f1 = (f ) det D
U [0, ):
U f1 V (f1 1 ) det D1 .
By the chain rule, 1 = idV implies ((D)1 )(D1 ) = id, thus ((det D)
1 )(det D1 ) = 1. We get
f
f1 1 = (f 1 )(det D) 1 = ;
det D1
(f1 1 ) det D1 = f ; U (f ) det D = U f1 V f .
9e Iterated integral
We consider an open set G Rm+n and functions f G [0, ) continuous
almost everywhere. Similarly to Sect. 5d, the section f (x, ) of f need not be
continuous almost everywhere on the section Gx = {y (x, y) G} of G; thus,
Gx f (x, ) is generally ill-defined. Similarly to Th. 5d1 we need the lower
integral (but no upper integral this time).
We define the lower integral by (9b1) again, but this time f G [0, )
is arbitrary (rather than continuous almost everywhere). That is, for open
G Rn (rather than Rm+n , for now)
(9e1) f = sup { g g Rn R integrable,

G Rn
0 g f on G, g = 0 on Rn G} [0, ] .
In particular, if f is continuous almost everywhere on G, then G f = G f .

As before, the condition on g may be reformulated as 0 g f 1lG . Still,
9b2(a) applies (check it). And 9b2(b) becomes: if f is bounded and G is
bounded, then G f = Rn f 1lG , the latter integral being proper, that is,
defined in Sect. 4c.
Similarly to 9b10, for open sets G, G1 , G2 , Rn ,

(9e2) Gk G G f G f [0, ]
k

for arbitrary f G [0, ). Similarly to 9b3(a),

Gk G min(f, k) G f [0, ] .
Gk
If, in addition, Gk are bounded, then we may rewrite it as

(9e3) min(f, k)1lGk G f ,
Rn
the left-hand side integral being proper.

An increasing sequence of integrable functions can converge1 to a function
that is not almost everywhere continuous (and moreover, is discontinuous
everywhere). Nevertheless, a limiting procedure is possible, as follows.
9e4 Proposition. If g1 , g2 , Rn [0, ) are integrable and gk f Rn

[0, ), then Rn gk Rn f . 2
This claim follows easily from an important theorem (to be proved in

Appendix).
9e5 Theorem (monotone convergence for Riemann integral). If g, g1 , g2 ,

Rn R are integrable and gk g, then Rn gk Rn g.
Proof that Th. 9e5 implies Prop. 9e4. Clearly, gk f implies limk gk f ;
we have to prove that limk gk f . Given an integrable g f , we
have min(gk , g) min(f, g) = g and, by 9e5, min(gk , g) g. Thus,
g limk gk ; supremum in g gives f limk gk .
We return to an open set G Rm+n and its sections Gx Rn for x Rm .
1
Pointwise, not uniformly.
2
Do you think that gk f for arbitrary (not integrable) gk ? No, this is wrong.
Recall fk of 4e7 and consider 1 fk .
9e6 Theorem (iterated improper integral). If a function f G [0, ) is

continuous almost everywhere, then

Rm dx G dy f (x, y) = f (x, y) dxdy [0, ] .
x
G
Unlike Th. 5d1, both integrals in the left-hand side are lower integrals.
The function x Gx dy f (x, y) need not be almost everywhere continuous,
even if G = R2 and f is continuous. Moreover, it can happen that x
2
R f (x, ) is unbounded on every interval, even if f R [0, ) is bounded,
continuously differentiable, R2 f < , and f (x, y) 0, f (x, y) 0 as
x2 + y 2 . (Can you find a counterexample? Hint: construct separately
f R[2k ,2k+1 ] for each k.)
It is easy to see (try it!) that G f does not exceed the iterated integral;
but the equality needs more effort.
Proof. We take admissible open sets Gk Rm+n such that Gk G, 1 and
introduce fk = min(f, k)1lGk , that is,

f (x, y), if (x, y) Gk and f (x, y) k,

fk (x, y) = k, if (x, y) Gk and f (x, y) k,

0,
if (x, y) Gk .
By Lebesgues criterion 6d2, each fk is integrable. By (9e3), Rm+n fk G f ,

the left-hand side integral being proper.
Given x Rm , we apply the same argument to the sections fk (x, ), f (x, ),
(Gk )x , Gx , taking into account that fk (x, ) need not be integrable, and we
get

fk (x, ) = min(f (x, ), k)1lGk (x, ) f (x, ) .
R n Rn Gx
By Th. 5d1 (applied to fk ), the function x Rn fk (x, ) is integrable, and
its integral is equal to Rm+n fk . Applying Prop. 9e4 to these functions we get

Rm+n fk = Rm (x Rn fk (x, ))

Rm (x G f (x, )) ;
x
but on the other hand, Rm+n fk G f .

9e7 Corollary. The volume2 of an open set G Rm+n is equal to the lower
integral of the volume of Gx (even if G is not admissible).
1
For example, we may use the interior of the union of all N -pixels contained in G
[N, N ]n .
2
That is, v (G) if G is bounded; and G 1 (in fact, the Lebesgue measure of G) in
general.
9f Multidimensional beta integrals of Dirichlet

9f1 Proposition.
(p1 ) . . . (pn )
xp11 1 . . . xnpn 1 dx1 . . . dxn =
(p1 + + pn + 1)
x1 ,...xn >0,
x1 ++xn <1
for all p1 , . . . pn > 0.
For the proof, we denote
I(p1 , . . . , pn ) = xp11 1 . . . xpnn 1 dx1 . . . dxn .

x1 ,...xn >0,
x1 ++xn <1
This integral is improper, unless p1 , . . . , pn 1.
9f2 Lemma. I(p1 , . . . , pn ) = B(pn , p1 + + pn1 + 1)I(p1 , . . . , pn1 ).
Proof. The change of variables = ax (that is, 1 = ax1 , . . . , n = axn ) gives

(by Theorem 9d11 )
1p1 1 . . . npn 1 d1 . . . dn = ap1 ++pn I(p1 , . . . , pn ) for a > 0 .

1 ,...n >0,
1 ++n <a
Thus, using 9e6 and (9c10),

1
I(p1 , . . . , pn ) = dxn xpnn 1 xp11 1 . . . xpn1
n1 1
dx1 . . . dxn1 =
0
x1 ,...xn1 >0,
x1 ++xn1 <1xn
1
= xpnn 1 (1 xn )p1 ++pn1 I(p1 , . . . , pn1 ) dxn =
0
= I(p1 , . . . , pn1 )B(pn , p1 + + pn1 + 1) .
Proof of Prop. 9f1.

Induction in the dimension n. For n = 1 the formula is obvious:
1 1 (p1 )
p1 1
0 x1 dx1 = p = (p + 1) .
1 1
1
But a linear change of variables does not really need 9d1; it is a simple generalization
of 7c1 or even (4h5).
From n 1 to n: using 9f2 (and (9c11)),
(pn )(p1 + + pn1 + 1) (p1 ) . . . (pn1 )

I(p1 , . . . , pn ) = =
(p1 + + pn + 1) (p1 + + pn1 + 1)
(p1 ) . . . (pn )
= .
(p1 + + pn + 1)
A seemingly more general formula,
1 ( p11 ) . . . ( pnn )
xp11 1 . . . xpnn 1 dx1 . . . dxn = ,
1 . . . n ( p1 + + pn + 1)
x1 ,...,xn >0, 1 n

x1 1 ++xnn <1

results from 9f1 by the (nonlinear!) change of variables yj = xj j .
A special case: p1 = = pn = 1, 1 = = n = p;
n ( p1 )
dx1 . . . dxn = .
x1 ,...,xn >0
pn ( np + 1)
xp1 ++xpn <1
Weve found the volume of the unit ball in the metric lp :
2n n ( p1 )
v(Bp (1)) = .
pn ( np + 1)
If p = 2, the formula gives us (again; see (9c7)) the volume of the standard
unit ball:
2 n/2
Vn = v(B2 (1)) = .
n( n2 )
2n
We also see that the volume of the unit ball in the l1 -metric equals n! .
Question: what does the formula give in the p limit?
9f3 Exercise. Show that

1 1
(x1 + + xn ) dx1 . . . dxn = (s)sn1 ds
(n 1)! 0
x1 ++xn <1
x1 ,...,xn >0
for every good function [0, 1] R and, more generally,
(x1 + + xn )xp11 1 . . . xpnn 1 dx1 . . . dxn =

x1 ++xn <1
x1 ,...,xn >0
(p1 ) . . . (pn ) 1
= (u)up1 +...pn 1 du .
(p1 + + pn ) 0
Hint: consider
1

0 ds (s) xp11 1 . . . xpnn 1 dx1 . . . dxn .
x1 ++xn <s
x1 ,...,xn >0
9g Non-positive (signed) integrands

We define
G (g h) = G g G h
whenever g, h G [0, ) are continuous almost everywhere and G g < ,
G h < ; this definition is correct, that is,
G g1 G h1 = G g2 G h2 whenever g1 h1 = g2 h2 ,
due to 9b9:
g1 h1 = g2 h2 g1 + h2 = g2 + h1 (g1 + h2 ) = (g2 + h1 )
G G
g1 + h2 = g2 + h1 g1 h1 = g2 h2 .
G G G G G G G G
9g1 Lemma. The following two conditions on a function f G R contin-

uous almost everywhere are equivalent:
(a) there exist g, h G [0, ), continuous almost everywhere, such that
G < , G h < and f = g h;
g
(b) G f < .
Proof. (a)(b): G g h G (g + h) = G g + G h < .
(b)(a): we introduce the positive part f + and the negative part f of
f,
f + (x) = max(0, f (x)) , f (x) = max(0, f (x)) ;
(9g2)
f = (f )+ ; f = f+ f ; f = f + + f ;
they are continuous almost everywhere (think, why); G f + G f < ,
+
G f G f < ; and f f = f .
We summarize:
+
(9g3) G f = G f G f
whenever f G R is continuous almost everywhere and such that G f <

. Such functions will be called improperly integrable 1 (on G).
9g4 Exercise. Prove linearity: G cf = c G f for c R, and G (f1 + f2 ) =

G f1 + G f2 .
Similarly to Sect. 4e, a function f G R continuous almost everywhere
will be called negligible if G f = 0. Functions f, g continuous almost ev-
erywhere and such that f g is negligible will be called equivalent. The
equivalence class of f will be denoted [f ].
Improperly integrable functions f G R are a vector space. On this
space, the functional f G f is a seminorm. The corresponding equiva-
lence classes are a normed space (therefore also a metric space). The integral
is a continuous linear functional on this space.
If G is admissible, then the space of improperly integrable functions on
G is embedded into the space of improperly integrable functions on Rn by
f f 1lG .
9g5 Proposition (exhaustion). For open sets G, G1 , G2 , Rn ,
Gk G f f R
G G k
for all improperly integrable f G R.
9g6 Theorem (change of variables). Let U, V Rn be open sets, U V

a diffeomorphism, and f V R. Then
(a) (f is continuous almost everywhere on V )
(f is continuous almost everywhere on U )
((f ) det D is continuous almost everywhere on U );
(b) if they are continuous almost everywhere, then
V f = U (f ) det D [0, ] ;
(c) and if the integrals in (b) are finite, then
V f = U (f ) det D R .
1
In one dimension they are usually called absolutely (improperly) integrable.
9g7 Exercise. Prove 9g5 and 9g6.
9g8 Exercise. If 0 < t0 < t1 < , then the function (x, t) xt1 ex ln x is
improperly integrable on (0, ) (t0 , t1 ), and
t1
t dt dx xt1 ex ln x = (t1 ) (t0 ) .
0 0
Prove it.1

9g9 Exercise. (a) The function t 0 xt1 ex ln x dx is continuous on
(0, );
(b) the gamma function is continuously differentiable on (0, ), and

(t) = xt1 ex ln x dx for 0 < t < ;
0
(c) the gamma function is convex on (0, ).

Prove it.
Index
beta function, 115 monotone convergence
for integral, 118
change of variables, 116, 123 for volume, 113
equivalent, 123
exhaustion, 113, 123 negligible, 123
gamma function, 114 Poisson formula, 111
improper integral
volume of ball, 114, 121
signed, 122, 123
unsigned, 110
B, 115
improperly integrable, 123
iterated improper integral, 119 [f ], 123
f 1lG , 110
linearity, 123 f + , f , 122
lower integral, 117 , 114
1
Hint: apply 9e6 twice, to f + and f .
2b3 Theorem (implicit function). Let f Rnm Rm Rm be continuously dif-
Topological notions in Rn are insensitive to a change of basis. ferentiable near (0, 0), f (0, 0) = 0, and (Df )(0,0) = A = ( B C ), B Rnm Rm ,
Topological notions are well-defined in every n-dimensional vector space, and pre- C Rm Rm , with C invertible. Then there exists g Rnm Rm , continuously differ-
served by isomorphisms of these spaces. entiable near 0, such that the two relations f (x, y) = 0 and y = g(x) are equivalent for
(x, y) near (0, 0); and (Dg)0 = C 1 B.
1f18 Exercise. (a) Determinant is a continuous function A det A on L(Rn Rn );
(b) invertible operators are an open set; Similarly, (Dg)x = Cx1 Bx , where ( Bx Cx ) = Ax = (Df )(x,g(x)) , for all x near 0.
(c) the mapping A A1 is continuous on this open set.
2c3 Theorem. Let f Rn Rn be continuously differentiable near 0, f (0) = 0, and
1f19 Exercise. If A L(R R ) satisfies A < 1, then
n n (Df )0 = A Rn Rn be invertible. Then f is open at 0.
(a) the series id A + A2 A3 + . . . converges in L(Rn Rn );
2d5 Theorem. Let f Rnm Rm Rm and A, B, C be as in Th. 2b3. Then there
(b) the sum S of this series satisfies (id +A)S = id, S(id +A) = id; thus, id +A is exists g Rnm Rm Rm , continuously differentiable near (0, 0), such that the two
invertible;
relations f (x, y) = z and y = g(x, z) are equivalent for (x, y, z) near (0, 0, 0); and
(c) det(id +A) > 0.
(Dg)(0,0) = ( C 1 B C 1 ).
When differentiating a given mapping, we may choose at will a pair of bases. This
applies to any pair of finite-dimensional vector spaces. 3a1 Theorem (Lagrange multipliers). Assume that x0 Rn , 1 m n 1, functions
f, g1 , . . . , gm Rn R are continuously differentiable near x0 , g1 (x0 ) = = gm (x0 ) = 0,
1f23 Exercise. If f C 1 (U Rm ) and g C 1 (Rm R` ), then g f C 1 (U R` ). and the vectors g1 (x0 ), . . . , gm (x0 ) are linearly independent. If x0 is a local con-
1f24 Exercise. A mapping f is continuously differentiable if and only if all parial strained extremum point of f subject to g1 () = = gm () = 0, then there exist
derivatives Di fj exist and are continuous. (Here f (x) = (f1 (x), . . . , fm (x)).) 1 , . . . , m R such that f (x0 ) = 1 g1 (x0 ) + + m gm (x0 ).
g1 (x) = = gm (x) = 0 (m equations) 1 , . . . , m (m variables)
1f25 Exercise. (a) If f C 1 (U ) and g C 1 (U Rm ), then f g C 1 (U Rm ) f (x) = 1 g1 (x) + + m gm (x) (n equations) x (n variables)
(pointwise product).
(b) If f, g C 1 (U Rm ), then f (), g() C 1 (U ) (scalar product). 3a2 Theorem. Let f Rn Rm be continuously differentiable near 0, f (0) = 0, and
(Df )0 = A Rn Rm be onto. Then f is open at 0.
1f27 Exercise. (a) Determinant is a continuously differentiable function f A det A
1/p
on L(Rn Rn ); xp + + xpn
(b) (Df )id (H) = tr(H) for all H L(Rn Rn ); Mp (x1 , . . . , xn ) = ( 1 ) for xk > 0 ; Mp Mq for p q .
n
(c) (D log f )A (H) = tr(A1 H) for all H L(Rn Rn ) and all invertible A L(Rn
Rn ). The system of m + n equations proposed in Sect. 3a is only one way of finding local
constrained extrema. Not necessarily the simplest way.
(1f31) f (b) f (a) Cb a , C = sup (Df )a+t(ba) (finite increment theorem) No need to find f when f () = (g()); find g, note that f is collinear to g.
t(0,1) If Lagrange method does not solve a problem to the end, it may still give a useful
information. Combine it with other methods as needed.
2a5 Exercise. For a linear A Rn Rm the following conditions are equivalent:
3d1 Proposition (singular value decomposition). Every linear operator from one finite-
(a) A is invertible; (d) A is a diffeomorphism; dimensional Euclidean vector space to another sends some orthonormal basis of the first
(b) A is a homeomorphism; (e) A is a local diffeomorphism. space into an orthogonal system in the second space.
(c) A is a local homeomorphism;
2a9 Exercise. For a linear A Rn Rm the following conditions are equivalent: 3d2 Proposition. Every linear operator from an n-dimensional Euclidean vector space
to an m-dimensional Euclidean vector space has a diagonal m n matrix in some pair
(a) A(Rn ) = Rm (onto); (c) A is open. of orthonormal bases.
(b) A is open at 0;
3d3 Proposition. Every finite-dimensional vector space endowed with two Euclidean
2b1 Theorem (inverse function). Let f Rn Rn be continuously differentiable near metrics contains a basis orthonormal in the first metric and orthogonal in the second
0, f (0) = 0, and (Df )0 = A Rn Rn be invertible. Then f is a local diffeomorphism, metric.
and (D(f 1 ))0 = A1 .
(3e1) sup f = sup f + 1 (0)c1 + + m (0)cm + o(c) .
Zc Z0
Similarly, (D(f ))f (x) = ((Df )x ) for all x near 0.
1 1
3f1 Theorem. The following conditions on a set M Rn , a point x0 M and a number 4c7 Remark. Monotonicity: if f () g() then f g , f g,

k {1, 2, . . . , n 1} are equivalent:

(a) there exists a mapping f Rn Rnk , continuously differentiable near x0 , such and for integrable f, g, f g.
that (Df )x0 = A Rn Rnk is onto, and
Homogeneity: cf = c f , cf = c f for c 0;
x M f (x) = f (x0 ) for all x near x0 ;

(b) there exists a local diffeomorphism near x0 such that cf = c f , cf = c f for c 0;

xM (x) Rk {0nk } for all x near x0 ; if f is integrable then cf is, and cf = c f for all c R.
(c) there exists a permutation (i1 , . . . , in ) of {1, . . . , n} and a mapping g Rk Rnk ,
(Sub-, super-) additivity: (f + g) f + g; (f + g) f + g;
continuously differentiable near (x0,i1 , . . . , x0,ik ), such that
x M g(xi1 , . . . , xik ) = (xik+1 , . . . , xin ) for all x near x0 . if f, g are integrable then f + g is, and (f + g) = f + g.
A nonempty set M Rn is a k-dimensional manifold, if the equivalent conditions v(E) = 1lE ; f = f 1lE .
3f1(a,b,c) hold for every x0 M . Rn E Rn
3f3 Exercise. Let Rn Rn be a diffeomorphism, and M Rn . (4d6) 1 = v(E) ; c = cv(E) for c R ;

(a) If M is a k-manifold near x0 , then its image (M ) is a k-manifold near (x0 ); E E
(b) M is a k-manifold if and only if (M ) is a k-manifold. (4d7) v(E) inf f (x) f v(E) sup f (x) ;
xE E xE
This applies, in particular, to shifts, rotations, and all invertible affine transformations
of Rn . (4d8) v(E) = 0 f = 0 .
E
1
(4a2) S(E F ) = S(E) + S(F ) f the mean value of f on E .
v(E) E

([f ], [g]) = [f ] [g] = f g ; the integral metric.
(4a3) vol(E) inf f (x) S(E) vol(E) sup f (x) B
xE xE We may safely ignore values of integrands on sets of volume zero (as far as they are
bounded). Likewise we may ignore sets of volume zero when dealing with volume.
(4b1) f is bounded; that is, sup f (x) < , The set of all (equivalence classes of) integrable functions is closed (in the integral
xRn metric).
(4b2) f has bounded support; that is, sup x < . 4f1 Exercise. (a) Every continuous f Rn R with bounded support is integrable;
xf (x)0 (b) every continuous function on a box is integrable on this box.
4f3 Proposition. Step functions are dense among integrable functions (in the integral
(4b3) LN (f ) = LN,k (f ) , LN,k (f ) = 2 nN
inf f (x) , metric).
kZn x2N (Q+k)
4f5 Remark. The set of all (equivalence classes of) integrable functions is the closure
(4b4) UN (f ) = UN,k (f ) , UN,k (f ) = 2nN sup f (x) ; of the set of all (equivalence classes of) step functions (in the integral metric).
kZn x2N (Q+k)
4f7 Corollary. The set of all (equivalence classes of) integrable functions is the closure
here Q = [0, 1] . Clearly, LN (f ) UN (f ) and LN (f ) = UN (f ).
n of the set of all (equivalence classes of) continuous functions with bounded support (in
the integral metric).
4b5 Lemma. For every N , LN +1 (f ) LN (f ) , UN +1 (f ) UN (f ). 4f9 Corollary. The (pointwise) product of two integrable functions is integrable.
L(f ) = lim LN (f ) , U (f ) = lim UN (f ) . 4f14 Proposition. If E, F Rn are admissible sets, then the sets E F , E F and
N N E F are admissible.
Clearly, < L(f ) U (f ) < . 4f16 Proposition. (a) A function integrable on Rn is integrable on every admissible
set;
4c6 Proposition (linearity). All integrable functions Rn R are a vector space, and (b) a function integrable on an admissible set is integrable on every admissible subset
the integral is a linear functional on this space. of the given set.
4g6 Proposition. For every bounded f Rn R with bounded support, 5c6 Exercise. Consider a function f R2 R of the form f (x, y) = g(x)h(y) where
g, h R R are bounded functions with bounded support.
f = sup { g step g f } , f = inf { h step h f } . (a) If g is negligible, then f is negligible.
Rn Rn Rn Rn
(b) Integrability of f does not imply that the set {x f (x, ) is not integrable} is of
4g7 Corollary. For every bounded f Rn R with bounded support, volume zero.
5d1 Theorem. If a function f Rm+n R is integrable, then the iterated integrals
f = sup { g integrable g f } , f = inf { h integrable h f } .
Rn Rn Rn Rn dy f (x, y) ,
dx dy f (x, y) , dx
4g8 Corollary. A function f Rn R is integrable if and only if for every > 0 there R Rn Rm Rn m

exist step functions g and h such that g f h and Rn h Rn g . n dy m dx f (x, y) , n dy m dx f (x, y)
4g9 Exercise. (c) for every integrable function f Rn R and > 0 there exist conti- R R R R
nuous functions g and h with bounded support such that g f h and Rn h Rn g . are well-defined and equal to f (x, y) dxdy.
Rm+n
4g10 Exercise. (b) additivity of the upper integral: EF f = E f + F f , and the

Clarification. The claim that dx dy f (x, y) is well-defined means that the function
same for the lower integral. x dy f (x, y) is integrable.
(c) (4d7) holds for lower and upper integrals. The equality
4h1 Proposition. f ( + a) is integrable if and only if f is integrable, and in this case (x f (x, )) = (x f (x, ))
Rn f ( + a) = Rn f .

implies integrability (with the same integral) of every function sandwiched between the
4h2 Corollary. For every set E Rn and vector a Rn , the shifted set E + a is lower and upper integrals. It is convenient to interpret x f (x, ) as any such function
admissible if and only if E is admissible, and in this case v(E + a) = v(E). and write, as before,
m+n f = m (x n f (x, ))
(4h5) an f (ax) dx = f, R R R
Rn Rn and
(4h6) v(aE) = a v(E) .
n
dx dy f (x, y) = f (x, y) dxdy = dy dx f (x, y)
4h7 Proposition. For every integrable f Rn R and > 0 there exists > 0 such even though fx may be non-integrable for some x.
that for all a Rn
a f ( + a) f . 5d3 Exercise. 5b2 generalizes to integrable functions
(a) assuming integrability of the function (x, y) f (x)g(y),
4i1 Proposition. If a function f Rn [0, ) is integrable, then the set (b) deducing integrability of this function from integrability of f and g (via sandwich).
E = {(x, t) 0 < t < f (x)} Rn R is admissible, and vn+1 (E) = Rn f .
5e1 Exercise. If E1 Rm and E2 Rn are admissible sets then the set E = E1 E2
4i2 Corollary. If functions f, g R R are integrable, then the set
n Rm+n is admissible.
E = {(x, t) f (x) < t < g(x)} Rn R is admissible. 5e2 Corollary. Let f Rm+n R be integrable on every box, and E Rm+n admissible
set; then
4i3 Exercise. For f as in 4i1, the set {(x, t) t = f (x) > 0} Rn R is of volume zero.
f = m (x fx ) where Ex = {y (x, y) E} R for x R .
n m
E R Ex
(5b1) (y f (, y)) = f = (x f (x, ))
Rn Rm R m+n Rm Rn (5e3) vm+n (E) = vn (Ex ) dx where vk is the volume in Rk ;
for every step function f R m+n
R. The same holds for every continuous f with Rm
bounded support. for instance, the volume of a 3-dimensional geometric body is the 1-dimensional integral
of the area of the 2-dimensional section of the body.
5b2 Exercise. 5e7 Exercise. For f , g and E as in 4i2
(a) vn+1 (E) = Rn (g f )+ ;
f (x1 , . . . , xm )g(y1 , . . . , yn ) dx1 . . . dxm dy1 . . . dyn = g(x)
Rm+n (b) E h = Rn dx 1lf <g (x) f (x) dt h(x, t) for every h E R integrable on E.
= ( f (x1 , . . . , xm ) dx1 . . . dxm )( g(y1 , . . . , yn ) dy1 . . . dyn )

Rm Rn 5e8 Remark. Here 1lf <g is the indicator of the set {x f (x) < g(x)}. This set need not
be admissible. And nevertheless, the iterated integral is well-defined (according to the
for continuous functions f Rm R, g Rn R with bounded support. clarifications. . . ).
5e14 Exercise. Let f R R be a continuous function; then The notion admissible set is insensitive to a change of basis.
x x1 xn1 x (x t)n1 This notion is well-defined in every n-dimensional vector space, and preserved by
dx1 dx2 ... dxn f (xn ) = f (t) dt . isomorphisms of these spaces.
0 0 0 0 (n 1)!
The same holds for the notion volume 0.
5e18 Exercise. Suppose the function f depends only on the first coordinate. Then
1 7b1 Proposition. If a linear operator A Rn Rn preserves the Euclidean metric,
f (x1 ) dx = vn1 f (x1 )(1 x21 )(n1)/2 dx1 , then it preserves volume.
V 1
where V is the unit ball in R , and vn1 is the volume of the unit ball in Rn1 .
n Volume is insensitive to a change of orthonormal basis. It is well-defined in every
n-dimensional Euclidean space, and preserved by isomorphisms of these spaces.
(6b1) Oscf (x0 ) = inf Oscf ({x x x0 < r}) ,
r>0 7b3 Theorem. Let A Rn Rn be an invertible linear operator. Then, for every
where Oscf (U ) = diamf (U ) = supxU f (x) inf xU f (x). admissible E Rn ,
6b2 Theorem. v(A(E)) = det A v(E) .

f f = Oscf . On an n-dimensional vector space the volume is ill defined, but admissibility is well
Rn Rn Rn
defined, and the ratio v(E 1)
of volumes is well defined. That is, the volume is well
6b5 Lemma (Lebesgues covering number). Let K Rn be a compact set, U1 , . . . , Um v(E2 )
Rn open sets, and K U1 Um . Then defined up to a coefficient.
> 0 x K i {1, . . . , m} y (y x < y Ui ) . 7c1 Theorem. Let A Rn Rn be an invertible linear operator. Then, for every
6b7 Corollary. A bounded function f Rn R with bounded support is integrable if bounded function f R R with bounded support,
n

and only if Oscf is negligible. det A f A = f and det A f A = f .
6b8 Exercise. For a set E Rn ,
Thus, f A is integrable if and only if f is integrable, and in this case
(a) Osc1lE = 1lE ;
(b) E is admissible if and only if E has volume 0; det A f A = f .
(c) v (E) v (E) = v (E);
(d) if E is admissible, then E and E are admissible, and v(E ) = v(E) = v(E). 8a1 Theorem. Let U, V Rn be admissible open sets, U V a diffeomorphism,
6b9 Exercise. For sets E, F Rn , and f V R a bounded function such that the function (f ) det D U R is also
(a) (E F ) E F , (E F ) E F , (E F ) E F . bounded. Then (a) (f is integrable on V ) (f is integrable on U )
((f ) det D is integrable on U );
(6b12) (f is integrable on E) (Oscf is negligible on E ) . (b) if they are integrable, then f = (f ) det D.
V U
6c2 Proposition. Countable union of sets of measure 0 has measure 0. 8b2 Exercise (polar coordinates in R2 ). (a)
6c3 Proposition. A compact set has measure 0 if and only if it has volume 0.
6c4 Exercise. (a) If Z has measure 0, then Z = , and v (Z) = 0. 2 2 2 f (x, y) dxdy = f (r cos , r sin ) r drd
x +y <R 0<r<R,0<<2
6d2 Theorem (Lebesgues criterion). A bounded function f Rn R with bounded for every integrable function f on the disk x2 + y 2 < R2 .
support is integrable if and only if it is continuous almost everywhere.
6d3 Lemma. Let f Rn R be a bounded function with bounded support. If f is 8b3 Exercise (spherical coord. in R3 ). (r, , ) = (r cos sin , r sin sin , r cos );
negligible then f () = 0 almost everywhere. (c) det D = r sin .
2
6d4 Lemma. The set {x Oscf (x) } is compact, for every > 0.
centroid theorem). Let (0, ) R R be
2
8b9 Proposition (the second Pappuss
7a1 Proposition. Let A Rn Rn be an invertible linear operator. Then, for every an admissible set and = {(x, y, z) ( x + y , z) } R . Then is admissible, and
2 2 3
E Rn , v3 () = v2 () 2xC ; here C = (xC , zC ) is the centroid of .

A(E) is admissible E is admissible.
8c1 Proposition. Let U, V Rn be open sets, and U V diffeomorphism. Then,
7a2 Lemma. Let A Rn Rn be a linear operator. Then, for every set Z Rn of for every set Z U ,
volume 0, the set A(Z) has volume 0. (Z has measure 0) ((Z) has measure 0) .
8c4 Lemma. Let E Rn be an admissible set, and f E R a bounded function. 9d1 Theorem (change of variables). Let U, V Rn be open sets, U V a diffeo-
Then f is integrable on E if and only if the discontinuity points of f on E are a set of morphism, and f V [0, ). Then
measure 0. (a) (f is continuous almost everywhere on V ) (f is continuous almost every-
8c5 Corollary. A set E U is admissible if and only if (E) V is admissible. where on U ) ((f ) det D is continuous almost everywhere on U );
(b) if they are, then f = (f ) det D [0, ].
(9b1) f = sup { g g Rn R integrable, 0 g f on G, g = 0 on Rn G} . V U
G Rn
f = sup { n g g R R integrable, 0 g f on G, g = 0 on R G} .
n n
+
(9e1)
x2
(9b4) e dx = . Poisson formula G R
In particular, if f is continuous almost everywhere on G, then G f = G f .
9e4 Proposition. If g1 , g2 , Rn [0, ) are integrable and gk f Rn [0, ),

9b9 Proposition. G (f1 + f2 ) = G f1 + G f2 [0, ] for all f1 , f2 0 on G, continuous then Rn gk Rn f .
almost everywhere.
9e5 Theorem (monotone convergence for Riemann integral). If g, g1 , g2 , Rn R
9b10 Proposition (exhaustion). For open sets G, G1 , G2 , Rn , are integrable and gk g, then Rn gk Rn g.

Gk G f f [0, ]
Gk G 9e6 Theorem (iterated improper integral). If a function f G [0, ) is continuous
for all f G [0, ) continuous almost everywhere. almost everywhere, then
9b11 Corollary (monotone convergence for volume). For open sets m dx dy f (x, y) = f (x, y) dxdy [0, ] .
Gx
G, G1 , G2 , Rn ,
R
G
Gk G v (Gk ) v (G) . 9e7 Corollary. The volume of an open set G Rm+n is equal to the lower integral of
the volume of Gx (even if G is not admissible).
(9c1) (t) = xt1 ex dx for t (0, ) . 9f1 Proposition.
0
(9c2) (t + 1) = t(t) for t (0, ) . (p1 ) . . . (pn )

xp11 1 . . . xpnn 1 dx1 . . . dxn = for all p1 , . . . pn > 0 .
(p1 + + pn + 1)
(9c3) (n + 1) = n! for n = 0, 1, 2, . . . x1 ,...xn >0,
x1 ++xn <1
2n n ( p1 )

a x2 1 a+1 Volume of the unit ball in the metric lp v(Bp (1)) = .
(9c4) x e dx = ( ) for a (1, ) , pn ( np + 1)
0 2 2
1 For improperly integrable f G R (that is, continuous a.e. and G f < ):
(9c5) ( ) = .
2 f = f f .
+
(9g3)
G G G
2n + 1 1 3 2n 1 9g4 Exercise. Linearity: G cf = c G f for c R, and G (f1 + f2 ) = G f1 + G f2 .
(9c6) ( ) = .
2 2 2 2 9g5 Proposition (exhaustion). For open sets G, G1 , G2 , Rn ,
n/2 Gk G f f R .
(9c7) Vn = . volume of the n-dimensional unit ball Gk G
n
2
( n2 ) 9g6 Theorem (change of variables). Let U, V Rn be open sets, U V a diffeo-
morphism, and f V R. Then
1 ( 2 )( 2 )
/2
cos1 sin1 d = for , (0, ) . (a) (f is continuous almost everywhere on V ) (f is continuous almost every-
(9c8)
0 2 ( +2
) where on U ) ((f ) det D is continuous almost everywhere on U );
(b) if they are, then f = (f ) det D [0, ];
( 2 )
/2 /2
V U
(9c9) sin 1
d = cos 1
d = .
0 0 2 ( +1
2
) (c) and if the integrals in (b) are finite, then f = (f ) det D R.
V U
1
9g7 Exercise. (a) The function t 0 xt1 ex ln x dx is continuous on (0, );

(9c10) x1 (1 x)1 dx = B(, ) for , (0, ) ,
0 (b) the gamma function is continuously

differentiable on (0, ), and
()() (t) =

x e ln x dx for 0 < t < ;
t1 x
(9c11) B(, ) = for , (0, ) 0
(c) the gamma function is convex on (0, ).
( + )

Calculus III - Notes of B. Tsirelson

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Calculus III - Notes of B. Tsirelson

Caricato da

Copyright:

Formati disponibili

Tel Aviv University, 2016 Analysis-III 1

1a Conventions, notation, terminology etc. . . . . . 1

1a Conventions, notation, terminology etc.

1c1 Exercise. Prove or disprove: a mapping f : R2 R is continuous if and

1c3 Exercise. Prove that a set K Rn is compact if and only if every

1c4 Exercise. Prove that a continuous image of a compact set is compact,

1c5 Exercise. Prove that every decreasing sequence of nonempty compact

1c6 Exercise. Let X Rn be a closed set, f : X Rm a continuous

1c7 Exercise. Formulate accurately and prove: composition of two contin-

1c9 Exercise. Let f : R R be a continuous bijection. Prove that f 1 :

1c10 Exercise. Give an example of a continuous bijection f : [0, 1) S 1 =

1c11 Exercise. Give an example of a continuous bijec-

1c12 Exercise. Give an example

such that f 1 : B R2 fails to be continuous.2

f (x) = f (x0 ) + A(x x0 ) + o(|x x0 |) , or

A a matrix, or a linear mapping Rn Rm

1d1 Exercise. Generalize the product rule3

For Df to be defined at x it is necessary that f is defined near x. If f is

x C is a linear operator1 from the tangent plane Tx C to C to the tangent

For g(x) = x we have (Dg)x = g 0 (x) = 1 : h 7 1 h;

In some sense this is id, and in another sense this is const.

exist and are equal to h(0) and h(/2) respectively.

exists if and only if h is constant on [0, /2].

lim g(ta, tb)

does not exist. Give an example.

The multivariate derivative is truly a pan-dimensional construct,

1e Textbooks to 1b, 1c, 1d

Let V be an n-dimensional vector space, and (1 , . . . , n ) a basis of V .

1f1 Exercise. 2 Consider the 2-dimensional vector subspace V = {(x, y, z) :

Find the change-of-basis matrix A.

1f2 Exercise. Consider the 3-dimensional vector space V of all functions

P 7 P (0), P 0 (0), P 00 (0)

We may transfer all topological notions from Rn to arbitrary n-dimen-

Topological notions in Rn are insensitive to a change of basis.

1f5 Exercise. Every (vector) subspace of a finite-dimensional vector space

A Euclidean metric on an n-dimensional vector space V may be defined

It follows that two arbitrary Euclidean norms ||1 , ||2 on a n-dimensional

(1f8) a, b (0, ) x V a|x|1 |x|2 b|x|1 .

Proof: apply 1f7 to E1 = (V, | |1 ), E2 = (V, | |2 ) and L = id : x 7 x.

1f10 Exercise. Is it possible to endow V of 1f2 with a Euclidean metric

space of matrices or linear operators

1f11 Definition. The norm kAk of a linear operator A : E1 E2 between

(think, why); this is the maximum of a continuous function on a compact

1f12 Exercise. If a matrix A = (ai,j )i,j is diagonal then

kAk = max |ai,i |.

The set L(Rn Rm ) of all matrices evidently is an mn-dimensional vec-

1f13 Exercise. Prove that k k is a norm on L(Rn Rm ), that is,

ktAk = |t| kAk for all A L(Rn Rm ), t R ;

1f14 Exercise. Consider the composition BA : E1 E3 of two linear op-

Treating a matrix as just mn numbers, we have a Euclidean norm, the

1f16 Exercise. The following conditions on matrices A, Ak L(Rn Rm )

1f17 Exercise. In the situation of 1f14 prove that BA is a continuous func-

1f18 Exercise. (a) Determinant is a continuous function A 7 det A on

Looking at the definition of (Df )x for f : Rn R,

f (x + h) = f (x) + (Df )x h + o(|h|) ,

and still, (1f8) ensures that both norms do not matter.

Looking at the definition of the gradient,

hf (x), hi = (Dh f )x for f : Rn R ,

(c) For f, g as in (b) prove that

whenever x = r cos , y = r sin , r > 0.

1f27 Exercise. 1 (a) Determinant is a continuously differentiable function

1f28 Exercise. Let f : Rn Rm be differentiable and symmetric in the

1f29 Exercise. Consider the vector space Vn+1 = {f : f (n+1) () = 0} and