Sei sulla pagina 1di 154

Ordinary Differential Equations

Peter Philip∗

Lecture Notes
Originally Created for the Class of Spring Semester 2012 at LMU Munich,
Revised and Extended for Several Subsequent Classes

November 24, 2017

Contents
1 Basic Notions 4
1.1 Types and First Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Equivalent Integral Equation . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Patching and Time Reversion . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Elementary Solution Methods 12


2.1 Geometric Interpretation, Graphing . . . . . . . . . . . . . . . . . . . . . 12
2.2 Linear ODE, Variation of Constants . . . . . . . . . . . . . . . . . . . . . 12
2.3 Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 General Theory 24
3.1 Equivalence Between Higher-Order ODE and Systems of First-Order ODE 24
3.2 Existence of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Uniqueness of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4 Extension of Solutions, Maximal Solutions . . . . . . . . . . . . . . . . . 40
3.5 Continuity in Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . 50

E-Mail: philip@math.lmu.de

1
CONTENTS 2

4 Linear ODE 57
4.1 Definition, Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Gronwall’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3 Existence, Uniqueness, Vector Space of Solutions . . . . . . . . . . . . . . 61
4.4 Fundamental Matrix Solutions and Variation of Constants . . . . . . . . 63
4.5 Higher-Order, Wronskian . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.6 Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6.1 Linear ODE of Higher Order . . . . . . . . . . . . . . . . . . . . . 68
4.6.2 Systems of First-Order Linear ODE . . . . . . . . . . . . . . . . . 75

5 Stability 84
5.1 Qualitative Theory, Phase Portraits . . . . . . . . . . . . . . . . . . . . . 84
5.2 Stability at Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3 Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.4 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.5 Limit Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

A Differentiability 118

B Kn -Valued Integration 119


B.1 Kn -Valued Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . . 119
B.2 Kn -Valued Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . 122

C Metric Spaces 124


C.1 Distance in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 124
C.2 Compactness in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . 127

D Local Lipschitz Continuity 132

E Maximal Solutions on Nonopen Intervals 135

F Paths in Rn 135

G Operator Norms and Matrix Norms 139

H The Vandermonde Determinant 142


CONTENTS 3

I Matrix-Valued Functions 144


I.1 Product Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
I.2 Integration and Matrix Multiplication Commute . . . . . . . . . . . . . . 144

J Autonomous ODE 145


J.1 Equivalence of Autonomous and Nonautonomous ODE . . . . . . . . . . 145
J.2 Integral for ODE with Discontinuous Right-Hand Side . . . . . . . . . . 146

K Polar Coordinates 147

References 154
1 BASIC NOTIONS 4

1 Basic Notions

1.1 Types of Ordinary Differential Equations (ODE) and First


Examples
A differential equation is an equation for some unknown function, involving one or more
derivatives of the unknown function. Here are some first examples:

y ′ = y, (1.1a)
y (5) = (y ′ )2 + π x, (1.1b)
(y ′ )2 = c, (1.1c)
∂t x = e2πit x2 , (1.1d)
 
′′ −1
x = −3x + . (1.1e)
1

One distinguishes between ordinary differential equations (ODE) and partial differential
equations (PDE). While ODE contain only derivatives with respect to one variable, PDE
can contain (partial) derivatives with respect to several different variables. In general,
PDE are much harder to solve than ODE. The equations in (1.1) all are ODE, and only
ODE are the subject of this class. We will see precise definitions shortly, but we can
already use the examples in (1.1) to get some first exposure to important ODE-related
terms and to discuss related issues.
As in (1.1), the notation for the unknown function varies in the literature, where the
two variants presented in (1.1) are probably the most common ones: In the first three
equations of (1.1), the unknown function is denoted y, usually assumed to depend on
a variable denoted x, i.e. x 7→ y(x). In the last two equations of (1.1), the unknown
function is denoted x, usually assumed to depend on a variable denoted t, i.e. t 7→ x(t).
So one has to use some care due to the different roles of the symbol x. The notation
t 7→ x(t) is typically favored in situations arising from physics applications, where t
represents time. In this class, we will mostly use the notation x 7→ y(x).
There is another, in a way a slightly more serious, notational issue that one commonly
encounters when dealing with ODE: Strictly speaking, the notation in (1.1b) and (1.1d)
is not entirely correct, as functions and function arguments are not properly distin-
guished. Correctly written, (1.1b) and (1.1d) read
2
∀ y (5) (x) = y ′ (x) + π x, (1.2a)
x∈D(y)
2
∀ (∂t x)(t) = e2πit x(t) , (1.2b)
t∈D(x)

where D(y) and D(x) denote the respective domains of the functions y and x. However,
one might also notice that the notation in (1.2) is more cumbersome and, perhaps,
harder to read. In any case, the type of slight abuse of notation present in (1.1b) and
(1.1d) is so common in the literature that one will have to live with it.
1 BASIC NOTIONS 5

One speaks of first-order ODE if the equations involve only first derivatives such as in
(1.1a), (1.1c), and (1.1d). Otherwise, one speaks of higher-order ODE, where the precise
order is given by the highest derivative occurring in the equation, such that (1.1b) is
an ODE of fifth order and (1.1e) is an ODE of second order. We will see later in Th.
3.1 that ODE of higher order can be equivalently formulated and solved as systems of
ODE of first order, where systems of ODE obviously consist of several ODE to be solved
simultaneously. Such a system of ODE can, equivalently, be interpreted as a single ODE
in higher dimensions: For instance, (1.1e) can be seen as a single two-dimensional ODE
of second order or as the system

x′′1 = −3x1 − 1, (1.3a)


x′′2 = −3x2 + 1 (1.3b)

of two one-dimensional ODE of second order.


One calls an ODE explicit if it has been solved explicitly for the highest-order deriva-
tive, otherwise implicit. Thus, in (1.1), all ODE except (1.1c) are explicit. In general,
explicit ODE are much easier to solve than implicit ODE (which include, e.g., so-called
differential-algebraic equations, cf. Ex. 1.4(g) below), and we will mostly consider ex-
plicit ODE in this class.
As the reader might already have noticed, without further information, none of the
equations in (1.1) makes much sense. Every function, in particular, every function
solving an ODE, needs a set as the domain where it is defined, and a set as the range it
maps into. Thus, for each ODE, one needs to specify the admissible domains as well as
the range of the unknown function. For an ODE, one usually requires a solution to be
defined on a nontrivial (bounded or unbounded) interval I ⊆ R. Prescribing the possible
range of the solution is an integral part of setting up an ODE, and it often completely
determines the ODE’s meaning and/or its solvability. For example for (1.1d), (a subset
of) C is a reasonable range. Similarly, for (1.1a)–(1.1c), one can require the range to
be either R or C, where requiring range R for (1.1c) immediately implies there is no
solution for c < 0. However, one can also specify (a subset of) Rn or Cn , n > 1, as
range for (1.1a), turning the ODE into an n-dimensional ODE (or a system of ODE),
where y now has n compoments (y1 , . . . , yn ) (note that, except in cases where we are
dealing with matrix multiplications, we sometimes denote elements of Rn as columns
and sometimes as rows, switching back and forth without too much care). A reasonable
range for (1.1e) is (a subset of) R2 or C2 .
One of the important goals regarding ODE is to find conditions, where one can guan-
rantee the existence of solutions. Moreover, if possible, one would like to find conditions
that guarantee the existence of a unique solution. Clearly, for each a ∈ R, the function
y : R −→ R, y(x) = a ex , is a solution to (1.1a), showing one cannot expect uniqueness
without specifying further requirements. The most common additional conditions that
often (but not always) guarantee a unique solution are initial conditions, (e.g. requiring
y(x0 ) = y0 (x0 , y0 given); or boundary conditions (e.g. requiring y(a) = ya , y(b) = yb for
y : [a, b] −→ Cn (ya , yb ∈ Cn given)).
Let us now proceed to mathematically precise definitions of the abovementioned notions.
1 BASIC NOTIONS 6

Notation 1.1. We will write K in situations, where we allow K to be R or C.

Definition 1.2. Let k, n ∈ N.

(a) Given U ⊆ R × K(k+1)n and F : U −→ Kn , call

F (x, y, y ′ , . . . , y (k) ) = 0 (1.4)

an implicit ODE of kth order. A solution to this ODE is a k times differentiable


function
φ : I −→ Kn , (1.5)
defined on a nontrivial (bounded or unbounded, open or closed or half-open) interval
I ⊆ R satisfying the two conditions

(i) (x, φ(x), φ′ (x), . . . , φ(k) (x)) ∈ I × K(k+1)n : x ∈ I ⊆ U .




(ii) F (x, φ(x), φ′ (x), . . . , φ(k) (x)) = 0 for each x ∈ I.

Note that condition (i) is necessary so that one can even formulate condition (ii).

(b) Given G ⊆ R × Kkn , and f : G −→ Kn , call

y (k) = f (x, y, y ′ , . . . , y (k−1) ) (1.6)

an explicit ODE of kth order. A solution to this ODE is a k times differentiable


function φ as in (1.5), defined on a nontrivial (bounded or unbounded, open or
closed or half-open) interval I ⊆ R satisfying the two conditions

x, φ(x), φ′ (x), . . . , φ(k−1) (x) ∈ I × Kkn : x ∈ I ⊆ G.


 
(i)
(ii) φ(k) (x) = f (x, φ(x), φ′ (x), . . . , φ(k−1) (x)) for each x ∈ I.

Again, note that condition (i) is necessary so that one can even formulate condition
(ii). Also note that φ is a solution to (1.6) if, and only if, φ is a solution to the
equivalent implicit ODE y (k) − f (x, y, y ′ , . . . , y (k−1) ) = 0.

Definition 1.3. Let k, n ∈ N.

(a) An initial value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp.
of the ODE (1.6)) plus the initial condition

∀ y (j) (x0 ) = y0,j (1.7)


j=0,...,k−1

with given x0 ∈ R and y0,0 , . . . , y0,k−1 ∈ Kn . A solution φ to the initial value


problem is a k times differentiable function φ as in (1.5) that is a solution to the
ODE and that also satisfies (1.7) (with y replaced by φ) – in particular, this requires
x0 ∈ I.
1 BASIC NOTIONS 7

(b) A boundary value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp.
of the ODE (1.6)) plus the boundary condition

∀ y (j) (a) = ya,j and ∀ y (j) (b) = yb,j (1.8)


j∈Ja j∈Jb

with given a, b ∈ R, a < b; Ja , Jb ⊆ {0, . . . , k − 1}, ya,j ∈ Kn for each j ∈ Ja , and


yb,j ∈ Kn for each j ∈ Jb . A solution φ to the boundary value problem is a k times
differentiable function φ as in (1.5) that is a solution to the ODE and that also
satisfies (1.8) (with y replaced by φ) – in particular, this requires [a, b] ⊆ I.

Under suitable hypotheses, initial and boundary value problems for ODE have unique
solutions (for initial value problems, we will see some rather general results in Cor. 3.10
and Cor. 3.16 below). However, in general, they can have infinitely many solutions or
no solutions, as shown by Examples 1.4(b),(c),(e) below.

Example 1.4. (a) Let k ∈ N. The function φ : R −→ K, φ(x) = a ex , a ∈ K, is a


solution to the kth order explicit initial value problem

y (k) = y, (1.9a)
∀ y (j) (0) = a. (1.9b)
j=0,...,k−1

We will see later (e.g., as a consequence of Th. 4.8 combined with Th. 3.1) that φ
is the unique solution to (1.9) on R.

(b) Consider the one-dimensional explicit first-order initial value problem


p
y ′ = |y|, (1.10a)
y(0) = 0. (1.10b)

Then, for every c ≥ 0, the function


(
0 for x ≤ c,
φc : R −→ R, φc (x) := (x−c)2 (1.11)
4
for x ≥ c,

is a solution to (1.10): Clearly, φc (0) = 0, φc is differentiable, and


(
0 for x ≤ c,
φ′c : R −→ R, φ′c (x) := x−c (1.12)
2
for x ≥ c,

solving the ODE. Thus, (1.10) is an example of an initial value problem with un-
countably many different solutions, all defined on the same domain.
1 BASIC NOTIONS 8

(c) As mentioned before, the one-dimensional implicit first-order ODE (1.1c) has no
real-valued
√ solution for c < 0. For c ≥ 0, every function φ : R −→ R, φ(x) :=
a ± x c, a ∈ R, is a solution √ to (1.1c). Moreover, for c < 0, every function
φ : R −→ C, φ(x) := a ± xi −c, a ∈ C, is a C-valued solution to (1.1c). The
one-dimensional implicit first-order ODE

ey = 0 (1.13)

is an example of an ODE that does not even have a C-valued solution. It is an


exercise to find f : R −→ R such that the explicit ODE y ′ = f (x) has no solution.

(d) Let n ∈ N and let a, c ∈ Kn . Then, on R, the function

φ : R −→ Kn , φ(x) := c + xa, (1.14)

is the unique solution to the n-dimensional explicit first-order initial value problem

y ′ = a, (1.15a)
y(0) = c. (1.15b)

This situation is a special case of Ex. 1.6 below.

(e) Let a, b ∈ R, a < b. We will see later in Example 4.12 that on [a, b] the 1-dimensional
explicit second-order ODE
y ′′ = −y (1.16)
has precisely the set of solutions
n  o
L= (c1 sin +c2 cos) : [a, b] −→ K : c1 , c2 ∈ K . (1.17)

In consequence, the boundary value problem

y(0) = 0, y(π/2) = 1, (1.18a)

for (1.16) has the unique solution φ : [0, π/2] −→ K, φ(x) := sin x (using (1.18a)
and (1.17) implies c2 = 0 and c1 = 1); the boundary value problem

y(0) = 0, y(π) = 0, (1.18b)

for (1.16) has the infinitely many different solutions φc : [0, π] −→ K, φc (x) :=
c sin x, c ∈ K; and the boundary value problem

y(0) = 0, y(π) = 1, (1.18c)

for (1.16) has no solution (using (1.18c) and (1.17) implies the contradictory re-
quirements c2 = 0 and c2 = −1).
1 BASIC NOTIONS 9

(f ) Consider
      
2 2 2 y1 z1 z2
F : R × K × K −→ K , F x, , := . (1.19a)
y2 z2 z2 − 1

Clearly, the implicit K2 -valued ODE

y2′
   
′ 0
F (x, y, y ) = = (1.19b)
y2′ − 1 0

has no solution on any nontrivial interval.

(g) Consider
      
y1 z1 z2 + iy3 − 2i
F : R × C3 × C3 −→ C3 , F x, y2  , z2  :=  z1 + y2 − x2  . (1.20a)
y3 z3 y1 − ieix

It is an exercise to show the C3 -valued implicit ODE


 ′   
y2 + iy3 − 2i 0
′ ′ 2
F (x, y, y ) = y1 + y2 − x
 = 0
 (1.20b)
y1 − ieix 0

has a unique solution on R (note that, here, we do not need to provide initial
or boundary conditions to obtain uniqueness). The implicit ODE (1.20b) is an
example of a differential algebraic equation, since, read in components, only its first
two equations contain derivatives, whereas its third equation is purely algebraic.

1.2 Equivalent Integral Equation


It is often useful to rewrite a first-order explicit intitial value problem as an equivalent
integral equation. We provide the details of this equivalence in the following theorem:
Theorem 1.5. If G ⊆ R × Kn , n ∈ N, and f : G −→ Kn is continuous, then, for each
(x0 , y0 ) ∈ G, the explicit n-dimensional first-order initial value problem

y ′ = f (x, y), (1.21a)


y(x0 ) = y0 , (1.21b)

is equivalent to the integral equation


Z x 
y(x) = y0 + f t, y(t) dt , (1.22)
x0

in the sense that a continuous function φ : I −→ Kn , with x0 ∈ I ⊆ R being a nontrivial


interval, and φ satisfying

x, φ(x) ∈ I × Kn : x ∈ I ⊆ G,
 
(1.23)
1 BASIC NOTIONS 10

is a solution to (1.21) in the sense of Def. 1.3(a) if, and only if,
Z x

∀ φ(x) = y0 + f t, φ(t) dt , (1.24)
x∈I x0

i.e. if, and only if, φ is a solution to the integral equation (1.22).

Proof. Assume I ⊆ R with x0 ∈ I to be a nontrivial interval and φ : I −→ Kn


to be a continuous function, satisfying (1.23). If φ is a solution to (1.21), then φ is
differentiable and the assumed continuity of f implies the continuity of φ′ . In other
words, each component φj of φ, j = {1, . . . , n}, is in C 1 (I, K). Thus, the fundamental
theorem of calculus [Phi16, Th. 10.20(b)] applies, and [Phi16, (10.51b)] yields
Z x Z x
 (1.21b) 
∀ ∀ φj (x) = φj (x0 ) + fj t, φ(t) dt = y0,j + fj t, φ(t) dt , (1.25)
x∈I j∈{1,...,n} x0 x0

proving φ satisfies (1.24). Conversely, if φ satisfies (1.24), then the validity of the initial
condition (1.21b) is immediate.
 Moreover, as f and φ are continuous, so is the integrand
function t 7→ f t, φ(t) of (1.24). Thus, [Phi16, Th. 10.20(a)] applies to (the components
of) φ, proving φ′ (x) = f x, φ(x) for each x ∈ I, proving φ is a solution to (1.21). 
Example 1.6. Consider the situation of Th. 1.5. In the particularly simple special
case, where f does not actually depend on y, but merely on x, the equivalence between
(1.21) and (1.22) can be directly exploited to actually solve the initial value problem:
If f : I −→ Kn , where I ⊆ R is some nontrivial interval with x0 ∈ I, then we obtain
φ : I −→ Kn to be a solution of (1.21) if, and only if,
Z x
∀ φ(x) = y0 + f (t) dt , (1.26)
x∈I x0

i.e. if, and only if, φ is the antiderivative of f that satisfies the initial condition. In
particular, in the present situation, φ as given by (1.26) is the unique solution to the
initial value problem. Of course, depending on f , it can still be difficult to carry out
the integral in (1.26).

1.3 Patching and Time Reversion


If solutions defined on different intervals fit together, then they can be patched to obtain
a solution on the union of the two intervals:
Lemma 1.7 (Patching of Solutions). Let k, n ∈ N. Given G ⊆ R × Kkn and f : G −→
Kn , if φ : I −→ Kn and ψ : J −→ Kn are both solutions to (1.6), i.e. to

y (k) = f (x, y, y ′ , . . . , y (k−1) ),

such that I =]a, b], J = [b, c[, a < b < c, and such that

∀ φ(j) (b) = ψ (j) (b), (1.27)


j=0,...,k−1
1 BASIC NOTIONS 11

then (
φ(x) for x ∈ I,
σ : I ∪ J −→ Kn , σ(x) := (1.28)
ψ(x) for x ∈ J,
is also a solution to (1.6).

Proof. Since φ and ψ both are solutions to (1.6),

x, σ(x), σ ′ (x), . . . , σ (k−1) (x) ∈ (I ∪ J) × Kkn : x ∈ I ∪ J ⊆ G


 
(1.29)

must hold, where (1.27) guarantees that σ (j) (b) exists for each j = 0, . . . , k−1. Moreover,
σ is k times differentiable at each x ∈ I ∪ J, x 6= b, and

σ (k) (x) = f x, σ(x), σ ′ (x), . . . , σ (k−1) (x) .



∀ (1.30)
x∈I∪J,
x6=b

However, at b, we also have (using the left-hand derivatives for φ and the right-hand
derivatives for ψ)

φ(k) (b) = f b, φ(b), φ′ (b), . . . , φ(k−1) (b)




= f b, ψ(b), ψ ′ (b), . . . , ψ (k−1) (b) = ψ (k) (b),



(1.31)

which shows σ is k times differentiable and the equality of (1.30) also holds at x = b,
completing the proof that σ is a solution. 

It is sometimes useful to apply what is known as time reversion:


Definition 1.8. Let k, n ∈ N, Gf ⊆ R × Kkn , f : Gf −→ Kn , and consider the ODE

y (k) = f (x, y, y ′ , . . . , y (k−1) ). (1.32)

We call the ODE


y (k) = g(x, y, y ′ , . . . , y (k−1) ), (1.33)
where

g : Gg −→ Kn , g(x, y) := (−1)k f − x, y1 , −y2 , . . . , (−1)k−1 yk ,



(1.34a)
Gg := (x, y) ∈ R × Kkn : − x, y1 , −y2 , . . . , (−1)k−1 yk ∈ Gf ,
 
(1.34b)

the time-reversed version of (1.32).


Lemma 1.9 (Time Reversion). Let k, n ∈ N, Gf ⊆ R × Kkn , and f : Gf −→ Kn .

(a) The time-reversed version of (1.33) is the original ODE, i.e. (1.32).

(b) If −∞ ≤ a < b ≤ ∞, then φ : ]a, b[−→ Kn is a solution to (1.32) if, and only if,

ψ : ] − b, −a[−→ Kn , ψ(x) := φ(−x), (1.35)

is a solution to the time-reversed version (1.33).


2 ELEMENTARY SOLUTION METHODS 12

Proof. (a) is immediate from the definition of g in (1.34).


(b): Due to (a), it suffices to show if φ is a solution to (1.32), then ψ is a solution to
(1.33). Clearly, if x ∈] − b, −a[, then −x ∈]a, b[. Moreover, noting
∀ ∀ ψ (j) (x) = (−1)j φ(j) (−x), (1.36a)
j=0,...,k x∈]−b,−a[

one has

− x, φ(−x), φ′ (−x), . . . , φ(k−1) (−x) ∈ Gf


x, ψ(x), ψ ′ (x), . . . , ψ (k−1) (x)



∀ ⇒ (1.36b)
x∈]−b,−a[ 
= x, φ(−x), −φ′ (−x), . . . , (−1)k−1 φ(k−1) (−x) ∈ Gg


and
ψ (k) (x) = (−1)k f − x, φ(−x), φ′ (−x), . . . , φ(k−1) (−x)


= (−1)k f − x, ψ(x), −ψ ′ (x), . . . , (−1)k−1 ψ (k−1) (x)



∀ (1.36c)
x∈]−b,−a[
= g x, ψ(x), ψ ′ (x), . . . , ψ (k−1) (x) ,


thereby establishing the case. 

2 Elementary Solution Methods for 1-Dimensional


First-Order ODE

2.1 Geometric Interpretation, Graphing


Geometrically, in the 1-dimensional real-valued case, the ODE (1.21a) provides a slope
y ′ = f (x, y) for every point (x, y). In other words, it provides a field of directions. The
task is to find a differentiable function φ such that its graph has the prescribed slope in
each point it contains. In certain simple cases, drawing the field of directions can help
to guess the solutions of the ODE.
Example 2.1. Let G := R+ × R and f : G −→ R, f (x, y) := y/x, i.e. we consider the
ODE y ′ = y/x. Drawing the field of directions leads to the idea that the solutions are
functions whose graphs constitute rays, i.e. φc : R+ −→ R, y = φc (x) = c x with c ∈ R.
Indeed, one immediately verifies that each φc constitutes a solution to the ODE.

2.2 Linear ODE, Variation of Constants


Definition 2.2. Let I ⊆ R be an open interval and let a, b : I −→ K be continuous
functions. An ODE of the form
y ′ = a(x)y + b(x) (2.1)
is called a linear ODE of first order. It is called homogeneous if, and only if, b ≡ 0; it is
called inhomogeneous if, and only if, it is not homogeneous.
2 ELEMENTARY SOLUTION METHODS 13

Theorem 2.3 (Variation of Constants). Let I ⊆ R be an open interval and let a, b :


I −→ K be continuous. Moreover, let x0 ∈ I and c ∈ K. Then the linear ODE (2.1)
has a unique solution φ : I −→ K that satisfies the initial condition y(x0 ) = c. This
unique solution is given by
 Z x 
−1
φ : I −→ K, φ(x) = φ0 (x) c + φ0 (t) b(t) dt , (2.2a)
x0

where Z x  Rx
a(t) dt
φ0 : I −→ K, φ0 (x) = exp a(t) dt =e x0
. (2.2b)
x0

Here, and in the following, φ−1


0 denotes 1/φ0 and not the inverse function of φ0 (which
does not even necessarily exist).

Proof. We begin by noting that φ0 according to (2.2b) is well-defined since a is assumed


to be continuous, i.e., in particular, Riemann integrable on [x0 , x]. Moreover, the fun-
damental theorem of calculus [Phi16, Th. 10.20(a)] applies, showing φ0 is differentiable
with Z x 
φ′0 : I −→ K, φ′0 (x) = a(x) exp a(t) dt = a(x)φ0 (x), (2.3)
x0

where Lem. A.1 of the Appendix was used as well. In particular, φ0 is continuous. Since
φ0 6= 0 as well, φ−1
0 is also continuous. Moreover, as b is continuous by hypothesis,
φ−1
0 b is continuous and, thus, Riemann integrable on [x0 , x]. Once again, [Phi16, Th.
10.20(a)] applies, yielding φ to be differentiable with

φ′ : I −→ K,
 Z x 
′ ′
φ (x) = φ0 (x) c + φ0 (t) b(t) dt + φ0 (x)φ0 (x)−1 b(x)
−1
x0
 Z x 
−1
= a(x)φ0 (x) c + φ0 (t) b(t) dt + b(x) = a(x)φ(x) + b(x), (2.4)
x0

where the product rule of [Phi16, Th. 9.7(c)] was used as well. Comparing (2.4) with
(2.1) shows φ is a solution to (2.1). The computation

φ(x0 ) = φ0 (x0 ) (c + 0) = 1 · c = c (2.5)

verifies that φ satisfies the desired initial condition. It remains to prove uniqueness. To
this end, let ψ : I −→ K be an arbitrary differentiable function that satisfies (2.1) as
well as the initial condition ψ(x0 ) = c. We have to show ψ = φ. Since φ0 6= 0, we can
define u := ψ/φ0 and still have to verify
Z x
∀ u(x) = c + φ0 (t)−1 b(t) dt . (2.6)
x∈I x0

We obtain

a φ0 u + b = a ψ + b = ψ ′ = (φ0 u)′ = φ′0 u + φ0 u′ = a φ0 u + φ0 u′ , (2.7)


2 ELEMENTARY SOLUTION METHODS 14

implying b = φ0 u′ and u′ = φ−1


0 b. Thus, the fundamental theorem of calculus in the
form [Phi16, Th. 10.20(b)] implies
Z x Z x

∀ u(x) = u(x0 ) + u (t) dt = c + φ0 (t)−1 b(t) dt , (2.8)
x∈I x0 x0

thereby completing the proof. 

Corollary 2.4. Let I ⊆ R be an open interval and let a : I −→ K be continuous.


Moreover, let x0 ∈ I and c ∈ K. Then the homogeneous linear ODE (2.1) (i.e. with
b ≡ 0) has a unique solution φ : I −→ K that satisfies the initial condition y(x0 ) = c.
This unique solution is given by
Rx
Z x 
a(t) dt
φ(x) = c exp a(t) dt = c e x0 . (2.9)
x0

Proof. One immediately obtains (2.9) by setting b ≡ 0 in in (2.2). 

Remark 2.5. The name variation of constants for Th. 2.3 can be understood from
comparing the solution (2.9) of the homogeneous linear ODE with the solution (2.2)
of the general inhomogeneous linear ODE: One obtains (2.2) R x from (2.9) by varying the
constant c, i.e. by replacing it with the function x 7→ c + x0 φ0 (t)−1 b(t) dt .

Example 2.6. Consider the ODE

y ′ = 2xy + x3 (2.10)

with initial condition y(0) = c, c ∈ C. Comparing (2.10) with Def. 2.2, we observe we
are facing an inhomogeneous linear ODE with

a : R −→ R, a(x) := 2x, (2.11a)


b : R −→ R, b(x) := x3 . (2.11b)

From Cor. 2.4, we obtain the solution φ0,c to the homogeneous version of (2.10):
Z x 
2
φ0,c : R −→ C, φ0,c (x) = c exp a(t) dt = cex . (2.12)
0

The solution to (2.10) is given by (2.2a):

φ : R −→ C,
 Z x    x 
x2 −t2 3 x2 1 2 −t2
φ(x) = e c+ e t dt = e c + − (t + 1)e
0 2 0
   
2 1 1 2 1 x2 1 2
= ex c + − (x2 + 1)e−x = c + e − (x + 1). (2.13)
2 2 2 2
2 ELEMENTARY SOLUTION METHODS 15

2.3 Separation of Variables


If the ODE (1.21a) has the particular form

y ′ = f (x)g(y), (2.14)

with one-dimensional real-valued continuous functions f and g, and g(y) 6= 0, then it


can be solved by a method known as separation of variables:
Theorem 2.7. Let I, J ⊆ R be (bounded or unbounded) open intervals and suppose that
f : I −→ R and g : J −→ R are continuous with g(y) 6= 0 for each y ∈ J. For each
(x0 , y0 ) ∈ I × J, consider the initial value problem consisting of the ODE (2.14) together
with the initial condition
y(x0 ) = y0 . (2.15)
Define the functions
x y
dt
Z Z
F : I −→ R, F (x) := f (t) dt , G : J −→ R, G(y) := . (2.16)
x0 y0 g(t)

(a) Uniqueness: On each open interval I ′ ⊆ I satisfying x0 ∈ I ′ and F (I ′ ) ⊆ G(J), the


initial value problem consisting of (2.14) and (2.15) has a unique solution. This
unique solution is given by

φ : I ′ −→ R, φ(x) := G−1 F (x) ,



(2.17)

where G−1 : G(J) −→ J is the inverse function of G on G(J).

(b) Existence: There exists an open interval I ′ ⊆ I satisfying x0 ∈ I ′ and F (I ′ ) ⊆ G(J),


i.e. an I ′ such that (a) applies.

Proof. (a): We begin by proving G has a differentiable inverse function G−1 : G(J) −→
J. According to the fundamental theorem of calculus [Phi16, Th. 10.20(a)], G is dif-
ferentiable with G′ = 1/g. Since g is continuous and nonzero, G is even C 1 . If
G′ (y0 ) = 1/g(y0 ) > 0, then G is strictly increasing on J (due to the intermediate
value theorem [Phi16, Th. 7.57]; g(y0 ) > 0, the continuity of g, and g 6= 0 imply that
g > 0 on J). Analogously, if G′ (y0 ) = 1/g(y0 ) < 0, then G is strictly decreasing on J.
In each case, G has a differentiable inverse function on G(J) by [Phi16, Th. 9.9].
In the next step, we verify that (2.17) does, indeed, define a solution to (2.14) and
(2.15). The assumption F (I ′ ) ⊆ G(J) and the existence of G−1 as shown above provide
that φ is well-defined by (2.17). Verifying (2.15) is quite simple: φ(x0 ) = G−1 (F (x0 )) =
G−1 (0) = y0 . To see φ to be a solution of (2.14), notice that (2.17) implies F = G ◦ φ
on I ′ . Thus, we can apply the chain rule to obtain the derivative of F = G ◦ φ on I ′ :
φ′ (x)
f (x) = F ′ (x) = G′ φ(x) φ′ (x) =

∀′ , (2.18)
x∈I g φ(x)

showing φ satisfies (2.14).


2 ELEMENTARY SOLUTION METHODS 16

We now proceed to show that each solution φ : I ′ −→ R to (2.14) that satisfies (2.15)
must also satisfy (2.17). Since φ is a solution to (2.14),

φ′ (x)
 = f (x) for each x ∈ I ′ . (2.19)
g φ(x)

Integrating (2.19) yields


Z x ′ Z x
φ (t)
 dt = f (t) dt = F (x) for each x ∈ I ′ . (2.20)
x0 g φ(t) x0

Using the change of variables formula of [Phi16, Th. 10.25] in the left-hand side of (2.20),
allows one to replace φ(t) by the new integration variable u (note that each solution
φ : I ′ −→ R to (2.14) is in C 1 (I ′ ) since f and g are presumed continuous). Thus, we
obtain from (2.20):
φ(x) φ(x)
du du
Z Z
= G φ(x) for each x ∈ I ′ .

F (x) = = (2.21)
φ(x0 ) g(u) y0 g(u)

Applying G−1 to (2.21) establishes φ satisfies (2.17).


(b): During the proof of (a), we have already seen G to be either strictly increasing
or strictly decreasing. As G(y0 ) = 0, this implies the existence of ǫ > 0 such that
] − ǫ, ǫ[⊆ G(J). The function F is differentiable and, in particular, continuous. Since
F (x0 ) = 0, there is δ > 0 such that, for I ′ :=]x0 −δ, x0 +δ[, one has F (I ′ ) ⊆]−ǫ, ǫ[⊆ G(J)
as desired. 

Example 2.8. Consider the ODE


y
y′ = − on I × J := R+ × R+ (2.22)
x
with the initial condition y(1) = c for some given c ∈ R+ . Introducing functions
1
f : R+ −→ R, f (x) := − , g : R+ −→ R, g(y) := y, (2.23)
x
one sees that Th. 2.7 applies. To compute the solution φ = G−1 ◦ F , we first have to
determine F and G:
Z x Z x
+ dt
F : R −→ R, F (x) = f (t) dt = − = − ln x, (2.24a)
1 1 t
Z y Z y
+ dt dt y
G : R −→ R, G(y) = = = ln . (2.24b)
c g(t) c t c

Here, we can choose I ′ = I = R+ , because F (R+ ) = R = G(R+ ). That means φ is


defined on the entire interval I. The inverse function of G is given by

G−1 : R −→ R+ , G−1 (t) = c et . (2.25)


2 ELEMENTARY SOLUTION METHODS 17

Finally, we get
c
φ : R+ −→ R, φ(x) = G−1 F (x) = c e− ln x = .

(2.26)
x
The uniqueness part of Th. 2.7 further tells us the above initial value problem can have
no solution different from φ.

The advantage of using Th. 2.7 as in the previous example, by computing the relevant
functions F , G, and G−1 , is that it is mathematically rigorous. In particular, one can be
sure one has found the unique solution to the ODE with initial condition. However, in
practice, it is often easier to use the following heuristic (not entirely rigorous) procedure.
In the end, in most cases, one can easily check by differentiation that the function found
is, indeed, a solution to the ODE with initial condition. However, one does not know
uniqueness without further investigations (general results such as Th. 3.15 below can
often help). One also has to determine on which interval the found solution is defined.
On the other hand, as one is usually interested in choosing the interval as large as
possible, the optimal choice is not always obvious when using Th. 2.7, either.
The heuristic procedure is as follows: Start with the ODE (2.14) written in the form

dy
= f (x)g(y). (2.27a)
dx
Multiply by dx and divide by g(y) (i.e. separate the variables):

dy
= f (x) dx . (2.27b)
g(y)

Integrate:
dy
Z Z
= f (x) dx . (2.27c)
g(y)
Change the integration variables and supply the appropriate upper and lower limits for
the integrals (according to the initial condition):
Z y Z x
dt
= f (t) dt . (2.27d)
y0 g(t) x0

Solve this equation for y, set φ(x) := y, check by differentiation that φ is, indeed, a
solution to the ODE, and determine the largest interval I ′ such that x0 ∈ I ′ and such
that φ is defined on I ′ . The use of this heuristic procedure is demonstrated by the
following example:

Example 2.9. Consider the ODE

y ′ = −y 2 on I × J := R × R (2.28)
2 ELEMENTARY SOLUTION METHODS 18

with the initial condition y(x0 ) = y0 for given values x0 , y0 ∈ R. We manipulate (2.28)
according to the heuristic procedure described in (2.27) above:

dy
Z Z
2 −2 −2
= −y −y dy = dx − y dy = dx
dx
Z y Z x  y
−2 1 1 1
− t dt = dt = [t]xx0 − = x − x0
y0 x0 t y0 y y0
y0
φ(x) = y = . (2.29)
1 + (x − x0 ) y0

Clearly, φ(x0 ) = y0 . Moreover,

y02 2
φ′ (x) = − 2 = − φ(x) , (2.30)
1 + (x − x0 ) y0

i.e. φ does, indeed, provide a solution to (2.28). If y0 = 0, then φ ≡ 0 is defined


on the entire interval I = R. If y0 6= 0, then the denominator of φ(x) has a zero at
x = (x0 y0 − 1)/y0 , and φ is not defined on all of R. In that case, if y0 > 0, then
x0 > (x0 y0 − 1)/y0 = x0 − 1/y0 and the maximal open interval for φ to be defined on
is I ′ =]x0 − 1/y0 , ∞[; if y0 < 0, then x0 < (x0 y0 − 1)/y0 = x0 − 1/y0 and the maximal
open interval for φ to be defined on is I ′ =] − ∞, x0 − 1/y0 [. Note that the formula for φ
obtained by (2.29) works for y0 = 0 as well, even though not every previous expression
in (2.29) is meaningful for y0 = 0 and, also, Th. 2.7 does not apply to (2.28) for y0 = 0.
In the present example, the subsequent Th. 3.15 does, indeed, imply φ to be the unique
solution to the initial value problem on I ′ .

2.4 Change of Variables


To solve an ODE, it can be useful to transform it into an equivalent ODE, using a
so-called change of variables. If one already knows how to solve the transformed ODE,
then the equivalence allows one to also solve the original ODE. We first present the
following Th. 2.10, which constitutes the base for the change of variables technique,
followed by examples, where the technique is applied.

Theorem 2.10. Let G ⊆ R × Kn be open, n ∈ N, f : G −→ Kn , and (x0 , y0 ) ∈ G.


Define
∀ Gx := {y ∈ Kn : (x, y) ∈ G} (2.31)
x∈R

and assume the change of variables function T : G −→ Kn is differentiable and such


that
 
∀ Tx := T (x, ·) : Gx −→ Tx (Gx ), Tx (y) := T (x, y), is a diffeomorphism ,
Gx 6=∅
(2.32)
2 ELEMENTARY SOLUTION METHODS 19

i.e. Tx is invertible and both Tx and Tx−1 are differentiable. Then the first-order initial
value problems
y ′ = f (x, y), (2.33a)
y(x0 ) = y0 , (2.33b)
and
 −1

DTx−1 (y) f x, Tx−1 (y) + ∂x T x, Tx−1 (y) ,
 
y = (2.34a)

y(x0 ) = T (x0 , y0 ), (2.34b)


are equivalent in the following sense:

(a) A differentiable function φ : I −→ Kn , where I ⊆ R is a nontrivial interval, is a


solution to (2.33a) if, and only if, the function
µ : I −→ Kn , µ(x) := (Tx ◦ φ)(x) = T x, φ(x) ,

(2.35)
is a solution to (2.34a).
(b) A differentiable function φ : I −→ Kn , where I ⊆ R is a nontrivial interval, is a
solution to (2.33) if, and only if, the function of (2.35) is a solution to (2.34).

Proof. We start by noting that the assumption of G being open clearly implies each Gx ,
x ∈ R, to be open as well, which, in turn, implies Tx (Gx ) to be open, even though this
is not as obvious1 . Next, for each x ∈ R such that Gx 6= ∅, we can apply the chain rule
[Phi15, Th. 2.28] to Tx ◦ Tx−1 = Id to obtain
DTx Tx−1 (y) ◦ DTx−1 (y) = Id

∀ (2.36)
y∈Tx (Gx )

and, thus, each DTx−1 (y) is invertible with


 −1
DTx−1 (y) = DTx Tx−1 (y) .

∀ (2.37)
y∈Tx (Gx )

Consider φ and µ as in (a) and notice that (2.35) implies


∀ φ(x) = Tx−1 (µ(x)). (2.38)
x∈I

Moreover, the differentiability of φ and T imply differentiability of µ by the chain rule,


which also yields
 

 1
µ (x) = DT x, φ(x)
∀ φ′ (x) (2.39)
x∈I


= DTx (φ(x)) φ (x) + ∂x T x, φ(x) .
1
If Tx is a continuously differentiable map, then this is related to the inverse function theorem (see,
e.g. [Phi15, Cor. C.9]); it is still true if Tx is merely continuous and injective, but then it is the invariance
of domain theorem of algebraic topology [Oss09, 5.6.15], which is equivalent to the Brouwer fixed-point
theorem [Oss09, 5.6.10], and is much harder to prove.
2 ELEMENTARY SOLUTION METHODS 20

To prove (a), first assume φ : I −→ Kn to be a solution of (2.33a). Then, for each


x ∈ I,
(2.39),(2.33a)
µ′ (x)
 
= DTx (φ(x)) f x, φ(x) + ∂x T x, φ(x)
(2.38)
DTx Tx−1 (µ(x)) f x, Tx−1 (µ(x)) + ∂x T x, Tx−1 (µ(x))
  
=
(2.37)
 −1
−1
f x, Tx−1 (µ(x)) + ∂x T x, Tx−1 (µ(x)) ,
 
= DTx (µ(x)) (2.40)

showing µ satisfies (2.34a). Conversely, assume µ to be a solution to (2.34a). Then, for


each x ∈ I,
 −1
DTx−1 (µ(x)) f x, Tx−1 (µ(x)) + ∂x T x, Tx−1 (µ(x))
 

(2.34a) ′ (2.39)
= µ (x) = DTx (φ(x)) φ′ (x) + ∂x T x, φ(x) .

(2.41)

Using (2.38), one can subtract the second summand from (2.41). Multiplying the result
by DTx−1 (µ(x)) from the left and taking into account (2.37) then provides
 (2.38)
φ′ (x) = f x, Tx−1 (µ(x))

∀ = f x, φ(x) , (2.42)
x∈I

showing φ satisfies (2.33a).


It remains to prove (b). If  φ satisfies (2.33), then µ satisfies (2.34a) by (a). More-
over, µ(x0 ) = T x0 , φ(x0 ) = T (x0 , y0 ), i.e. µ satisfies (2.34b) as well. Conversely,
assume µ satisfies (2.34). Then φ satisfies (2.33a) by (a). Moreover, by (2.38), φ(x0 ) =
Tx−1
0
(µ(x0 )) = Tx−1
0
(T (x0 , y0 )) = y0 , showing φ satisfies (2.33b) as well. 

As a first application of Th. 2.10, we prove the following theorem about so-called
Bernoulli differential equations:

Theorem 2.11. Consider the Bernoulli differential equation

y ′ = f (x, y) := a(x) y + b(x) y α , (2.43a)

where α ∈ R \ {0, 1}, the functions a, b : I −→ R are continuous and defined on an open
interval I ⊆ R, and f : I × R+ −→ R. For (2.43a), we add the initial condition

y(x0 ) = y0 , (x0 , y0 ) ∈ I × R+ , (2.43b)

and, furthermore, we also consider the corresponding linear initial value problem

y ′ = (1 − α) a(x) y + b(x) ,

(2.44a)
1−α
y(x0 ) = y0 , (2.44b)

with its unique solution ψ : I −→ R given by Th. 2.3.


2 ELEMENTARY SOLUTION METHODS 21

(a) Uniqueness: On each open interval I ′ ⊆ I satisfying x0 ∈ I ′ and ψ > 0 on I ′ , the


Bernoulli initial value problem (2.43) has a unique solution. This unique solution
is given by
 1
φ : I ′ −→ R+ , φ(x) := ψ(x) 1−α . (2.45)

(b) Existence: There exists an open interval I ′ ⊆ I satisfying x0 ∈ I ′ and ψ > 0 on I ′ ,


i.e. an I ′ such that (a) applies.

Proof. (b) is immediate from Th. 2.3, since ψ(x0 ) = y0 > 0 and ψ is continuous.
To prove (a), we apply Th. 2.10 with the change of variables

T : I × R+ −→ R+ , T (x, y) := y 1−α . (2.46)

Then T ∈ C 1 (I × R+ , R) with ∂x T ≡ 0 and ∂y T (x, y) = (1 − α) y −α . Moreover,

∀ Tx = S, S : R+ −→ R+ , S(y) := y 1−α , (2.47)


x∈I

which is differentiable with the differentiable inverse function S −1 : R+ −→ R+ ,


1 α
1
S −1 (y) = y 1−α , DS −1 (y) = (S −1 )′ (y) = 1−α y 1−α . Thus, (2.34a) takes the form
 −1
y ′ = DTx−1 (y) f x, Tx−1 (y) + ∂x T x, Tx−1 (y)
 

α
 1 1 α

= (1 − α) y − 1−α a(x) y 1−α + b(x) y 1−α +0

= (1 − α) a(x) y + b(x) . (2.48)

Thus, if I ′ ⊆ I is such that x0 ∈ I ′ and ψ > 0 on I ′ , then Th. 2.10 says φ defined
by (2.45) must be a solution to (2.43) (note that the differentiability of ψ implies the
differentiability of φ). On the other hand, if λ : I ′ −→ R+ is an arbitrary solution to
(2.43), then Th. 2.10 states µ := S ◦ λ = λ1−α to be a solution to (2.44). The uniqueness
part of Th. 2.3 then yields λ1−α = ψ↾I ′ = φ1−α , i.e. λ = φ. 

Example 2.12. Consider the initial value problem


1
y ′ = f (x, y) := i − , (2.49a)
ix − y + 2
y(1) = i, (2.49b)

where f : G −→ C, G := {(x, y) ∈ R × C : ix − y + 2 6= 0} (G is open as the continuous


preimage of the open set C \ {0}). We apply the change of variables

T : G −→ C, T (x, y) := ix − y. (2.50)

Then, T ∈ C 1 (G, C). Moreover, for each x ∈ R,

Gx = {y ∈ C : (x, y) ∈ G} = C \ {ix + 2} (2.51)


2 ELEMENTARY SOLUTION METHODS 22

and we have the diffeomorphisms

Tx : C \ {ix + 2} −→ C \ {−2}, Tx (y) = ix − y, (2.52a)


Tx−1 : C \ {−2} −→ C \ {ix + 2}, Tx−1 (y) = ix − y. (2.52b)

To obtain the transformed equation, we compute the right-hand side of (2.34a)


 −1
DTx−1 (y) f x, Tx−1 (y) + ∂x T x, Tx−1 (y)
 
 
1 1
= (−1) · i − +i= . (2.53)
y+2 y+2
Thus, the transformed initial value problem is
1
y′ = , (2.54a)
y+2
y(1) = T (1, i) = i − i = 0. (2.54b)

Using seperation of variables, one finds the solution



µ : ] − 1, ∞[−→] − 2, ∞[, µ(x) := 2x + 2 − 2, (2.55)

to (2.54). Then Th. 2.10 implies that



φ(x) := Tx−1 µ(x) = ix − 2x + 2 + 2,

φ : ] − 1, ∞[−→ C, (2.56)

is a solution to (2.49) (that φ is a solution to (2.49) can now also easily be checked
directly). It will become clear from Th. 3.15 below that φ and ψ are also the unique
solutions to their respective initial value problems.

Finding a suitable change of variables to transform a given ODE such that one is in a
position to solve the transformed ODE is an art, i.e. it can be very difficult to spot a
useful transformation, and it takes a lot of practise and experience.
Remark 2.13. Somewhat analogous to the situation described in the paragraph before
(2.27) regarding the separation of variables technique, in practise, one frequently uses a
heuristic procedure to apply a change of variables, rather than appealing to the rigorous
Th. 2.10. For the initial value problem y ′ = f (x, y), y(x0 ) = y0 , this heuristic procedure
proceeds as follows:

(1) One introduces the new variable z := T (x, y) and then computes z ′ , i.e. the deriva-
tive of the function x 7→ z(x) = T (x, y(x)).

(2) In the result of (1), one eliminates all occurrences of the variable y by first replacing
y ′ by f (x, y) and then replacing y by Tx−1 (z), where Tx (y) := T (x, y) = z (i.e. one has
to solve the equation z = T (x, y) for y). One thereby obtains the transformed initial
value problem problem z ′ = g(x, z), z(x0 ) = T (x0 , y0 ), with a suitable function g.
2 ELEMENTARY SOLUTION METHODS 23

(3) One solves the transformed initial value problem to obtain a solution µ, and then
−1

x 7→ φ(x) := Tx µ(x) yields a candidate for a solution to the original initial value
problem.

(4) One checks that φ is, indeed, a solution to y ′ = f (x, y), y(x0 ) = y0 .

Example 2.14. Consider

+ y y2
f : R × R −→ R, f (x, y) := 1 + + 2 , (2.57)
x x
and the initial value problem

y ′ = f (x, y), y(1) = 0. (2.58)

We introduce the change of variables z := T (x, y) := y/x and proceed according to the
steps of Rem. 2.13. According to (1), we compute, using the quotient rule,

y ′ (x) x − y(x)
z ′ (x) = . (2.59)
x2
According to (2), we replace y ′ (x) by f (x, y) and then replace y by Tx−1 (z) = xz to
obtain the transformed initial value problem

y y2 1 + z2
 
′ 1 y 1 2 z
z = 1 + + 2 − 2 = (1 + z + z ) − = , z(1) = 0/1 = 0. (2.60)
x x x x x x x

According to (3), we next solve (2.60), e.g. by seperation of variables, to obtain the
solution
 π π
µ : e− 2 , e 2 −→ R, µ(x) := tan ln x, (2.61)

of (2.60), and
 π π
φ : e− 2 , e 2 −→ R, φ(x) := x µ(x) = x tan ln x, (2.62)

as a candidate for a solution to (2.58). Finally, according to (4), we check that φ is,
indeed, a solution to (2.58): Due to φ(1) = 1 · tan 0 = 0, φ satisfies the initial condition,
and due to
1
φ′ (x) = tan ln x + x
(1 + tan2 ln x) = 1 + tan ln x + tan2 ln x
x
φ(x) φ2 (x)
=1+ + , (2.63)
x x2
φ satisfies the ODE.
3 GENERAL THEORY 24

3 General Theory

3.1 Equivalence Between Higher-Order ODE and Systems of


First-Order ODE
It turns out that each one-dimensional kth-order ODE is equivalent to a system of k
first-order ODE; more generally, that each n-dimensional kth-order ODE is equivalent
to a kn-dimensional first-order ODE (i.e. to a system of kn one-dimensional first-order
ODE). Even though, in this class, we will mainly consider explicit ODE, we provide the
equivalence also for the implicit case, as the proof is essentially the same (the explicit
case is included as a special case).
Theorem 3.1. In the situation of Def. 1.2(a), i.e. U ⊆ R × K(k+1)n and F : U −→ Kn ,
plus (x0 , y0,0 , . . . , y0,k−1 ) ∈ R × Kkn , consider the kth-order initial value problem

F (x, y, y ′ , . . . , y (k) ) = 0, (3.1a)


∀ y (j) (x0 ) = y0,j , (3.1b)
j∈{0,...,k−1}

and the first-order initial value problem


y1′ − y2 = 0,
y2′ − y3 = 0,
.. (3.2a)
.

yk−1 − yk = 0,
F (x, y1 , . . . , yk , yk′ ) = 0,
 
y0,0
y(x0 ) =  ...  (3.2b)
 
y0,k−1

(note that the unknown function y in (3.1) is Kn -valued, whereas the unknown function
y in (3.2) is Kkn -valued). Then both initial value problems are equivalent in the following
sense:

(a) If φ : I −→ Kn is a solution to (3.1), then


 
φ
 φ′ 
Φ : I −→ Kkn , Φ :=  .. , (3.3)
 
 . 
φ(k−1)

is a solution to (3.2).

(b) If Φ : I −→ Kkn is a solution to (3.2), then φ := Φ1 (which is Kn -valued) is a


solution to (3.1).
3 GENERAL THEORY 25

Proof. We rewrite (3.2a) as


G(x, y, y ′ ) = 0, (3.4)
where

G : V −→ Kkn ,
V := (x, y, z) ∈ R × Kkn × Kkn : (x, y, zk ) ∈ U ⊆ R × Kkn × Kkn ,


G1 (x, y, z) := z1 − y2 ,
G2 (x, y, z) := z2 − y3 ,
..
.
Gk−1 (x, y, z) := zk−1 − yk ,
Gk (x, y, z) := F (x, y, zk ). (3.5)

(a): As a solution to (3.1), φ is k times differentiable and Φ is well-defined. Then (3.1b)


implies (3.2b), since
 
φ(x0 )  
 φ′ (x0 )  y0,0
 (3.1b)  .. 
Φ(x0 ) =  ..  =  . .

 . 
(k−1) y0,k−1
φ (x0 )

Next, Def. 1.2(a)(i) for φ implies Def. 1.2(a)(i) for Φ, since

{ x, Φ(x), Φ′ (x) ∈ I × Kkn × Kkn : x ∈ I}




(3.3) 
x, φ(x), φ′ (x), . . . , φ(k−1) (x), φ′ (x), . . . , φ(k) (x) ∈ I × Kkn × Kkn : x ∈ I

=
(x, φ(x), . . . , φ(k) (x)) ∈ U
⊆ V. (3.6)

The definition of Φ in (3.3) implies

∀ Φ′j = (φ(j−1) )′ = φ(j) = Φj+1 ,


j∈{1,...,k−1}

showing Φ satisfies the first k − 1 equations of (3.2a). As

Φ′k (x) = (φ(k−1) )′ (x) = φ(k) (x)

and, thus,
 (3.1a)
Gk x, Φ(x), Φ′ (x) = F x, φ(x), φ′ (x), . . . , φ(k) (x) = 0,

∀ (3.7)
x∈I

Φ also satisfies the last equation of (3.2a).


(b): As Φ is a solution to (3.2), the first k − 1 equations of (3.2a) imply
φ=Φ1
∀ Φj+1 = Φ′j = φ(j) ,
j∈{1,...,k−1}
3 GENERAL THEORY 26

i.e. φ is k times differentiable and Φ has, once again, the form (3.3) (note Φ1 = φ by
the definition of φ). Then, clearly, (3.2b) implies (3.1b), and Def. 1.2(a)(i) for Φ implies
Def. 1.2(a)(i) for φ:

x, Φ(x), Φ′ (x) ∈ I × Kkn × Kkn : x ∈ I ⊆ V


 

and the definition of V in (3.5) imply

x, φ(x), . . . , φ(k) ∈ I × K(k+1)n : x ∈ I ⊆ U.


 

Finally, from the last equation of (3.2a), one obtains


 (3.3),(3.5)
F x, φ(x), . . . , φ(k) = Gk x, Φ(x), Φ′ (x) = 0,


x∈I

proving φ satisfies (3.1a). 

Example 3.2. The second-order initial value problem

y(0) = 0,
y ′′ = −y, (3.8)
y ′ (0) = r, r ∈ R given,

is equivalent to the following system of two first-order ODE:

y1′ = y2 ,
 
0
′ y(0) = . (3.9)
y2 = −y1 , r

The solution to (3.9) is


   
2 Φ1 (x) r sin x
Φ : R −→ R , Φ(x) = = , (3.10)
Φ2 (x) r cos x

and, thus, the solution to (3.8) is

φ : R −→ R, φ(x) = r sin x. (3.11)

As a consequence of Th. 3.1, one can carry out much of the general theory of ODE
(such as results regarding existence and uniqueness of solutions) for systems of first-
order ODE, obtaining the corresponding results for higher-order ODE as a corollary.
This is the strategy usually pursued in the literature and we will follow suit in this
class.

3.2 Existence of Solutions


It is a rather remarkable fact that, under the very mild assumption that f : G −→ Kn is
a continuous function defined on an open subset G of R×Kkn with (x0 , y0,0 , . . . , y0,k−1 ) ∈
3 GENERAL THEORY 27

G, every initial value problem (1.7) for the n-dimensional explicit kth-order ODE (1.6)
has at least one solution φ : I −→ Kn , defined on a, possibly very small, open interval.
This is the contents of the Peano Th. 3.8 below and its Cor. 3.10. From Example 1.4(b),
we already know that uniqueness of the solution cannot be expected without stronger
hypotheses.
The proof of the Peano theorem requires some work. One of the key ingredients is
the Arzelà-Ascoli Th. 3.7 that, under suitable hypotheses, guarantees a given sequence
of continuous functions to have a uniformly convergent subsequence (the formulation
in Th. 3.7 is suitable for our purposes – many different variants of the Arzelà-Ascoli
theorem exist in the literature).
We begin with some prelimanaries from the theory of metric spaces. At this point, the
reader might want to review the definition of a metric, a metric space, and basic notions
on metric spaces, such as the notion of compactness and the notion of continuity of
functions between metric spaces. Also recall that every normed space is a metric space
via the metric induced by the norm (in particular, if we use metric notions on normed
spaces, they are always meant with respect to the respective induced metric). If you
are not sufficiently familiar with metrics and norms, you might want to consult the
relevant subsections of [Phi15, Sec. 1]; for compactness and some related results see,
e.g., Appendix C.2.

Notation 3.3. Let (X, d) be a metric space. Given x ∈ X and r ∈ R+ , let

Br (x) := {y ∈ X : d(x, y) < r}

denote the open ball with center x and radius r, also known as the r-ball with center x.

Definition 3.4. Let (X, dX ) and (Y, dY ) be metric spaces. We say a sequence of func-
tions (fm )m∈N , fm : X −→ Y , converges uniformly to a function f : X −→ Y if, and
only if, 
∀ ∃ ∀ dY fm (x), f (x) < ǫ.
ǫ>0 N ∈N m≥N,
x∈X

Theorem 3.5. Let (X, dX ) and (Y, dY ) be metric spaces. If the sequence (fm )m∈N of
continuous functions fm : X −→ Y converges uniformly to the function f : X −→ Y ,
then f is continuous as well.

Proof. We have to show that f is continuous at every ξ ∈ X. Thus, let ξ ∈ X and ǫ > 0.
Due to the uniform convergence, we can choose m ∈ N such that dY fm (x), f (x) < ǫ/3
for every x ∈ X. Moreover, as fm is continuous at ξ, there exists δ > 0 such that
x ∈ Bδ (ξ) implies dY fm (ξ), fm (x) < ǫ/3. Thus, if x ∈ Bδ (ξ), then
   
dY f (ξ), f (x) ≤ dY f (ξ), fm (ξ) + dY fm (ξ), fm (x) + dY fm (x), f (x)
ǫ ǫ ǫ
< + + = ǫ,
3 3 3
proving f is continuous at ξ. 
3 GENERAL THEORY 28

Definition 3.6. Let (X, dX ) and (Y, dY ) be metric spaces and let F be a set of functions
from X into Y . Then the set F (or the functions in F) are said to be uniformly
equicontinuous if, and only if, for each ǫ > 0, there is δ > 0 such that
 

∀ dX (x, ξ) < δ ⇒ ∀ dY f (x), f (ξ) < ǫ . (3.12)
x,ξ∈X f ∈F

Theorem 3.7 (Arzelà-Ascoli). Let n ∈ N, let k · k denote some norm on Kn , and let
I ⊆ R be some bounded interval. If (fm )m∈N is a sequence of functions fm : I −→ Kn
such that {fm : m ∈ N} is uniformly equicontinuous and such that, for each x ∈ I, the
sequence fm (x) m∈N is bounded, then (fm )m∈N has a uniformly convergent subsequence
(fmj )j∈N , i.e. there exists f : I −→ Kn such that

∀ ∃ ∀ kfmj (x) − f (x)k < ǫ.


ǫ>0 N ∈N j≥N,
x∈I

In particular, the limit function f is continuous.

Proof. Let (r1 , r2 , . . . ) be an enumeration of the set of rational numbers in I, i.e. of


Q ∩ I. Inductively, we construct a sequence (Fm )m∈N of subsequences of (fm )m∈N ,
Fm = (fm,k )k∈N , such that

(i) for each m ∈ N, Fm is a subsequence of each Fj with j ∈ {1, . . . , m},

(ii) for each m ∈ N, Fm converges pointwise at each of the first m rational numbers
rj ; more precisely, there exists a sequence (z1 , z2 , . . . ) in Kn such that, for each
m ∈ N and each j ∈ {1, . . . , m}:

lim fm,k (rj ) = zj .


k→∞

Actually, we construct the (zm )m∈N inductively together with the (Fm )m∈N : Since the
sequence (fm (r1 ))m∈N is, by hypothesis, a bounded sequence in Kn , one can apply the
Bolzano-Weierstrass theorem (cf. [Phi15, Th. 1.16(b)]) to obtain z1 ∈ Kn and a sub-
sequence F1 = (f1,k )k∈N of (fm )m∈N such that limk→∞ f1,k (r1 ) = z1 . To proceed by
induction, we now assume to have already constructed F1 , . . . , FM and z1 , . . . , zM for
M ∈ N such that (i) and (ii) hold for each m ∈ {1, . . . , M }. Since the sequence
(fM,k (rM +1 ))k∈N is a bounded sequence in Kn , one can, once more, apply the Bolzano-
Weierstrass theorem to obtain zM +1 ∈ Kn and a subsequence FM +1 = (fM +1,k )k∈N of
FM such that limk→∞ fM +1,k (rM +1 ) = zM +1 . Since FM +1 is a subsequence of FM , it is
also a subsequence of all previous subsequences, i.e. (i) now also holds for m = M + 1.
In consequence, limk→∞ fM +1,k (rj ) = zj for each j = 1, . . . , M + 1, such that (ii) now
also holds for m = M + 1 as required.
Next, one considers the diagonal sequence (gm )m∈N , gm := fm,m , and observes that this
sequence converges pointwise at each rational number rj (limm→∞ gm (rj ) = zj ), since,
at least for m ≥ j, (gm )m∈N is a subsequence of every Fj (exercise) – in particular,
(gm )m∈N is also a subsequence of the original sequence (fm )m∈N .
3 GENERAL THEORY 29

In the last step of the proof, we show that (gm )m∈N converges uniformly on the entire
interval I to some f : I −→ Kn . To this end, fix ǫ > 0. Since {gm : m ∈ N} ⊆ {fm :
m ∈ N}, the assumed uniform equicontinuity of {fm : m ∈ N} yields δ > 0 such that
 
ǫ
∀ |x − ξ| < δ ⇒ ∀ gm (x) − gm (ξ) <
.
x,ξ∈I m∈N 3
Since I is bounded, it has S finite length and, thus, it can be covered with finitely many
intervals I1 , . . . , IN , I = N j=1 Ij , N ∈ N, such that each Ij has length less than δ.
Moreover, since Q is dense in R, for each j ∈ {1, . . . , N }, there exists k(j) ∈ N such
that rk(j) ∈ Ij . Define M := max{k(j)  : j = 1, . . . , N }. We note that each of the
finitely many sequences (gm (r1 ) m∈N , . . . , (gm (rM ) m∈N is a Cauchy sequence. Thus,

gk (rα ) − gl (rα ) < ǫ .



∃ ∀ ∀ (3.13)
K∈N k,l≥K α=1,...,M 3
We now consider an arbitrary x ∈ I and k, l ≥ K. Let j ∈ {1, . . . , N } such that x ∈ Ij .
Then rk(j) ∈ Ij , |rk(j) − x| < δ, and the estimate in (3.13) holds for α = k(j). In
consequence, we obtain the crucial estimate

gk (x) − gl (x)

∀ ≤ kg k (x) − g k (r k(j) ) + gk (rk(j) ) − gl (rk(j) ) + gl (rk(j) ) − gl (x)
(3.14)
k,l≥K ǫ ǫ ǫ
< + + = ǫ.
3 3 3

The estimate (3.14) shows (gm (x) m∈N is a Cauchy sequence for each x ∈ I, and we
can define
f : I −→ Kn , f (x) := lim gm (x). (3.15)
m→∞

Since K in (3.14) does not depend on x ∈ I, passing to the limit k → ∞ in the estimate
of (3.14) implies
∀ kgl (x) − f (x)k ≤ ǫ,
l≥K,
x∈I

proving uniform convergence of the subsequence (gm )m∈N of (fm )m∈N as desired. The
continuity of f is now a consequence of Th. 3.5. 

At this point, we have all preparations in place to state and prove the existence theorem.
Theorem 3.8 (Peano). If G ⊆ R×Kn is open, n ∈ N, and f : G −→ Kn is continuous,
then, for each (x0 , y0 ) ∈ G, the explicit n-dimensional first-order initial value problem

y ′ = f (x, y), (3.16a)


y(x0 ) = y0 , (3.16b)

has at least one solution. More precisely, given an arbitrary norm k · k on Kn , (3.16)
has a solution φ : I −→ Kn , defined on the open interval

I :=]x0 − α, x0 + α[, (3.17)


3 GENERAL THEORY 30

α = α(b) > 0, where b > 0 is such that

B := (x, y) ∈ R × Kn : |x − x0 | ≤ b and ky − y0 k ≤ b ⊆ G,

(3.18)

M := M (b) := max{kf (x, y)k : (x, y) ∈ B} < ∞, (3.19)


and (
min{b, b/M } for M > 0,
α := α(b) := (3.20)
b for M = 0.
In general, the choice of the norm k · k on Kn will influence the possible sizes of α and,
thus, of I.

Proof. The proof will be conducted in several steps. In the first step, we check α =
α(b) > 0 is well-defined: Since G is open, there always exists b > 0 such that (3.18)
holds. Since B is a closed and bounded subset of the finite-dimensional space R × Kn ,
B is compact (cf. [Phi15, Cor. 3.5]). Since f and, thus, kf k is continuous (every norm
is even Lipschitz continuous due to the inverse triangle inequality), it must assume its
maximum on the compact set B (cf. [Phi15, Th. 3.8]), showing M ∈ R+ 0 is well-defined
by (3.19) and α is well-defined by (3.20).
In the second step of the proof, we note that it suffices to prove (3.16) has a solution φ+ ,
defined on [x0 , x0 + α[: One can then apply the time reversion Lem. 1.9(b): The proof
providing the solution φ+ also provides a solution ψ+ : [−x0 , −x0 + α[−→ Kn to the
time-reversed initial value problem, consisting of y ′ = −f (−x, y) and y(−x0 ) = y0 (note
that the same M and α work for the time-reversed problem). Then, according to Lem.
1.9(b), φ− : ]x0 − α, x0 ] −→ Kn , φ− (x) := ψ+ (−x), is a solution to (3.16). According to
Lem. 1.7, we can patch φ− and φ+ together to obtain the desired solution
(
φ− (x) for x ≤ x0 ,
φ : I −→ Kn , φ(x) := (3.21)
φ+ (x) for x ≥ x0 ,

defined on all of I. It is noted that one can also conduct the proof with the second step
omitted, but then one has to perform the following steps on all of I, which means one
has to consider additional cases in some places.
In the third step of the proof, we will define a sequence (φm )m∈N of functions

φm : I+ −→ Kn , I+ := [x0 , x0 + α], (3.22)

that constitute approximate solutions to (3.16). To begin the construction of φm , fix


m ∈ N. Since B is compact and f is continuous, we know f is even uniformly continuous
on B (cf. [Phi15, Th. 3.9]). In particular,

f (x, y) − f (x̃, ỹ) < 1 .


 
∃ ∀ |x − x̃| < δm , ky − ỹk < δm ⇒
δm >0 (x,y),(x̃,ỹ)∈B m
(3.23)
3 GENERAL THEORY 31

We now form what is called a discretization of the interval I+ , i.e. a partition of I+ into
sufficiently many small intervals: Let N ∈ N and
x0 < x1 < · · · < xN −1 < xN := x0 + α (3.24)
such that
(
min{δm , δm /M, 1/m} for M > 0,
∀ xj − xj−1 < β := (3.25)
j∈{1,...,N } min{δm , 1/m} for M = 0
(for example one could make the equidistant choice xj := x0 + jh with h = α/N and
N > α/β, but it does not matter how the xj are defined as long as (3.24) and (3.25)
both hold). Note that we get a different discretization of I+ for each m ∈ N; however,
the dependence on m is suppressed in the notation for the sake of readability. We now
define recursively
φm : I+ −→ Kn ,
φm (x0 ) := y0 , (3.26)

φm (x) := φm (xj ) + (x − xj ) f xj , φm (xj ) for each x ∈ [xj , xj+1 ].
Note that there is no conflict between the two definitions given for x = xj with j ∈
{1, . . . , N − 1}. Each function φm defines a polygon in Kn . This construction is known
as Euler’s method and it can be used to obtain numerical approximations to the solution
of the initial value problem (while simple, this method is not very efficient, though). We
still need to verify that the definition (3.26) does actually make sense: We need to check
that f can, indeed, be applied to (xj , φm (xj )), i.e. we have to check (xj , φm (xj )) ∈ G.
We can actually show the stronger statement
∀ (x, φm (x)) ∈ B, (3.27)
x∈I+

where B is as defined in (3.18). First, it is pointed out that (3.20) implies α ≤ b, such
that x ∈ I+ implies |x − x0 | ≤ α ≤ b as required in (3.18). One can now prove (3.27)
by showing by induction on j ∈ {0, . . . , N − 1}:
∀ (x, φm (x)) ∈ B. (3.28)
x∈[xj ,xj+1 ]

To start the induction, note φm (x0 ) = y0 and (x0 , y0 ) ∈ B by (3.18). Now let j ∈
{0, . . . , N − 1} and x ∈ [xj , xj+1 ]. We estimate
j
X
kφm (x) − y0 k ≤ kφm (x) − φm (xj )k + kφm (xk ) − φm (xk−1 )k
k=1
j
(3.26)  X 
= (x − xj ) f xj , φm (xj ) + (xk − xk−1 ) f xk−1 , φm (xk−1 )
k=1
j
(∗) X
≤ (x − xj ) M + (xk − xk−1 ) M = (x − x0 ) M
k=1
(3.20)
≤ α M ≤ b, (3.29)
3 GENERAL THEORY 32

where, at (∗), it was used that (xk , φm (xk )) ∈ B by induction hypothesis for each
k = 0, . . . , j, and, thus, f xk , φm (xk ) ≤ M by (3.19). Estimate (3.29) completes the

induction and the third step of the proof.
In the fourth step of the proof, we establish several properties of the functions φm . The
first two properties are immediate from (3.26), namely that φm is continuous on I + and
differentiable at each x ∈]xj , xj+1 [, j ∈ {0, . . . , N − 1}, where

φ′m (x) = f xj , φm (xj ) .



∀ ∀ (3.30)
j∈{0,...,N −1} x∈]xj ,xj+1 [

The next property to establish is

∀ kφm (t) − φm (s)k ≤ |t − s| M. (3.31)


s,t∈I+

To prove (3.31), we may assume s < t without loss of generality. If s, t ∈ [xj , xj+1 ],
j ∈ {0, . . . , N − 1}, then

kφm (t) − φm (s)k


(3.26)  
= φm (xj ) + (t − xj ) f xj , φm (xj ) − φm (xj ) − (s − xj ) f xj , φm (xj )
 (3.19)
= |t − s| f xj , φm (xj ) ≤ |t − s| M (3.32a)

as desired. If s, t are not contained in the same interval [xj , xj+1 ], then fix j < k such
that s ∈ [xj , xj+1 ] and t ∈ [xk , xk+1 ]. Then (3.31) follows from an estimate analogous to
the one in (3.29):

kφm (t) − φm (s)k


k−1
X
≤ kφm (s) − φm (xj+1 )k + kφm (xl ) − φm (xl+1 )k + kφm (xk ) − φm (t)k
l=j+1
k−1
(3.32a) X
≤ |s − xj+1 | M + |xl − xl+1 | M + |t − xk | M
l=j+1
= |t − s| M, (3.32b)

completing the proof of (3.31). The following property of the φm is the justification for
calling them approximate solutions to our initial value problem (3.16):

φm (x) − f x, φm (x) < 1 .


′ 
∀ ∀ (3.33)
j∈{0,...,N −1} x∈]xj ,xj+1 [ m

Indeed, (3.33) is a consequence of (3.23), i.e. of the uniform continuity of f on B: First,


if M = 0, then f ≡ φ′m ≡ 0 and there is nothing to prove.So let M > 0. If x ∈]xj , xj+1 [,
then, according to (3.30), we have φ′m (x) = f xj , φm (xj ) . Thus, by (3.25),

(3.31)
|x−xj | < β ≤ min{δm , δm /M } ⇒ kφm (x)−φm (xj )k ≤ |x−xj | M < δm , (3.34a)
3 GENERAL THEORY 33

and
′    (3.34a),(3.23) 1
φm (x) − f x, φm (x) = f xj , φm (xj ) − f x, φm (x) < , (3.34b)
m
proving (3.33).
The last property of the φm we need is
(3.31)
∀ kφm (x)k ≤ kφm (x) − φm (x0 )k + kφm (x0 )k ≤ |x − x0 | M + kφm (x0 )k (3.35)
x∈I+
≤ αM + ky0 k,
which says that the φm are pointwise and even uniformly bounded.
In the fifth and last step of the proof, we use the Arzelà-Ascoli Th. 3.7 to obtain a
function φ+ : I+ −→ Kn , and we show that φ constitutes a solution to (3.16). According
to (3.31), the φm are uniformly equicontinuous (given ǫ > 0, condition (3.12) is satisfied
with δ := ǫ/M for M > 0 and arbitrary δ > 0 for M = 0), and according to (3.35)
the φm are bounded such that the Arzelà-Ascoli Th. 3.7 applies to yield a subsequence
(φmj )j∈N of (φm )m∈N converging uniformly to some continuous function φ+ : I+ −→ Kn .
So it merely remains to verify that φ+ is a solution to (3.16).
As the uniform convergence of the (φmj )j∈N implies pointwise convergence, we have
φ+ (x0 ) = limj→∞ φmj (x0 ) = y0 , showing φ+ satisfies the initial condition (3.16b).
Next,
∀ (x, φ+ (x)) = lim (x, φmj (x)) ∈ B,
x∈I j→∞

since each (x, φmj (x)) is in B and B is closed. In particular, f (x, φ+ (x)) is well-defined
for each x ∈ I+ . To prove that φ+ also satisfies the ODE (3.16a), by Th. 1.5, it suffices
to show Z x

∀ φ+ (x) − φ+ (x0 ) − f t, φ+ (t) dt = 0 (3.36)
x∈I+ x0

(in particular, (3.36) implies φ+ to be differentiable). Fixing x ∈ I+ and using the


triangle inequality for the umpteenth time, one obtains
Z x

φ+ (x) − φ+ (x0 ) − f t, φ+ (t) dt

x0
Z x

≤ kφ+ (x) − φmj (x)k + φmj (x) − φ+ (x0 ) − f t, φmj (t) dt

x0
Z x 
 
+ f t, φmj (t) − f t, φ+ (t) dt , (3.37)
x0

holding for every j ∈ N. We will conclude the proof by showing that all three summands
on the right-hand side of (3.37) tend to 0 for j → ∞. As already mentioned above,
the uniform convergence of the (φmj )j∈N implies pointwise convergence, implying the
convergence of the first summand. We tackle the third summand next, using
Z x  Z x
   
f t, φ (t) − f t, φ (t) dt ≤ f t, φ (t) − f t, φ (t) dt , (3.38)

mj + mj +


x0 x0
3 GENERAL THEORY 34

which holds for every norm (cf. Appendix Pn B), but can easily be checked directly for
the 1-norm, where k(z1 , . . . , zn )k1 := j=1 |zj | (exercise). Given ǫ > 0, the uniform
 
continuity of f on B provides δ > 0 such that f t, φmj (t) − f t, φ+ (t) < ǫ/α for

kφmj (t) − φ+ (t)k < δ. The uniform convergence of (φmj )j∈N then yields K ∈ N such
that kφmj (t) − φ+ (t)k < δ for every j ≥ K and each t ∈ I. Thus,
Z x
  |x − x0 | ǫ
∀ f t, φ (t) − f t, φ (t) dt ≤ ≤ ǫ,

mj +
α

j≥K x0

thereby establishing the convergence of the third summand from the right-hand side
of (3.37). For the remaining second summand, we note that the fact that each φm is
continuous and piecewise differentiable (with piecewise constant derivative) allows to
apply the fundamental theorem of calculus in the form [Phi16, Th. 10.20(b)] to obtain
Z x
∀ φm (x) = φm (x0 ) + φ′m (t) dt . (3.39)
x∈I+ x0

Using (3.39) in the second summand of the right-hand side of (3.37) provides
Z x Z x
 ′ 
φm (x) − φ+ (x0 ) − f t, φ mj (t) dt ≤ φ
mj (t) − f t, φ mj (t) dt

j
x0 x0
Z x
(3.33) 1 α
≤ ≤ ,
x0 m j mj

showing the convergence of the second summand, which finally concludes the proof. 

Corollary 3.9. If G ⊆ R × Kn is open, n ∈ N, f : G −→ Kn is continuous, and C ⊆ G


is compact, then there exists α > 0, such that, for each (x0 , y0 ) ∈ C, the explicit n-
dimensional first-order initial value problem (3.16) has a solution φ : I −→ Kn , defined
on the open interval I :=]x0 − α, x0 + α[, i.e. always on an interval of the same length
2α.

Proof. Exercise. 

Corollary 3.10. If G ⊆ R × Kkn is open, k, n ∈ N, and f : G −→ Kn is continuous,


then, for each (x0 , y0,0 , . . . , y0,k−1 ) ∈ G, the explicit n-dimensional kth-order initial value
problem consisting of (1.6) and (1.7), which, for convenience, we rewrite

y (k) = f x, y, y ′ , . . . , y (k−1) ,

(3.40a)
∀ y (j) (x0 ) = y0,j , (3.40b)
j∈{0,...,k−1}

has at least one solution. More precisely, there exists an open interval I ⊆ R with
x0 ∈ I and φ : I −→ Kn such that φ is a solution to (3.40). If C ⊆ G is compact,
then there exists α > 0 such that, for each (x0 , y0,0 , . . . , y0,k−1 ) ∈ C, (3.40) has a solution
φ : I −→ Kn , defined on the open interval I :=]x0 − α, x0 + α[, i.e. always on an interval
of the same length 2α.
3 GENERAL THEORY 35

Proof. If f is continuous, then the right-hand side of the equivalent first-order system
(3.2a) (written in explicit form) is given by the continuous function
 
y2
 y3 
..
 
f˜ : G −→ Kkn , f˜(x, y1 , . . . , yk ) :=  . (3.41)
 
 . 
 yk−1 
f (x, y1 , . . . , yk )
Thus, Th. 3.8 provides a solution Φ : I −→ Kkn to (3.2) and, then, Th. 3.1(b) yields
φ := Φ1 to be a solution to (3.40). Moreover, if C ⊆ G is compact, then Cor. 3.9 provides
α > 0 such that, for each (x0 , y0,0 , . . . , y0,k−1 ) ∈ C, (3.2) has a solution Φ : I −→ Kkn ,
defined on the same open interval I :=]x0 − α, x0 + α[. In particular, φ := Φ1 , the
corresponding solution to (3.40) is also defined on the same I. 

While the Peano theorem is striking in its generality, it does have several drawbacks:
(a) the interval, where the existence of a solution is proved can be unnecessarily short;
(b) the selection of the subsequence using the Arzelà-Ascoli theorem makes the proof
nonconstructive; (c) uniqueness of solutions is not provided, even in cases, where unique
solutions exist; (d) it does not provide information regarding how the solution changes
with a change of the initial condition. We will subsequently address all these points,
namely (b) and (c) in Sec. 3.3 (we will see that the proof of the Peano theorem becomes
constructive in situations, where the solution is unique – in general, a constructive proof
is not available), (a) in Sec. 3.4, and (d) in Sec. 3.5.

3.3 Uniqueness of Solutions


Example 1.4(b) shows that the hypotheses of the Peano Th. 3.8 are not strong enough
to guarantee the initial value problem (3.16) has a unique solution, not even in some
neighborhood of x0 . The additional condition that will yield uniqueness is local Lipschitz
continuity of f with respect to y.
Definition 3.11. Let m, n ∈ N, G ⊆ R × Km , and f : G −→ Kn .

(a) The function f is called (globally) Lipschitz continuous or just (globally) Lipschitz
with respect to y if, and only if,

∃ ∀ f (x, y) − f (x, ȳ) ≤ Lky − ȳk. (3.42)
L≥0 (x,y),(x,ȳ)∈G

(b) The function f is called locally Lipschitz continuous or just locally Lipschitz with
respect to y if, and only if, for each (x0 , y0 ) ∈ G, there exists a (relative) open set
U ⊆ G such that (x0 , y0 ) ∈ U (i.e. U is a (relative) open neighborhood of (x0 , y0 ))
and f is Lipschitz continuous with respect to y on U , i.e. if, and only if,

∀ ∃ ∃ ∀ f (x, y) − f (x, ȳ) ≤ Lky − ȳk.
(x0 ,y0 )∈G (x0 , y0 ) ∈ U ⊆ G open L≥0 (x,y),(x,ȳ)∈U
(3.43)
3 GENERAL THEORY 36

The number L occurring in (a),(b) is called Lipschitz constant. The norms on Km and
Kn in (a),(b) are arbitrary. If one changes the norms, then one will, in general, change
L, but not the property of f being (locally) Lipschitz.

Caveat 3.12. It is emphasized that f : G −→ Kn , (x, y) 7→ f (x, y), being Lipschitz


with respect to y does not imply f to be continuous: Indeed, if I ⊆ R, ∅ 6= A ⊆ Km ,
and g : I −→ Kn is an arbitrary discontinuous function, then f : I × A −→ Kn ,
f (x, y) := g(x) is not continuous, but satisfies (3.42) with L = 0.

While the local neighborhoods U , where a function locally Lipschitz (with respect to y)
is actually Lipschitz continuous (with respect to y) can be very small, we will now show
that a continuous function is locally Lipschitz (with respect to y) on G if, and only if,
it is Lipschitz continuous (with respect to y) on every compact set K ⊆ G.

Proposition 3.13. Let m, n ∈ N, G ⊆ R × Km , and f : G −→ Kn be continuous.


Then f is locally Lipschitz with respect to y if, and only if, f is (globally) Lipschitz with
respect to y on every compact subset K of G.

Proof. First, assume f is not locally Lipschitz with respect to y. Then there exists
(x0 , y0 ) ∈ G such that

∀ ∃ f (xN , yN,1 ) − f (xN , yN,2 ) > N kyN,1 − yN,2 k. (3.44)
N ∈N (xN ,yN,1 ),(xN ,yN,2 )
∈G∩B1/N (x0 ,y0 )

The set 
K := {(x0 , y0 )} ∪ (xN , yN,j ) : N ∈ N, j ∈ {1, 2}
is clearly a compact subset of G (e.g. by the Heine-Borel property of compact sets (see
Th. C.19), since every open set containing (x0 , y0 ) must contain all, but finitely many,
of the elements of K). Due to (3.44), f is not (globally) Lipschitz with respect to y on
the compact set K (so, actually, continuity of f was not used for this direction).
Conversely, assume f to be locally Lipschitz with respect to y, and consider a compact
subset K of G. Then, for each (x, y) ∈ K, there is some (relatively) open U(x,y) ⊆ G
with (x, y) ∈ U(x,y) and such that f is Lipschitz with respect to y in U(x,y) . By the
Heine-Borel property of compact sets (see Th. C.19), there are finitely many U1 :=
U(x1 ,y1 ) , . . . , UN := U(xN ,yN ) , N ∈ N, such that
N
[
K⊆ Uj . (3.45)
j=1

For each j = 1, . . . , N , let Lj denote the Lipschitz constant for f on Uj and set L′ :=
max{L1 , . . . , LN }. As f is assumed continuous and K is compact, we have

M := max{kf (x, y)k : (x, y) ∈ K} < ∞. (3.46)


3 GENERAL THEORY 37

Using the compactness of K once again, there exists a Lebesgue number δ > 0 for the
open cover (Uj )j∈{1,...,N } of K (cf. Th. C.21), i.e. δ > 0 such that
 
∀ ky − ȳk < δ ⇒ ∃ {(x, y), (x, ȳ)} ⊆ Uj . (3.47)
(x,y),(x,ȳ)∈K j∈{1,...,N }

Define L := max{L′ , 2M/δ}. Then, for every (x, y), (x, ȳ) ∈ K:
ky − ȳk < δ ⇒ kf (x, y) − f (x, ȳ)k ≤ Lj ky − ȳk ≤ Lky − ȳk, (3.48a)
2M δ
ky − ȳk ≥ δ ⇒ ≤ Lky − ȳk,
kf (x, y) − f (x, ȳ)k ≤ 2M = (3.48b)
δ
completing the proof that f is Lipschitz with respect to y on K. 

While, in general, the assertion of Prop. 3.13 becomes false if the continuity of f is omit-
ted, for convex G, it does hold without the continuity assumption on f (see Appendix
D). The following Prop. 3.14 provides a useful sufficient condition for f : G −→ Kn ,
G ⊆ R × Km open, to be locally Lipschitz with respect to y:
Proposition 3.14. Let m, n ∈ N, let G ⊆ R × Km be open, and f : G −→ Kn . A
sufficient condition for f to be locally Lipschitz with respect to y is f being continuously
(real) differentiable with respect to y, i.e., f is locally Lipschitz with respect to y provided
that all partials ∂yk fl ; k, l = 1, . . . , n (∂yk,1 fl , ∂yk,2 fl for K = C) exist and are continuous.

Proof. We consider the case K = R; the case K = C is included by using the identifi-
cations Cm ∼= R2m and Cn ∼ = R2n . Given (x0 , y0 ) ∈ G, we have to show f is Lipschitz
with respect to y on some open set U ⊆ G with (x0 , y0 ) ∈ U . As in the Peano Th. 3.8,
since G is open,
∃ B := (x, y) ∈ R × Rm : |x − x0 | ≤ b and ky − y0 k1 ≤ b ⊆ G,

b>0

where k · k1 denotes the 1-norm on Rm . Since the ∂yk fl , (k, l) ∈ {1, . . . , m} × {1, . . . , n},
are all continuous on the compact set B,

M := max |∂yk fl (x, y)| : (x, y) ∈ B, (k, l) ∈ {1, . . . , m} × {1, . . . , n} < ∞. (3.49)
Applying the mean value theorem (cf. [Phi15, Th. 2.32]) to the n components of the
function
fx : y ∈ Rm : (x, y) ∈ B −→ Rn , fx (y) := f (x, y),


we obtain η1 , . . . , ηn ∈ Rm such that


m
X
fl (x, y) − fl (x, ȳ) = ∂yk fl (x, ηl )(yk − ȳk ), (3.50)
k=1

and, thus,
n
X

f (x, y) − f (x, ȳ) = |fl (x, y) − fl (x, ȳ)|
1
l=1
∀ m
n X n (3.51)
(x,y),(x,ȳ)∈B (3.49),(3.50) X X
≤ M |yk − ȳk | = M ky − ȳk1 = nM ky − ȳk1 ,
l=1 k=1 l=1
3 GENERAL THEORY 38

i.e. f is Lipschitz with respect to y on B (where

(x, y) ∈ R × Rm : |x − x0 | < b and ky − y0 k1 < b ⊆ B




is an open neighborhood of (x0 , y0 )), showing f is locally Lipschitz with respect to y. 


Theorem 3.15. If G ⊆ R × Kn is open, n ∈ N, and f : G −→ Kn is continuous and
locally Lipschitz with respect to y, then, for each (x0 , y0 ) ∈ G, the explicit n-dimensional
first-order initial value problem

y ′ = f (x, y), (3.52a)


y(x0 ) = y0 , (3.52b)

has a unique solution. More precisely, if I ⊆ R is an open interval and φ, ψ : I −→ Kn


are both solutions to (3.52a), then φ(x0 ) = ψ(x0 ) for one x0 ∈ I implies φ(x) = ψ(x)
for all x ∈ I:
   
∃ φ(x0 ) = ψ(x0 ) ⇒ ∀ φ(x) = ψ(x) . (3.53)
x0 ∈I x∈I

Proof. We first show that φ and ψ must agree in a small neighborhood of x0 :

∃ ∀ φ(x) = ψ(x). (3.54)


ǫ>0 x∈]x0 −ǫ,x0 +ǫ[

Since f is continuous and both φ and ψ are solutions to the initial value problem (3.52),
we can use Th. 1.5 to obtain
Z x
 
∀ φ(x) − ψ(x) = f t, φ(t) − f t, ψ(t) dt . (3.55)
x∈I x0

As f is locally Lipschitz with respect to y, there exists δ > 0 such that f is Lipschitz
with Lipschitz constant L ≥ 0 with respect to y on

U := {(x, y) ∈ G : |x − x0 | < δ, ky − y0 k < δ},

where we have chosen some arbitrary norm k·k on Kn . The continuity of φ, ψ implies the
existence of ǫ̃ > 0 such that B ǫ̃ (x0 ) ⊆ I, φ(Bǫ̃ (x0 )) ⊆ Bδ (y0 ) and ψ(Bǫ̃ (x0 )) ⊆ Bδ (y0 ),
implying  
∀ f x, φ(x) − f x, ψ(x) ≤ L φ(x) − ψ(x) . (3.56)
x∈Bǫ̃ (x0 )

Next, define
ǫ := min{ǫ̃, 1/(2L)}
and, using the compactness of B ǫ (x0 ) = [x0 − ǫ, x0 + ǫ] plus the continuity of φ, ψ,

M := max kφ(x) − ψ(x)k : x ∈ B ǫ (x0 ) < ∞.

From (3.55) and (3.56), we obtain


x
Z
M
∀ kφ(x) − ψ(x)k ≤ L kφ(t) − ψ(t)k dt ≤ L |x − x0 | M ≤ (3.57)
x∈Bǫ (x0 ) x0 2
3 GENERAL THEORY 39

(note that the integral in (3.57) can be negative for x < x0 ). The definition of M
together with (3.57) yields M ≤ M/2, i.e. M = 0, finishing the proof of (3.54).
To prove φ(x) = ψ(x) for each x ≥ x0 , let
s := sup{ξ ∈ I : φ(x) = ψ(x) for each x ∈ [x0 , ξ]}.
One needs to show s = sup I. If s = sup I does not hold, then there exists α > 0 such
that [s, s + α] ⊆ I. Then the continuity of φ, ψ implies φ(s) = ψ(s), i.e. φ and ψ satisfy
the same initial value problem at s such that (3.54) must hold with s instead of x0 , in
contradiction to the definition of s. Finally, φ(x) = ψ(x) for each x ≤ x0 follows in an
completely analogous fashion, which concludes the proof of the theorem. 
Corollary 3.16. If G ⊆ R×Kkn is open, k, n ∈ N, and f : G −→ Kn is continuous and
locally Lipschitz with respect to y, then, for each (x0 , y0,0 , . . . , y0,k−1 ) ∈ G, the explicit
n-dimensional kth-order initial value problem consisting of (1.6) and (1.7), i.e.
y (k) = f x, y, y ′ , . . . , y (k−1) ,


∀ y (j) (x0 ) = y0,j ,


j∈{0,...,k−1}

has a unique solution. More precisely, if I ⊆ R is an open interval and φ, ψ : I −→ Kn


are both solutions to (1.6), then
∀ φ(j) (x0 ) = ψ (j) (x0 ) (3.58)
j∈{0,...,k−1}

holding for one x0 ∈ I implies φ(x) = ψ(x) for all x ∈ I.

Proof. Exercise. 
Remark 3.17. According to Th. 3.15, the condition of f being continuous and locally
Lipschitz with respect to y is sufficient for each initial value problem (3.52) to have a
unique solution. However, this condition is not necessary: It is an exercise to show that
the continuous function
(
1 for y ≤ 0,
f : R2 −→ R, f (x, y) := √ (3.59)
1 + y for y ≥ 0,
is not locally Lipschitz with respect to y, but that, for each (x0 , y0 ) ∈ R2 , the initial
value problem (3.52) still has a unique solution in the sense that (3.53) holds for each
solution φ to (3.52a). And one can (can you?) even find simple examples of f being
defined on an open domain such that f is discontinuous at every point in its domain
and every initial value problem (3.52) still has a unique solution.

At the end of Sec. 3.2, it was pointed out that the proof of the Peano Th. 3.8 is non-
constructive due to the selection of a subsequence. The following Th. 3.18 shows that,
whenever the initial value problem has a unique solution, it becomes unnecessary to
select a subsequence, and the construction procedure (namely Euler’s method) used in
the proof of Th. 3.8 becomes an effective (if not necessarily efficient) numerical approx-
imation procedure for the unique solution.
3 GENERAL THEORY 40

Theorem 3.18. Consider the situation of the Peano Th. 3.8. Under the additional
assumption that the solution to the explicit n-dimensional first-order initial value prob-
lem (3.16) is unique on some interval J ⊆ [x0 , x0 + α[, x0 ∈ J, and where α > 0 is
constructed as in Th. 3.8 (i.e. given by (3.18) – (3.20)), every sequence (φm )m∈N of func-
tions defined on J according to Euler’s method as in the proof of Th. 3.8 (i.e. defined
as in (3.26)) converges uniformly to the unique solution φ : J −→ Kn . An analogous
statement also holds for J ⊆]x0 − α, x0 ], x0 ∈ J.

Proof. Seeking a contradiction, assume (φm )m∈N does not converge uniformly to the
unique solution φ. Then there exists ǫ > 0 and a subsequence (φmj )j∈N such that

∀ kφmj − φksup = sup kφmj (x) − φ(x)k : x ∈ J ≥ ǫ. (3.60)
j∈N

However, as a subsequence, (φmj )j∈N still has all the properties of the (φm )m∈N (namely
pointwise boundedness, uniform equicontinuity, piecewise differentiability, being approx-
imate solutions according to (3.33)) that guanranteed the existence of a subsequence,
converging to a solution. Thus, since the solution is unique on J, (φmj )j∈N must, in turn,
have a subsequence, converging uniformly to φ, which is in contradiction to (3.60). This
shows the assumption that (φm )m∈N does not converge uniformly to φ must have been
false. The proof of the analogous statement for J ⊆]x0 − α, x0 ], x0 ∈ J, one obtains,
e.g., via time reversion (cf. the second step of the proof of Th. 3.8). 
Remark 3.19. The argument used to prove Th. 3.18 is of a rather general nature: It
can be applied whenever a sequence is known to have a subsequence converging to some
solution of some equation (or some other problem), provided the same still holds for
every subsequence of the original sequence – in that case, the additional knowledge that
the solution is unique implies the convergence of the original sequence without the need
to select a subsequence.

3.4 Extension of Solutions, Maximal Solutions


The Peano Th. 3.8 and Cor. 3.10 show the existence of local solutions to explicit initial
value problems, i.e. the solution’s existence is proved on some, possibly small, interval
containing the initial point x0 . In the current section, we will address the question in
which circumstances such local solutions can be extended, we will prove the existence
of maximal solutions (solutions that can not be extended), and we will learn how such
maximal solutions can be identified.
Definition 3.20. Let φ : I −→ Kn , n ∈ N, be a solution to some ODE (such as (1.6)
or (1.4) in the most general case), defined on some open interval I ⊆ R.

(a) We say φ has an extension or continuation to the right (resp. to the left) if, and
only if, there exists a solution ψ : J −→ Kn to the same ODE, defined on some
open interval J ⊇ I such that ψ↾I = φ and

sup J > sup I (resp. inf J < inf I). (3.61)


3 GENERAL THEORY 41

An extension or continuation of φ is a function that is an extension to the right or


an extension to the left (or both).
(b) The solution φ is called a maximal solution if, and only if, it does not admit any
extensions in the sense of (a) (note that we require maximal solutions to be defined
on open intervals, cf. Appendix E).
Remark 3.21. As an immediate consequence of the time reversion Lem. 1.9(b), if a
solution φ : I −→ Kn , n ∈ N, to (1.6), defined on some open interval I ⊆ R, has an
extension to the right (resp. to the left) if, and only if, ψ : (−I) −→ Kn , ψ(x) := φ(−x),
(solution to y (k) = (−1)k f (−x, y, −y ′ , . . . , (−1)k−1 y (k−1) )) has an extension to the left
(resp. to the right).

The existence of maximal solutions is not trivial – a priori it could be that every solution
had an extension (analogous to the fact that to every x ∈ [0, 1[ (or every x ∈ R) there
is some bigger element in [0, 1[ (respectively in R)).
Theorem 3.22. Every solution φ0 : I0 −→ Kn to (1.4) (resp. to (1.6)), defined on an
open interval I0 ⊆ R, can be extended to a maximal solution of (1.4) (resp. of (1.6)).

Proof. The proof is carried out for solutions to (1.4) (the implicit ODE) – the proof for
solutions to the explicit ODE (1.6) is analogous and can also be seen as a special case.
The idea is to apply Zorn’s lemma. To this end, define a partial order on the set
S := {(I0 , φ0 )} ∪ {(I, φ) : φ : I −→ Kn is solution to (1.4), extending φ0 } (3.62)
by letting
(I, φ) ≤ (J, ψ) :⇔ I ⊆ J, ψ↾I = φ. (3.63)
Every chain C, i.e. Severy totally ordered subset of S, has an upper bound, namely
(IC , φC ) with IC := (I,φ)∈C I and φC (x) := φ(x), where (I, φ) ∈ C is chosen such that
x ∈ I (since C is a chain, the value of φC (x) does not actually depend on the choice of
(I, φ) ∈ C and is, thus, well-defined).
Clearly, IC is an open interval, I0 ⊆ IC , and φC extends φ0 as a function; we still need to
see that φC is a solution to (1.4). For this, we, once again, use that x ∈ IC means there
exists (I, φ) ∈ C such that x ∈ I and φ is a solution to (1.4). Thus, using the notation
from Def. 1.2(a),
(k)
(x, φC (x), φ′C (x), . . . , φC (x)) = (x, φ(x), φ′ (x), . . . , φ(k) (x)) ∈ U
and
(k)
F (x, φC (x), φ′C (x), . . . , φC (x)) = F (x, φ(x), φ′ (x), . . . , φ(k) (x)) = 0,
showing φC is a solution to (1.4) as defined in Def. 1.2(a). In particular, (IC , φC ) ∈ S. To
verify (IC , φC ) is an upper bound for C, note that the definition of (IC , φC ) immediately
implies I ⊆ IC for each (I, φ) ∈ C and φC ↾I = φ for each (I, φ) ∈ C.
To conclude the proof, we note that all hypotheses of Zorn’s lemma have been verified
such that it yields the existence of a maximal element of (Imax , φmax ) ∈ S, i.e. φmax :
Imax −→ Kn must be a maximal solution extending φ0 . 
3 GENERAL THEORY 42

Proposition 3.23. Let k, n ∈ N. Given G ⊆ R × Kkn open and f : G −→ Kn


continuous, if φ : I −→ Kn is a solution to (1.6) such that I =]a, b[, a < b, b < ∞
(resp. −∞ < a), then φ has an extension to the right (resp. to the left) if, and only if,
lim φ(x), φ′ (x), . . . , φ(k−1) (x) = (η0 , . . . , ηk−1 ),

∃ (3.64a)
(b,η0 ,...,ηk−1 )∈G x↑b
 
′ (k−1)

resp. ∃ lim φ(x), φ (x), . . . , φ (x) = (η0 , . . . , ηk−1 ) . (3.64b)
(a,η0 ,...,ηk−1 )∈G x↓a

Proof. That the respective part of (3.64) is necessary for the existence of the respective
extension is immediate from the fact that, for each solution to (1.6), the solution and
all its derivatives up to order k − 1 must exist and must be continuous.
We now prove that (3.64a) is also sufficient for the existence of an extension to the right
(the sufficiency of (3.64b) for the existence of an extension to the left is then immediate
from Rem. 3.21). So assume (3.64a) to hold and consider the initial value problem
consisting of (1.6) and the initial conditions
∀ y (j) (b) = ηj .
j=0,...,k−1

By Cor. 3.10, there must exist ǫ > 0 such that this initial value problem has a solution
ψ : ]b−ǫ, b+ǫ[−→ Kn . We now show that φ extended to b via (3.64a) is still a solution to
(1.6). First note the mean value theorem (cf. [Phi16, Th. 9.18]) yields that φ(j) (b) = ηj
exists for j = 1, . . . , k − 1 as a left-hand derivative. Moreover,
lim φ(k) (x) = lim f x, φ(x), φ′ (x), . . . , φ(k−1) (x) = f (b, η0 , . . . , ηk−1 ),

x↑b x↑b

showing φ(k) (b) = f b, φ(b), φ′ (b), . . . , φ(k−1) (b) (again employing the mean value theo-


rem), which proves φ extended to b is a solution to (1.6). Finally, Lem. 1.7 ensures
(
φ(x) for x ≤ b,
σ : ]a, b + ǫ[−→ Kn , σ(x) :=
ψ(x) for x ≥ b,
is a solution to (1.6) that extends φ to the right. 
Proposition 3.24. Let k, n ∈ N, let G ⊆ R × Kkn be open, let f : G −→ Kn be
continuous, and let φ : I −→ Kn be a solution to (1.6) defined on the open interval I.
Consider x0 ∈ I and let gr+ (φ) (resp. gr− (φ)) denote the graph of (φ, . . . , φ(k−1) ) for
x ≥ x0 (resp. for x ≤ x0 ):
gr+ (φ) := gr+ (φ, x0 ) := x, φ(x), . . . , φ(k−1) (x) ∈ G : x ∈ I, x ≥ x0 ,
 
(3.65a)
gr− (φ) := gr− (φ, x0 ) := x, φ(x), . . . , φ(k−1) (x) ∈ G : x ∈ I, x ≤ x0 .
 
(3.65b)
If there exists a compact set K ⊆ G such that gr+ (φ) ⊆ K (resp. gr− (φ) ⊆ K), then φ
has an extension ψ : J −→ Kn to the right (resp. to the left) such that
x̃, ψ(x̃), . . . , ψ (k−1) (x̃) ∈

∃ / K. (3.66)
x̃∈J

The statement can be rephrased by saying that gr+ (φ) (resp. gr− (φ)) of each maximal
solution φ to (1.6) escapes from every compact subset of G when x appoaches the right
(resp. the left) boundary of I (where the boundary of I can contain −∞ and/or +∞).
3 GENERAL THEORY 43

Proof. We conduct the proof for extensions to the right; extensions to the left can be
handled completely analogously (alternatively, one can apply the time reversion Lem.
1.9(b) as demonstrated in the last paragraph of the proof below). The proof for exten-
sions to the right is divided into three steps. Let K ⊆ G be compact.
Step 1: We show that gr+ (φ) ⊆ K implies φ has an extension to the right: Since K is
bounded, so is gr+ (φ), implying

b := sup I < ∞ (3.67)

as well as

M1 := sup φ(j) (x) : j ∈ {0, . . . , k − 1}, x ∈ [x0 , b[ < ∞.




In the usual way, K compact and f continuous imply



M2 := max kf (x, y)k : (x, y) ∈ K < ∞.

Set
M := max{M1 , M2 }.
According to Prop. 3.23, we need to show (3.64a) holds. To this end, notice

∀ ∀ kφ(j) (x) − φ(j) (x̄)k ≤ M |x − x̄| : (3.68)


j=0,...,k−1 x,x̄∈[x0 ,b[

Indeed,

Z
kφ(k−1) (x) − φ(k−1) (x̄)k = f t, φ(t), . . . , φ(k−1) (t) dt

∀ ≤ M |x − x̄|,
x,x̄∈[x0 ,b[
x

and, for 0 ≤ j < k − 1,



Z
(j) (j) (j+1)

∀ kφ (x) − φ (x̄)k = φ ≤ M |x − x̄|,
(t) dt
x,x̄∈[x0 ,b[
x

proving (3.68). Since K is compact, there exists a sequence (xm )m∈N in [x0 , b[ such that

lim xm , φ(xm ), φ′ (xm ), . . . , φ(k−1) (xm ) = (b, η0 , . . . , ηk−1 ). (3.69)




(b,η0 ,...,ηk−1 )∈K m→∞

Using x̄ := xm in (3.68) yields, for m → ∞,

∀ ∀ kφ(j) (x) − ηj k ≤ M |x − b|,


j=0,...,k−1 x∈[x0 ,b[

implying
∀ lim φ(j) (x) = ηj ,
j=0,...,k−1 x↑b

i.e. (3.64a) holds, completing the proof of Step 1.


3 GENERAL THEORY 44

Step 2: We show that gr+ (φ) ⊆ K implies φ can be extended to the right to I∪]x0 , b+α[,
where α > 0 does not depend on b := sup I: Since K is compact, Cor. 3.9 guarantees
every initial value problem
y (k) = f (x, y, y ′ , . . . , y (k−1) ), (3.70a)
∀ y (j) (ξ0 ) = y0,j , (ξ0 , y0 ) ∈ K, (3.70b)
j=0,...,k−1

has a solution defined on ]ξ0 − α, ξ0 + α[ with the same α > 0. As shown in Step
1, the solution φ can be extended into b = sup I such that it satisfies (3.70b) with
(ξ0 , y0 ) = (b, η) ∈ K. Thus, using Lem. 1.7, it can be pieced together with the solution
to (3.70) given on [b, b + α[ by Cor. 3.9, completing the proof of Step 2.
Step 3: We finally show that gr+ (φ) ⊆ K implies φ has an extension ψ : J −→ Kn
to the right such that (3.66) holds: We set a := inf I and φ0 := φ. Then, by Step 2,
φ0 has an extension φ1 defined on ]a, b + α[. Inductively, for each m ≥ 1, either there
exists m0 ≤ m such that φm0 : ]a, b + m0 α[−→ Kn is an extension of φ that can be
used as ψ to conclude the proof (i.e. ψ := φm0 satisfies (3.66)) or φm can, once more,
be extended to ]a, b + (m + 1)α[. As K is bounded, {x ≥ x0 : (x, y) ∈ K} ⊆ R must
also be bounded, say by µ ∈ R. Thus, (3.66) must be satisfied for some ψ := φm with
1 ≤ m ≤ (µ − x0 )/α.
As mentioned above, one can argue completely analogous to the above proof to obtain
that gr− (φ) ⊆ K implies φ to have an extension to the left, satisfying (3.66). Here we
show how one, alternatively, can use the time reversion Lem. 1.9 to this end: Consider
the map
h : R × Kkn −→ R × Kkn , h(x, y1 , . . . , yk ) := (−x, y1 , . . . , (−1)k−1 yk ),
which clearly constitutes an R-linear isomophism. Noting (1.6) and (1.32) are the same,
we consider the time-reversed version (1.33) and observe Gg = h(G) to be open, h(K) ⊆
Gg to be compact, g : Gg −→ Kn , g = (−1)k (f ◦h), to be continuous. If gr− (φ, x0 ) ⊆ K,
then gr+ (ψ, −x0 ) ⊆ h(K), where ψ is the solution to the time-reversed version (1.33),
given by Lem. 1.9(b). Then ψ has an extension ψ̃ to the right, satisfying (3.66) with ψ
replaced by ψ̃ and K replaced by h(K). Then, by Rem. 3.21, φ must have an extension
φ̃ to the left, satisfying (3.66) with ψ replaced by φ̃. 

In Th. 3.28 below, we will show that, for continuous f : G −→ Kn , each maximal
solution to (1.6) must go to the boundary of G in the sense of the following definition.
Definition 3.25. Let k, n ∈ N, let G ⊆ R × Kkn be open, let f : G −→ Kn , and let
φ : ]a, b[−→ Kn , −∞ ≤ a < b ≤ ∞, be a solution to (1.6). We say that the solution φ
goes to the boundary of G for x → b (resp. for x → a) if, and only if,
∀ ∃ gr+ (φ, x0 ) ∩ K = ∅ (resp. gr− (φ, x0 ) ∩ K = ∅), (3.71)
K ⊆ G compact x0 ∈]a,b[

where gr+ (φ, x0 ) and gr− (φ, x0 ) are defined as in (3.65) (with I =]a, b[). In other words,
φ goes to the boundary of G for x → b (resp. for x → a) if, and only if, the graph of
(φ, . . . , φ(k−1) ) escapes every compact subset K of G forever for x → b (resp. for x → a).
3 GENERAL THEORY 45

Proposition 3.26. In the situation of Def. 3.25, if the solution φ goes to the boundary
of G for x → b, then one of the following conditions must hold:

(i) b = ∞,

(ii) b < ∞ and L := lim supx↑b φ(x), . . . , φ(k−1) (x) = ∞,


(iii) b < ∞, L < ∞ (L as defined in (ii)), G 6= R × Kkn (i.e. ∂G 6= ∅), and


 
lim dist x, φ(x), . . . , φ(k−1) (x) , ∂G = 0.

(3.72)
x↑b

An analogous statement is valid for the solution φ going to the boundary of G for x → a.

Proof. The proof is carried out for x → b; the proof for x → a is analogous.
Assume (i) – (iii) are all false. Choose c ∈]a, b[. Since (i) and (ii) are false,
φ(x), . . . , φ(k−1) (x) ≤ M.

∃ ∀
0≤M <∞ x∈[c,b[

If (iii) is false because G = R × Kkn , then K := {(x, y) ∈ R × Kkn : x ∈ [c, b], kyk ≤ M }
is a compact subset of G that shows (3.71) does not hold. In the only remaining case,
(iii) must be false, since (3.72) does not hold. Thus,
 
dist x1 , φ(x1 ), . . . , φ(k−1) (x1 ) , ∂G ≥ δ.

∃ ∀ ∃
δ>0 x0 ∈]a,b[ x1 ∈]x0 ,b[

Clearly, the set  


A := (x, y) ∈ G : dist (x, y), ∂G ≥ δ
is closed (e.g. as the distance function d : R × Kkn −→ R+ 0 , d(·) := dist(·, ∂G) is
continuous (see Th. C.4) and A = (d−1 [δ, ∞[) ∩ (G ∪ ∂G)). In consequence, K ∩ A with
K as defined above is a compact subset of G that shows (3.71) does not hold. 
Remark 3.27. (a) Examples such as the second ODE of Ex. 3.30(b) below show that
the lim sup in Prop. 3.26(ii) can not be replaced with a lim.
(b) If f : G −→ Kn is continuous, then the three conditions of Prop. 3.26 are also
sufficient for φ to go to the boundary of G (cf. Cor. 3.29 below).
(c) For discontinuous f : G −→ Kn , in general, (ii) of Prop. 3.26 is no longer sufficient
for φ to go to the boundary of G as is shown by simple examples, whereas (i) and (iii)
remain sufficient, even for discontinuous f (exercise). Similarly, simple examples
show Prop. 3.24 becomes false without the assumption of f being continuous; and
it can also happen that a maximal solution escapes every compact set, but still does
not go to the boundary of G (exercise).
Theorem 3.28. In the situation of Def. 3.25, if f : G −→ Kn is continuous and
φ : ]a, b[−→ Kn is a maximal solution to (1.6), then φ must go to the boundary of G for
both x → a and x → b, i.e., for both x → a and x → b, it must escape every compact
subset K of G forever and it must satisfy one of the conditions specified in Prop. 3.26
(and one of the analogous conditions for x → a).
3 GENERAL THEORY 46

Proof. We carry out the proof for x → b – the proof for x → a can be done analogously
or by applying the time reversion Lem. 1.9, as indicated at the end of the proof below.
Let φ : ]a, b[−→ Kn be a maximal solution to (1.6). Seeking a contradiction, we assume
φ does not go to the boundary of G for x → b, i.e. (3.71) does not hold and there exists
a compact subset K of G and a strictly increasing sequence (xm )m∈N in ]a, b[ such that
limm→∞ xm = b < ∞ and

xm , φ(xm ), . . . , φ(k−1) (xm ) ∈ K.



∀ (3.73)
m∈N

We now define C to be another compact subset of G that is strictly between K and G,


i.e. K ( C ( G: More precisely, we choose r > 0 such that

C := (x, y) ∈ R × Kkn : dist (x, y), K ≤ r ⊆ G,


 

where 
dist (x, y), K = inf{k(x, y) − (x̃, ỹ)k2 : (x̃, ỹ) ∈ K},
k · k2 denoting the Euclidean norm on Rkn+1 for K = R and the Euclidean norm on
R2kn+1 for K = C (this choice of norm is different from previous choices and will be
convenient later during the current proof). As φ is a maximal solution, Prop. 3.24
guarantees the existence of another strictly increasing sequence (ξm )m∈N in ]a, b[ such
that limm→∞ ξm = b < ∞, x1 < ξ1 < x2 < ξ2 < . . . (i.e. xm < ξm < xm+1 for each
m ∈ N) and such that

ξm , φ(ξm ), . . . , φ(k−1) (ξm ) ∈



∀ / C.
m∈N

Noting xm , φ(xm ), . . . , φ(k−1) (xm ) ∈ K by (3.73) and K ⊆ C, define




∀ sm := sup s ≥ xm : x, φ(x), . . . , φ(k−1) (x) ∈ C for each x ∈ [xm , s] .


 
m∈N

By the definition of sm as a sup, sm < xm+1 < b < ∞, and by the continuity of the
distance function d : R × Kkn −→ R+ 0 , d(·) := dist(·, K) (see Th. C.4), one obtains
 
dist sm , φ(sm ), . . . , φ(k−1) (sm ) , K = r,


m∈N

in particular,
x, φ(x), . . . , φ(k−1) (x) ∈ C

∀ (3.74)
x∈[xm ,sm ]

and

xm , φ(xm ), . . . , φ(k−1) (xm ) − sm , φ(sm ), . . . , φ(k−1) (sm ) ≥ r.

∀ (3.75)

m∈N 2

We use the boundedness of the compact set C and (3.74) to provide

M1 := sup φ(x), . . . , φ(k−1) (x) 2 : x ∈ [xm , sm ], m ∈ N < ∞,


 


M2 := max kf (x, y)k2 : (x, y) ∈ C < ∞
3 GENERAL THEORY 47

(as C is compact and f continuous),

M := max{M1 , M2 }.

We now notice that each function

Jm : [xm , sm ] −→ R × Kkn , Jm (x) := x, φ(x), . . . , φ(k−1) (x) ,




is a continuously differentiable curve or path (using the continuity of f ), cf. Def. F.1
(for K = C, we consider Jm as a path in R2kn+1 ). To finish the proof, we will have to
make use of the notion of arc length (cf. Def. F.5) of such a continuously differentiable
curve: Recall that each such continuously differentiable path is rectifyable, i.e. it has a
well-defined finite arc length l(Jm ) (cf. Th. F.7). Moreover, l(Jm ) satisfies
(F.4)
Z sm
(F.17) ′
kJm (xm ) − Jm (sm )k2 ≤ l(Jm ) = kJm (x)k2 dx
xm
v
Z sm u Xk
u
= t 1+ kφ(j) (x)k22 dx
xm j=1
Z sm q  2 2
≤ 1 + φ(x), . . . , φ(k−1) (x) 2 + f (Jm (x)) 2 dx
Z xm
sm √
≤ 1 + 2M 2 dx , (3.76)
xm

where it was used that k · k2 was chosen to be the Euclidean norm. For each m ∈ N, we
estimate
(3.75) 
(k−1) (k−1)

0 < r ≤ xm , φ(xm ), . . . , φ (xm ) − sm , φ(sm ), . . . , φ (sm )

2
(3.76)
Z sm √
= kJm (xm ) − Jm (sm )k2 ≤ 1 + 2M 2 dx
xm

= (sm − xm ) 1 + 2M 2 . (3.77)

However, limm→∞ (sm − xm ) 1 + 2M 2 = 0 due to limm→∞ sm = limm→∞ xm = b, in
contradiction to r > 0. This contradiction shows our initial assumption that φ does not
go to the boundary of G for x → b must have been wrong.
To obtain the remaining assertion that φ must go to the boundary of G for x → a,
one can proceed as in the last paragraph of the proof of Prop. 3.23, making use of the
function h defined there and of the time reversion Lem. 1.9: If K ⊆ G is a compact
set and ψ is the solution to the time-reversed version given by Lem. 1.9(b), then ψ
must be maximal as φ is maximal. Thus, for x → −a, ψ must escape the compact set
h(K) forever by the first part of the proof above, implying φ must escape K forever for
x → a. 
Corollary 3.29. Let k, n ∈ N, let G ⊆ R × Kkn be open, and let f : G −→ Kn
be continuous. If φ : ]a, b[−→ Kn , a < b, is a solution to (1.6), then the following
statements are equivalent:
3 GENERAL THEORY 48

(i) φ is a maximal solution.

(ii) φ must go to the boundary of G for both x → a and x → b in the sense defined in
Def. 3.25.

(iii) φ satisfies one of the conditions specified in Prop. 3.26 and one of the analogous
conditions for x → a.

Proof. (i) implies (ii) by Th. 3.28, (ii) implies (iii) by Prop. 3.26, and it is an exercise
to show (iii) implies (i) (here, Prop. 3.23 is the clue). 
Example 3.30. The following examples illustrate the different kinds of possible bahav-
ior of maximal solutions listed in Prop. 3.26 (the different kinds of bahavior can already
be seen for 1-dimensional ODE of first order):

(a) The initial value problem


y ′ = 0, y(0) = −1,
has the maximal solution φ : R −→ R, φ(x) = −1 – here we have

G = R2 , f : G −→ R, f (x, y) = 0,

solution interval I = R, b := sup I = ∞, i.e. we are in Case (i) of Prop. 3.26.

(b) The initial value problem


y ′ = x−2 , y(−1) = 1,
has the maximal solution φ : ] − ∞, 0[−→ R, φ(x) = −x−1 – here we have

G =] − ∞, 0[×R, f : G −→ R, f (x, y) = x−2 ,

solution interval I =] − ∞, 0[, b := sup I = 0, limx↑0 |φ(x)| = ∞ i.e. we are in Case


(ii) of Prop. 3.26.
To obtain an example, where we are also in Case (ii) of Prop. 3.26, but where
limx↑b |φ(x)|, b := sup I, does not exist, consider the initial value problem
 
′ 1 1 1 1 1
y = − 2 sin + 3 cos , y − = 0,
x x x x π

which has the maximal solution φ : ] − ∞, 0[−→ R, φ(x) = x−1 sin(x−1 ) (here
lim supx↑0 |φ(x)| = ∞, but, as φ(−1/(kπ)) = 0 for each k ∈ N, limx↑0 |φ(x)| does
not exist) – here we have
1 1 1 1
G = (R \ {0}) × R, f : G −→ R, f (x, y) = − 2
sin + 3 cos .
x x x x

To obtain an example, where we are again in Case (ii) of Prop. 3.26, but where
G = R2 , consider the initial value problem

y′ = y2, y(−1) = 1,
3 GENERAL THEORY 49

which, as in the first example of (b), has the maximal solution φ : ] − ∞, 0[−→ R,
φ(x) = −x−1 – here we have

G = R2 , f : G −→ R, f (x, y) = y 2 .

(c) The initial value problem

y ′ = −y −1 , y(−1) = 1,

has the maximal solution φ : ] − ∞, − 21 [−→ R, φ(x) = −2x − 1 – here we have

G = R × (R \ {0}), f : G −→ R, f (x, y) = −y −1 ,

solution interval I =] − ∞, − 21 [, b := sup I = − 21 , ∂G = R × {0},


 
1
lim (x, φ(x)) = − , 0 ∈ ∂G,
x↑b 2

i.e. we are in Case (iii) of Prop. 3.26.


An example, where we are in Case (iii) of Prop. 3.26, but where limx↑b (x, φ(x))
does not exist, is given by the initial value problem
 
′ 1 1 1
y = − 2 cos , y − = 0,
x x π

has the maximal solution φ : ] − ∞, 0[−→ R, φ(x) = sin(1/x) – here we have


1 1
G = (R \ {0}) × R, f : G −→ R, f (x, y) = − 2
cos ,
x x
solution interval I =] − ∞, 0[, b := sup I = 0, ∂G = {0} × R,

lim dist (x, φ(x)), ∂G = lim |x| = 0.
x↑0 x↑0

As a final example, where we are again in Case (iii) of Prop. 3.26, reconsider the
initial value problem from (a), but this time with

G =] − 1, 1[×] − 3, 5[, f : G −→ R, f (x, y) = 0.

Now the maximal solution is φ : ] − 1, 1[−→ R, φ(x) = −1, solution interval I =


] − 1, 1[, b := sup I = 1, and limx↑1 (x, φ(x)) = (1, −1) ∈ ∂G. This last example also
illustrates that, even though it is quite common to omit an explicit specification of
the domain G when writing an ODE (as we did in (a)) – where it is usually assumed
that the intended domain can be guessed from the context – the maximal solution
will typically depend on the specification of G.
3 GENERAL THEORY 50

Example 3.31. We have already seen examples of initial value problems that admit
more than one maximal solution – for instance, the initial value problem of Ex. 1.4(b)
had infinitely many different maximal solutions, all of them defined on all of R. The fol-
lowing example shows that an initial value problem can have maximal solutions defined
on different intervals: Let
p
|y|
G := R×] − 1, 1[, f : G −→ R, f (x, y) := p ,
1 − |y|

and consider the initial value problem


p
′ |y|
y = f (x, y) = p , y(0) = 0. (3.78)
1 − |y|

An obvious maximal solution is

φ : R −→ R, φ(0) = 0.

However, another maximal solution (that can be found using separation of variables) is
( √ 2
− 1− 1+x for −1 < x ≤ 0,
ψ : ] − 1, 1[−→ R, ψ(x) := √ 2
1− 1−x for 0 ≤ x < 1.

To confirm the maximality of the solution ψ, note limx↓−1 (x, ψ(x)) = (−1, −1) ∈ ∂G
and limx↑1 (x, ψ(x)) = (1, 1) ∈ ∂G.

3.5 Continuity in Initial Conditions


The goal of the present section is to show that, under suitable conditions, small changes
in the initial condition for an ODE result in small changes in the solution. As, in
situations of nonuniqueness, we can change the solution without having changed the
initial condition at all, ensuring unique solutions to initial value problems is a minimal
prerequisite for our considerations in this section.

Definition 3.32. Let G ⊆ R × Kkn , k, n ∈ N, and f : G −→ Kn . We say that the


explicit n-dimensional kth-order ODE (1.6), i.e.

y (k) = f x, y, y ′ , . . . , y (k−1) ,

(3.79a)

admits unique maximal solutions if, and only if, f is such that every initial value problem
consisting of (3.79a) and

∀ y (j) (ξ) = ηj ∈ Kn , (3.79b)


j∈{0,...,k−1}

with (ξ, η) ∈ G, has a unique maximal solution φ(ξ,η) : I(ξ,η) −→ Kn (combining Cor.
3.16 with Th. 3.22 yields that G being open and f being continuous and locally Lipschitz
3 GENERAL THEORY 51

with respect to y is sufficient for (3.79a) to admit unique maximal solutions, but we
know from Rem. 3.17 that this condition is not necessary). If f is such that (3.79a)
admits unique maximal solutions, then

Y : Df −→ Kn , Y (x, ξ, η) := φ(ξ,η) (x), (3.80)

defined on
Df := {(x, ξ, η) ∈ R × G : x ∈ I(ξ,η) }, (3.81)
is called the global or general solution to (3.79a). Note that the domain Df of Y is
determined entirely by f , which is notationally emphasized by its lower index f .

Lemma 3.33. In the situation of Def. 3.32, the following holds:

(a) Y (ξ, ξ, η) = η0 for each (ξ, η) ∈ G.



(b) If k = 1, then η = η0 and Y x, x̃, Y (x̃, ξ, η) = Y (x, ξ, η) for each (x, ξ, η), (x̃, ξ, η) ∈
Df .

(c) If k = 1, then Y ξ, x, Y (x, ξ, η) = η for each (x, ξ, η) ∈ Df .

Proof. (a) holds as Y (·, ξ, η) is a solution to (3.79b). For (b) note (x̃, ξ, η) ∈ Df im-
plying x̃, Y (x̃, ξ,  η) ∈ G, i.e. x̃, Y (x̃, ξ, η) are admissible initial data. Moreover,
Y ·, x̃, Y (x̃, ξ, η) and Y (·, ξ, η) are both maximal solutions for some intial value prob-
lem for (3.79a). Since both solutions agree at x = x̃, both functions must be identical
by the assumed uniqueness of solutions. In particular, they are defined for the same x
and yield the same value at each x. Setting x := ξ in (b) yields (c). 

The core of the proof of continuity in initial conditions as stated in Cor. 3.36 below is
the following Th. 3.34(a), which provides continuity in initial conditions locally. As a
byproduct, we will also obtain a version of the Picard-Lindelöf theorem in Th. 3.34(b),
which states the local uniform convergence of the so-called Picard iteration, a method for
obtaining approximate solutions that is quite different from the Euler method considered
above.

Theorem 3.34. Consider the situation of Def. 3.32 for first-order problems, i.e. with
k = 1, and with f being continuous and locally Lipschitz with respect to y on G open.
Fix an arbitrary norm k · k on Kn .

(a) For each (σ, ζ) ∈ G ⊆ R × Kn and each −∞ < a < b < ∞ such that [a, b] ⊆ I(σ,ζ)
(i.e., using the notation introduced in Def. 3.32, the maximal solution φ(σ,ζ) =
Y (·, σ, ζ) is defined on [a, b]), there exists δ > 0 satisfying:

(i) For every point (ξ, η) in the open set



Uδ (σ, ζ) := (ξ, η) ∈ G : ξ ∈]a, b[, η − Y (ξ, σ, ζ) < δ , (3.82)

the maximal solution φ(ξ,η) = Y (·, ξ, η) is defined on ]a, b[ (i.e. ]a, b[⊆ I(ξ,η) ).
3 GENERAL THEORY 52

(ii) The restriction of the global solution (x, ξ, η) 7→ Y (x, ξ, η) to the open set

W :=]a, b[×Uδ (σ, ζ) (3.83)

is continuous.

(b) (Picard-Lindelöf) For each (σ, ζ) ∈ G, there exists α > 0 such that the Picard
iteration, i.e. the sequence of functions (φm )m∈N0 , φm : ]σ − α, σ + α[−→ Kn , defined
recursively by

φ0 (x) := ζ, (3.84a)
Z x 
∀ φm+1 (x) := ζ + f t, φm (t) dt , (3.84b)
m∈N0 σ

converges uniformly to the solution of the initial value problem (3.79) (with k = 1
and (ξ, η) := (σ, ζ)) on ]σ − α, σ + α[.

Proof. We will obtain (b) as an aside while proving (a). To simplify notation, we
introduce the function

ψ : [a, b] −→ Kn , ψ(x) := Y (x, σ, ζ).

Since [a, b] is compact and ψ is continuous,

γ := (Id, ψ)[a, b] = {(x, ψ(x)) ∈ R × Kn : x ∈ [a, b]}

is a compact subset of G (cf. C.14). Thus, γ has a positive distance from the closed set
(R × Kn ) \ G, implying

∃ C := (x, y) ∈ R × Kn : x ∈ [a, b], y − ψ(x) ≤ δ1 ⊆ G.



(3.85)
δ1 >0

Clearly, C is bounded and C is also closed (using the continuity of the distance function
d : R × Kn −→ R+ 0 , d(·) := dist(·, γ), the continuity of the projection to the first
component π1 : R × Kn −→ R, and noting C = d−1 [0, δ1 ] ∩ π1−1 [a, b]). Thus, C is
compact, and the hypothesis of f being locally Lipschitz with respect to y implies f to
be globally Lipschitz with some Lipschitz constant L ≥ 0 on the compact set C by Prop.
3.13. We can now choose the number δ > 0 claimed to exist in (a) to be any number

0 < δ < e−L(b−a) δ1 . (3.86)

Since −L(b − a) ≤ 0, we have


δ < δ1 . (3.87)
Moreover, with d and π1 as above, Uδ (σ, ζ) as defined in (3.82) can be written in the
form
Uδ (σ, ζ) = d−1 [0, δ[ ∩ π1−1 ]a, b[,
showing it is an open set ([0, δ[ is, indeed, open in R+
0 ).
3 GENERAL THEORY 53

Even though we are mostly interested in what happens on the open set W , it will be
convenient to define functions on the slightly larger compact set
W := [a, b] × U ,
U := (x, y) ∈ R × Kn : x ∈ [a, b], y − ψ(x) ≤ δ = d−1 [0, δ] ∩ π1−1 [a, b].


To proceed with the proof, we now carry out a form of the Picard iteration, recursively
defining a sequence of functions (ψm )m∈N0 , ψm : W −→ Kn , defined recursively by
ψ0 (x, ξ, η) := ψ(x) + η − ψ(ξ), (3.88a)
Z x

∀ ψm+1 (x, ξ, η) := η + f t, ψm (t, ξ, η) dt . (3.88b)
m∈N0 ξ

The proof will be concluded if we can show the (ψm )m∈N0 constitute a sequence of
continuous functions converging uniformly on W to Y ↾W . As an intermediate step, we
establish the following properties of the ψm (simultaneously) by induction on m ∈ N0 :

(1) ψm is continuous for each m ∈ N0 .


(2) One has

∀ ψm (x, ξ, η) − ψ(x) < δ1 ⇒ (x, ψm (x, ξ, η)) ∈ C .
m∈N0 ,
(x,ξ,η)∈W

In particular, since C ⊆ G, this shows the ψm are well-defined by (3.88b).


(3) One has
m+1
ψm+1 (x, ξ, η) − ψm (x, ξ, η) ≤ L
|x − ξ|m+1 δ
∀ .
m∈N0 , (m + 1)!
(x,ξ,η)∈W

To start the induction proof, notice that the continuity of ψ implies the continuity of
ψ0 . Moreover, if (x, ξ, η) ∈ W , then

ψ0 (x, ξ, η) − ψ(x) (3.88a)



= η − ψ(ξ) = η − Y (ξ, σ, ζ) ≤ δ < δ1 . (3.89)
Also, from ψ = Y (·, σ, ζ) = φ(σ,ζ) , we know, for each x, ξ ∈ [a, b],
Z x Z ξ Z x
ψ(x) − ψ(ξ) = ζ + f (t, ψ(t)) dt − ζ − f (t, ψ(t)) dt = f (t, ψ(t)) dt
σ σ ξ

and, for each (x, ξ, η) ∈ W ,


Z x  

ψ1 (x, ξ, η) − ψ0 (x, ξ, η) = f t, ψ 0 (t, ξ, η) − f (t, ψ(t)) dt

ξ
Z x
f L-Lip.
≤ L ψ0 (t, ξ, η) − ψ(t) dt

Zξ x

= L η − ψ(ξ) dt ≤ L |x − ξ| δ,

ξ
3 GENERAL THEORY 54

completing the proof of (1) – (3) for m = 0. For the induction step, let m ∈ N0 .
It is left as an exercise to prove the continuity of ψm+1 .
Using the triangle inequality, we estimate, for each (x, ξ, η) ∈ W ,

ψm+1 (x, ξ, η) − ψ(x)
Xm

≤ kψj+1 (x, ξ, η) − ψj (x, ξ, η) + kψ0 (x, ξ, η) − ψ(x)
j=0
m
(3.89), ind.hyp. for (3) X Lj+1 |x − ξ|j+1 δ (3.86)
≤ + δ ≤ eL|x−ξ| δ < eL(b−a) e−L(b−a) δ1 = δ1 ,
j=0
(j + 1)!

establishing the estimate of (2) for m + 1. To prove the estimate in (3) for m replaced
by m + 1, one estimates, for each (x, ξ, η) ∈ W ,
Z x
 
ψm+2 (x, ξ, η) − ψm+1 (x, ξ, η) ≤ f t, ψm+1 (t, ξ, η) − f t, ψm (t, ξ, η) dt

ξ
Z x

≤ L ψm+1 (t, ξ, η) − ψm (t, ξ, η) dt

Zξ x m+1 m+1

ind.hyp. L |t − ξ| δ
≤ L dt
ξ (m + 1)!
m+2
L |x − ξ|m+2 δ
= ,
(m + 2)!
completing the induction proof of (1) – (3).
As a consequence of (3), for each l, m ∈ N0 such that m > l:
m
X Lj (b − a)j
∀ ψm (x, ξ, η) − ψl (x, ξ, η) ≤ δ . (3.90)
(x,ξ,η)∈W
j=l+1
j!

The convergence of the exponential series, thus, implies that (ψm (x, ξ, η))m∈N0 is a
Cauchy sequence for each (x, ξ, η) ∈ W , yielding pointwise convergence of the ψm to
some function ψ̃ : W −→ Kn . Letting m tend to infinity in (3.90) then shows

X Lj (b − a)j
∀ ψ̃(x, ξ, η) − ψl (x, ξ, η) ≤ δ ,
(x,ξ,η)∈W
j=l+1
j!

where the independence of the right-hand side with respect to (x, ξ, η) ∈ W proves
ψm → ψ̃ uniformly on W . The uniform convergence together with (1) then implies ψ̃
to be continuous.
In the final step of the proof, we show ψ̃ = Y on W , i.e. ψ̃(·, ξ, η) solves (3.79) (with
k = 1). By Th. 1.5, we need to show
Z x

∀ ψ̃(x, ξ, η) = η + f t, ψ̃(t, ξ, η) dt (3.91)
(x,ξ,η)∈W ξ
3 GENERAL THEORY 55

(then uniqueness of solutions implies ψ̃(·, ξ, η) = Y (·, ξ, η)). To verify (3.91), given
ǫ > 0, by the uniform convergence ψm → ψ̃, choose m ∈ N sufficiently large such that

∀ ∀ ψ̃(x, ξ, η) − ψk (x, ξ, η) < ǫ
k∈{m−1,m} (x,ξ,η)∈W

and estimate, for each (x, ξ, η) ∈ W ,


Z x

ψ̃(x, ξ, η) − η − f t, ψ̃(t, ξ, η) dt

ξ
x
Z
 

≤ ψ̃(x, ξ, η) − ψm (x, ξ, η) + f t, ψm−1 (t, ξ, η) − f t, ψ̃(t, ξ, η) dt


ξ
Z x

< ǫ + L ψm−1 (t, ξ, η) − ψ̃(t, ξ, η) dt ≤ ǫ + L ǫ (b − a). (3.92)

ξ

As (3.92) holds for every ǫ > 0, (3.91) must be true as well.


It is noted that we have, indeed, proved (b) as a byproduct, since we know (for example
from the Peano Th. 3.8) that ψ must be defined on [σ − α, σ + α] for some α > 0 and
then φm = ψm (·, σ, ζ) on [σ − α, σ + α] for each m ∈ N0 . 

Theorem 3.35. As in Th. 3.34, consider the situation of Def. 3.32 for first-order prob-
lems, i.e. with k = 1, and with f being continuous and locally Lipschitz with respect to
y on G open. Then the global solution (x, ξ, η) 7→ Y (x, ξ, η) as defined in Def. 3.32 is
continuous. Moreover, its domain Df is open.

Proof. Let (x, σ, ζ) ∈ Df . Then, using the notation from Def. 3.32, x is in the domain of
the maximal solution φ(σ,ζ) , i.e. x ∈ I(σ,ζ) . Since I(σ,ζ) is open, there must be −∞ < a <
x < b < ∞ such that [a, b] ⊆ I(σ,ζ) and then Th. 3.34(a) implies the global solution Y to
be continuous on W , where W as defined in (3.83) is an open neighborhood of (x, σ, ζ).
In particular, (x, σ, ζ) is an interior point of Df and Y is continuous at (x, σ, ζ). As
(x, σ, ζ) was arbitrary, Df must be open and Y must be continuous. 

Corollary 3.36. Consider the situation of Def. 3.32 with f being continuous and locally
Lipschitz with respect to y on G open. Then the global solution (x, ξ, η) 7→ Y (x, ξ, η) as
defined in Def. 3.32 is continuous. Moreover, its domain Df is open.

Proof. It was part of the exercise that proved Cor. 3.16 to show that the right-hand side
F of the first-order problem equivalent to (3.79) in the sense of Th. 3.1 is continuous
and locally Lipschitz with respect to y, provided f is continuous and locally Lipschitz
with respect to y. Thus, according to Th. 3.35, the equivalent first-order problem has
a continuous global solution Υ : DF −→ Kkn , defined on some open set DF . As a
consequence of Th. 3.1(b), Y = Υ1 : DF −→ Kn is the global solution to (3.79a). So
we have Df = DF and, as Υ is continuous, so is Y . 

It is sometimes interesting to consider situations where the right-hand side f depends


on some (vector of) parameters µ in addition to depending on x and y:
3 GENERAL THEORY 56

Definition 3.37. If G ⊆ R × Kkn × Kl with k, n, l ∈ N, and f : G −→ Kn is such that,


for each (ξ, η, µ) ∈ G, the explicit n-dimensional kth-order initial value problem
y (k) = f x, y, y ′ , . . . , y (k−1) , µ ,

(3.93a)
∀ y (j) (ξ) = ηj ∈ Kn , (3.93b)
j∈{0,...,k−1}

has a unique maximal solution φ(ξ,η,µ) : I(ξ,η,µ) −→ Kn , then


Y : Df −→ Kn , Y (x, ξ, η, µ) := φ(ξ,η,µ) (x), (3.94)
defined on
Df := {(x, ξ, η, µ) ∈ R × G : x ∈ I(ξ,η,µ) }, (3.95)
is called the global or general solution to (3.93a).
Corollary 3.38. Consider the situation of Def. 3.37 with f being continuous and locally
Lipschitz with respect to (y, µ) on G open. Then the global solution Y as defined in Def.
3.94 is continuous. Moreover, its domain Df is open.

Proof. We consider k = 1 (i.e. (3.93a) is of first order) – the case k > 1 can then, in the
usual way, be obtained by applying Th. 3.1. To apply Th. 3.35 to the present situation,
define the auxiliary function
(
fj (x, y) for j = 1, . . . , n,
F : G −→ Kn+l , Fj (x, y) := (3.96)
0 for j = n + 1, . . . , n + l.

Then, since f is continuous and locally Lipschitz with respect to (y, µ), F is continuous
and locally Lipschitz with respect to y, and we can apply Th. 3.35 to
y ′ = F (x, y), (3.97a)
y(ξ) = (η, µ), (3.97b)

where (ξ, η, µ) ∈ G. According to Th. 3.35, the global solution Ỹ : DF −→ Kn+l of


(3.97a) is continuous on the open set DF . Moreover, by the definition of F in (3.96),
we have  
Y (x, ξ, η, µ)
∀ Ỹ (x, ξ, η, µ) = ,
(x,ξ,η,µ)∈DF µ
where Y is as defined in (3.94). In particular, Df = DF and the continuity of Ỹ implies
the continuity of Y . 
Example 3.39. As a simple example of a parametrized ODE, consider f : R × K2 −→
K, f (x, y, µ) := µy,
y ′ = f (x, y, µ) = µ y,
y(ξ) = η,
with the global solution
Y : R × R × K2 −→ K, Y (x, ξ, η, µ) = η eµ (x−ξ) .
4 LINEAR ODE 57

4 Linear ODE

4.1 Definition, Setting


In Sec. 2.2, we saw that the solution of one-dimensional first-order linear ODE was
particularly simple. One can now combine the general theory of ODE with some linear
algebra to obtain results for n-dimensional linear ODE and, equivalently, for linear ODE
of higher order.
Notation 4.1. For n ∈ N, let M(n, K) denote the set of all n × n matrices over K.
Definition 4.2. Let I ⊆ R be a nontrivial interval, n ∈ N, and let A : I −→ M(n, K)
and b : I −→ Kn be continuous. An ODE of the form

y ′ = A(x)y + b(x) (4.1)

is called an n-dimensional linear ODE of first order. It is called homogeneous if, and
only if, b ≡ 0; it is called inhomogeneous if, and only if, it is not homogeneous.

Using the notion of matrix norm (cf. Sec. G), it is not hard to show the right-hand side
of (4.1) is continuous and locally Lipschitz with respect to y and, thus, every initial
value problem for (4.1) has a unique maximal solution (exercise). However, to show
the maximal solution is always defined on all of I, we need some additional machinery,
which is developed in the next section.

4.2 Gronwall’s Inequality


In the current section, we will provide Gronwall’s inequality, which is also of interest
outside the field of ODE. Here, Gronwall’s inequality will allow us to prove the global ex-
istence of maximal solutions for ODE with linearly bounded right-hand side – a corollary
being that maximal solutions of (4.1) are always defined on all of I.
As an auxiliary tool on our way to Gronwall’s inequality, we will now briefly study
(one-dimensional) differential inequalities:
Definition 4.3. Given G ⊆ R × R = R2 , and f : G −→ R, call

y ′ ≤ f (x, y) (4.2)

a (one-dimensional) differential inequality (of first order). A solution to (4.2) is a differ-


entiable function w : I −→ R defined on a nontrivial interval I ⊆ R satisfying the two
conditions
 
(i) x, w(x) ∈ I × R : x ∈ I ⊆ G,

(ii) w′ (x) ≤ f x, w(x) for each x ∈ I.



4 LINEAR ODE 58

Proposition 4.4. Let G ⊆ R2 be open, let f : G −→ R be continuous and locally


Lipschitz with respect to y, and let −∞ < a < b ≤ ∞. If w : [a, b[−→ R is a solution
to the differential inequality (4.2) and φ : [a, b[−→ R is a solution to the corresponding
ODE, then
w(a) ≤ φ(a) ⇒ ∀ w(x) ≤ φ(x). (4.3)
x∈[a,b[

Proof. Consider the auxiliary function

g : G × R −→ R, g(x, y, µ) := f (x, y) + µ (4.4)

and the (parametrized) ODE

y ′ = g(x, y, µ) = f (x, y) + µ. (4.5)

Since f is continuous and locally Lipschitz with respect to y, g is continuous and locally
Lipschitz with respect to (y, µ). Thus, continuity in initial conditions as given by Cor.
3.38 applies, yielding the global solution Y : Dg −→ R, (x, ξ, η, µ) 7→ Y (x, ξ, η, µ), to
be continuous on the open set Dg .
We now consider an arbitrary compact subinterval [a, c] ⊆ [a, b[ with a < c < b, noting
that it suffices to prove w ≤ φ on every such interval [a, c]. The set

γ := (Id, a, φ(a), 0)[a, c] = (x, a, φ(a), 0) : x ∈ [a, c] (4.6)

is a compact subset of Dg and, thus,


n o
4

∃ γǫ := (x, ξ, η, µ) ∈ R : dist (x, ξ, η, µ), γ < ǫ ⊆ Dg . (4.7)
ǫ>0

If we choose the distance in (4.7) to be meant with respect to the max-norm on R4 and
if 0 < µ < ǫ, then (x, a, φ(a), µ) ∈ γǫ for each x ∈ [a, c], such that φµ := Y (·, a, φ(a), µ)
is defined on (a superset of) [a, c]. We proceed to prove w ≤ φµ on [a, c]: Seeking a
contradiction, assume there exists x0 ∈ [a, c] such that w(x0 ) > φµ (x0 ). Due to the
continuity of w and φµ , w > φµ must then hold in an entire neighborhood of x0 . On the
other hand, w(a) ≤ φ(a) = φµ (a), such that, for

x1 := inf x < x0 : w(t) > φµ (t) for each t ∈]x, x0 ] ,

a ≤ x1 < x0 and w(x1 ) = φµ (x1 ). But then, for each sufficiently small h > 0,

w(x1 + h) − w(x1 ) > φµ (x1 + h) − φµ (x1 ),

implying
w(x1 + h) − w(x1 ) φµ (x1 + h) − φµ (x1 )
w′ (x1 ) = lim ≥ lim = φ′µ (x1 )
h→0 h h→0 h
   
= g x1 , φµ (x1 ), µ = f x1 , φµ (x1 ) + µ > f x1 , φµ (x1 ) = f x1 , w(x1 ) , (4.8)

in contradiction to w being a solution to (4.2).


4 LINEAR ODE 59

Thus, w ≤ φµ on [a, c] holds for every 0 < µ < ǫ, and continuity of Y on Dg yields,
 
∀ w(x) ≤ lim φµ (x) = lim Y x, a, φ(a), µ = Y x, a, φ(a), 0 = φ(x), (4.9)
x∈[a,c] µ→0 µ→0

concluding the proof. 


Theorem 4.5 (Gronwall’s Inequality). Let I := [a, b[, where −∞ < a < b ≤ ∞. If
α, β, γ : I −→ R are continuous and β(x) ≥ 0 for each x ∈ I, then
Z x
∀ γ(x) ≤ α(x) + β(t) γ(t) dt (4.10)
x∈I a

implies Z x Z x 
∀ γ(x) ≤ α(x) + α(t) β(t) exp β(s) ds dt . (4.11)
x∈I a t

Proof. Defining the auxiliary functions ψ, w : I −→ R,


ψ(x) := γ(x) − α(x), (4.12a)
Z x
w(x) := β(t)γ(t) dt , (4.12b)
a

(4.10) can be written as


∀ ψ(x) ≤ w(x).
x∈I

Moreover, this implies


w′ (x) = β(x)γ(x) = β(x) α(x) + ψ(x) ≤ β(x)w(x) + α(x)β(x),


x∈I

showing w satisfies the (linear) differential inequality


y ′ ≤ β(x) y + α(x) β(x). (4.13)
Continuously extending α and β to x < a (e.g. using the constant extensions α(x) = α(a)
and β(x) := β(a) for x < a), we can consider the linear ODE corresponding to (4.13) on
all of ] − ∞, b[. Using the initial condition y(a) = w(a) = 0, yields the unique solution
(employing the variation of constants Th. 2.3)
φ : ] − ∞, b[−→ R,
Z x Z x  Z t 
φ(x) := exp β(s) ds exp − β(s) ds α(t) β(t) dt
a a a
(4.14)
Z x Z x 
= α(t) β(t) exp β(s) ds dt .
a t

Finally, we apply Prop. 4.4 to conclude


(4.3)
Z x Z x 
(4.14)
∀ ψ(x) ≤ w(x) ≤ φ(x) = α(t) β(t) exp β(s) ds dt , (4.15)
x∈I a t

which, taking into account ψ = γ − α, establishes (4.11). 


4 LINEAR ODE 60

Example 4.6. Let I := [a, b[, where −∞ < a < b ≤ ∞. If β, γ : I −→ R are


continuous, β(x) ≥ 0 for each x ∈ I, and C ∈ R, then
Z x
∀ γ(x) ≤ C + β(t) γ(t) dt (4.16)
x∈I a

implies Z x 
∀ γ(x) ≤ C exp β(t) dt : (4.17)
x∈I a

We apply Gronwall’s inequality of Th. 4.5 with α ≡ C together with the fundamental
theorem of calculus to obtain the estimate
Z x Z x 
γ(x) ≤ C + C β(t) exp β(s) ds dt
a t
Z x Z t   Z t x
=C −C −β(t) exp −β(s) ds dt = C − C exp −β(s) ds
a x x a
Z x 
= C exp β(t) dt (4.18)
a

for each x ∈ I, proving (4.17).


The following Th. 4.7 will be applied to show maximal solutions to linear ODE are
always defined on all of I (with I as in Def. 4.2). However, Th. 4.7 is often also useful
to obtain the domains of maximal solutions for nonlinear ODE.
Theorem 4.7. Let n ∈ N, let I ⊆ R be an open interval, and let f : I × Kn −→ Kn be
continuous. If there exist nonnegative continuous functions γ, β : I −→ R+
0 such that

∀ kf (x, y)k ≤ γ(x) + β(x) kyk, (4.19)


(x,y)∈I×Kn

where k · k denotes some arbitrary norm on Kn , then every maximal solution to

y ′ = f (x, y)

is defined on all of I.

Proof. Let c < d and φ : ]c, d[−→ Kn be a solution to y ′ = f (x, y). We prove that
d < b := sup I implies φ can be extended to the right and a := inf I < c implies φ can
be extended to the left. First, assume d < b and let x0 ∈]c, d[. The idea is to apply
Example 4.6 on the interval [x0 , d[. To this end, we estimate, for each x ∈ [x0 , d[:
Z x Z x
 
kφ(x)k = φ(x0 ) + f t, φ(t) dt ≤ kφ(x0 )k + f t, φ(t) dt

x x0
(4.19)
Z0 x Z x
≤ kφ(x0 )k + γ(t) dt + β(t) kφ(t)k dt . (4.20)
x0 x0
4 LINEAR ODE 61

Since the continuous function γ is uniformly bounded on the compact interval [x0 , d],
Z x
∃ ∀ kφ(x)k ≤ C + β(t) kφ(t)k dt .
C≥0 x∈[x0 ,d[ x0

Thus, Example 4.6 applies, providing


Z x 
∀ kφ(x)k ≤ C exp β(t) dt ≤ C eM (d−x0 ) , (4.21)
x∈[x0 ,d[ x0

where M ≥ 0 is a uniform bound for the continuous function β on the compact interval
[x0 , d]. As (4.21) states that the graph
 
gr+ (φ) = x, φ(x) ∈ G : x ∈ [x0 , d[
is contained in the compact set
K := [x0 , d] × y ∈ Kn : kyk ≤ C eM (d−x0 ) ,


Prop. 3.24 implies φ has an extension to the right.


Now assume a < c. The idea is to apply the time reversion Lem. 1.9(b): According to
Lem. 1.9(b), ψ : ] − d, −c[−→ Kn , ψ(x) = φ(−x), is a solution to y ′ = −f (−x, y) and
the first part of the prove above shows ψ to have an extension to the right. However,
then Rem. 3.21 tells us φ has an extension to the left. 

4.3 Existence, Uniqueness, Vector Space of Solutions


Theorem 4.8. Consider the setting of Def. 4.2 with an open interval I. Then every
initial value problem consisting of the linear ODE (4.1) and y(x0 ) = y0 , x0 ∈ I, y0 ∈ Kn ,
has a unique maximal solution φ : I −→ Kn (note that φ is defined on all of I).

Proof. It is an exercise to show the right-hand side of (4.1) is continuous and locally
Lipschitz with respect to y. Thus, every initial value problem has a unique maximal
solution by using Cor. 3.16 and Th. 3.22. That each maximal solution is defined on I
follows from Th. 4.7, as

∀ A(x)y + b(x) ≤ b(x) + A(x) kyk,
x∈I

where A(x) denotes the matrix norm of A(x) induced by the norm k · k on Kn (cf.
Appendix G). 

We will now proceed to study the solution spaces of linear ODE – as it turns out, these
solution spaces inherit the linear structure of the ODE.
Notation 4.9. Again, we consider the setting of Def. 4.2. Define Li and Lh to be the
respective sets of solutions to (4.1) and its homogeneous version, i.e.
n o
Li := (φ : I −→ Kn ) : φ′ = Aφ + b , (4.22a)
n o
Lh := (φ : I −→ Kn ) : φ′ = Aφ . (4.22b)
4 LINEAR ODE 62

Lemma 4.10. Using Not. 4.9, we have

∀ Li = φ + Lh = {φ + ψ : ψ ∈ Lh }, (4.23)
φ∈Li

i.e. one obtains all solutions to the inhomogeneous equation (4.1) by adding solutions
of the homogeneous equation to a particular solution to the inhomogeneous equation
(note that this is completely analogous to what occurs for solutions to linear systems of
equations in linear algebra).

Proof. Exercise. 

Theorem 4.11. Let I ⊆ R be a nontrivial interval, n ∈ N, and let A : I −→ M(n, K)


be continuous. Then the following holds:

(a) The set Lh defined in (4.22b) constitutes a vector space over K.

(b) For each k ∈ N and φ1 , . . . , φk ∈ Lh , the following statements are equivalent:

(i) The k functions φ1 , . . . , φk are linearly independent over K.


(ii) There exists x0 ∈ I such that the k vectors φ1 (x0 ), . . . , φk (x0 ) ∈ Kn are linearly
independent over K.
(iii) The k vectors φ1 (x), . . . , φk (x) ∈ Kn are linearly independent over K for every
x ∈ I.

(c) The dimension of Lh is n.

Proof. (a): Exercise.


(b): (iii) trivially implies (ii). That (ii) implies (i) can easily be shownPby contraposition:
If (i) does not hold, then there is (λ1 , . . . , λk ) ∈ K \ {0} such that kj=1 λj φj = 0, i.e.
k
Pk
j=1 λj φj (x) = 0 holds for each x ∈ I, i.e. (ii) does not hold. It remains to show (i)
implies (iii). Once again, we accomplish this via contraposition: PkIf (iii) does not hold,
k
then there are (λ1 , . . . , λk ) ∈ K \ {0} and x ∈ I such that j=1 λj φj (x) = 0. But
then, since kj=1 λj φj ∈ Lh by (a), kj=1 λj φj = 0 (using uniqueness of solutions). In
P P
consequence, φ1 , . . . , φk are linearly dependent and (i) does not hold.
(c): Let (b1 , . . . , bn ) be a basis of Kn and x0 ∈ I. Let φ1 , . . . , φn ∈ Lh be the solutions
to the initial conditions y(x0 ) = b1 , . . . , y(x0 ) = bn , respectively. Then the φ1 , . . . , φn
must be linearly independent by (b) (as they are linearly independent at x0 ), proving
dim Lh ≥ n. On the other hand, if φ1 , . . . , φk ∈ Lh are linearly independent, k ∈ N,
then, once more by (b), φ1 (x), . . . , φk (x) ∈ Kn are linearly independent for each x ∈ Kn ,
showing k ≤ n and dim Lh ≤ n. 

Example 4.12. In Example 1.4(e), we had claimed that the second-order ODE (1.16)
on [a, b], a < b, namely
y ′′ = −y
4 LINEAR ODE 63

had the set of solutions L as in (1.17), namely


n  o
L= (c1 sin +c2 cos) : [a, b] −→ K : c1 , c2 ∈ K .

We are now in a position to verify this claim: The second-order ODE (1.16) is equivalent
to the homogeneous linear first-order ODE
 ′     
y1 y2 0 1 y1
′ = = (4.24)
y2 −y1 −1 0 y2

with the vector space of solutions Lh of dimension 2 over K. Clearly, φ1 , φ2 ∈ Lh , where


φ1 , φ2 : [a, b] −→ K2 with
   
sin x cos x
φ1 (x) := , φ2 (x) := . (4.25)
cos x − sin x

Moreover, φ1 and φ2 are linearly independent (e.g. since φ1 (0) = 01 and φ2 (0) = 10 are
 

linearly independent, so are φ1 , φ2 : R −→ K2 by Th. 4.11(b), implying, again by Th.


4.11(b), the linear independence of φ1 (a), φ2 (a), finally implying the linear independence
of φ1 , φ2 : [a, b] −→ K2 ). Thus,
n  o
2
Lh = (c1 φ1 + c2 φ2 ) : [a, b] −→ K : c1 , c2 ∈ K (4.26)

and, since, according to Th. 3.1 the solutions to (1.16) are precisely the first components
of solutions to (4.24), the representation (1.17) is verified.

4.4 Fundamental Matrix Solutions and Variation of Constants


Definition and Remark 4.13. A basis (φ1 , . . . , φn ), n ∈ N, of the n-dimensional
vector space Lh over K is called a fundamental system for the linear ODE (4.1). One
then also calls the matrix  
φ11 . . . φ1n
Φ :=  ... ..  , (4.27)

. 
φn1 . . . φnn
where the kth column of the matrix consists of the component functions φ1k , . . . , φnk of
φk , k ∈ {1, . . . , n}, a fundamental system or a fundamental matrix solution for (4.1). The
latter term is justified by the observation that Φ : I −→ M(n, K) can be interpreted as
a solution to the matrix-valued ODE

Y ′ = A(x) Y : (4.28)

Indeed,
Φ′ = φ′1 , . . . , φ′n = A(x) φ1 , . . . , A(x) φn = A(x) Φ.
 

Corollary 4.14. Let φ1 , . . . , φn ∈ Lh , n ∈ N, and let Φ be defined as in (4.27). Then


the following statements are equivalent:
4 LINEAR ODE 64

(i) Φ is a fundamental system for (4.1).

(ii) There exists x0 ∈ I such that det Φ(x0 ) 6= 0.

(iii) det Φ(x) 6= 0 for every x ∈ I.

Proof. The equivalences are a direct consequence of the equivalences in Th. 4.11(b). 

Theorem 4.15 (Variation of Constants). Consider the setting of Def. 4.2. If Φ : I −→


M(n, K) is a fundamental system for (4.1), then the unique solution ψ : I −→ Kn of
the initial value problem consisting of (4.1) and y(x0 ) = y0 , (x0 , y0 ) ∈ I × Kn , is given
by Z x
ψ : I −→ Kn , ψ(x) = Φ(x)Φ−1 (x0 ) y0 + Φ(x) Φ−1 (t) b(t) dt . (4.29)
x0

Proof. The initial condition is easily verified:

ψ(x0 ) = Φ(x0 )Φ−1 (x0 ) y0 + 0 = Id y0 = y0 .

To check that ψ satisfies (4.1), one computes, for each x ∈ I,


Z x
′ (I.3) ′ −1 ′
ψ (x) = Φ (x)Φ (x0 ) y0 + Φ (x) Φ−1 (t) b(t) dt + Φ(x)Φ−1 (x) b(x)
x0
Z x
(4.28) −1
= A(x) Φ(x)Φ (x0 ) y0 + A(x) Φ(x) Φ−1 (t) b(t) dt + b(x)
x0
= A(x) ψ(x) + b(x), (4.30)

thereby establishing the case. 

Remark 4.16. The 1-dimensional variation of constants formula (2.2) is actually a


special case of (4.29): We note that, for n = 1 and A(x) = a(x), the solution Φ := φ0
to the 1-dimensional homogeneous equation as defined in (2.2b), i.e.
Rx
Z x 
a(t) dt
φ0 : I −→ K, φ0 (x) = exp a(t) dt = e x0
x0

constitutes a fundamental matrix solution in the sense of Def. and Rem. 4.13 (since 1/φ0
exists). Taking into account Φ(x0 ) = φ0 (x0 ) = 1, we obtain, for each x ∈ I,
Z x
(4.29) −1
φ(x) = Φ(x)Φ (x0 ) y0 + Φ(x) Φ−1 (t) b(t) dt
x0
 Z x 
−1
= φ0 (x) y0 + φ0 (t) b(t) dt , (4.31)
x0

which is (2.2a).

4 LINEAR ODE 65

In Sec. 4.6, we will study methods for actually finding fundamental matrix solutions in
cases where A is constant. However, in general, fundamental matrix solutions are often
not explicitly available. In such situations, the following Th. 4.17 can sometimes help
to extract information about solutions.
Theorem 4.17 (Liouville’s Formula). Consider the setting of Def. 4.2 and recall the
trace of an n × n matrix A = (akl ) is defined by
n
X
tr A := akk .
k=1

If Φ : I −→ M(n, K) is a fundamental system for (4.1), then


Z x 
  
∀ det Φ(x) = det Φ(x0 ) exp tr A(t) dt . (4.32)
x0 ,x∈I x0

Proof. Exercise. 

4.5 Higher-Order, Wronskian


In Th. 3.1, we saw that higher-order ODE are equivalent to systems of first-order ODE.
We can now combine Th. 3.1 with our findings regarding first-order linear ODE to help
with the solution of higher-order linear ODE.
Definition 4.18. Let I ⊆ R be a nontrivial interval, n ∈ N. Let b : I −→ K and
a0 , . . . , an−1 : I −→ K be continuous functions. Then a (1-dimensional) linear ODE of
nth order is an equation of the form

y (n) = an−1 (x)y (n−1) + · · · + a1 (x)y ′ + a0 (x)y + b(x). (4.33)

It is called homogeneous if, and only if, b ≡ 0; it is called inhomogeneous if, and only if,
it is not homogeneous. Analogous to (4.22), define the respective sets of solutions
n n−1
X o
Hi := (φ : I −→ K) : φ(n) = b + ak φ(k) , (4.34a)
k=0
n n−1
X o
(n) (k)
Hh := (φ : I −→ K) : φ = ak φ . (4.34b)
k=0

Definition 4.19. Let I ⊆ R be a nontrivial interval, n ∈ N. For each n-tuple of (n − 1)


times differentiable functions φ1 , . . . , φn : I −→ K, define the Wronskian
 
φ1 (x) ... φn (x)
 φ′1 (x) ... φ′n (x) 
W (φ1 , . . . , φn ) : I −→ K, W (φ1 , . . . , φn )(x) := det  .. .. .
 
 . . 
(n−1) (n−1)
φ1 (x) . . . φn (x)
(4.35)
4 LINEAR ODE 66

Theorem 4.20. Consider the setting of Def. 4.18.

(a) If Hi and Hh are the sets defined in (4.34), then Hh is an n-dimensional vector
space over K and, if φ ∈ Hi is arbitrary, then

Hi = φ + Hh . (4.36)

(b) Let φ1 , . . . , φn ∈ Hh . Then the following statements are equivalent:

(i) φ1 , . . . , φn are linearly independent over K (i.e. (φ1 , . . . , φn ) forms a basis of


Hh ).
(ii) There exists x0 ∈ I such that the Wronskian does not vanish:

W (φ1 , . . . , φn )(x0 ) 6= 0.

(iii) The Wronskian never vanishes, i.e. W (φ1 , . . . , φn )(x) 6= 0 for every x ∈ I.

Proof. According to Th. 3.1, (4.33) is equivalent to the first-order linear ODE
    
0 1 0 ... 0 0 y1 0
 0 0 1 ... 0 0    y2   0 
   
 .. .. . . 

 . ... ...   ..   .. 
   

y = .  + 
 0 0 0 . . . 1 0  yn−2   0 
    
 0 0 0 ... 0 1  yn−1   0 
a0 (x) a1 (x) a2 (x) . . . an−2 (x) an−1 (x) yn b(x)

=: Ã(x) y + b̃(x). (4.37)

Define
n o
n ′
Li := (Φ : I −→ K ) : Φ = ÃΦ + b̃ ,
n o
Lh := (Φ : I −→ Kn ) : Φ′ = Ãφ .

(a): Let φ ∈ Hi and define  


φ
 φ′
Φ :=  .
..
 
  .
(n−1)
φ
Then
Th. 3.1(a),(b)
Hh = {Ψ1 : Ψ ∈ Lh }
and
Th. 3.1(a),(b) (4.23)
Hi = {Φ̃1 : Φ̃ ∈ Li } = {Φ̃1 : Φ̃ ∈ Φ + Lh }
= {(Φ + Ψ)1 : Ψ ∈ Lh } = φ + Hh .
4 LINEAR ODE 67

As a consequence of Th. 3.1, the map J : Lh −→ Hh , J(Φ) := Φ1 , is a linear isomor-


phism, implying that Hh , like Lh , is an n-dimensional vector space over K.
(l−1)
(b): For φ1 , . . . , φn ∈ Hh , define Φkl := φk for each k, l ∈ {1, . . . , n} and
 
φ1 (x) ... φn (x)
 φ′1 (x) ... φ′n (x) 
∀ Φ(x) := (Φ1 (x), . . . , Φn (x)) = (Φkl (x)) =  .. ..
 
. .

x∈I  
(n−1) (n−1)
φ1 (x) . . . φn (x)

such that det Φ(x) = W (φ1 , . . . , φn )(x) for each x ∈ I. Since Th. 3.1 yields Φ1 , . . . , Φn ∈
Lh if, and only if, φ1 , . . . , φn ∈ Hh , the equivalences of (b) follow from the equivalences
of Cor. 4.14. 

Example 4.21. Consider a0 , a1 : R+ −→ K, a1 (x) := 1/(2x), a0 (x) := −1/(2x2 ), and

y′ y
y ′′ = a1 (x) y ′ + a0 (x) y = − 2.
2x 2x
One might be able to guess the solutions

φ1 , φ2 : R+ −→ K, φ1 (x) := x, φ2 (x) := x.

The Wronskian is
W (φ1 , φ2 ) : R+ −→ K,
√  √ √
x √

x x x
W (φ1 , φ2 )(x) = det √ = − x=− < 0,
1 1/(2 x) 2 2

i.e. φ1 and φ2 span Hh according to Th. 4.20(b):

Hh = {c1 φ1 + c2 φ2 : c1 , c2 ∈ K}.

4.6 Constant Coefficients


For 1-dimensional first-order linear ODE, we obtained a solution formula in Th. 2.3 in
terms of integrals (of course, in general, evaluating integrals can still be very difficult,
and one might need effective and efficient numerical methods). In the previous sections,
we have studied systems of first-order linear ODE as well as linear ODE of higher order.
Unfortunately, there are no general solution formulas for these situations (one can use
(4.29) if one knows a fundamental system, but the problem is the absence of a general
procedure to obtain such a fundamental system). However, there is a more satisfying
solution theory for the situation of so-called constant coefficients, i.e. if A in (4.1) and
the a0 , . . . , an−1 in (4.33) do not depend on x.
4 LINEAR ODE 68

4.6.1 Linear ODE of Higher Order

Definition 4.22. Let I ⊆ R be a nontrivial interval, n ∈ N. Let b : I −→ K be


continuous and a0 , . . . , an−1 ∈ K. Then a (1-dimensional) linear ODE of nth order with
constant coefficients is an equation of the form

y (n) = an−1 y (n−1) + · · · + a1 y ′ + a0 y + b(x). (4.38)

In the present context, it is useful to introduce the following notation:


Notation 4.23. Let n ∈ N0 .

(a) Let P denote the set of all polynomials over K, Pn := {P ∈ P : deg P ≤ n}. We
will also write P[R], P[C], Pn [R], Pn [C] if we need to be specific about the field of
coefficients.

(b) Let I ⊆ R be a nontrivial interval. Let Dn (I) := Dn (I, K) denote the set of all n
times differentiable functions f : I −→ K, and let

∂x : D1 (I) −→ F(I, K) := {f : I −→ K}, ∂x f := f ′ ,

and, for each P ∈ Pn with P (x) = nj=0 aj xj (a0 , . . . , an ∈ K) define the differential
P
operator
n
X n
X
n
P (∂x ) : D (I) −→ F(I, K), P (∂x )f := aj ∂xj f = aj f (j) . (4.39)
j=0 j=0

Remark 4.24. Using Not. 4.23(b), the ODE (4.38) can be written concisely as
n−1
X
P (∂x )y = b(x), where P (x) := xn − aj x j . (4.40)
j=0

The following Prop. 4.25 implies that the differential operator P (∂x ) does not, actually,
depend on the representation of the polynomial P .
Proposition 4.25. Let P, P1 , P2 ∈ P.

(a) If P = P1 + P2 and n := max{deg P1 , deg P2 }, then

∀ P (∂x )f = P1 (∂x )f + P2 (∂x )f.


f ∈D n (I)

(b) If P = P1 P2 and n := max{deg P, deg P1 , deg P2 }, then



∀n P (∂x )f = P1 (∂x ) P2 (∂x )f .
f ∈D (I)
4 LINEAR ODE 69

Proof. Exercise. 

Lemma 4.26. Let λ ∈ K and

f : R −→ K, f (x) := eλx . (4.41)

Then, for each P ∈ P,

P (∂x )f : R −→ K, P (∂x )f (x) = P (λ) eλx . (4.42)


Pn
Proof. There exists n ∈ N0 and a0 , . . . , an ∈ K such that P (x) = j=0 aj xj . One
computes
n
X Xn
P (∂x )f (x) = aj ∂xj eλx = aj λj eλx = eλx P (λ),
j=0 j=0

proving (4.42). 
Pn−1
Theorem 4.27. If a0 , . . . , an−1 ∈ K, n ∈ N, and P (x) = xn − j=0 aj xj has the distinct
zeros λ1 , . . . , λn ∈ K (i.e. P (λ1 ) = · · · = P (λn ) = 0), then (φ1 , . . . , φn ), where

∀ φj : I −→ K, φj (x) := eλj x , (4.43)


j∈{1,...,n}

is a basis of the homogeneous solution space


n o
Hh = (φ : I −→ K) : P (∂x )φ = 0 (4.44)

to (4.38) (i.e. to (4.40)).

Proof. It is immediate from (4.42) and P (λj ) = 0 that each φj satisfies P (∂x )φj = 0.
From Th. 4.20(a), we already know Hh is an n-dimensional vector space over K. Thus,
it merely remains to compute the Wronskian. One obtains (cf. (4.35)):

1 . . . 1

λ1 . . . λn n−1
(H.2) Y
W (φ1 , . . . , φn )(0) = .. .. = (λk − λl ) 6= 0,

. .
n−1 k,l=0
. . . λnn−1

λ1 k>l

since the λj are all distinct. We have used that the Wronskian, in the present case, turns
out to be a Vandermonde determinant. The formula (H.2) for this type of determinant is
provided and proved in Appendix H. We also used that the determinant of a matrix is the
same as the determinant of its transpose: det A = det At . From W (φ1 , . . . , φn )(0) 6= 0
and Th. 4.20(b), we conclude that (φ1 , . . . , φn ) is a basis of Hh . 

Example 4.28. We consider the third-order linear ODE

y ′′′ = 2y ′′ − y ′ + 2y, (4.45)


4 LINEAR ODE 70

which can be written as P (∂x )y = 0 with

P (x) := x3 − 2x2 + x − 2 = (x2 + 1)(x − 2) = (x − i)(x + i)(x − 2), (4.46)

i.e. P has the distinct zeros λ1 = i, λ2 = −i, λ3 = 2. Thus, according to Th. 4.27, the
three functions

φ1 , φ2 , φ3 : R −→ C, φ1 (x) = eix , φ2 (x) = e−ix , φ3 (x) = e2x , (4.47)

form a basis of the C-vector space Hh . If we consider (4.45) as an ODE over R, then
we are interested in a basis of the R-vector space Hh . We can use linear combinations
of φ1 and φ2 to obtain such a basis (cf. Rem. 4.33(b) below):
eix + e−ix eix − e−ix
ψ1 , ψ2 : R −→ R, ψ1 (x) = = cos x, ψ2 (x) = = sin x. (4.48)
2 2i
As explained in Rem. 4.33(b) below, as (φ1 , φ2 , φ3 ) are a basis of Hh over C, (ψ1 , ψ2 , φ3 )
are a basis of Hh over R.

By working a bit harder, one can generalize Th. 4.27 to the case where P has zeros of
higher multiplicity. We provide this generalization in Th. 4.32 below after recalling the
notion of zeros of higher multiplicity in Rem. and Def. 4.29, and after providing two
preparatory lemmas.
Remark and Definition 4.29. According to the fundamental theorem of algebra (cf.
[Phi16, Th. 8.32, Cor. 8.33(b)]), for every polynomial P ∈ Pn with deg P = n, n ∈ N,
there exists r ∈ N with r ≤ n, k1 , . . . , kr ∈ N with k1 + · · · + kr = n, and distinct
numbers λ1 , . . . , λr ∈ C such that

P (x) = (x − λ1 )k1 · · · (x − λr )kr . (4.49)

Clearly, λ1 , . . . , λr are precisely the distinct zeros of P and kj is referred to as the


multiplicity of the zero λj , j = 1, . . . , r.
Lemma 4.30. Let I ⊆ R be a nontrivial interval, λ ∈ K, k ∈ N0 , and f ∈ Dk (I). Then
we have
∀ (∂x − λ)k f (x) eλx = f (k) (x) eλx .

(4.50)
x∈I

Proof. The proof is carried out by induction. The case k = 0 is merely the identity
(∂x − λ)0 f (x) eλx = f (x) eλx . For the induction step, let k ≥ 1 and compute, using
the product rule,
 ind. hyp.
(∂x − λ)k f (x) eλx (∂x − λ) f (k−1) (x) eλx

=
= f (k) (x) eλx + f (k−1) (x) λ eλx − λ f (k−1) (x) eλx
= f (k) (x) eλx , (4.51)

thereby establishing the case. 


4 LINEAR ODE 71

Lemma 4.31. Let P ∈ P and λ ∈ K such that P (λ) 6= 0. Then, for each Q ∈ P with
deg Q = k, k ∈ N0 , it holds that

∀ P (∂x ) Q(x) eλx = R(x) eλx ,



(4.52)
x∈R

where R ∈ P is still a polynomial of degree k.

Proof. We can rewrite P (cf. [Phi16, Th. 6.5(a)]) in the form


n
X
P (x) = bj (x − λ)j , n ∈ N0 , (4.53)
j=0

where b0 = P (λ) 6= 0 and the remaining bj ∈ K can also be calculated from the
coefficients of P according to [Phi16, (6.6)]. We compute
n n
 (4.53) X  (4.50) X
P (∂x ) Q(x) eλx = bj (∂x − λ)j Q(x) eλx = bj Q(j) (x) eλx ,
j=0 j=0

Pk
i.e. (4.52) holds with R := Q(j) and b0 6= 0 implies deg R = deg Q = k.
j=0 bj 
Pn−1
Theorem 4.32. If a0 , . . . , an−1 ∈ K, n ∈ N, and P (x) = xn − j=0 aj xj has the distinct
zeros λ1 , . . . , λr ∈ K with respective multiplicities k1 , . . . , kr ∈ N, then the set
n o
B := (φjm : I −→ K) : j ∈ {1, . . . , r}, m ∈ {0, . . . , kj − 1} , (4.54a)

where
∀ ∀ φjm : I −→ K, φjm (x) := xm eλj x , (4.54b)
j∈{1,...,r} m∈{0,...,kj −1}

yields a basis of the homogeneous solution space


n o
Hh = (φ : I −→ K) : P (∂x )φ = 0 .

Proof. Since k1 + · · · + kr = n implies #B = n and we know dim Hh = n, it suffices to


show that B ⊆ Hh and the elements of B are linearly independent. Let φjm be as in
(4.54b). As λj is a zero of multiplicity kj of P , we can write P (x) = Qj (x)(x − λj )kj
with some Qj ∈ P. From the computation
 (4.50) kj >m
P (∂x )φjm (x) = Qj (∂x )(∂x − λj )kj xm eλj x = Qj (∂x ) ∂xkj xm eλj x = 0,


we gather B ⊆ Hh . Linear independence of the φjm is verified by showing


r
!
X
Qj (x) eλj x = 0 ∧ ∀ Qj ∈ Pkj −1 ⇒ ∀ Qj ≡ 0. (4.55)
j=1,...,r j=1,...,r
j=1

We prove (4.55) by induction on r. Since eλj x =


6 0 for each x ∈ R, the case r = 1
is immediate. For the induction step, let r ≥ 2. If at least one Qj ≡ 0, then the
4 LINEAR ODE 72

remaining Qj ≡ 0 as well by the induction hypothesis. It only remains to consider the


case that none of the Qj vanishes identically. In that case, we apply (∂x − λr )kr to
P r λj x
j=1 Qj (x) e = 0, obtaining
r−1
X
Rj (x) eλj x = 0 (4.56)
j=1
(k )
with suitable Rj ∈ P, since Lem. 4.30 yields (∂x − λr )kr Qr (x) eλr x = Qr r (x) eλr x = 0


and, for j < r, Lem. 4.31 applies due to (λj − λr )kr 6= 0, also providing deg Rj = deg Qj .
Thus, none of the Rj in (4.56) can vanish identically, violating the induction hypothesis.
This finishes the proof of Qj ≡ 0 for each j = 1, . . . , r and the proof of the theorem. 

As it can occur in Th. 4.32 that P ∈ P[R], but λj ∈ C \ R for some or all of the zeros λj ,
the question arises of how to obtain a basis of the R-vector space Hh from the basis of
the C-vector space Hh provided by Th. 4.32. The following Rem. 4.33(b) answers this
question.
Remark 4.33. (a) If λ1 , λ2 ∈ C, then complex conjugation has the properties (cf.
[Phi16, Def. and Rem. 5.5])
λ1 ± λ2 = λ̄1 ± λ̄2 , λ1 λ2 = λ̄1 λ̄2 .
In consequence, if P ∈ P[R], then P (λ) = P (λ̄) for each λ ∈ C. In particular, if
P ∈ P[R] and λ ∈ C \ R is a nonreal zero of P , then λ̄ 6= λ is also a zero of P .
(b) Consider the situation of Th. 4.32 with P ∈ P[R]. Using (a), if φjm : I −→ C,
φjm (x) = xm eλj x , λj ∈ C \ R, occurs in a basis for the C-vector space Hh (with
m = 0 in the special case of Th. 4.27), then φj̃m : I −→ C, φj̃m (x) = xm eλj̃ x , with
λj̃ = λ̄j will occur as well. Noting that, for each x ∈ R and each λ ∈ C,
eλx = ex(Re λ+i Im λ) = ex Re λ cos(x Im λ) + i sin(x Im λ) ,

(4.57a)
λ̄x x(Re λ−i Im λ) x Re λ

e =e =e cos(x Im λ) − i sin(x Im λ) , (4.57b)
1 λx
(e + eλ̄x ) = ex Re λ cos(x Im λ), (4.57c)
2
1 λx
(e − eλ̄x ) = ex Re λ sin(x Im λ), (4.57d)
2i
one can define
1
ψjm : I −→ R, ψjm (x) := (φjm (x) + φj̃m (x)) = xm ex Re λj cos(x Im λj ), (4.58a)
2
1
ψj̃m : I −→ R, ψj̃m (x) := (φjm (x) − φj̃m (x)) = xm ex Re λj sin(x Im λj ).
2i
(4.58b)
If one replaces each pair φjm , φj̃m in the basis for the C-vector space Hh with the
corresponding pair ψjm , ψj̃m , then one obtains a basis for the R-vector space Hh :
This follows from
    1 1 
ψjm φjm 2 2 1
=A with A := 1 1 , det A = − 6= 0. (4.59)
ψj̃m φj̃m 2i
− 2i 2i
4 LINEAR ODE 73

Example 4.34. We consider the fourth-order linear ODE

y (4) = −8y ′′ − 16y, (4.60)

which can be written as P (∂x )y = 0 with

P (x) := x4 + 8x2 + 16 = (x2 + 4)2 = (x − 2i)2 (x + 2i)2 , (4.61)

i.e. P has the zeros λ1 = 2i, λ2 = −2i, both with multiplicity 2. Thus, according to Th.
4.32, the four functions
φ10 , φ11 , φ20 , φ21 : R −→ C,
φ10 (x) = e 2ix
, φ11 (x) = x e2ix , φ20 (x) = e−2ix , φ21 (x) = x e−2ix ,

form a basis of the C-vector space Hh . If we consider (4.60) as an ODE over R, we can
use (4.58) to obtain the basis (ψ10 , ψ11 , ψ20 , ψ21 ) of the R-vector space Hh , where

ψ10 , ψ11 , ψ20 , ψ21 : R −→ R,


ψ10 (x) = cos(2x), ψ11 (x) = x cos(2x), ψ20 (x) = sin(2x), ψ21 (x) = x sin(2x).

If (4.38) is inhomogeneous, then one can use Th. 4.32 and, if necessary, Rem. 4.33(b),
to obtain a basis of the homogeneous solution space Hh , then using the equivalence with
systems of first-order linear ODE and variation of constants according to Th. 4.15 to
solve (4.38). However, if the function b in (4.38) is such that the following Th. 4.35
applies, then one can avoid using the above strategy to obtain a particular solution φ
to (4.38) (and, thus, the entire solution space via Hi = φ + Hh ).
Pn−1
Theorem 4.35. Let a0 , . . . , an−1 ∈ K, n ∈ N, and P (x) = xn − j=0 aj xj . Consider

P (∂x )y = Q(x)eµx , Q ∈ P, µ ∈ K. (4.62)

(a) (no resonance): If P (µ) 6= 0 and m := deg(Q) ∈ N0 , then there exists a polynomial
R ∈ P such that deg(R) = m and

φ : R −→ K, φ(x) := R(x) eµx , (4.63)

is a solution to (4.62). Moreover, if Q ≡ 1, then one can choose R ≡ 1/P (µ).


(b) (resonance): If µ is a zero of P with multiplicity k ∈ N and m := deg(Q) ∈ N0 ,
then there exists a solution to (4.62) of the following form:
m+k
X
µx
φ : R −→ K, φ(x) := R(x) e , R ∈ P, R(x) = c j xj , ck , . . . , cm+k ∈ K.
j=k
(4.64)

The reason behind the terms no resonance and resonance will be explained in the follow-
ing Example 4.36.
4 LINEAR ODE 74

Proof. Exercise. 
Example 4.36. Consider the second-order linear ODE
d2 x
+ ω02 x = a cos(ωt), ω0 , ω ∈ R+ , a ∈ R \ {0}, (4.65)
dt2
which can be written as P (∂t )x = a cos(ωt) with
P (t) := t2 + ω02 = (t − iω0 )(t + iω0 ). (4.66)
Note that the unknown function is written as x depending on the variable t (instead of y
depending on x). This is due to the physical interpretation of (4.65), where x represents
the position of a so-called harmonic oscillator at time t, having angular frequency ω0
and being subjected to a periodic external force of angular frequency ω and amplitude
a. We can find a particular solution φ to (4.65) by applying Th. 4.35 to
P (∂t )x = a eiωt . (4.67)
We have to distinguish two cases:

(a) Case ω 6= ω0 : In this case, one says that the oscillator and the external force are not
in resonance, which explains the term no resonance in Th. 4.35(a). In this case, we
can apply Th. 4.35(a) with µ := iω and Q ≡ a, yielding R ≡ a/P (iω) = a/(ω02 −ω 2 ),
i.e.
a
φ0 : R −→ C, φ0 (t) := R(t) eµt = 2 eiωt , (4.68a)
ω0 − ω 2
is a solution to (4.67) and
a
φ : R −→ R, φ(t) := Re φ0 (t) = cos(ωt), (4.68b)
ω02 − ω2
is a solution to (4.65).
(b) Case ω = ω0 : In this case, one says that the oscillator and the external force are in
resonance, which explains the term resonance in Th. 4.35(b). In this case, we can
apply Th. 4.35(b) with µ := iω and Q ≡ a, i.e. m = 0, k = 1, yielding R(t) = ct
for some c ∈ C. To determine c, we plug x(t) = R(t) eµt into (4.67):
P (∂t ) ct eiωt = ∂t (c eiωt + ciωt eiωt ) + ω02 ct eiωt


= ciω eiωt + ciω eiωt − cω 2 t eiωt + ω02 ct eiωt


= 2ciω eiωt = a eiωt ⇒ c = a/(2iω). (4.69)
Thus,
a
φ0 : R −→ C, φ0 (t) := t eiωt , (4.70a)
2iω
is a solution to (4.67) and
a
φ : R −→ R, φ(t) := Re φ0 (t) = t sin(ωt), (4.70b)

is a solution to (4.65).
4 LINEAR ODE 75

4.6.2 Systems of First-Order Linear ODE

Matrix Exponential Function


Definition 4.37. Let I ⊆ R be a nontrivial interval, n ∈ N, A ∈ M(n, K) and b :
I −→ Kn be continuous. Then a linear ODE with constant coefficients is an equation
of the form
y ′ = A y + b(x), (4.71)
i.e. a linear ODE, where the matrix A does not depend on x.

Recalling that the ordinary exponential function expa : R −→ C, x 7→ eax , a ∈ C,


is precisely the solution to the initial value problem y ′ = a y, y(0) = 1, the following
definition constitutes a natural generalization:
Definition 4.38. Given n ∈ N, A ∈ M(n, C), define the matrix exponential function

expA : R −→ M(n, C), x 7→ eAx , (4.72a)

to be the solution to the matrix-valued initial value problem

Y ′ = AY, Y (0) = Id, (4.72b)

i.e. as the fundamental matrix solution of y ′ = A y that satisfies Y (0) = Id (sometimes


called the principal maxtrix solution of y ′ = A y).

The previous definition of the matrix exponential function is further justified by the
following result:
Theorem 4.39. For each A ∈ M(n, C), n ∈ N, it holds that

X (Ax)k
∀ eAx = (4.73)
x∈R
k=0
k!

in the sense that the partial sums on the right-hand side converge pointwise to eAx on
R, where the convergence is even uniform on every compact interval.

Proof. By the equivalence of all norms on Cn ∼


2
= M(n, C), we may choose a convenient
norm on M(n, C). So we let k · k denote an arbitrary operator norm on M(n, C),
induced by some norm k · k on Cn . We first show that the partial sums (Am (x))m∈N ,
(Ax)k
Am (x) := m
P
k=0 k! , in (4.73) form a Cauchy sequence in M(n, C): For M, N ∈ N,
N > M , one estimates, for each x ∈ R,
N N
X (Ax)k (G.10) X kAkk |x|k
kAN (x) − AM (x)k = ≤ . (4.74)

k! k!


k=M +1 k=M +1
4 LINEAR ODE 76

kAkk |x|k
Since the convergence limm→∞ m = ekAk|x| is pointwise for x ∈ R and uniform
P
k=0 k!
on every compact interval, (4.74) shows each (Am (x))m∈N is a Cauchy sequence that
converges to some Φ(x) ∈ M(n, C) (by the completeness of M(n, C)) pointwise for
x ∈ R and uniform on every compact interval. It remains to show Φ is the solution to
(4.72b), i.e. Z x
∀ Φ(x) = Id + AΦ(t) dt . (4.75)
x∈R 0
Using the identity
m m−1 x m−1
(Ax)k Ak xk+1 Ak tk
X X Z X
Am (x) = Id + = Id +A = Id + A dt ,
k=1
k! k=0
(k + 1)! 0 k=0
k!

we estimate, for each x ∈ R and each m ∈ N,


Z x Z x

Φ(x) − Id − AΦ(t) dt ≤ Φ(x) − Am (x) + Am (x) − Id − AΦ(t) dt


0 0

Z x m−1
!
X Ak tk


= Φ(x) − A (x) + A − AΦ(t) dt

m

k!

0
k=0
Z x

≤ Φ(x) − Am (x) +
kAk Am−1 (t) − Φ(t) dt → 0 for m → ∞,
0

which proves (4.75) and establishes the case. 

The matrix exponential function has some properties that are familiar from the case
n = 1 (see Prop. 4.40(a),(b)), but also some properties that are, perhaps, unexpected
(see Prop. 4.42(a),(b)).
Proposition 4.40. Let A ∈ M(n, C), n ∈ N.

(a) eA(t+s) = eAt eAs holds for each s, t ∈ R.


(b) (eAx )−1 = eA(−x) = e−Ax holds for each x ∈ R.
t
(c) For the transpose At , one has eA x = (eAx )t for each x ∈ R.

Proof. (a): Fix s ∈ R. The function Φs : R −→ M(n, C), Φs (t) := eA(t+s) is a solution
to Y ′ = AY (namely the one for the initial condition Y (−s) = Id). Moreover, the
function Ψs : R −→ M(n, C), Ψs (t) := eAt eAs , is also a solution to Y ′ = AY , since

∂t Ψs (t) = ∂t eAt eAs = AeAt eAs = AΨs (t).




Finally, since Ψs (0) = eA0 eAs = Id eAs = eAs = Φs (0), the claimed Φs = Ψs follows by
uniqueness of solutions.
(b) is an easy consequence of (a), since
(a)
Id = eA0 = eA(x−x) = eAx e−Ax .
4 LINEAR ODE 77

(c): Clearly, the map A 7→ At is continuous on M(n, C) (since limk→∞ Ak = A implies


limk→∞ ak,αβ = aαβ for all components, which implies limk→∞ Atk = At ), providing, for
each x ∈ R,
m m
!t m
!t
t k k k
t
X (A x) X (Ax) X (Ax)
eA x = lim = lim = lim = (eAx )t ,
m→∞
k=0
k! m→∞
k=0
k! m→∞
k=0
k!

completing the proof. 

Proposition 4.41. Let A ∈ M(n, C), n ∈ N. Then

det eA = etr A .

Proof. Applying Liouville’s formula (4.32) to Φ(x) := eAx , x ∈ R, yields


Z x 
Ax A0
det e = det e exp tr A dt = 1 · ex tr A , (4.76)
0

and setting x = 1 in (4.76) proves the proposition. 

Proposition 4.42. Let A, B ∈ M(n, C), n ∈ N.

(a) BeAx = eAx B holds for each x ∈ R if, and only if, AB = BA.

(b) e(A+B)x = eAx eBx holds for each x ∈ R if, and only if, AB = BA.

Proof. (a): If BeAx = eAx B holds for each x ∈ R, then differentiation yields BAeAx =
AeAx B for each x ∈ R, and the case x = 0 provides BA Id = A Id B, i.e. BA = AB.
For the converse, assume BA = AB and define the auxiliary maps

fB : M(n, C) −→ M(n, C), fB (C) := BC,


gB : M(n, C) −→ M(n, C), gB (C) := CB.

If k·k denotes an operator norm, then kBC1 −BC2 k ≤ kBkkC1 −C2 k and kC1 B−C2 Bk ≤
kBkkC1 − C2 k, showing fB and gB to be (even Lipschitz) continuous. Thus,
m
! m
!
k k
X (Ax) X (Ax)
BeAx = fB (eAx ) = fB lim = lim fB
m→∞
k=0
k! m→∞
k=0
k!
m m
! m
!
X (Ax)k AB=BA X (Ax)k X (Ax)k
= lim B = lim B = lim gB
m→∞
k=0
k! m→∞
k=0
k! m→∞
k=0
k!
m
!
X (Ax)k
= gB lim = gB (eAx ) = eAx B,
m→∞
k=0
k!

thereby establishing the case.


(b): Exercise (hint: use (a)). 
4 LINEAR ODE 78

Eigenvalues and Jordan Normal Form


We will see that the solution theory of linear ODE with constant coefficients is related
to the eigenvalues of A. We recall the definition of this notion:

Definition 4.43. Let n ∈ N and A ∈ M(n, C). Then λ ∈ C is called an eigenvalue of


A if, and only if, there exists 0 6= v ∈ Cn such that

Av = λv. (4.77)

If (4.77) holds, then v 6= 0 is called an eigenvector for the eigenvalue λ.

Theorem 4.44. Let n ∈ N and A ∈ M(n, C).

(a) For each eigenvalue λ ∈ C of A with eigenvector v ∈ Cn \ {0}, the function

φ : I −→ Cn , φ(x) := eλx v, (4.78)

is a solution to the homogeneous version of (4.71).

(b) If {v1 , . . . , vn } is a basis of eigenvectors for Cn , where vj is an eigenvector with


respect to the eigenvalue λj ∈ C of A for each j ∈ {1, . . . , n}, then φ1 , . . . , φn with

∀ φj : I −→ Cn , φj (x) := eλj x vj , (4.79)


j∈{1,...,n}

form a fundamental system for (4.71).

Proof. (a): One computes, for each x ∈ I,

φ′ (x) = λ eλx v = eλx Av = Aφ(x),

proving that φ solves the homogeneous version of (4.71).


(b): Without loss of generality, we may consider I = R. We already know from (a) that
each φj is a solution to the homogeneous version of (4.71). Thus, it merely remains
to check that φ1 , . . . , φn are linearly independent. As φ1 (0) = v1 , . . . , φn (0) = vn are
linearly independent by hypothesis, the linear independence of φ1 , . . . , φn is provided by
Th. 4.11(b). 

To proceed, we need a few more notions and results from linear algebra:

Theorem 4.45. Let n ∈ N and A ∈ M(n, C). Then the following statements (i) and
(ii) are equivalent:

(i) There exists a basis B of eigenvectors for Cn , i.e. there exist v1 , . . . , vn ∈ Cn and
λ1 , . . . , λn ∈ C such that B = {v1 , . . . , vn } is a basis of Cn and Avj = λj vj for
each j = 1, . . . , n (note that the vj must all be distinct, whereas some (or all) of
the λj may be identical).
4 LINEAR ODE 79

(ii) There exists an invertible matrix W ∈ M(n, C) such that


 
λ1 0
W −1 AW = 
 ... ,

(4.80)
0 λn

i.e. A is diagonalizable (if (4.80) holds, then the columns v1 , . . . , vn of W must


actually be the respective eigenvectors to the eigenvalues λ1 , . . . , λn ).

Proof. See, e.g., [Koe03, Th. 8.3.1]. 

Unfortunately, not every matrix A ∈ M(n, C) is diagonalizable. However, every A ∈


M(n, C) can at least be transformed into Jordan normal form:

Theorem 4.46 (Jordan Normal Form). Let n ∈ N and A ∈ M(n, C). There exists an
invertible matrix W ∈ M(n, C) such that

B := W −1 AW (4.81)

is in Jordan normal form, i.e. B has block diagonal form


 
B1 0
B=
 ... ,

(4.82)
0 Br

1 ≤ r ≤ n, where each block Bj is a so-called Jordan matrix or Jordan block, i.e.


 
λj 1 0 . . . 0

 λj 1 

Bj = (λj ) or Bj = 
 . . . . . . 0 ,

(4.83)
 
 0 λj 1 
λj

where λj is an eigenvalue of A.

Proof. See, e.g., [Koe03, Th. 9.5.6] or [Str08, Th. 27.13]. 

The reason Th. 4.46 regarding the Jordan normal form is useful for solving linear ODE
with constant coefficients is the following theorem:

Theorem 4.47. Let n ∈ N and A, W ∈ M(n, C), where W is assumed invertible.

(a) The following statements (i) and (ii) are equivalent:

(i) φ : I −→ Cn is a solution to y ′ = Ay.


(ii) ψ := W −1 φ : I −→ Cn is a solution to y ′ = W −1 AW y.
4 LINEAR ODE 80

−1 AW x
(b) eW = W −1 eAx W for each x ∈ R.

Proof. (a): The equivalences

φ′ = Aφ ⇔ W −1 φ′ = W −1 Aφ ⇔ ψ ′ = W −1 AW ψ

establish the case.


−1
(b): By definition, x 7→ eW AW x is the solution to the initial value problem Y ′ =
W −1 AW Y , Y (0) = Id. Thus, noting W −1 eA0 W = Id and

(W −1 eAx W )′ = W −1 AeAx W = W −1 AW W −1 eAx W

shows x 7→ W −1 eAx W is a solution to the same initial value problem, establishing


(b). 
Remark 4.48. To obtain a fundamental system for (4.71) with A ∈ M(n, C), it suffices
to obtain a fundamental system for y ′ = By, where B := W −1 AW is in Jordan normal
form and W ∈ M(n, C) is invertible: If φ1 , . . . , φn are linearly independent solutions to
y ′ = By, then A = W BW −1 , Th. 4.47(a), and W being a linear isomorphism yield that
ψ1 := W φ1 , . . . , ψn := W φn are linearly independent solutions to y ′ = Ay.
Moreover, since B is in block diagonal form with each block being a Jordan matrix
according to (4.82) and (4.83), it actually suffices to solve y ′ = By assuming that

B = λ Id +N, (4.84)

where λ ∈ C and N is a so-called canonical nilpotent matrix, i.e.


0 1 0 ... 0
 
 0 1 
... ... 
 
N = 0 (zero matrix) or N =  0 , (4.85)

 
 0 0 1
0
where the case N = 0 is already covered by Th. 4.44. The remaining case is covered by
the following Th. 4.49.
Theorem 4.49. Let λ ∈ C, k ∈ N, k ≥ 2, and assume 0 6= N ∈ M(k, C) is a canonical
nilpotent matrix according to (4.85). Then
2 xk−2 xk−1
1 x x2
 
... (k−2)! (k−1)!
xk−3 xk−2 
0 1 x ...

(k−3)! (k−2)! 

... .. .. 
0 0 1 . .
 

Φ : R −→ M(k, C), Φ(x) := eλx  . , (4.86)
 
 .. .. .. ... ... ..
 . . 
.
 
0 0 0 ... 1 x 
 
0 0 0 ... 0 1
4 LINEAR ODE 81

is a fundamental matrix solution to

Y ′ = (λ Id +N )Y, Y (0) = Id, (4.87)

i.e.
∀ Φ(x) = e(λ Id +N )x ; (4.88)
x∈R

in particular, the columns of Φ provide k solutions to y ′ = (λ Id +N )y that are linearly


independent.

Proof. Φ(0) = Id is immediate from (4.86). Since Φ(x) has upper triangular form with
all 1’s on the diagonal, we obtain det Φ(x) = ekλx 6= 0 for each x ∈ R, showing the
columns of Φ are linearly independent. Let φαβ : R −→ C denote the αth component
function of the βth column of Φ, i.e.
(
xβ−α
eλx (β−α)! for α ≤ β,
∀ φαβ : R −→ C, φαβ (x) :=
α,β∈{1,...,k} 0 for α > β.

It remains to show that



λφαβ + φα+1,β
 for α < β,
∀ φ′αβ = λφαβ for α = β, (4.89)
α,β∈{1,...,k} 
0 for α > β.

One computes,
xβ−α xβ−(α+1)

λx
λ e
 (β−α)!
+ eλx (β−(α+1))!
for α < β,
xβ−α
∀ φ′αβ (x) = λ eλx (β−α)!
+0 for α = β,
α,β∈{1,...,k} 
0 for α > β,

i.e. (4.89) holds, completing the proof. 

Example 4.50. For a 2-dimensional real system of linear ODE

y ′ = Ay, A ∈ M(2, R), (4.90)

there exist precisely the following three possibilities (i) – (iii):

(i) The matrix A is diagonalizable with real eigenvalues λ1 , λ2 ∈ R (λ1 = λ2 is


possible), i.e. there is a basis {v1 , v2 } of R2 such that vj is an eigenvector for λj ,
j ∈ {1, 2}. In this case, according to Th. 4.44(b), the two functions

φ1 , φ2 : R −→ K2 , φ1 (x) := eλ1 x v1 , φ2 (x) := eλ2 x v2 , (4.91)

form a fundamental system for (4.90) (over K).


4 LINEAR ODE 82

(ii) The matrix A is diagonalizable with two complex conjugate eigenvalues λ1 , λ2 ∈


C \ R, λ2 = λ̄1 . Analogous to (i), one has a basis {v1 , v2 } of C2 such that vj
is an eigenvector for λj , j ∈ {1, 2}, and the two functions in (4.91) still form a
fundamental system for (4.90), but with K replaced by C. However, one can still
obtain a real-valued fundamental system as follows: We have

λ1 = µ + iω, λ2 = µ − iω, where µ ∈ R, ω ∈ R \ {0}. (4.92)

Thus, if Av1 = λ1 v1 with 0 6= v1 = α + iβ, where α, β ∈ R2 , then, letting


v2 := v̄1 = α − iβ, and taking complex conjugates

Av2 = Av̄1 = Av1 = λ1 v1 = λ̄1 v̄1 = λ2 v2

shows v2 is an eigenvector with respect to λ2 . Thus, φ2 = φ̄1 and, similar to the


approach described in Rem. 4.33(b) above, we can let
  1 1  
ψ1 2 2 φ1
= 1 1 ,
ψ2 2i
− 2i φ2

to obtain a fundamental system {ψ1 , ψ2 } for (4.90) over R, where ψ1 , ψ2 : R −→


R2 ,

ψ1 (x) = Re(φ1 (x)) = Re e(µ+iω)x (α + iβ)



 
= Re eµx cos(ωx) + i sin(ωx) (α + iβ)


= eµx α cos(ωx) − β sin(ωx) ,



(4.93a)
ψ2 (x) = Im(φ1 (x)) = eµx α sin(ωx) + β cos(ωx) .

(4.93b)

(iii) The matrix A has precisely one eigenvalue λ ∈ R and the corresponding eigenspace
is 1-dimensional. Then there is an invertible matrix W ∈ M(2, R) such that
B := W −1 AW is in (nondiagonal) Jordan normal form, i.e.
 
−1 λ 1
B = W AW = .
0 λ
According to Th. 4.49, the two functions
   
2 λx 1 λx x
φ1 , φ2 : R −→ K , φ1 (x) := e , φ2 (x) := e , (4.94)
0 1

form a fundamental system for y ′ = By (over K). Thus, according to Th. 4.47,
the two functions

ψ1 , ψ2 : R −→ K2 , ψ1 (x) := W φ1 (x), ψ2 (x) := W φ2 (x), (4.95)

form a fundamental system for (4.90) (over K).


Remark 4.51. One way of finding a fundamental matrix solution for y ′ = A y, A ∈
M(n, C), is to obtain eAx , using the following strategy based on Jordan normal forms:
4 LINEAR ODE 83

(i) Determine the distinct eigenvalues λ1 , . . . , λs , 1 ≤ s ≤ n, of A, which are precisely


the zeros of the characteristic polynomial χA (x) := det(A − x Id) (the multiplicity
of the zero λj is called its algebraic multiplicity, the dimension of the eigenspace
ker(A − λj Id) its geometric multiplicity).

(ii) Determine the Jordan normal form B of A and W such that B = W −1 AW . In


general, this means computing the (finitely many distinct) powers (A − λj Id)k and
(suitable bases of) ker(A−λj Id)k (in general, this is a somewhat involved procedure
and it is referred to [Mar04, Sections 4.2,4.3] and [Str08, Sec. 27] for details – the
lecture notes do not provide further details here, as they rather recommend using
the Putzer algorithm as described below instead).

(iii) For each Jordan block Bj (as in (4.83)) of B compute eBj x as in (4.86).

As step (ii) above tends to be complicated in practise, it is usually easier to obtain eAx
using the Putzer algorithm described next.

Putzer Algorithm
The Putzer algorithm due to [Put66] is a procedure for computing eAx that avoids the
difficulty of determining the Jordan normal form of A, and, thus, is often more efficient
to employ in practise than the procedure described in Rem. 4.51 above. The Putzer
algorithm is provided by the following theorem:

Theorem 4.52. Let A ∈ M(n, C), n ∈ N. If λ1 , . . . , λn ∈ C are precisely the eigenval-


ues of A (not necessarily distinct, each eigenvalue occurring possibly repeatedly according
to its multiplicity). Then
Xn−1
Ax
∀ e = pk+1 (x) Mk , (4.96)
x∈R
k=0

where the functions p1 , . . . , pn : R −→ C and matrices M0 , . . . , Mn−1 ∈ M(n, C) are


defined recursively by

p′1 = λ1 p1 , p1 (0) = 1, (4.97a)


p′k = λk pk + pk−1 , pk (0) = 0 for k = 2, . . . , n (4.97b)

(i.e. each pk is a solution to a (typically nonhomogeneous) 1-dimensional first-order


linear ODE that can be solved using (2.2)) and

M0 := Id, (4.98a)
Mk = Mk−1 (A − λk Id) for k = 1, . . . , n − 1. (4.98b)

Proof. Note that (4.98) can be extended to k = n, yielding


n
Y
Mn = (A − λk Id) = χA (A) = 0,
k=1
5 STABILITY 84

since each matrix annihilates its characteristic polynomial according to the Cayley-
Hamilton theorem (cf. [Koe03, Th. 8.4.6] or [Str08, Th. 26.6]). Also note

∀ AMk = Mk (A − λk+1 Id) + λk+1 Mk = Mk+1 + λk+1 Mk . (4.99)


k=0,...,n−1

Pn−1
We have to show that x 7→ Φ(x) := k=0 pk+1 (x) Mk solves the initial value problem
Y ′ = AY , Y (0) = Id. The initial condition is satisfied, as Φ(0) = p1 (0) M0 = Id, and
the ODE is satisfied, as, for each x ∈ R,
n−1
X n−1
X

Φ (x) − AΦ(x) = p′k+1 (x) Mk −A pk+1 (x) Mk
k=0 k=0
n−1
(4.97), (4.99) X 
= λ1 p1 (x) M0 + λk+1 pk+1 (x) + pk (x) Mk
k=1
n−1
X 
− pk+1 (x) Mk+1 + λk+1 Mk
k=0

= −pn (x) Mn = 0,

completing the proof. 

5 Stability

5.1 Qualitative Theory, Phase Portraits


In the qualitative theory of ODE, which can be seen as part of the field of dynamical
systems, the idea is to understand the set of solutions to an ODE (or to a class of ODE),
if possible, without making use of explicit solution formulas, which, in most situations,
are not available anyway. Examples of qualitative questions are if, and under which
conditions, solutions to an ODE are constant, periodic, are unbounded, approach some
limit (more generally, the solutions’ asymptotic behavior), etc. One often thinks of the
solutions as depending on a time-like variable, and then qualitative theory typically
means disregarding the speed of change, but rather focusing on the shape/geometry of
the solution’s image.
The topic of stability takes continuity in intial conditions further and investigates the
behavior of solutions that are, at least initially, close to some given solution. Under
which conditions do nearby solutions approach each other or diverge away from each
other, show the same or different asymptotic behavior etc.
Even though the abovedescribed considerations are not limited to this situation, a nat-
ural starting point is to consider first-order ODE where the right-hand side does not
depend on x. In the following, we will mostly be concerned with this type of ODE,
which has a special name:
5 STABILITY 85

Definition 5.1. If Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn , then the n-dimensional first-order


ODE
y ′ = f (y) (5.1)
is called autonomous and Ω is called the phase space.
Remark 5.2. In fact, nonautonomous ODE are not really more general than au-
tonomous ODE, due to the, perhaps, surprising Th. J.1 of the Appendix, which states
that every nonautonomous ODE is equivalent to an autonomous ODE. However, this fact
is of little practical relevance, since the autonomous ODE arising via Th. J.1 from nonau-
tonomous ODE can never have bounded solutions on unbounded intervals, whereas the
theory of autonomous ODE is most powerful and useful for ODE that admit bounded
solutions on unbounded intervals (such as constant or periodic solutions, or solutions
approaching constant or periodic functions).
Lemma 5.3. If, in the context of Def. 5.1, φ : I −→ Kn is a solution to (5.1), defined
on the interval I ⊆ R, then
∀ φξ : I − ξ −→ Kn , φξ (x) := φ(x + ξ), where I − ξ := {x − ξ ∈ R : x ∈ I},
ξ∈R
(5.2)
is another solution to (5.1). In consequence, if φ is a maximal solution, then so is φξ .

Proof. Clearly, I − ξ is an interval. Note x ∈ I − ξ ⇒ x + ξ ∈ I and, since φ is a


solution to (5.1), it is φ(I) ⊆ Ω, implying φξ (I − ξ) ⊆ Ω. Finally,
φ′ξ (x) = φ′ (x + ξ) = f φ(x + ξ) = f φξ (x) ,
 

x∈I−ξ

completing the proof that φξ is a solution. Since each extension of φ yields an extension
of φξ and vice versa, φ is a maximal solution if, and only if, φξ is a maximal solution. 
Lemma 5.4. If Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn is such that (5.1) admits unique
maximal solutions (f being locally Lipschitz on Ω open is sufficient, but not necessary,
cf. Def. 3.32), then the global solution Y : Df −→ Kn of (5.1) satisfies

(a) Y (x, ξ, η) = Y (x − ξ, 0, η) for each (x, ξ, η) ∈ Df .


0, η) for each (x, x̃, η) ∈ R×R×Kn such that (x̃, 0, η) ∈

(b) Y x, 0, Y (x̃, 0, η) = Y (x+x̃,

Df and x, 0, Y (x̃, 0, η) ∈ Df .

Proof. (a): If ψ : Iξ,η −→ Kn and φ : I0,η −→ Kn denote the maximal solutions to


the initial data y(ξ) = η and y(0) = η, respectively, then (a) claims, using the notation
from Lem. 5.3, ψ = φ−ξ . As a consequence of Lem. 5.3, φ−ξ : I0,η + ξ −→ Kn , is some
maximal solution to (5.1) and, since φ−ξ (ξ) = φ(0) = η = ψ(ξ), the assumed uniqueness
yields the claimed ψ = φ−ξ , in particular, Iξ,η = I0,η + ξ.
(b): Let η̃ := Y (x̃, 0, η). If ψ : I0,η̃ −→ Kn and φ : I0,η −→ Kn denote the maximal
solutions to the initial data y(0) = η̃ and y(0) = η, respectively, then (b) claims ψ = φx̃ .
As a consequence of Lem. 5.3, φx̃ : I0,η − x̃ −→ Kn , is some maximal solution to (5.1)
and, since φx̃ (0) = φ(x̃) = η̃ = ψ(0), the assumed uniqueness yields the claimed ψ = φx̃ ,
in particular, I0,η̃ = I0,η − x̃. 
5 STABILITY 86

Definition 5.5. Let I ⊆ R be an interval and φ : I −→ S (in principle, S can be


arbitrary).

(a) The image of I under φ, i.e.

O(φ) := φ(I) = {φ(x) : x ∈ I} ⊆ S (5.3)

is often referred to as the orbit of φ in the present context of qualitative ODE theory.

(b) φ : R −→ S (note I = R) is called periodic if, and only if, there exists a smallest
ω > 0 (called the period of φ) such that

∀ φ(x + ω) = φ(x). (5.4)


x∈R

The requirement ω > 0 means constant functions are not periodic in the sense of
this definition.
Lemma 5.6. Let φ : R −→ Kn , n ∈ N.

(a) If φ is continuous and (5.4) holds for some ω > 0, then φ is either constant or
periodic in the sense of Def. 5.5(b).

(b) (a) is false without the assumption of φ being continuous.

Proof. Exercise. 
Definition 5.7. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn . In the context of the
autonomous ODE (5.1), the zeros of f are called the fixed points of the ODE (5.1) (cf.
Lem. 5.8 below). One then sometimes uses the notation

F := Ff := {η ∈ Ω : F (η) = 0} (5.5)

for the set of fixed points.


Lemma 5.8. Let Ω ⊆ Kn , n ∈ N, f : Ω −→ Kn , η ∈ Ω. Then the following statements
are equivalent:

(i) f (η) = 0, i.e. η is a fixed point of (5.1).

(ii) φ : R −→ Kn , φ ≡ η, is a solution to (5.1).

Proof. If f (η) = 0 and φ ≡ η, then φ′ (x) = 0 = f (φ(x)) for each x ∈ R, i.e. (i) implies
(ii). Conversely, if φ ≡ η is a solution to (5.1), then f (η) = f (φ(x)) = φ′ (x) = 0, i.e. (ii)
implies (i). 
Proposition 5.9. If Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn is such that (5.1) admits
unique maximal solutions (f being locally Lipschitz on Ω open is sufficient), then, for
maximal solutions φ1 : I1 −→ Kn , φ2 : I2 −→ Kn to (5.1), defined on open intervals
I1 , I2 , respectively, precisely one of the following two statements (i) and (ii) is true:
5 STABILITY 87

(i) O(φ1 ) ∩ O(φ2 ) = ∅, i.e. the solutions have disjoint orbits.


(ii) There exists ξ ∈ R such that
I2 = I1 − ξ and ∀ φ2 (x) = φ1 (x + ξ). (5.6)
x∈I2

In particular, it follows in this case that O(φ1 ) = O(φ2 ), i.e. the solutions have
the same orbit.

Proof. Suppose (i) does not hold. Then there are x1 ∈ I1 and x2 ∈ I2 such that
φ1 (x1 ) = φ2 (x2 ). Define ξ := x1 − x2 and consider
φ : I1 − ξ −→ Kn , φ(x) := φ1 (x + ξ). (5.7)
Then φ is a maximal solution of (5.1) by Lem. 5.3 and φ(x2 ) = φ1 (x1 ) = φ2 (x2 ). By
uniqueness of maximal solutions, we obtain φ = φ2 , in particular, I2 = I1 − ξ, proving
(5.6). Clearly, (5.6) implies O(φ1 ) = O(φ2 ). 
Proposition 5.10. If Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn is such that (5.1) admits
unique maximal solutions (f being locally Lipschitz on Ω open is sufficient), then, for
each maximal solution φ : I −→ Kn to (5.1), defined on the open interval I, precisely
one of the following three statements is true:

(i) φ is injective.
(ii) I = R and φ is periodic.
(iii) I = R and φ is constant (in this case η := φ(0) is a fixed point of (5.1)).

Proof. Clearly, (i) – (iii) are mutually exclusive. Suppose (i) does not hold. Then there
exist x1 , x2 ∈ I, x1 < x2 , such that φ(x1 ) = φ(x2 ). Set ω := x2 − x1 . According to Lem.
5.3, ψ : I − ω −→ Kn , ψ(x) := φ(x + ω), must also be a maximal solution to (5.1).
Since ψ(x1 ) = φ(x1 + ω) = φ(x2 ) = φ(x1 ), uniqueness implies ψ = φ and I = I − ω.
As ω > 0, this means I = R and the validity of (5.4). As φ is also continuous, by Lem.
5.6(a), either (ii) or (iii) must hold. 
Corollary 5.11. If Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn is such that (5.1) admits unique
maximal solutions (f being locally Lipschitz on Ω open is sufficient), then the orbits
of maximal solutions to (5.1) partition the phase space Ω into disjoint sets. Moreover,
every point η ∈ Ω is either a fixed point, or it belongs to some periodic orbit, or it belongs
to the orbit of some injective solution.

Proof. The corollary merely summarizes Prop. 5.9 and Prop. 5.10. 
Definition 5.12. In the situation of Cor. 5.11, a phase portrait for (5.1) is a sketch
showing representative orbits. Thus, the sketch shows subsets of the phase space Ω,
including fixed points (if any) and representative periodic solutions (if any). Usually,
one also uses arrows to indicate the direction in which each drawn orbit is traced as the
variable x increases.
5 STABILITY 88

Example 5.13. Even though it is a main goal of qualitative theory to obtain phase
portraits without the need of explicit solution formulas, and we will study techniques
for accomplishing this below, we will make use of explicit solution formulas for our first
two examples of phase portraits.

(a) Consider the autonomous linear ODE


 ′  
y1 −y2
= . (5.8)
y2′ y1

Here we have Ω = R2 and f : Ω −→ Ω, f (y1 , y2 ) = (−y2 , y1 ). The only fixed point


is (0, 0). Clearly, for each r > 0, φ : R −→ R2 , φ(x) := (r cos x, r sin x) is a solution
to (5.8) and its orbit is the circle with radius r around the origin. Since every point
of Ω belongs to such a circle, every orbit is either the origin or a circle around the
origin. Thus, the phase portrait consists of such circles plus the origin and arrows
that indicate the circles are traversed counterclockwise.

(b) As compared to the previous one, the phase portrait of the autonomous linear ODE
 ′  
y1 y
= 2 (5.9)
y2′ y1

is more complicated: While (0, 0) is still the only fixed point, for each r > 0, all the
following functions φ1 , φ2 , φ3 , φ4 : R −→ R2 are solutions:

φ1 (x) := (r cosh x, r sinh x), (5.10a)


φ2 (x) := (−r cosh x, −r sinh x), (5.10b)
φ3 (x) := (r sinh x, r cosh x), (5.10c)
φ4 (x) := (−r sinh x, −r cosh x), (5.10d)

each type describing a hyperbolic orbit in some section of the plane R2 . These
sections are separated by rays, forming the orbits of the solutions φ5 , φ6 , φ7 , φ8 :
R −→ R2 :

φ5 (x) := (ex , ex ), (5.10e)


φ6 (x) := (−ex , −ex ), (5.10f)
φ7 (x) := (e−x , −e−x ), (5.10g)
φ8 (x) := (−e−x , e−x ). (5.10h)

The two rays on {(y1 , y1 ) : y1 6= 0} move away from the origin, whereas the two rays
on {(y1 , −y1 ) : y1 6= 0} move toward the origin. The hyperbolic orbits asymptoti-
cally approach the ray orbits and are traversed such that the flow direction agrees
between approaching orbits.


5 STABILITY 89

The next results will be useful to obtain new phase portraits from previously known
phase portraits in certain situations.
Proposition 5.14. Let Ω ⊆ Kn , n ∈ N, let I ⊆ R be some nontrivial interval, let
f : Ω −→ Kn , and let φ : I −→ Kn be a solution to

y ′ = γ(x) f (y), (5.11)

where γ : I −→ R is continuous. If φ′ (x) 6= 0 for each x ∈ I (if one thinks of x as


time, then one can think of φ′ as the velocity of φ), then there exists a continuously
differentiable bijective map λ : J −→ I, defined on some nontrivial interval J, such that
(φ ◦ λ) : J −→ Kn is a solution to y ′ = f (y).

Proof. Since φ′ (x) 6= 0 for each x ∈ I, one has γ(x) 6= 0 for each x ∈ I. As γ is also
continuous, it must be either always
Rx negative or always positive. In consequence, fixing
x0 ∈ I, Γ : I −→ R, Γ(x) := x0 γ(t) dt , is continuous and either strictly increasing or
strictly decreasing. In particular, Γ is injective, J := Γ(I) is an interval, and Γ : I −→ J
is bijective. The desired function λ is λ := Γ−1 : J −→ I. Indeed, according to [Phi16,
Th. 9.9], λ is differentiable and its derivative is the continuous function
1
λ′ : J −→ R, λ′ (x) = ,
γ λ(x)

implying, for each x ∈ J,


1
(φ ◦ λ)′ (x) = φ′ λ(x) λ′ (x) = γ λ(x) f φ(λ(x))
   
 = f φ(λ(x)) ,
γ λ(x)

showing φ ◦ λ is a solution to y ′ = f (y) as required. 


Proposition 5.15. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn . Moreover, consider a
continuous function h : Ω −→ R with the property that either h > 0 everywhere on Ω
or h < 0 everywhere on Ω.

(a) If f has no zeros (i.e. F = ∅), then the ODE

y ′ = f (y), (5.12a)
y ′ = h(y) f (y) (5.12b)

have precisely the same orbits, i.e. every orbit of a solution to (5.12a) is an orbit
of a solution to (5.12b) and vice versa.

(b) If f and h are such that the ODE (5.12) admit unique maximal solutions, then the
ODE (5.12) have precisely the same orbits (even if F 6= ∅).

Proof. (a): If φ : I −→ Kn is a solution to (5.12b), then γ := h ◦ φ is well-defined


and continuous. Since F = ∅ implies φ′ 6= 0, we can apply Prop. 5.14 to obtain
the existence of a bijective λ1 : J1 −→ I such that φ ◦ λ1 is a solution to (5.12a).
5 STABILITY 90

Thus, O(φ) = O(φ ◦ λ1 ). Conversely, if ψ : I −→ Kn is a solution to (5.12a), i.e. to


y ′ = h(y)
h(y)
f (y), then γ := 1/(h ◦ ψ) is well-defined and continuous. Since F = ∅ implies

ψ 6= 0, we can apply Prop. 5.14 to obtain the existence of a bijective λ2 : J2 −→ I such
that ψ ◦ λ2 is a solution to (5.12b). Thus, O(ψ) = O(ψ ◦ λ2 ).
(b): We are now in the situation of Prop. 5.10 and Cor. 5.11, and from (a) we know every
nonconstant orbit of (5.12a) is a nonconstant orbit of (5.12b) and vice versa. However,
since h > 0 or h < 0, both ODE in (5.12) have precisely the same constant solutions,
concluding the proof. 

Remark 5.16. We apply Prop. 5.15 to phase portraits (in particular, assume unique
maximal solutions). Prop. 5.15 says that overall multiplication with a continuous pos-
itive function h does not change the phase portrait at all. Moreover, Prop. 5.15 also
states that overall multiplication with a continuous negative function h does not change
the partition of Ω into solution orbits. However, after multiplication with a negative h,
the orbits are clearly traversed in the opposite direction, i.e., for negative h, the arrows
in the phase portrait have to be reversed. For a general continuous h, this implies the
phase portrait remains the same in each region of Ω, where h > 0; it remains the same,
except for the arrows reversed, in each region of Ω, where h < 0; and the zeros of h
add additional fixed points, cutting some of the previous orbits. We summarize how to
obtain the phase portrait of (5.12b) from that of (5.12a):

(1) Start with the phase portrait of (5.12a).

(2) Add the zeros of h as additional fixed points (if any). Previous orbits are cut, where
fixed points are added.

(3) Reverse the arrows where h < 0.

Example 5.17. (a) Consider the ODE

y1′ = −y2 (y1 − 1)2 + y22 ,



(5.13)
y2′ = y1 (y1 − 1)2 + y22 ,


which comes from multiplying the right-hand side of (5.8) by h(y) = (y1 − 1)2 + y22 .
The phase portrait is the same as the one for (5.8), except for the added fixed point
at {(1, 0)}.

(b) Consider the ODE


y1′ = −y1 y2 + y22 ,
(5.14)
y2′ = −y1 y2 + y12 ,
which comes from multiplying the right-hand side of (5.8) by h(y) = y1 − y2 . The
phase portrait is obtained from that of (5.8), where additional fixed points are on
the line with y1 = y2 . This line cuts each previously circular orbit into two segments.
The arrows have to be reversed for y2 > y1 , that means above the y1 = y2 line.
5 STABILITY 91

Definition 5.18. Let Ω ⊆ Rn , n ∈ N, and f : Ω −→ Rn . A function E : Ω −→ R is


called an integral for the autonomous ODE (5.1), i.e. for y ′ = f (y), if, and only if, E ◦ φ
is constant for every solution φ of (5.1).
Lemma 5.19. Let Ω ⊆ Rn be open, n ∈ N, and f : Ω −→ Rn such that each initial
value problem for (5.1) has at least one solution (f continuous is sufficient by Th. 3.8).
Then a differentiable function E : Ω −→ R is an integral for (5.1) if, and only if,
n
X
∀ (∇ E)(y) • f (y) = ∂j E(y) fj (y) = 0. (5.15)
y∈Ω
j=1

Proof. Let φ : I −→ Rn be a solution to y ′ = f (y). Then, by the chain rule,

∀ (E ◦ φ)′ (x) = (∇ E)(φ(x)) • φ′ (x) = (∇ E)(φ(x)) • f (φ(x)). (5.16)


x∈I

The differentiable function E ◦ φ : I −→ R is constant on the interval I if, and only if,
(E ◦ φ)′ ≡ 0. Thus, by (5.16), E ◦ φ being constant for every solution φ is equivalent to
(∇ E) • f (y) = 0 for each y ∈ Ω such that at least one solution passes through y. 

Example J.2 of the Appendix, pointed out by Anton Sporrer, shows the hypothesis of
Lem. 5.19, that each initial value problem for (5.1) has at least one solution, can not be
omitted. The following Prop. 5.20 makes use of integrals and applies to phase portraits
of 2-dimensional real ODE:
Proposition 5.20. Let Ω ⊆ R2 be open, and let f : Ω −→ R2 be continuous and
such that (5.1) admits unique maximal solutions (f being locally Lipschitz is sufficient).
Assume E : Ω −→ R to be a continuously differentiable integral for (5.1), i.e. for
y ′ = f (y), satisfying ∇ E(y) 6= 0 for each y ∈ Ω. Then the following statements hold
for each maximal solution φ : I −→ R2 of (5.1) (I ⊆ R some open interval):

(a) If (xm )m∈N is a sequence in I such that limm→∞ φ(xm ) = η ∈ Ω, then η ∈ F (i.e. η
is a fixed point) or η ∈ O(φ) (i.e. there exists ξ ∈ I with φ(ξ) = η).

(b) Let C ∈ R be such that E ◦ φ ≡ C (such a C exists, as E is an integral). If


E −1 {C} = {y ∈ Ω : E(y) = C} is compact and E −1 {C} ∩ F = ∅, then φ is
periodic.

Proof. Throughout the proof let C be as in (b), i.e. E ◦ φ ≡ C.


(a): The continuity of E yields E(η) = limm→∞ E(φ(xm )) = C. Moreover, by hypoth-
esis, (ǫ1 , ǫ2 ) := ∇ E(η) 6= (0, 0). We proceed with the proof for ǫ2 6= 0 – if ǫ2 = 0
and ǫ1 6= 0, then the roles of the indices 1, 2 have to be switched in the following. We
apply the implicit function theorem [Phi15, Th. C.9] to the function f˜ : Ω −→ R,
f˜(y) := E(y) − C at its zero η = (η1 , η2 ). By [Phi15, Th. C.9], there exist ǫ, δ > 0 and a
continuously differentiable map g : Ig −→ R, Ig :=]η1 − δ, η1 + δ[, such that g(η1 ) = η2 ,

∀ E s, g(s) = C, (5.17a)
s∈Ig
5 STABILITY 92

and, having fixed some arbitrary norm k · k on R2 ,


  

∀ ky − ηk < ǫ ∧ E(y) = C ⇒ ∃ y = s, g(s) . (5.17b)
y∈Ω s∈Ig

We now assume η ∈ / F and show η ∈ O(φ). If η ∈ / F, then f (η) 6= 0 and the continuity
of f and g imply there is δ̃ > 0, δ̃ ≤ δ, such that, for each s ∈ I˜ :=]η1 − δ̃, η1 + δ̃[,
f (s, g(s)) 6= 0. Define the auxiliary function ϕ : I˜ −→ Ω, ϕ(s) = (s, g(s)). Since
E ◦ ϕ ≡ C, we can employ the chain rule to conclude

∀ 0 = (E ◦ ϕ)′ (s) = (∇ E)(ϕ(s)) • ϕ′ (s), (5.18)


s∈I˜

i.e. the two-dimensional vectors (∇ E)(ϕ(s)) and ϕ′ (s) are orthogonal with respect to
the Euclidean scalar product. As E is an integral, using (5.15), f (ϕ(s)) is another vector
orthogonal to (∇ E)(ϕ(s)) and, since all vectors in R2 orthogonal to (∇ E)(ϕ(s)) form
a 1-dimensional subspace of R2 (recalling (∇ E)(ϕ(s)) 6= 0), there exists γ(s) ∈ R such
that
ϕ′ (s) = γ(s)f (ϕ(s)) (5.19)
(note f (ϕ(s)) 6= 0 as s ∈ I). ˜ We can now apply Prop. 5.14, since (5.19) says ϕ is a
solution to (5.11), the function γ : I˜ −→ R, s 7→ γ(s) = ϕ′ (s)/f (ϕ(s)) is continuous,
and ϕ′ (s) = (1, g ′ (s)) 6= (0, 0) for each s ∈ I. ˜ Thus, Prop. 5.14 provides a bijective
˜ ′
λ : J −→ I, such that ϕ ◦ λ is a solution to y = f (y).
As we assume limm→∞ φ(xm ) = η, there exists M ∈ N such that kφ(xm )−ηk < ǫ for each
m ≥ M . Since E(φ(xm )) = C also holds, (5.17b) implies the existence of a sequence
(sm )m∈N in I˜ such that φ(xm ) = (sm , g(sm )) for each m ≥ M . Then, for each m ≥ M
and τm := λ−1 (sm ), (ϕ ◦ λ)(τm ) = ϕ(sm ) = φ(xm ). On the other hand, for τ0 := λ−1 (η1 ),
(ϕ ◦ λ)(τ0 ) = ϕ(η1 ) = η, showing φ(xm ), η ∈ O(ϕ ◦ λ). Since φ(xm ) ∈ O(φ) as well,
Prop. 5.9 implies O(ϕ ◦ λ) ⊆ O(φ), i.e. η ∈ O(φ), which proves (a). In preparation for
(b), we also observe that kφ(xm ) − ηk < ǫ for each m ≥ M implies the sm for m ≥ M
all are in some compact interval I1 with η1 ∈ I1 , implying the τm to be in the compact
interval J1 := λ−1 [I1 ] with τ0 ∈ J1 . We will use for (b) that J1 is bounded.
(b): As we have O(φ) ⊆ E −1 {C} according to the choice of C, the assumed compactness
of E −1 {C} and Prop. 3.24 show φ can only be maximal if it is defined on all of R (since
(x, φ(x)) must escape every compact [−m, m] × E −1 {C}, m ∈ N, on the left and on the
right). Using the compactness of E −1 {C} a second time, we obtain the existence of a
sequence (xm )m∈N in R such that limm→∞ xm = ∞ and limm→∞ φ(xm ) = η ∈ E −1 {C}.
So we see that we are in the situation of (a). Let ψ be the maximal extension of the
solution ϕ ◦ λ constructed in the proof of (a). Then we know O(ψ) ∩ O(φ) 6= ∅ from the
proof of (a) and, since ψ and φ both are maximal, Prop. 5.9 implies O(ψ) = O(φ) and,
more importantly for us here, there exists ξ ∈ R such that ψ(x) = φ(x+ξ) for each x ∈ R.
Let m ≥ M with M from the proof of (a). If ξ 6= 0, then φ(xm ) = ψ(τm ) = φ(xm + ξ)
shows φ is not injective. If ξ = 0, then φ = ψ and φ(xm ) = φ(τm ). Since the τm are
bounded, whereas the xm are unbounded, xm = τm cannot be true for all m, again
showing φ is not injective. Since E −1 {C} ∩ F = ∅, φ cannot be constant, therefore it
must be periodic by Prop. 5.10. 
5 STABILITY 93

Example 5.21. Using condition (5.15), i.e. ∇ E • f ≡ 0, one readily verifies that the
functions

E : R2 −→ R, E(y1 , y2 ) := y12 + y22 , (5.20a)


E : R2 −→ R, E(y1 , y2 ) := y12 − y22 , (5.20b)

are integrals for (5.8) and (5.9), i.e. for


 ′  
y1 −y2
=
y2′ y1
and  ′  
y1 y
= 2 ,
y2′ y1
respectively, and we recover the respective phase portraits via the respective level curves
E(y1 , y2 ) = C, C ∈ R.
Example 5.22. Consider the autonomous ODE
 ′  
y1 2y1 y2
= . (5.21)
y2′ 1 − 2y12
We claim that
2 2
E : R2 −→ R, E(y1 , y2 ) := y1 e−(y1 +y2 ) , (5.22)
is an integral for (5.21) and intend to use Prop. 5.20 to establish (5.21) has orbits that
are fixed points, orbits that are periodic, and orbits that are neither. To verify E is an
integral, one computes, for each (y1 , y2 ) ∈ R2 ,

∇ E(y1 , y2 ) • (2y1 y2 , 1 − 2y12 )


2 2 2 2 2 2 
= e−(y1 +y2 ) − 2y12 e−(y1 +y2 ) , −2y1 y2 e−(y1 +y2 ) • (2y1 y2 , 1 − 2y12 )
2 2
= e−(y1 +y2 ) (1 − 2y12 , −2y1 y2 ) • (2y1 y2 , 1 − 2y12 ) = 0.

Clearly, the set of fixed points is


   
1 1
F= −√ , 0 , √ , 0 .
2 2
The level set of 0 is E −1 {0} = {(0, y2 ) : y2 ∈ R}, i.e. it is the y2 -axis. This is a
nonperiodic orbit (actually, the orbit of solutions of the form φ : R −→ R2 , φ(x) :=
(0, x + c), c ∈ R). Now consider the level set

E −1 {e−1 } = (y1 , y2 ) : y1 > 0, y22 = ln y1 − y12 + 1 .




Using g : R+ −→ R, g(y1 ) = ln y1 − y12 + 1, and its derivative, it is not hard to show


g has precisely two zeros, namely λ1 = 1 and 0 < λ2 < 1, pand g ≥  0 precisely
on the
−1 −1
compact interval J := [λ2 , 1], implying E {e } = y1 , ± g(y1 ) : y1 ∈ J , showing
E −1 {e−1 } is compact. According to Prop. 5.20(b), E −1 {e−1 } must consist of one or
more periodic orbits.
5 STABILITY 94

5.2 Stability at Fixed Points


Given an autonomous ODE with a fixed point p, we will investigate the question under
what conditions a solution φ(x) starting out near p will remain near p as x increases or
decreases.
To simplify notation, we will restrict ourselves to initial data y(0) = y0 , which, in light
of Lem. 5.4(b), is not an essential restriction.
Notation 5.23. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn such that
y ′ = f (y) (5.23)
admits unique maximal solutions (f being locally Lipschitz on Ω open is sufficient). Let
Y : Df −→ Kn denote the general solution to (5.23) and define
Y : Df,0 −→ Kn , Y (x, η) := Y (x, 0, η),
(5.24)
Df,0 := {(x, η) ∈ R × Kn : (x, 0, η) ∈ Df }.
Definition 5.24. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn such that (5.23) admits unique
maximal solutions (f being locally Lipschitz on Ω open is sufficient). Moreover, assume
the set of fixed points to be nonempty, F 6= ∅, and let p ∈ F. The fixed point p is said
to be positively (resp. negatively) stable if, and only if, the following conditions (i) and
(ii) hold:

(i) There exists r > 0 such that, for each η ∈ Ω with kη − pk < r, the maximal

solution x 7→ Y (x, η) (cf. (5.24)) is defined on (a superset of) R+
0 (resp. R0 ).

(ii) For each ǫ > 0, there exists δ > 0 such that, for each η ∈ Ω,
kη − pk < δ ⇒ ∀ kY (x, η) − pk < ǫ. (5.25)
x≥0
(resp. x ≤ 0)

The fixed point p is said to be positively (resp. negatively) asymptotically stable if, and
only if, (i) and (ii) hold plus the additional condition

(iii) There exists γ > 0 such that, for each η ∈ Ω,


kη − pk < γ ⇒ lim Y (x, η) = p (resp. lim Y (x, η) = p). (5.26)
x→∞ x→−∞

The norm k · k on Kn used in (i) – (iii) above is arbitrary. Due to the equivalence of
norms on Kn , changing the norm does not change the defined stability properties, even
though, in general, it does change the sizes of r, δ, γ.
Remark 5.25. In the situation of Def. 5.24, consider the time-reversed version of (5.23),
i.e.
y ′ = −f (y). (5.27)
According to Lem. 1.9(b), (5.27) has the general solution
Ỹ : D−f,0 −→ Kn , Ỹ (x, η) := Y (−x, η),
(5.28)
D−f,0 = {(x, η) ∈ R × Kn : (−x, η) ∈ Df,0 }.
5 STABILITY 95

(a) Clearly, for a fixed point p ∈ F, we have the following equivalences:

p is positively stable for (5.23) ⇔ p is negatively stable for (5.27),


p is negatively stable for (5.23) ⇔ p is positively stable for (5.27).

(b) Clearly, for a fixed point p ∈ F, we have the following equivalences:

p is pos. asympt. stable for (5.23) ⇔ p is neg. asympt. stable for (5.27),
p is neg. asympt. stable for (5.23) ⇔ p is pos. asympt. stable for (5.27).

Lemma 5.26. Consider the situation of Def. 5.24 with f : Ω −→ Kn continuous on


Ω ⊆ Kn open. Then the fixed point p is positively (resp. negatively) stable if, and only
if, for each ǫ > 0, there exists δ > 0 such that, for each η ∈ Ω,

kη − pk < δ ⇒ ∀ kY (x, η) − pk < ǫ, (5.29)


x∈I(0,η) ∩R+ 0
(resp. x ∈ I(0,η) ∩ R−
0 )

where I(0,η) denotes the domain of the maximal solution Y (·, η).

Proof. Clearly, stability in the sense of Def. 5.24 implies (5.29), and it merely remains
to show that (5.29) implies Def. 5.24(i). As (5.29) holds, we can consider ǫ := 1 and
obtain a corresponding δ =: r. Then (5.29) states that, for each η ∈ Ω with kη − pk < r,
for x ≥ 0 (resp. for x ≤ 0), the maximal solution Y (x, η) remains in the compact set
B 1 (p). Since f : Ω −→ Kn is continuous on Ω ⊆ Kn open, Th. 3.28 implies R+ 0 ⊆ I(0,η)
(resp. R− 0 ⊆ I (0,η) ), proving Def. 5.24(i). 

It is an exercise to show Lem. 5.26 becomes false if the hypothesis that f be continuous
is omitted.

Example 5.27. (a) Consider the 1-dimensional R-valued ODE

y ′ = y(y − 1). (5.30)

The set of fixed points is F = {0, 1}. Moreover, Y ′ (·, η) < 0 for 0 < η < 1 and
Y ′ (·, η) > 0 for η ∈] − ∞, 0[∪]1, ∞[. It follows that, for p = 0, the positive stability
part of (5.29) holds (where, given ǫ > 0, one can choose δ := min{1, ǫ}). Moreover,
for η < 0 and 0 < η < 1, one has limx→∞ Y (x, η) = 0. Thus, all three conditions of
Def. 5.24 are satisfied and 0 is positively asymptotically stable. Analogously, one
sees that 1 is negatively asymptotically stable.

(b) For the R2 -valued ODE of (5.8), (0, 0) is a fixed point that is positively and neg-
atively stable, but neither positively nor negatively asymptotically stable. For the
R2 -valued ODE of (5.9), (0, 0) is a fixed point that is neither positively nor nega-
tively stable.
5 STABILITY 96

(c) Consider the 1-dimensional R-valued ODE

y′ = y2. (5.31)

The only fixed point is 0, which is neither positively nor negatively stable. Indeed,
not even Def. 5.24(i) is satisfied: One obtains
η
Y : Df,0 −→ R, Y (x, η) := ,
1 − ηx
where

Df,0 =(R × {0})


∪ (x, η) ∈ R2 : η > 0, x ∈] − ∞, 1/η[


∪ (x, η) ∈ R2 : η < 0, x ∈]1/η, ∞[ ,




showing every neighborhood of 0 contains η such that Y (·, η) is not defined on all

of R+
0 and η such that Y (·, η) is not defined on all of R0 .

Remark 5.28. There exist examples of autonomous ODE that show fixed points can
satisfy Def. 5.24(iii) without satisfying Def. 5.24(ii). For example, [Aul04, Ex. 7.4.16]
provides the following ODE in polar coordinates (r, ϕ):

r′ = r (1 − r), (5.32a)
1 − cos ϕ ϕ
ϕ′ = = sin2 . (5.32b)
2 2
Even though it is somewhat tedious, one can show that its fixed point (1, 0) satisfies Def.
5.24(iii) without satisfying Def. 5.24(ii) (see Claim 4 of Example K.2 in the Appendix).

We will now study a method that allows, in certain cases, to determine the stability
properties of a fixed point without having to know the solutions to an ODE. The method
is known as Lyapunov’s method. The key ingredient to this method is a test function V ,
known as a Lyapunov function. Once a Lyapunov function is known, stability is often
easily tested. The catch, however, is that Lyapunov functions can be hard to find. From
the literature, it appears there is no definition for an all-purpose Lyapunov function, as
a suitable choice depends on the circumstances.

Definition 5.29. Let Ω0 ⊆ Rn be open, n ∈ N. A function V : Ω0 −→ R is said to


be positive (resp. negative) definite at p ∈ Ω0 if, and only if, the following conditions (i)
and (ii) hold:

(i) V (y) ≥ 0 (resp. V (y) ≤ 0) for each y ∈ Ω0 .

(ii) V (y) = 0 if, and only if, y = p.


5 STABILITY 97

Theorem 5.30 (Lyapunov). Consider the situation of Def. 5.24 with K = R, Ω ⊆ Rn


open, and f : Ω −→ Rn continuous. Let Ω0 be open with p ∈ Ω0 ⊆ Ω ⊆ Rn . Assume
V : Ω0 −→ R to be continuously differentiable and define
n
X
V̇ : Ω0 −→ R, V̇ (y) := (∇ V )(y) • f (y) = ∂j V (y) fj (y). (5.33)
j=1

If V is positive definite at p and V̇ ≤ 0 (resp. V̇ ≥ 0) on Ω0 , then p is positively (resp.


negatively) stable. If, in addition, V̇ is negative (resp. positive) definite at p, then p is
positively (resp. negatively) asymptotically stable.

Proof. The proof is carried out for the case of postive (asymptotic) stability; the proof
for the case of negative (asymptotic) stability is then easily obtained by reversing time,
i.e. by using Rem. 5.25 together with noting V̇ changing its sign when replacing f with
−f . Fix your favorite norm k·k on Rn . Let r > 0 such that B r (p) = {y ∈ Rn : ky −pk ≤
r} ⊆ Ω0 (such an r > 0 exists, as Ω0 is open). Define

k : ]0, r] −→ R+ , k(ǫ) := min V (y) : ky − pk = ǫ ,



(5.34)

where k is well-defined, since the continuous function V assumes its min on compact
sets, and k(ǫ) > 0 by the positive definiteness of V . Given ǫ ∈]0, r], since V (p) = 0,
k(ǫ) > 0, and V continuous,

∃ ∀ V (y) < k(ǫ), (5.35)


0<δ(ǫ)<ǫ y∈Bδ(ǫ) (p)

where we used Not. 3.3 to denote an open ball with center p with respect to k · k.
We now claim that, for each η ∈ Bδ(ǫ) (p), the maximal solution x 7→ φ(x) := Y (x, η)
must remain inside Bǫ (p) for each x ≥ 0 in its domain I(0,η) (implying p to be positively
stable by Lem. 5.26). Seeking a contradiction, assume there exists ξ ≥ 0 such that
kφ(ξ) − pk ≥ ǫ and let
n o
s := sup x ≥ 0 : φ(t) ∈ Bǫ (p) for each t ∈ [0, x] ≤ ξ < ∞. (5.36)

The continuity of φ then implies kφ(s) − pk = ǫ, i.e.

V (φ(s)) ≥ k(ǫ) (5.37)

by the definition of k(ǫ). On the other hand, by the chain rule (V ◦ φ)′ (x) = V̇ (φ(x))
(cf. (5.16)), such that V̇ ≤ 0 implies
Z s
(5.35)
V (φ(s)) = V (η) + V̇ (φ(x)) dx ≤ V (η) < k(ǫ), (5.38)
0

in contradiction to (5.37), proving φ(x) ∈ Bǫ (p) for each x ∈ I(0,η) ∩ R+


0 and the positive
stability of p.
5 STABILITY 98

For the remaining part of the proof, we additionally assume V̇ to be negative definite
at p, while continuing to use the notation from above. Set γ := δ(r). We have to show
limx→∞ Y (x, η) = p for each η ∈ Bγ (p), i.e.

∀ ∃ ∀ kY (x, η) − pk < ǫ. (5.39)


ǫ∈]0,r] ξǫ ≥0 x≥ξǫ

So fix η ∈ Bγ (p) and, as above, let φ(x) := Y (x, η). Given ǫ ∈]0, r], we first claim that
there exists ξǫ ≥ 0 such that φ(ξǫ ) ∈ Bδ(ǫ) (p), where δ(ǫ) is as in the first part of the
proof above. Indeed, seeking a contradiction, assume kφ(x) − pk ≥ δ(ǫ) for all x ≥ 0,
and set 
α := max V̇ (y) : δ(ǫ) ≤ ky − pk ≤ r . (5.40)
Then α < 0 due to the negative definiteness of V̇ at p. Moreover, due to the choice of
γ, we have δ(ǫ) ≤ kφ(x) − pk ≤ r for each x ≥ 0, implying
Z x
∀ 0 ≤ V (φ(x)) = V (η) + V̇ (φ(t)) dt ≤ V (η) + αx, (5.41)
x≥0 0

which is the desired contradiction, as α < 0 implies the right-hand side to go to −∞ for
x → ∞. Thus, we know the existence of ξǫ such that ηǫ := φ(ξǫ ) ∈ Bδ(ǫ) (p).
To finish the proof, we recall from the first part of the proof that kY (x, ηǫ ) − pk < ǫ for
each x ≥ 0. Using Lem. 5.4(a), we obtain

∀ φ(ξǫ + x) = Y (ξǫ + x, ξǫ , ηǫ ) = Y (ξǫ + x − ξǫ , ηǫ ) = Y (x, ηǫ ) ∈ Bǫ (p), (5.42)


x≥0

showing kφ(x) − pk < ǫ for each x ≥ ξǫ as needed. 

Example 5.31. Let k, m ∈ N and α, β > 0. We claim that (0, 0) is a positively


asymptotically stable fixed point for each R2 -valued ODE of the form
!
−y12k−1 + αy1 y22
 ′
y1
= . (5.43)
y2′ −y22m−1 − βy12 y2

Indeed, (0, 0) is clearly a fixed point, and we consider the Lyapunov function

y12 y22
V : R2 −→ R, V (y1 , y2 ) := + , (5.44a)
α β

which is clearly positive definite at (0, 0). Since V̇ : R2 −→ R,

V̇ (y1 , y2 ) = ∇ V (y1 , y2 ) • − y12k−1 + αy1 y22 , −y22m−1 − βy12 y2




= (2y1 /α, 2y2 /β) • − y12k−1 + αy1 y22 , −y22m−1 − βy12 y2




= −2(y12k /α + y22m /β), (5.44b)

is clearly negative definite at (0, 0), Th. 5.30 proves (0, 0) to be a positively asymptoti-
cally stable fixed point.
5 STABILITY 99

Theorem 5.32. Consider the situation of Def. 5.24 with K = R. Let Ω0 be open with
p ∈ Ω0 ⊆ Ω ⊆ Rn . Assume V : Ω0 −→ R to be continuously differentiable and assume
there is an open set U ⊆ Ω0 such that the following conditions (i) – (iii) are satisfied:

(i) p ∈ ∂U , i.e. p is in the boundary of U .

(ii) V > 0 and V̇ > 0 (resp. V̇ < 0) on U (where V̇ is defined as in (5.33)).

(iii) V (y) = 0 for each y ∈ Ω0 ∩ ∂U .

Then the fixed point p is not positively (resp. negatively) stable.

Proof. We assume V and V̇ are positive, proving p not to be positively stable; the cor-
responding statement regarding p not to be negatively stable is then, once again, easily
obtained by reversing time, i.e. by using Lem. 5.25 together with noting V̇ changing its
sign when replacing f with −f .
Seeking a contradiction, assume p to be positively stable. Then there exists r > 0 such
that B r (p) = {y ∈ Rn : ky − pk ≤ r} ⊆ Ω0 and η ∈ Br (p) implies Y (x, η) is defined
for each x ≥ 0. Moreover, positive stability and p ∈ ∂U also imply the existence of
η ∈ U ∩ Br (p) such that φ(x) := Y (x, η) ∈ Br (p) for all x ≥ 0 (note p 6= η as p ∈ ∂U ).
Set 
s := sup x ≥ 0 : φ(t) ∈ U for each t ∈ [0, x]}. (5.45)
If s < ∞, then the maximality of φ implies φ(s) to be defined. Moreover, φ(s) ∈ ∂U by
the definition of s, and φ(s) ∈ Br (p) ⊆ Ω0 by the choice of η. Thus, φ(s) ∈ Ω0 ∩ ∂U
and V (φ(s)) = 0. On the other hand, as V and V̇ are positive on U , we have
Z s
V (φ(s)) = V (η) + V̇ (φ(t)) dt > V (η) > 0, (5.46)
0

which is a contradiction to V (φ(s)) = 0, implying s = ∞ and φ(x) ∈ U as well as


V (φ(x)) > V (η) > 0 hold for each x > 0.
To conclude the proof, consider the compact set

C := B r (p) ∩ U ∩ V −1 [V (η), ∞[⊆ Ω0 . (5.47)

Then the choice of η guarantees φ(x) ∈ C for all x ≥ 0. If y ∈ C, then V (y) ≥ V (η) > 0.
If y ∈ Ω0 ∩ ∂U , then V (y) = 0, showing C ∩ ∂U = ∅, i.e. C ⊆ U and

α := min{V̇ (y) : y ∈ C} > 0. (5.48)

Thus, Z x
∀ V (φ(x)) = V (η) + V̇ (φ(t)) dt ≥ V (η) + αx. (5.49)
x≥0 0
But this means that the continuous function V is unbounded on the compact set C and
this contradiction proves p is not positively stable. 
5 STABILITY 100

Example 5.33. Let h1 , h2 : Ω −→ R be continuously differentiable functions defined


on some open set Ω ⊆ R2 with (0, 0) ∈ Ω and h1 (0, 0) > 0, h2 (0, 0) > 0. We claim that
(0, 0) is not a positively stable fixed point for each R2 -valued ODE of the form
 ′ 
y1 y2 h1 (y1 , y2 )

= . (5.50)
y2′ y1 h2 (y1 , y2 )

Indeed, (0, 0) is clearly a fixed point, and we let Ω0 be some open neighborhood of (0, 0),
where both h1 and h2 are positive (such an Ω0 exists by continuity of h1 , h2 and h1 , h2
being positive at (0, 0)), and consider the Lyapunov function

V : Ω0 −→ R, V (y1 , y2 ) := y1 y2 , (5.51a)

with V̇ : Ω0 −→ R,

V̇ (y1 , y2 ) = ∇ V (y1 , y2 ) • y2 h1 (y1 , y2 ), y1 h2 (y1 , y2 )

= (y2 , y1 ) • y2 h1 (y1 , y2 ), y1 h2 (y1 , y2 )
= y22 h1 (y1 , y2 ) + y12 h2 (y1 , y2 ) > 0 on Ω0 \ {(0, 0)}. (5.51b)

Letting U := Ω0 ∩ (R+ × R+ ), one has (0, 0) ∈ ∂U , both V and V̇ are positive on U ,


and V = 0 on Ω0 ∩ ∂U ⊆ ({0} × R) ∪ (R × {0}). Thus, Th. 5.32 applies, yielding that
(0, 0) is not positively stable.

Theorem 5.34. Let Ω ⊆ Rn be open, n ∈ N. Let F : Ω −→ R be C 2 and consider

y ′ = − ∇ F (y). (5.52)

If p ∈ Ω is an isolated critical point of F (i.e. ∇ F (p) = 0 and there exists an open set
O with p ∈ O ⊆ Ω and ∇ F 6= 0 on O \ {p}), then p is a fixed point of (5.52) that is
positively asymptotically stable, negatively asymptotically stable, neither positively nor
negatively stable as p is a local minimum for F , local maximum for F , neither.

Proof. Note that F being C 2 implies ∇ F to be C 1 and, in particular, locally Lipschitz,


such that (5.52) admits unique maximal solutions. Suppose F has a local min at p.
As p is an isolated critical point, the local min at p must be strict, i.e. there exists an
open neighborhood Ω0 of p such that F (p) < F (y) for each y ∈ Ω0 \ {p}. Then the
Lyapunov function V : Ω0 −→ R, V (y) := F (y) − F (p), is clearly positive definite at p
and V̇ : Ω0 −→ R,

V̇ (y) = ∇ V (y) • (− ∇ F (y)) = − ∇ F (y) • ∇ F (y) = −k ∇ F (y)k22 , (5.53)

is clearly negative definite at p. Thus, p is a positively asymptotically stable fixed point


by Th. 5.30. If F has a local max at p, then the proof is conducted analogously, using
V : Ω0 −→ R, V (y) := F (p) − F (y), or, alternatively, by using time reversion (if F
has a local max at p, then −F has a local min at p, i.e. p is a positively asymptotically
stable for y ′ = ∇ F (y), i.e. p is a negatively asymptotically stable for y ′ = − ∇ F (y) by
Rem. 5.25(b)).
5 STABILITY 101

If p is neither a local min nor max for F , then let Ω0 := O, and V : Ω0 −→ R, V (y) :=
F (p) − F (y), where O was chosen such that ∇ F 6= 0 on O \ {p}, i.e. V̇ : Ω0 −→ R,
V̇ (y) = k ∇ F (y)k22 , is positive definite at p. Let U := {y ∈ Ω0 : F (y) < F (p)}. Then
U is open by the continuity of F , and p ∈ ∂U , as p is neither a local min nor max
for F . By the continuity of F , F (y) = F (p) for each y ∈ Ω0 ∩ ∂U , i.e. V = 0 on
Ω0 ∩ ∂U . Thus, Th. 5.32 applies, showing p is not positively stable. Analogously, using
U := {y ∈ Ω0 : F (y) > F (p)} and V (y) := F (y) − F (p) shows p is not negatively
stable. 
Example 5.35. (a) The function F : Rn −→ R, F (y) = kyk22 , has an isolated critical
point at 0, which is also a min for F . Thus, by Th. 5.34,

y ′ = − ∇ F (y) = (−2y1 , . . . , −2yn ) (5.54)

has 0 as a fixed point that is positively asymptotically stable.

(b) The function F : R2 −→ R, F (y) = ey1 y2 , has an isolated critical point at 0, which
is neither a local min nor local max for F . Thus, by Th. 5.34,

y ′ = − ∇ F (y) = (−y2 ey1 y2 , −y1 ey1 y2 ) (5.55)

has 0 as a fixed point that is neither positively nor negatively stable.

5.3 Constant Coefficients


The stability properties of systems of first-order linear ODE (cf. Sec. 4.6.2) are closely
related to the eigenvalues of the matrix A. As it turns out, the stability of the origin is
essentially determined by the sign of the real part of the eigenvalues of A (cf. Th. 5.38
below). We start with a preparatory lemma:
Lemma 5.36. Let n ∈ N and W ∈ M(n, K) be invertible. Moreover, let k · k be some
norm on M(n, K). Then

k · kW : M(n, K) −→ R+
0, kAkW := kW −1 AW k, (5.56)

also constitutes a norm on M(n, K).

Proof. If A = 0, then kAkW = kW −1 0W k = k0k = 0. If kAkW = 0, then W −1 AW = 0,


i.e. A = W 0W −1 = 0, showing k · kW is positive definite. Next,

∀ kλAkW = |λ| kW −1 AW k = |λ| kAkW ,


λ∈K

showing k · kW is homogeneous of degree 1. Finally,

∀ kA + BkW = kW −1 (A + B)W k = kW −1 AW + W −1 BW k ≤ kAkW + kBkW ,


A,B∈M(n,K)

showing k · kW satisfies the triangle inequality. 


5 STABILITY 102

Remark and Definition 5.37. Let n ∈ N, A ∈ M(n, C), and let λ ∈ C be an


eigenvalue of A.

(a) Clearly, one has

{0} ⊆ ker(A − λ Id) ⊆ ker(A − λ Id)2 ⊆ . . .

and the inclusion can be strict for at most n times. Let

r(λ) := min k ∈ N0 : ker(A − λ Id)k = ker(A − λ Id)k+1 .




Then
∀ ker(A − λ Id)r(λ) = ker(A − λ Id)r(λ)+k :
k∈N

Indeed, otherwise, let k0 := min{k ∈ N : ker(A − λ Id)r(λ) ( ker(A − λ Id)r(λ)+k }.


Then there exists v ∈ Cn such that (A−λ Id)r(λ)+k0 v = 0, but (A−λ Id)r(λ)+k0 −1 v 6=
0. However, that means w := (A − λ Id)k0 −1 v ∈ ker(A − λ Id)r(λ)+1 , but w ∈ /
ker(A − λ Id)r(λ) , in contradiction to the definition of r(λ). The space

M (λ) := ker(A − λ Id)r(λ)

is called the generalized eigenspace corresponding to the eigenvalue λ.


(b) Due to A(A − λ Id) = (A − λ Id)A, one has

A ker(A − λ Id)k ⊆ ker(A − λ Id)k ,




k∈N0

i.e. all the kernels (in particular, the generalized eigenspace M (λ)) are invariant
subspaces for A.
(c) As already mentioned in Rem. 4.51 the algebraic multiplicity of λ, denoted ma (λ),
is its multiplicity as a zero of the characteristic polynomial χA (x) = det(A − x Id),
and the geometric multiplicity of λ is mg (λ) := dim ker(A − λ Id). We call the
eigenvalue λ semisimple if, and only if, its algebraic and geometric multiplicities
are equal. We then have the equivalence of the following statements (i) – (iv):
(i) λ is semisimple.
(ii) M (λ) = ker(A − λ Id).
(iii) A↾M (λ) is diagonalizable.
(iv) All the Jordan blocks corresponding to λ are trivial, i.e. they all have size 1
(i.e. there are dim ker(A − λ Id) such blocks).
Indeed, note that ma (λ) = dim ker(A − λ Id)ma (λ) (e.g., since, if A is in Jordan
normal form, then ma (λ) provides the size of the λ-block and, for A − λ Id, this
block is canonically nilpotent). This shows the equivalence between (i) and (ii).
Moreover, mg (λ) = ma (λ) means ker(A − λ Id) has a basis of ma (λ) eigenvectors
v1 , . . . , vma (λ) for the eigenvalue λ. The equivalence of (i),(ii) with (iii) and with
(iv) is then given by Th. 4.45 and Th. 4.46, respectively.
5 STABILITY 103

Theorem 5.38. Let n ∈ N and A ∈ M(n, C). Moreover, let k · k be some norm on
M(n, C) and let λ1 , . . . , λs ∈ C, 1 ≤ s ≤ n, be the distinct eigenvalues of A.

(a) The following statements (i) – (iii) are equivalent:

(i) There exists K > 0 such that keAx k ≤ K holds for each x ≥ 0 (resp. x ≤ 0).
(ii) Re λj ≤ 0 (resp. Re λj ≥ 0) for every j = 1, . . . , s and if Re λj = 0 occurs, then
λj is a semisimple eigenvalue (i.e. its algebraic and geometric multiplicities
are equal).
(iii) The fixed point 0 of y ′ = Ay is positively (resp. negatively) stable.

(b) The following statements (i) – (iii) are equivalent:

(i) There exist K, α > 0 such that keAx k ≤ Ke−α|x| holds for each x ≥ 0 (resp.
x ≤ 0).
(ii) Re λj < 0 (resp. Re λj > 0) for every j = 1, . . . , s.
(iii) The fixed point 0 of y ′ = Ay is positively (resp. negatively) asymptotically
stable.

Proof. Let k · kmax denote the max-norm on Cn ∼


2
= M(n, C), i.e.

k(mkl )kmax := max |mkl | : k, l ∈ {1, . . . , n}

(caveat: for n > 1, this is not the operator norm induced by the max-norm on Cn ).
Moreover, using Th. 4.46, let W ∈ M(n, C) be invertible and such that B := W −1 AW
−1
is in Jordan normal form. Then, according to Lem. 5.36, kM kW max := kW M W kmax
also defines a norm on M(n, C). According to Th. 4.47(b),
−1 Ax −1 AW x
∀ keAx kW
max = kW e W kmax = keW kmax = keBx kmax . (5.57)
x∈R

According to Th. 4.44 and Th. 4.49, the entries βkl (x) of (βkl (x)) := eBx enjoy the
following property:

∀ ∃ ∃ ∃ ∀ |βkl (x)| = C eRe λj x |x|m . (5.58)


k,l∈{1,...,n} j∈{1,...,s} C>0 m∈N0 x∈R

Moreover,

|βkl (x)| = C eRe λj x |x|m ∧ Re λj < 0



⇒ lim |βkl (x)| = 0, (5.59a)
x→∞
|βkl (x)| = C eRe λj x |x|m ∧ Re λj > 0

⇒ lim |βkl (x)| = ∞, (5.59b)
x→∞
|βkl (x)| = C eRe λj x |x|m ∧ Re λj = 0 ∧ m = 0

⇒ |βkl | ≡ C, (5.59c)
Re λj x m

|βkl (x)| = C e |x| ∧ Re λj = 0 ∧ m > 0 ⇒ lim |βkl (x)| = ∞. (5.59d)
x→∞

(a): We start with the equivalence between (i) and (ii): Suppose, Re λj ≤ 0 for every
j = 1, . . . , s and if Re λj = 0 occurs, then λj is a semisimple eigenvalue. Then, using
5 STABILITY 104

Rem. and Def. 5.37(c) and (5.58), we are either in situation (5.59a) or in situation
(5.59c). Thus, there exists K0 > 0 such that |βkl (x)| ≤ K0 for each x ≥ 0 and each
k, l = 1, . . . , n. Then there exists K1 > 0 such that

∀ keAx k ≤ K1 keAx kW
max = K1 ke
Bx
kmax ≤ K1 K0 , (5.60)
x≥0

showing (i) holds with K := K1 K0 . Conversely, if there is j ∈ {1, . . . , s} such that


Re λj > 0, then there is βkl such that (5.59b) occurs; if there is j ∈ {1, . . . , s} such that
Re λj = 0 and λj is not semisimple, then, using Rem. and Def. 5.37(c), there is βkl such
that (5.59d) occurs. In both cases,

lim keAx k = lim keAx kW


max = lim ke
Bx
kmax = ∞, (5.61)
x→∞ x→∞ x→∞

i.e., the corresponding statement of (i) can not be true. The remaining case is handled
via time reversion: keAx k ≤ K holds for each x ≤ 0 if, and only if, ke−Ax k ≤ K holds
for each x ≥ 0, which holds if, and only if, Re(−λj ) ≤ 0 for every j = 1, . . . , s with
λj semisimple for Re(−λj ) = 0, which is equivalent to Re λj ≥ 0 for every j = 1, . . . , s
with λj semisimple for Re λj = 0.
We proceed to the equivalence between (i) and (iii): Fix some arbitary norm k · k on
Cn , and let k · kop denote the induced operator norm on M(n, C). Let C1 , C2 > 0 be
such that kM kop ≤ C1 kM k and kM k ≤ C2 kM kop for each M ∈ M(n, C). Suppose
there exists K > 0 such that keAx k ≤ K holds for each x ≥ 0. Given ǫ > 0, choose
δ := ǫ/(C1 K). Then
ǫ
∀ ∀ kY (x, η) − 0k = keAx ηk ≤ keAx kop kηk < C1 K = ǫ, (5.62)
η∈Bδ (0) x≥0 C1 K
proving 0 is positively stable. Conversely, assume 0 to be positively stable. Then there
exists δ > 0 such that kY (x, η)k = keAx ηk < 1 for each η ∈ Bδ (0) and each x ≥ 0.
Thus,
(G.1)
keAx kop = sup keAx ηk : η ∈ Cn , kηk = 1

∀ 2 2 (5.63)
x≥0
= sup keAx ηk : η ∈ Cn , kηk = δ/2 ≤ ,

δ δ
showing (i) holds with K := 2 C2 /δ. The remaining case is handled via time reversion:
keAx k ≤ K holds for each x ≤ 0 if, and only if, ke−Ax k ≤ K holds for each x ≥ 0, which
holds if, and only if, 0 is positively stable for y ′ = −Ay, which, by Rem. 5.25(a), holds
if, and only if, 0 is negatively stable for y ′ = Ay.
(b): As in (a), we start with the equivalence between (i) and (ii): Suppose, Re λj < 0
for every j = 1, . . . , s. We first show, using (5.58),

∀ ∃ ∀ |βkl (x)| ≤ Kkl e−αkl x : (5.64)


k,l=1,...,n Kkl ,αkl >0 x≥0

According to (5.58),

∃ ∀ |βkl (x)| = Ckl eRe λj x/2 eRe λj x/2 xm .


Ckl >0 x≥0
5 STABILITY 105

Since Re λj < 0, one has limx→∞ eRe λj x/2 xm = 0, i.e. eRe λj x/2 xm is uniformly bounded
on [0, ∞[ by some Mkl > 0. Thus, (5.64) holds with Kkl := Ckl Mkl and αkl := − Re λj /2.
In consequence, if K1 is chosen as in (5.60), then keAx k ≤ Ke−α|x| ≤ K for each x ≥ 0
holds with K := K1 max{Kkl : k, l = 1, . . . , n} and α := min{αkl : k, l = 1, . . . , n}.
Conversely, if there is j ∈ {1, . . . , s} such that Re λj ≥ 0, then there is βkl such that
(5.59b) or (5.59c) or (5.59d) occurs. In each case, keAx k 6→ 0 for x → ∞, since

lim keAx kW
max = lim ke
Bx
kmax ∈ ]0, ∞], (5.65)
x→∞ x→∞

i.e., the corresponding statement of (i) can not be true. The remaining case is handled
via time reversion: keAx k ≤ Ke−α|x| holds for each x ≤ 0 if, and only if, ke−Ax k ≤
Ke−α|x| holds for each x ≥ 0, which holds if, and only if, Re(−λj ) < 0 for every
j = 1, . . . , s, which is equivalent to Re λj > 0 for every j = 1, . . . , s.
It remains to consider the equivalence between (i) and (iii): Let k · kop and C1 , C2 > 0
be as in the proof of the equivalence between (i) and (iii) in (a). Suppose, there exist
K, α > 0 such that keAx k ≤ Ke−α|x| holds for each x ≥ 0. Since keAx k ≤ Ke−α|x| ≤ K
for each x ≥ 0, 0 is positively stable by (a). Moreover,

∀ ∀ kY (x, η)k = keAx ηk ≤ keAx kop kηk ≤ C1 K e−α|x| kηk → 0 for x → ∞,


η∈Cn x≥0
(5.66)
showing 0 to be positively asymptotically stable. For the converse, we will actually
show (iii) implies (ii). If 0 is positively asymptotically stable, then, in particular, it
is positively stable, such that (ii) of (a) must hold. It merely remains to exclude the
possibility of a semisimple eigenvalue λ with Re λ = 0. If there were a semisimple
eigenvalue λ with Re λ = 0, then eBx had a Jordan block of size 1 with entry eλx , i.e.
βkk (x) = eλx for some k ∈ {1, . . . , n}. Let ek be the corresponding standard unit vector
of Cn (all entries 0, except the kth entry, which is 1). Then, for η := W ek ,

kW −1 eAx (ǫ η)k = ǫ kW −1 eAx W ek k = ǫ keBx ek k = ǫ keλx ek k


∀ ∀ (5.67)
ǫ∈R+ x∈R = ǫ |eλx | kek k = ǫ · 1 · kek k > 0,

showing 0 were not positively asymptotically stable (e.g., since y 7→ kW −1 yk defines a


norm on Cn ). The remaining case is, once again, handled via time reversion: keAx k ≤
Ke−α|x| holds for each x ≤ 0 if, and only if, ke−Ax k ≤ Ke−α|x| holds for each x ≥ 0,
which holds if, and only if, 0 is positively asymptotically stable for y ′ = −Ay, which, by
Rem. 5.25(b), holds if, and only if, 0 is negatively asymptotically stable for y ′ = Ay. 
Example 5.39. (a) The matrix  
2 1
A=
1 2
has eigenvalues 1 and 3 and, thus, the fixed point 0 of y ′ = Ay is negatively
asymptotically stable, but not positively stable.
(b) The matrix  
0 1
A=
0 0
5 STABILITY 106

has eigenvalue 0, which is not semisimple, i.e. the fixed point 0 of y ′ = Ay is neither
negatively nor positively stable.

(c) The matrix  


i 1 2 2 − 3i
0 −i 5 −17 
A=
0 0 −1 + 3i

0 
0 0 0 −5
has simple eigenvalues i, −i, −1 + 3i, −5, i.e. the fixed point 0 of y ′ = Ay is
positively stable (since all real parts are ≤ 0), but neither negatively stable nor
positively asymptotically stable (since there are eigenvalues with 0 real part).

5.4 Linearization
If the right-hand side f of an autonomous ODE is differentiable and p is a fixed point (i.e.
f (p) = 0), then one can sometimes use its linearization, i.e. its derivative A := Df (p)
(which is an n × n matrix), to infer stability properties of y ′ = f (y) at p from those of
y ′ = Ay at 0 (see Th. 5.44 below). We start with some preparatory results:

Lemma 5.40. Let n ∈ N and consider the bilinear function


n
X
n n t
β : R × R −→ R, β(y, z) := y • (Bz) = y Bz = yk bkl zl , (5.68)
k,l=1

where B = (bkl ) ∈ M(n, R), “•” denotes the Euclidean scalar product, and elements of
Rn are interpreted as column vectors when involved in matrix multiplications.

(a) The function β is differentiable (it is even a polynomial, deg(β) ≤ 2, and, thus,
C ∞ ) and
∂yk β : Rn × Rn −→ R,
n
∀ X (5.69a)
k∈{1,...,n} ∀n n ∂yk β(y, z) = bkl zl = (Bz)k ,
(y,z)∈R ×R
l=1

∂zl β : R × Rn −→ R,
n

n
∀ X (5.69b)
l∈{1,...,n} ∀n ∂zl β(y, z) = yk bkl = (y t B)l ,
(y,z)∈R ×Rn
k=1

Dβ(y, z) = ∇ β(y, z) : R × Rn −→ R,
n
∀ (5.69c)
(y,z)∈Rn ×Rn ∇ β(y, z)(u, v) = β(y, v) + β(u, z) = y t Bv + ut Bz.

(b) The function


n
X
n t
V : R −→ R, V (y) := β(y, y) = y • (By) = y By = yk bkl yl , (5.70)
k,l=1
5 STABILITY 107

is differentiable (it is also even a polynomial, deg(β) ≤ 2, and, thus, C ∞ ) and

∂yk V : Rn −→ R,
n
∀ X (5.71a)
k∈{1,...,n} ∀n ∂yk V (y) = yl (bkl + blk ) = y t (B + B t )k ,
y∈R
l=1

DV (y) = ∇ V (y) : Rn −→ R,
∀n (5.71b)
y∈R ∇ V (y)(u) = β(y, u) + β(u, y) = y t Bu + ut By = y t (B + B t )u.

Proof. (a): (5.69a) and (5.69b) are immediate from (5.68) and, then, imply (5.69c).
(b): (5.71a) is immediate from (5.70) and, then, implies (5.71b). 

Lemma 5.41. Let A, B ∈ M(n, R), n ∈ N, and V : Rn −→ R as in (5.70). Then

∀ ∇ V (y) • (Ay) = y t (BA + At B)y. (5.72)


y∈Rn

Proof. We note
t
∀n (y t B t ) • (Ay) = y t B t Ay = y t B t Ay = y t At By (5.73)
y∈R

and, thus, obtain


(5.71b) (5.73)
∀n ∇ V (y) • (Ay) = y t (B + B t ) • (Ay) = y t (BA + At B)y, (5.74)
y∈R

proving (5.72). 

Definition 5.42. A matrix B ∈ M(n, R), n ∈ N, is called positive definite if, and only
if, the function V of (5.70) is positive definite at p = 0 in the sense of Def. 5.29.

Proposition 5.43. Let A ∈ M(n, R), n ∈ N. Then the following statements (i) – (iii)
are equivalent:

(i) There exist positive definite matrices B, C ∈ M(n, R), satisfying

BA + At B = −C. (5.75)

(ii) Re λ < 0 holds for each eigenvalue λ ∈ C of A.

(iii) For each given positive definite (symmetric) C ∈ M(n, R), there exists a positive
definite (symmetric) B ∈ M(n, R), satisfying (5.75).

Proof. (iii) immediately implies (i) (e.g. by applying (iii) with C := Id).
For the proof that (i) implies (ii), let B, C ∈ M(n, R) be positive definite matrices,
satisfying (5.75). By Th. 5.38(b), it suffices to show 0 is a positively asymptotically
stable fixed point for y ′ = Ay. To this end, we apply Th. 5.30, using V : Rn −→ R of
5 STABILITY 108

(5.70) as the Lyapunov function. Then, by Def. 5.42, B being positive definite means
V being positive definite at 0. Since
(5.72) (5.75)
V̇ : Rn −→ R, V̇ (y) = ∇ V (y) • (Ay) = y t (BA + At B)y = −y t Cy (5.76)

and C is positive definite, V̇ is negative definite at 0, i.e. Th. 5.30 yields 0 to be a


positively asymptotically stable fixed point for y ′ = Ay as desired.
It remains to show that (ii) implies (iii). If all eigenvalues of A have negative real part,
then, as A and At have the same eigenvalues, all eigenvalues of At have negative real
part as well. Thus, according to Th. 5.38(b),
 t

∃ ∀ keAx kmax ≤ Ke−αx ∧ keA x kmax ≤ Ke−αx , (5.77)
K,α>0 x≥0

2
where we have chosen the norm in (5.77) to mean the max-norm on Rn (note that eAx
is real if A is real, e.g. due to the series representation (4.73)). Given C ∈ M(n, R),
define Z ∞
t
B := eA x C eAx dx . (5.78)
0

To verify that B ∈ M(n, R) is well-defined, note that each entry of the integrand matrix
of (5.78) constitutes an integrable function on [0, ∞[: Indeed,
At x At x
e C eAx kmax kCkmax keAx kmax

max
≤ M ke
∃ ∀ (5.79)
M >0 x≥0 ≤ M kCkmax K 2 e−2αx ,

which is integrable on [0, ∞[. Next, we compute


Z x
(5.79) At x Ax t
∂s eA s C eAs ds

−C = lim e C e − C = lim
x→∞ x→∞ 0
Z ∞
t
∂s eA s C eAs ds

=
Z0 ∞
(I.3) t t
At eA s C eAs + eA s C eAs A ds

=
0
Z ∞ Z ∞ 
(I.5),(I.6) t At s As At s As
= A e C e ds + e C e ds A
0 0
(5.78)
= = At B + BA, (5.80)

showing (5.75) is satisfied. If C is positive definite and 0 6= y ∈ Rn , then y t Cy > 0,


implying
Z ∞  Z ∞
t t At x Ax (I.5),(I.6) t
y By = y e C e dx y = y t eA x C eAx y dx
Z ∞ 0 0
Prop. 4.40(c)
= (eAx y)t C eAx y dx > 0, (5.81)
0
5 STABILITY 109

showing B is positive definite as well. Finally, if C is symmetric, then


Z ∞ Z ∞
t At x Ax t Prop. 4.40(c) t
eA x C eAx dx = B,

B = e Ce dx = (5.82)
0 0

showing B is symmetric as well. 


Theorem 5.44. Let Ω ⊆ Rn be open, n ∈ N, and f : Ω −→ Rn continuously differ-
entiable. Let p ∈ Ω be a fixed point (i.e. f (p) = 0) and A := Df (p) ∈ M(n, R) the
derivative of f at p. If all eigenvalues of A have negative (resp. positive) real parts, then
p is a positively (resp. negatively) asymptotically stable fixed point for y ′ = f (y).

Proof. Let all eigenvalues of A have negative real parts. We first consider the special
case p = 0, i.e. A = Df (0). By the equivalence between (ii) and (iii) of Prop. 5.43,
we can choose C := Id in (iii) to obtain the existence of a positive definite symmetric
matrix B ∈ M(n, R), satisfying

BA + At B = − Id . (5.83)

The idea is now to apply the Lyapunov Th. 5.30 with V of (5.70), i.e.
n
X
t
V : Ω −→ R, V (y) := y • (By) = y By = yk bkl yl . (5.84)
k,l=1

We already know V to be continuously differentiable and positive definite. We will


conclude the proof of 0 being positively asymptotically stable by showing there exists
δ > 0, such that
V̇ : Bδ (0) −→ R, V̇ (y) = (∇ V )(y) • f (y), (5.85)
is negative definite at 0, where we take Bδ (0) with respect to the 2-norm k · k2 on Rn .
The differentiability of f at 0 implies that (cf. [Phi15, Lem. 2.21])

r : Ω −→ Rn , r(y) := f (y) − Ay, (5.86)

satisfies
kr(y)k2
lim = 0. (5.87)
y→0 kyk2

Thus, we compute, for each y ∈ Ω,


(5.71b),(5.86)
(∇ V )(y) • (Ay) + y t (B + B t ) • r(y)

V̇ (y) = (∇ V )(y) • f (y) =
(5.72), B=B t (5.83)
= y t (BA + At B)y + 2y t B r(y) = −kyk22 + 2 y • B r(y). (5.88)

We can estimate the second summand via the Cauchy-Schwarz inequality to obtain

y • B r(y) ≤ kyk2 kB r(y)k2 ≤ kyk2 kBk kr(y)k2 , (5.89)

and, thus, using (5.87),


y • B r(y)
lim = 0. (5.90)
y→0 kyk22
5 STABILITY 110

Now choose δ > 0 such that Bδ (0) ⊆ Ω and such that



2 y • B r(y) 1
∀ 2
< . (5.91)
y∈Bδ (0) kyk2 2

Then, for each 0 6= y ∈ Bδ (0)


(5.91) kyk22 kyk22
V̇ (y) = −kyk22 + 2 y • B r(y) < −kyk22 + =− < 0, (5.92)
2 2
showing V̇ to be negative definite at 0, and 0 to be positively asymptotically stable. If
p 6= 0, then consider the ODE y ′ = g(y) := f (y + p), g : (Ω − p) −→ Rn . Then 0 is a
fixed point for y ′ = g(y), Dg(0) = Df (p) = A, i.e. 0 is positively asymptotically stable
for y ′ = g(y). But, since ψ is a solution to y ′ = g(y) if, and only if, φ = ψ +p is a solution
to y ′ = f (y), p must be positively asymptotically stable for y ′ = f (y). The remaining
case that all eigenvalues of A have positive real parts is now treated via time reversion:
If all eigenvalues of A have positive real parts, then all eigenvalues of −A = D(−f )(p)
have negative real parts, i.e. p is positively asymptotically stable for y ′ = −f (y), i.e., by
Rem. 5.25(b), p is negatively asymptotically stable for y ′ = f (y). 

Caveat 5.45. The following example shows that the converse of Th. 5.44 does not
hold: A fixed point p can be positively (resp. negatively) asymptotically stable without
A := Df (p) having only eigenvalues with negative (resp. positive) real parts. The same
example shows that, in general, one can not infer anything regarding the stability of the
fixed point p if A := Df (p) is merely stable, but not asymptotically stable: Consider

f : R2 −→ R2 , f (y1 , y2 ) := (y2 + µy13 , −y1 + µy23 ), µ ∈ R. (5.93)

Then, independently of µ, (0, 0) is a fixed point and


 
0 1
Df (0, 0) = (5.94)
−1 0

with complex eigenvalues i and −i. Thus, the linearized system is positively and nega-
tively stable, but not asymptotically stable, still independently of µ. However, we claim
that (0, 0) is a positively asymptotically stable fixed point for y ′ = f (y) if µ < 0 and
a positively asymptotically stable fixed point for y ′ = f (y) if µ > 0. Indeed, this can
be seen by using the Lyapunov function V : R2 −→ R, V (y1 , y2 ) = y12 + y22 , which has
∇ V (y1 , y2 ) = (2y1 , 2y2 ) and

V̇ (y1 , y2 ) = ∇ V (y1 , y2 ) • f (y1 , y2 ) = 2µ (y14 + y24 ). (5.95)

Thus, V is positive definite at (0, 0) and V̇ is negative definite at (0, 0) for µ < 0 and
positive definite at (0, 0) for µ > 0.

Example 5.46. Consider (x, y, z)′ = f (x, y, z) with

f : R3 −→ R3 , f (x, y, z) = (−x cos y, −yez , x2 − 2z). (5.96)


5 STABILITY 111

The derivative is
 
− cos y x sin y 0
Df : R3 −→ M(3, R), Df (x, y, z) =  0 −ez −yez  . (5.97)
2x 0 −2
Clearly, (0, 0, 0) is a fixed point and Df (0, 0, 0) has eigenvalues −1 and −2. Thus,
(0, 0, 0) is a positively asymptotically stable fixed point for (x, y, z)′ = f (x, y, z) by Th.
5.44.

5.5 Limit Sets


Limit sets are important when studying the asymptotic behavior of solutions, i.e. φ(x)
for x → ∞ and for x → −∞. If a solution has a limit, then its corresponding limit set
consists of precisely one point. In general, the limit set of a solution is defined to consist
of all points that occur as limits of sequences taken along the solution’s orbit (of course,
the limit sets can also be empty):
Definition 5.47. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn be such that y ′ = f (y) admits
unique maximal solutions. For each η ∈ Ω, we define the omega limit set and the alpha
limit set of η as follows:
 
ω(η) := ωf (η) := y ∈ Ω : ∃ lim xk = ∞ ∧ lim Y (xk , η) = y , (5.98a)
(xk )k∈N ⊆R k→∞ k→∞
 
α(η) := αf (η) := y ∈ Ω : ∃ lim xk = −∞ ∧ lim Y (xk , η) = y . (5.98b)
(xk )k∈N ⊆R k→∞ k→∞

Remark 5.48. In the situation of Def. 5.47, consider the time-reversed version of y ′ =
f (y), i.e. y ′ = −f (y), with its general solution Ỹ (x, η) = Y (−x, η), cf. (5.28). Clearly,
for each η ∈ Ω,
ωf (η) = α−f (η), αf (η) = ω−f (η). (5.99)
Proposition 5.49. In the situation of Def. 5.47, the following hold:

(a) If Y (·, η) is defined on all of R+


0 , then

\
ω(η) = {Y (x, η) : x ≥ m}; (5.100a)
m=0

and if Y (·, η) is defined on all of R−


0 , then


\
α(η) = {Y (x, η) : x ≤ −m}. (5.100b)
m=0

(b) All points in the same orbit have the same omega and alpha limit sets, i.e.
  
∀ ω(η) = ω Y (x, η) ∧ α(η) = α Y (x, η) .
x∈I0,η
5 STABILITY 112

Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limit
sets.
(a): Let y ∈ ω(η) and m ∈ N0 . Then there is a sequence (xk )k∈N in R such that
limk→∞ xk = ∞ and limk→∞ Y (xk , η) = y. Since, for sufficiently large k0 ∈ N, the
sequence (Y (xk , η))k≥k0 is in {Y (x, η) : x ≥ m}, the inclusion “⊆” of (5.100a) is proved.
Conversely, assume y ∈ {Y (x, η) : x ≥ m} for each m ∈ N0 . Then,

Y (xk , η) − y < 1 ,

∀ ∃
k∈N xk ∈[k,∞[ k

providing a sequence (xk )k∈N in R such that limk→∞ xk = ∞ and limk→∞ Y (xk , η) = y,
proving y ∈ ω(η) and the inclusion “⊇” of (5.100a).
(b): Let y ∈ ω(η) and x ∈ I0,η . Choose a sequence (xk )k∈N in R such that limk→∞ xk = ∞
and limk→∞ Y (xk , η) = y. Then limk→∞ (xk − x) = ∞ and
 Lem. 5.4(b)
lim Y xk − x, Y (x, η) = lim Y (xk , η) = y, (5.101)
k→∞ k→∞

proving ω(η) ⊆ ω Y (x, η) . The reversed inclusion then also follows, since
 Lem. 5.4(b)
Y − x, Y (x, η) = Y (0, η) = η, (5.102)

concluding the proof. 

Example 5.50. Let Ω ⊆ Kn , n ∈ N, and f : Ω −→ Kn be such that y ′ = f (y) admits


unique maximal solutions.

(a) If η ∈ Ω is a fixed point, then ω(η) = α(η) = {η}. More generally, if η ∈ Ω


is such that limx→∞ Y (x, η) = y ∈ Kn , then ω(η) = {y}; if η ∈ Ω is such that
limx→−∞ Y (x, η) = y ∈ Kn , then α(η) = {y}.

(b) If A ∈ M(n, C) is such that the conditions of Th. 5.38(b) hold (all eigenvalues have
negative real parts, 0 is positively asymptotically stable), then ω(η) = {0} for each
η ∈ Cn and α(η) = ∅ for each η ∈ Cn \ {0}.

(c) If η ∈ Ω is such that the orbit O(φ) of φ := Y (·, η) is periodic, then ω(η) = α(η) =
O(φ). For example, for (5.8),

∀ 2 ω(η) = α(η) = Skηk2 (0) = y ∈ R2 : kyk2 = kηk2 .



(5.103)
η∈R

Example 5.51. As an example with nonperiodic orbits that have limit sets consisting
of more than one point, consider

y1′ = y2 + y1 (1 − y12 − y22 ), (5.104a)


y2′ = −y1 + y2 (1 − y12 − y22 ). (5.104b)
5 STABILITY 113

We will show that, for each point except the origin (which is clearly a fixed point), the
omega limit set is the unit circle, i.e.

∀ ω(η) = S1 (0) = {y ∈ R2 : y12 + y22 = 1}. (5.105)


η∈R2 \{0}

We first verify that the general solution is


(η1 cos x + η2 sin x, η2 cos x − η1 sin x)
Y : Df,0 −→ R2 , Y (x, η1 , η2 ) = p , (5.106)
η12 + η22 + (1 − η12 − η22 )e−2x

where, letting
kηk22 − 1
 
1
∀ xη := ln , (5.107)
η∈{y∈R2 : kyk2 >1} 2 kηk22
   
Df,0 = R × {η ∈ R2 : kηk2 ≤ 1} ∪ ]xη , ∞[×{η ∈ R2 : kηk2 > 1} : (5.108)

For each (η1 , η2 ) ∈ R2 , Y (·, η1 , η2 ) satisfies the initial condition:

(η1 , η2 )
Y (0, η1 , η2 ) = p = (η1 , η2 ). (5.109)
η12 + η22 + (1 − η12 − η22 )

The following computations prepare the check that each Y (·, η1 , η2 ) satisfies (5.104):
The 2-norm squared of the numerator in (5.106) is

(η1 cos x + η2 sin x, η2 cos x − η1 sin x) 2



2
= η12 cos2 x + 2η1 η2 cos x sin x + η22 sin2 x + η22 cos2 x − 2η1 η2 cos x sin x + η12 sin2 x
= η12 + η22 = kηk22 . (5.110)

Thus,

Y (x, η1 , η2 ) = p kηk2
2
(5.111)
kηk22 + (1 − kηk22 )e−2x
and
2 kηk22 + (1 − kηk22 )e−2x − kηk22
1 − Y12 (x, η1 , η2 ) − Y22 (x, η1 , η2 ) = 1 − Y (x, η1 , η2 ) 2 =

kηk22 + (1 − kηk22 )e−2x
(1 − kηk22 )e−2x
= . (5.112)
kηk22 + (1 − kηk22 )e−2x
In consequence,

Y1′ (x, η1 , η2 )

(−η1 sin x + η2 cos x) kηk22 + (1 − kηk22 )e−2x + (η1 cos x + η2 sin x)(1 − kηk22 )e−2x

= 3
kηk22 + (1 − kηk22 )e−2x 2
 
= Y2 (x, η1 , η2 ) + Y1 (x, η1 , η2 ) 1 − Y12 (x, η1 , η2 ) − Y22 (x, η1 , η2 ) , (5.113)
5 STABILITY 114

verifying (5.104a). Similarly,

Y2′ (x, η1 , η2 )

(−η2 sin x − η1 cos x) kηk22 + (1 − kηk22 )e−2x + (η2 cos x − η1 sin x)(1 − kηk22 )e−2x

= 3
kηk22 + (1 − kηk22 )e−2x 2
 
= −Y1 (x, η1 , η2 ) + Y2 (x, η1 , η2 ) 1 − Y12 (x, η1 , η2 ) − Y22 (x, η1 , η2 ) , (5.114)

verifying (5.104b).
For kηk2 ≤ 1, Y (·, η1 , η2 ) is maximal, as it is defined on R (the denominator in (5.106)
has no zero in this case). For kηk2 > 1, the denominator clearly has a zero at xη < 0,
where xη is defined as in (5.107). For x > xη , the expression under the square root
in (5.106) is positive. Since limx↓xη kY (x, η1 , η2 )k2 = ∞ for kηk2 > 1, Y (·, η1 , η2 ) is
maximal in this case as well, completing the verification of Y , defined as in (5.106) –
(5.108), being the general solution of (5.104).
It remains to prove (5.105). From (5.111), we obtain

∀2 lim Y (x, η1 , η2 ) 2 = 1, (5.115)
η∈R \{0} x→∞

which implies
∀ ω(η) ⊆ S1 (0). (5.116)
η∈R2 \{0}

Conversely, consider η = (η1 , η2 ) ∈ R2 \ {0} and y = (y1 , y2 ) ∈ S1 (0). We will show


y ∈ ω(η): Since kyk2 = 1,

∃ y = (sin ϕy , cos ϕy ). (5.117)


ϕy ∈[0,2π[

Analogously,
∃ η = kηk2 (sin ϕη , cos ϕη ) (5.118)
ϕη ∈[0,2π[

(the reader might note that, in (5.117) and (5.118), we have written y and η using their
polar coordinates, cf. [Phi15, Ex. 4.19]). Then, according to (5.106), we obtain, for each
x ≥ 0,

kηk2 (sin ϕη cos x + cos ϕη sin x, cos ϕη cos x − sin ϕη sin x)


Y (x, η1 , η2 ) = p
kηk22 + (1 − kηk22 )e−2x

kηk2 sin(x + ϕη ), cos(x + ϕη )
= p . (5.119)
kηk22 + (1 − kηk22 )e−2x

Define
∀ xk := ϕy − ϕη + 2πk ∈ R+ . (5.120)
k∈N
5 STABILITY 115

Then limk→∞ xk = ∞ and

lim Y (xk , η1 , η2 )
k→∞

(5.119) kηk2 sin(ϕy − ϕη + 2πk + ϕη ), cos(ϕy − ϕη + 2πk + ϕη )
= lim p
k→∞ kηk22 + (1 − kηk22 )e−2x
kηk2 (sin ϕy , cos ϕy )
= lim p = y, (5.121)
k→∞ kηk22 + (1 − kηk22 )e−2x

showing y ∈ ω(η) and S1 (0) ⊆ ω(η).

Proposition 5.52. In the situation of Def. 5.47, if f is locally Lipschitz, then orbits
that intersect an omega or alpha limit set, must entirely remain inside that same omega
or alpha limit set, i.e.
   
∀ ∀ Y (x, y) ∈ ω(η) ∧ ∀ ∀ Y (x, y) ∈ α(η) . (5.122)
y∈Ω∩ω(η) x∈I0,y y∈Ω∩α(η) x∈I0,y

Proof. Due to Rem. 5.48, it suffices to prove the statement involving the omega limit set.
Let y ∈ Ω∩ω(η) and x ∈ I0,y . Choose a sequence (xk )k∈N in R such that limk→∞ xk = ∞
and limk→∞ Y (xk , η) = y. Then limk→∞ (xk + x) = ∞ and,
Lem. 5.4(b)  (∗)
lim Y (xk + x, η) = lim Y x, Y (xk , η) = Y (x, y), (5.123)
k→∞ k→∞

proving Y (x, y) ∈ ω(η). At “(∗)”, we have used that, due to f being locally Lipschitz
by hypothesis, Y is continuous by Th. 3.35. 

Proposition 5.53. In the situation of Def. 5.47, let η ∈ Ω be such that there exists a
compact set K ⊆ Ω, satisfying

{Y (x, η) : x ≥ 0} ⊆ K resp. {Y (x, η) : x ≤ 0} ⊆ K . (5.124)

Then the following hold:

(a) ω(η) 6= ∅ (resp. α(η) 6= ∅).

(b) ω(η) (resp. α(η)) is compact.

(c) ω(η) (resp. α(η)) is a connected set, i.e. if O1 , O2 are disjoint open subsets of Kn
such that ω(η) ⊆ O1 ∪O2 (resp. α(η) ⊆ O1 ∪O2 ), then ω(η)∩O1 = ∅ or ω(η)∩O2 = ∅
(resp. α(η) ∩ O1 = ∅ or α(η) ∩ O2 = ∅).

Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limit
sets.
(a): Since, by hypothesis, (Y (k, η))k∈N is a sequence in the compact set K, it must have
a subsequence, converging to some limit y ∈ K. But then y ∈ ω(η), i.e. ω(η) 6= ∅.
5 STABILITY 116

(b): According to (5.100a) and (5.124), ω(η) is a closed subset of the compact set K,
implying ω(η) to be compact as well.
(c): Seeking a contradiction, we suppose the assertion is false, i.e. there are disjoint
open subsets O1 , O2 of Kn such that ω(η) ⊆ O1 ∪ O2 , ω1 := ω(η) ∩ O1 6= ∅ and ω2 :=
ω(η) ∩ O2 6= ∅. Then ω1 and ω2 are disjoint since O1 , O2 are disjoint. Moreover, ω1
and ω2 are both subsets of the compact set ω(η). Due to ω1 = ω(η) ∩ (Kn \ O2 ) and
ω2 = ω(η) ∩ (Kn \ O1 ), ω1 and ω2 are also closed, hence, compact. Then, according
to Prop. C.10, δ := dist(ω1 , ω2 ) > 0. If y1 ∈ ω1 and y2 ∈ ω2 , then there are numbers
0 < s1 < t1 < s2 < t2 < . . . such that limk→∞ sk = limk→∞ tk = ∞ and
 
∀ Y (sk , η) ∈ O1 ∧ Y (tk , η) ∈ O2 . (5.125)
k∈N

Define 
∀ σk := sup x ≥ sk : Y (t, η) ∈ O1 for each t ∈ [sk , x] . (5.126)
k∈N

Then sk < σk < tk and the continuity of the (even differentiable) map Y (·, η) yields
ηk := Y (σk , η) ∈ ∂O1 . Thus, (ηk )k∈N is a sequence in the compact set K ∩ ∂O1 and,
therefore, must have a convergent subsequence, converging to some z ∈ K ∩ ∂O1 . But
then z ∈ ω(η), but not in O1 ∪ O2 , in contradiction to ω(η) ⊆ O1 ∪ O2 . 

Theorem 5.54 (LaSalle). Let Ω ⊆ Rn , n ∈ N, and f : Ω −→ Rn be such that y ′ = f (y)


admits unique maximal solutions. Moreover, let Ω0 be an open subset of Ω, assume
V : Ω0 −→ R is continuously differentiable, K := {y ∈ Ω0 : V (y) ≤ r} is compact for
some r ∈ R, and V̇ (y) ≤ 0 (resp. V̇ (y) ≥ 0) for each y ∈ K, where V̇ is defined as in
(5.33). If η ∈ Ω0 is such that V (η) < r, then the following hold:


(a) Y (·, η) is defined on all of R+
0 (resp. on all of R0 ).

(b) One has ω(η) ⊆ K (resp. α(η) ⊆ K) and V is constant on ω(η) (resp. on α(η)).

(c) If f is locally Lipschitz, then, letting


n  o
M := y ∈ K : V̇ Y (x, y) = 0 for each x ≥ 0 (resp. for each x ≤ 0) ,

one has ω(η) ⊆ M (resp. α(η) ⊆ M ). In particular, V̇ (y) = 0 for each y ∈ ω(η)
(resp. for each y ∈ α(η)).

Proof. As usual, it suffices to prove the assertions for V̇ (y) ≤ 0, as the assertions for
V̇ (y) ≥ 0 then follow via time reversion.
(a): We claim 
∀ V Y (x, η) < r : (5.127)
x∈I0,η ∩R+
0

Indeed, if (5.127) does not hold, then let


 
0 < s := sup x ≥ 0 : V Y (t, η) < r for each t ∈ [0, x] ∈ I0,η , (5.128)
5 STABILITY 117

and Z s
 
r = V Y (s, η) = V (η) + V̇ Y (t, η) dt ≤ V (η) < r, (5.129)
0

which is impossible. Thus, (5.127) must hold. However, (5.127) implies R+


0 ⊆ I0,η , since
Y (·, η) is a maximal solution and K = {y ∈ Ω0 : V (y) ≤ r} is compact.
(b): Let φ := Y (·, η). During the proof of (a) above, we have shown φ(x) ∈ K for each
x ≥ 0. Since, then, (V ◦ φ)′ (x) = V̇ (φ(x)) ≤ 0 for each x ≥ 0, V ◦ φ is nonincreasing for
x ≥ 0. Since V ◦ φ is also bounded on K,

∃ c = lim V φ(x) . (5.130)
c∈R x→∞

If y ∈ ω(η), then there exists a sequence (xk )k∈N in R such that limk→∞ xk = ∞ and
limk→∞ φ(xk ) = y. Thus, y ∈ K (since K is closed), and
 (5.130)
V (y) = lim V φ(xk ) = c, (5.131)
k→∞

proving (b).
(c): Let y ∈ ω(η) and φ := Y (·, y). Since f is assumed to be locally Lipschitz, Prop.
5.52 applies and we obtain φ(x) = Y (x, y) ∈ ω(η) for each x ∈ R+0 . Using (b), we know
V to be constant on ω(η), i.e. V ◦ φ must be constant on R+ 0 as well, implying

∀+ V̇ (φ(x)) = (V ◦ φ)′ (x) = 0 (5.132)


x∈R0

as claimed. 
Example 5.55. Let a < 0 < b and let h : ]a, b[−→ R be continuously differentiable and
such that 
< 0 for x < 0,

h(x) = 0 for x = 0, (5.133)

> 0 for x > 0.

Consider the autonomous ODE

y1′ = y2 , (5.134a)
y2′ = −y12 y2 − h(y1 ). (5.134b)

The right-hand side is defined on Ω :=]a, b[×R and is clearly C 1 , i.e. the ODE admits
unique maximal solutions. Due to (5.133), F = {(0, 0)}, i.e. the origin is the only fixed
point of (5.134). We will use Th. 5.54(c) to show (0, 0) is positively asymptotically
stable: We introduce
Z x
H : ]a, b[−→ R, H(x) := h(t) dt , (5.135)
0

and the Lyapunov function


y22
V : Ω −→ R, V (y1 , y2 ) := H(y1 ) + . (5.136)
2
A DIFFERENTIABILITY 118

Since H is positive definite at 0 (H is actually strictly decreasing on ]a, 0] and strictly


increasing on [0, b[), V is positive definite at (0, 0). We also obtain

V̇ : Ω −→ R, V̇ (y1 , y2 ) = h(y1 ), y2 • y2 , −y12 y2 − h(y1 ) = −y12 y22 ≤ 0. (5.137)


 

Thus, from the Lyapunov Th. 5.30, we already know (0, 0) to be positively stable.
However, V̇ is not negative definite at (0, 0), i.e. we can not immediately conclude that
(0, 0) is positively asymptotically stable. Instead, as promised, we apply Th. 5.54(c):
To this end, using that H is continuous and positive definite at 0, we choose r > 0 and
c, d ∈ R, satisfying

a < c < 0 < d < b and H(c) = H(d) = r, (5.138)

and define

O := {(y1 , y2 ) ∈ Ω : V (y1 , y2 ) < r}, (5.139)


K := {(y1 , y2 ) ∈ Ω : V (y1 , y2 ) ≤ r}. (5.140)

Then O is open since V is continuous, and it suffices to show

∀ lim Y (x, η1 , η2 ) = (0, 0). (5.141)


(η1 ,η2 )∈O x→∞

√ √
Moreover, the continuity of V implies K to be closed. Since K ⊆ [c, d] × [− 2r, 2r], it
is also bounded, i.e. compact. Thus, Th. 5.54 applies to each η ∈ O. So let η ∈ O. We
will show that M = {(0, 0)}, where M is the set of Th. 5.54(c) (then ω(η) = {(0, 0)} by
Th. 5.54(c), which implies (5.141) as desired). To verify M = {(0, 0)}, note V̇ (y1 , y2 ) < 0
for y1 , y2 6= 0, showing (y1 , y2 ) ∈
/ M . For y1 = 0, y2 6= 0, let φ := Y (·, y1 , y2 ). Then
φ2 (0) = y2 6= 0 and φ′1 (0) = y2 6= 0, i.e. both φ1 and φ2 are nonzero on some interval
]0, ǫ[ with ǫ > 0, showing (y1 , y2 ) ∈/ M . Likewise, if y1 6= 0, y2 = 0, then let φ be as
before. This time φ1 (0) = y1 6= 0 and φ′2 (0) = −h(y1 ) 6= 0, again showing both φ1 and
φ2 are nonzero on some interval ]0, ǫ[ with ǫ > 0, implying (y1 , y2 ) ∈
/ M.

A Differentiability
We provide a lemma used in the variation of constants Th. 2.3.

Lemma A.1. Let O ⊆ R be open. If the function a : O −→ K is differentiable, then

f : O −→ K, f (x) := ea(x) (A.1a)

is differentiable with

f ′ : O −→ K, f ′ (x) := a′ (x) ea(x) . (A.1b)


B KN -VALUED INTEGRATION 119

Proof. For K = R, the lemma is immediate from the chain rule of [Phi16, Th. 9.11].
It remains to consider the case K = C. Note that we can not apply the chain rule for
holomorphic (i.e. C-differentiable functions), since a is only R-differentiable and it does
not need to have a holomorphic extension. However, we can argue as follows, merely
using the chain rule and the product rule for real-valued functions: Write a = b + ic
with differentiable functions b, c : O −→ R. Then

f (x) = ea(x) = eb(x)+ic(x) = eb(x) eic(x) = eb(x) sin c(x) + i cos c(x) .

(A.2)

Thus, one computes

f ′ (x) = b′ (x) eb(x) eic(x) + eb(x) − c′ (x) cos c(x) + ic′ (x) sin c(x)


= b′ (x) ea(x) + ic′ (x) eb(x) i cos c(x) + sin c(x) = b′ (x) ea(x) + ic′ (x) eb(x) eic(x)


= b′ (x) + ic′ (x) ea(x) = a′ (x) ea(x) ,



(A.3)

proving (A.1b). 

B Kn-Valued Integration
During the course of this class, we frequently need Kn -valued integrals.R In particular,
for f : I −→ Kn , I an interval in R, we make use of the estimate k I f k ≤ I kf k,
R

for example in the proof of the Peano Th. 3.8. As mentioned in the proof of Th. 3.8,
the estimate can easily be checked directly for the 1-norm on Kn , but it does hold for
every norm on Kn . To verify this result is the main purpose of the present section.
Throughout the class, it suffices to use Riemann integrals. However, some readers
might be more familiar with Lebesgue integrals, which is a more general notion (every
Riemann integrable function is also Lebesgue integrable). For convenience, the material
is presented twice, first using Riemann integrals and arguments that make specific use
of techniques available for Riemann integrals, then, second, using Lebesgue integrals
and corresponding techniques. For Riemann integrals, the norm estimate is proved in
Th. B.4, for Lebesgue integrals in Th. B.9.

B.1 Kn -Valued Riemann Integral


Definition B.1. Let a, b ∈ R, I := [a, b]. We call a function f : I −→ Kn , n ∈ N,
Riemann integrable if, and only if, each coordinate function fj = πj ◦ f : I −→ K,
j = 1, . . . , n, is Riemann integrable. Denote the set of all Riemann integrable functions
from I into Kn by R(I, Kn ). If f : I −→ Kn is Riemann integrable, then
Z Z Z 
f := f1 , . . . , fn ∈ Kn (B.1)
I I I

is the (Kn -valued) Riemann integral of f over I.


B KN -VALUED INTEGRATION 120

Remark B.2. The linearity of the K-valued integral implies the linearity of the Kn -
valued integral.
Theorem B.3. Let a, b ∈ R, a ≤ b, I := [a, b]. If f ∈ R(I, Kn ), n ∈ N, and φ :
f (I) −→ R is Lipschitz continuous, then φ ◦ f ∈ R(I, R).

Proof. If K = R, then φ ◦ f = ψ ◦ ι ◦ f , where ι : Rn −→ Cn is the canonical imbedding,


and ψ : Cn −→ R, ψ(z1 , . . . , zn ) := φ(Re z1 , . . . , Re zn ). Clearly, ι ◦ f ∈ R(I, Cn ), and,
if φ is L-Lipschitz, L ≥ 0, then, due to

|ψ(z) − ψ(w)| = |φ(Re z) − φ(Re w)| ≤ L k Re z − Re wk


n
(∗) X
≤ CL k Re z − Re wk1 = CL | Re zj − Re wj |
∀ j=1 (B.2)
z,w∈Cn
n
[Phi16, Th. 5.9(d)] X (∗∗)
≤ CL |zj − wj | = CL kz − wk1 ≤ C̃CL kz − wk,
j=1

where the estimate at (∗) holds with C ∈ R+ , due to the equivalence of k · k and k · k1 on
Rn , and the estimate at (∗∗) holds with C̃ ∈ R+ , due to the equivalence of k · k1 and k · k
on Cn . Thus, by (B.2), ψ is Lipschitz as well, namely C̃CL-Lipschitz, and it suffices to
consider the case K = C, which we proceed to do next. Once again using the equivalence
of k · k1 and k · k on Cn , there exists c ∈ R+ such that kzk ≤ ckzk1 for each z ∈ Cn .
Assume φ to be L-Lipschitz, L ≥ 0. If f ∈ R(I, Cn ), then Re f1 , . . . , Re fn ∈ R(I, R)
and Im f1 , . . . , Im fn ∈ R(I, R), i.e., given ǫ > 0, Riemann’s integrability criterion of
[Phi16, Th. 10.13] provides partitions ∆1 , . . . , ∆n of I and Π1 , . . . , Πn of I such that
ǫ
R(∆j , Re fj ) − r(∆j , Re fj ) < ,
∀ 2ncL (B.3)
j=1,...,n ǫ
R(Πj , Im fj ) − r(Πj , Im fj ) < ,
2ncL
where R and r denote upper and lower Riemann sums, respectively (cf. [Phi16, (10.7)]).
Letting ∆ be a joint refinement of the 2n partitions ∆1 , . . . , ∆n , Π1 , . . . , Πn , we have (cf.
[Phi16, Def. 10.8(a),(b)] and [Phi16, Th. 10.10(a)])
ǫ
R(∆, Re fj ) − r(∆, Re fj ) < ,
∀ 2ncL (B.4)
j=1,...,n ǫ
R(∆, Im fj ) − r(∆, Im fj ) < .
2ncL
Recalling that, for each g : I −→ R and ∆ = (x0 , . . . , xN ) ∈ RN +1 , N ∈ N, a = x0 <
x1 < · · · < xN = b, Ik := [xk−1 , xk ], it is
N
X N
X
r(∆, g) = mk |Ik | = mk (g)(xk − xk−1 ), (B.5a)
k=1 k=1
XN XN
R(∆, g) = Mk |Ik | = Mk (g)(xk − xk−1 ), (B.5b)
k=1 k=1
B KN -VALUED INTEGRATION 121

where
mk (g) := inf{g(x) : x ∈ Ik }, Mk (g) := sup{g(x) : x ∈ Ik }, (B.5c)
we obtain, for each ξk , ηk ∈ Ik ,

(φ ◦ f )(ξk ) − (φ ◦ f )(ηk ) ≤ L f (ξk ) − f (ηk ) ≤ cL f (ξk ) − f (ηk )
1
n
X
= cL fj (ξk ) − fj (ηk )
j=1
n n
[Phi16, Th. 5.9(d)] X X
≤ cL Re fj (ξk ) − Re fj (ηk ) + cL
Im fj (ξk ) − Im fj (ηk )
j=1 j=1
n
X n
X
 
≤ cL Mk (Re fj ) − mk (Re fj ) + cL Mk (Im fj ) − mk (Im fj ) . (B.6)
j=1 j=1

Thus,
N
X 
R(∆, φ ◦ f ) − r(∆, φ ◦ f ) = Mk (φ ◦ f ) − mk (φ ◦ f ) |Ik |
k=1
n
N X
(B.6) X 
≤ cL Mk (Re fj ) − mk (Re fj ) |Ik |
k=1 j=1
N X
X n

+ cL Mk (Im fj ) − mk (Im fj ) |Ik |
k=1 j=1
n
X n
X
 
= cL R(∆, Re fj ) − r(∆, Re fj ) + cL R(∆, Im fj ) − r(∆, Im fj )
j=1 j=1
(B.4) ǫ
< 2ncL = ǫ. (B.7)
2ncL
Thus, φ ◦ f ∈ R(I, R) by [Phi16, Th. 10.13]. 
Theorem B.4. Let a, b ∈ R, a ≤ b, I := [a, b]. For each norm k · k on Kn , n ∈ N, and
each Riemann integrable f : I −→ Kn , it is kf k ∈ R(I, R), and the following holds:
Z Z

f ≤ kf k. (B.8)

I I

Proof. From Th. B.3, we obtain kf k ∈ R(I, R), as the norm k · k is 1-Lipschitz by the
inverse triangle inequality. Let ∆ be an arbitrary partition of I. Recalling that, for
each g : I −→ R and ∆ = (x0 , . . . , xN ) ∈ RN +1 , N ∈ N, a = x0 < x1 < · · · < xN = b,
Ik := [xk−1 , xk ], ξk ∈ Ik , the intermediate Riemann sums
N
X N
X
ρ(∆, f ) = f (tk ) |Ik | = f (tk )(xk − xk−1 ), (B.9)
k=1 k=1
B KN -VALUED INTEGRATION 122

we obtain, for ξk ∈ Ik ,
  
ρ(∆, Re f1 ), ρ(∆, Im f1 ) , . . . , ρ(∆, Re fn ), ρ(∆, Im fn )

N N
! N N
!!
X X X X
= Re f1 (ξk ) |Ik |, Im f1 (ξk ) |Ik | , . . . , Re fn (ξk ) |Ik |, Im fn (ξk ) |Ik |



k=1 k=1 k=1 k=1
N
X    
= Re f1 (ξk ) |Ik |, Im f1 (ξk ) |Ik | , . . . , Re fn (ξk ) |Ik |, Im fn (ξk ) |Ik |


k=1
N 
X   

≤ Re f1 (ξk ), Im f1 (ξk ) , . . . , Re fn (ξk ), Im fn (ξk ) |Ik |

k=1
N
X
= kf (ξk )k |Ik | = ρ(∆, kf k). (B.10)
k=1

Since the intermediate Riemann sums in (B.10) converge to the respective integrals by
[Phi16, (10.25b)], one obtains
Z 
 
f = lim

ρ(∆, Re f 1 ), ρ(∆, Im f 1 ) , . . . , ρ(∆, Re f n ), ρ(∆, Im f n )



I
|∆|→0
(B.10)
Z
≤ lim ρ(∆, kf k) = kf k, (B.11)
|∆|→0 I

proving (B.8). 

B.2 Kn -Valued Lebesgue Integral


Definition B.5. Let I ⊆ R be (Lebesgue) measurable, n ∈ N.

(a) A function f : I −→ Kn is called (Lebesgue) measurable (respectively, (Lebesgue)


integrable) if, and only if, each coordinate function fj = πj ◦ f : I −→ K, j =
1, . . . , n, is (Lebesgue) measurable (respectively, (Lebesgue) integrable), which, for
K = C, means if, and only if, each Re fj and each Im fj , j = 1, . . . , n, is (Lebesgue)
measurable (respectively, (Lebesgue) integrable).
(b) If f : I −→ Kn is integrable, then
Z Z Z 
f := f1 , . . . , fn ∈ Kn (B.12)
I I I

is the (Kn -valued) (Lebesgue) integral of f over I.


Remark B.6. The linearity of the K-valued integral implies the linearity of the Kn -
valued integral.
Theorem B.7. Let I ⊆ R be measurable, n ∈ N. Then f : I −→ Kn is measurable in
the sense of Def. B.5(a) if, and only if, f −1 (O) is measurable for each open subset O of
Kn .
B KN -VALUED INTEGRATION 123

Proof. Assume f −1 (O) is measurable for each open subset O of Kn . Let j ∈ {1, . . . , n}.
If Oj ⊆ K is open in K, then O := πj−1 (Oj ) = {z ∈ Kn : zj ∈ Oj } is open in Kn .
Thus, fj−1 (Oj ) = f −1 (O) is measurable, showing that each fj is measurable, i.e. f is
measurable. Now assume f is measurable, i.e. each fj is measurable. Since every open
O ⊆ Kn is a countable union of open sets of the form O = O1 × · · · × On with each Oj
being an open subset of K, it suffices to show that the
Tn preimages of such open sets are
−1
measurable. So let O be as above. Then f (O) = j=1 fj (Oj ), showing that f −1 (O)
−1

is measurable. 

Corollary B.8. Let I ⊆ R be measurable, n ∈ N. If f : I −→ Kn is measurable, then


kf k : I −→ R is measurable.

Proof. If O ⊆ R is open, then k · k−1 (O) is an open subset of Kn by the continuity of


the norm. In consequence, kf k−1 (O) = f −1 k · k−1 (O) is measurable.



Theorem B.9. Let I ⊆ R be measurable, n ∈ N. For each norm k · k on Kn and each


integrable f : I −→ Kn , the following holds:
Z Z

f ≤ kf k. (B.13)

I I

Proof. First assume that B ⊆ I is measurable, y ∈ Kn , and f = y χB , where χB is the


characteristic function of B (i.e. the fj are yj on B and 0 on I \ B). Then
Z Z

y1 λ(B), . . . , yn λ(B) = λ(B)kyk = kf k,
f =
(B.14)

I I

where λ denotes Lebesgue measure on R. Next, consider the case that f is a so-called
simple function, that means f takes only finitely many values y1 , . . . , yN ∈ Kn , N ∈ N,
and each preimage Bj := f −1 {yj } ⊆ I is measurable. Then
N
X
f= yj χ B j , (B.15)
j=1

where, without loss of generality, we may assume that the Bj are pairwise disjoint. We
obtain
Z
XN Z N Z Z X N
X
f ≤ yj χ B = yj χ B = yj χ B
j j j
I j=1 I j=1 I I j=1

Z X N

Z
(∗)
= yj χBj = kf k, (B.16)


I j=1
I

where, at (∗), it was used that, as the Bj are disjoint, the integrands of the two integrals
are equal at each x ∈ I.
C METRIC SPACES 124

Now, if f is integrable, then each Re fj and each Im fj is integrable (i.e. Re fj , Im fj ∈


L1 (I)) and there exist sequences of simple functions φj,k : I −→ R and ψj,k : I −→ R
such that limk→∞ kφj,k − Re fj kL1 (I) = limk→∞ kψj,k − Im fj kL1 (I) = 0. In particular,
Z Z

0 ≤ lim φj,k − Re fj ≤ lim kφj,k − Re fj kL1 (I) = 0, (B.17a)
k→∞ I I k→∞
Z Z

0 ≤ lim ψj,k − Im fj ≤ lim kψj,k − Im fj kL1 (I) = 0, (B.17b)
k→∞ I I k→∞

and also
0 ≤ lim kφj,k + iψj,k − fj kL1 (I)
k→∞
≤ lim kφj,k − Re fj kL1 (I) + lim kψj,k − Im fj kL1 (I) = 0. (B.18)
k→∞ k→∞

Thus, we obtain
Z Z Z 

f = f1 , . . . , fn

I
 I Z I
Z Z Z 

= lim
k→∞ φ 1,k + i lim ψ 1,k , . . . , lim φ n,k + i lim ψ n,k

I k→∞ I k→∞ I k→∞ I
Z Z Z
(∗)
= lim (φ k + i ψ k ) ≤ lim kφk + iψk k = kf k, (B.19)
k→∞

I
k→∞ I I

where the equality at (∗) holds due to limk→∞ k(φ1,k , . . . , φn,k )k−kf k L1 (I)
= 0, which,
in turn, is verified by
Z Z Z

0≤ kφk + iψk k − kf k ≤ kφk + iψk − f k ≤ C kφk + iψk − f k1

I I I
n
Z X

=C φj,k + iψj,k − fj → 0 for k → ∞, (B.20)
I j=1

with C ∈ R+ since the norms k · k and k · k1 are equivalent on Kn . 

C Metric Spaces

C.1 Distance in Metric Spaces


Lemma C.1. The following law holds in every metric space (X, d):
|d(x, y) − d(x′ , y ′ )| ≤ d(x, x′ ) + d(y, y ′ ) for each x, x′ , y, y ′ ∈ X. (C.1)
In particular, (C.1) states the Lipschitz continuity of d : X 2 −→ R+ 0 (with Lipschitz
2
constant 1) with respect to the metric d1 on X defined by
d1 : X 2 × X 2 −→ R+ d1 (x, y), (x′ , y ′ ) = d(x, x′ ) + d(y, y ′ ).

0, (C.2)
Further consequences are the continuity and even uniform continuity of d, and also the
continuity of d in both components.
C METRIC SPACES 125

Proof. First, note d(x, y) ≤ d(x, x′ ) + d(x′ , y ′ ) + d(y ′ , y), i.e.

d(x, y) − d(x′ , y ′ ) ≤ d(x, x′ ) + d(y ′ , y). (C.3a)

Second, d(x′ , y ′ ) ≤ d(x′ , x) + d(x, y) + d(y, y ′ ), i.e.

d(x′ , y ′ ) − d(x, y) ≤ d(x′ , x) + d(y, y ′ ). (C.3b)

Taken together, (C.3a) and (C.3b) complete the proof of (C.1). 


Definition C.2. Let (X, d) be a nonempty metric space. For each A, B ⊆ X define the
distance between A and B by

dist(A, B) := inf{d(a, b) : a ∈ A, b ∈ B} ∈ [0, ∞] (C.4)

and
∀ dist(x, B) := dist({x}, B) and dist(A, x) := dist(A, {x}). (C.5)
x∈X

Remark C.3. Clearly, for dist(A, B) as defined in (C.4), we have

dist(A, B) < ∞ ⇔ A 6= ∅ and B 6= ∅. (C.6)

Theorem C.4. Let (X, d) be a nonempty metric space. If A ⊆ X and A 6= ∅, then the
functions
δ, δ̃ : X −→ R+
0, δ(x) := dist(x, A), δ̃(x) := dist(A, x), (C.7)
are both Lipschitz continuous with Lipschitz constant 1 (in particular, they are both
continuous and even uniformly continuous).

Proof. Since dist(x, A) = dist(A, x), it suffices to verify the Lipschitz continuity of δ.
We need to show
∀ | dist(x, A) − dist(y, A)| ≤ d(x, y). (C.8)
x,y∈X

To this end, let x, y ∈ X and a ∈ A be arbitrary. Then

dist(x, A) ≤ d(x, a) ≤ d(x, y) + d(y, a) (C.9)

and
dist(x, A) − d(x, y) ≤ d(y, a), (C.10)
implying
dist(x, A) − d(x, y) ≤ dist(y, A) (C.11)
and
dist(x, A) − dist(y, A) ≤ d(x, y). (C.12)
Since x, y ∈ X were arbitrary, (C.12) also yields

dist(y, A) − dist(x, A) ≤ d(x, y), (C.13)

where (C.12) and (C.13) together are precisely (C.8). 


C METRIC SPACES 126

Definition C.5. Let (X, d) be a metric space, A ⊆ X, and ǫ ∈ R+ . Define

Aǫ := {x ∈ X : d(x, A) < ǫ}, (C.14a)


Aǫ := {x ∈ X : d(x, A) ≤ ǫ}. (C.14b)

We call Aǫ the open ǫ-fattening of A, and Aǫ the closed ǫ-fattening of A.


Lemma C.6. Let (X, d) be a metric space, A ⊆ X, and ǫ ∈ R+ . Then Aǫ , the open
ǫ-fattening of A, is, indeed, open, and Aǫ , the closed ǫ-fattening of A, is, indeed, closed.

Proof. Since the distance function δ : X −→ R+ 0 , δ(x) := dist(x, A), is continuous by


−1
Th. C.4, Aǫ = δ [0, ǫ[ is open as the continuous preimage of an open set (note that [0, ǫ[
is, indeed, (relatively) open in R+ −1
0 ); Aǫ = δ [0, ǫ] is closed as the continuous preimage
of a closed set. 
Lemma C.7. Let (X, d) be a metric space, A ⊆ X, and ǫ ∈ R+ . If A is bounded, then
so are the fattenings Aǫ and Aǫ .

Proof. If A is bounded, then there exist x ∈ X and r > 0 such that A ⊆ Br (x). Let
s := r + ǫ + 1. If y ∈ Aǫ , then there exists a ∈ A such that d(a, y) < ǫ + 1. Thus,

d(x, y) ≤ d(x, a) + d(a, y) < r + ǫ + 1 = s, (C.15)

showing Aǫ ⊆ Aǫ ⊆ Bs (x), i.e. Aǫ and Aǫ are bounded. 


Proposition C.8. Let (X, d) be a metric space, A ⊆ X, and 0 < ǫ1 < ǫ2 .

(a) Then A ⊆ Aǫ1 ⊆ Aǫ1 ⊆ Aǫ2 ⊆ Aǫ2 always holds.


(b) If (X, k·k) is a normed space with d being the induced metric, ∅ 6= A ⊆ X, and there
exists x ∈
/ A, satisfying δ := d(x, A) ≥ ǫ2 , then all the inclusions in (a) are strict:
A ( Aǫ1 ( Aǫ1 ( Aǫ2 ( Aǫ2 . Caveat: For general metric spaces X and A satisfying
all the hypotheses, the inclusions do not need to be strict (consider discrete metric
spaces for simple examples).

Proof. (a) is immediate from (C.14).


To prove (b), let a ∈ A and consider the maps

φ : [0, 1] −→ X, φ(t) := tx + (1 − t)a, (C.16a)



f : [0, 1] −→ R, f (t) := d φ(t), A . (C.16b)

If (sn )n∈N is a sequence in [0, 1] such that limn→∞ sn = s ∈ [0, 1], then limn→∞ φ(sn ) =
sx+(1−s)a = φ(s), i.e. φ is continuous. Then, using Th. C.4, f is also continuous. Thus,
since f (0) = d(a, A) = 0 and f (1) = d(x, A) = δ ≥ ǫ2 , one can use the intermediate
value theorem [Phi16, Th. 7.57] to obtain, for each ǫ ∈ [0, ǫ2 ], some τ ∈ [0, 1], satisfying
f (τ ) = ǫ. If ǫ > 0, then d(φ(τ ), A) = f (τ ) = ǫ > 0, i.e φ(τ ) ∈ Aǫ \ A and φ(τ ) ∈ Aǫ \ Aǫ ,
showing A ( Aǫ1 , Aǫ1 ( Aǫ1 , and Aǫ2 ( Aǫ2 . If ǫ := (ǫ1 + ǫ2 )/2, then ǫ1 < ǫ = f (τ ) =
d(φ(τ ), A) < ǫ2 , i.e. φ(τ ) ∈ Aǫ2 \ Aǫ1 , showing Aǫ1 ( Aǫ2 . 
C METRIC SPACES 127

C.2 Compactness in Metric Spaces


Definition C.9. A subset C of a metric space X is called compact if, and only if, every
sequence in C has a subsequence that converges to some limit c ∈ C.
Proposition C.10. Let (X, d) be a metric space, C, A ⊆ X. If C is compact, A is
closed, and A ∩ C = ∅, then dist(C, A) > 0.

Proof. Proceeding by contraposition, we show that dist(C, A) = 0 implies A ∩ C 6= ∅.


If dist(C, A) = 0, then there exists a sequence ((ck , ak ))k∈N in C × A such that

lim d(ck , ak ) = 0. (C.17)


k→∞

As C is compact, we may assume

lim ck = c ∈ C, (C.18)
k→∞

also implying

lim ak = c, since ∀ d(ak , c) ≤ d(ak , ck ) + d(ck , c). (C.19)


k→∞ k∈N

Since A is closed, (C.19) yields c ∈ A, i.e. c ∈ A ∩ C. 


Proposition C.11. Let (X, d) be a metric space and C ⊆ X.

(a) If C is compact, then C is closed and bounded.


(b) If C is compact and A ⊆ C is closed, then A is compact.

Proof. (a): Suppose C is compact. Let (xk )k∈N be a sequence in C that converges in
X, i.e. limk→∞ xk = x ∈ X. Since C is compact, (xk )k∈N must have a subsequence
that converges to some c ∈ C, implying x = c ∈ C and showing C is closed. If C
is not bounded, then, for each x ∈ X, there is a sequence (xk )k∈N in C such that
limk→∞ d(x, xk ) = ∞. If y ∈ X, then d(x, xk ) ≤ d(x, y) + d(y, xk ), i.e. d(y, xk ) ≥
d(x, xk ) − d(x, y), showing that limk→∞ d(y, xk ) = ∞ as well. Thus, y can not be a limit
of any subsequence of (xk )k∈N . As y was arbitrary, C can not be compact.
(b): If (xk )k∈N is a sequence in A, then (xk )k∈N is a sequence in C. Since C is compact,
it must have a subsequence that converges to some c ∈ C. However, as A is closed, c
must be in A, showing that (xk )k∈N has a subsequence that converges to some c ∈ A,
i.e. A is compact. 
Corollary C.12. A subset C of Kn , n ∈ N, is compact if, and only if, C is closed and
bounded.

Proof. Every compact set is closed and bounded by Prop. C.11(a). If C is closed and
bounded, and (xk )k∈N is a sequence in C, then the boundedness and the Bolzano-
Weierstrass theorem yield a subsequence that converges to some x ∈ Kn . However,
since C is closed, x ∈ C, showing that C is compact. 
C METRIC SPACES 128

The following examples show that, in general, sets can be closed and bounded without
being compact.
Example C.13. (a) If (X, d) is a noncomplete metric space, than it contains a Cauchy
sequence that does not converge. It is not hard to see that such a sequence can
not have a convergent subsequence, either. This shows that no noncomplete metric
space can be compact. Moreover, the closure of every bounded subset of X that
contains such a nonconvergent Cauchy sequence is an example of a closed and
bounded set that is noncompact. Concrete examples are given by Q ∩ [a, b] for each
a, b ∈ R with a < b (these sets are Q-closed, but not R-closed!) and ]a, b[ for each
a, b ∈ R with a < b, in each case endowed with the usual metric d(x, y) := |x − y|.

(b) There can also be closed and bounded sets in complete spaces that are not compact.
Consider the space X of all bounded sequences (xn )n∈N in K, endowed with the sup-
norm k(xn )n∈N ksup := sup{|xn | : n ∈ N}. It is not too difficult to see that X with
the sup-norm is a Banach space: Let (xk )k∈N with xk = (xkn )n∈N be a Cauchy
sequence in X. Then, for each n ∈ N, (xkn )k∈N is a Cauchy sequence in K, and,
thus, it has a limit yn ∈ K. Let y := (yn )n∈N . Then

kxk − yksup = sup{|xkn − yn | : n ∈ N}.

Let ǫ > 0. As (xk )k∈N is a Cauchy sequence with respect to the sup-norm, there is
N ∈ N such that kxk − xl ksup < ǫ for all k, l > N . Fix some l > N and some n ∈ N.
Then ǫ ≥ limk→∞ |xkn − xln | = limk→∞ |yn − xln |. Since this is valid for each n ∈ N,
we get kxl − yksup ≤ ǫ for each l > N , showing liml→∞ xl = y, i.e. X is complete
and a Banach space.
Now consider the sequence (ek )k∈N with
(
1 for k = n,
ekn :=
0 otherwise.

Then (ek )k∈N constitutes a sequence in X with kek ksup = 1 for each k ∈ N. In par-
ticular, (ek )k∈N is a sequence inside the closed unit ball B 1 (0), and, hence, bounded.
However, if k, l ∈ N with k 6= l, then kek − el ksup = 1. Thus, neither (ek )k∈N nor any
subsequence can be a Cauchy sequence. In particular, no subsequence can converge,
showing that the closed and bounded unit ball B 1 (0) is not compact.
Note: There is an important result that shows that a normed vector space is finite-
dimensional if, and only if, the closed unit ball B 1 (0) is compact (see, e.g., [Str08,
Th. 28.14]).
Theorem C.14. If (X, dX ) and (Y, dY ) are metric spaces, C ⊆ X is compact, and
f : C −→ Y is continuous, then f (C) is compact.

Proof. If (y k )k∈N is a sequence in f (C), then, for each k ∈ N, there is some xk ∈ C


such that f (xk ) = y k . As C is compact, there is a subsequence (ak )k∈N of (xk )k∈N
with limk→∞ ak = a for some a ∈ C. Then (f (ak ))k∈N is a subsequence of (y k )k∈N and
C METRIC SPACES 129

the continuity of f yields limk→∞ f (ak ) = f (a) ∈ f (C), showing that (y k )k∈N has a
convergent subsequence with limit in f (C). We have therefore established that f (C) is
compact. 

Theorem C.15. If (X, d) is a metric space, C ⊆ X is compact, and f : C −→ R is


continuous, then f assumes its max and its min, i.e. there are xm ∈ C and xM ∈ C
such that f has a global min at xm and a global max at xM .

Proof. Since C is compact and f is continuous, f (C) ⊆ R is compact according to Th.


C.14. Then, by [Phi16, Lem. 7.53], f (C) contains a smallest element m and a largest
element M . This, in turn, implies that there are xm , xM ∈ C such that f (xm ) = m and
f (xM ) = M . 

Theorem C.16. If (X, dX ) and (Y, dY ) are metric spaces, C ⊆ X is compact, and
f : C −→ Y is continuous, then f is uniformly continuous.

Proof. If f is not uniformly continuous, then there must be some ǫ > 0 such that, for
each k ∈ N, there exist xk , y k ∈ C satisfying dX (xk , y k ) < 1/k and dY (f (xk ), f (y k )) ≥ ǫ.
Since C is compact, there is a ∈ C and a subsequence (ak )k∈N of (xk )k∈N such that
a = limk→∞ ak . Then there is a corresponding subsequence (bk )k∈N of (y k )k∈N such that
dX (ak , bk ) < 1/k and dY (f (ak ), f (bk )) ≥ ǫ for all k ∈ N. Using the compactness of C
again, there is b ∈ C and a subsequence (v k )k∈N of (bk )k∈N such that b = limk→∞ v k .
Now there is a corresponding subsequence (uk )k∈N of (ak )k∈N such that dX (uk , v k ) <
1/k and dY (f (uk ), f (v k )) ≥ ǫ for all k ∈ N. Note that we still have a = limk→∞ v k .
Given α > 0, there is N ∈ N such that, for each k > N , one has dX (a, uk ) < α/3,
dX (b, v k ) < α/3, and dX (uk , v k ) < 1/k < α/3. Thus, dX (a, b) < dX (a, uk ) + dX (uk , v k ) +
dX (b, v k ) < α, implying d(a, b) = 0 and a = b. Finally, the continuity of f implies
f (a) = limk→∞ f (uk ) = limk→∞ f (v k ) in contradiction to dY (f (uk ), f (v k )) ≥ ǫ. 

Theorem C.17. If (X, dX ) and (Y, dY ) are metric spaces, C ⊆ X is compact, and
f : C −→ Y is continuous and one-to-one, then f −1 : f (C) −→ C is continuous.

Proof. Let (y k )k∈N be a sequence f (C) such that limk→∞ y k = y ∈ f (C). Then there
is a sequence (xk )k∈N in C such that f (xk ) = y k for each k ∈ N. Let x := f −1 (y).
It remains to prove that limk→∞ xk = x. As C is compact, there is a ∈ C and a
subsequence (ak )k∈N of (xk )k∈N such that a = limk→∞ ak . The continuity of f yields
f (a) = limk→∞ f (ak ) = limk→∞ y k = y = f (x) since (f (ak ))k∈N is a subsequence of
(y k )k∈N . It now follows that a = x since f is one-to-one. The same argument shows
that every convergent subsequence of (xk )k∈N has to converge to x. If (xk )k∈N did not
converge to x, then there had to be some ǫ > 0 such that infinitely man xk are not in
Bǫ (x). However, the compactness of C would provide a convergent subsequence whose
limit could not be x, in contradiction to x having to be the limit of all convergent
subsequences of (xk )k∈N . 

Definition C.18. A subset A of a metric space (X, d) is called precompact or totally


bounded if, and only if, for each ǫ > 0, A can be covered by finitely many ǫ-balls, i.e. if,
C METRIC SPACES 130

and only if, there exist finitely many points a1 , . . . , aN ∈ A, N ∈ N, such that
N
[
A⊆ Bǫ (aj ). (C.20)
j=1

Theorem C.19. For a subset C of a metric space (X, d), the following statements are
equivalent:

(i) C is compact as defined in Def. C.9.

(ii) C has the Heine-Borel property, i.e. every open cover of C has a finite subcover,
i.e. if (Oj )j∈I is a family of open sets Oj ⊆ X, satisfying
[
C⊆ Oj , (C.21)
j∈I

SN
then there exist j1 , . . . , jN ∈ I, N ∈ N, such that C ⊆ k=1 Oj k .

(iii) C is precompact (i.e. totally bounded) as defined in Def. C.18 and complete, i.e.
every Cauchy sequence in C converges to a limit in C.

Proof. We show (i) ⇒ (iii) ⇒ (ii) ⇒ (i).


“(i) ⇒ (iii)”: Let (cn )n∈N be a Cauchy sequence in C. As C is compact, (cn )n∈N has a
subsequence (cnj )j∈N such that limj→∞ cnj = c ∈ C. Given ǫ > 0 choose K ∈ N such
that, for each m, n ≥ K, d(cm , cn ) < 2ǫ , and such that, for each nj ≥ K, d(cnj , c) < 2ǫ .
Then, fixing some nj ≥ K,
ǫ ǫ
∀ d(cn , c) ≤ d(cn , cnj ) + d(cnj , c) < + = ǫ, (C.22)
n≥K 2 2
showing limn→∞ cn = c and the completeness of C. We now show C to be also totally
bounded. We proceed by contraposition and assume C not to be totally bounded, i.e.
there exists ǫ > 0 such that C is not contained in any finite union of ǫ-balls. Inductively,
we construct a sequence (cn )n∈N in C such that

∀ d(cm , cn ) ≥ ǫ : (C.23)
m,n∈N,
m6=n

To start with, we note C 6= ∅ and choose some arbitrary c1 ∈ C. Assuming c1 , . . . , ck ∈


C, k ∈ N, have already been constructed such that d(cm , cn ) ≥ ǫ holds for each m, n ∈
{1, . . . , k}, there must be
[k
c∈C\ Bǫ (cj ). (C.24)
j=1

Choosing ck+1 := c, (C.24) guarantees (C.23) now holds for each m, n ∈ {1, . . . , k + 1}.
Due to (C.23), no subsequence of (cn )n∈N can be a Cauchy sequence, i.e. (cn )n∈N does
not have a convergent subsequence, proving C is not compact.
C METRIC SPACES 131

“(iii) ⇒ (ii)”: Assume C to be precompact and complete. For each k ∈ N, the precom-
pactness yields points ck1 , . . . , ckNk ∈ C, Nk ∈ N, such that
Nk
[
C⊆ B 1 (ckj ). (C.25)
k
j=1

Seeking a contradiction, assume C does not have the Heine-Borel property, i.e. there
exists an open cover (Oj )j∈I of C which does not have a finite subcover. Inductively, we
construct a decreasing sequence of subsets Ck of C, C ⊇ C1 ⊇ C2 ⊇ . . . , such that no
Ck can be covered by a finite subcover of (Oj )j∈I and such that
∀ ∃ Ck ⊆ B 1 (ckj ) : (C.26)
k∈N j∈{1,...,Nk } k

To start out, we note that (C.25) implies at least one of the finitely many sets C ∩
B1 (c11 ), . . . , C∩B1 (c1N1 ) can not be covered by a finite subcover of (Oj )j∈I , say, C∩B1 (c1j1 ).
Define C1 := C ∩B1 (c1j1 ). Then, given C1 , . . . , Ck have already been constructed for some
k ∈ N, since Ck can not be covered by a finite subcover of (Oj )j∈I and
Nk+1
[
Ck ⊆ C ⊆ B 1 (ck+1
j ), (C.27)
k+1
j=1

there exists jk+1 ∈ {1, . . . , Nk+1 } such that Ck ∩ B 1 (ck+1


jk+1 ) can not be covered by a
k+1

finite subcover of (Oj )j∈I , either. Define Ck+1 := Ck ∩ B 1 (ck+1 jk+1 ). For each k ∈ N,
k+1
choose some sk ∈ Ck (note Ck 6= ∅, as it can not be covered by finitely many Oj ). Given
ǫ > 0, there is K ∈ N such that K2 < ǫ. If k, l ≥ K, then sk , sl ∈ CK ⊆ B 1 (cK j ) for some
K
2
suitable j ∈ {1, . . . , NK }. In particular, d(sk , sl ) < K < ǫ, showing (sk )k∈N is a Cauchy
sequence. As (sk )k∈N is a Cauchy sequence in C and C is complete, there exists c ∈ C
such that limk→∞ sk = c. However, then there must exist some j ∈ I such that c ∈ Oj
and, since Oj is open, there is ǫ > 0 with Bǫ (c) ⊆ Oj , and Bǫ (c) must contain almost
all of the sk . Choose k sufficiently large such that k1 < 4ǫ and d(sk , c) < 2ǫ . Then, since
sk ∈ Ck ⊆ B 1 (ckj ), (C.28)
k

one has
2 ǫ 2ǫ ǫ
∀ d(x, c) ≤ d(x, sk ) + d(sk , c) < + < + = ǫ, (C.29)
x∈B 1 (ckj ) k 2 4 2
k

showing Ck ⊆ B 1 (ckj ) ⊆ Bǫ (c) ⊆ Oj , in contradiction to Ck not being coverable by


k
finitely many Oj .
“(ii) ⇒ (i)”: Assume C has the Heine-Borel property. Seeking a contradiction, assume C
is not compact, that means there exists a sequence (cn )n∈N in C such that no subsequence
of (cn )n∈N converges to a limit in C. According to [Phi15, Prop. 1.38(d)], no c ∈ C can
be a cluster point of (cn )n∈N , i.e., for each c ∈ C, there
S exists ǫc > 0 such that Bǫc (c)
contains only finitely many of the cn . Since C ⊆ c∈C Bǫc (c), the family Bǫc (c) c∈C
constitutes an open cover of C. As C has the Heine-Borel SN property, there exist finitely
many points a1 , . . . , aN ∈ C, N ∈ N, such that C ⊆ j=1 Bǫaj (aj ), i.e. C contains only
finitely many of the cn , in contradiction to (cn )n∈N being a sequence in C. 
D LOCAL LIPSCHITZ CONTINUITY 132

Caveat C.20. In general topological spaces, one defines compactness via the Heine-
Borel property (a topological space C is defined to be compact if, and only if, C has
the Heine-Borel property). Moreover, a topological space C is defined to be sequentially
compact if, and only if, every sequence in C has a convergent subsequence. Using this
terminology, one can rephrase the equivalence between (i) and (ii) in Th. C.19 by stating
that a metric space is sequentially compact if, and only if, it is compact. However, in
general topological spaces, neither implication remains true ((iii) of Th. C.19 does not
even make sense in general topological spaces, as the concepts of boundedness, total
boundedness, and Cauchy sequences are, in general, not available): For an example
of a topological space that is compact, but not sequentially compact, see, e.g. [Pre75,
7.2.10(a)]; for an example of a topological space that is sequentially compact, but not
compact, see, e.g. [Pre75, 7.2.10(c)].
Theorem C.21 (Lebesgue Number). Let (X, d) be a metric space and C ⊆ X. If C is
compact and (Oj )j∈I is an open cover of C, then there exists a Lebesgue number δ for
the open cover, i.e. some δ > 0 such that, for each A ⊆ C with diam A < δ, there exists
j0 ∈ I, where A ⊆ Oj0 . Recall that
(
0 for A = ∅,
diam A =  (C.30)
sup d(x, y) : x, y ∈ A for ∅ 6= A.

Proof. Seeking a contradiction, assume there is no Lebesgue number for the open cover
(Oj )j∈I . Then there are sequences (xk )k∈N in C and (Ak )k∈N in P(C) such that
 
1
∀ xk ∈ Ak , diam Ak < , and ∀ Ak 6⊆ Oj . (C.31)
k∈N k j∈I

As C is compact, we may assume that limk→∞ xk = c ∈ C. Then there must be Oj


such that c ∈ Oj and ǫ > 0 such that Bǫ (c) ⊆ Oj . If k ∈ N is such that k1 < 2ǫ and
d(xk , c) < 2ǫ , then, for each a ∈ Ak , we have d(a, c) ≤ d(a, xk ) + d(xk , c) < 2ǫ + 2ǫ = ǫ,
implying the contradiction Ak ⊆ Bǫ (c) ⊆ Oj . 

D Local Lipschitz Continuity


In Prop. 3.13, it was shown that a continuous function is locally Lipschitz with respect
to y if, and only if, it is globally Lipschitz with respect to y on every compact set.
The following Prop. D.1 shows that this equivalence holds even if f is not continuous,
provided that each projection Gx as in (D.1) below is convex. On the other hand, Ex.
D.2 shows that, in general, there exist discontinuous functions that are locally Lipschitz
with respect to y without being globally Lipschitz with respect to y on every compact
set.
Proposition D.1. Let m, n ∈ N, G ⊆ R × Km , and f : G −→ Kn . If G is such that
each projection
Gx := {y ∈ Km : (x, y) ∈ G}, x ∈ R, (D.1)
D LOCAL LIPSCHITZ CONTINUITY 133

is convex (in particular, if G itself is convex), then f is locally Lipschitz with respect to
y if, and only if, f is (globally) Lipschitz with respect to y on every compact subset K
of G.

Proof. The proof of Prop. 3.13 shows, whithout making use of the continuity of f , that
(global) Lipschitz continuity with respect to y on every compact subset K of G implies
local Lipschitz continuity on G. Thus, assume f to be locally Lipschitz with respect to
y and assume each Gx to be convex. The proof of Prop. 3.13 shows, whithout making
use of the continuity of f , that, for each K ⊆ G compact
 
∃ ∀ ky − ȳk < δ ⇒ kf (x, y) − f (x, ȳ)k ≤ Lky − ȳk . (D.2)
δ>0, (x,y),(x,ȳ)∈K
L≥0

If (x, y), (x, ȳ) ∈ K are arbitrary with y 6= ȳ, then the convexity of Gx implies

{(x, (1 − t)y) + (x, tȳ) : t ∈ [0, 1]} ⊆ G. (D.3)

Choose N ∈ N such that N > 2ky − ȳk/δ and set h := ky − ȳk/N . Then

h < δ/2. (D.4)

Define  
kh kh
∀ yk := y+ 1− ȳ. (D.5)
k=0,...,N ky − ȳk ky − ȳk
Then
h h
∀ kyk+1 − yk k =
y+ ȳ =h<δ (D.6)
k=0,...,N −1 ky − ȳk ky − ȳk
and
N −1 N −1
X (D.2) X
kf (x, y) − f (x, ȳ)k ≤ kf (x, yk ) − f (x, yk+1 )k ≤ L kyk − yk+1 k
k=0 k=0

= L N h = L ky − ȳk, (D.7)

showing f to be (globally) L-Lipschitz with respect to y on K. 


Example D.2. We provide two examples that show that, in general, a discontinuous
function can be locally Lipschitz with respect to y without being globally Lipschitz with
respect to y on every compact set.

(a) Consider 
G :=] − 2, 2[× ] − 4, −1[∪]1, 4[ (D.8)
and f : G −→ R,

1/x for x 6= 0, y ∈] − 4, −1[,

f (x, y) := 0 for x = 0, y ∈] − 4, −1[, (D.9)

0 for y ∈]1, 4[.

D LOCAL LIPSCHITZ CONTINUITY 134

For the following open balls with respect to the max norm k(x, y)k := max{|x|, |y|},
one has B1 (x, y) ∩ G ⊆] − 2, 2[×] − 4, −1[ for y ∈] − 4, −1[, andB1 (x, y) ∩ G ⊆
] − 2, 2[×]1, 4[ for y ∈]1, 4[. Thus, f (x, ·) is constant on each set B1 (x, y) ∩ G (either
constantly equal to 1/x or constantly equal to 0), i.e. 0-Lipschitz with respect to y.
In particular, f is locally Lipschitz with respect to y. However, f is not Lipschitz
continuous with respect to y on the compact set

K := [−1, 1] × [−3, −2] ∪ [2, 3] : (D.10)
For the sequence ((xk , yk , y k ))k∈N , where
∀ xk := 1/k, yk := −2, y k := 2, (D.11)
k∈N

one has
|f (xk , yk ) − f (xk , y k )| k−0
lim = lim = ∞, (D.12)
k→∞ |yk − y k | k→∞ 2 − (−2)

showing f is not Lipschitz continuous with respect to y on K.


(b) If one increases the dimension by 1, then one can modify the example in (a) such
the set G is even connected (this variant was pointed out by Anton Sporrer): Let
A := ] − 4, −1[×] − 2, 2[ ∪ ] − 4, 4[×] − 2, 0[ ∪ ]1, 4[×] − 2, 2[ ⊆ R2 . (D.13)
  

Then A is open and connected (but not convex) and the same holds for
G :=] − 2, 2[×A ⊆ R3 . (D.14)
Define
(
1/x for x 6= 0, y1 ∈] − 4, −1[, y2 > 0,
f : G −→ R, f (x, y1 , y2 ) := (D.15)
0 otherwise.
Then everything works essentially as in (a) (it might be helpful to graphically
visualize the set A and the behavior of the function f ): For the following open balls
with respect to the max norm k(x, y)k := max{|x|, |y1 |, |y2 |}, one has
 
∀ (ξ, η1 , η2 ) ∈ B1 (x, y1 , y2 ) ∩ G ⇒ η1 < −1 + 1 = 0 < 1 . (D.16)
(x,y1 ,y2 )∈G,
y1 ∈]−4,−1[

Analogous to (a), f (x, ·) is constant on each set B1 (x, y1 , y2 ) ∩ G (either constantly


equal to 1/x or constantly equal to 0), i.e. 0-Lipschitz with respect to y. In particu-
lar, f is locally Lipschitz with respect to y. However, f is not Lipschitz continuous
with respect to y on the compact set

K := [−1, 1] × [−3, −2] ∪ [2, 3] × [−1, 1] : (D.17)
For the sequence ((xk , y1,k , y 1,k ), y2,k )k∈N with
∀ xk := 1/k, y1,k := −2, y 1,k := 2, y2,k := 0, (D.18)
k∈N

one has
|f (xk , y1,k , y2,k ) − f (xk , y 1,k , y2,k )| k−0
lim = lim = ∞, (D.19)
k→∞ k(y1,k , y2,k ) − (y 1,k , y2,k )kmax k→∞ max{4, 0}

showing f is not Lipschitz continuous with respect to y on K.


E MAXIMAL SOLUTIONS ON NONOPEN INTERVALS 135

E Maximal Solutions on Nonopen Intervals


In Def. 3.20, we required a maximal solution to an ODE to be defined on an open
interval. The following Ex. E.1 shows it can occur that such a maximal solution has an
extension to a larger nonopen interval. In such cases, one might want to call the solution
on the nonopen interval maximal rather than the solution on the smaller open interval.
However, this would make the treatment of maximal solutions more cumbersome in some
places, without adding any real substance, which is why we stick to our requirement for
maximal solutions to always be defined on an open interval.

Example E.1. (a) Let

G := [0, 1] × R, f : G −→ R, f (x, y) := 0. (E.1)

Then, for each (x0 , y0 ) ∈ G, the function

φ : [0, 1] −→ R, φ ≡ y0 , (E.2)

is a solution to the initial value problem

y ′ = f (x, y), y(x0 ) = y0 . (E.3)

However, the maximal solution of (E.3) according to Def. 3.20 is φ↾]0,1[ .

(b) The following modification of (a) allows f to be defined on all of R2 : Let


(
2 0 for x ∈ [0, 1],
G := R , f : G −→ R, f (x, y) := (E.4)
1 for x ∈/ [0, 1].

Then, for each (x0 , y0 ) ∈ [0, 1] × R, the function φ of (E.2) is a solution to the initial
value problem (E.3), but, again, the maximal solution of (E.3) according to Def.
3.20 is φ↾]0,1[ .

F Paths in Rn
Definition F.1. A path or curve in Rn , n ∈ N, is a continuous map ψ : I −→ Rn , where
I ⊆ R is an interval. One calls the path differentiable, continuously differentiable, etc.
if, and only if, the function ψ has the respective property.

Definition F.2. If a, b ∈ R, a ≤ b, and I := [a, b], then we call

|I| := b − a = |a − b|, (F.1)

the length of I.
F PATHS IN RN 136

Definition F.3. Given a real interval I := [a, b] ⊆ R, a, b ∈ R, a < b, the (N + 1)-


tuple ∆ := (x0 , . . . , xN ) ∈ RN +1 , N ∈ N, is called a partition of I if, and only if,
a = x0 < x1 < · · · < xN = b. The set of all partitions of I is denoted by Π(I) or by
Π[a, b]. Given a partition ∆ of I as above and letting Ij := [xj−1 , xj ], the number

|∆| := max |Ij | : j ∈ {1, . . . , N } , (F.2)

is called the mesh size of ∆.

Notation F.4. Given a, b ∈ R, a < b, a path ψ : [a, b] −→ Rn , n ∈ N, and a partition


∆ = (x0 , . . . , xN ), N ∈ N, of [a, b], we consider the approximation of ψ by the polygon,
connecting the points ψ(x0 ), . . . , ψ(xN ), where we denote the polygon’s length by
N
X −1
pψ (∆) := pψ (x0 , . . . , xN ) := kψ(xk+1 ) − ψ(xk )k2 , (F.3)
k=0

using k · k2 to denote the 2-Norm on Rn , i.e. the Euclidean norm.

Definition F.5. Given a, b ∈ R, a ≤ b, for each path ψ : [a, b] −→ Rn , n ∈ N, define


(
0 n for a = b,
l(ψ) := o (F.4)
sup pψ (∆) : ∆ ∈ Π[a, b] ∈ [0, ∞] for a < b.

The path ψ is called rectifyable with arc length l(ψ) if, and only if, l(ψ) < ∞.

Proposition F.6. Let a, b ∈ R, a < b, and let ψ : [a, b] −→ Rn be a path, n ∈ N.

(a) If ψ is affine, i.e. there exist y0 , y1 ∈ Rn such that

∀ ψ(x) = y0 + x y1 , (F.5)
x∈[a,b]

then ψ is rectifyable with arc length

l(ψ) = ky1 k2 (b − a) = kψ(b) − ψ(a)k2 . (F.6)

(b) If the path ψ is L-Lipschitz with L ≥ 0, then ψ is rectifyable and

l(ψ) ≤ L (b − a). (F.7)

(c) If the paths φ, ψ : [a, b] −→ Rn are both rectifyable, then



l(φ) − l(ψ) ≤ l(φ − ψ). (F.8)

(d) For each ξ ∈ [a, b], it holds that

l(ψ) = l(ψ↾[a,ξ] ) + l(ψ↾[ξ,b] ). (F.9)


F PATHS IN RN 137

Proof. (a): For each partition (x0 , . . . , xN ), N ∈ N, of [a, b], we have


N
X −1 N
X −1
pψ (x0 , . . . , xN ) = kψ(xk+1 ) − ψ(xk )k2 = kxk+1 y1 − xk y1 k2
k=0 k=0
N
X −1
= ky1 k2 (xk+1 − xk ) = ky1 k2 (b − a), (F.10)
k=0

proving (F.6).
(b): For each partition (x0 , . . . , xN ), N ∈ N, of [a, b], we have
N
X −1 N
X −1
pψ (x0 , . . . , xN ) = kψ(xk+1 ) − ψ(xk )k2 ≤ L kxk+1 − xk k2
k=0 k=0
N
X −1
=L (xk+1 − xk ) = L (b − a), (F.11)
k=0

proving (F.7).
(c): For each partition ∆ = (x0 , . . . , xN ), N ∈ N, of [a, b], we have
−1 −1

N X N
X
pφ (∆) − pψ (∆) = kφ(xk+1 ) − φ(xk )k2 − kψ(xk+1 ) − ψ(xk )k2


k=0 k=0
N
X −1
≤ kφ(xk+1 ) − φ(xk )k2 − kψ(xk+1 ) − ψ(xk )k2

k=0
N
X −1
≤ φ(xk+1 ) − ψ(xk+1 ) − φ(xk ) − ψ(xk )

2
k=0

= pφ−ψ (∆), (F.12)

proving (F.8) (the last estimate in (F.12) holds true due to the inverse triangle inequal-
ity).
(d): If ξ = a or ξ = b, then there is nothing to prove. Thus, assume a < ξ < b. If
∆1 := (x0 , . . . , xN ) is a partition of [a, ξ] and ∆2 := (xN , . . . , xM ) is a partition of [ξ, b],
N, M ∈ N, M > N , then ∆ := (x0 , . . . , xM ) is a partition of [a, b]. Moreover,

pψ (∆) = pψ (∆1 ) + pψ (∆2 ) (F.13)

is immediate from (F.3), implying

l(ψ) ≥ l(ψ↾[a,ξ] ) + l(ψ↾[ξ,b] ). (F.14)

On the other hand, if ∆ = (x0 , . . . , xM ) M ∈ N, is a partition of [a, b], then, either there
is 0 < N < M such that ξ = N , in which case (F.13) holds once again, where ∆1 and ∆2
are defined as before. Otherwise, there is N ∈ {0, . . . , M − 1} such that xN < ξ < xN +1
F PATHS IN RN 138

and, in this case, ∆1 := (x0 , . . . , xN , ξ) is a partition of [a, ξ] and ∆2 := (ξ, xN +1 , . . . , xM )


is a partition of [ξ, b]. Moreover,
M
X −1
pψ (∆) = kψ(xk+1 ) − ψ(xk )k2
k=0
N
X −1 M
X −1
= kψ(xk+1 ) − ψ(xk )k2 + kxN +1 − xN k + kψ(xk+1 ) − ψ(xk )k2
k=0 k=N +1

≤ pψ (∆1 ) + pψ (∆2 ), (F.15)

showing
l(ψ) ≤ l(ψ↾[a,ξ] ) + l(ψ↾[ξ,b] ) (F.16)
and concluding the proof. 
Theorem F.7. Given a, b ∈ R, a < b, each continuously differentiable path ψ : [a, b] −→
Rn , n ∈ N, is rectifyable with arc length
Z b

l(ψ) = ψ (x) dx .
2
(F.17)
a

Proof. Since ψ is continuously differentiable, it follows from [Phi15, Th. C.3] that ψ is
Lipschitz continuous on [a, b], i.e. ψ is rectifyable by Prop. F.6(b) above. To prove (F.17),
according to the fundamental theorem of calculus [Phi16, Th. 10.20(b)], it suffices to
show the function
λ : [a, b] −→ R+ 0, λ(x) := l(ψ↾[a,x] ), (F.18)
is differentiable with derivative λ′ (x) = kψ ′ (x)k2 . To this end, first note the continuous
function ψ ′ is even uniformly continuous by Th. C.16. Thus,
 
∀ ∃ ∀ |x0 − x| < δ ⇒ kψ(x0 ) − ψ(x)k2 < ǫ. . (F.19)
ǫ>0 δ>0 x0 ,x∈[a,b]

Fix x0 ∈ [a, b[ and consider x1 ∈]a, b[ such that x0 < x1 < x0 + δ. Define the affine path

α : [x0 , x1 ] −→ Rn , α(x) := ψ(x0 ) + (x − x0 )ψ ′ (x0 ). (F.20)

According to Prop. F.6(a), we have

l(α) = kψ ′ (x0 )k2 (x1 − x0 ). (F.21)

Moreover, for the path ψ − α, we have


(F.19)
∀ kψ ′ (x) − α′ (x)k2 = kψ ′ (x) − ψ ′ (x0 )k2 < ǫ. (F.22)
x∈[x0 ,x1 ]

Thus, it follows from [Phi15, Th. C.3] that ψ − α is ǫ-Lipschitz on [a, b] and, then, Prop.
F.6(b) yields
l(ψ↾[x0 ,x1 ] −α) ≤ ǫ(x1 − x0 ), (F.23)
G OPERATOR NORMS AND MATRIX NORMS 139

Prop. F.6(c), in turn, yields



l(ψ↾[x0 ,x1 ] ) − l(α) ≤ l(ψ↾[x0 ,x1 ] −α) ≤ ǫ(x1 − x0 ). (F.24)
Putting everything together, we obtain

l(ψ↾[a,x1 ] ) − l(ψ↾[a,x0 ] ) ′
Prop. F.6(d), (F.21) l(ψ↾[x0 ,x1 ] ) l(α)
− kψ (x0 )k2
= x1 − x0 − x1 − x0

x1 − x0
(F.24) ǫ(x1 − x0 )
≤ = ǫ, (F.25)
x1 − x0
showing the function λ from (F.18) has a right-hand derivative at x0 and the value of
that right-hand derivative at x0 is the desired kψ ′ (x0 )k2 . Repeating the above argument
with x0 , x1 ∈]a, b] such that x0 − δ < x1 < x0 shows λ to have a left-hand derivative at
each x0 ∈]a, b] with value kψ ′ (x0 )k2 , which completes the proof. 
Remark F.8. An example of a differentiable nonrectifyable path is given by (cf. [Wal02,
Ex. 5.14.6]) (
2 x2 cos xπ2 for x 6= 0,
ψ : [0, 1] −→ R , ψ(x) := (F.26)
0 for x = 0.

G Operator Norms and Matrix Norms


For the present ODE class, we are mostly interested in linear maps from Kn into itself.
However, introducing the relevant notions for linear maps between general normed vector
spaces does not provide much additional difficulty, and, hopefully, even some extra
clarity.
Definition G.1. Let A : X −→ Y be a linear map between two normed vector spaces
(X, k · kX ) and (Y, k · kY ) over K. Then A is called bounded if, and only if, A maps
bounded sets to bounded sets, i.e. if, and only if, A(B) is a bounded subset of Y for
each bounded B ⊆ X. The vector space of all bounded linear maps between X and Y
is denoted by L(X, Y ).
Definition G.2. Let A : X −→ Y be a linear map between two normed vector spaces
(X, k · kX ) and (Y, k · kY ) over K. The number
 
kAxkY
kAk := sup : x ∈ X, x 6= 0
kxkX

= sup kAxkY : x ∈ X, kxkX = 1 ∈ [0, ∞] (G.1)
is called the operator norm of A induced by k · kX and k · kY (strictly speaking, the term
operator norm is only justified if the value is finite, but it is often convenient to use the
term in the generalized way defined here).
In the special case, where X = Kn , Y = Km , and A is given via a real m × n matrix,
the operator norm is also called matrix norm.

G OPERATOR NORMS AND MATRIX NORMS 140

From now on, the space index of a norm will usually be suppressed, i.e. we write just
k · k instead of both k · kX and k · kY .
Theorem G.3. For a linear map A : X −→ Y between two normed vector spaces
(X, k · k) and (Y, k · k) over K, the following statements are equivalent:

(a) A is bounded.
(b) kAk < ∞.
(c) A is Lipschitz continuous.
(d) A is continuous.
(e) There is x0 ∈ X such that A is continuous at x0 .

Proof. Since every Lipschitz continuous map is continuous and since every continuous
map is continuous at every point, “(c) ⇒ (d) ⇒ (e)” is clear.
“(e) ⇒ (c)”: Let x0 ∈ X be such that A is continuous at x0 . Thus, for each ǫ > 0, there
is δ > 0 such that kx − x0 k < δ implies kAx − Ax0 k < ǫ. As A is linear, for each x ∈ X
with kxk < δ, one has kAxk = kA(x + x0 ) − Ax0 k < ǫ, due to kx + x0 − x0 k = kxk < δ.
Moreover, one has k(δx)/2k ≤ δ/2 < δ for each x ∈ X with kxk ≤ 1. Letting L := 2ǫ/δ,
this means that kAxk = kA((δx)/2)k/(δ/2) < 2ǫ/δ = L for each x ∈ X with kxk ≤ 1.
Thus, for each x, y ∈ X with x 6= y, one has
 
x−y
kAx − Ayk = kA(x − y)k = kx − yk A < L kx − yk. (G.2)
kx − yk
Together with the fact that kAx − Ayk ≤ kx − yk is trivially true for x = y, this shows
that A is Lipschitz continuous.
“(c) ⇒ (b)”: As A is Lipschitz continuous, there exists L ∈ R+
0 such that kAx − Ayk ≤
L kx − yk for each x, y ∈ X. Considering the special case y = 0 and kxk = 1 yields
kAxk ≤ L kxk = L, implying kAk ≤ L < ∞.
“(b) ⇒ (c)”: Let kAk < ∞. We will show
kAx − Ayk ≤ kAk kx − yk for each x, y ∈ X. (G.3)
For x = y, there is nothing to prove. Thus, let x 6= y. One computes
 
kAx − Ayk x − y
≤ kAk
= A (G.4)
kx − yk kx − yk

x−y
as kx−yk = 1, thereby establishing (G.3).
“(b) ⇒ (a)”: Let kAk < ∞ and let M ⊆ X be bounded. Then there is r > 0 such that
M ⊆ Br (0). Moreover, for each 0 6= x ∈ M :
 
kAxk x ≤ kAk
= A
(G.5)
kxk kxk
G OPERATOR NORMS AND MATRIX NORMS 141


x
as kxk = 1. Thus kAxk ≤ kAkkxk ≤ rkAk, showing that A(M ) ⊆ BrkAk (0). Thus,
A(M ) is bounded, thereby establishing the case.
“(a) ⇒ (b)”: Since A is bounded, it maps the bounded set B1 (0) ⊆ X into some
bounded subset of Y . Thus, there is r > 0 such that A(B1 (0)) ⊆ Br (0) ⊆ Y . In
particular, kAxk < r for each x ∈ X satisfying kxk = 1, showing kAk ≤ r < ∞. 
Remark G.4. For linear maps between finite-dimensional spaces, the equivalent prop-
erties of Th. G.3 always hold: Each linear map A : Kn −→ Km , (n, m) ∈ N2 , is
continuous (this follows, for example, from the fact that each such map is (trivially)
differentiable, and every differentiable map is continuous). In particular, each linear
map A : Kn −→ Km , has all the equivalent properties of Th. G.3.
Theorem G.5. Let X and Y be normed vector spaces over K.

(a) The operator norm does, indeed, constitute a norm on the set of bounded linear
maps L(X, Y ).
(b) If A ∈ L(X, Y ), then kAk is the smallest Lipschitz constant for A, i.e. kAk is a
Lipschitz constant for A and kAx − Ayk ≤ L kx − yk for each x, y ∈ X implies
kAk ≤ L.

Proof. (a): If A = 0, then, in particular, Ax = 0 for each x ∈ X with kxk = 1, implying


kAk = 0. Conversely, kAk = 0 implies Ax = 0 for each x ∈ X with kxk = 1. But then
Ax = kxk A(x/kxk) = 0 for every 0 6= x ∈ X, i.e. A = 0. Thus, the operator norm is
positive definite. If A ∈ L(X, Y ), λ ∈ K, and x ∈ X, then

(λA)x = A(λx) = λ(Ax) = |λ| Ax , (G.6)
yielding
 
kλAk = sup k(λA)xk : x ∈ X, kxk = 1 = sup |λ| kAxk : x ∈ X, kxk = 1

= |λ| sup kAxk : x ∈ X, kxk = 1 = |λ| kAk, (G.7)
showing that the operator norm is homogeneous of degree 1. Finally, if A, B ∈ L(X, Y )
and x ∈ X, then
k(A + B)xk = kAx + Bxk ≤ kAxk + kBxk, (G.8)
yielding

kA + Bk = sup k(A + B)xk : x ∈ X, kxk = 1

≤ sup kAxk + kBxk : x ∈ X, kxk = 1
 
≤ sup kAxk : x ∈ X, kxk = 1 + sup kBxk : x ∈ X, kxk = 1
= kAk + kBk, (G.9)
showing that the operator norm also satisfies the triangle inequality, thereby completing
the verification that it is, indeed, a norm.
(b): That kAk is a Lipschitz constant for A was already shown in the proof of “(b) ⇒
(c)” of Th. G.3. Now let L ∈ R+
0 be such that kAx − Ayk ≤ L kx − yk for each x, y ∈ X.
Specializing to y = 0 and kxk = 1 implies kAxk ≤ L kxk = L, showing kAk ≤ L. 
H THE VANDERMONDE DETERMINANT 142

Remark G.6. Even though it is beyond the scope of the present class, let us mention
as an outlook that one can show that L(X, Y ) with the operator norm is a Banach space
(i.e. a complete normed vector space) provided that Y is a Banach space (even if X is
not a Banach space).
Lemma G.7. If Id : X −→ X, Id(x) := x, is the identity map on a normed vector space
X over K, then k Id k = 1 (in particular, the operator norm of a unit matrix is always
1). Caveat: In principle, one can consider two different norms on X simultaneously,
and then the operator norm of the identity can differ from 1.

Proof. If kxk = 1, then k Id(x)k = kxk = 1. 


Lemma G.8. Let X, Y, Z be normed vector spaces and consider linear maps A ∈
L(X, Y ), B ∈ L(Y, Z). Then
kBAk ≤ kBk kAk. (G.10)

Proof. Let x ∈ X with kxk = 1. If Ax = 0, then kB(A(x))k = 0 ≤ kBk kAk. If Ax 6= 0,


then one estimates
 

B(Ax) = kAxk B
Ax ≤ kAk kBk, (G.11)
kAxk
thereby establishing the case. 
Example G.9. Let m, n ∈ N and let A : Rn −→ Rm be the linear map given by the
m × n matrix (akl )(k,l)∈{1,...,m}×{1,...,n} . Then
( n )
X
kAk∞ := max |akl | : k ∈ {1, . . . , m} (G.12a)
l=1

is called the row sum norm of A, and


( m
)
X
kAk1 := max |akl | : l ∈ {1, . . . , n} (G.12b)
k=1

is called the column sum norm of A. It is an exercise to show that kAk∞ is the operator
norm induced if Rn and Rm are endowed with the ∞-norm, and kAk1 is the operator
norm induced if Rn and Rm are endowed with the 1-norm.

H The Vandermonde Determinant


Theorem H.1. Let n ∈ N and λ0 , λ1 , . . . , λn ∈ C. Moreover, let
 
1 λ0 . . . λn0
 1 λ1 . . . λn 
1
V :=  .. ..  (H.1)

. .
1 λn . . . λnn
H THE VANDERMONDE DETERMINANT 143

be the corresponding Vandermonde matrix. Then its determinant, the so-called Vander-
monde determinant is given by
n
Y
det(V ) = (λk − λl ). (H.2)
k,l=0
k>l

Proof. The proof can be conducted by induction with respect to n: For n = 1, we have
1
1 λ0 Y
det(V ) = = λ1 − λ0 = (λk − λl ), (H.3)
1 λ1
k,l=0
k>l

showing (H.2) holds for n = 1. Now let n > 1. We know from Linear Algebra that the
value of a determinant does not change if we add a multiple of a column to a different
column. Adding the (−λ0 )-fold of the nth column to the (n + 1)st column, we obtain
in the (n + 1)st column  
0
λn − λn−1 λ0 
 1 1
.. . (H.4)


 . 
λnn − λnn−1 λ0
Next, one adds the (−λ0 )-fold of the (n − 1)st column to the nth column, and, succes-
sively, the (−λ0 )-fold of the mth column to the (m + 1)st column. One finishes, in the
nth step, by adding the (−λ0 )-fold of the first column to the second column, obtaining

1 λ0 . . . λn 1 0 0 . . . 0
0
1 λ1 . . . λn 1 λ1 − λ0 λ2 − λ1 λ0 . . . λn − λn−1 λ0
1 1 1 1
det(V ) = .. .. = .. .. .. .. .. . (H.5)

. . . . . . .
1 λn . . . λnn 1 λn − λ0 λ2n − λn λ0 . . . λnn − λnn−1 λ0

Applying the rule for determinants of block matrices to (H.5) yields



λ1 − λ0 λ2 − λ1 λ0 . . . λn − λn−1 λ0
1 1 1
det(V ) = 1 · ... .. .. ..

. (H.6)

. . .
2 n n−1

λn − λ0 λn − λn λ0 . . . λn − λn λ0
As we also know from Linear Algebra that determinants are linear in each row, for each
k, we can factor out (λk − λ0 ) from the kth row of (H.6), arriving at

1 λ1 . . . λn−1
n 1
.. .. . .
Y
det(V ) = (λk − λ0 ) . . .
. .
. .

(H.7)
n−1

k=1 1 λn . . . λn
However, the determinant in (H.7) is precisely the Vandermonde determinant of the n−1
numbers λ1 , . . . , λn , which is given according to the induction hypothesis, implying
n
Y n
Y n
Y
det(V ) = (λk − λ0 ) (λk − λl ) = (λk − λl ), (H.8)
k=1 k,l=1 k,l=0
k>l k>l
I MATRIX-VALUED FUNCTIONS 144

completing the induction proof of (H.2). 

I Matrix-Valued Functions
Notation I.1. Given m, n ∈ N, let M(m, n, K) denote the set of m × n matrices over
K.

I.1 Product Rule


Proposition I.2. Let I ⊆ R be a nontrivial interval, let m, n, l ∈ N, and suppose

A : I −→ M(m, n, K), A(x) = aαβ (x) , (I.1a)

B : I −→ M(n, l, K), B(x) = bαβ (x) , (I.1b)

are differentiable. Then

C : I −→ M(m, l, K), C(x) := A(x)B(x), (I.2)

is differentiable, and one has the product rule

∀ C ′ (x) = A′ (x)B(x) + A(x)B ′ (x). (I.3)


x∈I


Proof. Writing C(x) = cαβ (x) and using the one-dimensional product rule together
with the definition of matrix multiplication, one computes, for each (α, β) ∈ {1, . . . , m}
× {1, . . . , l},
n
!′
X
c′αβ (x) = aαγ (x) bγβ (x)
γ=1
n
X n
X
= a′αγ (x) bγβ (x) + aαγ (x) b′γβ (x)
γ=1 γ=1

= A′ (x)B(x) + A(x)B ′ (x)


 
αβ αβ
, (I.4)

proving the proposition. 

I.2 Integration and Matrix Multiplication Commute


Proposition I.3. Let m, n, p ∈ N, let I ⊆ R be measurable (e.g. an interval), let A :
I −→ M(m, n, K), x 7→ A(x) = (akl (x)), be integrable (i.e. all Re akl , Im akl : I −→ R
are integrable).
J AUTONOMOUS ODE 145

(a) If B = (bjk ) ∈ M(p, m, K), then


Z Z
B A(x) dx = B A(x) dx . (I.5)
I I

(b) If B = (Blj ) ∈ M(n, p, K), then


Z  Z
A(x) dx B = A(x) B dx . (I.6)
I I

Proof. (a): One computes, for each (j, l) ∈ {1, . . . , p} × {1, . . . , n},
 Z  m Z Z m
!
X X
B A(x) dx = bjk akl (x) dx = bjk akl (x) dx
I jl k=1 I I k=1
Z Z 
= (BA(x))jl dx = B A(x) dx , (I.7)
I I jl

proving (I.5).
(b): One computes, for each (k, j) ∈ {1, . . . , m} × {1, . . . , p},
Z   n Z  Z n
!
X X
A(x) dx B = akl (x) dx blj = akl (x)blj dx
I kj l=1 I I l=1
Z Z 
= (A(x)B)kj dx = A(x) B dx , (I.8)
I I kj

proving (I.6). 

J Autonomous ODE

J.1 Equivalence of Autonomous and Nonautonomous ODE


Theorem J.1. Let G ⊆ R × Kn , n ∈ N, and f : G −→ Kn . Then the nonautonomous
ODE
y ′ = f (x, y) (J.1)
is equivalent to the autonomous ODE

y ′ = g(y), (J.2)

where
g : R × G −→ Kn+1 ,

g(y1 , . . . , yn+1 ) := 1, f (y1 , y2 , . . . , yn+1 ) , (J.3)
in the following sense:
J AUTONOMOUS ODE 146

(a) If φ : I −→ Kn is a solution to (J.1), then ψ : I −→ Kn+1 , ψ(x) := (x, φ(x)), is a


solution to (J.2).

(b) If ψ : I −→ Kn+1 is a solution to (J.2) with the property

∃ ψ1 (x0 ) = x0 , (J.4)
x0 ∈I

then φ : I −→ Kn , φ(x) := (ψ2 (x), . . . , ψn+1 (x)), is a solution to (J.1).

Proof. (a): If φ : I −→ Kn is a solution to (J.1) and ψ : I −→ Kn+1 , ψ(x) := (x, φ(x)),


then
∀ ψ ′ (x) = (1, φ′ (x)) = 1, f (x, φ(x)) = g(x, φ(x)) = g(ψ(x)),

(J.5)
x∈I

showing ψ is a solution to (J.2).


(b): If ψ : I −→ Kn+1 is a solution to (J.2) with the property (J.4) and φ : I −→ Kn ,
φ(x) := (ψ2 (x), . . . , ψn+1 (x)), then (J.4) implies ψ1 (x) = x for each x ∈ I and, thus,

∀ φ′ (x) = (ψ2′ (x), . . . , ψn+1



(x)) = f (x, ψ2 (x), . . . , ψn+1 (x)) = f (x, φ(x)), (J.6)
x∈I

showing φ is a solution to (J.1). 

While Th. J.1 is somewhat striking and of theoretical interest, it has few useful applica-
tions in practise, due to the unbounded first component of solutions to (J.2) (cf. Rem.
5.2).

J.2 Integral for ODE with Discontinuous Right-Hand Side


The following Example J.2, provided by Anton Sporrer, shows Lem. 5.19 becomes false
if the hypothesis that every initial value problem for the considered ODE y ′ = f (y) has
at least one solution is omitted:
Example J.2. Consider
(
0 for y ∈ Q,
f : R −→ R, f (y) := (J.7)
1 for y ∈ R \ Q,

and the autonomous ODE y ′ = f (y). If (x0 , y0 ) ∈ R × Q, then the initial value problem
y(x0 ) = y0 has the unique solution φ : R −→ R, φ ≡ y0 ∈ Q. However, if (x0 , y0 ) ∈
R × (R \ Q), then the initial value problem y(x0 ) = y0 has no solution. Since y ′ = f (y)
has only constant solutions, every function E : R −→ R is an integral for this ODE
according to Def. 5.18. However, not every differentiable function E : R −→ R satisfies
(5.15): For example, if E(y) := y, then E ′ ≡ 1, i.e.

∀ E ′ (y)f (y) = 1 6= 0, (J.8)


y∈R\Q

showing that Lem. 5.19 does not hold for y ′ = f (y) with f according to (J.7).
K POLAR COORDINATES 147

K Polar Coordinates
Recall the following functions, used in polar coordinates of the plane:
q
r : R2 \ {(0, 0)} −→ R+ , r(y1 , y2 ) := y12 + y22 , (K.1a)


 0 for y2 = 0, y1 > 0,

arccot(y /y )
1 2 for y2 > 0,
ϕ : R2 \ {(0, 0)} −→ [0, 2π[, ϕ(y1 , y2 ) :=


 π for y2 = 0, y1 < 0,
π + arccot(y1 /y2 ) for y2 < 0.

(K.1b)

Theorem K.1. Consider f : R2 \ {(0, 0)} −→ R2 and the corresponding R2 -valued


autonomous ODE

y1′ = f1 (y1 , y2 ), (K.2a)


y2′ = f2 (y1 , y2 ), (K.2b)

together with its polar coordinate version

r′ = g1 (r, ϕ), (K.3a)


ϕ′ = g2 (r, ϕ), (K.3b)

where g : R+ × R −→ R2 ,

g1 : R+ × R −→ R,
g1 (r, ϕ) := f1 (r cos ϕ, r sin ϕ) cos ϕ + f2 (r cos ϕ, r sin ϕ) sin ϕ, (K.4a)
g2 : R+ × R −→ R,
1 
g2 (r, ϕ) := f2 (r cos ϕ, r sin ϕ) cos ϕ − f1 (r cos ϕ, r sin ϕ) sin ϕ . (K.4b)
r
Let µ : I −→ R2 be a solution to (K.3).

(a) Then
φ : I −→ R2 ,

φ(x) := µ1 (x) cos µ2 (x), µ1 (x) sin µ2 (x) , (K.5)
is a solution to (K.2).
(b) If µ satisfies the initial condition

µ1 (0) = ρ, ρ ∈ R+ , (K.6a)
µ2 (0) = τ, τ ∈ R, (K.6b)

and if

η1 = ρ cos τ, (K.7a)
η2 = ρ sin τ, (K.7b)
K POLAR COORDINATES 148

then φ satisfies the initial condition

φ1 (0) = η1 , (K.8a)
φ2 (0) = η2 . (K.8b)

Note that ρ > 0 implies (η1 , η2 ) 6= (0, 0), and that, for (ρ, τ ) ∈ R+ × [0, 2π[, (K.7)
is equivalent to

r(η1 , η2 ) = ρ, (K.9a)
ϕ(η1 , η2 ) = τ (K.9b)

(cf. the computations of φ−1 ◦ φ and of φ ◦ φ−1 in [Phi15, Ex. 4.19]).

Proof. Exercise. 
Example K.2. Consider the autonomous ODE (K.2) with

f1 : R2 \ {(0, 0)} −→ R,
 y2 (r(y1 , y2 ) − y1 )
f1 (y1 , y2 ) := y1 1 − r(y1 , y2 ) − , (K.10a)
2 r(y1 , y2 )
f2 : R2 \ {(0, 0)} −→ R,
 y1 (r(y1 , y2 ) − y1 )
f2 (y1 , y2 ) := y2 1 − r(y1 , y2 ) + , (K.10b)
2 r(y1 , y2 )
where r is the radial polar coordinate function as defined in (K.1a). Using g : R+ ×R −→
R2 as defined in (K.4), one obtains, for each (ρ, ϕ) ∈ R+ × R,
ρ sin ϕ (ρ − ρ cos ϕ) cos ϕ
g1 (ρ, ϕ) = ρ cos ϕ (1 − ρ) cos ϕ −

ρ cos ϕ (ρ − ρ cos ϕ) sin ϕ
+ ρ sin ϕ (1 − ρ) sin ϕ +

= ρ (1 − ρ), (K.11a)
cos ϕ (ρ − ρ cos ϕ) cos ϕ
g2 (ρ, ϕ) = sin ϕ (1 − ρ) cos ϕ +

sin ϕ (ρ − ρ cos ϕ) sin ϕ
− cos ϕ (1 − ρ) sin ϕ +

1 − cos ϕ
= , (K.11b)
2
such that the autonomous ODE

r′ = r (1 − r), (K.12a)
1 − cos ϕ [Phi16, (I.1c)] ϕ
ϕ′ = = sin2 , (K.12b)
2 2
is the polar coordinate version of (K.2) as defined in Th. K.1.
K POLAR COORDINATES 149

Claim 1. The general solution to

r′ = p(r), p : R+ −→ R, p(r) := r (1 − r), (K.13)

is
ρ
Yp : Dp,0 −→ R+ , Yp (x, ρ) := , (K.14)
ρ + (1 − ρ) e−x
where  h
 i ρ
Dp,0 = R×]0, 1] ∪ (x, ρ) : ρ > 1, x ∈ − ln ,∞ . (K.15)
ρ−1

Proof. The initial condition is satisfied, since


ρ
∀+ Yp (0, ρ) = = ρ. (K.16)
ρ∈R ρ + (1 − ρ)

The ODE is satisfied, since

ρ(1 − ρ)e−x
Yp′ (x, ρ) =

∀ 2 = Yp (x, ρ) 1 − Yp (x, ρ) . (K.17)
(x,ρ)∈Dp,0 ρ + (1 − ρ) e−x

To verify the form of Dp,0 , we note that the denominator in (K.14) is positive for each
x ∈ R if 0 < ρ ≤ 1. If ρ > 1, then the function a : R −→ R, a(x) := ρ + (1 − ρ) e−x , is
ρ
strictly increasing (note a′ (x) = (ρ − 1) e−x > 0) and has a unique zero at x = − ln ρ−1 .
ρ Y (x, ρ) = ∞, proving the maximality of Y (·, ρ).
Thus limx↓− ln ρ−1 N

Claim 2. Letting, for each k ∈ Z,


2 cos τ + 2
x0 (τ ) := for τ ∈ R \ {lπ : l ∈ Z}, (K.18a)
sin τ
Rk :=]0, π[+2kπ, (K.18b)
Lk :=] − π, 0[+2kπ, (K.18c)
A0 := R × {2kπ : k ∈ Z}, (K.18d)
A0,k := R− × {π + 2kπ}, (K.18e)
B0,k := R+ × {π + 2kπ}, (K.18f)
A1,k := (x, τ ) ∈ R2 : x ∈] − ∞, x0 (τ )[, τ ∈ Rk ,

(K.18g)
A2,k := (x, τ ) ∈ R2 : x ∈]x0 (τ ), ∞[, τ ∈ Lk ,

(K.18h)
B1,k := (x, τ ) ∈ R2 : x ∈]x0 (τ ), ∞[, τ ∈ Rk ,

(K.18i)
B2,k := (x, τ ) ∈ R2 : x ∈] − ∞, x0 (τ )[, τ ∈ Lk ,

(K.18j)
C1,k := (x, τ ) ∈ R2 : x = x0 (τ ), τ ∈ Rk ,

(K.18k)
C2,k := (x, τ ) ∈ R2 : x = x0 (τ ), τ ∈ Lk ,

(K.18l)

the general solution to


1 − cos ϕ ϕ
ϕ′ = q(ϕ), q : R −→ R, q(ϕ) := = sin2 , (K.19)
2 2
K POLAR COORDINATES 150

is
Yq : R2 −→ R+ ,



 τ for (x, τ ) ∈ A0 ,
2
 



 2 kπ + arctan − x
for (x, τ ) ∈ A0,k , k ∈ Z,
2 (k + 1)π + arctan − x2

for (x, τ ) ∈ B0,k , k ∈ Z,





π + 2kπ for (x, τ ) = (0, π + 2kπ), k ∈ Z,



Yq (x, τ ) := 2 kπ + arctan 2 cos τ2−x
sin τ

sin τ +2
for (x, τ ) ∈ A1,k ∪ A2,k , k ∈ Z,

π + 2kπ for (x, τ ) ∈ C1,k , k ∈ Z,




2 (k + 1)π + arctan 2 cos τ2−x sin τ
 



 sin τ +2
for (x, τ ) ∈ B1,k , k ∈ Z,




 −π + 2kπ for (x, τ ) ∈ C2,k , k ∈ Z,
2 sin τ
2 (k − 1)π + arctan 
for (x, τ ) ∈ B2,k , k ∈ Z.

2 cos τ −x sin τ +2
(K.20)

Proof. One observes that Yq is well-defined, since


[ 
2 ˙ ˙ ˙ ˙ ˙ ˙ ˙ ˙ ˙
R = A0 ∪ {(0, π + 2kπ)} ∪ A0,k ∪ B0,k ∪ A1,k ∪ A2,k ∪ B1,k ∪ B2,k ∪ C1,k ∪ C2,k
k∈Z
(K.21)
and, introducing the auxiliary function
∆ : R2 −→ R, ∆(x, τ ) := 2 cos τ − x sin τ + 2, (K.22)
one has
∆(x, τ ) 6= 0 for each (x, τ ) ∈ A1,k ∪ A2,k ∪ B1,k ∪ B2,k , k ∈ Z. (K.23)
It remains to show that, for each τ ∈ R, the function x 7→ Yq (x, τ ) is differentiable on
R, satisfying
1 − cos Yq (x, τ )
∀ Yq′ (x, τ ) = , (K.24)
x∈R 2
and the initial condition
Yq (0, τ ) = τ. (K.25)
The initial condition (K.25) is satisfied, since
∀ Yq (0, τ ) = τ, (K.26a)
τ ∈{kπ: k∈Z}
2 sin τ [Phi16, (I.1d)]  τ
∀ Yq (0, τ ) = 2kπ + 2 arctan = 2kπ + 2 arctan tan
τ ∈Rk ∪Lk , k∈Z 2 cos τ + 2 2
τ 
= 2kπ + 2 − kπ = τ. (K.26b)
2
Next, we show that, for each τ ∈ R, the function x 7→ Yq (x, τ ) is differentiable on R and
satisfies (K.24): For τ ∈ {2kπ : k ∈ Z}, x 7→ Yq (x, τ ) is constant, i.e. differentiability is
clear, and
1 − cos Yq (x, τ ) 1 − cos(2kπ)
∀ = = 0 = Yq′ (x, τ ) (K.27)
x∈R 2 2
K POLAR COORDINATES 151

proves (K.24).
For each τ ∈ {2(k +1)π : k ∈ Z}, differentiability is clear in each x ∈ R\{0}. Moreover,
  ′
2 4 1 4
∀ 2 arctan − = 2 4 = , (K.28)
x∈R\{0} x x 1 + x2 4 + x2
and, thus, for each x ∈ R \ {0},
4
1 1−
  
1 − cos Yq (x, τ ) 1 1 2 [Phi16, (I.1e)] 1 x2
= − cos 2 arctan − = − 4
2 2 2 x 2 2 1+ x2
2
1 1 x −4 8 4 (K.28) ′
= − 2
= 2
= = Yq (x, τ ), (K.29)
2 2 x +4 2(4 + x ) 4 + x2
proving (K.24) for each x ∈ R \ {0}. It remains to consider x = 0. One has, by
L’Hôpital’s rule [Phi16, Th. 9.26(a)],
π + 2kπ − 2 kπ + arctan − x2

Yq (0, τ ) − Yq (x, τ )
lim = lim
x↑0 0−x x↑0 −x
4
[Phi16, (9.29)],(K.28) − 2
= lim 4+x = 1 (K.30)
x↑0 −1
and
π + 2kπ − 2 (k + 1)π + arctan − x2

Yq (0, τ ) − Yq (x, τ )
lim = lim
x↓0 0−x x↓0 −x
4
[Phi16, (9.29)],(K.28) − 2
= lim 4+x = 1, (K.31)
x↓0 −1
showing x 7→ Yq (x, τ ) to be differentiable in x = 0 with Yq′ (0, τ ) = 1. Due to
1 − cos(π + 2kπ)
= 1 = Yq′ (0, π + 2kπ), (K.32)
2
(K.24) also holds.
For each τ ∈ Rk ∪ Lk , the differentiability is clear in each x ∈ R \ {x0 (τ )}. Moreover,
recalling ∆(x, τ ) from (K.22), one has, for each x ∈ R \ {x0 (τ )},
′
4(sin τ )2 4(sin τ )2
 
2 sin τ 1
2 arctan = 2 = , (K.33)
∆(x, τ ) (∆(x, τ ))2 1 + 4(sin τ ) 2 4(sin τ )2 + (∆(x, τ ))2
(∆(x,τ ))

and, thus, for each x ∈ R \ {x0 (τ )},


4(sin τ )2
1 1−
  
1 − cos Yq (x, τ ) 1 1 2 sin τ [Phi16, (I.1e)] 1 (∆(x,τ ))2
= − cos 2 arctan = − 4(sin τ )2
2 2 2 ∆(x, τ ) 2 2 1+
(∆(x,τ ))2
1 1 (∆(x, τ ))2 − 4(sin τ )2 8(sin τ )2
= − =
2 2 (∆(x, τ ))2 + 4(sin τ )2 2(4(sin τ )2 + (∆(x, τ ))2 )
4(sin τ )2 (K.33)
= = Yq′ (x, τ ), (K.34)
4(sin τ )2 + (∆(x, τ ))2
K POLAR COORDINATES 152

proving (K.24) for each x ∈ R \ {x0 (τ )}. It remains to consider x = x0 (τ ). For τ ∈ Rk ,


we have sin τ > 0 and x0 (τ ) > 0, and, thus, by L’Hôpital’s rule [Phi16, Th. 9.26(a)],

π + 2kπ − 2 kπ + arctan 2 cos τ2−x


sin τ
 
Yq x0 (τ ), τ − Yq (x, τ ) sin τ +2
lim = lim
x↑x0 (τ ) x0 (τ ) − x x↑x0 (τ ) x0 (τ ) − x
2
[Phi16, (9.29)],(K.33) − 4(sin τ4(sin τ)
)2 +(∆(x,τ ))2
= lim =1 (K.35)
x↑x0 (τ ) −1

and
2 sin τ
 
Yq x0 (τ ), τ − Yq (x, τ ) π + 2kπ − 2 (k + 1)π + arctan 2 cos τ −x sin τ +2
lim = lim
x↓x0 (τ ) x0 (τ ) − x x↓x0 (τ ) x0 (τ ) − x
2
[Phi16, (9.29)],(K.33) − 4(sin τ4(sin τ)
)2 +(∆(x,τ ))2
= lim =1 (K.36)
x↓x0 (τ ) −1

showing x 7→ Yq (x, τ ) to be differentiable in x = x0 (τ ) with Yq′ (x0 (τ ), τ ) = 1. Due to

1 − cos(π + 2kπ)
= 1 = Yq′ (x0 (τ ), τ ), (K.37)
2
(K.24) also holds. For τ ∈ Lk , we have sin τ < 0 and x0 (τ ) < 0, and, thus, by L’Hôpital’s
rule [Phi16, Th. 9.26(a)],

Yq x0 (τ ), τ − Yq (x, τ )
lim
x↑x0 (τ ) x0 (τ ) − x
−π + 2kπ − 2 (k − 1)π + arctan 2 cos τ2−x
sin τ

sin τ +2
= lim
x↑x0 (τ ) x0 (τ ) − x
2
[Phi16, (9.29)],(K.33) − 4(sin τ4(sin τ)
)2 +(∆(x,τ ))2
= lim =1 (K.38)
x↑x0 (τ ) −1

and
2 sin τ
 
Yq x0 (τ ), τ − Yq (x, τ ) −π + 2kπ − 2 kπ + arctan 2 cos τ −x sin τ +2
lim = lim
x↓x0 (τ ) x0 (τ ) − x x↓x0 (τ ) x0 (τ ) − x
2
[Phi16, (9.29)],(K.33) − 4(sin τ4(sin τ)
)2 +(∆(x,τ ))2
= lim =1 (K.39)
x↓x0 (τ ) −1

showing x 7→ Yq (x, τ ) to be differentiable in x = x0 (τ ) with Yq′ (x0 (τ ), τ ) = 1. Due to

1 − cos(−π + 2kπ)
= 1 = Yq′ (x0 (τ ), τ ), (K.40)
2
(K.24) also holds. N
K POLAR COORDINATES 153

Claim 3. The general solution to (K.2) with f1 , f2 according to (K.10) is

Y : Df,0 −→ R2 ,
  
Y (x, η1 , η2 ) := Yp x, r(η1 , η2 ) cos Yq x, ϕ(η1 , η2 ) ,
 
Yp x, r(η1 , η2 ) sin Yq x, ϕ(η1 , η2 ) , (K.41)

where r and ϕ are given by (K.1), and


 
Df,0 = R × {η ∈ R2 : 0 < kηk2 ≤ 1}
 h
2
i kηk2
∪ (x, η) ∈ R × R : kηk2 > 1, x ∈ − ln ,∞ . (K.42)
kηk2 − 1

Proof. Since (K.12) is the polar coordinate version of (K.2) with f1 , f2 according to
(K.10), everything follows from combining Th. K.1 with Claims 1 and 2. N
Claim 4. The autonomous ODE (K.2) with f1 , f2 according to (K.10) has (1, 0) as its
only fixed point, and (1, 0) satisfies Def. 5.24(iii) for x → ∞ (even for each η ∈ R2 \ {0})
without satisfying Def. 5.24(ii) (i.e. without being positively stable).

Proof. For each η ∈ R2 \ {0}, it is r(η) > 0, and, thus


 r(η)
∀2 lim Yp x, r(η) = lim  = 1. (K.43)
η∈R \{0} x→∞ x→∞ r(η) + 1 − r(η) e−x

Fix η ∈ R2 \ {0}. If ϕ(η) = 0, then



lim Yq x, 0 = lim 0 = 0. (K.44a)
x→∞ x→∞

If ϕ(η) = π, then
  
 2
lim Yq x, π = lim 2 π + arctan − = 2(π + 0) = 2π. (K.44b)
x→∞ x→∞ x
If 0 < ϕ(η) < π or π < ϕ(η) < 2π, then sin ϕ(η) 6= 0 and, thus,
  
 2 sin ϕ(η)
lim Yq x, ϕ(η) = lim 2 π + arctan
x→∞ x→∞ 2 cos ϕ(η) − x sin ϕ(η) + 2
= 2(π + 0) = 2π. (K.44c)

Using (K.43) and (K.44) in (K.41) yields

∀ lim Y (x, η) = (1, 0). (K.45)


η∈R2 \{0} x→∞

While (1, 0) is clearly a fixed point for (K.2) with f1 , f2 according to (K.10), (K.45)
shows that no other η ∈ R2 \ {0} can be a fixed point.
REFERENCES 154

For each τ ∈]0, π[ and η := (cos τ, sin τ ), it is ϕ(η) = τ and Yq (0, ϕ(η)) = τ . Thus, due
to (K.44c) and the intermediate value theorem, the continuous function Yq (·, ϕ(η)) must
attain every value between τ and 2π, in particular, there is xπ > 0 that Yq (xτ , ϕ(η)) = π
and Y (xτ , η) = (cos π, sin π) = (−1, 0). Since every neighborhood of (1, 0) contains
points η = (cos τ, sin τ ) with τ ∈]0, π[, this shows that (1, 0) does not satisfy Def.
5.24(ii) for x ≥ 0. N

References
[Aul04] Bernd Aulbach. Gewöhnliche Differenzialgleichungen, 2nd ed. Spektrum
Akademischer Verlag, Heidelberg, Germany, 2004 (German).

[Koe03] Max Koecher. Lineare Algebra und analytische Geometrie, 4th ed. Springer-
Verlag, Berlin, 2003 (German), 1st corrected reprint.

[Mar04] Nelson G. Markley. Principles of Differential Equations. Pure and Applied


Mathematics, Wiley-Interscience, Hoboken, NJ, USA, 2004.

[Oss09] E. Ossa. Topologie. Vieweg+Teubner, Wiesbaden, Germany, 2009 (German).

[Phi15] P. Philip. Calculus II for Statistics Students. Lecture Notes, Ludwig-


Maximilians-Universität, Germany, 2015, available in PDF format at
http://www.math.lmu.de/~philip/publications/lectureNot
es/calc2_forStatStudents.pdf.

[Phi16] P. Philip. Analysis I: Calculus of One Real Variable. Lecture Notes, Lud-
wig-Maximilians-Universität, Germany, 2015/2016, available in PDF format at
http://www.math.lmu.de/~philip/publications/lectureNotes/analysis1.pdf.

[Pre75] G. Preuß. Allgemeine Topologie, 2nd ed. Springer-Verlag, Berlin, 1975 (Ger-
man).

[Put66] E.J. Putzer. Avoiding the Jordan Canonical Form in the Discussion of Linear
Systems with Constant Coefficients. The American Mathematical Monthly 73
(1966), No. 1, 2–7.

[Str08] Gernot Stroth. Lineare Algebra, 2nd ed. Berliner Studienreihe zur Mathe-
matik, Vol. 7, Heldermann Verlag, Lemgo, Germany, 2008 (German).

[Wal02] Wolfgang Walter. Analysis 2, 5th ed. Springer-Verlag, Berlin, 2002 (Ger-
man).